Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Problem in restart after merging the parallel code #133

Closed
giovannipizzi opened this issue Apr 3, 2017 · 5 comments · Fixed by #149 or #151
Closed

Problem in restart after merging the parallel code #133

giovannipizzi opened this issue Apr 3, 2017 · 5 comments · Fixed by #149 or #151
Assignees
Labels

Comments

@giovannipizzi
Copy link
Member

From Arash:
With the benzene example, if I run a wannierisation, and then do a second run setting "restart = wannierise” (or indeed plot) I get an error

forrtl: severe (24): end-of-file during read, unit 11, file /export111/work/aam/w90-examples/example12-val/benzene.chk

Or:
example01, example02 etc: instead of the end-of-file error on reading .chk files, I sometimes get the following error on the first cycle of the restart:

Cycle:      1
 wann_main: ZHEEV in internal_new_u_and_m failed, info=            3
            trying Schur decomposition instead
 wann_main: SCHUR failed, info=            4
 Exiting.......
 wann_main: problem computing schur form 1
@jryates
Copy link
Member

jryates commented Apr 10, 2017

I think this is one problematic part of the code. In the comms_gatherv there is a reference to m_matrix which should be m_matrix_1b. This fixes the lead restart on my machine.
However, this is nothing to do with the gamma-point code, so the original issue must be something different. I can't reproduce any problems with the benzene example (various compilers, valgrind etc).

do nn = 1, nntot
m_matrix_1b_loc=m_matrix_loc(:,:,nn,:)
call comms_gatherv(m_matrix_1b_loc,num_wannnum_wanncounts(my_node_id),&
m_matrix,num_wannnum_wanncounts,num_wannnum_wanndispls)
call comms_bcast(m_matrix_1b(1,1,1),num_wannnum_wannnum_kpts)
m_matrix(:,:,nn,:)=m_matrix_1b(:,:,:)
end do!nn

@giovannipizzi
Copy link
Member Author

@jryates
Copy link
Member

jryates commented Dec 5, 2017

I think the test is wrong. The problem was not with reading the chk file, it was with writing it. So the test has a wrong chk file.

@giovannipizzi
Copy link
Member Author

Interesting, thanks! I'll see if I can fix this with a new chk file.

@giovannipizzi
Copy link
Member Author

It was even more simple than that... there was no reference file! I committed and created a PR so we have also this test case in, feel free to merge if it's OK. Thanks!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
3 participants