Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

VASP Drone and onsite_density_matrix causing large document sizes #577

Open
mkhorton opened this issue Jan 27, 2021 · 2 comments
Open

VASP Drone and onsite_density_matrix causing large document sizes #577

mkhorton opened this issue Jan 27, 2021 · 2 comments
Labels
bug improvement reported issues that considered further improvement to atomate

Comments

@mkhorton
Copy link
Contributor

I have seen an example of a calculation (~200 atoms, ~50 SCF steps) where the task document size goes past 16 MB -- the vast majority of this due to the onsite_density_matrix in the OUTCAR.

Creating this issue to keep an eye on it. Possibilities are (1) a bug in parsing the matrix, (2) a sub-optimal representation of the matrix, (3) the possibility we shouldn't be storing this regardless except for the last SCF step. I have not had an opportunity to investigate further yet, if anyone wants a test file let me know.

@utf
Copy link
Member

utf commented Jan 27, 2021

Thanks @mkhorton.

This is actually something we had to deal with in emmet-cli recently: https://github.com/materialsproject/emmet/blob/e30cbf2d6856d51dd7149ee253c4eb1ea969ddc9/emmet-cli/emmet/cli/utils.py#L394

I agree it would be better to handle this in the drone directly. Do you know of any potential uses for the onsite_density_matrix data? As in, is there any downside to always removing it?

@mkhorton
Copy link
Contributor Author

@acrutt brought this to my attention, we can share the example file privately if it's helpful.

I don't think this is data we'd commonly need... I think I'm actually to blame for this, I added the parsing to the Outcar two years ago, though I can't recall the context now.

In the example file, it ends up being a list of dicts (15504 elements) keyed by spin (+1, -1).

I think we could probably safely remove the key from the drone, and probably the way this data is represented could improved at a later date in pymatgen, because I think the current representation of the data is basically 1-to-1 equivalent of how it's stored in the Outcar, except as a list of dicts, and I don't think this is very sensible.

@itsduowang itsduowang added bug enhancement improvement reported issues that considered further improvement to atomate and removed enhancement labels Feb 8, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug improvement reported issues that considered further improvement to atomate
Projects
None yet
Development

No branches or pull requests

3 participants