Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dice required tremendous amount of RAM #21

Open
vvp-nsk opened this issue Mar 15, 2024 · 11 comments
Open

Dice required tremendous amount of RAM #21

vvp-nsk opened this issue Mar 15, 2024 · 11 comments

Comments

@vvp-nsk
Copy link

vvp-nsk commented Mar 15, 2024

Hej!

Please correct me if I am wrong here, but the SHCI method is not shared memory parallelized within Dice. I am struggling with getting it works for a moderate-size problem (74e, 179o). It requires so much memory per process that even 64GB of RAM is not sufficient to get it working with epsilon_1=1.0e-4. My idea now is to reduce the number of virtual orbitals. Could you please explain what is the purpose of the 'nvirt' keyword implemented in Dice?

Thank you in advance!

With best regards,
Victor

P.s. Both 'diskdavidson' and 'directdavidson' algorithms neither help.

@sanshar
Copy link
Owner

sanshar commented Mar 15, 2024 via email

@vvp-nsk
Copy link
Author

vvp-nsk commented Mar 15, 2024

Hej!

I agree that HCI does approximate FCI at the cost of an exponential scaling. In other words, smaller epsilon's thresholds cause an extreme size of CI space. It is still ok.

With 179 orbital the cost of storing integrals themselves which roughly scale as n^4 will be tremendous.

Actually, it is something about 8GB, not much.

I am puzzled about how to force DICE to treat only a certain subset of occupied and virtual orbitals as active, i.e., how to instruct Dice that:

  • first N doubly occupied orbitals are frozen
  • last K virtual orbitals are inactive

In this way, one can drastically reduce the dimensionality of Hilbert space but still get some realistic CI expansion. For instance, performing HCI with very tight thresholds in valence-only active space still might provide a reasonable MSD trial for subsequent AFQMC, IMHO.

Another question - if one has a CI coefficients and corresponding determinant list (produced by 'fci.addons.large_ci'), how to dump such information into the 'dets.bin' file? Probably, you can point out to a dedicated function already implemented and available in QMCUtils.py?

Thank you in advance!

With best regards,
Victor

@xubwa
Copy link
Collaborator

xubwa commented Mar 15, 2024

Keyword nvirt is no longer used in hci code. 64G of ram is not large, you can try with some larger epsilon which will require less resources or consider using more resources, giving more rams or distributing among several nodes. If you want to freeze several core and virtual orbitals, you may just exclude them in the CI calculation.

@vvp-nsk
Copy link
Author

vvp-nsk commented Mar 15, 2024

Hej!

If you want to freeze several core and virtual orbitals, you may just exclude them in the CI calculation.

Could you please provide a working example on how to do so?

With best regards,
Victor

@xubwa
Copy link
Collaborator

xubwa commented Mar 15, 2024

Just shrink your (74e,179o) calculation to whatever space you actually want to correlate. If you want to have 20 doubly occupied orbitals and 50 empty orbitals, then do a calculation with (34e,109o).

@vvp-nsk
Copy link
Author

vvp-nsk commented Mar 15, 2024

Hej!

To my best knowledge, Dice is not capable to perform such shrinking. Please correct me if I am wrong. Do you mean a shrinking of orbitals at the stage of generating of FCIDUMP?

Also, could you please shed light on the 'restricted' keyword? Sorry to ask many question but documentation is a bit scarce.

With best regards,
Victor

@xubwa
Copy link
Collaborator

xubwa commented Mar 15, 2024

Yes, use a smaller active space would be proper for the purpose you described. restrict keyword is only experimental, correct me on this if I'm incorrect @sanshar , it is used to mimic a restricted active space or generalized active space.

@ankit76
Copy link
Collaborator

ankit76 commented Mar 15, 2024

If you are looking for a script that does core averaging, you can make a dummy casscf or hciscf object e.g. see here https://github.com/sanshar/Dice/blob/master/examples/DQMC/hci_trials/n2.py#L27

For the dets file, I don't have a python function that writes dets from pyscf, but it should be easy to write based on this function https://github.com/sanshar/Dice/blob/master/scripts/QMCUtils.py#L271

@vvp-nsk
Copy link
Author

vvp-nsk commented Mar 20, 2024

Hej!

For the dets file, I don't have a python function that writes dets from pyscf, but it should be easy to write based on this function https://github.com/sanshar/Dice/blob/master/scripts/QMCUtils.py#L271

If I grab CI vector from PySCF (or, Arrow), do I need to worry about phase conversion as it is implemented in IPIE?

Thank you in advance!

With best regards,
Victor

@sanshar
Copy link
Owner

sanshar commented Mar 20, 2024 via email

@ndattani
Copy link
Collaborator

I agree with @sanshar that (74e,179o) is not a "moderate-sized problem" and I will even say that it's near (or beyond) the limit of what can be done in 2024 with state-of-the-art supercomputers, depending on the accuracy that you want (e.g. ±20 mH, or ±10 mH, or ±1 mH, or ±0.1 mH).

I also agree with @xubwa that 64 GB is not "so much RAM".

We did (113e,76o) for FeMoco in this paper but did not get any meaningful energies (the small calculation was just done to show the occupation numbers, and how nasty the problem is). Even for (54e,54o) in the same paper, the difference between E_var and E_tot was more than 10 mH for the smallest value of epsislon (which I believe required nodes with 1 TB of RAM). The final extrapolated SHCI energy was more than 2 mH different from the extrapolated DMRG energy, and based on how the extrapolated SHCI energy was changing as more and more values of epsilon were used and plotted, I would not "bet" that the final SHCI energy was within ±1 mH of the "true" ground-state eigenvalue for this Hamiltonian.

Therefore, for a system with your size, a reasonable target precision might be ±20 mH or ±10 mH which is much larger than what is usually desired in quantum chemistry, so (74e,179o) is not a "moderate-sized" problem for SCHI, but rather quite a large one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants