Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP for openacc (GPU) for MulticolvarTemplate #1206

Open
wants to merge 4 commits into
base: final-backpropegation
Choose a base branch
from

Conversation

Iximiel
Copy link
Member

@Iximiel Iximiel commented Feb 20, 2025

I am opening this PR to have a clearer diff and to make a few checks

In particular I need to understand why some tests do not pass anymore the forces part. I did not touch that part yet... (could be that skipping of the pbc passage 🤔)
So I need a clearer diff.

At the time of writing this the /targetGPU/rt-matrixsum that is adjmat/rt-matrixsum passes the "value" test but not the "force test"

If you try to compile/modify this remember that the dependency part of the build tree is somehow broken with nvc++ (at least with mine) and as now I am not trying to fix it. So to compile after a modification you have to manually delete the interested files (like this rm src/colvar/Distance.o ; make -j12)

@Iximiel Iximiel marked this pull request as draft February 20, 2025 10:27
@Iximiel Iximiel changed the title [] WIP for openacc (GPU) for MulticolvarTemplate Feb 20, 2025
@Iximiel Iximiel force-pushed the gpu/dfbrun branch 3 times, most recently from 8f15f0d to 35b742a Compare February 21, 2025 14:59
@Iximiel
Copy link
Member Author

Iximiel commented Feb 21, 2025

@GiovanniBussi @gtribello
As it is now it seems to work, IF you do not look at the virial

Now if you run the new targetGPU/rt-basic-print/, the one with the USEGPU flags, it passes 😄, I think the forward loop is ok

The next thing to do (after finding a solution for the virial) will be documenting what I did, since working with openacc feels more or less like playing with an escherian jenga tower

@gtribello
Copy link
Member

@Iximiel, the stuff for the secondary structure using the new ParallelTaskManager is in the branch called gpuwork/secondarystructure

@gtribello
Copy link
Member

Hi @Iximiel

Further to my message the code for the new secondary structure stuff is in the gpuwork/secondarystructure-2

This version of the branch was built off final-backpropegation

@Iximiel
Copy link
Member Author

Iximiel commented Feb 26, 2025

I managed to make also the virial work 😄

as now is not mergeable, becasue we are spending more time in printf-debugging that other. But I wanted to show that the code was working.

I would like to do a small change while I a prepare this to be merged:
@gtribello do you mind if I change the name of the variable nderivatives_per_task into nderivatives_per_component in the ParallelTaskManager? It is more clear, and it was the casue of the confusion in the error of the virial.

I have already did something similar in ColvarOutput.h (where I also unrolled and removed the memory copies in setBoxDerivativesNoPbc, becasue I tought that there where a problem there...), now it it slighly clear, I think with the names of the variables.

@gtribello
Copy link
Member

Hi @Iximiel

I made that change of variable name in the final-backpropegation branch that I pushed yesterday. If you have other changes that you think should be incorporated from the work you have done here then please PR them onto that branch.

Thanks
Gareth

@Iximiel Iximiel changed the base branch from gpu-derivatives-from-backpropegation to final-backpropegation February 26, 2025 14:50
@Iximiel
Copy link
Member Author

Iximiel commented Feb 27, 2025

The changes in configure.ac in 37388af make working with nvhpc compiler much much easier

@Iximiel Iximiel force-pushed the gpu/dfbrun branch 5 times, most recently from db55c29 to 0891e5b Compare February 27, 2025 09:03
@Iximiel
Copy link
Member Author

Iximiel commented Feb 27, 2025

rebased on final-backpropagation, need to adapt it to the new changes

@Iximiel
Copy link
Member Author

Iximiel commented Feb 27, 2025

@gtribello it works (at least on my PC), on the branch more aligned with the final-backpropagation

I have included the change I proposed here

I think I just need to refine a few things, but it is basically done:

  • the configure part
  • understand/advise/workaround the fact that if in an input file you specify POSITION and DISTANCE both using the GPU you'll get a crash and I have NULL ideas why
  • maybe adding a few more tests, just to be sure.
  • understand why all the CI does not like my code

@Iximiel Iximiel force-pushed the gpu/dfbrun branch 3 times, most recently from bc858c1 to efb7b76 Compare February 28, 2025 08:59
@Iximiel Iximiel marked this pull request as ready for review February 28, 2025 09:01
Comment on lines +22 to +25
//this is temporary:
#ifdef __PLUMED_HAS_OPENACC
#define __PLUMED_USE_OPENACC 1
#endif //__PLUMED_HAS_OPENACC
Copy link
Member Author

@Iximiel Iximiel Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gtribello I think it is done: with this last change I should have solved the cppcheck about __PLUMED_HAS_OPENACC not being defined somewhere

My idea is to activate __PLUMED_USE_OPENACC with __PLUMED_HAS_OPENACC while it is "experimental" and then remove these "preheaders" and use __PLUMED_HAS_OPENACC directly

Moreover now there is a "standard" way of compiling with openacc

I think now this is ready to be merged into final-backpropagation, so that I can rebase on these changes the work I am doing on the secondarystructure-2 branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants