Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evolution to GALAHAD 5 #272

Merged
merged 77 commits into from
Jun 5, 2024
Merged

Evolution to GALAHAD 5 #272

merged 77 commits into from
Jun 5, 2024

Conversation

amontoison
Copy link
Member

@amontoison amontoison commented Apr 9, 2024

Nick: These are all of the preparatory changes for GALAHAD 5. The new edition has new interfaces to julia, and supports the optional multprecision HSL subset, along with other less significant changes.

@amontoison amontoison mentioned this pull request Apr 9, 2024
@amontoison
Copy link
Member Author

@nimgould You forgot to push hsl_subset.h.

@nimgould
Copy link
Contributor

nimgould commented Apr 9, 2024

I added it. Did I need to? I am a bit lost as to where things are, what is mine and what has been created by you

@amontoison
Copy link
Member Author

amontoison commented Apr 9, 2024

@nimgould You forgot to push hsl_subset.h.

I just created an empty meson.build file in src/metis.
You can click directly on the commits here to see what I modified but it's almost nothing.

@nimgould
Copy link
Contributor

I can no longer commit, I get

nothing added to commit but untracked files present (use "git add" to track)
To https://github.com/ralna/GALAHAD
! [rejected] galahad5-nick -> galahad5-nick (non-fast-forward)
error: failed to push some refs to 'https://github.com/ralna/GALAHAD'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

I tried a git pull, but that didn't help.

Can we get rid of the old galahad5, or are we keeping that for anything?

@nimgould
Copy link
Contributor

git status says

On branch galahad5-nick
Your branch and 'origin/galahad5-nick' have diverged,
and have 1 and 1 different commits each, respectively.
(use "git pull" to merge the remote branch into yours)

Last command done (1 command done):
pick 448dac2 Evolution to GALAHAD 5
No commands remaining.
You are currently editing a commit while rebasing branch 'galahad5' on '82e7fa91'.

So I seem to be on galahad5-nick but it is complaining about a rebase to galahad5. How do I stop the rebase, and is it this that is coonfusing everything??

@nimgould
Copy link
Contributor

% git pull
fatal: It seems that there is already a rebase-merge directory, and
I wonder if you are in the middle of another rebase. If that is the
case, please try
git rebase (--continue | --abort | --skip)
If that is not the case, please
rm -fr ".git/rebase-merge"
and run me again. I am stopping in case you still have something
valuable there.

@nimgould
Copy link
Contributor

(PS I tried to fix the broken galahad_modules.h, but it won't accept the push)

@amontoison
Copy link
Member Author

amontoison commented Apr 10, 2024

git rebase --abort
git fetch origin
git rebase origin/galahad5-nick

(Fix the conflicts and add the files that have conflicts:
git add include/galahad_modules.h
git add fileXYZ.f90

git rebase --continue
git push

@nimgould
Copy link
Contributor

Git is really broken. I follow your instructions, update all of the compromised files, git rebase --continue, but then at the next commit, all the files I updated have conflicts reinserted. How do I get out of this endless loop?

@nimgould
Copy link
Contributor

Despite using the git rebase origin/galahad5-nick command, it keeps doing things to galahad5 (no nick), and then complains that all of the changes I made are now conflicted again. I've been through the same cycle of fixing conflicts four times now, and I cannot see how this will ever converge

@nimgould
Copy link
Contributor

This was my mistake, I missed the git checkout galahad5-nick, and all is now fine. How do I get it to run the actions?

@amontoison
Copy link
Member Author

This was my mistake, I missed the git checkout galahad5-nick, and all is now fine. How do I get it to run the actions?

git push

@nimgould
Copy link
Contributor

Thank you. I thought I'd done that, but obviously not.

For some reason the build gives

./src/external/hsl/hsl_ma48/hsl_ma48r.f90:139:11:
737

738
139 | USE GALAHAD_SYMBOLS
739
| 1
740
Fatal Error: Cannot open module file ‘galahad_symbols.mod’ for reading at (1): No such file or directory

But the earlier

[656/2783] Compiling Fortran object libgalahad_single.so.p/meson-generated_single_symbols.f90.o

should provide this module file. Is something missing in meson?

@nimgould
Copy link
Contributor

That was my blunder, a programming bug that wasn't picked up locally.

@nimgould
Copy link
Contributor

Another one, this looks like a meson issue, This is supposed to be a 32bit integer single compile, but the -DINTEGER_64 flag is being passed to the compiler

gfortran -Ilibgalahad_single.so.p -I. -I.. -Iinclude -I../include -I../src/dum/include -I../src/metis/include -Isrc/ampl -I../src/ampl -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -O0 -g -fopenmp -fPIC -DSINGLE -DGALAHAD_BLAS -DGALAHAD_LAPACK -DDUMMY_SMUMPS -DDUMMY_MKL_PARDISO -DDUMMY_PARDISO -DDUMMY_PASTIXF -DDUMMY_SPMF -DDUMMY_WSMP -DDUMMY_MPI -DDUMMY_HSL -DINTEGER_64 -Jlibgalahad_single.so.p -o libgalahad_single.so.p/meson-generated_single_uls.f90.o -c libgalahad_single.so.p/single_uls.f90
817
../src/uls/uls.F90:44:10:
818
44 | USE hsl_ma48_precision
819
| 1
820
Fatal Error: Cannot open module file ‘gal_hsl_ma48_single_64.mod’ for reading at (1): No such file or directory

@amontoison
Copy link
Member Author

amontoison commented Apr 10, 2024

Nick, you looked at a 64-bit integer build.

But we have a similar issue with a 32-bit integer build, we have something wrong the preprocessed macros for HSL:
https://github.com/ralna/GALAHAD/actions/runs/8633938903/job/23668183354?pr=272#step:12:825

@nimgould
Copy link
Contributor

How do we find what the compiler did when compiling hsl_ma57r.f90? what name was the resulting .mod file?

@amontoison
Copy link
Member Author

amontoison commented Apr 10, 2024

@nimgould I found the culprit. It's because with galahad_modules.h, I add the prefix gal_ with the dummy modules of HSL.
I added a bunch of macros in hsl_subset.h to fix that.
I think that we should merge some header files together.
It starts to get hard to maintain with all these macros in different files.

@amontoison
Copy link
Member Author

New error:

 gfortran -Ilibgalahad_double.so.p -I. -I.. -Iinclude -I../include -I../src/dum/include -I../src/metis/include -Isrc/ampl -I../src/ampl -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -O0 -g -fopenmp -fPIC -DDOUBLE -DGALAHAD_BLAS -DGALAHAD_LAPACK -DDUMMY_DMUMPS -DDUMMY_MKL_PARDISO -DDUMMY_PARDISO -DDUMMY_PASTIXF -DDUMMY_SPMF -DDUMMY_WSMP -DDUMMY_MPI -DDUMMY_HSL -Jlibgalahad_double.so.p -o libgalahad_double.so.p/meson-generated_double_blas_interface.f90.o -c libgalahad_double.so.p/double_blas_interface.f90
../src/lapack/blas_interface.F90:40:10:
   40 |        DOUBLE PRECISION :: DNRM2
      |          1
Error: Unclassifiable statement at (1)
../src/lapack/blas_interface.F90:51:31:
   51 |        SUBROUTINE SROTG( a, b, c, s )
      |                               1
......
   57 |        SUBROUTINE DROTG( a, b, c, s )
      |                               2
Error: Ambiguous interfaces in generic interface 'rotg' for ‘galahad_srotg’ at (1) and ‘galahad_drotg’ at (2)

@amontoison
Copy link
Member Author

@nimgould Can we remove the folder src/zd11?
I don't compile it anymore because we now have src/external/hsl/hsl_zd11 and Meson is not happy to detect the same Fortran module twice.

@nimgould
Copy link
Contributor

Thanks for finding the include issue. I realised that it must be something like this in the middle of the night!

Yes, we should simplify the hsl header files. The main complication is, of course, the dummy names you need for your build.

I'll see what the blas_interface problem is and fix it.

And, yes, it should be trivial to remove the explicit zd11. I will look

Thanks for your help on this

@nimgould
Copy link
Contributor

Unfortunately your fix to hsl_subset.h broke things here! I will recover and do it again.

#define KB21GI GALAHAD_KB21GI_64
#define KB21HI GALAHAD_KB21HI_64
#define kb07ai galahad_kb07ai_64
#ifdef DUMMY_HSL
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nimgould
I did an error here. This #ifdef should be at line 648.

@nimgould
Copy link
Contributor

I am working on this at the moment, it is a bit of a mess!

@nimgould
Copy link
Contributor

I believe that the issue you reported for blas_interface.F90 is that you are using the -DDOUBLE flag; unfortunately DOUBLE is a reserved keyword in C++; we are careful elsewhere not to use DOUBLE in any of the header files. See what happens if you remove this flag

meson.build Outdated Show resolved Hide resolved
@amontoison
Copy link
Member Author

I believe that the issue you reported for blas_interface.F90 is that you are using the -DDOUBLE flag; unfortunately DOUBLE is a reserved keyword in C++; we are careful elsewhere not to use DOUBLE in any of the header files. See what happens if you remove this flag

It fixed the issue with the.BLAS interface.
I checked CI builds and we can almost compile.
We have an error during linking but it's because of the preprocessed symbols (HSL / METIS).
I can fix it later today.

@nimgould
Copy link
Contributor

I forgot to add a couple of local metis files, I have them at home and will commit from there. The last actions complained about not having mc23b, I need to check, again later

@amontoison
Copy link
Member Author

We have the following error with the dummy hsl_ma57r (64-bit integer):

FAILED: libgalahad_single.so.p/meson-generated_single_ma27r.f.o 
gfortran -Ilibgalahad_single.so.p -I. -I.. -Iinclude -I../include -I../src/dum/include -I../src/metis/include -Isrc/ampl -I../src/ampl -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -O0 -g -fopenmp -fPIC -DSINGLE -DGALAHAD_BLAS -DGALAHAD_LAPACK -DDUMMY_SMUMPS -DDUMMY_MKL_PARDISO -DDUMMY_PARDISO -DDUMMY_PASTIXF -DDUMMY_SPMF -DDUMMY_WSMP -DDUMMY_MPI -DDUMMY_HSL -DINTEGER_64 -Jlibgalahad_single.so.p -o libgalahad_single.so.p/meson-generated_single_ma27r.f.o -c libgalahad_single.so.p/single_ma27r.f
../src/external/hsl/ma27/ma27r.f:69:70:

   69 |       INTEGER ( KIND = ip_ ), INTENT( IN ), DIMENSION( n, 3 ) :: IKEEP
      |                                                                      1
Error: Symbol at (1) is not a DUMMY variable
../src/external/hsl/ma27/ma27r.f:65:68:

   65 |       INTEGER ( KIND = ip_ ), INTENT( IN ) :: n, nz, la, liw, nsteps
      |                                                                    1
Error: Symbol at (1) is not a DUMMY variable
../src/external/hsl/ma27/ma27r.f:111:71:

  111 |       INTEGER ( KIND = ip_ ), INTENT( OUT ), DIMENSION( nsteps ) :: IW1
      |                                                                       1
Error: Symbol at (1) is not a DUMMY variable
../src/external/hsl/ma27/ma27r.f:109:72:

  109 |       INTEGER ( KIND = ip_ ), INTENT( IN ) :: n, la, liw, maxfrt, nsteps
      |                                                                        1
Error: Symbol at (1) is not a DUMMY variable
../src/external/hsl/ma27/ma27r.f:188:68:

  188 |       INTEGER ( KIND = ip_ ), INTENT( IN ), DIMENSION( nblk ) :: IW2
      |                                                                    1
Error: Symbol at (1) is not a DUMMY variable
../src/external/hsl/ma27/ma27r.f:186:70:

  186 |       INTEGER ( KIND = ip_ ), INTENT( IN ) :: n, la, liw, maxfnt, nblk
      |                                                                      1
Error: Symbol at (1) is not a DUMMY variable

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

I probably already did!.
Git told me to!

@amontoison
Copy link
Member Author

It's not your fault Nick. Jari is the culprit!

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

Yup, I've lost all of the changes I made .... of course those be3c7c8
64e0850
messages say what the changes were, so in the worst case I could try to redo them

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

Is it not possible to tell giit to reapply those ones above?

@amontoison
Copy link
Member Author

Can you try this command Nick?

git checkout 64e0850ea5ef8af8b6860f11bb033df3156612da

If it works then:

git checkout -b nick-backup
git push

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

The checkout seems to have all the changes I made. Right, I'll try your two commands. Wish me luck ...

@amontoison
Copy link
Member Author

amontoison commented Jun 5, 2024

If it's not working, we can still go to the page
https://github.com/ralna/GALAHAD/tree/64e0850ea5ef8af8b6860f11bb033df3156612da
and download the files modified by your two commits one by one, Nick.

Fortunately, GitHub keeps a copy of these commits somewhere.
It's possible that the commits are still on your computer too 🤞

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

% git checkout -b nick-backup
Switched to a new branch 'nick-backup'
% git push
fatal: The current branch nick-backup has no upstream branch.
To push the current branch and set the remote as upstream, use

git push --set-upstream origin nick-backup

Should I do that?

@amontoison
Copy link
Member Author

% git checkout -b nick-backup Switched to a new branch 'nick-backup' % git push fatal: The current branch nick-backup has no upstream branch. To push the current branch and set the remote as upstream, use

git push --set-upstream origin nick-backup

Should I do that?

Yes!!!

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

Done. It looks ok

% git push --set-upstream origin nick-backup
Enumerating objects: 64, done.
Counting objects: 100% (64/64), done.
Delta compression using up to 16 threads
Compressing objects: 100% (38/38), done.
Writing objects: 100% (38/38), 8.39 KiB | 8.39 MiB/s, done.
Total 38 (delta 30), reused 2 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (30/30), completed with 25 local objects.
remote:
remote: Create a pull request for 'nick-backup' on GitHub by visiting:
remote: https://github.com/ralna/GALAHAD/pull/new/nick-backup
remote:
To https://github.com/ralna/GALAHAD

  • [new branch] nick-backup -> nick-backup
    Branch 'nick-backup' set up to track remote branch 'nick-backup' from 'origin'

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

Of course, now I am completely lost. I will sit down on that small hill over there, and open a cold beer

@amontoison amontoison merged commit 603b07e into master Jun 5, 2024
11 of 21 checks passed
@amontoison amontoison deleted the galahad5-nick branch June 5, 2024 15:24
@amontoison
Copy link
Member Author

All modifications are on the branch master. The PR is merged. 😎

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

Hurrah! So I can now
git checkout master
right?

@amontoison
Copy link
Member Author

Yep yep

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

Thank you so much, you are my hero of the hour, perhaps day

@amontoison
Copy link
Member Author

You're welcome Nick!
I have earned the right to correct my copies now...

@nimgould
Copy link
Contributor

nimgould commented Jun 5, 2024

Of course, of course, you can correct as much as you like. Now I've lost that beer in the excitement

@jfowkes
Copy link
Collaborator

jfowkes commented Jun 5, 2024

Thank you both and please accept my sincere apologies. I have now fixed the README on the master branch.

@jfowkes
Copy link
Collaborator

jfowkes commented Jun 5, 2024

@amontoison can we kill the GitHub Pages build? No idea why that's running on master, docs are hosted from https://github.com/ralna/galahad_docs

@amontoison
Copy link
Member Author

Thank you both and please accept my sincere apologies. I have now fixed the README on the master branch.

If you pay me a beer at Montréal during your visit for ISMP, I accept your apologies 😛

@jfowkes
Copy link
Collaborator

jfowkes commented Jun 5, 2024

Thank you both and please accept my sincere apologies. I have now fixed the README on the master branch.

If you pay me a beer at Montréal during your visit for ISMP, I accept your apologies 😛

You're on, one of those famous American Pale Ales of yours 😜

@amontoison
Copy link
Member Author

@amontoison can we kill the GitHub Pages build? No idea why that's running on master, docs are hosted from https://github.com/ralna/galahad_docs

I think that I killed the build. Let's see when a new commit will be added on master.

@jfowkes
Copy link
Collaborator

jfowkes commented Jun 6, 2024

@amontoison can we kill the GitHub Pages build? No idea why that's running on master, docs are hosted from https://github.com/ralna/galahad_docs

I think that I killed the build. Let's see when a new commit will be added on master.

Thanks @amontoison. @nimgould we seem to have an old stray index.htmlin the root GALAHAD folder, I assume this was from the time when the GALAHAD website was planned to be hosted from this repository but is no longer in use?

@nimgould
Copy link
Contributor

nimgould commented Jun 6, 2024

Yes, indeed. Infact it is only a week old, it was from my failed attempt to host the fortran pdf docs from the galahad site. Fortunately, following Jari's suggestion, these are now on the galahad_docs site

@nimgould
Copy link
Contributor

nimgould commented Jun 6, 2024

It will be gone at my next commit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants