Run Beersheba's convolve on GPUs using cupy #886

soleti · 2024-07-11T15:42:17Z

I added a functionality which allows to run convolve and fftconvolve using Cupy on the GPU, if available.

The performance gain starts to be significant only when the image is large (in my tests, when bin_size < 1, see plot).

Two things I am not sure of:

How to test this on machines that don't have a GPU
Using a global flag to check if there is a GPU (how are similar cases handled in IC?)

jwaiton · 2024-07-11T16:09:25Z

Hi Stefano!

I'm not sure how IC considers the specifications of the computer in use, but I believe you can avoid using a global flag by using the cupy function cupy.cuda.is_available(), as is described in this thread.

I'm not too familiar with cupy, but I'm assuming it would require a CUDA toolkit to be installed before use, right? So possibly the installation of such a toolkit should be included when building IC, and a check for a CUDA-compatible GPU during building would be helpful to avoid those without a compatible GPU from installing it. I'm not sure how tricky it would be to check this within bash, but I can have a look.

soleti · 2024-07-11T16:12:26Z

@jwaiton the global flag is not needed to check if CUDA is available, I am already doing that, it is needed because then inside the code there are a couple of ifs that depend on the presence of CUDA and I wanted to avoid calling cupy.cuda.is_available() every time. Setting a flag cupy_available at installation time is a good idea though, but I don't know how to pass that information to the module then. If you could help with that it would be great!

jwaiton · 2024-07-11T16:33:11Z

@soleti I think a nice way of doing it would be adding a new check into manage.sh that checks for an installation of CUDA (using something like command -v --nvcc and checking that there is a binary directory output) and then set a BASH variable as a CUDA flag based on this. You can pass the BASH variable through into deconv_functions.py quite easily using os.environ, although I'm not sure if there is some other method for doing this that is standard within IC. I can try and write something up tomorrow to test this.

Edit: checking specifically for a GPU compatible with CUDA is more complicated, I can see a couple of methods but they're platform specific, but I'll keep looking into it.

soleti · 2024-07-12T10:44:12Z

Passing an environment variable is a possibility, but I want to double check with IC experts if this is a viable solution or there other guidelines for this use case. @gonzaponte ?

gonzaponte · 2024-07-22T16:05:18Z

How to test this on machines that don't have a GPU

To test what exactly?

Using a global flag to check if there is a GPU (how are similar cases handled in IC?)

We try not to rely on global flags in IC. If the reason to use one is:

I wanted to avoid calling cupy.cuda.is_available() every time

I suggest replacing the variable with a function with cached output that checks for that (and can even do imports, etc.).

Also, I would make this an opt-in feature, probably controlled from config file.

soleti · 2024-07-26T15:26:52Z

How to test this on machines that don't have a GPU

To test what exactly?

That the GPU code is giving correct results.

Using a global flag to check if there is a GPU (how are similar cases handled in IC?)

We try not to rely on global flags in IC. If the reason to use one is:

I wanted to avoid calling cupy.cuda.is_available() every time

I suggest replacing the variable with a function with cached output that checks for that (and can even do imports, etc.).

I am trying to do this, but I am not sure how to do the import in the cached function and make that module available outside the function. The only way I see is to use global but we probably want to avoid that. Did you mean something like this or did I misunderstand?

@lru_cache
def is_gpu_available() -> bool:
    '''
    Check if GPUs are available, import the necessary libraries and 
    return True if they are available, False otherwise.
    '''
    try:
        import cupy  as cp
    ...

Maybe making is_gpu_available an inner function of richardson_lucy?

Also, I would make this an opt-in feature, probably controlled from config file.

Yes.

gonzaponte · 2024-08-04T12:06:59Z

How to test this on machines that don't have a GPU

To test what exactly?

That the GPU code is giving correct results.

By definition this is not possible, right? We will need to find a way to run the tests in machines with GPUs...

Did you mean something like this or did I misunderstand?

Including the imports in the function was a stupid suggestion from my side, sorry. The function should only check the availability of gpus.

Can the libraries be installed in any machine, even in those without a gpu? If so, the try/except clause in the imports can be omitted if the libraries are included in the IC environment.

Going back to...

I wanted to avoid calling cupy.cuda.is_available() every time

is this slow or are there any other reasons for that? Caching was the solution I proposed because I assumed this was the reason.

soleti · 2025-06-12T11:37:52Z

Ok this was stuck for a bit and I took the opportunity to finish it since @jwaiton might be interested. It would be good to have an independent tester :)

Copilot

Pull Request Overview

This PR adds GPU support for deconvolution operations by integrating Cupy into the convolution/FFT convolution functions.

Added an optional GPU flag (use_gpu) to the deconvolution and satellite mask functions.
Introduced an is_gpu_available helper function to check for available GPUs and adjusted array operations accordingly.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
invisible_cities/reco/deconv_functions.py	Added GPU flag support, integrated Cupy functions, and updated array operations.
invisible_cities/cities/beersheba.py	Updated deconvolve_signal to accept and propagate the use_gpu parameter to deconvolution.

invisible_cities/reco/deconv_functions.py

Co-authored-by: Copilot <[email protected]>

jwaiton · 2025-06-16T14:56:14Z

Thanks for updating the PR again, I'll pull the branch and start testing to crosscheck that everything works as expected.

I'll also look into a quick comparison soon with how it compares when using larger bin sizes as well (2.5mm bin size increased runtime by factor 20 using CPUs with minimal difference in reconstruction).

Do you happen to still have the data from your first plot? For NEW/N100 the number of iterations is on the order of 10s to 100s, where the difference in time complexity may be less drastic. Would be interesting to check 😸

soleti · 2025-06-16T15:04:50Z

Oh I didn't use real data, I just generated random arrays to produce the scaling plot... I agree it would be interesting to see it at work on real data, if you have a dataset at hand I think it's the fastest way to verify that.

soleti added 4 commits July 11, 2024 17:37

Run convolve using Cupy if GPU is available

fd12bc9

fix types

7b6b533

fix comments

37c27c7

fix comments

62041f7

soleti added 3 commits June 12, 2025 12:49

cache gpu test and add use_gpu flag for beersheba

adade00

improve gpu check

99544d6

Merge branch 'master' into beersheba_cupy

607eb63

soleti added 3 commits June 12, 2025 16:33

fix syntax error

39dad45

fix keyword arg

0ef7b53

fix undeclared variable...

93ba84e

soleti requested review from Copilot, gonzaponte and jwaiton June 16, 2025 14:39

Copilot AI reviewed Jun 16, 2025

View reviewed changes

invisible_cities/reco/deconv_functions.py Outdated Show resolved Hide resolved

invisible_cities/reco/deconv_functions.py Show resolved Hide resolved

spelling mistake

f432109

Co-authored-by: Copilot <[email protected]>

mcidlaso force-pushed the master branch from f0b9a75 to 72df21e Compare September 9, 2025 11:04

Run Beersheba's convolve on GPUs using cupy #886

Are you sure you want to change the base?

Run Beersheba's convolve on GPUs using cupy #886

Uh oh!

Conversation

soleti commented Jul 11, 2024

Uh oh!

jwaiton commented Jul 11, 2024

Uh oh!

soleti commented Jul 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jwaiton commented Jul 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

soleti commented Jul 12, 2024

Uh oh!

gonzaponte commented Jul 22, 2024

Uh oh!

soleti commented Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gonzaponte commented Aug 4, 2024

Uh oh!

soleti commented Jun 12, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

jwaiton commented Jun 16, 2025

Uh oh!

soleti commented Jun 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

soleti commented Jul 11, 2024 •

edited

Loading

jwaiton commented Jul 11, 2024 •

edited

Loading

soleti commented Jul 26, 2024 •

edited

Loading