Extend FFTBasedPoissonSolver to work on AMDGPU #4593

navidcy · 2025-06-10T06:52:43Z

This PR adds FFTBasedPoisonSolve method for AMDGPU.

navidcy · 2025-06-10T10:53:07Z

ext/OceananigansAMDGPUExt.jl

+
+function plan_backward_transform(A::ROCArray, ::Union{Bounded, Periodic}, dims, planner_flag)
+    length(dims) == 0 && return nothing
+    return AMDGPU.rocFFT.plan_bfft!(A, dims)


this is the plan_bfft!, not the plan_ifft!
do we need to add a normalisation or something?

navidcy · 2025-06-10T13:13:51Z

With the current modifications I can create an FFTBasedPoissonSolver with a uniformly-spaced grid.

julia> using Oceananigans, AMDGPU

julia> grid = RectilinearGrid(GPU(AMDGPU.ROCBackend()), size=(8, 8, 8), extent=(1, 2, 3))
8×8×8 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── Periodic x ∈ [0.0, 1.0)  regularly spaced with Δx=0.125
├── Periodic y ∈ [0.0, 2.0)  regularly spaced with Δy=0.25
└── Bounded  z ∈ [-3.0, 0.0] regularly spaced with Δz=0.375

julia> pressure_solver = Oceananigans.Solvers.FFTBasedPoissonSolver(grid)
FFTBasedPoissonSolver on GPU{ROCBackend}: 
├── grid: 8×8×8 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── storage: ROCArray{ComplexF64, 3, AMDGPU.Runtime.Mem.HIPBuffer}
├── buffer: ROCArray{ComplexF64, 3, AMDGPU.Runtime.Mem.HIPBuffer}
└── transforms:
    ├── forward: Oceananigans.Solvers.DiscreteTransform, Oceananigans.Solvers.DiscreteTransform
    └── backward: Oceananigans.Solvers.DiscreteTransform, Oceananigans.Solvers.DiscreteTransform

julia> model = NonhydrostaticModel(; grid, pressure_solver)
NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── grid: 8×8×8 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── timestepper: RungeKutta3TimeStepper
├── advection scheme: Centered(order=2)
├── tracers: ()
├── closure: Nothing
├── buoyancy: Nothing
└── coriolis: Nothing

julia> simulation = Simulation(model, Δt=10, stop_iteration=3)
Simulation of NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── Next time step: 10 seconds
├── Elapsed wall time: 0 seconds
├── Wall time per iteration: NaN days
├── Stop time: Inf days
├── Stop iteration: 3.0
├── Wall time limit: Inf
├── Minimum relative step: 0.0
├── Callbacks: OrderedDict with 4 entries:
│   ├── stop_time_exceeded => Callback of stop_time_exceeded on IterationInterval(1)
│   ├── stop_iteration_exceeded => Callback of stop_iteration_exceeded on IterationInterval(1)
│   ├── wall_time_limit_exceeded => Callback of wall_time_limit_exceeded on IterationInterval(1)
│   └── nan_checker => Callback of NaNChecker for u on IterationInterval(100)
├── Output writers: OrderedDict with no entries
└── Diagnostics: OrderedDict with no entries

julia> run!(simulation)
[ Info: Initializing simulation...
[ Info:     ... simulation initialization complete (416.478 μs)
[ Info: Executing initial time step...
[ Info:     ... initial time step complete (3.048 ms).
[ Info: Simulation is stopping after running for 6.118 ms.
[ Info: Model iteration 3 equals or exceeds stop iteration 3.

navidcy · 2025-06-10T13:18:04Z

With AMDGPU, scalar operations need AMDGPU.@allowscalar, right?
But all the code is sprinkled with CUDA.@allowscalar. What do we do about that?

amontoison · 2025-06-10T13:33:18Z

We have a generic @allowscalar in GPUArraysCore.jl if I remember well.

navidcy · 2025-06-10T13:46:20Z

This is ready to review.
The only thing I'm not sure about is the use of bfft instead of ifft.

glwagner · 2025-06-10T16:22:06Z

@allowscalar does not belong to CUDA and it is wrong that we use CUDA.@allowscalar. This is addressed by #4499

@michel2323 , I feel that with AMDGPU starting to get some use, we should try to merge #4499. What do you think? If we wait too long we may get some conflicts.

glwagner · 2025-06-10T16:22:23Z

This is ready to review. The only thing I'm not sure about is the use of bfft instead of ifft.

Can you please test this? An empirical test should reveal whether this code is correct and also will show what the normalization needs to be (if documentation is insufficient, which I am not surprised that it is).

michel2323 · 2025-06-10T20:05:09Z

@michel2323 , I feel that with AMDGPU starting to get some use, we should try to merge #4499. What do you think? If we wait too long we may get some conflicts.

@glwagner I continuously rebased, so it shouldn't be too far off. However, while the majority of tests pass, there are still a few that fail. I'll do another pass tomorrow morning and give you a short summary. I might need some help.

navidcy · 2025-06-10T20:54:33Z

Can you please test this? An empirical test should reveal whether this code is correct and also will show what the normalization needs to be (if documentation is insufficient, which I am not surprised that it is).

Yes, I wanted an idea of how to test. I know that ifft and bfft differ by a factor, e.g.,

julia> Nx, Ny = 3, 4
(3, 4)

julia> a = rand(Nx, Ny)
3×4 Matrix{Float64}:
 0.453795  0.778266  0.499822    0.749407
 0.646126  0.855627  0.00391171  0.158878
 0.307381  0.898621  0.453264    0.604466

julia> ah = fft(a)
3×4 Matrix{ComplexF64}:
  6.40956+0.0im        0.450305-1.01976im    -1.68096+0.0im        0.450305+1.01976im
 0.517153+0.518913im   -0.64285-0.21592im  -0.0206029-0.327336im  0.0544652-1.14911im
 0.517153-0.518913im  0.0544652+1.14911im  -0.0206029+0.327336im   -0.64285+0.21592im

julia> ifft(ah)
3×4 Matrix{ComplexF64}:
 0.453795+0.0im  0.778266+0.0im    0.499822+0.0im  0.749407+0.0im
 0.646126+0.0im  0.855627+0.0im  0.00391171+0.0im  0.158878+0.0im
 0.307381+0.0im  0.898621+0.0im    0.453264+0.0im  0.604466+0.0im

julia> bfft(ah) / (Nx * Ny)
3×4 Matrix{ComplexF64}:
 0.453795+0.0im  0.778266+0.0im    0.499822+0.0im  0.749407+0.0im
 0.646126+0.0im  0.855627+0.0im  0.00391171+0.0im  0.158878+0.0im
 0.307381+0.0im  0.898621+0.0im    0.453264+0.0im  0.604466+0.0im

But I don't know exactly how to incorporate this function in the plans so that it's the same as elsewhere in the pressure solver.

navidcy · 2025-06-10T22:31:03Z

Hm... If I understand correctly we need to manually apply the scaling after the inverse transform. How do I do that best way? The normalization should be

N = prod(size(A, d) for d in dims)

function backward_transform!(x)
    p * x           # in-place inverse FFT
    @. x = x / N    # in-place normalization
    return x
end

glwagner · 2025-06-10T23:54:56Z

What prevents you from using that exact code?

navidcy · 2025-06-11T00:11:45Z

I tried and failed. I need help perhaps?
I don't know exactly how to make the pressure solver use this rescaling only if it's a ROCArray

navidcy · 2025-06-11T02:29:35Z

I tried doing this in the AMD extension:

function plan_backward_transform(A::ROCArray, ::Union{Bounded, Periodic}, dims, planner_flag)
    length(dims) == 0 && return nothing

    p = AMDGPU.rocFFT.plan_bfft!(A, dims)
    N = prod(size(A, d) for d in dims)

    function backward_transform!(x, y)
        @. x = p * y           # in-place inverse FFT
        @. x = x / N    # in-place normalization
        return x
    end
    return backward_transform!
end

but I got an error related with binary operations... The pressure_solver is constructed but while time stepping I get:

julia> using Oceananigans, AMDGPU
[ Info: Precompiling OceananigansAMDGPUExt [8da284e0-aa90-5603-9506-0dba59ddb112]

julia> grid = RectilinearGrid(GPU(AMDGPU.ROCBackend()), size=(16, 8, 4), x=(0, 16), y=(0, 1), z=[0, 1, 2, 3, 4])
16×8×4 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── Periodic x ∈ [0.0, 16.0) regularly spaced with Δx=1.0
├── Periodic y ∈ [0.0, 1.0)  regularly spaced with Δy=0.125
└── Bounded  z ∈ [0.0, 4.0]  variably spaced with min(Δz)=1.0, max(Δz)=1.0

julia> pressure_solver = Oceananigans.Solvers.FFTBasedPoissonSolver(grid)
FFTBasedPoissonSolver on GPU{ROCBackend}: 
├── grid: 16×8×4 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── storage: ROCArray{ComplexF64, 3, AMDGPU.Runtime.Mem.HIPBuffer}
├── buffer: ROCArray{ComplexF64, 3, AMDGPU.Runtime.Mem.HIPBuffer}
└── transforms:
    ├── forward: Oceananigans.Solvers.DiscreteTransform
    └── backward: Oceananigans.Solvers.DiscreteTransform

julia> model = NonhydrostaticModel(; grid, pressure_solver)
NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── grid: 16×8×4 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── timestepper: RungeKutta3TimeStepper
├── advection scheme: Centered(order=2)
├── tracers: ()
├── closure: Nothing
├── buoyancy: Nothing
└── coriolis: Nothing

julia> simulation = Simulation(model, Δt=10, stop_iteration=3)
Simulation of NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── Next time step: 10 seconds
├── Elapsed wall time: 0 seconds
├── Wall time per iteration: NaN days
├── Stop time: Inf days
├── Stop iteration: 3.0
├── Wall time limit: Inf
├── Minimum relative step: 0.0
├── Callbacks: OrderedDict with 4 entries:
│   ├── stop_time_exceeded => Callback of stop_time_exceeded on IterationInterval(1)
│   ├── stop_iteration_exceeded => Callback of stop_iteration_exceeded on IterationInterval(1)
│   ├── wall_time_limit_exceeded => Callback of wall_time_limit_exceeded on IterationInterval(1)
│   └── nan_checker => Callback of NaNChecker for u on IterationInterval(100)
├── Output writers: OrderedDict with no entries
└── Diagnostics: OrderedDict with no entries

julia> run!(simulation)
[ Info: Initializing simulation...
[ Info:     ... simulation initialization complete (4.818 seconds)
[ Info: Executing initial time step...
ERROR: MethodError: no method matching *(::OceananigansAMDGPUExt.var"#backward_transform!#3"{…}, ::ROCArray{…})

Closest candidates are:
  *(::Any, ::Any, ::Any, ::Oceananigans.Grids.AbstractGrid, ::Any, ::Any, ::Number, ::Oceananigans.AbstractOperations.BinaryOperation)
   @ Oceananigans ~/Oceananigans.jl/src/AbstractOperations/binary_operations.jl:91
  *(::Any, ::Any, ::Any, ::Oceananigans.Grids.AbstractGrid, ::Any, ::Any, ::Oceananigans.AbstractOperations.BinaryOperation, ::Oceananigans.AbstractOperations.BinaryOperation)
   @ Oceananigans ~/Oceananigans.jl/src/AbstractOperations/binary_operations.jl:70
  *(::Any, ::Any, ::Any, ::Oceananigans.Grids.AbstractGrid, ::Any, ::Any, ::Oceananigans.AbstractOperations.BinaryOperation, ::Oceananigans.Fields.AbstractField)
   @ Oceananigans ~/Oceananigans.jl/src/AbstractOperations/binary_operations.jl:76
  ...

Stacktrace:
  [1] apply_transform!(A::ROCArray{…}, B::ROCArray{…}, plan::Function, ::Nothing)
    @ Oceananigans.Solvers ~/Oceananigans.jl/src/Solvers/discrete_transforms.jl:142
  [2] (::Oceananigans.Solvers.DiscreteTransform{…})(A::ROCArray{…}, buffer::ROCArray{…})
    @ Oceananigans.Solvers ~/Oceananigans.jl/src/Solvers/discrete_transforms.jl:118
  [3] solve!(ϕ::Field{…}, solver::Oceananigans.Solvers.FFTBasedPoissonSolver{…}, b::ROCArray{…}, m::Int64)
    @ Oceananigans.Solvers ~/Oceananigans.jl/src/Solvers/fft_based_poisson_solver.jl:119
  [4] solve!(ϕ::Field{…}, solver::Oceananigans.Solvers.FFTBasedPoissonSolver{…})
    @ Oceananigans.Solvers ~/Oceananigans.jl/src/Solvers/fft_based_poisson_solver.jl:96
  [5] solve_for_pressure!(pressure::Field{…}, solver::Oceananigans.Solvers.FFTBasedPoissonSolver{…}, Δt::Float64, args::@NamedTuple{…})
    @ Oceananigans.Models.NonhydrostaticModels ~/Oceananigans.jl/src/Models/NonhydrostaticModels/solve_for_pressure.jl:85
  [6] compute_pressure_correction!(model::NonhydrostaticModel{…}, Δt::Float64)
    @ Oceananigans.Models.NonhydrostaticModels ~/Oceananigans.jl/src/Models/NonhydrostaticModels/pressure_correction.jl:13
  [7] time_step!(model::NonhydrostaticModel{…}, Δt::Float64; callbacks::Tuple{})
    @ Oceananigans.TimeSteppers ~/Oceananigans.jl/src/TimeSteppers/runge_kutta_3.jl:114
  [8] time_step!(sim::Simulation{…})
    @ Oceananigans.Simulations ~/Oceananigans.jl/src/Simulations/run.jl:149
  [9] run!(sim::Simulation{…}; pickup::Bool)
    @ Oceananigans.Simulations ~/Oceananigans.jl/src/Simulations/run.jl:105
 [10] run!(sim::Simulation{…})
    @ Oceananigans.Simulations ~/Oceananigans.jl/src/Simulations/run.jl:92
 [11] top-level scope
    @ REPL[6]:1
Some type information was truncated. Use `show(err)` to see complete types.

glwagner · 2025-06-11T03:34:17Z

The stack trace shows that the error originates from here:

Oceananigans.jl/src/Solvers/discrete_transforms.jl

Lines 141 to 144 in 3489204

    
           function apply_transform!(A, B, plan, ::Nothing) 
        
               plan * A 
        
               return nothing 
        
           end

You need to extend apply_transform! (or extend * for the plan you are creating if you want to reuse the apply_transform! function itself, but I don't see the value in that)

glwagner · 2025-06-11T03:36:53Z

although the DiscreteTransform interface already includes a normalization options which is applied here

Oceananigans.jl/src/Solvers/discrete_transforms.jl

Line 120 in 3489204

maybe_normalize!(A, transform.normalization)

glwagner · 2025-06-11T03:37:20Z

Oceananigans.jl/src/Solvers/discrete_transforms.jl

Lines 8 to 17 in 3489204

    
           struct DiscreteTransform{P, D, G, Δ, Ω, N, T, Σ} 
        
                          plan :: P 
        
                          grid :: G 
        
                     direction :: D 
        
                          dims :: Δ 
        
                      topology :: Ω 
        
                 normalization :: N 
        
               twiddle_factors :: T # # https://en.wikipedia.org/wiki/Twiddle_factor 
        
                transpose_dims :: Σ 
        
           end

navidcy · 2025-06-11T03:39:03Z

Yes. I am confused though because the Fourier transforms don't include a normalisation factor but then it seems that the DiscreteTransforms method were called. (Is the Fourier transforms <: DiscreteTransforms)?

Perhaps I have a lack of understanding of how things work in general with the transforms and that's where the essence of my difficulty lies... :(

glwagner · 2025-06-11T04:53:49Z

Yes but nobody understands this so the only solution is to figure it out

glwagner · 2025-06-11T04:55:39Z

this file organizes the transforms:

Oceananigans.jl/src/Solvers/plan_transforms.jl

Line 137 in 3489204

backward_transforms = (

it looks like plans are made and then inserted into DiscreteTransform.

navidcy · 2025-06-11T04:56:18Z

Yes but nobody understands this so the only solution is to figure it out

:D

OK! Will try to penetrate this!

navidcy · 2025-06-11T04:56:39Z

(it's helpful to hear that we are at the same denominator)

glwagner · 2025-06-11T05:02:15Z

The constructor for DiscreteTransform is here:

Oceananigans.jl/src/Solvers/discrete_transforms.jl

Line 83 in 3489204

function DiscreteTransform(plan, direction, grid, dims)

this builds normalization factors here:

Oceananigans.jl/src/Solvers/discrete_transforms.jl

Line 90 in 3489204

    
           normalization = prod(normalization_factor(arch, topo[d](), direction, N[d]) for d in dims)

The normalization_factor takes the architecture as first argument, eg

Oceananigans.jl/src/Solvers/discrete_transforms.jl

Line 26 in 3489204

normalization_factor(arch, topo, direction, N) = 1

for fallback and

Oceananigans.jl/src/Solvers/discrete_transforms.jl

Line 34 in 3489204

normalization_factor(::CPU, ::Bounded, ::Backward, N) = 1 / 2N

on CPU.

You can try extending normalization_factor for AMDGPU?

It is a little dirty, because it makes an assumption that every architecture is tied to a particular transform implementation. Nevertheless, this is the case for all existing code and also here as well.

glwagner · 2025-06-11T05:02:53Z

The normalization factor is applied here

Oceananigans.jl/src/Solvers/discrete_transforms.jl

Lines 108 to 122 in 3489204

    
           function (transform::DiscreteTransform{P, <:Forward})(A, buffer) where P 
        
               maybe_permute_indices!(A, buffer, architecture(transform), transform.grid, transform.dims, transform.topology) 
        
               apply_transform!(A, buffer, transform.plan, transform.transpose_dims) 
        
               maybe_twiddle_forward!(A, transform.twiddle_factors) 
        
               maybe_normalize!(A, transform.normalization) 
        
               return nothing 
        
           end 
        
           function (transform::DiscreteTransform{P, <:Backward})(A, buffer) where P 
        
               maybe_twiddle_backward!(A, transform.twiddle_factors) 
        
               apply_transform!(A, buffer, transform.plan, transform.transpose_dims) 
        
               maybe_unpermute_indices!(A, buffer, architecture(transform), transform.grid, transform.dims, transform.topology) 
        
               maybe_normalize!(A, transform.normalization) 
        
               return nothing 
        
           end

via functions maybe_normalize which are defined here

Oceananigans.jl/src/Solvers/discrete_transforms.jl

Lines 178 to 184 in 3489204

    
           function maybe_normalize!(A, normalization) 
        
               # Avoid a tiny kernel launch if possible. 
        
               if normalization != 1 
        
                   @. A *= normalization 
        
               end 
        
               return nothing 
        
           end

if this fits your use case, you can try it.

It is all a little overengineered and therefore overcomplex.

navidcy · 2025-07-16T21:39:57Z

OK, if I understand correctly the FFT-related methods were already included in #4499, right @michel2323? So perhaps this PR is not needed? Or it might be a good idea to add a NonhydrostaticModel test; perhaps this is what the PR should do.

navidcy · 2025-07-16T22:11:18Z

Hm... I seem to be getting an error that plan_ifft! is not defined?

https://buildkite.com/julialang/oceananigans-dot-jl/builds/1788#0198152b-f0c3-432f-9372-6ee7d06a56a0/122-1089

giordano · 2026-01-01T23:13:19Z

Hm... I seem to be getting an error that plan_ifft! is not defined?

I just came across this. The FFT stuff in the AMDGPU extension is simply broken: NumericalEarth/Breeze.jl#369 (comment). Oceananigans shouldn't call AMDGPU.rocFFT.plan_ifft!, but AbstractFFTs.plan_ifft! on an ROCArray{<:Complex}, similarly for all the other FFT functions. This error would have been avoided if ExplicitImports.jl was enforced in the entirety of Oceananigans (we are still a long way from that goal).

glwagner · 2026-01-02T00:58:07Z

Makes sense, so we don’t have to figure out how to do this translation from ifft to bfft ourselves

…g ones in `rocFFT`

giordano · 2026-01-04T18:02:59Z

New errors, but the undefined plan_fft!/plan_ifft! errors are resolved: https://buildkite.com/julialang/oceananigans-dot-jl/builds/4986#019b8a21-ade7-487e-9248-0a15db839156/L2086

AMDGPU on RectilinearGrids: Error During Test at /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:22
  Got exception outside of a @test
  MethodError: no method matching plan_transforms(::RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}, OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}, OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}, OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float64, Float64, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float64, Float64, Int64}}, GPU{ROCBackend}}, ::ROCArray{ComplexF32, 3, AMDGPU.Runtime.Mem.HIPBuffer}, ::UInt32)
  The function `plan_transforms` exists, but no method is defined for this combination of argument types.
  Closest candidates are:
    plan_transforms(::Any, ::Any, ::Any, ::Any)
     @ Oceananigans /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/src/Solvers/plan_transforms.jl:139
    plan_transforms(::RectilinearGrid{<:Any, <:Any, <:Any, <:Any, <:Union{MutableVerticalDiscretization{<:Any, <:Any, <:Number}, Oceananigans.Grids.StaticVerticalDiscretization{<:Any, <:Any, <:Number}}, <:Number, <:Number}, ::Any, ::Any)
     @ Oceananigans /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/src/Solvers/plan_transforms.jl:68
  Stacktrace:
    [1] FFTBasedPoissonSolver(grid::RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}, OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}, OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}, OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float64, Float64, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float64, Float64, Int64}}, GPU{ROCBackend}}, planner_flag::UInt32)
      @ Oceananigans.Solvers /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/src/Solvers/fft_based_poisson_solver.jl:67
    [2] FFTBasedPoissonSolver(grid::RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}, OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}, OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}, OffsetVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}}}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float64, Float64, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float64, Float64, Int64}}, GPU{ROCBackend}})
      @ Oceananigans.Solvers /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/src/Solvers/fft_based_poisson_solver.jl:53
    [3] macro expansion
      @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:69 [inlined]
    [4] macro expansion
      @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.12/julia-1.12-latest-linux-x86_64/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
    [5] top-level scope
      @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:23
[...]
AMDGPU on LatitudeLongitudeGrid with HydrostaticFreeSurfaceModel: Error During Test at /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:96
  Test threw exception
  Expression: parent(grid.Δyᶠᶜᵃ) isa ROCArray
  MethodError: no method matching parent(::Float32)
  The function `parent` exists, but no method is defined for this combination of argument types.
  Closest candidates are:
    parent(::DataFrames.SubDataFrame)
     @ DataFrames ~/.cache/julia-buildkite-plugin/depots/01962b57-ae40-45a7-b229-560c2d4f6b07/packages/DataFrames/b4w9K/src/subdataframe/subdataframe.jl:137
    parent(::SubArray)
     @ Base subarray.jl:80
    parent(::DataFrames.AbstractDataFrame)
     @ DataFrames ~/.cache/julia-buildkite-plugin/depots/01962b57-ae40-45a7-b229-560c2d4f6b07/packages/DataFrames/b4w9K/src/abstractdataframe/abstractdataframe.jl:1914
    ...
  Stacktrace:
   [1] macro expansion
     @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.12/julia-1.12-latest-linux-x86_64/share/julia/stdlib/v1.12/Test/src/Test.jl:677 [inlined]
   [2] macro expansion
     @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:96 [inlined]
   [3] macro expansion
     @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.12/julia-1.12-latest-linux-x86_64/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
   [4] top-level scope
     @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:84
AMDGPU on LatitudeLongitudeGrid with HydrostaticFreeSurfaceModel: Error During Test at /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:97
  Test threw exception
  Expression: parent(grid.Δyᶜᶠᵃ) isa ROCArray
  MethodError: no method matching parent(::Float32)
  The function `parent` exists, but no method is defined for this combination of argument types.
  Closest candidates are:
    parent(::DataFrames.SubDataFrame)
     @ DataFrames ~/.cache/julia-buildkite-plugin/depots/01962b57-ae40-45a7-b229-560c2d4f6b07/packages/DataFrames/b4w9K/src/subdataframe/subdataframe.jl:137
    parent(::SubArray)
     @ Base subarray.jl:80
    parent(::DataFrames.AbstractDataFrame)
     @ DataFrames ~/.cache/julia-buildkite-plugin/depots/01962b57-ae40-45a7-b229-560c2d4f6b07/packages/DataFrames/b4w9K/src/abstractdataframe/abstractdataframe.jl:1914
    ...
  Stacktrace:
   [1] macro expansion
     @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.12/julia-1.12-latest-linux-x86_64/share/julia/stdlib/v1.12/Test/src/Test.jl:677 [inlined]
   [2] macro expansion
     @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:97 [inlined]
   [3] macro expansion
     @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.12/julia-1.12-latest-linux-x86_64/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
   [4] top-level scope
     @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:84
AMDGPU on LatitudeLongitudeGrid with HydrostaticFreeSurfaceModel: Error During Test at /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:83
  Got exception outside of a @test
  UndefVarError: `build_and_time_step_simulation` not defined in `Main`
  Suggestion: check for spelling errors or missing imports.
  Stacktrace:
    [1] macro expansion
      @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:115 [inlined]
    [2] macro expansion
      @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.12/julia-1.12-latest-linux-x86_64/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
    [3] top-level scope
      @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/test_amdgpu.jl:84
    [4] include(mapexpr::Function, mod::Module, _path::String)
      @ Base ./Base.jl:307
    [5] IncludeInto
      @ ./Base.jl:308 [inlined]
    [6] macro expansion
      @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/runtests.jl:303 [inlined]
    [7] macro expansion
      @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.12/julia-1.12-latest-linux-x86_64/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
    [8] macro expansion
      @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/runtests.jl:303 [inlined]
    [9] macro expansion
      @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.12/julia-1.12-latest-linux-x86_64/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
   [10] (::var"#6#7")()
      @ Main /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/runtests.jl:25
   [11] task_local_storage(body::var"#6#7", key::Symbol, val::GPUArraysCore.ScalarIndexing)
      @ Base ./task.jl:298
   [12] allowscalar(f::Function)
      @ GPUArraysCore ~/.cache/julia-buildkite-plugin/depots/01962b57-ae40-45a7-b229-560c2d4f6b07/packages/GPUArraysCore/aNaXo/src/GPUArraysCore.jl:179
   [13] top-level scope
      @ /var/lib/buildkite-agent/builds/amdgpu1-luraess-com/julialang/oceananigans-dot-jl/test/runtests.jl:21
   [14] include(mapexpr::Function, mod::Module, _path::String)
      @ Base ./Base.jl:307
   [15] top-level scope
      @ none:6
   [16] eval(m::Module, e::Any)
      @ Core ./boot.jl:489
   [17] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:283
   [18] _start()
      @ Base ./client.jl:550

giordano · 2026-01-04T18:07:00Z

test/test_amdgpu.jl

-
-        @test iteration(simulation) == 3
-        @test time(simulation) == 3minutes
+        build_and_time_step_simulation(model)


@glwagner you added this line in f835c0d, but this function doesn't exist anywhere as far as I can tell.

test/test_amdgpu.jl

giordano · 2026-01-04T23:04:15Z

Remaining errors are the method error with plan_transforms (see full error in #4593 (comment)) and #4593 (comment)

extend plan transforms for Flat

009bff7

navidcy added GPU 👾 Where Oceananigans gets its powers from extensions 🧬 labels Jun 10, 2025

add methods for plan_fwd/bwd_transform

8e2621b

navidcy requested review from amontoison and simone-silvestri and removed request for simone-silvestri June 10, 2025 10:52

navidcy commented Jun 10, 2025

View reviewed changes

navidcy marked this pull request as ready for review June 10, 2025 10:53

navidcy added 2 commits June 10, 2025 19:01

add tests for NonhydrostaticModel

661887f

compactify tests

0c97835

navidcy added 2 commits June 10, 2025 21:44

reorder

30eec3a

test few grids

495be48

simone-silvestri mentioned this pull request Jun 17, 2025

Issues using Distributed with AMD GPUs #4597

Open

navidcy added 2 commits July 17, 2025 07:34

merge main

f06ae9b

merge main

2f54d0e

navidcy mentioned this pull request Dec 12, 2025

Test NonhydrostaticModel on AMDGPU #5033

Closed

glwagner added 2 commits December 12, 2025 16:00

Add tests for AMDGPU with HydrostaticFreeSurfaceModel

f835c0d

Merge branch 'main' into ncc/amdgpu-fft

039bcce

giordano added 2 commits January 4, 2026 18:44

Merge branch 'main' into ncc/amdgpu-fft

f9a2daa

[AMDGPUExt] Use FFT functions from AbstractFFTs instead non-existin…

b07b3c0

…g ones in `rocFFT`

giordano reviewed Jan 4, 2026

View reviewed changes

test/test_amdgpu.jl Outdated Show resolved Hide resolved

glwagner reviewed Jan 4, 2026

View reviewed changes

test/test_amdgpu.jl Outdated Show resolved Hide resolved

glwagner and others added 2 commits January 4, 2026 11:19

Update test/test_amdgpu.jl

de4e9e9

Add compat bound for AbstractFFTs

fa511bf

Extend FFTBasedPoissonSolver to work on AMDGPU #4593

Are you sure you want to change the base?

Extend FFTBasedPoissonSolver to work on AMDGPU #4593

Conversation

navidcy commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

navidcy Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

navidcy commented Jun 10, 2025

Uh oh!

navidcy commented Jun 10, 2025

Uh oh!

amontoison commented Jun 10, 2025

Uh oh!

navidcy commented Jun 10, 2025

Uh oh!

glwagner commented Jun 10, 2025

Uh oh!

glwagner commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michel2323 commented Jun 10, 2025

Uh oh!

navidcy commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

navidcy commented Jun 10, 2025

Uh oh!

glwagner commented Jun 10, 2025

Uh oh!

navidcy commented Jun 11, 2025

Uh oh!

navidcy commented Jun 11, 2025

Uh oh!

glwagner commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glwagner commented Jun 11, 2025

Uh oh!

glwagner commented Jun 11, 2025

Uh oh!

navidcy commented Jun 11, 2025

Uh oh!

glwagner commented Jun 11, 2025

Uh oh!

glwagner commented Jun 11, 2025

Uh oh!

navidcy commented Jun 11, 2025

Uh oh!

navidcy commented Jun 11, 2025

Uh oh!

glwagner commented Jun 11, 2025

Uh oh!

glwagner commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

navidcy commented Jul 16, 2025

Uh oh!

navidcy commented Jul 16, 2025

Uh oh!

giordano commented Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glwagner commented Jan 2, 2026

Uh oh!

giordano commented Jan 4, 2026

Uh oh!

giordano Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

giordano commented Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

navidcy commented Jun 10, 2025 •

edited

Loading

glwagner commented Jun 10, 2025 •

edited

Loading

navidcy commented Jun 10, 2025 •

edited

Loading

glwagner commented Jun 11, 2025 •

edited

Loading

glwagner commented Jun 11, 2025 •

edited

Loading

giordano commented Jan 1, 2026 •

edited

Loading