-
Notifications
You must be signed in to change notification settings - Fork 260
Extend FFTBasedPoissonSolver to work on AMDGPU #4593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ext/OceananigansAMDGPUExt.jl
Outdated
|
|
||
| function plan_backward_transform(A::ROCArray, ::Union{Bounded, Periodic}, dims, planner_flag) | ||
| length(dims) == 0 && return nothing | ||
| return AMDGPU.rocFFT.plan_bfft!(A, dims) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the plan_bfft!, not the plan_ifft!
do we need to add a normalisation or something?
|
With the current modifications I can create an FFTBasedPoissonSolver with a uniformly-spaced grid. julia> using Oceananigans, AMDGPU
julia> grid = RectilinearGrid(GPU(AMDGPU.ROCBackend()), size=(8, 8, 8), extent=(1, 2, 3))
8×8×8 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── Periodic x ∈ [0.0, 1.0) regularly spaced with Δx=0.125
├── Periodic y ∈ [0.0, 2.0) regularly spaced with Δy=0.25
└── Bounded z ∈ [-3.0, 0.0] regularly spaced with Δz=0.375
julia> pressure_solver = Oceananigans.Solvers.FFTBasedPoissonSolver(grid)
FFTBasedPoissonSolver on GPU{ROCBackend}:
├── grid: 8×8×8 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── storage: ROCArray{ComplexF64, 3, AMDGPU.Runtime.Mem.HIPBuffer}
├── buffer: ROCArray{ComplexF64, 3, AMDGPU.Runtime.Mem.HIPBuffer}
└── transforms:
├── forward: Oceananigans.Solvers.DiscreteTransform, Oceananigans.Solvers.DiscreteTransform
└── backward: Oceananigans.Solvers.DiscreteTransform, Oceananigans.Solvers.DiscreteTransform
julia> model = NonhydrostaticModel(; grid, pressure_solver)
NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── grid: 8×8×8 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── timestepper: RungeKutta3TimeStepper
├── advection scheme: Centered(order=2)
├── tracers: ()
├── closure: Nothing
├── buoyancy: Nothing
└── coriolis: Nothing
julia> simulation = Simulation(model, Δt=10, stop_iteration=3)
Simulation of NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── Next time step: 10 seconds
├── Elapsed wall time: 0 seconds
├── Wall time per iteration: NaN days
├── Stop time: Inf days
├── Stop iteration: 3.0
├── Wall time limit: Inf
├── Minimum relative step: 0.0
├── Callbacks: OrderedDict with 4 entries:
│ ├── stop_time_exceeded => Callback of stop_time_exceeded on IterationInterval(1)
│ ├── stop_iteration_exceeded => Callback of stop_iteration_exceeded on IterationInterval(1)
│ ├── wall_time_limit_exceeded => Callback of wall_time_limit_exceeded on IterationInterval(1)
│ └── nan_checker => Callback of NaNChecker for u on IterationInterval(100)
├── Output writers: OrderedDict with no entries
└── Diagnostics: OrderedDict with no entries
julia> run!(simulation)
[ Info: Initializing simulation...
[ Info: ... simulation initialization complete (416.478 μs)
[ Info: Executing initial time step...
[ Info: ... initial time step complete (3.048 ms).
[ Info: Simulation is stopping after running for 6.118 ms.
[ Info: Model iteration 3 equals or exceeds stop iteration 3. |
|
With AMDGPU, scalar operations need |
|
We have a generic |
|
This is ready to review. |
|
@michel2323 , I feel that with AMDGPU starting to get some use, we should try to merge #4499. What do you think? If we wait too long we may get some conflicts. |
Can you please test this? An empirical test should reveal whether this code is correct and also will show what the normalization needs to be (if documentation is insufficient, which I am not surprised that it is). |
@glwagner I continuously rebased, so it shouldn't be too far off. However, while the majority of tests pass, there are still a few that fail. I'll do another pass tomorrow morning and give you a short summary. I might need some help. |
Yes, I wanted an idea of how to test. I know that julia> Nx, Ny = 3, 4
(3, 4)
julia> a = rand(Nx, Ny)
3×4 Matrix{Float64}:
0.453795 0.778266 0.499822 0.749407
0.646126 0.855627 0.00391171 0.158878
0.307381 0.898621 0.453264 0.604466
julia> ah = fft(a)
3×4 Matrix{ComplexF64}:
6.40956+0.0im 0.450305-1.01976im -1.68096+0.0im 0.450305+1.01976im
0.517153+0.518913im -0.64285-0.21592im -0.0206029-0.327336im 0.0544652-1.14911im
0.517153-0.518913im 0.0544652+1.14911im -0.0206029+0.327336im -0.64285+0.21592im
julia> ifft(ah)
3×4 Matrix{ComplexF64}:
0.453795+0.0im 0.778266+0.0im 0.499822+0.0im 0.749407+0.0im
0.646126+0.0im 0.855627+0.0im 0.00391171+0.0im 0.158878+0.0im
0.307381+0.0im 0.898621+0.0im 0.453264+0.0im 0.604466+0.0im
julia> bfft(ah) / (Nx * Ny)
3×4 Matrix{ComplexF64}:
0.453795+0.0im 0.778266+0.0im 0.499822+0.0im 0.749407+0.0im
0.646126+0.0im 0.855627+0.0im 0.00391171+0.0im 0.158878+0.0im
0.307381+0.0im 0.898621+0.0im 0.453264+0.0im 0.604466+0.0imBut I don't know exactly how to incorporate this function in the plans so that it's the same as elsewhere in the pressure solver. |
|
Hm... If I understand correctly we need to manually apply the scaling after the inverse transform. How do I do that best way? The normalization should be N = prod(size(A, d) for d in dims)
function backward_transform!(x)
p * x # in-place inverse FFT
@. x = x / N # in-place normalization
return x
end |
|
What prevents you from using that exact code? |
|
I tried and failed. I need help perhaps? |
|
I tried doing this in the AMD extension: function plan_backward_transform(A::ROCArray, ::Union{Bounded, Periodic}, dims, planner_flag)
length(dims) == 0 && return nothing
p = AMDGPU.rocFFT.plan_bfft!(A, dims)
N = prod(size(A, d) for d in dims)
function backward_transform!(x, y)
@. x = p * y # in-place inverse FFT
@. x = x / N # in-place normalization
return x
end
return backward_transform!
endbut I got an error related with binary operations... The pressure_solver is constructed but while time stepping I get: julia> using Oceananigans, AMDGPU
[ Info: Precompiling OceananigansAMDGPUExt [8da284e0-aa90-5603-9506-0dba59ddb112]
julia> grid = RectilinearGrid(GPU(AMDGPU.ROCBackend()), size=(16, 8, 4), x=(0, 16), y=(0, 1), z=[0, 1, 2, 3, 4])
16×8×4 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── Periodic x ∈ [0.0, 16.0) regularly spaced with Δx=1.0
├── Periodic y ∈ [0.0, 1.0) regularly spaced with Δy=0.125
└── Bounded z ∈ [0.0, 4.0] variably spaced with min(Δz)=1.0, max(Δz)=1.0
julia> pressure_solver = Oceananigans.Solvers.FFTBasedPoissonSolver(grid)
FFTBasedPoissonSolver on GPU{ROCBackend}:
├── grid: 16×8×4 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── storage: ROCArray{ComplexF64, 3, AMDGPU.Runtime.Mem.HIPBuffer}
├── buffer: ROCArray{ComplexF64, 3, AMDGPU.Runtime.Mem.HIPBuffer}
└── transforms:
├── forward: Oceananigans.Solvers.DiscreteTransform
└── backward: Oceananigans.Solvers.DiscreteTransform
julia> model = NonhydrostaticModel(; grid, pressure_solver)
NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── grid: 16×8×4 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on ROCGPU with 3×3×3 halo
├── timestepper: RungeKutta3TimeStepper
├── advection scheme: Centered(order=2)
├── tracers: ()
├── closure: Nothing
├── buoyancy: Nothing
└── coriolis: Nothing
julia> simulation = Simulation(model, Δt=10, stop_iteration=3)
Simulation of NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── Next time step: 10 seconds
├── Elapsed wall time: 0 seconds
├── Wall time per iteration: NaN days
├── Stop time: Inf days
├── Stop iteration: 3.0
├── Wall time limit: Inf
├── Minimum relative step: 0.0
├── Callbacks: OrderedDict with 4 entries:
│ ├── stop_time_exceeded => Callback of stop_time_exceeded on IterationInterval(1)
│ ├── stop_iteration_exceeded => Callback of stop_iteration_exceeded on IterationInterval(1)
│ ├── wall_time_limit_exceeded => Callback of wall_time_limit_exceeded on IterationInterval(1)
│ └── nan_checker => Callback of NaNChecker for u on IterationInterval(100)
├── Output writers: OrderedDict with no entries
└── Diagnostics: OrderedDict with no entries
julia> run!(simulation)
[ Info: Initializing simulation...
[ Info: ... simulation initialization complete (4.818 seconds)
[ Info: Executing initial time step...
ERROR: MethodError: no method matching *(::OceananigansAMDGPUExt.var"#backward_transform!#3"{…}, ::ROCArray{…})
Closest candidates are:
*(::Any, ::Any, ::Any, ::Oceananigans.Grids.AbstractGrid, ::Any, ::Any, ::Number, ::Oceananigans.AbstractOperations.BinaryOperation)
@ Oceananigans ~/Oceananigans.jl/src/AbstractOperations/binary_operations.jl:91
*(::Any, ::Any, ::Any, ::Oceananigans.Grids.AbstractGrid, ::Any, ::Any, ::Oceananigans.AbstractOperations.BinaryOperation, ::Oceananigans.AbstractOperations.BinaryOperation)
@ Oceananigans ~/Oceananigans.jl/src/AbstractOperations/binary_operations.jl:70
*(::Any, ::Any, ::Any, ::Oceananigans.Grids.AbstractGrid, ::Any, ::Any, ::Oceananigans.AbstractOperations.BinaryOperation, ::Oceananigans.Fields.AbstractField)
@ Oceananigans ~/Oceananigans.jl/src/AbstractOperations/binary_operations.jl:76
...
Stacktrace:
[1] apply_transform!(A::ROCArray{…}, B::ROCArray{…}, plan::Function, ::Nothing)
@ Oceananigans.Solvers ~/Oceananigans.jl/src/Solvers/discrete_transforms.jl:142
[2] (::Oceananigans.Solvers.DiscreteTransform{…})(A::ROCArray{…}, buffer::ROCArray{…})
@ Oceananigans.Solvers ~/Oceananigans.jl/src/Solvers/discrete_transforms.jl:118
[3] solve!(ϕ::Field{…}, solver::Oceananigans.Solvers.FFTBasedPoissonSolver{…}, b::ROCArray{…}, m::Int64)
@ Oceananigans.Solvers ~/Oceananigans.jl/src/Solvers/fft_based_poisson_solver.jl:119
[4] solve!(ϕ::Field{…}, solver::Oceananigans.Solvers.FFTBasedPoissonSolver{…})
@ Oceananigans.Solvers ~/Oceananigans.jl/src/Solvers/fft_based_poisson_solver.jl:96
[5] solve_for_pressure!(pressure::Field{…}, solver::Oceananigans.Solvers.FFTBasedPoissonSolver{…}, Δt::Float64, args::@NamedTuple{…})
@ Oceananigans.Models.NonhydrostaticModels ~/Oceananigans.jl/src/Models/NonhydrostaticModels/solve_for_pressure.jl:85
[6] compute_pressure_correction!(model::NonhydrostaticModel{…}, Δt::Float64)
@ Oceananigans.Models.NonhydrostaticModels ~/Oceananigans.jl/src/Models/NonhydrostaticModels/pressure_correction.jl:13
[7] time_step!(model::NonhydrostaticModel{…}, Δt::Float64; callbacks::Tuple{})
@ Oceananigans.TimeSteppers ~/Oceananigans.jl/src/TimeSteppers/runge_kutta_3.jl:114
[8] time_step!(sim::Simulation{…})
@ Oceananigans.Simulations ~/Oceananigans.jl/src/Simulations/run.jl:149
[9] run!(sim::Simulation{…}; pickup::Bool)
@ Oceananigans.Simulations ~/Oceananigans.jl/src/Simulations/run.jl:105
[10] run!(sim::Simulation{…})
@ Oceananigans.Simulations ~/Oceananigans.jl/src/Simulations/run.jl:92
[11] top-level scope
@ REPL[6]:1
Some type information was truncated. Use `show(err)` to see complete types. |
|
The stack trace shows that the error originates from here: Oceananigans.jl/src/Solvers/discrete_transforms.jl Lines 141 to 144 in 3489204
You need to extend |
|
although the
|
|
Oceananigans.jl/src/Solvers/discrete_transforms.jl Lines 8 to 17 in 3489204
|
|
Yes. I am confused though because the Fourier transforms don't include a normalisation factor but then it seems that the DiscreteTransforms method were called. (Is the Fourier transforms <: DiscreteTransforms)? Perhaps I have a lack of understanding of how things work in general with the transforms and that's where the essence of my difficulty lies... :( |
|
Yes but nobody understands this so the only solution is to figure it out |
|
this file organizes the transforms:
it looks like plans are made and then inserted into |
:D OK! Will try to penetrate this! |
|
(it's helpful to hear that we are at the same denominator) |
|
The constructor for
this builds normalization factors here:
The
for fallback and
on CPU. You can try extending It is a little dirty, because it makes an assumption that every architecture is tied to a particular transform implementation. Nevertheless, this is the case for all existing code and also here as well. |
|
The normalization factor is applied here Oceananigans.jl/src/Solvers/discrete_transforms.jl Lines 108 to 122 in 3489204
via functions Oceananigans.jl/src/Solvers/discrete_transforms.jl Lines 178 to 184 in 3489204
if this fits your use case, you can try it. It is all a little overengineered and therefore overcomplex. |
|
OK, if I understand correctly the FFT-related methods were already included in #4499, right @michel2323? So perhaps this PR is not needed? Or it might be a good idea to add a NonhydrostaticModel test; perhaps this is what the PR should do. |
|
Hm... I seem to be getting an error that plan_ifft! is not defined? |
I just came across this. The FFT stuff in the AMDGPU extension is simply broken: NumericalEarth/Breeze.jl#369 (comment). Oceananigans shouldn't call |
|
Makes sense, so we don’t have to figure out how to do this translation from ifft to bfft ourselves |
|
New errors, but the undefined |
|
|
||
| @test iteration(simulation) == 3 | ||
| @test time(simulation) == 3minutes | ||
| build_and_time_step_simulation(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Remaining errors are the method error with |
This PR adds FFTBasedPoisonSolve method for AMDGPU.
Closes #4591