|
| 1 | +# Solving the Burgers Equation with the Fourier Neural Operator |
| 2 | + |
| 3 | +This example mostly replicates the original work by [Li et al](https://github.com/zongyi-li/fourier_neural_operator/blob/master/fourier_1d.py). |
| 4 | + |
| 5 | +We try to create an operator for the Burgers equation |
| 6 | + |
| 7 | +$$ \partial_t u(x,t) + \partial_x (u^2(x,t)/2) = \nu \partial_{xx} u(x,t) $$ |
| 8 | + |
| 9 | +in one dimension for a unit spacial and temporal domain. The operator maps the initial condition $u(x,0) = u_0(x)$ to the flow field at the final time $u(x,1)$. |
| 10 | + |
| 11 | +So overall, we need an approximation function that does the following: |
| 12 | + |
| 13 | +```julia |
| 14 | +function foo(u0,x) |
| 15 | + # Do something |
| 16 | + return u1 |
| 17 | +``` |
| 18 | + |
| 19 | +We sample from a dataset that contains several instances of the initial condition (`a`) and the final velocity field (`u`). |
| 20 | +The data is given on a grid of 8192 points, however we would like to only sample 1024 points. |
| 21 | + |
| 22 | +```julia |
| 23 | +using Flux: length, reshape, train!, throttle, @epochs |
| 24 | +using OperatorLearning, Flux, MAT |
| 25 | + |
| 26 | +device = gpu; |
| 27 | + |
| 28 | +# Read the data from MAT file and store it in a dict |
| 29 | +vars = matread("burgers_data_R10.mat") |> device |
| 30 | + |
| 31 | +# For trial purposes, we might want to train with different resolutions |
| 32 | +# So we sample only every n-th element |
| 33 | +subsample = 2^3; |
| 34 | + |
| 35 | +# create the x training array, according to our desired grid size |
| 36 | +xtrain = vars["a"][1:1000, 1:subsample:end] |> device; |
| 37 | +# create the x test array |
| 38 | +xtest = vars["a"][end-99:end, 1:subsample:end] |> device; |
| 39 | + |
| 40 | +# Create the y training array |
| 41 | +ytrain = vars["u"][1:1000, 1:subsample:end] |> device; |
| 42 | +# Create the y test array |
| 43 | +ytest = vars["u"][end-99:end, 1:subsample:end] |> device; |
| 44 | +``` |
| 45 | + |
| 46 | +For now, we only have one input and one output array. In addition, we need corresponding x values for a(x) and u(x) as the second input array which at this point are still missing. The data were sampled from an equispaced grid (otherwise the FFT in our architecture wouldn't work anyway), so manually creating them is fairly straightforward: |
| 47 | + |
| 48 | +```julia |
| 49 | +# The data is missing grid data, so we create it |
| 50 | +# `collect` converts data type `range` into an array |
| 51 | +grid = collect(range(0, 1, length=length(xtrain[1,:]))) |> device |
| 52 | + |
| 53 | +# Merge the created grid with the data |
| 54 | +# Output has the dims: batch x grid points x 2 (a(x), x) |
| 55 | +# First, reshape the data to a 3D tensor, |
| 56 | +# Then, create a 3D tensor from the synthetic grid data |
| 57 | +# and concatenate them along the newly created 3rd dim |
| 58 | +xtrain = cat(reshape(xtrain,(1000,1024,1)), |
| 59 | + reshape(repeat(grid,1000),(1000,1024,1)); |
| 60 | + dims=3) |> device |
| 61 | +ytrain = cat(reshape(ytrain,(1000,1024,1)), |
| 62 | + reshape(repeat(grid,1000),(1000,1024,1)); |
| 63 | + dims=3) |> device |
| 64 | +# Same treatment with the test data |
| 65 | +xtest = cat(reshape(xtest,(100,1024,1)), |
| 66 | + reshape(repeat(grid,100),(100,1024,1)); |
| 67 | + dims=3) |> device |
| 68 | +ytest = cat(reshape(ytest,(100,1024,1)), |
| 69 | + reshape(repeat(grid,100),(100,1024,1)); |
| 70 | + dims=3) |> device |
| 71 | +``` |
| 72 | + |
| 73 | +Next we need to consider the shape that the `FourierLayer` expects the inputs to be, i.e. `[numInputs, grid, batch]`. But our dataset contains the batching dim as the first one, so we need to do some permuting: |
| 74 | + |
| 75 | +```julia |
| 76 | +# Our net wants the input in the form (2,grid,batch), though, |
| 77 | +# So we permute |
| 78 | +xtrain, xtest = permutedims(xtrain,(3,2,1)), permutedims(xtest,(3,2,1)) |> device |
| 79 | +ytrain, ytest = permutedims(ytrain,(3,2,1)), permutedims(ytest,(3,2,1)) |> device |
| 80 | +``` |
| 81 | + |
| 82 | +In order to slice the data into mini-batches, we pass the arrays to the Flux `DataLoader`. |
| 83 | + |
| 84 | +```julia |
| 85 | +# Pass the data to the Flux DataLoader and give it a batch of 20 |
| 86 | +train_loader = Flux.Data.DataLoader((xtrain, ytrain), batchsize=20, shuffle=true) |> device |
| 87 | +test_loader = Flux.Data.DataLoader((xtest, ytest), batchsize=20, shuffle=false) |> device |
| 88 | +``` |
| 89 | + |
| 90 | +We can now set up the architecture. We lift the inputs to a higher-dimensional space via a simple linear transform using a `Dense` layer. The input dimensionality is 2, we will transform it to 128. After that, we set up 4 instances of a Fourier Layer where we keep only 16 of the `N/2 + 1 = 513` modes that the FFT provides and use the GeLU activation. Finally, we reduce the latent space to the two output arrays we wish to obtain - `u1(x)` and `x`: |
| 91 | + |
| 92 | +```julia |
| 93 | +# Set up the Fourier Layer |
| 94 | +# 128 in- and outputs, batch size 20 as given above, grid size 1024 |
| 95 | +# 16 modes to keep, σ activation on the gpu |
| 96 | +layer = FourierLayer(128,128,1024,16,gelu,bias_fourier=false) |> device |
| 97 | + |
| 98 | +# The whole architecture |
| 99 | +# linear transform into the latent space, 4 Fourier Layers, |
| 100 | +# then transform it back |
| 101 | +model = Chain(Dense(2,128;bias=false), layer, layer, layer, layer, |
| 102 | + Dense(128,2;bias=false)) |> device |
| 103 | +``` |
| 104 | + |
| 105 | +The rest is more or less boilerplate training code for a DNN. We employ the ADAM optimizer with a fixed learning rate of 1e-3, use the mean squared error as loss, evaluate the test loss as callback and train the FNO for 500 epochs. |
| 106 | + |
| 107 | +```julia |
| 108 | +# We use the ADAM optimizer for training |
| 109 | +learning_rate = 0.001 |
| 110 | +opt = ADAM(learning_rate) |
| 111 | + |
| 112 | +# Specify the model parameters |
| 113 | +parameters = params(model) |
| 114 | + |
| 115 | +# The loss function |
| 116 | +loss(x,y) = Flux.Losses.mse(model(x),y) |
| 117 | + |
| 118 | +# Define a callback function that gives some output during training |
| 119 | +evalcb() = @show(loss(xtest,ytest)) |
| 120 | +# Print the callback only every 5 seconds, |
| 121 | +throttled_cb = throttle(evalcb, 5) |
| 122 | + |
| 123 | +# Do the training loop |
| 124 | +Flux.@epochs 500 train!(loss, parameters, train_loader, opt, cb = throttled_cb) |
| 125 | +``` |
0 commit comments