Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The AIFS generates different predictions with the same inputs. How to fix it? #102

Open
LQscience opened this issue Jan 12, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@LQscience
Copy link

What happened?

I'm running AIFS with the same inputs and random seed, but I'm getting inconsistent results. Could you please guide me on how to properly set the random seed so that I can obtain the same predictions under the same conditions?

Here's the random seed setting I'm using:

import torch
import numpy as np
import random

seed = 42
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
np.random.seed(seed)
random.seed(seed)

What are the steps to reproduce the bug?

Here are the inconsistent results I have gotten:
截屏2025-01-12 22 12 09

Version

v2.1.0

Platform (OS and architecture)

Linux I1ed537bc0000401ffb 5.4.0-126-generic #142-Ubuntu SMP Fri Aug 26 12:12:57 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Relevant log output

No response

Accompanying data

No response

Organisation

No response

@LQscience LQscience added the bug Something isn't working label Jan 12, 2025
@anaprietonem
Copy link

anaprietonem commented Jan 16, 2025

Hello @LQscience !

Thanks for opening this issue and for the interest in Anemoi. The behaviour that your are seeing is not a "bug" but the expected behaviour unless as you correctly suggested, one limits some of the Pytorch settings to enforce a deterministic behaviour. Regarding those settings, could you try with the following:

#First in your terminal
export CUBLAS_WORKSPACE_CONFIG=:4096:8

#And then before running inference:
import torch
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
torch.use_deterministic_algorithms(True)

Note that enforcing a deterministic behavior might come at the cost of increased runtime.
We will aim to update our anemoi-inference docs to have a section that illustrates this better!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants