Skip to content

test: triangular attention snapshots#128

Open
jandom wants to merge 14 commits intomainfrom
jandom/2026-02/test/triangular-attention-exploration-snapshots
Open

test: triangular attention snapshots#128
jandom wants to merge 14 commits intomainfrom
jandom/2026-02/test/triangular-attention-exploration-snapshots

Conversation

@jandom
Copy link
Copy Markdown
Collaborator

@jandom jandom commented Feb 17, 2026

Summary

Adding our 1st snapshot test. I'm not sure how this will work across different GPU types, especially with jit.

Changes

  • added comments to the test to ease the reader in
  • converted the test into pytest
  • added a snapshot test for TriangleAttention (.npz backed)
  • new test dep pytest-regressions

Question for reviewers

I think we should mutualize the test code, somewhat, otherwise the test cases have a lot of the same boilerplate

  • seed torch
  • eval
  • call with no_grad()
  • assert that outputs are not all zeros
  • snapshot comparison

Related Issues

Testing

Other Notes

@jandom jandom requested review from christinaflo and jnwei February 17, 2026 14:23
@jandom jandom self-assigned this Feb 17, 2026
@jandom jandom added the safe-to-test Internal only label used to indicate PRs that are ready for automated CI testing. label Feb 17, 2026
@jandom jandom marked this pull request as ready for review February 18, 2026 17:35
Copy link
Copy Markdown
Contributor

@jnwei jnwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this is awesome. I did not know about pytest-regressions, but it seems perfect for our use case.

As this PR is hopefully the first of a few PRs to introduce snapshot tests, I have a few general questions regarding testing / organization

  • Which system did you create the snapshots on? I can help test the snapshots on other machines to see if they pass.

  • How do you envision organizing the snapshot files? It seems that by default, a folder is created for each test. If we end up with many snapshot files, I wonder if it is better to place all of the snapshots in one directory so that we can zip the files.

@jandom
Copy link
Copy Markdown
Collaborator Author

jandom commented Feb 23, 2026

Which system did you create the snapshots on? I can help test the snapshots on other machines to see if they pass.

These were all created on my CPU VM but passed on g5.xlarge, so either this ran CPU-only or within tolerance on the GPUs. Maybe we can include something in the fixture filename such that it's arch specific. Ideally we can automate generating for the hardware we need, it was very tedious in my past life to keep these not-stale manually.

How do you envision organizing the snapshot files? It seems that by default, a folder is created for each test. If we end up with many snapshot files, I wonder if it is better to place all of the snapshots in one directory so that we can zip the files.

Indeed, i don't have a good answer. Some things are better local, some things are better global. For tests, we chose to keep them separately but organized like the codebase. I'll explore how easy it'd be to keep it in a separate location.

The compression is a separate thing, ideally we don't expose the internals to the developer and "hide" pytest-regression behind a fixture that handles compression/location for them.

@jandom
Copy link
Copy Markdown
Collaborator Author

jandom commented Mar 31, 2026

Now that we have the VCR snapshots, this could actually live alongside those

@jandom jandom requested a review from jnwei March 31, 2026 17:07
Comment on lines +83 to +86
@pytest.fixture(scope="module")
def original_datadir(request: pytest.FixtureRequest) -> Path:
"""Redirect pytest-regressions snapshot storage to test_data/snapshots/."""
return Path(__file__).parent / "test_data" / "snapshots" / Path(request.path).stem
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what allows us to place the snapshots along other test_data, sibling to the 'cassettes'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Does it make sense to update the snapshot paths for the templates test you added in https://github.com/aqlaboratory/openfold-3/tree/main/openfold3/tests/test_data/cassettes/test_rscb

Can be a different PR if it is cumbersome to update here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are different snapshots – we have two types

  • the VCR snapshots for web requests
  • ndarrays_regression snapshots for numerics

I'd be weary of mixing those up – they mean a different thing and solve a different problem

@jandom jandom added safe-to-test Internal only label used to indicate PRs that are ready for automated CI testing. and removed safe-to-test Internal only label used to indicate PRs that are ready for automated CI testing. labels Mar 31, 2026
Copy link
Copy Markdown
Contributor

@jnwei jnwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good, just one suggestion regarding a context manager for the random state.

Comment on lines +83 to +86
@pytest.fixture(scope="module")
def original_datadir(request: pytest.FixtureRequest) -> Path:
"""Redirect pytest-regressions snapshot storage to test_data/snapshots/."""
return Path(__file__).parent / "test_data" / "snapshots" / Path(request.path).stem
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Does it make sense to update the snapshot paths for the templates test you added in https://github.com/aqlaboratory/openfold-3/tree/main/openfold3/tests/test_data/cassettes/test_rscb

Can be a different PR if it is cumbersome to update here.

# NOTE: seeding may need further work — torch.manual_seed controls both
# the random input and the module's weight init. If init changes upstream,
# regenerate snapshots with: pytest --force-regen
torch.manual_seed(123)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use a context manager to handle the rng state, something like this (I have not tested this myself)

#conftest.py

import random
import numpy as np
import torch
import pytest
from torch.random import fork_rng

@pytest.fixture()
def seeded_rng():
    """Isolate all RNG state (torch, numpy, python) for the duration of the test."""
    py_state = random.getstate()
    np_state = np.random.get_state()
    with fork_rng():          # saves/restores torch (+ CUDA) state
        torch.manual_seed(123)
        # set other random states if neeeded
        random.seed(123)
        np.random.seed(123)
        yield
    # torch state restored by freeze_rng_state on exit
    random.setstate(py_state)         # restore python state manually
    np.random.set_state(np_state)     # restore numpy state manually

Docs for fork_rng

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is great – in general I think something like this will be a good fit for these tests, because they're very similar to each other, and we'll duplicate a lot of code between different module tests.

@jandom
Copy link
Copy Markdown
Collaborator Author

jandom commented Apr 3, 2026

Good news here on the cross-platform, there will definitely be some hiccups – so far i'm testing on DGX/aarch64, and all these are passing – no update needed.

@jandom jandom requested a review from jnwei April 3, 2026 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe-to-test Internal only label used to indicate PRs that are ready for automated CI testing.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants