Skip to content

Conversation

@mgorny
Copy link
Contributor

@mgorny mgorny commented Dec 16, 2025

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-admin
Copy link
Contributor

conda-forge-admin commented Dec 16, 2025

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

  • ℹ️ The magma output has been superseded by libmagma-devel.
  • ℹ️ The recipe is not parsable by parser conda-souschef (grayskull). This parser is not currently used by conda-forge, but may be in the future. We are collecting information to see which recipes are compatible with grayskull.
  • ℹ️ The recipe is not parsable by parser conda-recipe-manager. The recipe can only be automatically migrated to the new v1 format if it is parseable by conda-recipe-manager.

This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/20469266995. Examine the logs at this URL for more detail.

@mgorny
Copy link
Contributor Author

mgorny commented Dec 17, 2025

 $SRC_DIR/aten/src/ATen/native/mps/OperationUtils.mm:862:33: error: use of undeclared identifier 'MTLLanguageVersion3_1'
  862 |     [options setLanguageVersion:MTLLanguageVersion3_1];
      |                                 ^

I'm going to try bumping SDK to 14.5.

@mgorny
Copy link
Contributor Author

mgorny commented Dec 17, 2025

win64:

  -- Could NOT find nvtx3 (missing: nvtx3_dir)
  CMake Warning at cmake/Dependencies.cmake:988 (message):
    Cannot find NVTX3, find old NVTX instead
  Call Stack (most recent call first):
    CMakeLists.txt:940 (include)

I see a missing CUDA_INCLUDE_DIRS substitution, I'll update the CUDAToolkit patch.

@mgorny
Copy link
Contributor Author

mgorny commented Dec 17, 2025

Also, looks like the system fmt patch is not working correctly anymore, probably due to some upstream fixes.

@mgorny
Copy link
Contributor Author

mgorny commented Dec 17, 2025

(For the next testing round, I'll run a fewer builds.)

@mgorny
Copy link
Contributor Author

mgorny commented Dec 17, 2025

The Linux non-CUDA builds are failing with:

=================================== FAILURES ===================================
_______ TestTorchDeviceTypeCPU.test_cpp_warnings_have_python_context_cpu _______
[gw0] linux -- Python 3.12.12 $PREFIX/bin/python3.12

self = <test_torch.TestTorchDeviceTypeCPU testMethod=test_cpp_warnings_have_python_context_cpu>
device = 'cpu'

    @skipIfTorchDynamo("Not a suitable test for TorchDynamo")
    @unittest.skipIf(TEST_WITH_CROSSREF, "crossref perturbs line numbering")
    def test_cpp_warnings_have_python_context(self, device):
        # Creates long string in advance to avoid a too-long Python line
        s = ".+Triggered internally at.+RangeFactories.+"
        # nvfuser deprecation warning filter
        warnings.filterwarnings("ignore", "torch::jit::fuser::cuda", UserWarning)
    
        def cpp_warn_fn():
            out = torch.empty((5,))
            torch.arange(0, 3, out=out)
            return out
    
        # Checks eager-mode cpp warning
        with warnings.catch_warnings(record=True) as w:
            cpp_warn_fn()
            frameinfo = inspect.getframeinfo(inspect.currentframe())
            warning = w[0]
    
            # Checks for cpp context in the warning message
            escaped_warning_message = str(warning.message).encode('unicode_escape')
            self.assertTrue(re.search(s, repr(escaped_warning_message), re.IGNORECASE) is not None)
    
            # Checks the Python features of the warning
            # Note: the eager mode warning refers to the line in the function
            # that throws the warning.
            self.assertEqual(frameinfo.lineno - 6, warning.lineno)
            self.assertEqual(len(w), 1)
    
        # Checks jitted cpp warning
        with warnings.catch_warnings(record=True) as w:
            scripted_cpp_warn_fn = torch.jit.script(cpp_warn_fn)
            scripted_cpp_warn_fn()
            warning = w[0]
    
            # Checks for cpp context in the warning message
            escaped_warning_message = str(warning.message).encode('unicode_escape')
>           self.assertTrue(re.search(s, repr(escaped_warning_message), re.IGNORECASE) is not None)
E           AssertionError: False is not true
E           
E           To execute this test, run the following from the base repo dir:
E               python test/test_torch.py TestTorchDeviceTypeCPU.test_cpp_warnings_have_python_context_cpu
E           
E           This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

test/test_torch.py:905: AssertionError

To be honest, I wouldn't bother and just skip this test. It's just a check for a warning message, and my educated guess is that it may be getting confused by an extra warning.

@mgorny
Copy link
Contributor Author

mgorny commented Dec 17, 2025

Oh, or maybe it's --disable-warnings 🤷.

@mgorny
Copy link
Contributor Author

mgorny commented Dec 17, 2025

AArch64:

FAILED [66.9597s] test/test_nn.py::TestNN::test_upsampling_bfloat16 - AssertionError: Tensor-likes are not close!

Mismatched elements: 1361 / 1386 (98.2%)
Greatest absolute difference: nan at index (0, 0, 10, 6, 0) (up to 0.01 allowed)
Greatest relative difference: nan at index (0, 0, 10, 6, 0) (up to 0.01 allowed)

To execute this test, run the following from the base repo dir:
    python test/test_nn.py TestNN.test_upsampling_bfloat16

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
FAILED [0.1908s] test/

Presumably emulation inaccuracy?

@mgorny
Copy link
Contributor Author

mgorny commented Dec 19, 2025

FAILED [1.1162s] test/inductor/test_torchinductor.py::GPUTests::test_lite_regional_compile_flex_attention_cuda - torch._dynamo.exc.Unsupported: Attempt to trace generator
  Explanation: Generators cannot be compiled directly with `torch.compile`.
  Hint: Call a generator from inside of a non-generator Python function and compile that function instead.
  Hint: This graph break is fundamental - it is unlikely that Dynamo will ever be able to trace through your code. Consider finding a workaround.

  Developer debug context: 

 For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb0003.html

@mgorny

This comment was marked as outdated.

Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
…5.12.21.08.15.33

Other tools:
- conda-build 25.11.1
- rattler-build 0.55.0
- rattler-build-conda-compat 1.4.10
@mgorny
Copy link
Contributor Author

mgorny commented Dec 23, 2025

Oops, I've just learned that there's an rc3 already. Since most of the builds are clearly green already (and I think the ones that hadn't run yet are green too), I'll update the PR in a minute.

Signed-off-by: Michał Górny <[email protected]>
Thanks to Andrey Talman for fixing the builds!

Signed-off-by: Michał Górny <[email protected]>
@mgorny mgorny changed the title 2.10.0rc1 2.10.0 RCs Dec 23, 2025
@mgorny
Copy link
Contributor Author

mgorny commented Dec 24, 2025

Looks like we've just hit the 20h timeout on osx-arm64. I've retried in case it was a fluke but I suppose we may want to extend "test one python impl" limit to osx-arm64.

@mgorny mgorny marked this pull request as ready for review December 25, 2025 16:02
@mgorny
Copy link
Contributor Author

mgorny commented Dec 25, 2025

Okay, looks like the 20h timeout is a fluke, probably jobs hitting a busy time or something. Perhaps long-term it would make sense to increase the timeout a bit, but for now 2.10.0rc3 passed.

Copy link
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! I wanted to incorporate some learnings from the test skip hygiene PRs (windows is ready), but we can do that for GA.

Looks like we've just hit the 20h timeout on osx-arm64. I've retried in case it was a fluke

Okay, looks like the 20h timeout is a fluke

Not a fluke unfortunately, see #462. So that won't be solved "for free" with the version update, unfortunately (but also isn't directly related to 2.10). Looking at a previous run that ran into that, it's really an amazingly bad regression, c.f.

= 8673 passed, 4907 skipped, 57 xfailed, 65981 warnings in 24776.23s (6:52:56) =
                                                                      ^^^^^^^
                                                                      !!!!!!!

with

== 7555 passed, 1589 skipped, 31 xfailed, 65834 warnings in 230.02s (0:03:50) ==

from the last run.

@h-vetinari h-vetinari merged commit 13893f4 into conda-forge:main Dec 28, 2025
21 of 24 checks passed
@Tobias-Fischer
Copy link
Contributor

Just wanted to say big congrats on getting this in so quickly - very promising to see, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants