test_cp_with_fused_attention[False-None-False-False-False-p2p-thd-cp_1_1-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_1_1', 'qkv_format=thd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

failed: tests.pytorch.attention.test_attention_with_cp

test_cp_with_fused_attention[False-None-False-False-False-p2p-thd-cp_1_0-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_1_0', 'qkv_format=thd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

failed: tests.pytorch.attention.test_attention_with_cp

test_cp_with_fused_attention[False-None-False-False-False-p2p-sbhd-cp_3_4-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_3_4', 'qkv_format=sbhd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

failed: tests.pytorch.attention.test_attention_with_cp

test_cp_with_fused_attention[False-None-False-False-False-p2p-sbhd-cp_3_2-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_3_2', 'qkv_format=sbhd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

failed: tests.pytorch.attention.test_attention_with_cp

test_cp_with_fused_attention[False-None-False-False-False-p2p-sbhd-cp_2_3-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_2_3', 'qkv_format=sbhd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

failed: tests.pytorch.attention.test_attention_with_cp

test_cp_with_fused_attention[False-None-False-False-False-p2p-sbhd-cp_2_2-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_2_2', 'qkv_format=sbhd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

failed: tests.pytorch.attention.test_attention_with_cp

test_cp_with_fused_attention[False-None-False-False-False-p2p-sbhd-cp_2_0-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_2_0', 'qkv_format=sbhd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

failed: tests.pytorch.attention.test_attention_with_cp

test_cp_with_fused_attention[False-None-False-False-False-p2p-sbhd-cp_1_4-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_1_4', 'qkv_format=sbhd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

failed: tests.pytorch.attention.test_attention_with_cp

test_cp_with_fused_attention[False-None-False-False-False-p2p-sbhd-cp_1_1-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_1_1', 'qkv_format=sbhd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

failed: tests.pytorch.attention.test_attention_with_cp

test_cp_with_fused_attention[False-None-False-False-False-p2p-sbhd-cp_1_0-bf16]::subprocess.CalledProcessError: Command '['python3', '-m', 'torch.distributed.launch', '--nproc-per-node=2', '/workspace/tests/pytorch/attention/run_attention_with_cp.py', 'dtype=bf16', 'model=cp_1_0', 'qkv_format=sbhd', 'kernel_backend=FusedAttention', 'cp_comm_type=p2p', 'fp8_bwd=False', 'fp8_dpa=False', 'fp8_mha=False', 'scaling_mode=None', 'f16_O=False', 'is_training=True', 'log_level=WARNING']' returned non-zero exit status 1.

CI Level 3 / mGPU Torch (mi35x)

Process completed with exit code 1.

CI Level 3 / mGPU Torch (mi35x)

PyTorch mGPU tests FAILED.

CI Level 3 / sGPU Tests (mi35x)

Canceling since a higher priority waiting request for PR Automatic CI-refs/pull/608/merge exists

CI Level 3 / sGPU Tests (mi35x)

The operation was canceled.

PR Automatic CI

Canceling since a higher priority waiting request for PR Automatic CI-refs/pull/608/merge exists

CI Level 3 / build / Build ROCm Docker image and TransformerEngine wheels

Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/upload-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

CI Level 3 / mGPU JAX (mi35x)

Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/download-artifact@v4, actions/upload-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

CI Level 3 / mGPU JAX (mi30x)

CI Level 3 / mGPU Torch (mi30x)

CI Level 3 / mGPU Torch (mi35x)

31 more failures omitted from annotations; see the job summary for the full list.

CI Level 3 / sGPU Tests (mi30x)

CI Level 3 / sGPU Tests (mi35x)

CI Level 3 / Select Docker Image

Using ci/ci_config.json from dev

Artifacts

Produced during runtime

Name	Size	Digest
logs-mgpu-mi30x-jax Expired	48.8 KB	`sha256:37d4cdc0e4ddc5079e385e2d54c99c0d6ccbe6535686a1b4c69c45daf92bf150`
logs-mgpu-mi30x-pytorch Expired	113 KB	`sha256:12bed27efb59fedaa54b6cbf2a42e82b2c9a328cbcdc5cbfc43fda9911b7589e`
logs-mgpu-mi35x-jax Expired	53 KB	`sha256:080befbd928585bdfdf04237b8153e42741c9cbfbae19150acca03d9cba5e8ea`
logs-mgpu-mi35x-pytorch Expired	137 KB	`sha256:4ebb64d83ef85b57e50b649bef660af925032db4e3743116203dd8f727462f26`
logs-sgpu-mi30x Expired	3.42 MB	`sha256:a0337a44069940d31dde88467c7683bc82bb2af6997394c1bfb9d5e5b4ef3037`
logs-sgpu-mi35x Expired	2.71 MB	`sha256:974bb7e5ea0478f02e6fe25e2977aabe809d782cc17e4a9dde6c22228f65fd2b`
te-rocm-wheels Expired	722 MB	`sha256:a4c4f3320051701647fb7b52f9072752d789dd6e0da4c9e09b4356edd09eaf87`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Updated test logging and timeouts #697

Summary

Updated test logging and timeouts #697

Uh oh!

rocm-ci-dispatch.yml

Annotations

Artifacts