Skip to content

Device agnostic for DCP #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 457 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
457 commits
Select commit Hold shift + click to select a range
ced5cf0
Revert "Cleanup old caffe2 scripts (#158475)"
pytorchmergebot Jul 17, 2025
a004424
[CI][TD] Enable TD on all test configs (#158163)
clee2000 Jul 17, 2025
af66240
[dynamo] Skip training flag check id already guarding on nn modules (…
anijain2305 Jul 17, 2025
41b2c4d
Reduce random reads for offset metadata when calling torch.load under…
mikaylagawarecki Jul 17, 2025
1b91954
Suppress volatile type error (#158435)
cyyever Jul 17, 2025
74f4cf4
Add missing <vector> in c10/util/WaitCounter.h (#158354)
pganssle-google Jul 17, 2025
0ecfb93
Avoid globally modifying torch.testing._internal.common_methods_invoc…
kundaMwiza Jul 17, 2025
2df2e3b
[ROCm][CI] Last known good HIP patch (#158596)
jeffdaily Jul 17, 2025
f63988a
[BE]Clean up old APIs in AOTI c shim (#158400)
yiming0416 Jul 17, 2025
b0e325c
[Dynamo][Better Engineering] Add type coverage to decorators (#158509)
Lucaskabela Jul 17, 2025
33c9b41
[CI][MPS] Enable test_indexing on MPS (#158582)
malfet Jul 17, 2025
7b72e5b
Fix Pandas version mismatch upon reinstalling numpy (#158584)
exclamaforte Jul 18, 2025
6673ac7
Fix test linalg for MKL upgrading (#158312)
CaoE Jul 18, 2025
ef38edb
Add stride check for attn_mask on non-cpu device (#158424)
CaoE Jul 18, 2025
ddbecdf
[DTensor] Document redistribute_costs (#158495)
wconstab Jul 17, 2025
6fd6fc4
[B200] Fix flex-attention heuristic for `test_tma_with_customer_kerne…
eqy Jul 18, 2025
583138d
[Dynamo][Better Engineering] Add typing for comptime, cache, and conv…
Lucaskabela Jul 18, 2025
ce45543
Shunt fx_interpreter graphmodule print on error into tlparse (#158469)
wconstab Jul 17, 2025
89d842f
Make torch.distributed.breakpoint() set a long timeout (#158481)
wconstab Jul 17, 2025
86dbc0e
[NativeRT] Remove makeProxyExecutor from ModelRunner interface (#158587)
SherlockNoMad Jul 18, 2025
1e86fa2
Add stack trace to Inductor IR nodes if `inductor.config.trace.prove…
yushangdi Jul 18, 2025
d8b0843
[DTensor] Fix default_strategy and rename for clarity (#158490)
wconstab Jul 17, 2025
9a7c2f1
Revert "Add torch compile force disable caches alias (#158072)"
pytorchmergebot Jul 18, 2025
9308261
[ROCm][CI] update fbgemm_gpu hash used by inductor tests (#158602)
jeffdaily Jul 18, 2025
eb73650
[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427)
PaliC Jul 18, 2025
a00cd8c
Add a way to disable compile for debugging flex-attention (#158534)
drisspg Jul 18, 2025
fda3f3b
[while_loop] fix constant tensor used as carried inputs (#158381)
ydwu4 Jul 15, 2025
a3396a9
[hop] set capture_scalar_outputs=True by default for compiled hops (#…
ydwu4 Jul 17, 2025
be896d6
Revert "Forward-fix unused variables warning/error (#158549)"
pytorchmergebot Jul 18, 2025
32aade9
Revert "Support DeepSeek-style blockwise scaling scaled-mm for fp8 on…
pytorchmergebot Jul 18, 2025
ead80f3
Fix s390x CI: ensure that all python dependencies are installed when …
AlekseiNikiforovIBM Jul 18, 2025
7b05bdd
[DTensor] fix copy_ strategy (#158538)
wconstab Jul 17, 2025
27af877
[ATen][CUDA][SDPA] Flash Attention: Refactor sm version checks (#158558)
Aidyn-A Jul 18, 2025
a4ec381
[build] pin `setuptools>=77` to enable PEP 639 (#158104)
XuehaiPan Jul 18, 2025
0eae6b6
Unify torch.tensor and torch.ops.aten.scalar_tensor behavior (#158537)
bobrenjc93 Jul 17, 2025
e882c76
Add STD_TORCH_CHECK to headeronly (#158377)
janeyx99 Jul 18, 2025
036eb1f
[precompile] Filter out ID_MATCH family of guards with caching_precom…
zhxchen17 Jul 18, 2025
193b29e
[BE][EZ] Minor doc fixes (#158574)
ZainRizvi Jul 18, 2025
35df895
[AOTI] package loader normalize path separator (#158630)
xuhancn Jul 18, 2025
50f33a6
Revert "[DTensor] fix copy_ strategy (#158538)"
pytorchmergebot Jul 18, 2025
bf4aa78
Revert "[DTensor] Fix default_strategy and rename for clarity (#158490)"
pytorchmergebot Jul 18, 2025
acffd1a
[iter] Update some of the tests to not call pickle (#156369)
guilhermeleobas Jul 18, 2025
6f73e06
[iter] exhaust `ListIterator` when `unpack_var_sequence` is called (#…
guilhermeleobas Jul 18, 2025
8c3f849
[aot] fix greater_than_max build fail on Windows. (#158479)
xuhancn Jul 18, 2025
725cdb2
Name threads in caffe2/torch/distributed/checkpoint AsyncCheckpointEx…
dsesh Jul 18, 2025
86675af
Revert "[ROCm][CI] update fbgemm_gpu hash used by inductor tests (#15…
pytorchmergebot Jul 18, 2025
b4358c5
[inductor] Explicitly link c10 in inductor. (#158622)
yuchengliu1 Jul 18, 2025
6e07d6a
[Dynamo][Better Engineering] Add typing support for _dynamo/repro and…
Lucaskabela Jul 18, 2025
656885b
[Dynamo][Better Engineering] Type devices, resume_execution and testi…
Lucaskabela Jul 18, 2025
b87e50d
[BE][testing] Fix internal test failures in test/dynamo/test_unspec (…
masnesral Jul 16, 2025
79e49ef
Pull latest Sphinx theme (#158595)
svekars Jul 18, 2025
75e2628
Add lower bounds for fsspec and networkx dependencies (#158565)
dsashidh Jul 18, 2025
1b5fdb2
[BE] Add pre-push hook for lintrunner to the PyTorch repo (#158389)
ZainRizvi Jul 18, 2025
04ac258
[BE][testing] Fix test_cudacodecache.py (#158259)
masnesral Jul 14, 2025
599f94e
[AOTI] add Windows file ext to package loader. (#158578)
xuhancn Jul 18, 2025
ec0b538
[inductor] Make times and repeat parameters command line args (#158590)
A-Kokolis Jul 18, 2025
8b2a650
pt2_remote_cache: Log sample for failures, and log the explicit reaso…
c00w Jul 18, 2025
1ab1ab3
Use linux.12xlarge.memory to build for H100/sm_90 (#158598)
huydhn Jul 18, 2025
e3351b3
Revert "[DCP][HF] [ez]Change where sharded tensors are saved (#158069)"
pytorchmergebot Jul 18, 2025
3bb729d
Revert "Fix test consolidate hf safetensors (#157386)"
pytorchmergebot Jul 18, 2025
89850bb
[Dynamo] Use proper sources for constructing dataclass defaults (#157…
mlazos Jul 18, 2025
07c4c2a
[dynamo][be] hide warnings without invalidating warnings cache (#158520)
xmfan Jul 17, 2025
bc7b1f5
[AOTI] Use libstdc++ only for fbcode cpu case (#158659)
hl475 Jul 18, 2025
be483a5
setup pinned commit for vllm in pytorch ci (#158591)
yangw-dev Jul 18, 2025
f76f4ab
Track monitor (#156907)
yangw-dev Jul 18, 2025
a835dbc
[c10d][ez] Fix error message to reflect the correct API name (#158668)
fduwjj Jul 18, 2025
60b9b06
[caffe2] Fix Missing override in get_buffer of NCCLSymmetricMemory (…
wenxin0319 Jul 18, 2025
15ef4f2
Fused RMSNorm implementation (#153666)
AaronWang04 Jul 18, 2025
36bddcd
[DTensor] Fix default_strategy and rename for clarity (#158490)
wconstab Jul 18, 2025
a3aacd6
[DTensor] fix copy_ strategy (#158538)
wconstab Jul 18, 2025
d42c409
[AOTI] windows package load dev (#158671)
xuhancn Jul 19, 2025
5b40f65
Revert "Add warning about removed sm50 and sm60 arches (#158301)"
pytorchmergebot Jul 19, 2025
c2c8884
Revert "[Easy] Show some clear error when torch.ops.load_library fail…
pytorchmergebot Jul 19, 2025
2c16eb9
[dynamo] Support more basic output types for `nonstrict_trace` (#157969)
StrongerXi Jul 18, 2025
2955aca
Clean up some unused build env variables (#158599)
huydhn Jul 18, 2025
a741094
Build domain libraries on the build job (#158600)
huydhn Jul 18, 2025
90b082e
enable_caching_generated_triton_templates=True by default (#158592)
laithsakka Jul 17, 2025
ab55742
[cca] [c10d] Refactor CUDAEventCache into separate files (#158616)
d4l3k Jul 19, 2025
64dabb2
only fail regressions>10% on pr_time benchmarks (#158577)
laithsakka Jul 18, 2025
fac0be7
[async-TP] Turn asserts back into silent skips (#158572)
lw Jul 18, 2025
5cde344
Fix `MakeTensor::computeStorageSize()` (#158690)
malfet Jul 18, 2025
22d8222
GenAI Layer Benchmark (#158536)
BoyuanFeng Jul 19, 2025
a9f8402
[CI] Fixes CI for CUDA Version > 12.9 (#157385)
AaronWang04 Jul 19, 2025
f735941
[BE] document Adadelta and Adagrad APIs properly (#158483)
janeyx99 Jul 18, 2025
7cc5d03
Document the rest of the specific optimizer module APIs (#158669)
janeyx99 Jul 18, 2025
7cc1a95
[AOTI] fix extract file failed on Windows. (#158702)
xuhancn Jul 19, 2025
d36afac
Build domain libraries for all workflows with TorchBench config (#158…
huydhn Jul 19, 2025
a1cfe7f
[nativert] benchmark util (#158678)
dolpm Jul 20, 2025
b64f338
[DLPack] add NumPy exchange tests. (#150216)
ysiraichi Jul 19, 2025
1d526fe
Fix DLPack stream logic. (#150217)
ysiraichi Jul 19, 2025
a10f157
[DLPack] Add support for missing keyword-arguments. (#150218)
ysiraichi Jul 19, 2025
b4abf41
Raise `BufferError` for DLPack buffer-related errors. (#150691)
ysiraichi Jul 19, 2025
4869f71
don't set CUDA_MODULE_LOADING (#158712)
ngimel Jul 20, 2025
badf002
[Reland] Add warning about removed sm50 and sm60 arches (#158700)
atalman Jul 20, 2025
5e149a6
Add deprecation warning (#158203)
tugsbayasgalan Jul 20, 2025
2e03879
[inductor][templates] Finalize all registered hooks (#157270)
kundaMwiza Jul 20, 2025
4b02bd7
DCP safetensors test fix (#158685)
ankitageorge Jul 20, 2025
2cdafab
[BE] Raise ValueError from `torch.cat` meta func (#158249)
malfet Jul 20, 2025
ff0da08
[AOTI] normalize path and process model files. (#158705)
xuhancn Jul 21, 2025
5e12328
Revert "[build] pin `setuptools>=77` to enable PEP 639 (#158104)"
pytorchmergebot Jul 21, 2025
70b4a88
[SymmMem] Add NVSHMEM barrier_all, my_pe, n_pes support into Triton …
codingwithsurya Jul 20, 2025
1c6328a
[EZ][BE] Fix compilation warning in Pooling.metal (#158729)
malfet Jul 21, 2025
a527e81
[CI] update flake8 and mypy lint dependencies (#158720)
XuehaiPan Jul 21, 2025
bbc32d6
[SymmMem] Add NVSHMEM sync_all support into Triton (#158512)
codingwithsurya Jul 20, 2025
1eb6b20
[Inductor] Set the default value of min_chunk_size to 512 (#150762)
jiayisunx Jul 21, 2025
979fae7
Rename modules in AOTAutograd (#158449)
ezyang Jul 21, 2025
d5a29fc
De-abstract premature generalization with InductorWrapper (#158528)
ezyang Jul 21, 2025
8e57cdb
Still run TritonBundler with BundledAOTAutogradCache, save autotune r…
jamesjwu Jul 18, 2025
393377d
Revert "[CI] update flake8 and mypy lint dependencies (#158720)"
pytorchmergebot Jul 21, 2025
f168cf4
[BE] Always use python 3.9 for pre-push hook's lintrunner (#158693)
ZainRizvi Jul 21, 2025
9894d43
[AOTI] explicit aoti wrapper functions for Windows. (#158713)
xuhancn Jul 21, 2025
cbe1cb7
[CMake] Move xpu flag to xpu.cmake (#158542)
guangyey Jul 21, 2025
35f1b4a
Revert "Fused RMSNorm implementation (#153666)"
pytorchmergebot Jul 21, 2025
7205458
[Easy] Show some clear error when torch.ops.load_library fails. (#157…
FFFrog Jul 21, 2025
a78fb63
[build] pin `setuptools>=77` to enable PEP 639 (#158104)
XuehaiPan Jul 21, 2025
637e754
[BE] always use `uv pip` if possible in `pip_init.py` for `lintrunner…
XuehaiPan Jul 15, 2025
9285b82
[BE][testing] fix test_cat_max_autotune_triton (#158589)
masnesral Jul 18, 2025
393fecb
[Optimus][Unit test] clean up the unit test (#158696)
mengluy0125 Jul 21, 2025
8ed5e18
[AOTI] Convert C-struct zip handling to RAII container (#158687)
benjaminglass1 Jul 21, 2025
72db0a9
Revert "[DTensor] Assert DTensorSpec has valid placements (#158133)"
pytorchmergebot Jul 21, 2025
662dd7d
[cutlass backend] cache maybe_append_choices (#156781)
henrylhtsang Jul 21, 2025
ad2dec1
[SymmMem] Add NVSHMEM alltoall support into Triton (#158513)
codingwithsurya Jul 20, 2025
22920c9
Grab bag of (mostly) typing improvements (#158075)
benjaminglass1 Jul 21, 2025
25fbf09
Use more fine-grained locks in sym mem kernels (#158523)
ngimel Jul 21, 2025
ea5b06e
[Dynamo][BetterEngineering] Type side_effects.py (#158605)
Lucaskabela Jul 21, 2025
b66f429
Fix `torch.randint`, `torch.mul` param missing description (#158731)
zeshengzong Jul 21, 2025
851e953
ci: Only run lint jobs on relevant files (#158773)
seemethere Jul 21, 2025
b1a0c34
[pt2 event logging] add configurable prefix (#157678)
coconutruben Jul 21, 2025
bc379ae
Revert "Still run TritonBundler with BundledAOTAutogradCache, save au…
pytorchmergebot Jul 21, 2025
6b0526a
ban fusion of large amount of reads (#158667)
xuanzhang816 Jul 21, 2025
5e17932
[DCP] Add support for ShardedTensor to PgTransport (#158573)
H-Huang Jul 21, 2025
9e0473b
removed zero dim cpu logic from fake_tensor.py (#147501)
zero000064 Jul 21, 2025
a991e28
[AOTI] Add more default options to compile_standalone (#158560)
desertfire Jul 21, 2025
c774180
Bump requests from 2.32.2 to 2.32.4 in /tools/build/bazel (#158006)
dependabot[bot] Jul 21, 2025
216ba6e
Fix `MaskedTensor` to device ignored mask (#151205)
zeshengzong Jul 21, 2025
0e46f54
[ROCm][CI] update HIP patch for 6.4.1 (#158651)
jeffdaily Jul 21, 2025
9498d95
[Dynamo][BetterEngineering] Type trace_rules.py (#158679)
Lucaskabela Jul 21, 2025
97d7dc1
Revert "[AOTI] Convert C-struct zip handling to RAII container (#1586…
pytorchmergebot Jul 21, 2025
e8af168
Revert "[AOTI] normalize path and process model files. (#158705)"
pytorchmergebot Jul 21, 2025
5a56e6a
Revert "[AOTI] fix extract file failed on Windows. (#158702)"
pytorchmergebot Jul 21, 2025
734826d
Revert "[AOTI] windows package load dev (#158671)"
pytorchmergebot Jul 21, 2025
dd0adc9
[SymmMem] Add NVSHMEM broadcast support into Triton (#158514)
codingwithsurya Jul 20, 2025
4366610
[c10d] block_current_stream: correctness fixes (#158757)
d4l3k Jul 21, 2025
cab2833
Setup TorchBench in Docker (#158613)
huydhn Jul 21, 2025
b3c868d
[vllm]Add vllm.txt for pinned commit (#158754)
yangw-dev Jul 21, 2025
feaa02f
Revert "[build] pin `setuptools>=77` to enable PEP 639 (#158104)"
pytorchmergebot Jul 21, 2025
f09a484
Remove is_arvr_mode() from xnnpack.buck.bzl (#158682)
kambati-meta Jul 21, 2025
2bb6843
Fix the typos in the right nav by pulling the latest theme (#158746)
svekars Jul 21, 2025
1227ed6
[dynamic shapes] fix _maybe_evaluate_static axioms bug (#158672)
pianpwk Jul 21, 2025
15a50dc
Revert "[BE] Make PyObjectSlot use a global PyInterpreter and remove …
pytorchmergebot Jul 21, 2025
99cc363
Revert "[BE] Modify PyObjectSlot the assume only a single interpreter…
pytorchmergebot Jul 21, 2025
920f26c
Revert "[BE] Remove __reduce_deploy__ (#158291)"
pytorchmergebot Jul 21, 2025
4c18e85
Revert "[BE] Remove torch deploy | remove torch deploy specific files…
pytorchmergebot Jul 21, 2025
ee5a434
Revert "[BE] remove torch deploy - conditionals (#158288)"
pytorchmergebot Jul 21, 2025
d293022
[cutass backend] memorize parts of cache key to reduce general overhe…
henrylhtsang Jul 15, 2025
67be2f2
[CI][lintrunner] Only run on non deleted changed files (#158794)
clee2000 Jul 21, 2025
187c2de
Fix clamp(min/max) strategy (#158619)
zpcore Jul 21, 2025
08540b1
Use cuda error code instead of error text in get_cuda_error_help (#15…
Raymo111 Jul 21, 2025
2c37acf
[AOTI][CPU] Consider bias=None case for fbgemm_linear_fp16_weight (#1…
hl475 Jul 21, 2025
9281625
Revert "Setup TorchBench in Docker (#158613)"
pytorchmergebot Jul 22, 2025
350d6af
[AOTI] add windows support for get_cpp_compile_command (#158732)
xuhancn Jul 22, 2025
6341311
Revert "Add unified memory APIs for torch.accelerator (#152932)"
pytorchmergebot Jul 22, 2025
95b6584
Revert "Add DeviceAllocator as the base device allocator (#138222)"
pytorchmergebot Jul 22, 2025
abe0c95
[BE] Fix extra-semi warnings (#158730)
malfet Jul 22, 2025
1a6b21c
[AOTI] fix load_pt2 split wrong model name on Windows (#158711)
xuhancn Jul 22, 2025
eac777c
[Inductor] Expose decomposeK knobs as envvars (#158745)
PaulZhang12 Jul 21, 2025
aee8a2e
Remove duplicated installation for python dependencies. (#158339)
FFFrog Jul 16, 2025
3639d29
Fix warnings of unused-variable (#158627)
cyyever Jul 22, 2025
a155f74
[benchmark] allow default mode for compile (#158792)
BoyuanFeng Jul 22, 2025
21c97bd
[reland] Transfer "stack_trace" in post_grad passes (#158752)
yushangdi Jul 22, 2025
d984143
[ci][cutlass backend] Add ci for cutlass backend tests (#156626)
henrylhtsang Jul 21, 2025
3a67bf9
[PGNCCLx] Bring split and merge for PGNCCLx (#158790)
fduwjj Jul 22, 2025
392fa75
Change from import trace to import config (#158796)
yushangdi Jul 22, 2025
91b69de
[ROCm][CI] update fbgemm_gpu hash used by inductor tests (#158602)
naromero77amd Jul 22, 2025
0142d5f
Revert "Remove is_arvr_mode() from xnnpack.buck.bzl (#158682)"
pytorchmergebot Jul 22, 2025
9b4d938
[dynamo][fsdp] Consistent behavior of int attributes (#157262)
anijain2305 Jul 22, 2025
8e99714
[EZ][BE][MPS] Remove unused `ndArrayFromTensor` (#158823)
malfet Jul 22, 2025
1b772de
Still run TritonBundler with BundledAOTAutogradCache, save autotune r…
jamesjwu Jul 21, 2025
371ffaf
[bucketing] Support case of several pgs in graph (#158632)
IvanKobzarev Jul 22, 2025
d0c00d9
[MPS] Do not crash if tensor dim > INT_MAX (#158824)
malfet Jul 22, 2025
9a28e23
Revert "removed zero dim cpu logic from fake_tensor.py (#147501)"
pytorchmergebot Jul 22, 2025
4060f30
[AOTI] Convert C-struct zip handling to RAII container (#158687)
benjaminglass1 Jul 21, 2025
7d6f340
Revert "[AOTI] Add more default options to compile_standalone (#158560)"
pytorchmergebot Jul 22, 2025
0971637
Fix torch.tensor warning in ONNX symbolic_opset10 export (#158835)
novikov-alexander Jul 22, 2025
52c2940
[hop] allow non fake inputs when check input alias and mutation (#158…
ydwu4 Jul 21, 2025
2a249f1
We do support 3.14
albanD Jul 22, 2025
7d2ceaf
[dynamo] skip tracing functions registered in sys.monitoring (#158171)
williamwen42 Jul 21, 2025
55ff4f8
[FP8][CUTLASS] xFail `honor_sm_carveout` on `sm100` (#152378)
eqy Jul 22, 2025
56df025
Add caching for `_rename_without_collisions` (#158594)
hsjts0u Jul 22, 2025
832ab99
Use init_device_mesh API for select tests where possible (#158675)
Electron4444 Jul 22, 2025
659bfbf
Revert "We do support 3.14" (#158856)
ZainRizvi Jul 22, 2025
c917c63
[ROCm][tunableop] UT tolerance increase for matmul_small_brute_force_…
naromero77amd Jul 22, 2025
7677919
[ONNX] Set default opset to 20 (#158802)
justinchuby Jul 22, 2025
37ded2a
Using torch.accelerator in comm_mode_features_example.py and visualiz…
githubsgi Jul 22, 2025
e175380
Making input dynamically adjust. (#157324)
githubsgi Jul 22, 2025
6499420
[DeviceMesh] Make the repr shorter when debug ENV not set (#158822)
fduwjj Jul 22, 2025
823e223
[ROCm] logsumexp on ROCm needs scaling back to natural base. (#156903)
xinyazhang Jul 22, 2025
ddd74d1
More fixes to `MakeTensor::computeStorageSize()` (#158813)
malfet Jul 22, 2025
e44e05f
[dynamo] Move skipIf decorator to class level in test_fx_graph_runnab…
skarjala Jul 10, 2025
fd47401
[doc] Updates to distributed.md for XCCL backend (#155834)
pkourdis Jul 22, 2025
a626dc8
[AOTI] windows package load dev (#158671)
xuhancn Jul 22, 2025
04a3935
Fused RMSNorm implementation (#153666)
AaronWang04 Jul 22, 2025
fc5a404
[gtest][listing] fixing caffe2:verify_api_visibility - main (#158229)
yahayaohinoyi Jul 22, 2025
badfebf
Revert "[Inductor] Expose decomposeK knobs as envvars (#158745)"
pytorchmergebot Jul 22, 2025
6100ed4
[ROCm] Improve Type Safety of C10_WARP_SIZE (#158271)
xinyazhang Jul 22, 2025
cab96b5
[tests] Reduce sizes of unnecessarily large tensors to reduce OOM fla…
benjaminglass1 Jul 22, 2025
d3f9107
Remove top limit for cpython version and fix lint appropriately. (#15…
albanD Jul 22, 2025
3703dab
[ROCm] delete un-needed workaround for tensor.item() (#158486)
naromero77amd Jul 23, 2025
39b54b7
[export] runtime asserts for while HOP subgraphs (#158467)
pianpwk Jul 23, 2025
56d07d0
Add merge_rules category for Dynamo; add guilhermeleobas (#158620)
zou3519 Jul 18, 2025
096dc35
[aoti][mps] Fix update constants buffer (#158349)
angelayi Jul 22, 2025
84058d1
[aoti][mps] Fix cpu kernel generation (#158350)
angelayi Jul 22, 2025
cc372ad
[aoti][mps] Improve tabbing in cpp generation (#158351)
angelayi Jul 22, 2025
91602a9
Cleanup old caffe2 scripts (#158475)
albanD Jul 23, 2025
9df0f56
Fix Triton GEMM templates with k=1 (#158650)
PaulZhang12 Jul 21, 2025
dec0d31
[export] fix unbacked range deserialization (#158681)
pianpwk Jul 23, 2025
2dccff7
[inductor] pass_fds not supported on Windows, skip them on Windows. (…
xuhancn Jul 23, 2025
f10e443
[AOTI] normalize path and process model files. (#158705)
xuhancn Jul 23, 2025
b87471e
[MTIA Aten Backend] Migrate addcdiv.out / addcmul.out / eq.Tensor_out…
andyanwang Jul 22, 2025
42a69f7
[MTIA Aten Backend] Migrate addmm.out / baddbmm.out / bmm.out (#158749)
andyanwang Jul 22, 2025
f80f97d
[audio hash update] update the pinned audio hash (#158807)
pytorchupdatebot Jul 23, 2025
be72bcf
[vllm hash update] update the pinned vllm hash (#158806)
pytorchupdatebot Jul 23, 2025
a6b7bea
[inductor] support linear & layer_norm unbacked (#155267)
ColinPeppler Jul 22, 2025
1d302ea
[vllm] add vllm test base docker image (#158755)
yangw-dev Jul 23, 2025
255a04b
[pt2 event logging] send autotuning data for strides and hinted shape…
coconutruben Jul 23, 2025
c665594
[AOTI] fix extract file failed on Windows. (#158702)
xuhancn Jul 23, 2025
ee72338
[Inductor] MSVC use pointer when generating temporary array pointer (…
yuchengliu1 Jul 23, 2025
5702491
Fix decorators skipping NCCL tests (#158846)
Flamefire Jul 23, 2025
5998cd4
[MPS] Speedup torch.full for 1-byte types (#158874)
malfet Jul 23, 2025
d898d0d
[Precompile] Various small bugfixes, add CachingPrecompile to torchbe…
jamesjwu Jul 22, 2025
2a60b8f
[export][ez] Fix packaging (#158855)
zhxchen17 Jul 23, 2025
7d296d5
[aoti][mps] Enable more tests (#158703)
angelayi Jul 22, 2025
d3d9bc1
[inductor] Allow backends to register their own custom config object …
kundaMwiza Jul 23, 2025
671e22a
[math] Raise exception in Dynamo if constant fold call fail (#156975)
guilhermeleobas Jul 22, 2025
f5314f8
[struct] Add `struct.pack` and `struct.unpack` polyfills (#156977)
guilhermeleobas Jul 22, 2025
576253c
[math] Trace `float.fromhex` (#156976)
guilhermeleobas Jul 22, 2025
00da8e6
CI for Windows Arm64 (#148753)
iremyux Jul 23, 2025
5e386ee
[AOTI] enable aot inductor on Windows (#158915)
xuhancn Jul 23, 2025
1b456c5
[dynamo][guards] Add type info of the guarded value in guard managers…
anijain2305 Jul 23, 2025
41b6cda
Revert "Fix Triton GEMM templates with k=1 (#158650)"
pytorchmergebot Jul 23, 2025
30b0ad5
Revert "Fix decorators skipping NCCL tests (#158846)"
pytorchmergebot Jul 23, 2025
9905ed6
[Inductor] Expose decomposeK knobs as envvars (#158745)
PaulZhang12 Jul 23, 2025
76be282
Revert "[Precompile] Various small bugfixes, add CachingPrecompile to…
pytorchmergebot Jul 23, 2025
fef236d
Add zero_() and empty_like(t) to torch/csrc/stable/ops.h (#158866)
mikaylagawarecki Jul 22, 2025
5fe1f5f
Device agnostic for DCP
Chao1Han Jul 14, 2025
cecca5e
Commit suggestion
Chao1Han Jul 15, 2025
b804495
Update test/distributed/checkpoint/_experimental/test_staging.py
Chao1Han Jul 23, 2025
12e06c2
Update test/distributed/checkpoint/_experimental/test_staging.py
Chao1Han Jul 23, 2025
adb5261
Update test/distributed/checkpoint/_experimental/test_builder.py
Chao1Han Jul 23, 2025
615fb77
acc comment
Chao1Han Jul 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .bazelrc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ build --cxxopt=--std=c++17
build --copt=-I.
# Bazel does not support including its cc_library targets as system
# headers. We work around this for generated code
# (e.g. c10/macros/cmake_macros.h) by making the generated directory a
# (e.g. torch/headeronly/macros/cmake_macros.h) by making the generated directory a
# system include path.
build --copt=-isystem --copt bazel-out/k8-fastbuild/bin
build --copt=-isystem --copt bazel-out/darwin-fastbuild/bin
Expand Down
6 changes: 3 additions & 3 deletions .ci/caffe2/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

if [[ ${BUILD_ENVIRONMENT} == *onnx* ]]; then
pip install click mock tabulate networkx==2.0
pip -q install --user "file:///var/lib/jenkins/workspace/third_party/onnx#egg=onnx"
pip -q install "file:///var/lib/jenkins/workspace/third_party/onnx#egg=onnx"
fi

# Skip tests in environments where they are not built/applicable
Expand Down Expand Up @@ -147,8 +147,8 @@ export DNNL_MAX_CPU_ISA=AVX2
if [[ "${SHARD_NUMBER:-1}" == "1" ]]; then
# TODO([email protected]) remove this when the linked issue resolved.
# py is temporary until https://github.com/Teemu/pytest-sugar/issues/241 is fixed
pip install --user py==1.11.0
pip install --user pytest-sugar
pip install py==1.11.0
pip install pytest-sugar
# NB: Warnings are disabled because they make it harder to see what
# the actual erroring test is
"$PYTHON" \
Expand Down
102 changes: 102 additions & 0 deletions .ci/docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,105 @@ See `build.sh` for valid build environments (it's the giant switch).
# Set flags (see build.sh) and build image
sudo bash -c 'TRITON=1 ./build.sh pytorch-linux-bionic-py3.8-gcc9 -t myimage:latest
```

## [Guidance] Adding a New Base Docker Image

### Background

The base Docker images in directory `.ci/docker/` are built by the `docker-builds.yml` workflow. Those images are used throughout the PyTorch CI/CD pipeline. You should only create or modify a base Docker image if you need specific environment changes or dependencies before building PyTorch on CI.

1. **Automatic Rebuilding**:
- The Docker image building process is triggered automatically when changes are made to files in the `.ci/docker/*` directory
- This ensures all images stay up-to-date with the latest dependencies and configurations

2. **Image Reuse in PyTorch Build Workflows** (example: linux-build):
- The images generated by `docker-builds.yml` are reused in `_linux-build.yml` through the `calculate-docker-image` step
- The `_linux-build.yml` workflow:
- Pulls the Docker image determined by the `calculate-docker-image` step
- Runs a Docker container with that image
- Executes `.ci/pytorch/build.sh` inside the container to build PyTorch

3. **Usage in Test Workflows** (example: linux-test):
- The same Docker images are also used in `_linux-test.yml` for running tests
- The `_linux-test.yml` workflow follows a similar pattern:
- It uses the `calculate-docker-image` step to determine which Docker image to use
- It pulls the Docker image and runs a container with that image
- It installs the wheels from the artifacts generated by PyTorch build jobs
- It executes test scripts (like `.ci/pytorch/test.sh` or `.ci/pytorch/multigpu-test.sh`) inside the container

### Understanding File Purposes

#### `.ci/docker/build.sh` vs `.ci/pytorch/build.sh`
- **`.ci/docker/build.sh`**:
- Used for building base Docker images
- Executed by the `docker-builds.yml` workflow to pre-build Docker images for CI
- Contains configurations for different Docker build environments

- **`.ci/pytorch/build.sh`**:
- Used for building PyTorch inside a Docker container
- Called by workflows like `_linux-build.yml` after the Docker container is started
- Builds PyTorch wheels and other artifacts

#### `.ci/docker/ci_commit_pins/` vs `.github/ci_commit_pins`
- **`.ci/docker/ci_commit_pins/`**:
- Used for pinning dependency versions during base Docker image building
- Ensures consistent environments for building PyTorch
- Changes here trigger base Docker image rebuilds

- **`.github/ci_commit_pins`**:
- Used for pinning dependency versions during PyTorch building and tests
- Ensures consistent dependencies for PyTorch across different builds
- Used by build scripts running inside Docker containers

### Step-by-Step Guide for Adding a New Base Docker Image

#### 1. Add Pinned Commits (If Applicable)

We use pinned commits for build stability. The `nightly.yml` workflow checks and updates pinned commits for certain repository dependencies daily.

If your new Docker image needs a library installed from a specific pinned commit or built from source:

1. Add the repository you want to track in `nightly.yml` and `merge-rules.yml`
2. Add the initial pinned commit in `.ci/docker/ci_commit_pins/`. The text filename should match the one defined in step 1

#### 2. Configure the Base Docker Image
1. **Add new Base Docker image configuration** (if applicable):

Add the configuration in `.ci/docker/build.sh`. For example:
```bash
pytorch-linux-jammy-cuda12.8-cudnn9-py3.12-gcc11-new1)
CUDA_VERSION=12.8.1
CUDNN_VERSION=9
ANACONDA_PYTHON_VERSION=3.12
GCC_VERSION=11
VISION=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
TRITON=yes
NEW_ARG_1=yes
;;
```

2. **Add build arguments to Docker build command**:

If you're introducing a new argument to the Docker build, make sure to add it in the Docker build step in `.ci/docker/build.sh`:
```bash
docker build \
....
--build-arg "NEW_ARG_1=${NEW_ARG_1}"
```

3. **Update Dockerfile logic**:

Update the Dockerfile to use the new argument. For example, in `ubuntu/Dockerfile`:
```dockerfile
ARG NEW_ARG_1
# Set up environment for NEW_ARG_1
RUN if [ -n "${NEW_ARG_1}" ]; then bash ./do_something.sh; fi
```

4. **Add the Docker configuration** in `.github/workflows/docker-builds.yml`:

The `docker-builds.yml` workflow pre-builds the Docker images whenever changes occur in the `.ci/docker/` directory. This includes the
pinned commit updates.
45 changes: 34 additions & 11 deletions .ci/docker/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,17 @@ tag=$(echo $image | awk -F':' '{print $2}')
# configuration, so we hardcode everything here rather than do it
# from scratch
case "$tag" in
pytorch-linux-jammy-cuda12.4-cudnn9-py3-gcc11)
CUDA_VERSION=12.4
CUDNN_VERSION=9
ANACONDA_PYTHON_VERSION=3.10
GCC_VERSION=11
VISION=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
TRITON=yes
;;
pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11)
CUDA_VERSION=12.8.1
CUDNN_VERSION=9
Expand Down Expand Up @@ -149,6 +160,17 @@ case "$tag" in
UCC_COMMIT=${_UCC_COMMIT}
TRITON=yes
;;
pytorch-linux-jammy-cuda12.8-cudnn9-py3.12-gcc11-vllm)
CUDA_VERSION=12.8.1
CUDNN_VERSION=9
ANACONDA_PYTHON_VERSION=3.12
GCC_VERSION=11
VISION=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
TRITON=yes
;;
pytorch-linux-jammy-cuda12.6-cudnn9-py3-gcc9-inductor-benchmarks)
CUDA_VERSION=12.6
CUDNN_VERSION=9
Expand Down Expand Up @@ -220,33 +242,34 @@ case "$tag" in
VISION=yes
TRITON=yes
;;
pytorch-linux-jammy-rocm-n-1-py3)
ANACONDA_PYTHON_VERSION=3.10
pytorch-linux-jammy-rocm-n-py3 | pytorch-linux-noble-rocm-n-py3)
if [[ $tag =~ "jammy" ]]; then
ANACONDA_PYTHON_VERSION=3.10
else
ANACONDA_PYTHON_VERSION=3.12
fi
GCC_VERSION=11
VISION=yes
ROCM_VERSION=6.3
ROCM_VERSION=6.4
NINJA_VERSION=1.9.0
TRITON=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
INDUCTOR_BENCHMARKS=yes
;;
pytorch-linux-jammy-rocm-n-py3 | pytorch-linux-noble-rocm-n-py3)
if [[ $tag =~ "jammy" ]]; then
ANACONDA_PYTHON_VERSION=3.10
else
ANACONDA_PYTHON_VERSION=3.12
fi
pytorch-linux-noble-rocm-alpha-py3)
ANACONDA_PYTHON_VERSION=3.12
GCC_VERSION=11
VISION=yes
ROCM_VERSION=6.4
ROCM_VERSION=7.0
NINJA_VERSION=1.9.0
TRITON=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
INDUCTOR_BENCHMARKS=yes
PYTORCH_ROCM_ARCH="gfx90a;gfx942;gfx950"
;;
pytorch-linux-jammy-xpu-2025.0-py3)
ANACONDA_PYTHON_VERSION=3.9
Expand All @@ -264,7 +287,7 @@ case "$tag" in
NINJA_VERSION=1.9.0
TRITON=yes
;;
pytorch-linux-jammy-py3.9-gcc11-inductor-benchmarks)
pytorch-linux-jammy-py3.9-gcc11-inductor-benchmarks)
ANACONDA_PYTHON_VERSION=3.9
GCC_VERSION=11
VISION=yes
Expand Down
2 changes: 1 addition & 1 deletion .ci/docker/ci_commit_pins/triton.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
ae848267bebc65c6181e8cc5e64a6357d2679260
11ec6354315768a85da41032535e3b7b99c5f706
9 changes: 2 additions & 7 deletions .ci/docker/common/install_conda.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,8 @@ set -ex

# Optionally install conda
if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
BASE_URL="https://repo.anaconda.com/miniconda"
CONDA_FILE="Miniconda3-latest-Linux-x86_64.sh"
if [[ $(uname -m) == "aarch64" ]] || [[ "$BUILD_ENVIRONMENT" == *xpu* ]] || [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then
BASE_URL="https://github.com/conda-forge/miniforge/releases/latest/download" # @lint-ignore
CONDA_FILE="Miniforge3-Linux-$(uname -m).sh"
fi
BASE_URL="https://github.com/conda-forge/miniforge/releases/latest/download" # @lint-ignore
CONDA_FILE="Miniforge3-Linux-$(uname -m).sh"

MAJOR_PYTHON_VERSION=$(echo "$ANACONDA_PYTHON_VERSION" | cut -d . -f 1)
MINOR_PYTHON_VERSION=$(echo "$ANACONDA_PYTHON_VERSION" | cut -d . -f 2)
Expand All @@ -21,7 +17,6 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
exit 1
;;
esac

mkdir -p /opt/conda
chown jenkins:jenkins /opt/conda

Expand Down
49 changes: 49 additions & 0 deletions .ci/docker/common/install_cuda.sh
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,19 @@ function install_nvshmem {
echo "nvSHMEM ${nvshmem_version} for CUDA ${cuda_major_version} (${arch_path}) installed."
}

function install_124 {
CUDNN_VERSION=9.1.0.70
echo "Installing CUDA 12.4.1 and cuDNN ${CUDNN_VERSION} and NCCL and cuSparseLt-0.6.2"
install_cuda 12.4.1 cuda_12.4.1_550.54.15_linux

install_cudnn 12 $CUDNN_VERSION

CUDA_VERSION=12.4 bash install_nccl.sh

CUDA_VERSION=12.4 bash install_cusparselt.sh

ldconfig
}

function install_126 {
CUDNN_VERSION=9.10.2.21
Expand Down Expand Up @@ -113,6 +126,40 @@ function install_129 {
ldconfig
}

function prune_124 {
echo "Pruning CUDA 12.4"
#####################################################################################
# CUDA 12.4 prune static libs
#####################################################################################
export NVPRUNE="/usr/local/cuda-12.4/bin/nvprune"
export CUDA_LIB_DIR="/usr/local/cuda-12.4/lib64"

export GENCODE="-gencode arch=compute_50,code=sm_50 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_90,code=sm_90"
export GENCODE_CUDNN="-gencode arch=compute_50,code=sm_50 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_90,code=sm_90"

if [[ -n "$OVERRIDE_GENCODE" ]]; then
export GENCODE=$OVERRIDE_GENCODE
fi
if [[ -n "$OVERRIDE_GENCODE_CUDNN" ]]; then
export GENCODE_CUDNN=$OVERRIDE_GENCODE_CUDNN
fi

# all CUDA libs except CuDNN and CuBLAS
ls $CUDA_LIB_DIR/ | grep "\.a" | grep -v "culibos" | grep -v "cudart" | grep -v "cudnn" | grep -v "cublas" | grep -v "metis" \
| xargs -I {} bash -c \
"echo {} && $NVPRUNE $GENCODE $CUDA_LIB_DIR/{} -o $CUDA_LIB_DIR/{}"

# prune CuDNN and CuBLAS
$NVPRUNE $GENCODE_CUDNN $CUDA_LIB_DIR/libcublas_static.a -o $CUDA_LIB_DIR/libcublas_static.a
$NVPRUNE $GENCODE_CUDNN $CUDA_LIB_DIR/libcublasLt_static.a -o $CUDA_LIB_DIR/libcublasLt_static.a

#####################################################################################
# CUDA 12.4 prune visual tools
#####################################################################################
export CUDA_BASE="/usr/local/cuda-12.4/"
rm -rf $CUDA_BASE/libnvvp $CUDA_BASE/nsightee_plugins $CUDA_BASE/nsight-compute-2024.1.0 $CUDA_BASE/nsight-systems-2023.4.4/
}

function prune_126 {
echo "Pruning CUDA 12.6"
#####################################################################################
Expand Down Expand Up @@ -169,6 +216,8 @@ function install_128 {
while test $# -gt 0
do
case "$1" in
12.4) install_124; prune_124
;;
12.6|12.6.*) install_126; prune_126
;;
12.8|12.8.*) install_128;
Expand Down
2 changes: 2 additions & 0 deletions .ci/docker/common/install_cudnn.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ if [[ -n "${CUDNN_VERSION}" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-9.10.2.21_cuda12-archive"
elif [[ ${CUDA_VERSION:0:4} == "12.6" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-9.10.2.21_cuda12-archive"
elif [[ ${CUDA_VERSION:0:4} == "12.4" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-9.10.2.21_cuda12-archive"
elif [[ ${CUDA_VERSION:0:2} == "11" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-9.1.0.70_cuda11-archive"
else
Expand Down
8 changes: 8 additions & 0 deletions .ci/docker/common/install_cusparselt.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,14 @@ if [[ ${CUDA_VERSION:0:4} =~ ^12\.[5-9]$ ]]; then
fi
CUSPARSELT_NAME="libcusparse_lt-linux-${arch_path}-0.7.1.0-archive"
curl --retry 3 -OLs https://developer.download.nvidia.com/compute/cusparselt/redist/libcusparse_lt/linux-${arch_path}/${CUSPARSELT_NAME}.tar.xz
elif [[ ${CUDA_VERSION:0:4} == "12.4" ]]; then
arch_path='sbsa'
export TARGETARCH=${TARGETARCH:-$(uname -m)}
if [ ${TARGETARCH} = 'amd64' ] || [ "${TARGETARCH}" = 'x86_64' ]; then
arch_path='x86_64'
fi
CUSPARSELT_NAME="libcusparse_lt-linux-${arch_path}-0.6.2.3-archive"
curl --retry 3 -OLs https://developer.download.nvidia.com/compute/cusparselt/redist/libcusparse_lt/linux-${arch_path}/${CUSPARSELT_NAME}.tar.xz
else
echo "Not sure which libcusparselt version to install for this ${CUDA_VERSION}"
fi
Expand Down
Loading