fix(deps): update dependency accelerate to v1 #80

dreadnode-renovate-bot · 2025-04-24T17:09:20Z

This PR contains the following updates:

| Package | Type | Update | Change |
|

Generated Summary

Updated accelerate dependency version from ^0.30.1 to ^1.0.0.
This change potentially improves compatibility and access to newer features.
Ensures alignment with the latest updates in the accelerate library.

This summary was generated with ❤️ by rigging

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.

Release Notes

huggingface/accelerate (accelerate)

`v1.6.0`: : FSDPv2, DeepSpeed TP and XCCL backend support

Compare Source

FSDPv2 support

This release introduces the support for FSDPv2 thanks to @S1ro1.

If you are using python code, you need to set fsdp_version=2 in FullyShardedDataParallelPlugin:

from accelerate import FullyShardedDataParallelPlugin, Accelerator

fsdp_plugin = FullyShardedDataParallelPlugin(
    fsdp_version=2

### other options...
)
accelerator = Accelerator(fsdp_plugin=fsdp_plugin)

If want to convert a YAML config that contains the FSDPv1 config to FSDPv2 one , use our conversion tool:

accelerate to-fsdp2 --config_file config.yaml --output_file new_config.yaml`

To learn more about the difference between FSDPv1 and FSDPv2, read the following documentation.

DeepSpeed TP support

We have added initial support for DeepSpeed + TP. Not many changes were required as the DeepSpeed APIs was already compatible. We only needed to make sure that the dataloader was compatible with TP and that we were able to save the TP weights. Thanks @inkcherry for the work ! https://github.com/huggingface/accelerate/pull/3390.

To use TP with deepspeed, you need to update the setting in the deepspeed config file by including tensor_parallel key:

    ....
    "tensor_parallel":{
      "autotp_size": ${autotp_size}
    },
   ...

More details in this deepspeed PR.

Support for XCCL distributed backend

We've added support for XCCL which is an Intel distributed backend which can be used with XPU devices. More details in this torch PR. Thanks @dvrogozh for the integration !

Bug Fixes:

Fixed an issue with torch.get_default_device() requiring a higher version than what we support
Fixed a broken pytest import in prod

Full Changelog: huggingface/accelerate@v1.5.0...v1.5.2

`v1.5.1`

Compare Source

`v1.5.0`: : HPU support

Compare Source

HPU Support

Adds in HPU accelerator support for 🤗 Accelerate

What's Changed

[bug] fix device index bug for model training loaded with bitsandbytes by @faaany in https://github.com/huggingface/accelerate/pull/3408
[docs] add the missing import torch by @faaany in https://github.com/huggingface/accelerate/pull/3396
minor doc fixes by @nbroad1881 in https://github.com/huggingface/accelerate/pull/3365
fix: ensure CLI args take precedence over config file. by @cyr0930 in https://github.com/huggingface/accelerate/pull/3409
fix: Add device=torch.get_default_device() in torch.Generators by @saforem2 in https://github.com/huggingface/accelerate/pull/3420
Add Tecorigin SDAA accelerator support by @siqi654321 in https://github.com/huggingface/accelerate/pull/3330
fix typo : thier -> their by @hackty in https://github.com/huggingface/accelerate/pull/3423
Fix quality by @muellerzr in https://github.com/huggingface/accelerate/pull/3424
Distributed inference example for llava_next by @VladOS95-cyber in https://github.com/huggingface/accelerate/pull/3417
HPU support by @IlyasMoutawwakil in https://github.com/huggingface/accelerate/pull/3378

New Contributors

@cyr0930 made their first contribution in https://github.com/huggingface/accelerate/pull/3409
@saforem2 made their first contribution in https://github.com/huggingface/accelerate/pull/3420
@siqi654321 made their first contribution in https://github.com/huggingface/accelerate/pull/3330
@hackty made their first contribution in https://github.com/huggingface/accelerate/pull/3423
@VladOS95-cyber made their first contribution in https://github.com/huggingface/accelerate/pull/3417
@IlyasMoutawwakil made their first contribution in https://github.com/huggingface/accelerate/pull/3378

Full Changelog: huggingface/accelerate@v1.4.0...v1.5.0

`v1.4.0`: : `torchao` FP8, TP & dataLoader support, fix memory leak

Compare Source

`torchao` FP8, initial Tensor Parallel support, and memory leak fixes

`torchao` FP8

This release introduces a new FP8 API and brings in a new backend: torchao. To use, pass in AORecipeKwargs to the Accelerator while setting mixed_precision="fp8". This is initial support, as it matures we will incorporate more into it (such as accelerate config/yaml) in future releases. See our benchmark examples here

TensorParallel

We have intial support for an in-house solution to TP when working with accelerate dataloaders. check out the PR here

Bug fixes

fix triton version check by @faaany in https://github.com/huggingface/accelerate/pull/3345
fix torch_dtype in estimate memory by @SunMarc in https://github.com/huggingface/accelerate/pull/3383
works for fp8 with deepspeed by @XiaobingSuper in https://github.com/huggingface/accelerate/pull/3361
[memory leak] Replace GradientState -> DataLoader reference with weakrefs by @tomaarsen in https://github.com/huggingface/accelerate/pull/3391

What's Changed

fix triton version check by @faaany in https://github.com/huggingface/accelerate/pull/3345
[tests] enable BNB test cases in tests/test_quantization.py on XPU by @faaany in https://github.com/huggingface/accelerate/pull/3349
[Dev] Update release directions by @muellerzr in https://github.com/huggingface/accelerate/pull/3352
[tests] make cuda-only test work on other hardware accelerators by @faaany in https://github.com/huggingface/accelerate/pull/3302
[tests] remove require_non_xpu test markers by @faaany in https://github.com/huggingface/accelerate/pull/3301
Support more functionalities for MUSA backend by @fmo-mt in https://github.com/huggingface/accelerate/pull/3359
[tests] enable more bnb tests on XPU by @faaany in https://github.com/huggingface/accelerate/pull/3350
feat: support tensor parallel & Data loader by @kmehant in https://github.com/huggingface/accelerate/pull/3173
DeepSpeed github repo move sync by @stas00 in https://github.com/huggingface/accelerate/pull/3376
[tests] Fix bnb cpu error by @faaany in https://github.com/huggingface/accelerate/pull/3351
fix torch_dtype in estimate memory by @SunMarc in https://github.com/huggingface/accelerate/pull/3383
works for fp8 with deepspeed by @XiaobingSuper in https://github.com/huggingface/accelerate/pull/3361
fix: typos in documentation files by @maximevtush in https://github.com/huggingface/accelerate/pull/3388
[examples] upgrade code for seed setting by @faaany in https://github.com/huggingface/accelerate/pull/3387
[memory leak] Replace GradientState -> DataLoader reference with weakrefs by @tomaarsen in https://github.com/huggingface/accelerate/pull/3391
add xpu check in get_quantized_model_device_map by @faaany in https://github.com/huggingface/accelerate/pull/3397
Torchao float8 training by @muellerzr in https://github.com/huggingface/accelerate/pull/3348

New Contributors

@kmehant made their first contribution in https://github.com/huggingface/accelerate/pull/3173
@XiaobingSuper made their first contribution in https://github.com/huggingface/accelerate/pull/3361
@maximevtush made their first contribution in https://github.com/huggingface/accelerate/pull/3388

Full Changelog: huggingface/accelerate@v1.3.0...v1.4.0

`v1.3.0`: Bug fixes + Require torch 2.0

Compare Source

Torch 2.0

As it's been ~2 years since torch 2.0 was first released, we are now requiring this as the minimum version for Accelerate, which similarly was done in transformers as of its last release.

Core

[docs] no hard-coding cuda by @faaany in https://github.com/huggingface/accelerate/pull/3270
fix load_state_dict for npu by @ji-huazhong in https://github.com/huggingface/accelerate/pull/3211
Add keep_torch_compile param to unwrap_model and extract_model_from_parallel for distributed compiled model. by @ggoggam in https://github.com/huggingface/accelerate/pull/3282
[tests] make cuda-only test case device-agnostic by @faaany in https://github.com/huggingface/accelerate/pull/3340
latest bnb no longer has optim_args attribute on optimizer by @winglian in https://github.com/huggingface/accelerate/pull/3311
add torchdata version check to avoid "in_order" error by @faaany in https://github.com/huggingface/accelerate/pull/3344
[docs] fix typo, change "backoff_filter" to "backoff_factor" by @suchot in https://github.com/huggingface/accelerate/pull/3296
dataloader: check that in_order is in kwargs before trying to drop it by @dvrogozh in https://github.com/huggingface/accelerate/pull/3346
feat(tpu): remove nprocs from xla.spawn by @tengomucho in https://github.com/huggingface/accelerate/pull/3324

Big Modeling

Fix test_nested_hook by @SunMarc in https://github.com/huggingface/accelerate/pull/3289
correct the return statement of _init_infer_auto_device_map by @Nech-C in https://github.com/huggingface/accelerate/pull/3279
Use torch.xpu.mem_get_info for XPU by @dvrogozh in https://github.com/huggingface/accelerate/pull/3275
Ensure that tied parameter is children of module by @pablomlago in https://github.com/huggingface/accelerate/pull/3327
Fix for offloading when using TorchAO >= 0.7.0 by @a-r-r-o-w in https://github.com/huggingface/accelerate/pull/3332
Fix offload generate tests by @SunMarc in https://github.com/huggingface/accelerate/pull/3334

Examples

Give example on how to handle gradient accumulation with cross-entropy by @ylacombe in https://github.com/huggingface/accelerate/pull/3193

Full Changelog

What's Changed

[docs] no hard-coding cuda by @faaany in https://github.com/huggingface/accelerate/pull/3270
fix load_state_dict for npu by @ji-huazhong in https://github.com/huggingface/accelerate/pull/3211
Fix test_nested_hook by @SunMarc in https://github.com/huggingface/accelerate/pull/3289
correct the return statement of _init_infer_auto_device_map by @Nech-C in https://github.com/huggingface/accelerate/pull/3279
Give example on how to handle gradient accumulation with cross-entropy by @ylacombe in https://github.com/huggingface/accelerate/pull/3193
Use torch.xpu.mem_get_info for XPU by @dvrogozh in https://github.com/huggingface/accelerate/pull/3275
Add keep_torch_compile param to unwrap_model and extract_model_from_parallel for distributed compiled model. by @ggoggam in https://github.com/huggingface/accelerate/pull/3282
Ensure that tied parameter is children of module by @pablomlago in https://github.com/huggingface/accelerate/pull/3327
Bye bye torch <2 by @muellerzr in https://github.com/huggingface/accelerate/pull/3331
Fixup docker build err by @muellerzr in https://github.com/huggingface/accelerate/pull/3333
feat(tpu): remove nprocs from xla.spawn by @tengomucho in https://github.com/huggingface/accelerate/pull/3324
Fix offload generate tests by @SunMarc in https://github.com/huggingface/accelerate/pull/3334
[tests] make cuda-only test case device-agnostic by @faaany in https://github.com/huggingface/accelerate/pull/3340
latest bnb no longer has optim_args attribute on optimizer by @winglian in https://github.com/huggingface/accelerate/pull/3311
Fix for offloading when using TorchAO >= 0.7.0 by @a-r-r-o-w in https://github.com/huggingface/accelerate/pull/3332
add torchdata version check to avoid "in_order" error by @faaany in https://github.com/huggingface/accelerate/pull/3344
[docs] fix typo, change "backoff_filter" to "backoff_factor" by @suchot in https://github.com/huggingface/accelerate/pull/3296
dataloader: check that in_order is in kwargs before trying to drop it by @dvrogozh in https://github.com/huggingface/accelerate/pull/3346

New Contributors

@ylacombe made their first contribution in https://github.com/huggingface/accelerate/pull/3193
@ggoggam made their first contribution in https://github.com/huggingface/accelerate/pull/3282
@pablomlago made their first contribution in https://github.com/huggingface/accelerate/pull/3327
@tengomucho made their first contribution in https://github.com/huggingface/accelerate/pull/3324
@suchot made their first contribution in https://github.com/huggingface/accelerate/pull/3296

Full Changelog: huggingface/accelerate@v1.2.1...v1.3.0

`v1.2.1`: : Patchfix

Compare Source

fix: add max_memory to _init_infer_auto_device_map's return statement inhttps://github.com/huggingface/accelerate/pull/32799 by @Nech-C
fix load_state_dict for npu in https://github.com/huggingface/accelerate/pull/3211 by @statelesshz

Full Changelog: huggingface/accelerate@v1.2.0...v1.2.1

`v1.2.0`: : Bug Squashing & Fixes across the board

Compare Source

Core

enable find_executable_batch_size on XPU by @faaany in https://github.com/huggingface/accelerate/pull/3236
Use numpy._core instead of numpy.core by @qgallouedec in https://github.com/huggingface/accelerate/pull/3247
Add warnings and fallback for unassigned devices in infer_auto_device_map by @Nech-C in https://github.com/huggingface/accelerate/pull/3066
Allow for full dynamo config passed to Accelerator by @muellerzr in https://github.com/huggingface/accelerate/pull/3251
[WIP] FEAT Decorator to purge accelerate env vars by @BenjaminBossan in https://github.com/huggingface/accelerate/pull/3252
[data_loader] Optionally also propagate set_epoch to batch sampler by @tomaarsen in https://github.com/huggingface/accelerate/pull/3246
use XPU instead of GPU in the accelerate config prompt text by @faaany in https://github.com/huggingface/accelerate/pull/3268

Big Modeling

Fix align_module_device, ensure only cpu tensors for get_state_dict_offloaded_model by @kylesayrs in https://github.com/huggingface/accelerate/pull/3217
Remove hook for bnb 4-bit by @SunMarc in https://github.com/huggingface/accelerate/pull/3223
[docs] add instruction to install bnb on non-cuda devices by @faaany in https://github.com/huggingface/accelerate/pull/3227
Take care of case when "_tied_weights_keys" is not an attribute by @fabianlim in https://github.com/huggingface/accelerate/pull/3226
Update deferring_execution.md by @max-yue in https://github.com/huggingface/accelerate/pull/3262
Revert default behavior of get_state_dict_from_offload by @kylesayrs in https://github.com/huggingface/accelerate/pull/3253
Fix: Resolve #3060, preload_module_classes is lost for nested modules by @wejoncy in https://github.com/huggingface/accelerate/pull/3248

DeepSpeed

Select the DeepSpeedCPUOptimizer based on the original optimizer class. by @eljandoubi in https://github.com/huggingface/accelerate/pull/3255
support for wrapped schedulefree optimizer when using deepspeed by @winglian in https://github.com/huggingface/accelerate/pull/3266

Documentation

Update code in tracking documentation by @faaany in https://github.com/huggingface/accelerate/pull/3235
Replaced set/check breakpoint with set/check trigger in the troubleshooting documentation by @relh in https://github.com/huggingface/accelerate/pull/3259
Update set-seed by @faaany in https://github.com/huggingface/accelerate/pull/3228
Fix typo by @faaany in https://github.com/huggingface/accelerate/pull/3221
Use real path for checkpoint by @faaany in https://github.com/huggingface/accelerate/pull/3220
Fixed multiple typos for Tutorials and Guides docs by @henryhmko in https://github.com/huggingface/accelerate/pull/3274

New Contributors

@winglian made their first contribution in https://github.com/huggingface/accelerate/pull/3266
@max-yue made their first contribution in https://github.com/huggingface/accelerate/pull/3262
@as12138 made their first contribution in https://github.com/huggingface/accelerate/pull/3261
@relh made their first contribution in https://github.com/huggingface/accelerate/pull/3259
@wejoncy made their first contribution in https://github.com/huggingface/accelerate/pull/3248
@henryhmko made their first contribution in https://github.com/huggingface/accelerate/pull/3274

Full Changelog

Fix align_module_device, ensure only cpu tensors for get_state_dict_offloaded_model by @kylesayrs in https://github.com/huggingface/accelerate/pull/3217
remove hook for bnb 4-bit by @SunMarc in https://github.com/huggingface/accelerate/pull/3223
enable find_executable_batch_size on XPU by @faaany in https://github.com/huggingface/accelerate/pull/3236
take care of case when "_tied_weights_keys" is not an attribute by @fabianlim in https://github.com/huggingface/accelerate/pull/3226
[docs] update code in tracking documentation by @faaany in https://github.com/huggingface/accelerate/pull/3235
Add warnings and fallback for unassigned devices in infer_auto_device_map by @Nech-C in https://github.com/huggingface/accelerate/pull/3066
[data_loader] Optionally also propagate set_epoch to batch sampler by @tomaarsen in https://github.com/huggingface/accelerate/pull/3246
[docs] add instruction to install bnb on non-cuda devices by @faaany in https://github.com/huggingface/accelerate/pull/3227
Use numpy._core instead of numpy.core by @qgallouedec in https://github.com/huggingface/accelerate/pull/3247
Allow for full dynamo config passed to Accelerator by @muellerzr in https://github.com/huggingface/accelerate/pull/3251
[WIP] FEAT Decorator to purge accelerate env vars by @BenjaminBossan in https://github.com/huggingface/accelerate/pull/3252
use XPU instead of GPU in the accelerate config prompt text by @faaany in https://github.com/huggingface/accelerate/pull/3268
support for wrapped schedulefree optimizer when using deepspeed by @winglian in https://github.com/huggingface/accelerate/pull/3266
Update deferring_execution.md by @max-yue in https://github.com/huggingface/accelerate/pull/3262
Fix: Resolve #3257 by @as12138 in https://github.com/huggingface/accelerate/pull/3261
Replaced set/check breakpoint with set/check trigger in the troubleshooting documentation by @relh in https://github.com/huggingface/accelerate/pull/3259
Select the DeepSpeedCPUOptimizer based on the original optimizer class. by @eljandoubi in https://github.com/huggingface/accelerate/pull/3255
Revert default behavior of get_state_dict_from_offload by @kylesayrs in https://github.com/huggingface/accelerate/pull/3253
Fix: Resolve #3060, preload_module_classes is lost for nested modules by @wejoncy in https://github.com/huggingface/accelerate/pull/3248
[docs] update set-seed by @faaany in https://github.com/huggingface/accelerate/pull/3228
[docs] fix typo by @faaany in https://github.com/huggingface/accelerate/pull/3221
[docs] use real path for checkpoint by @faaany in https://github.com/huggingface/accelerate/pull/3220
Fixed multiple typos for Tutorials and Guides docs by @henryhmko in https://github.com/huggingface/accelerate/pull/3274

Code Diff

Release diff: huggingface/accelerate@v1.1.1...v1.2.0

`v1.1.1`

Compare Source

`v1.1.0`: : Python 3.9 minimum, torch dynamo deepspeed support, and bug fixes

Compare Source

Internals:

Allow for a data_seed argument in https://github.com/huggingface/accelerate/pull/3150
Trigger weights_only=True by default for all compatible objects when checkpointing and saving with torch.save in https://github.com/huggingface/accelerate/pull/3036
Handle negative values for dim input in pad_across_processes in https://github.com/huggingface/accelerate/pull/3114
Enable cpu bnb distributed lora finetune in https://github.com/huggingface/accelerate/pull/3159

DeepSpeed

Support torch dynamo for deepspeed>=0.14.4 in https://github.com/huggingface/accelerate/pull/3069

Megatron

update Megatron-LM plugin code to version 0.8.0 or higher in https://github.com/huggingface/accelerate/pull/3174

Big Model Inference

New has_offloaded_params utility added in https://github.com/huggingface/accelerate/pull/3188

Examples

Florence2 distributed inference example in https://github.com/huggingface/accelerate/pull/3123

Full Changelog

Handle negative values for dim input in pad_across_processes by @mariusarvinte in https://github.com/huggingface/accelerate/pull/3114
Fixup DS issue with weakref by @muellerzr in https://github.com/huggingface/accelerate/pull/3143
Refactor scaler to util by @muellerzr in https://github.com/huggingface/accelerate/pull/3142
DS fix, continued by @muellerzr in https://github.com/huggingface/accelerate/pull/3145
Florence2 distributed inference example by @hlky in https://github.com/huggingface/accelerate/pull/3123
POC: Allow for a data_seed by @muellerzr in https://github.com/huggingface/accelerate/pull/3150
Adding multi gpu speech generation by @dame-cell in https://github.com/huggingface/accelerate/pull/3149
support torch dynamo for deepspeed>=0.14.4 by @oraluben in https://github.com/huggingface/accelerate/pull/3069
Fixup Zero3 + save_model by @muellerzr in https://github.com/huggingface/accelerate/pull/3146
Trigger weights_only=True by default for all compatible objects by @muellerzr in https://github.com/huggingface/accelerate/pull/3036
Remove broken dynamo test by @oraluben in https://github.com/huggingface/accelerate/pull/3155
fix version check bug in get_xpu_available_memory by @faaany in https://github.com/huggingface/accelerate/pull/3165
enable cpu bnb distributed lora finetune by @jiqing-feng in https://github.com/huggingface/accelerate/pull/3159
[Utils] has_offloaded_params by @kylesayrs in https://github.com/huggingface/accelerate/pull/3188
fix bnb by @eljandoubi in https://github.com/huggingface/accelerate/pull/3186
[docs] update neptune API by @faaany in https://github.com/huggingface/accelerate/pull/3181
docs: fix a wrong word in comment in src/accelerate/accelerate.py:1255 by @Rebornix-zero in https://github.com/huggingface/accelerate/pull/3183
[docs] use nn.module instead of tensor as model by @faaany in https://github.com/huggingface/accelerate/pull/3157
Fix typo by @kylesayrs in https://github.com/huggingface/accelerate/pull/3191
MLU devices : Checks if mlu is available via an cndev-based check which won't trigger the drivers and leave mlu by @huismiling in https://github.com/huggingface/accelerate/pull/3187
update Megatron-LM plugin code to version 0.8.0 or higher. by @eljandoubi in https://github.com/huggingface/accelerate/pull/3174
🚨 🚨 🚨 Goodbye Python 3.8! 🚨 🚨 🚨 by @muellerzr in https://github.com/huggingface/accelerate/pull/3194
Update transformers.deepspeed references from transformers 4.46.0 release by @loadams in https://github.com/huggingface/accelerate/pull/3196
eliminate dead code by @statelesshz in https://github.com/huggingface/accelerate/pull/3198
take torch.nn.Module model into account when moving to device by @faaany in https://github.com/huggingface/accelerate/pull/3167
[docs] add xpu part and fix bug in torchrun by @faaany in https://github.com/huggingface/accelerate/pull/3166
Models With Tied Weights Need Re-Tieing After FSDP Param Init by @fabianlim in https://github.com/huggingface/accelerate/pull/3154
add the missing xpu for local sgd by @faaany in https://github.com/huggingface/accelerate/pull/3163
typo fix in big_modeling.py by @a-r-r-o-w in https://github.com/huggingface/accelerate/pull/3207
[Utils] align_module_device by @kylesayrs in https://github.com/huggingface/accelerate/pull/3204

New Contributors

@mariusarvinte made their first contribution in https://github.com/huggingface/accelerate/pull/3114
@hlky made their first contribution in https://github.com/huggingface/accelerate/pull/3123
@dame-cell made their first contribution in https://github.com/huggingface/accelerate/pull/3149
@kylesayrs made their first contribution in https://github.com/huggingface/accelerate/pull/3188
@eljandoubi made their first contribution in https://github.com/huggingface/accelerate/pull/3186
@Rebornix-zero made their first contribution in https://github.com/huggingface/accelerate/pull/3183
@loadams made their first contribution in https://github.com/huggingface/accelerate/pull/3196

Full Changelog: huggingface/accelerate@v1.0.1...v1.1.0

`v1.0.1`: : Bugfix

Compare Source

Bugfixes

Fixes an issue where the auto values were no longer being parsed when using deepspeed
Fixes a broken test in the deepspeed tests related to the auto values

Full Changelog: huggingface/accelerate@v1.0.0...v1.0.1

`v1.0.0`: Accelerate 1.0.0 is here!

Compare Source

🚀 Accelerate 1.0 🚀

With accelerate 1.0, we are officially stating that the core parts of the API are now "stable" and ready for the future of what the world of distributed training and PyTorch has to handle. With these release notes, we will focus first on the major breaking changes to get your code fixed, followed by what is new specifically between 0.34.0 and 1.0.

To read more, check out our official blog here

Migration assistance

Passing in dispatch_batches, split_batches, even_batches, and use_seedable_sampler to the Accelerator() should now be handled by creating an accelerate.utils.DataLoaderConfiguration() and passing this to the Accelerator() instead (Accelerator(dataloader_config=DataLoaderConfiguration(...)))
Accelerator().use_fp16 and AcceleratorState().use_fp16 have been removed; this should be replaced by checking accelerator.mixed_precision == "fp16"
Accelerator().autocast() no longer accepts a cache_enabled argument. Instead, an AutocastKwargs() instance should be used which handles this flag (among others) passing it to the Accelerator (Accelerator(kwargs_handlers=[AutocastKwargs(cache_enabled=True)]))
accelerate.utils.is_tpu_available should be replaced with accelerate.utils.is_torch_xla_available
accelerate.utils.modeling.shard_checkpoint should be replaced with split_torch_state_dict_into_shards from the huggingface_hub library
accelerate.tqdm.tqdm() no longer accepts True/False as the first argument, and instead, main_process_only should be passed in as a named argument

Multiple Model DeepSpeed Support

After long request, we finally have multiple model DeepSpeed support in Accelerate! (though it is quite early still). Read the full tutorial here, however essentially:

When using multiple models, a DeepSpeed plugin should be created for each model (and as a result, a separate config). a few examples are below:

Knowledge distillation

(Where we train only one model, zero3, and another is used for inference, zero2)

from accelerate import Accelerator
from accelerate.utils import DeepSpeedPlugin

zero2_plugin = DeepSpeedPlugin(hf_ds_config="zero2_config.json")
zero3_plugin = DeepSpeedPlugin(hf_ds_config="zero3_config.json")

deepspeed_plugins = {"student": zero2_plugin, "teacher": zero3_plugin}

accelerator = Accelerator(deepspeed_plugins=deepspeed_plugins)

To then select which plugin to be used at a certain time (aka when calling prepare), we call `accelerator.state.select_deepspeed_plugin("name"), where the first plugin is active by default:

accelerator.state.select_deepspeed_plugin("student")
student_model, optimizer, scheduler = ...
student_model, optimizer, scheduler, train_dataloader = accelerator.prepare(student_model, optimizer, scheduler, train_dataloader)

accelerator.state.select_deepspeed_plugin("teacher") # This will automatically enable zero init
teacher_model = AutoModel.from_pretrained(...)
teacher_model = accelerator.prepare(teacher_model)

Multiple disjoint models

For disjoint models, separate accelerators should be used for each model, and their own .backward() should be called later:

for batch in dl:
    outputs1 = first_model(**batch)
    first_accelerator.backward(outputs1.loss)
    first_optimizer.step()
    first_scheduler.step()
    first_optimizer.zero_grad()
    
    outputs2 = model2(**batch)
    second_accelerator.backward(outputs2.loss)
    second_optimizer.step()
    second_scheduler.step()
    second_optimizer.zero_grad()

FP8

We've enable

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

| datasource | package | from | to | | ---------- | ---------- | ------ | ----- | | pypi | accelerate | 0.30.1 | 1.6.0 |

dreadnode-renovate-bot · 2025-04-24T20:02:51Z

Edited/Blocked Notification

Renovate will not automatically rebase this PR, because it does not recognize the last commit author and assumes somebody else may have edited the PR.

You can manually request rebase by checking the rebase/retry box above.

⚠️ Warning: custom changes will be lost.

fix(deps): update dependency accelerate to v1

4886a4e

| datasource | package | from | to | | ---------- | ---------- | ------ | ----- | | pypi | accelerate | 0.30.1 | 1.6.0 |

dreadnode-renovate-bot bot requested a review from a team as a code owner April 24, 2025 17:09

dreadnode-renovate-bot bot added type/digest Dependency digest updates area/python Changes to Python package configuration and dependencies labels Apr 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deps): update dependency accelerate to v1 #80

fix(deps): update dependency accelerate to v1 #80

dreadnode-renovate-bot bot commented Apr 24, 2025 •

edited by github-actions bot

Loading

dreadnode-renovate-bot bot commented Apr 24, 2025

fix(deps): update dependency accelerate to v1 #80

Are you sure you want to change the base?

fix(deps): update dependency accelerate to v1 #80

Conversation

dreadnode-renovate-bot bot commented Apr 24, 2025 • edited by github-actions bot Loading

| Package | Type | Update | Change | |

Generated Summary

Release Notes

v1.6.0: : FSDPv2, DeepSpeed TP and XCCL backend support

FSDPv2 support

DeepSpeed TP support

Support for XCCL distributed backend

What's Changed

New Contributors

v1.5.2: Patch: v1.5.2

v1.5.1

v1.5.0: : HPU support

HPU Support

What's Changed

New Contributors

v1.4.0: : torchao FP8, TP & dataLoader support, fix memory leak

torchao FP8, initial Tensor Parallel support, and memory leak fixes

torchao FP8

TensorParallel

Bug fixes

What's Changed

New Contributors

v1.3.0: Bug fixes + Require torch 2.0

Torch 2.0

Core

Big Modeling

Examples

Full Changelog

What's Changed

New Contributors

v1.2.1: : Patchfix

v1.2.0: : Bug Squashing & Fixes across the board

Core

Big Modeling

DeepSpeed

Documentation

New Contributors

Full Changelog

Code Diff

v1.1.1

v1.1.0: : Python 3.9 minimum, torch dynamo deepspeed support, and bug fixes

Internals:

DeepSpeed

Megatron

Big Model Inference

Examples

Full Changelog

New Contributors

v1.0.1: : Bugfix

Bugfixes

v1.0.0: Accelerate 1.0.0 is here!

🚀 Accelerate 1.0 🚀

Migration assistance

Multiple Model DeepSpeed Support

Knowledge distillation

Multiple disjoint models

FP8

Configuration

dreadnode-renovate-bot bot commented Apr 24, 2025

Edited/Blocked Notification

dreadnode-renovate-bot bot commented Apr 24, 2025 •

edited by github-actions bot

Loading

| Package | Type | Update | Change |
|

`v1.6.0`: : FSDPv2, DeepSpeed TP and XCCL backend support

`v1.5.2`: Patch: v1.5.2

`v1.5.1`

`v1.5.0`: : HPU support

`v1.4.0`: : `torchao` FP8, TP & dataLoader support, fix memory leak

`torchao` FP8, initial Tensor Parallel support, and memory leak fixes

`torchao` FP8

`v1.3.0`: Bug fixes + Require torch 2.0

`v1.2.1`: : Patchfix

`v1.2.0`: : Bug Squashing & Fixes across the board

`v1.1.1`

`v1.1.0`: : Python 3.9 minimum, torch dynamo deepspeed support, and bug fixes

`v1.0.1`: : Bugfix

`v1.0.0`: Accelerate 1.0.0 is here!