You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source-fabric/advanced/model_init.rst
+4
Original file line number
Diff line number
Diff line change
@@ -75,6 +75,10 @@ When training sharded models with :doc:`FSDP <model_parallel/fsdp>` or DeepSpeed
75
75
76
76
model = fabric.setup(model) # parameters get sharded and initialized at once
77
77
78
+
# Make sure to create the optimizer only after the model has been set up
79
+
optimizer = torch.optim.Adam(model.parameters())
80
+
optimizer = fabric.setup_optimizers(optimizer)
81
+
78
82
.. note::
79
83
Empty-init is experimental and the behavior may change in the future.
80
84
For FSDP on PyTorch 2.1+, it is required that all user-defined modules that manage parameters implement a ``reset_parameters()`` method (all PyTorch built-in modules have this too).
0 commit comments