We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 68a894d commit 2a32b98Copy full SHA for 2a32b98
README.md
@@ -21,11 +21,6 @@ Easy Per Step Fault Tolerance for PyTorch
21
22
---
23
24
-> ⚠️ WARNING: This is an alpha prototype for PyTorch fault tolerance and may have bugs
25
-> or breaking changes as this is actively under development. We'd love to collaborate
26
-> and contributions are welcome. Please reach out if you're interested in torchft
27
-> or want to discuss fault tolerance in PyTorch
28
-
29
This repository implements techniques for doing a per-step fault tolerance so
30
you can keep training if errors occur without interrupting the entire training
31
job.
0 commit comments