You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**pytorch-optimizer** is optimizer & lr scheduler collections in PyTorch.
12
-
I just re-implemented (speed & memory tweaks, plug-ins) the algorithm while based on the original paper. Also, It includes useful and practical optimization ideas.
13
-
Currently, **81 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported!
11
+
## The reasons why you use `pytorch-optimizer`.
12
+
13
+
1. Wide range of supported optimizers. Currently, **83 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported!
14
+
2. Including many variants such as `Cautious`, `AdamD`, `Gradient Centrailiaztion`
15
+
3. Easy to use, clean, and tested codes
16
+
4. Active maintenance
17
+
5. Somewhat a bit more optimized compared to the original implementation
14
18
15
19
Highly inspired by [pytorch-optimizer](https://github.com/jettify/pytorch-optimizer).
| MicroAdam |*Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence*|[github](https://github.com/IST-DASLab/MicroAdam)|<https://arxiv.org/abs/2405.15593>|[cite](https://github.com/IST-DASLab/MicroAdam?tab=readme-ov-file#citing)|
189
193
| Muon |*MomentUm Orthogonalized by Newton-schulz*|[github](https://github.com/KellerJordan/Muon)|<https://x.com/kellerjordan0/status/1842300916864844014>|[cite](https://github.com/KellerJordan/Muon)|
194
+
| LaProp |*Separating Momentum and Adaptivity in Adam*|[github](https://github.com/Z-T-WANG/LaProp-Optimizer)|<https://arxiv.org/abs/2002.04839>|[cite](https://github.com/Z-T-WANG/LaProp-Optimizer?tab=readme-ov-file#citation)|
**pytorch-optimizer** is optimizer & lr scheduler collections in PyTorch.
12
-
I just re-implemented (speed & memory tweaks, plug-ins) the algorithm while based on the original paper. Also, It includes useful and practical optimization ideas.
13
-
Currently, **81 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported!
11
+
## The reasons why you use `pytorch-optimizer`.
12
+
13
+
1. Wide range of supported optimizers. Currently, **83 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported!
14
+
2. Including many variants such as `Cautious`, `AdamD`, `Gradient Centrailiaztion`
15
+
3. Easy to use, clean, and tested codes
16
+
4. Active maintenance
17
+
5. Somewhat a bit more optimized compared to the original implementation
14
18
15
19
Highly inspired by [pytorch-optimizer](https://github.com/jettify/pytorch-optimizer).
| MicroAdam |*Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence*|[github](https://github.com/IST-DASLab/MicroAdam)|<https://arxiv.org/abs/2405.15593>|[cite](https://github.com/IST-DASLab/MicroAdam?tab=readme-ov-file#citing)|
189
193
| Muon |*MomentUm Orthogonalized by Newton-schulz*|[github](https://github.com/KellerJordan/Muon)|<https://x.com/kellerjordan0/status/1842300916864844014>|[cite](https://github.com/KellerJordan/Muon)|
194
+
| LaProp |*Separating Momentum and Adaptivity in Adam*|[github](https://github.com/Z-T-WANG/LaProp-Optimizer)|<https://arxiv.org/abs/2002.04839>|[cite](https://github.com/Z-T-WANG/LaProp-Optimizer?tab=readme-ov-file#citation)|
0 commit comments