Skip to content

Commit d16a368

Browse files
authored
Merge pull request #314 from kozistr/feature/mars-optimizer
[Feature] Implement MARS optimizer
2 parents 42b1d76 + e915efd commit d16a368

File tree

4 files changed

+13
-3
lines changed

4 files changed

+13
-3
lines changed

docs/changelogs/v3.3.1.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@
88
* [SGD-like Memory, AdamW-level Performance](https://arxiv.org/abs/2412.05270)
99
* Rename the `Apollo` (`An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization`) optimizer name to `ApolloDQN` not to overlap with the new optimizer name `APOLLO`. (#312)
1010
* Implement `MARS` optimizer. (#313, #314)
11-
* [Unleashing the Power of Variance Reduction for Training Large Models](https://arxiv.org/abs/2411.10438)
11+
* [Unleashing the Power of Variance Reduction for Training Large Models](https://arxiv.org/abs/2411.10438)
12+
* Support `Cautious` variant to `MARS` optimizer. (#314)
1213

1314
### Bug
1415

@@ -17,7 +18,7 @@
1718

1819
### Docs
1920

20-
* Add more visualizations. (#310)
21+
* Add more visualizations. (#310, #314)
2122

2223
### Contributions
2324

pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[tool.poetry]
22
name = "pytorch_optimizer"
3-
version = "3.3.0"
3+
version = "3.3.1"
44
description = "optimizer & lr scheduler & objective function collections in PyTorch"
55
license = "Apache-2.0"
66
authors = ["kozistr <[email protected]>"]

tests/constants.py

+1
Original file line numberDiff line numberDiff line change
@@ -537,6 +537,7 @@
537537
(MARS, {'lr': 5e-1, 'lr_1d': 5e-1, 'weight_decay': 1e-3, 'mars_type': 'adamw'}, 5),
538538
(MARS, {'lr': 1e-1, 'weight_decay': 1e-3, 'mars_type': 'lion', 'optimize_1d': True}, 5),
539539
(MARS, {'lr': 5e-1, 'lr_1d': 5e-1, 'weight_decay': 1e-3, 'mars_type': 'shampoo'}, 5),
540+
(MARS, {'lr': 5e-1, 'lr_1d': 5e-1, 'weight_decay': 1e-3, 'mars_type': 'adamw', 'ams_bound': True}, 5),
540541
]
541542
ADANORM_SUPPORTED_OPTIMIZERS: List[Tuple[Any, Dict[str, Union[float, bool, int]], int]] = [
542543
(AdaBelief, {'lr': 5e-1, 'weight_decay': 1e-3, 'adanorm': True}, 10),

tests/test_optimizers.py

+8
Original file line numberDiff line numberDiff line change
@@ -812,3 +812,11 @@ def test_muon_rank(rank):
812812
model[2].weight.grad = torch.randn(1, 1, 1)
813813

814814
optimizer.step()
815+
816+
817+
def test_mars_c_t_norm():
818+
param = simple_parameter(True)
819+
param.grad[0] = 100.0
820+
821+
optimizer = load_optimizer('mars')([param], optimize_1d=True)
822+
optimizer.step()

0 commit comments

Comments
 (0)