News
Having Muon available in Optax would be very cool. Its also the optimizer used to train Kimi 2. References: Implementation Pytorch pull request1 paper paper showing muon efficiency Would the team ...
I recommend following the same strategy as for metrics: a small amount of well known optimizers should be callable with simple names as keys of an optimizer.name entry in the config file. For more ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results