News

Having Muon available in Optax would be very cool. Its also the optimizer used to train Kimi 2. References: Implementation Pytorch pull request1 paper paper showing muon efficiency Would the team ...
I recommend following the same strategy as for metrics: a small amount of well known optimizers should be callable with simple names as keys of an optimizer.name entry in the config file. For more ...