News

This repository contains the official implementation of Scale-Distribution Decoupling (SDD), a novel method developed to stabilize the training of large language models (LLMs) by effectively ...