This research explores the nuanced role of synthetic data in LLM pre-training, revealing that mixing 30% synthetic data with natural data can accelerate trai...
Level: advanced
By Unknown
Category: research