Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability
This research challenges the assumption that supervised fine-tuning only memorizes, revealing that reasoning generalization is conditional on optimization, d...