How will we do SFT on models with opaque reasoning?

This research explores the critical failure modes of Supervised Fine-Tuning in models with opaque reasoning, proposing structural safeguards to ensure stable...

Level: advanced

By Unknown

Category: research