A Unified Understanding of Offline Data Selection and Online Self-refining Generation for Post-training LLMs

This research introduces a bilevel optimization framework for offline data selection and online self-refining generation to enhance LLM post-training. It det...

Level: advanced

By Quan Xiao, Tianyi Chen

Category: research