Diversity-Aware Reverse Kullback-Leibler Divergence for Large Language Model Distillation
This research introduces Diversity-aware RKL, a novel method to overcome the overconfidence and diversity loss inherent in standard Reverse Kullback-Leibler ...