Efficient Mathematical Reasoning Models via Dynamic Pruning and Knowledge Distillation
This research introduces a dynamic attention head pruning framework combined with knowledge distillation to optimize mathematical reasoning in large language...
Level: advanced
By Fengming Yu, Qingyu Meng, Haiwei Pan, Kejia Zhang