Reasoning Language Model Inference Serving Unveiled: An Empirical Study

This empirical study dissects the serving dynamics of reasoning large language models, revealing critical performance trade-offs and optimization strategies ...

Level: advanced

By Qi Li and 8 other authors

Category: research