This research introduces a training-free online routing algorithm leveraging ANN search to optimize high-volume multi-LLM serving with asymptotic optimality ...
Level: advanced
By Fangzhou Wu, Sandeep Silwal
Category: research