ParetoBandit: Budget-Paced Adaptive Routing for Non-Stationary LLM Serving

Explore ParetoBandit, an open-source adaptive router designed to optimize Large Language Model serving by balancing cost and quality in non-stationary enviro...

Level: advanced

By Annette Taberner-Miller

Category: research