3D Optimization for AI Inference Scaling: Balancing Accuracy, Cost, and Latency
Explore a novel 3D multi-objective optimization framework designed to balance accuracy, cost, and latency in AI inference scaling. This research details how ...
Level: advanced
By Minseok Jung, Abhas Ricky, Muhammad Rameez Chatni