CHESS: Context-aware Hierarchical Efficient Semantic Selection for Long-Context LLM Inference

Explore CHESS, a novel algorithm-system co-design framework that dynamically reconstructs semantic context to achieve 4.56x throughput in long-context LLM in...

Level: advanced

By Chao Fei, Guozhong Li, Chenxi Liu, Panos Kalnis

Category: research