CHESS: Context-aware Hierarchical Efficient Semantic Selection for Long-Context LLM Inference
Explore CHESS, a novel algorithm-system co-design framework that dynamically reconstructs semantic context to achieve 4.56x throughput in long-context LLM in...
Level: advanced
By Chao Fei, Guozhong Li, Chenxi Liu, Panos Kalnis