This research introduces KV Packet, a novel framework that eliminates recomputation overhead in LLM inference by treating cached documents as immutable packe...
Level: advanced
By Chuangtao Chen
Category: research