KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs

This research introduces KV Packet, a novel framework that eliminates recomputation overhead in LLM inference by treating cached documents as immutable packe...

Level: advanced

By Chuangtao Chen

Category: research