Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach

Explore FlashCache, a novel framework leveraging frequency-domain analysis to compress KV caches in multimodal LLMs, achieving significant speedups and memor...

Level: advanced

By Yaoxin Yang, Peng Ye, Xudong Tan, Chongjun Tu, Maosen Zhao, Jia Hao, Tao Chen

Category: research