FLRC introduces a novel low-rank compression framework that optimizes layer-specific rank allocation and progressive decoding to enhance LLM inference effici...
Level: advanced
By Unknown
Category: research