ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization

Explore ParetoQ, a novel framework advancing LLM quantization from 1-bit to 4-bit, revealing critical insights into the balance between model efficiency and ...

Level: advanced

By Unknown

Category: research