This research introduces Neuralink, a novel approach leveraging sparsity and flash memory co-design to accelerate LLM inference on smartphones with 1.49x low...
Level: advanced
By Unknown
Category: research