Challenges and Research Directions for Large Language Model Inference Hardware

This research establishes memory bandwidth and interconnect latency as the dominant constraints in LLM inference, proposing advanced architectural solutions ...

Level: advanced

By Xiaoyu Ma, David Patterson

Category: research