Automated Attention Pattern Discovery at Scale in Large Language Models

This research introduces AP-MAE, a vision transformer-based approach to mining attention patterns in large language models, offering scalable interpretabilit...

Level: advanced

By Jonathan Katzy, Razvan-Mihai Popescu, Erik Mekkes, Arie van Deursen, Maliheh Izadi

Category: research