H-Node Attack and Defense in Large Language Models

This research introduces H-Node ANC, a mechanistic framework that identifies hallucinations in transformers by targeting high-variance hidden-state dimension...

Level: advanced

By Eric Yocam, Varghese Vaidyan, Yong Wang

Category: research