This research investigates whether transformer models adaptively utilize their depth for relational reasoning, revealing distinct behaviors between pretraine...
Level: advanced
By Alicia Curth
Category: research