Higher Embedding Dimension Creates a Stronger World Model for a Simple Sorting Task
This research investigates how increasing embedding dimensions in transformers enhances the formation of structured internal world models during sorting task...
Level: advanced
By Brady Bhalla, Honglu Fan, Nancy Chen, Tony Yue YU