This technical report details the UK AI Security Institute's framework for assessing whether advanced AI systems reliably follow goals, specifically examinin...
Level: advanced
By Alexandra Souly, Robert Kirk, Jacob Merizian, Abby D'Cruz, Xander Davies
Category: discussion