InterpBench introduces a semi-synthetic dataset leveraging Strict IIT to rigorously evaluate mechanistic interpretability techniques while maintaining realis...
Level: advanced
By Unknown
Category: research