RADAR leverages mechanistic interpretability to detect data contamination in LLM evaluations by distinguishing between recall and reasoning patterns. This re...
Level: advanced
By Unknown
Category: research