This research audits frontier LLM judges against human ratings to reveal critical gaps in automated disinformation risk assessment, challenging the validity ...
Level: advanced
By Zonghuan Xu
Category: discussion