ChEmREF: Evaluating Language Model Readiness for Chemical Emergency Response

This study introduces ChEmREF, a rigorous benchmark evaluating LLM reliability in chemical emergency response, highlighting critical gaps in safety-critical ...

Level: advanced

By Risha Surana, Qinyuan Ye, Swabha Swayamdipta

Category: research