This research challenges the viability of constraining parameter regions for LLM safety, revealing poor generalizability and instability in current detection...
Level: advanced
By Zongmin Li, Jian Su, Farah Benamara, Aixin Sun
Category: discussion