Can LLM Safety Be Ensured by Constraining Parameter Regions?

This research challenges the viability of constraining parameter regions for LLM safety, revealing poor generalizability and instability in current detection...

Level: advanced

By Zongmin Li, Jian Su, Farah Benamara, Aixin Sun

Category: discussion