This research introduces GOSV, a framework leveraging global optimization to extract and exploit safety vectors in LLMs, revealing critical vulnerabilities i...
Level: advanced
By Fengheng Chu, Jiahao Chen, Yuhong Wang, Jun Wang, Zhihui Fu, Shouling Ji, Songze Li
Category: research