AutoBackdoor: Automating Backdoor Attacks via LLM Agents

This research introduces AutoBackdoor, an agent-based framework automating backdoor attacks on LLMs through semantic prompt engineering. It exposes critical ...

Level: advanced

By Yige Li, Zhe Li, Wei Zhao, Nay Myat Min, Hanxun Huang, Xingjun Ma, Jun Sun

Category: discussion