FALCON: Fine-grained Activation Manipulation for Large Language Models

FALCON introduces a novel framework using contrastive orthogonal unalignment to precisely isolate and unlearn specific knowledge components in large language...

Level: expert

By Jinwei Hu and 6 other authors

Category: research