This research introduces a microsaccade-inspired probing technique using positional encoding perturbations to detect LLM misbehaviors without fine-tuning, of...
Level: advanced
By Unknown
Category: research