Counterfactual Simulation Training for Chain-of-Thought Faithfulness

Explore Counterfactual Simulation Training, a novel objective designed to enhance Chain-of-Thought faithfulness in large language models by detecting spuriou...

Level: advanced

By Peter Hase, Christopher Potts

Category: research