Position: The Complexity of Perfect AI Alignment -- Formalizing the RLHF Trilemma

This research formalizes the Alignment Trilemma in RLHF, proving that global representativeness, tractability, and robustness cannot coexist under current co...

Level: expert

By Subramanyam Sahoo, Aman Chadha, Vinija Jain, Divya Chaudhary

Category: discussion