This research introduces a theoretical framework for offline constrained RLHF using multiple preference oracles, ensuring safety and fairness through convex ...
Level: expert
By Brenden Latham, Mehrdad Moharrami
Category: research