Adaptive Simulation Experiment for LLM Policy Optimization

This research introduces LLM-PO, a novel framework optimizing Large Language Model policies by treating them as stochastic simulators within adaptive experim...

Level: expert

By Mingjie Hu

Category: research