This paper introduces PRAISE, a novel framework for training agentic search models that overcomes reward sparsity and expensive rollouts by leveraging prefix...
Level: advanced
By Erhan Zhang
Category: research