Docs2Synth: A Synthetic Data Trained Retriever Framework for Scanned Visually Rich Documents Understanding

Docs2Synth introduces an agent-based framework for training retrievers on visually rich documents using synthetic data, eliminating manual annotation bottlen...

Level: advanced

By Unknown

Category: research