Docs2Synth: A Synthetic Data Trained Retriever Framework for Scanned Visually Rich Documents Understanding
Docs2Synth introduces an agent-based framework for training retrievers on visually rich documents using synthetic data, eliminating manual annotation bottlen...