SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes

SceneCOT introduces a novel hierarchical framework for grounded chain-of-thought reasoning in 3D scenes, leveraging cross-modal attention to minimize halluci...

Level: advanced

By Xiongkun Linghu and 4 other authors

Category: research