Quit Emailing Yourself

# multimodal → evaluation → benchmark → artificial-intelligence

1 link tagged with all of: multimodal + evaluation + benchmark + artificial-intelligence

Click any tag below to further narrow down your results

Links

Abstract

SpatialScore introduces a comprehensive benchmark for evaluating multimodal large language models (MLLMs) in spatial understanding, consisting of the VGBench dataset and an extensive collection of 28K samples. It features the SpatialAgent, a multi-agent system designed for enhanced spatial reasoning, and reveals persistent challenges and improvements in spatial tasks through quantitative and qualitative evaluations.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

+ spatial-understanding multimodal ✓ evaluation ✓ benchmark ✓ artificial-intelligence ✓