ScanNet++ Spatial Reasoning

Overview

Current methods using Large Language Models (LLMs) to understand 3D visual information and generate segmentation prompts have not fully explored spatial reasoning capabilities. This is due to a lack of high-quality datasets that test both reasoning and spatial understanding abilities.

Dataset

1,000+

3D Scenes

10,000+

Text-Object Pairs

Key Features

Text inputs as spatial reasoning questions
3D object masks as answers
Built on ScanNet++ scene collection

Examples Browser

Annotation

Benchmark

Evaluation Metrics

Description of how models are evaluated on this dataset.

Baseline Results

Method	Accuracy	IoU	F1 Score
Method 1	85.2%	0.72	0.80

Challenge Cases

Download & Use

Complete Dataset

Full ScanNet++ Spatial Reasoning dataset with all annotations.

Size: 4.2 GB

Download

Sample Version

Lightweight sample with 50 scenes for quick exploration.

Size: 500 MB

Download Sample

Code & Models

Evaluation code and baseline models.

GitHub Repo

Usage Guide


# Example code for loading and using the dataset
import json
import numpy as np

# Load dataset
with open('spatial_reasoning_dataset.json', 'r') as f:
    dataset = json.load(f)

# Access a sample
sample = dataset[0]
question = sample['question']
object_ids = sample['object_ids']

print(f"Question: {question}")
print(f"Answer object IDs: {object_ids}")