中文
VLA Simulation Benchmarks · Task & Robustness Atlases

The Benchmark
Atlas Index

One front door to every dashboard in this collection — interactive, single-file, blueprint-style atlases of the task suites and leaderboards used to evaluate Vision-Language-Action models. Grouped from capability (what models do in-distribution) through robustness and memory to the sim ↔ real reality gap. Click any card to open its atlas in a new tab.

22
dashboards
6
tiers
20+
benchmarks