Researched H5 proposal | June 5, 2026

The data layer for world models is becoming a business category.

Across video world models, simulation, robot planners, autonomous driving, and digital twins, the recurring bottleneck is not only compute. It is rights-cleared multimodal data connected to state, action, physics, scene context, and human judgment.

What the literature says about training data.

The concrete examples below show that world modeling is not one data problem. It is a stack of modality-specific data problems that need curation, annotation, validation, and reuse rights.

Observation data is abundant but messy.

Sora, Genie, Cosmos, V-JEPA, and GAIA-1 all point to video as a core substrate, but video alone does not cleanly expose action, state, object properties, or legal training rights.

State data is scarce and expensive.

Simulators need geometry, materials, object states, articulated assets, collision meshes, and physics labels. This is where generic web data runs thin.

Action data creates the commercial wedge.

Robot policies need demonstrations, trajectories, instructions, failures, and environment variation. These are precisely the assets Humanbased can source and verify.

CaptureVideo, robot logs, 3D scans, motion capture, driving scenes, simulated rollouts.

StructureSegment tasks, align sensors, normalize formats, attach instructions and timestamps.

AnnotateObject state, affordance, action phase, material, physics issue, success or failure.

ValidateConsensus, expert review, simulator-real comparison, safety and realism scoring.

PackageDataset card, API export, license, lineage, quality report, resale window.

Evidence catalog: actual data used across world-model venues.

Use the filters to scan by function. Each card names publicly described training data or benchmark data, how it is used, and what Humanbased could supply or validate. Some frontier labs publish only partial data recipes, so the catalog separates documented inputs from Humanbased's inferred opportunity.

Business strategy: convert the data gap into pilots.

The strongest initial motion is not to sell "world models." It is to sell fixed-scope data products tied to model training or evaluation pain.

Humanbased should become the human-verified reality layer for physical AI: source the data, label the state, validate the physics, and prove the rights.

First ICP: robotics foundation-model teams

Offer a 4-6 week household or warehouse manipulation pilot: 1,000 interaction episodes, action/state/affordance labels, validation report, and dataset card.

Second ICP: simulation and digital twin teams

Offer sim-to-real validation: compare synthetic scenes, robot rollouts, or driving scenarios against human and expert realism judgments.

Third ICP: 3D and world-generation labs

Offer 3D asset QA, scene capture, material/scale/collision metadata, and human preference scoring for generated interactive worlds.

Product packaging.

A practical set of features Humanbased can bring to market without waiting for a full foundation model partnership.

World Data Campaigns

Guided capture and annotation for physical interactions: household, warehouse, retail, healthcare support, human navigation, and task failures.

Physical Annotation Schemas

Reusable schemas for object state, action phase, affordance, material, collision, force/contact, and success/failure outcomes.

Sim-To-Real Validation

Human and expert comparison of generated or simulated predictions against real-world outcomes and safety expectations.

Contributor Credential Pools

Verified household contributors, device-gated 3D scanners, robot teleoperators, warehouse workers, and domain experts.

Dataset Marketplace Packs

Reusable data products such as kitchen manipulation, indoor navigation, 3D rooms, common-object affordances, and synthetic-data QA.

Rights And Lineage Layer

Consent scope, collection protocol, contributor reputation, acceptance history, reuse rights, and auditable dataset cards.

Primary references.

These are the source links used for the example catalog and claims in the page.