Verifier — drone SAR correction loop

Select a rollout.

model flight correction branch survivor beacon no-fly zone

tick 0 / 0

Corrections on this rollout

None yet.

Fleet — live operations

Four staged robots flying different policies over simulated video feeds. Watch, take control when a flight goes wrong, and the takeover lands in the record store — the correction loop, entered from the ops floor.

MIRA 2v2 — four views, one world

Recorded 2v2 Rocket League from the kyutai/rocket-science corpus (General Intuition · Kyutai · Epic Games), played back the way MIRA §4.5 frames it: four synchronized perspectives of one shared world on a single master clock. Judge the play, trace an object across views, and export the segments as schema v0.2 records.

Judge which behavior you'd rather see deployed. Both bad routes the segment to the takeover queue.

queue:

A

B

margin:

Takeover queue (from both-bad)

Empty.

Browser Trainer — close the loop

Corrections become a new checkpoint; the new checkpoint gets evaluated on held-out seeds; the eval attributes improvement by failure category. This panel is the half of the loop nobody builds.

No corrections yet — fly takeovers in Review, or synthesize expert corrections to simulate volume.

Before / after — held-out worlds, by failure category

Correction shelf life

Which corrections does the new checkpoint still visit? Stale corrections are the DAgger decay made visible — the reason this platform is a flow, not a dataset.

Training runs

Training runs from the Python / Prime Intellect pipeline appear here once the pipeline has produced artifacts.

Runs Explorer — the platform at scale

Scale — 1M clips/sec and the storage split

Lake — the whole record store, visualized

Every pane is a live query against the real Vortex lake of sim rollouts. Requires the local sidecar (train/lake_server.py) — the deployed demo shows this surface offline by design.

Operator corps — quality, calibration, economics

Five simulated operators judge the same gold-standard pairs. Calibration decides each operator's export weight — and the value table asks the guild-vs-crowd question: is one veteran worth more than the crowd?

Calibration (gold tasks)

operator	archetype	gold accuracy	export weight

Guild vs crowd

Value table

operator	judgments	effective signal	cost units	signal / cost

Organic play mining

Free-play runs from Review are auto-paired against model rollouts on the same world. Directed correction is performance under observation; organic play is the Medal thesis — capture people at their best because nobody's watching.

No free-play rollouts yet — press F in Review and just fly.

Learned verifier — online Bradley-Terry over trajectory features

Trains live on your Compare judgments. Human preferences are the only supervision signal — this is the verifiable-domain loop in miniature. Same math as MIRA's Bayesian-Elo human studies (App. I): Bradley-Terry is the Elo model, run online instead of as a one-off study.

Rollout ranking (by current verifier score)

#	rollout	checkpoint	outcome	score

Failure taxonomy by checkpoint

Auto events from the sim plus your annotations. This table is the seed of the eval loop: did corrections in a category reduce failures in that category on the next checkpoint?

Export — schema v0.2 records

Records are the source of truth; training formats are views. The bundle strips internal fields and carries a manifest.

VERIFIER v1.0

Rollouts

Corrections on this rollout

Fleet — live operations

MIRA 2v2 — four views, one world

A

B

Takeover queue (from both-bad)

Browser Trainer — close the loop

Before / after — held-out worlds, by failure category

Correction shelf life

Training runs

Runs Explorer — the platform at scale

Scale — 1M clips/sec and the storage split

Lake — the whole record store, visualized

Operator corps — quality, calibration, economics

Calibration (gold tasks)

Guild vs crowd

Value table

Organic play mining

Learned verifier — online Bradley-Terry over trajectory features

Rollout ranking (by current verifier score)

Failure taxonomy by checkpoint

Export — schema v0.2 records

Annotate correction

The correction loop