GAUNTLET

Leaderboard

Quality and cost are separate axes. Quality % is vs the frontier ceiling — or best-in-cohort when the ceiling hasn't run. Voided runs (leakage / uncontrolled effort) are excluded from ranking.

No baseline runs yet

This board populates at Phase 3 — Baseline runs, once the cell-runner and scorer (Phase 2) are built and the first comparison runs in the sandbox. Nothing is faked here — real scores or nothing.

What will be scored — the rubric (0 dimensions)

The cells that will compete — reasonability bands

TierAgentsCandidate cellsCeiling (100%)