Leaderboard
Quality and cost are separate axes. Quality % is vs the frontier ceiling — or best-in-cohort when the ceiling hasn't run. Voided runs (leakage / uncontrolled effort) are excluded from ranking.
No baseline runs yet
This board populates at Phase 3 — Baseline runs, once the cell-runner and scorer (Phase 2) are built and the first comparison runs in the sandbox. Nothing is faked here — real scores or nothing.
What will be scored — the rubric (0 dimensions)
The cells that will compete — reasonability bands
| Tier | Agents | Candidate cells | Ceiling (100%) |
|---|