██████╗ ██████╗ ██╗ ██╗ ██╗███╗ ███╗ █████╗ ██████╗ ██╗ ██╗███████╗████████╗ ██╗ █████╗ ██████╗ ██╔══██╗██╔═══██╗██║ ╚██╗ ██╔╝████╗ ████║██╔══██╗██╔══██╗██║ ██╔╝██╔════╝╚══██╔══╝ ██║ ██╔══██╗██╔══██╗ ██████╔╝██║ ██║██║ ╚████╔╝ ██╔████╔██║███████║██████╔╝█████╔╝ █████╗ ██║ ██║ ███████║██████╔╝ ██╔═══╝ ██║ ██║██║ ╚██╔╝ ██║╚██╔╝██║██╔══██║██╔══██╗██╔═██╗ ██╔══╝ ██║ ██║ ██╔══██║██╔══██╗ ██║ ╚██████╔╝███████╗ ██║ ██║ ╚═╝ ██║██║ ██║██║ ██║██║ ██╗███████╗ ██║ ███████╗██║ ██║██████╔╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝ ╚═╝ ╚══════╝╚═╝ ╚═╝╚═════╝
artbreguez@lab:~/polymarket-lab$ status --all --live
● F1 champion loading…
● TMAX champion loading…
● ECMWF backfill loading…
● TMAX tracking loading…
● F1 paper loading…
● Next GP loading…
updated every 6h · —
F1 Walk-Forward PnL
—
—
F1 ROI
—
multitask_qr r4
TMAX Gate PnL
—
—
TMAX Hit Rate
—
—
TMAX Live Positions
—
—
ECMWF Backfill
—
30 cities · 2602 markets
// next grand prix
LOADING
FP1: —
--
days
:
--
hrs
:
--
min
:
--
sec
● champion r4 ready
auto-scheduled
paper trading only
// process status
GFS Backfill — 30 cities
2602/2602 · complete
DONE
ECMWF Backfill (ifs025 + aifs025)
loading…
…
F1 Autoresearch v3
450 candidates · r4 champion held
DONE
TMAX Champion Publish + HF Upload
lgbm_emos → huggingface/artbreguez
DONE
TMAX Paper Trading (every 4h)
loading…
…
F1 Paper Trading (auto-scheduled)
loading…
AUTO
Post-GP Retrain Cron (Sundays)
auto sync + retrain + HF publish
ACTIVE
// F1 Lab / Champion Model
// F1 Lab / Autoresearch History
// F1 Lab / PnL History — Walk-Forward
// cumulative pnl by gp (r4 champion)
36 GPsUnitized PnL (1 unit = 1 share). Multiply × bet_size for dollars.
2024 R122025 R12025 R6 →
// per-family cumulative pnl
| Family | Total PnL | ROI | Bets | Hit Rate |
|---|---|---|---|---|
| head_to_head | +$14.13 | 47% | 60 | — |
| race_winner | +$5.62 | 166% | 56 | — |
| driver_pole | +$2.16 | 257% | 28 | — |
| TOTAL | +$25.17 | 17.1% | 654 | — |
// F1 Lab / Retrain Pipeline
// TMAX Lab / PnL History — Historical Backtest
// champion model performance — 30 cities · 2,602 markets
tuned_ensemblePnL is dollar-denominated (real market prices). Backtest uses walk-forward with stride=30 days. Quote-proxy PnL simulates the Polymarket bid/ask spread.
+$381.67
Backtest PnL · 1,245 trades
+$138.89
Quote-proxy PnL (live simulation)
+$201.54
Recent-core gate · 218 trades
29.6%
Hit Rate
31.1%
Avg Edge
// BACKTEST PNL BY MODEL
lgbm_emos CHAMPION
+$381
gaussian_emos no live edge
+$461
lgbm_emos (v1)
+$206
det2prob_nn skip
+$25
// model comparison — historical backtest
stride=30| Model | Backtest PnL | qp_PnL | Hit Rate | Trades | Brier | Status |
|---|---|---|---|---|---|---|
| tuned_ensemble (lgbm_emos) | +$381.67 | +$138.89 | 29.6% | 1,245 | 0.1137 | CHAMPION |
| gaussian_emos | +$461.85 | +$3.92 | 21.7% | 1,242 | 0.1119 | NO LIVE EDGE |
| lgbm_emos (v1) | +$206.73 | -$12.13 | 25.0% | — | — | RETIRED |
| det2prob_nn | +$25.76 | -$173.94 | 20.8% | 1,254 | 0.1176 | SKIP |
// why gaussian_emos loses despite higher backtest PnL
▸
qp_pnl is the true signal. Backtest assumes perfect execution at mid-price. Quote-proxy simulates the real Polymarket spread (2%). gaussian_emos: +$461 backtest vs +$3.92 qp_pnl — edge evaporates at execution.
▸
tuned_ensemble has real edge. +$138.89 qp_pnl means consistently beating the spread. 29.6% hit rate on binary markets signals real mispricings.
▸
ECMWF upgrade pending. neighbor_spread feature (|ifs025 − aifs025|) expected to improve calibration. Auto-triggers at 100% backfill.
// recent-core gate — 3 city validation
GO ✓| City | Trades | PnL | Hit Rate | Gate |
|---|---|---|---|---|
| Atlanta | 76 | +$162.45 | — | PASS ✓ |
| Buenos Aires | 76 | +$28.84 | — | PASS ✓ |
| Dallas | 66 | +$10.25 | — | PASS ✓ |
| AGGREGATE | 218 | +$201.54 | 20.2% | GO ✓ |
// TMAX Lab / Benchmark History
// model evolution — versions & decisions
Unlike F1 (neural network hill-climbing), TMAX uses LGBM + EMOS statistical calibration. Each version is benchmarked on the full historical backtest and a recent-core gate (3 cities, last ~3 months). Champion auto-promoted when qp_pnl > 0 and gate GO.
▸
v1 (lgbm_emos): GFS-only, 20 cities. +$206 backtest PnL. Baseline established April 2026.
▸
v2 (tuned_ensemble): GFS backfill to 30 cities + tuned head weights. +$381 backtest, +$138 qp_pnl. Champion since 2026-04-29.
▸
v3 (pending): ECMWF IFS025 + AIFS025 neighbor_spread feature. Retrain auto-triggers at 100% ECMWF backfill.
// full leaderboard — all candidates
historical_real · stride=30| Model | Backtest PnL | qp_PnL | Hit Rate | Brier | Trades | Score |
|---|---|---|---|---|---|---|
| tuned_ensemble | +$381.67 | +$138.89 | 29.6% | 0.1137 | 1,245 | 1.85 |
| gaussian_emos | +$461.85 | +$3.92 | 21.7% | 0.1119 | 1,242 | 1.15 |
| lgbm_emos (v1) | +$206.73 | -$12.13 | 25.0% | — | — | — |
| det2prob_nn | +$25.76 | -$173.94 | 20.8% | 0.1176 | 1,254 | 0.0 |
// promotion gate criteria
| Criterion | Threshold | Rationale |
|---|---|---|
| qp_pnl | > $0 | Must beat spread in live simulation |
| recent-core gate | GO (3 cities pass) | Validates on most recent market data |
| hit_rate | > 18% | Minimum statistical signal |
| sample_adequacy | passed | Enough trades for significance |
// next research directions
▸
ECMWF neighbor_spread. |ifs025 − aifs025_single| as uncertainty signal. Improves calibration on high-variance days.
▸
Expand to 40+ cities. London, NYC, Paris have higher Polymarket volume. Currently 30 covered.
▸
Horizon-specific models. morning_of vs market_open have different dynamics. Splitting may improve edge consistency.
Loading...
// POLYQUANT · QUANTITATIVE RESEARCH TERMINAL
// UPCOMING EVENTS
🏎 CANADIAN GP — 22–24 MAY
🌡 TMAX scoring — loading…
Next retrain: post-Canadian GP
// TMAX Lab / Champion Model
tuned_ensemble (lgbm_emos) [alias: champion]
promoted: 2026-04-29 14:51 UTC · gate: GO ✓ · 3 cities passed · 30 cities covered
-$5479.37
Backtest PnL
-$48119.07
qp_pnl (live sim)
+$201.54
gate aggregate
0.1070
Brier Score
43.9%
Hit Rate
30
Coverage
// champion model — full metrics
LIVE · 2026-04-29| Metric | Value |
|---|---|
| Model | tuned_ensemble (lgbm_emos) |
| Backtest PnL | +$381.67 |
| Backtest Trades | 1820 |
| Hit Rate | 29.6% |
| Avg Edge | 31.1% |
| Brier Score | 0.1137 |
| Quote-Proxy PnL | +$138.89 |
| Gate Decision | GO ✓ |
| Gate PnL | +$201.54 |
| Gate Trades | 218 |
| Markets Covered | 2,602 |
| Cities | 30 |
| Published | 2026-04-29 |
// model leaderboard
historical backtest| Model | PnL | qp_pnl | Brier | Trades | Status |
|---|---|---|---|---|---|
| lgbm_emos (tuned_ensemble) | +$381 | +$138 | 0.1137 | 1245 | CHAMPION |
| gaussian_emos | +$461 | +$4 | 0.1119 | 1242 | NO EDGE |
| det2prob_nn | +$25 | -$173 | 0.1176 | 1254 | SKIP |
// recent-core gate — city results
GO| City | Trades | PnL | ok_ratio | Gate |
|---|---|---|---|---|
| Atlanta | 76 | +$162.45 | 100% | PASS ✓ |
| Buenos Aires | 76 | +$28.84 | 100% | PASS ✓ |
| Dallas | 66 | +$10.25 | 100% | PASS ✓ |
// key metric: qp_pnl > brier
!
gaussian_emos has better Brier (0.1119) but qp_pnl ≈ $4. A well-calibrated model that matches Polymarket prices offers no tradeable edge. Always select champion by quote_proxy_pnl, not raw Brier score.
▸
London/NYC failed the gate (negative PnL, z-scores -5.57 and -8.08). Not noise. Switched to Atlanta/Buenos Aires/Dallas for recent-core validation. Madrid later replaced by Dallas (May 2026) — persistent negative PnL (-$44) in recent benchmarks.
// TMAX Lab / Retrain Pipeline
// Ops / VPS Recovery Guide
// Ops / Cron Registry