← Home

Model Health

Live calibration drift. Aggregated across all FreshLoop users, anonymized. If the numbers here drift, we retrain.

Window:

What do these numbers mean?

Brier score measures how accurate our probabilistic predictions are.
  • · 0.00 = perfect (model is never wrong)
  • · 0.14-0.18 = well-calibrated professional model (our target)
  • · 0.25 = same as flipping a coin
  • · 0.30+ = model is broken / needs retrain
Bias (pp) is how much the model over- or under-predicts on average.
  • · 0 pp = predictions match reality exactly, on average
  • · +5 pp = we predict ~5% higher than actual (overrate teams)
  • · -5 pp = we predict ~5% lower than actual (underrate teams)
  • · |>10 pp| = time to retrain
Data source: anonymized prediction-outcome pairs from every FreshLoop user whose daemon is set to share calibration data (opt-in, default on). No user IDs are stored with these samples. See privacy policy.

Want to see individual match results?

View Track Record