Model Health

Live calibration drift. Aggregated across all FreshLoop users, anonymized. If the numbers here drift, we retrain.

Window:

What do these numbers mean?

Brier score measures how accurate our probabilistic predictions are.

· 0.00 = perfect (model is never wrong)
· 0.14-0.18 = well-calibrated professional model (our target)
· 0.25 = same as flipping a coin
· 0.30+ = model is broken / needs retrain

Bias (pp) is how much the model over- or under-predicts on average.

· 0 pp = predictions match reality exactly, on average
· +5 pp = we predict ~5% higher than actual (overrate teams)
· -5 pp = we predict ~5% lower than actual (underrate teams)
· |>10 pp| = time to retrain

Data source: anonymized prediction-outcome pairs from every FreshLoop user whose daemon is set to share calibration data (opt-in, default on). No user IDs are stored with these samples. See privacy policy.

Want to see individual match results?

View Track Record