EPL Prediction Lab — nullshift.sh

The hypothesis

A three-leg ensemble (xG from shot-quality model + Elo-based pace-of-play adjustment + XGBoost over engineered features) consistently beats any single component for top-5 European league outcomes. Sample is small but the SHA-256 commits make every retrospective auditable.

Stack & architecture

Python for the modeling pipeline (xG model, Elo updates, XGBoost ensemble).
FastAPI for the inference service.
PostgreSQL for match data + commitment ledger.
Next.js for the frontend at https://predictor.nullshift.sh.
Probabilities for each upcoming match are written to the ledger and hashed; the hash is posted publicly before kickoff. Post-game, the raw probabilities are revealed and the hash verified.

What I learned

XGBoost is forgiving on small samples if your features are honest. The Elo component matters most for teams with sparse recent-form data. Honest probability commitment (SHA-256 before kickoff) makes the accuracy log worth keeping; without it, the temptation to retroactively explain misses poisons the project.