The hypothesis
A three-leg ensemble (xG from shot-quality model + Elo-based pace-of-play adjustment + XGBoost over engineered features) consistently beats any single component for top-5 European league outcomes. Sample is small but the SHA-256 commits make every retrospective auditable.
Stack & architecture
- Python for the modeling pipeline (xG model, Elo updates, XGBoost ensemble).
- FastAPI for the inference service.
- PostgreSQL for match data + commitment ledger.
- Next.js for the frontend at https://predictor.nullshift.sh.
- Probabilities for each upcoming match are written to the ledger and hashed; the hash is posted publicly before kickoff. Post-game, the raw probabilities are revealed and the hash verified.
What I learned
XGBoost is forgiving on small samples if your features are honest. The Elo component matters most for teams with sparse recent-form data. Honest probability commitment (SHA-256 before kickoff) makes the accuracy log worth keeping; without it, the temptation to retroactively explain misses poisons the project.