I Pitted XGBoost Against Logistic Regression on 358 Matches. The Boring Model Won.
A practical experiment comparing five classifiers (logistic regression, random forest, KNN, neural network, XGBoost) on 358 international football matches reveals that logistic regression wins on log-loss while XGBoost performs worse than random guessing. The explanation centers on the bias-variance tradeoff: with only 358 samples and three features, high-capacity models overfit and produce confidently wrong probabilities, which log-loss penalizes heavily. The post explains why logistic regression's inductive bias matches the near-linear relationship in the data, discusses how to rescue tree-based models via regularization, and offers learning curves as a diagnostic for when complex models become worth it. The key takeaway is to match model complexity to data size, not to hype.