NCAA Basketball March Mania 2023
The goal of the project is training model for probability prediction of March Madness Tournament game results
based on previous game history in NCAA league.
Project Details / Background
The NCAA Division I men's basketball tournament, branded as NCAA March Madness and commonly called March Madness. The tournament consists of 68 teams
and was first conducted in 1939. 68 teams were competing in seven rounds of a single-elimination bracket.
The NCAA project has been ongoing for 6 years since 2018. Previous analysts and researchers have
contributed many important findings and insights through Exploratory Data Analysis and Statistical Analysis.
Based on previous results and stats, I selected several important features that are correlated with prediction results.
Related Features
Processed dataset sample
Most of the features require statistical computation using percentage metrics and function. Combined with objective ratings
from internet, 10 different features were implemented into model building. Model evaluation was then performed using the processed dataset. And Xgboost has been found to arrive to the
most optimal results using previous years' results as K-fold cross-validation. Finally, some manuel override predictions were made to polarize prediction
score for better log-loss score. The model predicted the results for 2023 and it
achieves a Brier Score of 0.18795 (Baseline: 0.25). Codes could be found on Github.
Credit : March Machine Learning Mania 2023
Image Gallery
XGBoost Importance Plot
XGBoost Explanation Plot