I’m a real baller so I tried to understand the game that I’m playing on the court and watching on television. This was the motivation to this merry and fancy experiment. Sure, the game isn’t that unpredictable, but I was amazed how easy it is, to obtain good results with the simplest AI models.
Result prediction is a very interesting field in every sport. Based on statistics and hidden dependencies, an artificial intelligence can learn to predict a final score. Current news showing the relevance of this topic: Warren Buffett will pay a billion dollar if someone provides a complete and correct bracket for the upcoming March Madness event in US college basketball.
A look through the sport betting landscape reveals some other platforms that offer score prediction like vitibet.com and scorepredictor.net. These interesting tools predict the winner of a game based on some statistics. Unfortunately, I had no accuracy statistics of their algorithms and no time to play around with these tools to get a qualifing answer. So I started with some feature ideas and preprocessing of them. The first results were listed below together with the next steps and future plans for this project.
I retrieved data from NBBL games (german youth league) from nbbl-basketball.de. For each team, I collected the data manually. It was a simple and meditating work. After a rave of statistics and 2 hours of time, the data was collected and – after some preprocessing steps (relations between statistics) ready for some tests.
The first task was to estimate winner and loser of a basketball game with a given result and statistics using a neural network. The configuration of the neural net was optimized as long as a better result occurs. Some feature improvements using PCA (Principal Component Analysis) and GLVQ (Generalized Learning Vector Quantization) brought more accurate results. I divided the data into a training set as well as a test set having 80/20 ratio. The results below can be interpreted as 91,62% of all games were predicted correctly.
- 88,94% … Without Preprocessing
- 90,02% … With Preprocessing
- 91,62% … Baseline Feature Set
After this experiment, a simple predictor was trained to estimate basketball games with a good baseline accuracy (91,62%). The second experiment was a bit more complex. The goal was to find the winner of the match between team A and B where both teams have never played a match in the past as well as current shape and tendency. In more colloquial language, let’s say, I want the neural network to learn:
- Dependency of the last games played
- Influence of trend / shape in saison
- Time-relevant facts (recurrent neural nets)
Not all of these ideas were considered for the first try, but the most and so I ended up with a network, that predicts the game winner using the stats of the last 3 games plus preprocessed features (that are still secret). The aim wasn’t to reach a higher accuracy (then 91,62%) but rather to find out, which of these statistics affect the result and how these stats affect the prediction. The results below outline the first results:
- 84,67% … Without time features
- 88,13% … With time features
This points out, that features containing the games in the past and the corresponding trend/shape of a team affect the chance to win or lose. Sure, this isn’t a new perception, but with this experiment it was observed in a statistical experiment and the machine learned this fact to better predict the result of the game.
This project can be an interesting journey through the statistical patterns in sport – not only basketball. I plan to continue this project (if time allows) to develop more features and a more accurate intelligence. A possible result could be a scouting software based on probabilistic methods or a betting software to ensure enough money to develop other things ;). Anyway, for this and other prospects a larger dataset is needed for a better training of the neural nets.