I have always subscribed to the thought that the best way to learn is to just pick a project and begin working. Each issue that inevitably comes up is another problem to solve and lesson to learn. These past couple weeks, I have decided to try my hand at developing an NBA draft model. I don’t expect to produce anything groundbreaking, but I do see it as an intriguing challenge to try to tackle. I also thought it would be interesting to keep an ongoing blog documenting my progress, trials, and tribulations. Even if nobody else reads this, it could serve as my journal for the sake of posterity. There is also some small comfort in sending this off into the void that is the internet.
Anyways, let’s get started.
- Collect several seasons of NCAA Division 1 Men’s college basketball box score and advanced statistics data
- Develop a model (or several) for projecting NBA success based on this NCAA data
- For now we will define NBA success as based on performance during each player’s first 4 seasons
- Compare the model’s ranking of prospects against the real-life NBA draft order
- Project forward by applying this model to the incoming 2019 NBA Draft class
Key Questions to Answer
- Has the league as a whole “improved” at drafting over time?
- Are there individual teams who consistently overperform or underperform in the NBA draft?
- How does a model built solely using publicly available data perform against the actual draft decisions made by teams?
- What college statistics / metrics are most important in identifying NBA level talent?
- Can this model be used to determine the relative strength of each draft class?
For now I’ll try to stick to a weekly schedule of posting project updates. As of today I am actually a couple weeks into the project, so there will be some lag time as I try to catch up with blog posts. Thanks for reading!Click here for Part 2 of my NBA Draft Model Developer Blog Series