OnTheBall

an analytics visualization tool for the basketball analyst

Motivation

Data-Driven analysis for the game of Basketball has surged in popularity in recent years. While large amounts of data are collected about the game, effective analysis and visualization tools are still in their infancy. To aid in the analysis of the large amounts of data collected, we have created a visualization tool to perform domain-specific tasks in the field of Basketball analytics. Our approach towards tackling this issue provides a robust tool for visualization of data for use by Basketball analysts, coaches, and general managers.

Domain problems that need to be tackled are the ability to directly compare the visualized statistics of individual players and teams to discover where some excel and where some falter. These insights provide novel approaches and strategies to the game of Basketball. Another specific domain problem is needing to identify what specific statistics are conducive to winning the game. while the game is decided by the number of points scored by each team, certain statistics may be heavily correlated with favorable game outcomes.

The NBA has recently surpassed 10-billion-dollars in revenue, marking a significant growth in interest for the sport of basketball. Due to the immense financial incentives, NBA teams are constantly looking for ways to improve their play and get an edge over their competitors.

This tool will compile data regarding various team statistics (such as field goal percentage, opponent field goal percentage, shot selection distributions, rebound differential, etc.) and demonstrate how they contribute to winning. Specifically, the tool will allow users to analyze the interactions between multiple attributes, and see the statistical profiles of the most winning (and least winning) teams

This tool will be usable by NBA coaches and general managers to determine areas of improvement for their teams, as specific teams' statistics will be available upon a query for comparison purposes. For example, a coach can see how their team's opponent's FG and rebound rate compare to winning teams, as well as which is more important to improve if they want a better record next year. This can inform roster moves, coaching emphases, and playbook changes, among other possible actions. Furthermore, if a team's win rate does not match the typical win rates of teams with similar statistics, it suggests there is another, more pressing change that needs to be emphasized.

Background

Data

The data being used for the visualization is from a Kaggle dataset by Nathan Lauga, who scraped the data from the official NBA website. This data is collected on a per-game basis by professional scorekeepers and uploaded to the website postgame. The dataset contains data on every individual game, including each player's statistical contributions as well as important team statistics. From this, we have derived averages of each player's per-game statistics (such as points per game, assists per game, turnovers per game) and compiled into statistical totals (games played, minutes played, total field goals made). The data presented in this visualization is from the 2021-22 NBA season.

There exists little inherent bias as presented in the raw dataset. As basic statistics are collected in their totality with no missing values, there is little room for inherent bias; furthermore, there are no ethical considerations to the sourcing of data. However, one must consider the various imperfections and tradeoffs associated with more advanced metrics. For example, player efficiency rating (PER) disproportionately values shotmaking over every other metric, which overvalues catch-and-shoot players while undervaluing shot creation and non-scoring related aspects of the game. These advanced metrics will be presented to users, however, as they are commonly used; explanations of these stats will be available to explain potential limitations and biases.

The datasets can be found at the following link: https://www.kaggle.com/datasets/nathanlauga/nba-games

Demo Video

Report

The report can be found at the following link: Report Article

Visualization

Instructions

The scatterplot shows the distributions of all players in the NBA over the 2021-22 season (min. 10 games played) based on two selected statistics.
Mouse over a point to get more information on its respective player (as well as how the player's selected stats compare to the league as a whole).
Click on a point for a full player profile.

The histograms on each axis represent the distribution of players with respect to the chosen statistic (i.e. the larger the bar, the more players have averages/totals within a given range).
Click and drag on either histogram to zoom into a certain range of values.

Depending on the data, the legend in the top-right corner may hide some players. Click and drag the legend to move it out of the way.

Click a player!















Acknowledgements

All stats courtesy of NBA.com and Nathan Lauga.

Image credits: