Yishu Xue
2021-10-22, University of Missouri Columbia
Most basketball visualization tasks fall into the following categories:
We will cover how each of these types of plots are drawn using ggplot2.
In ggplot2, customized lines, dots and shapes can be plotted using
polygons. This post
explains how they are defined.
I have put the defined polygon into an .rds
file (the description is in
French!).
library(tidyverse); library(ggpubr); library(ggsci)
court <- readRDS("./court.rds"); head(court)
x y group side descri
1 -25.16667 -0.1666667 1 1 ligne de fond
2 -25.16667 0.0000000 1 1 ligne de fond
3 25.16667 0.0000000 1 1 ligne de fond
4 25.16667 -0.1666667 1 1 ligne de fond
5 -25.16667 94.1666667 2 2 ligne de fond
6 -25.16667 94.0000000 2 2 ligne de fond
## visualize a court first - what might be wrong with the court?
ggplot(court, aes(x = y, y = x, group = group)) + geom_polygon()
For basketball, football, or soccer courts, it is important to maintain the length/width ratio so that the visualization looks real.
ggplot(court, aes(x = y, y = x, group = group)) + geom_polygon() +
coord_fixed()
Another refinement: ggplot2's default theme has a grey background with white grids, which might complicate the visualization. Removing the grid is a good idea. Also, we do not really care about the \( x \) or \( y \) values. They can be removed too.
A clean visualization of the basketball court:
ggplot(court, aes(x = y, y = x, group = group)) + geom_polygon() +
coord_fixed() + theme_pubr(border = TRUE) +
theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank()
) +
xlab("") + ylab("") +
scale_y_continuous(expand = c(0,0)) + # removes the padding between border and plot
scale_x_continuous(expand = c(0,0))
A shot chart, given the locations of shots, is essentially a scatterplot overlaid on the court. Let us use an example data to illustrate.
shots <- readRDS("shotDataf.rds")
ad_shots <- shots %>% filter(PLAYER_NAME == "Anthony Davis")
head(ad_shots)
PLAYER_NAME SHOT_ZONE_BASIC Made EVENT_TYPE SHOT_DISTANCE LOC_X
1 Anthony Davis 5 1 Alley Oop Dunk Shot 1 25.6
2 Anthony Davis 2 1 Alley Oop Dunk Shot 0 24.7
3 Anthony Davis 4 1 Alley Oop Dunk Shot 0 25.3
4 Anthony Davis 1 1 Alley Oop Dunk Shot 0 25.2
5 Anthony Davis 1 1 Alley Oop Dunk Shot 0 24.7
6 Anthony Davis 2 1 Alley Oop Dunk Shot 0 25.2
LOC_Y Result
1 6.7 Made Shot
2 4.4 Made Shot
3 4.6 Made Shot
4 5.6 Made Shot
5 5.8 Made Shot
6 5.4 Made Shot
## a scatterplot of the shots only first
ggplot(ad_shots, aes(x = LOC_X, y = LOC_Y)) + geom_point()
the \( x \) locations range from (0, 50), not (-25, 25) like how the court was defined (adjust the \( x \) locations)
the shots are in the offensive half court (use only half of the court polygon)
made shots and missed shots are not distinguished (change point shapes)
## a scatterplot of the shots only first
ad_shots$lab <- ifelse(ad_shots$Made == 1, "Made", "Missed")
half_court <- court %>% filter(y <= 47)
ad_shots %>%
ggplot(aes(x = LOC_X - 25, y = LOC_Y)) +
geom_point(aes(shape = lab, col = lab), size = 3, alpha = 0.5) +
## overlay the polygon of half court
geom_polygon(data = half_court, aes(x = x, y = y, group = group)) +
coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
axis.ticks = element_blank()) + xlab("") + ylab("") +
scale_y_continuous(expand = c(0,0)) + # removes the padding between border and plot
scale_x_continuous(expand = c(0,0)) + ggsci::scale_color_lancet()
Oftentimes intensities are estimated by fitting certain model for count data, after partitioning the court into a grid of small squares. Here we use the LGCP fitted intensity as an example.
# read in the data of Draymond Green and James Harden
green <- readRDS("green.rds")
harden <- readRDS("harden.rds")
head(green)
mean x y Player
5012 0.03444147 0.3892617 0.5033557 Draymond Green
5013 0.03510861 1.0402685 0.5033557 Draymond Green
5014 0.03664086 1.6912752 0.5033557 Draymond Green
5015 0.03948377 2.3422819 0.5033557 Draymond Green
5016 0.04425573 2.9932886 0.5033557 Draymond Green
5017 0.04962592 3.6442953 0.5033557 Draymond Green
What are \( x \) and \( y \)? Centers of small squares over the court. The mean
column
contains the intensity values.
plot(green$x, green$y, pch = 18, cex = 0.3)
ggplot(green, aes(x = x, y = y)) +
geom_tile(aes(fill = mean)) +
coord_fixed() +
coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
axis.ticks = element_blank()) + xlab("") + ylab("") +
scale_y_continuous(expand = c(0,0)) +
scale_x_continuous(expand = c(0,0))
We need to customize the color palette. One thing I personally like to do is to break the continuous intensity into disjoint levels, because this helps us to distinguish the small values more clearly.
library(RColorBrewer)
breaks <- c(0, 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 1, 2, 4, 6, 8, 10, 11)
# expand the "Blues" palette to contain more discrete values as needed
colors <- colorRampPalette(brewer.pal(9, "Blues"))(length(breaks) - 1)
green$Intensity <- cut(green$mean, breaks = breaks)
ggplot(green, aes(x = x, y = y)) + geom_tile(aes(fill = Intensity)) +
geom_polygon(data = half_court,
aes(y = x + 25, x = y, group = group)) + # some coordinate adjustments
coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
axis.ticks = element_blank()) + xlab("") + ylab("") +
scale_y_continuous(expand = c(0,0)) +
scale_x_continuous(expand = c(0,0)) +
scale_fill_manual(values = colors)
What if we want contours of intensities like maps?
library(RColorBrewer)
breaks <- c(0, 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 1, 2, 4, 6, 8, 10, 11)
# expand the "Blues" palette to contain more discrete values as needed
colors <- colorRampPalette(brewer.pal(9, "Blues"))(length(breaks) - 1)
ggplot(green, aes(x = x, y = y)) +
geom_contour_filled(aes(z = mean), breaks = breaks, col = 'grey') +
geom_polygon(data = half_court,
aes(y = x + 25, x = y, group = group)) + # some coordinate adjustments
coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
axis.ticks = element_blank()) + xlab("") + ylab("") +
scale_y_continuous(expand = c(0,0)) +
scale_x_continuous(expand = c(0,0)) +
scale_fill_manual(values = colors)
harden$Intensity <- cut(harden$mean, breaks = breaks)
rbind(green, harden) %>%
ggplot(aes(x = x, y = y)) +
geom_contour_filled(aes(z = mean), breaks = breaks, col = 'grey') +
geom_polygon(data = half_court,
aes(y = x + 25, x = y, group = group)) + # some coordinate adjustments
coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
axis.ticks = element_blank()) + xlab("") + ylab("") +
scale_y_continuous(expand = c(0,0)) +
scale_x_continuous(expand = c(0,0)) +
scale_fill_manual(values = colors) +
facet_wrap(~Player)