Visualization Basketball Data Using ggplot2

Yishu Xue
2021-10-22, University of Missouri Columbia

Outline

Most basketball visualization tasks fall into the following categories:

  • the court
  • shot chart
  • intensity plot
    • with grids
    • with contours

We will cover how each of these types of plots are drawn using ggplot2.

The Basketball Court

In ggplot2, customized lines, dots and shapes can be plotted using polygons. This post explains how they are defined. I have put the defined polygon into an .rds file (the description is in French!).

library(tidyverse); library(ggpubr); library(ggsci)
court <- readRDS("./court.rds"); head(court)
          x          y group side        descri
1 -25.16667 -0.1666667     1    1 ligne de fond
2 -25.16667  0.0000000     1    1 ligne de fond
3  25.16667  0.0000000     1    1 ligne de fond
4  25.16667 -0.1666667     1    1 ligne de fond
5 -25.16667 94.1666667     2    2 ligne de fond
6 -25.16667 94.0000000     2    2 ligne de fond
## visualize a court first - what might be wrong with the court?
ggplot(court, aes(x = y, y = x, group = group)) + geom_polygon()

plot of chunk unnamed-chunk-1

The Basketball Court

For basketball, football, or soccer courts, it is important to maintain the length/width ratio so that the visualization looks real.

ggplot(court, aes(x = y, y = x, group = group)) + geom_polygon() +
  coord_fixed()

plot of chunk unnamed-chunk-2

The Basketball Court

Another refinement: ggplot2's default theme has a grey background with white grids, which might complicate the visualization. Removing the grid is a good idea. Also, we do not really care about the \( x \) or \( y \) values. They can be removed too.

A clean visualization of the basketball court:

ggplot(court, aes(x = y, y = x, group = group)) + geom_polygon() +
  coord_fixed() + theme_pubr(border = TRUE) +
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank()
  ) +
  xlab("") + ylab("") +
  scale_y_continuous(expand = c(0,0)) + # removes the padding between border and plot
  scale_x_continuous(expand = c(0,0))

A Clean Court

Shot Chart

A shot chart, given the locations of shots, is essentially a scatterplot overlaid on the court. Let us use an example data to illustrate.

shots <- readRDS("shotDataf.rds")
ad_shots <- shots %>% filter(PLAYER_NAME == "Anthony Davis")
head(ad_shots)
    PLAYER_NAME SHOT_ZONE_BASIC Made          EVENT_TYPE SHOT_DISTANCE LOC_X
1 Anthony Davis               5    1 Alley Oop Dunk Shot             1  25.6
2 Anthony Davis               2    1 Alley Oop Dunk Shot             0  24.7
3 Anthony Davis               4    1 Alley Oop Dunk Shot             0  25.3
4 Anthony Davis               1    1 Alley Oop Dunk Shot             0  25.2
5 Anthony Davis               1    1 Alley Oop Dunk Shot             0  24.7
6 Anthony Davis               2    1 Alley Oop Dunk Shot             0  25.2
  LOC_Y    Result
1   6.7 Made Shot
2   4.4 Made Shot
3   4.6 Made Shot
4   5.6 Made Shot
5   5.8 Made Shot
6   5.4 Made Shot

Shot Chart

## a scatterplot of the shots only first
ggplot(ad_shots, aes(x = LOC_X, y = LOC_Y)) + geom_point()

A Clean Court

  • the \( x \) locations range from (0, 50), not (-25, 25) like how the court was defined (adjust the \( x \) locations)

  • the shots are in the offensive half court (use only half of the court polygon)

  • made shots and missed shots are not distinguished (change point shapes)

Shot Chart

## a scatterplot of the shots only first
ad_shots$lab <- ifelse(ad_shots$Made == 1, "Made", "Missed")
half_court <- court %>% filter(y <= 47)
ad_shots %>%
  ggplot(aes(x = LOC_X - 25, y = LOC_Y)) +
  geom_point(aes(shape = lab, col = lab), size = 3, alpha = 0.5) +
## overlay the polygon of half court
  geom_polygon(data = half_court, aes(x = x, y = y, group = group)) +
    coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
  theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
        axis.ticks = element_blank()) + xlab("") + ylab("") +
  scale_y_continuous(expand = c(0,0)) + # removes the padding between border and plot
  scale_x_continuous(expand = c(0,0)) + ggsci::scale_color_lancet()

A Clean Court

Intensity Plot with Grids

Oftentimes intensities are estimated by fitting certain model for count data, after partitioning the court into a grid of small squares. Here we use the LGCP fitted intensity as an example.

# read in the data of Draymond Green and James Harden
green <- readRDS("green.rds")
harden <- readRDS("harden.rds")

head(green)
           mean         x         y         Player
5012 0.03444147 0.3892617 0.5033557 Draymond Green
5013 0.03510861 1.0402685 0.5033557 Draymond Green
5014 0.03664086 1.6912752 0.5033557 Draymond Green
5015 0.03948377 2.3422819 0.5033557 Draymond Green
5016 0.04425573 2.9932886 0.5033557 Draymond Green
5017 0.04962592 3.6442953 0.5033557 Draymond Green

What are \( x \) and \( y \)? Centers of small squares over the court. The mean column contains the intensity values.

plot(green$x, green$y, pch = 18, cex = 0.3)

plot of chunk unnamed-chunk-8

Intensity Plot with Grid

ggplot(green, aes(x = x, y = y)) +
  geom_tile(aes(fill = mean)) +
  coord_fixed() +
      coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
  theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
        axis.ticks = element_blank()) + xlab("") + ylab("") +
  scale_y_continuous(expand = c(0,0)) +
  scale_x_continuous(expand = c(0,0))

plot of chunk unnamed-chunk-9

  • the default palette makes it impossible to overlay a basketball court on it
  • so many tiles with small values - what is the difference?

Intensity Plot with Grid

We need to customize the color palette. One thing I personally like to do is to break the continuous intensity into disjoint levels, because this helps us to distinguish the small values more clearly.

library(RColorBrewer)
breaks <- c(0, 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 1, 2, 4, 6, 8, 10, 11)
# expand the "Blues" palette to contain more discrete values as needed
colors <- colorRampPalette(brewer.pal(9, "Blues"))(length(breaks) - 1)

green$Intensity <- cut(green$mean, breaks = breaks)
ggplot(green, aes(x = x, y = y)) + geom_tile(aes(fill  = Intensity)) +
  geom_polygon(data = half_court,
               aes(y = x + 25, x = y, group = group)) + # some coordinate adjustments
  coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
  theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
        axis.ticks = element_blank()) + xlab("") + ylab("") +
  scale_y_continuous(expand = c(0,0)) +
  scale_x_continuous(expand = c(0,0)) +
  scale_fill_manual(values = colors)

plot of chunk unnamed-chunk-10

Intensity Plot with Contours

What if we want contours of intensities like maps?

library(RColorBrewer)
breaks <- c(0, 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 1, 2, 4, 6, 8, 10, 11)
# expand the "Blues" palette to contain more discrete values as needed
colors <- colorRampPalette(brewer.pal(9, "Blues"))(length(breaks) - 1)

ggplot(green, aes(x = x, y = y)) +
  geom_contour_filled(aes(z = mean), breaks = breaks, col = 'grey') +
  geom_polygon(data = half_court,
               aes(y = x + 25, x = y, group = group)) + # some coordinate adjustments
  coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
  theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
        axis.ticks = element_blank()) + xlab("") + ylab("") +
  scale_y_continuous(expand = c(0,0)) +
  scale_x_continuous(expand = c(0,0)) +
  scale_fill_manual(values = colors)

plot of chunk unnamed-chunk-11

Multiple Intensity Plots Together

harden$Intensity <- cut(harden$mean, breaks = breaks)
rbind(green, harden) %>%
  ggplot(aes(x = x, y = y)) +
  geom_contour_filled(aes(z = mean), breaks = breaks, col = 'grey') +
  geom_polygon(data = half_court,
               aes(y = x + 25, x = y, group = group)) + # some coordinate adjustments
  coord_fixed() + theme_pubr(border = TRUE, base_size = 16) +
  theme(axis.text.x = element_blank(), axis.text.y = element_blank(),
        axis.ticks = element_blank()) + xlab("") + ylab("") +
  scale_y_continuous(expand = c(0,0)) +
  scale_x_continuous(expand = c(0,0)) +
  scale_fill_manual(values = colors) +
  facet_wrap(~Player)

plot of chunk unnamed-chunk-12