What is ggplot2?

Components

  • ggplot - dataset and aesthetic mapping
  • Layers
library(ggplot2)

# ?mpg

ggplot(mpg, mapping = aes(x = displ, y = hwy, colour = class)) + 
  geom_point()

https://ggplot2.tidyverse.org/reference/

Besides the coordinates (x,y), it is also to specify other parameters to aes such as fill, size, shape, color/colour, etc.

ggplot(mpg, mapping = aes(x = displ, y = hwy, size = class)) + 
  geom_point()
## Warning: Using size for a discrete variable is not advised.

ggplot(mpg, mapping = aes(x = displ, y = hwy, shape = class)) + 
  geom_point()
## Warning: The shape palette can deal with a maximum of 6 discrete values
## because more than 6 becomes difficult to discriminate; you have 7.
## Consider specifying shapes manually if you must have them.
## Warning: Removed 62 rows containing missing values (geom_point).

ggplot(mpg, mapping = aes(x = displ, y = hwy, alpha = class)) + 
  geom_point()
## Warning: Using alpha for a discrete variable is not advised.

This parameters can also go outside the mapping

Exercise: What is wrong with the next code?

ggplot(mpg, mapping = aes(x = displ, y = hwy), color = "skyblue3", alpha = .6, shape = 8) + 
  geom_point()

The desired output is:

Exercise: Change size of suv to 3 and the other classes to 2

Style

Exercise: Create a barplot showing the number of cars for each class

It is also possible to change labels and specify colors manually

List of colors: http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf

p + labs(x = "Class", y = "N", title = "# of Classes") +
  scale_fill_manual(values = c("gray0", "gray10", "gray20", "gray30", "gray40", "gray50", "gray60")) +
  guides(fill = F)

There are a lot of packages that contain premade palettes, RColorBrewer (http://applied-r.com/rcolorbrewer-palettes/) and ggsci (https://cran.r-project.org/web/packages/ggsci/vignettes/ggsci.html) are two examples.

Question: What color are going to appear for the classes?

library(ggsci)

p + scale_fill_manual(values = c("gray0", "gray10", "gray20", "gray30", "gray40", "gray50", "gray60")) +
  scale_fill_npg()

Other parameters can be changed with theme()

#?theme
p +  scale_fill_npg() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, face = "italic"))

Question: what is hjust doing?

ggplot also have premade themes (https://ggplot2.tidyverse.org/reference/ggtheme.html).

Arranging multiple plots

Now, I want to know the number of classes per year

1999 2008
2seater 2 3
compact 25 22
midsize 20 21
minivan 6 5
pickup 16 17
subcompact 19 16
suv 29 33
p + facet_wrap(~year) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

p + facet_wrap(year~ .) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

p + facet_grid(~year) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

p + facet_grid(year~.) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

p + facet_wrap(year~cyl) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

p + facet_grid(year~cyl) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Exercise: figure out a way to only show the classes that have counts for each manufacturer

Question: is there a difference between using facet_grid and facet_wrap?

Different plots

Exercise: using the iris dataset, we are going to create 5 plots: barplot showing the number of species (p1) and four density plots by specie for Sepal.Length (p2), Sepal.Width (p3), Petal.Length(p4), Petal.Width (p5)

Grid

library(grid)
library(gridExtra)

grid.arrange(p1,p2,p3,p4,p5)

You can also pass to grid.arrange a matrix specifying the position of each plot.

mat <- matrix(c(1,1,2,3,4,5), byrow = T, ncol = 2)
grid.arrange(p1,p2,p3,p4,p5, layout_matrix = mat)

Since we are using the same colors …

Exercise: remove the legends from the density plots, and only leave the one for the bar plot. Also remove the x label for p2 and p3 and the y label for p2,p4

Exercise: remove x labels from p1 and put the color legend at the top. Plot only p1 and p2 in the same row, what happens?

Egg

library(egg)

ggarrange(p1,p2, labels = c("A", "B"), widths = c(1.5,2))

Saving plots

ggplot2 has its own command to save plots: ggsave.

# ? ggsave

ggsave(filename = "PlotIris.png", plot = p1)

Interactive plots

suppressMessages(library(plotly))

p <- ggplot(iris, aes(x = Species, y = Sepal.Width, fill = Species)) +
  geom_boxplot() +
  guides(fill = F) +
  scale_fill_jama()

ggplotly(p)
plot_ly(iris, x = ~Sepal.Length, color = ~Species, type = "box")

Resources