This R worksheet does not include any assessed questions.


On the last R worksheet, we saw the basic commands for plotting data visualisations:

We also saw some optional arguments these functions could take, to change the appearance of the plot:

In this worksheet, we will see some other optional arguments that can be used to improve the appearance of your plots. Many of these arguments can be applied to any of the three plot types we’ve studied. The most important arguments are about labelling or titling plots and about setting the limits on axes.

Labelling and titling

When drawing a plot, it’s very important that it’s clear what the data represents. To do this, we almost always need to label the x and y axes. It’s often helpful to give to give the plot an explanatory title. Axis label are set with the arguments xlab = ... and ylab = ... respectively. A title is set with main = .... The text to appear in the labels/title must always be put in quotation marks " ". Remember that multiple arguments must be separated by commas.

Let’s use the Met Office historical temperature data again.

temperature <- read.csv("https://mpaldridge.github.io/math1710/data/met-office.csv")

In the last worksheet, we drew a histogram of December temperatures with the command

hist(temperature$dec, freq = FALSE)

But this picture would be clearer properly labelled and titled.

hist(temperature$dec, freq = FALSE, xlab = "Average temperature (degrees Celsius)", ylab = "Frequency density", main = "Historical December temperatures in the UK")

Adding all these extra arguments can make your code difficult to read. Remember that R allows you to use line breaks when it’s obvious a command isn’t finished – for example, when a pair of brackets has been opened but not yet closed. You can also add spaces to make your code easier to read. For example, you may find the following formatting of exactly the same command as above more pleasant.

hist(temperature$dec, freq = FALSE,
  xlab = "Average temperature (degrees Celsius)",
  ylab = "Probability density",
  main = "Historical December temperatures in the UK"
)

Exercise 5.1. Read the Met Office temperature data into R. Last time, in Exercise 4.5, you drew a scatterplot of January and August temperatures. Do this again, but with explanatory labels on the axes and an explanatory title.

For boxplots diagrams with multiple boxplots, you will also want to label the individual boxplots. This is done with the names = ... arguments to boxplot(). Here, names should be equal to a vector of the names given to the boxplots. Remember that vectors are set with c(), and each of the names must be in quotation marks.

Adapting another example from last time, we get

boxplot(temperature$jul, temperature$aug, temperature$sep,
  xlab  = "Month",
  names = c("July", "August", "September"),
  ylab  = "Average temperature (degrees Celsius)",
  main  = "Historical UK temperatures"
)

Exercise 5.2. Draw a figure consisting of boxplots for two or more months of temperature data. Make sure all axes are labelled, including each of the individual months.

Axis limits

When you draw a plot, R tries to choose the upper and lower limits of the x and y axes sensibly, to ensure that all the data fits, and that the numbers displayed next to the axes can be nice round numbers. (For histograms, R also ensures the lower limit on the y-axis is 0.) However, sometimes, you may wish to choose the axis limits yourself. The main reason we might want to fix the axis limits ourselves is that, in some circumstances, it’s appropriate ensure one or both of the axes start at 0, to put changes in the corresponding value in proper context. R, on the other hand, may choose to zoom in on relatively unimportant small changes.

We can choose the axis limits ourselves using the xlim = ... and ylim = ... arguments. Here xlim or ylim should be set to be a vector of length 2, with the first number being the lower limit for the axis and the second number being the upper limit. If you just set one of the axis limits, R will still try to choose the other one sensibly.

For example, if making a scatterplot of February and July temperatures, I may wan to put both month’s temperatures on the same scale – say from -2 to 10 – to give a fair comparison between them.

plot(temperature$feb, temperature$apr,
  xlab = "Average February temperature (degrees Celsius)",
  ylab = "Average April temperature (degrees Celsius)",
  xlim = c(-2, 10),
  ylim = c(-2, 10)
)

Exercise 5.3. Draw a histogram of of August temperature. Make sure the axes of your plot are appropriate and give your plot a title. Now redraw the histogram with the temperature axis going down to 0 (and an appropriate upper limit). Do you think changing the axis limits was helpful here?

Other arguemnts

There are many other optional arguments than can be passed to our plot-drawing functions. You can find out about many of these by reading the relevant help files in R. Try typing ?boxplot, hist, or plot.default (or ?par, for the adventurous) into the console and pressing Enter – you should see the help files open in the bottom-right quadrant of RStudio.

Some of these arguments include:

Exercise 5.4. Draw a plot of your choice based on the Met Office temperature data, but use as many extra wacky options as you can. Make sure to title and label your plot, of course.