Let X be a random variable with expectation μ and variance σ2. Suppose you want to try to predict the value of a sample from X. Then your best guess, in the sense of minimising the mean-square error is to choose the expectation μ. That best possible mean-square error is equal to the variance σ2.

To put it another way, the variance σ2 measures how hard it is to guess the outcome of X.

Now suppose we have two random variables X and Y. To keep things simple, let’s suppose they both have the same variance σ2 (although the basic message of what I write holds more generally). Suppose now that we want to predict the difference between the outcomes of X and Y; that is, we want to predict the outcome outcome of XY.

If X and Y are independent, then the variance of XY is 2σ2. In other words, it’s twice as difficult (in the sense of minimum mean-square error) to predict the difference of X and Y as it is to predict either X or Y individually.

But what if X and Y are not independent? If X and Y have correlation ρ (a number between –1 and +1), then the variance of their difference is 2(1ρ)σ2. This gives the curious result that if we have a strong positive correlation of ρ>12, then 2(1ρ)σ2<σ2, so it is in fact easier to predict the difference XY than it is to predict either of X and Y individually.

The reason is that if X happens to come out above average, then, when there’s positive correlation, Y is more likely to come out above average too, so the difference ends up getting preserved (at least approximately).

One common example of X and Y having high correlation are when they are measuring the same quantities at different points in the future. For example, suppose X is the maximum temperature in Leeds on Christmas Day this year and Y the maximum temperature in Leeds on Boxing Day this year. Then X and Y are individually difficult to predict – perhaps somewhere between 1 and 13 degrees Celsius? But predicting the temperature difference between those two days is pretty easy – if you guess 0 degrees difference, there’s a pretty good chance that you’ll be within a couple of degrees either way.

A second example is if X and Y measure the same quantity under different “treatments” or interventions. For example, suppose half of a field is treated with a standard fertiliser and half with a more expensive but more effective fertiliser; let X be the height of the wheat in one half of the field three months later and Y the height of the wheat in the other half of the field three months later. Then X and Y might be difficult to predict, because it depends on the quality of the soil and the weather conditions and so on. But if the respective qualities of the fertilisers are well understood, then it might be much easier to predict the improvement XY that the more expensive fertiliser makes.

In 2016, the UK Treasury released estimates of how much leaving the EU would affect the GDP of the UK. That is, it was predicting XY where X is the GDP if the UK leaves the EU and Y is the GDP if the UK remains in the EU. These are highly correlated – they are both strongly affected by general economic conditions, productivity, governance, wars, pandemics, etc. After the UK voted leave, it became clear that the prediction for X was not very accurate – but that doesn’t necessarily mean that the prediction for XY was inaccurate too. The problem here, of course, is that there’s no way to assess the accuracy of predictions of Y or of XY.