Cranberry picking short-term temperature trends

If you look at the data and sort of cherry-pick a micro-trend within a bigger trend, that technique is particularly suspect
John Grego, professor of statistics at the University of South Carolina.

If you choose your start points and your end points carefully enough you can make it look as if any trend you want is happening.

Christopher Monckton, sacked UKIP climate change spokesman
& former chair and demolisher of UKIP Scotland.

Monckton is a heavy user of this disingenuous "technique" and should thus know better: you cannot get any trend, but people like Monckton unfortunately do have much leeway to deceive the population. This post will show that political activists can nearly always pick a politically correct period to get a short-term trend that is smaller than the long-term trend. After this careful selection they can pretend to be shocked that scientists did not tell them about this slowdown in warming.

Traditionally this strategy to pick only the data you like is called "cherry picking". It is such a deplorable deceptive strategy that "cherry picking" sounds too nice to me. I would suggest calling it "cranberry picking". Under the assumption that people only eat cranberries when the burn peeing is worse. Another good new name could be "wishful picking."

In a previous post, I showed that the uncertainty of short-term trends is huge, probably much larger than you think, the uncertainty monster can only stomach a few short-term trends for breakfast. Because of this large uncertainty the influence of cranberry picking is probably also larger than you think. Even I was surprised by the calculations. I hope the uncertainty monster does not upset his stomach, he does not get the uncertainties he needs to thrive.

Size of short-term temperature fluctuations

To get some realistic numbers we first need to know how large the fluctuations around the long-term trend are. Thus let's first have a look at the size of these fluctuations in two surface temperature and two tropospheric temperature datasets:

the surface temperature of Berkeley Earth (formerly known as BEST),
the surface temperature of NASA-GISS: GISTEMP,
the satellite Temperature of the Total Troposphere (TTT) of Remote Sensing Systems (RSS),
the satellite Temperature of the Lower Troposphere (TLT version 6 beta) of the University of Alabama in Huntsville (UAH).

The four graphs below have two panels. The top panel shows the yearly average temperature anomalies over time as red dots. The Berkeley Earth data series starts earlier, but I only use data starting in 1880 because earlier data is too sparse and may thus not show actual climatic changes in the global mean temperature. For both surface temperature datasets the second world war is removed because its values are not reliable. The long-term trend is estimated using a [[LOESS]] smoother and shown as a blue line.

The lower panel shows the deviations from the long-term trend as red dots. The standard deviation of these fluctuations over the full period is written in red. The graphs for the surface temperature also gives the standard deviation of the deviations over the shorter satellite period written in blue for comparison with the satellite data. The period does not make much difference.

Both tropospheric datasets have fluctuations with a typical size (standard deviation) of 0.14 °C. The standard deviation of the surface datasets varies a little depending on the dataset or period. For the rest of this post I will use 0.086 °C as a typical value for the surface temperature.

The tropospheric temperature clearly shows more short-term variability. This mainly comes from El Nino, which has a stronger influence on the temperature high up in the air than on the surface temperature. This larger noise level gives the impression that the trend in the tropospheric temperature is smaller, but the trend in the RSS dataset is actually about the same as the surface trend; see below.

The trend in the preliminary UAHv6 temperature is currently lower than all others. Please note that, the changes from the previous version of UAH to the recent one are large and that the previous version of UAH showed more (recent) warming* and about the same trend as the other datasets.

Uncertainty of short-term trends

Already without cranberry picking short-term trends are problematic because of the strong influence of short-term fluctuations. While a average value computed over 10 years of data is only 3 times as uncertain as a 100-year average, the uncertainty of a 10-year trend is 32 times as large as a 100-year trend.**

To study how accurate a trend is you can generate random numbers and compute their trend. On average this trend will be zero, but due to the short-term fluctuations any individual realization will have some trend. By repeating this procedure often you can study how much the trend varies due to the short-term fluctuations, how uncertain the trend is, or more positively formulated: what the confidence interval of the trend is. See my previous post for details. I have done this for the graph below; for the satellite temperatures the random numbers have a standard deviation of 0.14 °C, for the surface temperatures 0.086 °C.

The graph below shows the confidence interval of the trends, which is two times the standard deviation of 10,000 trends computed from 10,000 series of random numbers. A 10-year trend of the satellite temperatures, which may sound like a decent period, has a whooping uncertainty of 3 °C per century.*** This means that with no long-term trend the short-term trend will vary between -3°C and +3 °C per century for 95% of the cases and for the other 5% even more. That is the uncertainty from the fluctuations along, there are additional uncertainties due to changes in the orbit, the local time the satellite observes, calibration and so on.

Cherry picking the begin year

To look at the influence of cranberry picking, I generated series of 30 values, computed all possible trends between 10 and 30 years and selected the smallest trend. The confidence intervals of these cranberry picked satellite temperature trends are shown below in red. For comparison the intervals for trends without cranberry picking, like above, are shown in blue. To show both cases clearly in the same graph, I have shifted the both bars a little away from each others.

The situation is similar for the surface temperature trends. However, because the data is less noisy, the confidence intervals of the trends are smaller; see below.

While the short-term trends without cranberry picking have a huge uncertainty, on average they are zero. With cranberry picking the average trends are clearly negative, especially for shorter trends, showing the strong influence of selecting a specific period. Without cranberry picking half of the trends are below zero, with cranberry picking 88% of the trends are negative.

Cherry picking the period

For some the record temperatures the last two years are not a sign that they were wrong to see a "hiatus". Some claim that there was something like a "pause" or a "slowdown" since 1998, but that it recently stopped. This claim gives even more freedom for cranberry picking. Now also the end year is cranberry picked. To see how bad this is, I again generated noise and selected the period lasting at least 10 years with the lowest trend and ending this year, or one year earlier or two years earlier.

The graphs below compare the range of trends you can get with cranberry picking the begin and end year in green with "only" cranberry picking the begin year like before in red. With double cranberry picking 96% of the trends are negative and the trends are going down even more. (Mitigation skeptics often use this "technique" by showing an older plot, when the newer plot would not be as "effective".)

A negative trend in the above examples of random numbers without any trend would be comparable to a real dataset where a short-term trend is below the long-term trend. Thus by selecting the "right" period, political activists can nearly always claim that scientists talking about the long-term trend are exaggerating because they do not look at this highly interesting short period.

In the US political practice the cranberry picking will be worse. Activists will not only pick a period of their political liking, but also the dataset, variable, region, depth, season, or resolution that produces a graph that can be misinterpreted. The more degrees of freedom, the stronger the influence of cranberry picking.

Solutions

There are a few things you can do to protect yourself against making spurious judgements.

1. Use large datasets. You can see in the plots above that the influence of cranberry picking is much smaller for the longer trends. For a 30-year period the difference between the blue confidence intervals for a typical 30-year period and the red confidence intervals for a cranberry picked 30-year period is small. Had I generated series of 50 random numbers rather than 30 numbers, this would likely have shown a larger effect of cranberry picking on 30-year trends, but still a lot smaller than on 10-year trends.

2. Only make statistical tests for relationships you expect to exist. This limits your freedom and the chance that one of the many possible statistical tests is spuriously significant. If you make 100 statistical tests of pure noise, 5 of them will on average be spuriously significant.

There was no physical reason for global warming to stop or slow down after 1998. No one computed the trend since 1998 because they had a reason to expect a change. They computed it because their eyes had seen something; that makes the trend test cranberry picking by definition. The absence of a reason should have made people very careful. The more so because there was a good reason to expect spurious results starting in a large El Nino year.

3. Study the reasons for the relationship you found. Even if I would wrongly have seen the statistical evidence for a trend decrease as credible, I would not have made a big point of it before I had understood the reason for this trend change. In the "hiatus" case the situation was even reversed: it was clear from the beginning that most of fluctuations that gave the appearance of a "hiatus" in the eyes of some was El Nino. Thus there was a perfectly fine physical reason not to claim that there was a change in the trend.

There is currently a strong decline in global sea ice extent. Before I cry wolf, accuse scientists of fraud and understating the seriousness of climate change, I would like to understand why this decline happened.

4. Use the right statistical test. People have compared the trend before 1998 and after 1998 and their uncertainties. These trend uncertainties are not valid for cherry picked periods. In this case, the right test would have been one for a trend change at an unknown position/year. There was no physical reason to expect a real trend change in 1998, thus the statistical test should take that the actual reason you make the test is because your eye sampled all possible years.

Against activists doing these kind of things we cannot do much, except trying to inform their readers how deceptive this strategy is. For example by linking to this post. Hint, hint.

Let me leave you with a classic Potholer54 video delicately mocking Monckton's cranberry picking to get politically convenient global cooling and melting ice trends.

Variable Variability

Pages

Monday 16 January 2017