Remember the Watts et al. manuscript in 2012? Anthony Watts putting his blog on hold to urgently finish his draft? This study is now a poster at the AGU conference and Watts promises to submit it soon to an undisclosed journal.
On first sight, the study now has a higher technical quality and some problems have been solved. The two key weakness are, however, not discussed in the press release to the poster. This is strange. I have had long discussions with second author Evan Jones about this. Scientists (real sceptics) have to be critical about their own work. You would expect a scientist to focus a large part of a study on any weaknesses, if possible try to show they probably do not matter or else at least honestly confront the weaknesses, rather than simply ignore them.
Watts et al. is about the immediate surrounding, also called micro-siting, of weather stations that measure the surface air temperature. The American weather stations have been assessed for their quality in five categories by volunteers of the blog WUWT. Watts and colleagues call the two best categories "compliant" and the three worst ones "non-compliant". For these two categories they then compare the average temperature signal for the 30-year period 1979 – 2008.
An important problem of the 2012 version of this study was that historical records typically also contain temperature changes because the method of observation has changed. An important change in the USA is the time of observation bias. In the past observations were more often made in the afternoon than in the morning. Morning measurements results in somewhat lower temperatures. This change in the time of observation creates a bias of about 0.2°C per century and was ignored in the 2012 study. Also the auditor, Steve McIntyre, who was then a co-author admitted this was an error. This problem is now fixed; stations with a change in the time of observation have been removed from the study.
A much used type of AWS in the USA is the MMTS. America was one of the first countries to automatize its network, with then analogue equipment that did not allow for long cables between the sensor and the display, which is installed inside a building. Furthermore, the technicians only had one day per station and as a consequence many of the MMTS systems were badly sited. Although they are badly sited, these MMTS system typically measure
Two weaknessesWeakness 1 is that the authors only know the siting quality at the end of the period. Stations in the compliant categories may have been in less well sited earlier on, while stations in the non-compliant categories may have been better sited before.
Someone has a weather station in a parking lot. Noticing their error, they move the station to a field, creating a great big cooling-bias inhomogeneity. Watts comes along, and seeing the station correctly set up says: this station is sited correctly, and therefore the raw data will provide a reliable trend estimate.The study tries to reduce this problem by creating a subset of stations that is unperturbed by Time of Observation changes, station moves, or rating changes. At least according to the station history (metadata). The problem is that metadata is never perfect.
The scientists working on homogenization thus advise to always also detect changes in the observational methods (inhomogeneities) by comparing a station to its neighbours. I have told Evan Jones how important this is, but they refuse to use homogenization methods because they feel homogenization does not work. In a scientific paper, they will have to provide evidence to explain why they reject an established method that could ameliorate a serious problem with their study. The irony is that the MMTS adjustments, which the Watts et al. study does use, depend on the same principle.
Weakness 2 is that the result is purely statistical and that no physical explanation is provided for the result. It is clear that bad micro-siting will lead to a temperature bias, but this does not affect the trend, while the study shows a difference in trend. I would not know how bad or good constant siting quality would change a trend. The press release also does not offer an explanation.
What makes this trend difference even more mysterious, if it were real, is that it mainly happens in the 1980s and 1990s, but has stopped in the last decade. See the graph below showing the trend for compliant (blue) and non-compliant stations (orange).
[UPDATE. The beginning period in which the difference builds up and that since 1996 the trends for "compliant" and "non-compliant" stations is the same is better seen in the graph below computed from the data in the above figure digitized by George Bailley. (No idea what the unit of the y-axis is on either of these graphs. Maybe 0.001°C.)
That the Watts phenomenon has stopped is also suggested by a comparison of the standard USA climate network (USHCN) and a new climate-quality network with perfect siting (USCRN) shown below. The pristine network even warms a little more. (Too little to be interpreted.)
While I am unable to see a natural explanation for the trend difference, that the difference is mainly seen in the first two decades fits to the hypothesis that the siting quality of the compliant stations was worse in the past: that in the past these stations were less compliant and a little too warm. The further you go back in time, the more likely it becomes that some change has happened. And the further you go back in time, the more likely it is that this change is no longer known.
six key findingsBelow I have quoted the six key findings of Watts et al. (2015) according to the press release.
1. Comprehensive and detailed evaluation of station metadata, on-site station photography, satellite and aerial imaging, street level Google Earth imagery, and curator interviews have yielded a well-distributed 410 station subset of the 1218 station USHCN network that is unperturbed by Time of Observation changes, station moves, or rating changes, and a complete or mostly complete 30-year dataset. It must be emphasized that the perturbed stations dropped from the USHCN set show significantly lower trends than those retained in the sample, both for well and poorly sited station sets.
The temperature network in the USA has on average one detectable break every 15 years (and a few more breaks that are too small to be detected, but can still influence the result). The 30-year period studied should thus contain on average 2 breaks and likely only 12.6% of the stations do not have a break (154 stations). According to Watts et al. 410 of 1218 stations have no break. 256 stations (more than half their "unperturbed" dataset) thus likely have a break that Watts et al. did not find.
That the "perturbed" stations have a smaller trend than the "unperturbed" stations confirms what we know: that in the USA the inhomogeneities have a cooling bias. In the "raw" data the "unperturbed" subset has a trend in the mean temperature of 0.204°C per decade; see table below. In the "perturbed" subset the trend is only 0.126°C per decade. That is a whooping cooling difference of 0.2°C over this period.
Table 1 of Watts et al. (2015)
2. Bias at the microsite level (the immediate environment of the sensor) in the unperturbed subset of USHCN stations has a significant effect on the mean temperature (Tmean) trend. Well sited stations show significantly less warming from 1979 – 2008. These differences are significant in Tmean, and most pronounced in the minimum temperature data (Tmin). (Figure 3 and Table 1 [shown above])
The stronger trend difference for the minimum temperature would also need an explanation.
3. Equipment bias (CRS [Cotton Region Shelter] v. MMTS [Automatic Weather station] stations) in the unperturbed subset of USHCN stations has a significant effect on the mean temperature (Tmean) trend when CRS stations are compared with MMTS stations. MMTS stations show significantly less warming than CRS stations from 1979 – 2008. (Table 1 [shown above]) These differences are significant in Tmean (even after upward adjustment for MMTS conversion) and most pronounced in the maximum temperature data (Tmax).
The trend for the stations that use a Cotton Region Shelter is 0.3°C per decade. That is large and should be studied. This was the typical shelter in the past. Thus we can be quite sure that in these cases the shelter did not change, but there could naturally have been other changes.
4. The 30-year Tmean temperature trend of unperturbed, well sited stations is significantly lower than the Tmean temperature trend of NOAA/NCDC official adjusted homogenized surface temperature record for all 1218 USHCN stations.
It is natural that the trend in the raw data is smaller than the trend in the adjusted data. Mainly for the above mentioned reasons (TOBS and MMTS) the biases in the USA are large compared to the rest of the world and the trend in the USA is adjusted 0.4°C per century upwards.
5. We believe the NOAA/NCDC homogenization adjustment causes well sited stations to be adjusted upwards to match the trends of poorly sited stations.
Well, they already wrote "we believe". There is no evidence for this claim.
6. The data suggests that the divergence between well and poorly sited stations is gradual, not a result of spurious step change due to poor metadata.
The year to year variations in a single station series is about 1°C. I am not sure whether one would see whether the inhomogeneity is one or more step changes or a gradual change.
ReviewIf I were reviewer of this manuscript, I would ask about some choices that seem arbitrary and I would like to know whether they matter. For example using the period 1979 – 2008 and not continuing the data to 2015. It is fine to also show data until 2008 for better comparisons with earlier papers, but stopping 7 years earlier is suspicious. Also the choice to drop stations with TOBS changes, but to correct stations with MMTS changes sounds strange. It would be of interest whether any of the other 3 options show different results. Anomalies should be computed over a period, not relative to the starting year.
I hope that Anthony Watts and Evan M. Jones find the above comments useful. Jones wrote earlier this year:
Oh, a shout-out to Dr. Venema, one of the earlier critics of Watts et al. (2012) who pointed out to us things that needed to be accounted for, such as TOBS, a stricter hand on station moves, and MMTS equipment conversion.Watts wrote in the side notes to his press release:
Note to Anthony: In terms of reasonable discussion, VV is way up there. He actually has helped to point us in a better direction. I think both Victor Venema and William Connolley should get a hat-tip in the paper (if they would accept it!) because their well considered criticism was of such great help to us over the months since the 2012 release. It was just the way science is supposed to be, like you read about in books.
Even input from openly hostile professional people, such as Victor Venema, have been highly useful, and I thank him for it.Glad to have been of help. I do not recall having been "openly hostile" to this study. It would be hard to come to a positive judgement of the quality of the blog posts at WUWT, whether they are from the pathological misquoter Monckton or greenhouse effect denier Tim Ball.
However, it is always great when people contribute to the scientific literature. When the quality of their work meets the scientific standard, it does not matter what their motivation is, then science can learn something. The surface stations project is useful to learn more about the quality of the measurements; also for trend studies if continued over the coming decades.
Comparison of Temperature Trends Using an Unperturbed Subset of The U.S. Historical Climatology NetworkAnthony Watts, Evan Jones, John Nielsen-Gammon and John Christy
Abstract. Climate observations are affected by variations in land use and land cover at all scales, including the microscale. A 410-station subset of U.S. Historical Climatology Network (version 2.5) stations is identified that experienced no changes in time of observation or station moves during the 1979-2008 period. These stations are classified based on proximity to artificial surfaces, buildings, and other such objects with unnatural thermal mass using guidelines established by Leroy (2010). The relatively few stations in the classes with minimal artificial impact are found to have raw temperature trends that are collectively about 2/3 as large as stations in the classes with greater expected artificial impact. The trend differences are largest for minimum temperatures and are statistically significant even at the regional scale and across different types of instrumentation and degrees of urbanization. The homogeneity adjustments applied by the National Centers for Environmental Information (formerly the National Climatic Data Center) greatly reduce those differences but produce trends that are more consistent with the stations with greater expected artificial impact. Trend differences between the Cooperative Observer Network and the Climate Reference Network are not found during the 2005-2014 sub-period of relatively stable temperatures, suggesting that the observed differences are caused by a physical mechanism that is directly or indirectly caused by changing temperatures.
[UPDATE. I forgot to mention the obvious: After homogenization the trend Watts et al. (2015) computed are nearly the same for all five siting categories, just like it was for Watts et al. (2012) and the published study Fall et al. Thus for the data used by climatologists, the homogenized data, the siting quality does not matter. Just like before, they did not study homogenization algorithms and thus cannot draw any conclusions about them, but unfortunately they do.]
Related readingAnthony Watts' #AGU15 poster on US temperature trends
Blog review of the Watts et al. (2012) manuscript on surface temperature trends
A short introduction to the time of observation bias and its correction
Comparing the United States COOP stations with the US Climate Reference Network
WUWT not interested in my slanted opinion
Some history from 2010On Weather Stations and Climate Trends
The conservative family values of Christian man Anthony Watts
Watts not to love: New study finds the poor weather stations tend to have a slight COOL bias, not a warm one
Poorly sited U.S. temperature instruments not responsible for artificial warming