Sunday, March 12, 2017

Residuals of monthly global temperatures.

I have frequently written about the task of getting a global average surface temperature as one of spatial integration, as here or here. But there is more to be said about the statistical aspect. It's a continuation of what I wrote here about spatial sampling error. In this post, I'll follow a path rather like ANOVA, with a hierarchy of improving approximations leading to smaller and more random residuals. I'll also follow through on my foreshadowed more substantial application of the new WebGL system, to show how the residuals do change over the surface.

So the aims of this post are:
  1. To see how various levels of approximation reduce the variance
  2. To see graphically how predictability is removed from the residuals. The idea here is that if we can get to iid residuals in known locations, that distribution should be extendable to unknown locations, giving a clear basis for estimation of coverage uncertainty.
  3. To consider the implications for accurate estimation of global average. If each approximation is itself integrable, then the residuals make a smaller error. However, unfortunately, they also become themselves harder to integrate, since smoothness is deliberately lost.
A table of contents will be useful:

Temperatures and mesh

I'll look at the GHCN/ERSST data for January 2017, as used in the TempLS calculation. It's the data I illustrated in the palettes post. The normals are actually those derived from spherical harmonics integration. Illustrations are based on the triangular mesh for TempLS, which is actually the convex hull of the stations.

Statistical modelling

I am following an ANOVA pattern here. It's not exactly ANOVA because we don't have subsets with different treatments; I am looking at different predictors for the same data set. A statistical model for this data would divide it into an a priori predictable part and a remainder:
y = p + e
The remainder is identified with the residuals - what you get when you subtract the prediction. The idea first is to improve the prediction, as measured by the sum of squares, or variance, of the remainder. But I haven't yet called e the random component. That is because individual values may still be somewhat predictable from some pattern of the data (or residuals). The classic pattern here is autocorrelation. So that needs a further equation:
e = f + ε
where ε is an independently distributed random variable - no further prediction is possible. That is aspirational. The second equation may well involve lags or shifts, as in ARIMA models.

The distinction I am making here is analogous to that between the fixed and random effects of ANOVA. In fact, I won't be talking about fixed effects, although the boundary can be fuzzy. BEST uses them - it tries to predict first on the basis of things like latitude. I (and most others) rely on observed normals for that part, which I think is more direct, but is deduced from the data rather than a priori. And in this post I go on to look at spatial variation pattern as a predictor. NOAA for example does this with EOFs; I'll use fitted spherical harmonics here. But the things to keep tracking are
  1. How much variance has been explained so far? and
  2. How random are the residuals?
I'll calculate the first part, and show the residuals on the global plots below (using the new WebGL system). The point of the latter is to see if we are progressing toward id randomness, or whether there is still more than could be usefully predicted.

I should add that by sum of squares here, I will generally mean an area-weighted sum, using the weights used for temperature average.

Absolute temperatures and anomalies

I have often written about why one should never average absolute temperatures, but rather anomalies (as have GISS and NOAA. I often put it in terms of sampling and homogeneity, but in a recent post, I quantified it in terms of location sampling error. It also shows up in a SS analysis, and that seems like a good place to start. So the first plot I'll show below is just the plot of temperature in °C (for Jan 2017), and the second is the plot of normals that emerge from TempLS analysis. The point is that they are pretty much identical. This implies two things
  • The plot doesn't tell us much about Jan 2017
  • Much of the variation is due to the spatial variation of normals
. So let's quantify that:
Variance Temp Variance residuals
Total 215.2251.388
%1000.64
Clearly the normals explain most of the variation - that is why they should be subtracted out immediately. I have seen interesting effects of not doing this. Steven Goddard has been most prominent of a number of people who have tried to measure the effect of adjustments in USHCN over time by subtracting the mean of the adjusted stations rom the mean of the unadjusted. But they are different sets of stations, and because of the very large contributor of normals to the variation, what is being measured is the contribution of that variation between the two sets. This can be easily shown by doing the same calculation using the normals instead of the actual data. The graph is almost the same, despite the fact that there is now no variation of actual weather, let alone adjustment. The variability of the normals swamps everything. There is another example here.


Anomalies have a special status, because we are familiar with the idea of deviations from normals, and use that as our conventional measure of temperature change. I have plotted it below as the third plot. And it's clear that there is a lot of spatial correlation. That is the reason why it can be successfully interpolated and so integrated. And it is why the anomaly plot is much more informative than the first two plots, which just tell us about normals, rather than what happened in Jan 2017.
Mbr>Anyway, the spatial correlation suggests that we can reduce the variance further by some spatial approximation. So to spherical harmonics.

Spherical harmonics spatial analysis

I have described spherical harmonics here and elsewhere. They are the sphere equivalent of Fourier functions. I use here harmonics of order up to 10, a total of 121 functions. The harmonics are fitted and the residual found. The fit is shown in the fourth figure below, and the residuals in the fifth. The SH fit is what I show each month for TempLS. The results of fitting were
Variance AnomaliesVariance residuals after SH
Total1.3880.76
%10054.7
This time there is less success. 45% of variance explained is not nothing, but not huge. And the plot of residuals (fifth) shows why. There is still a lot of spatial coherence. Although the SH picks up the pattern, much of the vcontribution is at higher frequencies. A general extension of SH analysis to those frequencies won't work, because it breaks down in the sparsely sampled parts of the world like Africa.

LOESS type prediction

LOESS uses a local weighting function to fit some low order polynomial to the nearby points, and that then becomes the predictor for the center. The weights are usually radially symmetric, and I use an exponential. The rate of decrease is interesting, and Hansen found that you can reasonably use exponentials that stretch out to 1200 km. But having taken out LF variation with harmonics, I can get a slightly better result with a characteristic decay distance of optimally 60 km or so. Here is the result:
Variance Anomalies after SHVariance residuals after LOESS
Total0.760.59
%10077.6
So not a huge further gain in variance explained. But the plots below do show a reduction in apparent spatial correlation. Note the effect of color scale change here; the LOESS smoothed residuals are quite small, despite the reds and blues. There is still some low amplitude regular variation at sea. I think this is because the 60km characteristic distance which gives most variance reduction does not have so much effect on the regular 4° ocean grid.

WebGL residuals

Here is the plot of the various fields described above. For details, see the recent descriptions here and here. I have moved the radio buttons to the far right column.



Review of aims

These were:
  1. To see how various levels of approximation reduce the variance. The initial taking of anomalies made a huge reduction. Spatial fitting of spherical harmonics made a smallish reduction, but captured the low frequency behaviour. Further reduction by local LOESS smoothing made a small difference.
  2. To see graphically how predictability is removed from the residuals. The first calculation of anomaly actually increased predictability. Taking out low frequency spatial variation produced a much more random pattern of residuals, but still with some spatial correlation. Taking out the LOESS component did seem to remove more spatial pattern, without much improvement in variance.
  3. To consider the implications for accurate estimation of global average. Generally in averaging, the data is approximated in some way, the approximation integrated, and the integral of the residual assumed to be zero. Here, the SH approximation is entirely integrable (and I think it is one of the best ways to integrate), and the residuals are moderate, and their apparent near-randomness suggests cancellation will justify ignoring the remainder. LOESS increases that randomness, even though it doesn't reduce the magnitude of the remainder. However, the LOESS itself does not have an analytic form, like SH. I think it may be worth integrating numerically; the weights are available. The main reason would be to use the new residuals as an improved error estimate, since they appear to be more independent. That may be the subject of another post






0 comments:

Post a Comment