To get a spatial average, you need a spatial integral. This process has been at the heart of my development of TempLS over the years. A numerical integral from data points ends up being a weighted sum of those points. In the TempLS algorithm, what is actually needed are those weights. But you get them by figuring out how best to integrate.
I started, over five years ago, using a scheme sometimes used in indices. Divide the surface into lat/lon cells, find the average for each cell with data, then make an area-weighted sum of those. I've called that the grid version, and it has worked quite well. I noted last year that it tracked the NOAA index very closely. That is still pretty much true. But a problem is that some regions have many empty cells, and these are treated as if they were at the global average, which may be a biased estimate.
Then I added a method based on an irregular triangle mesh. basically, you linearly interpolate between data points and integrate that approximation, as in finite elements. The advantage is that every area is approximated by local data. It has been my favoured version, and I think it still is.
I have recently described two new methods, which I expect to be also good. My idea in pursuing them is that you can have more confidence if methods based on different principles give concordant results. This post reports on that.
The first new method, mentioned here, uses spherical harmonics (SH). Again you integrate an approximant, formed by least squares fitting (regression). Integration is easy, because all but one (the zeroth, constant) of the SH give zero.
The second I described more recently. It is an upgrade of the original grid method. First it uses a cubed sphere to avoid having the big range of element areas that lat/lon has near the poles. And then it has a scheme for locally interpolating grid values which have no internal data.
I have now incorporated all four methods as options in TempLS. That involved some reorganisation, so I'll call the result Ver 3.1, and post it some time soon. But for now, I just want to report on that question of whether the "better" methods actually do produce more concordant results with TempLS.
The first test is a simple plot. It's monthly data, so I'll show just the last five years. For "Infilled", (enhanced grid) I'm using a 16x16 grid on each face, with the optimisation described here. For SH, I'm using L=10 - 121 functions. "Grid" and "Mesh" are just the methods I use for monthly reports.
The results aren't very clear, except that the simple grid method (black) does seem to be a frequent outlier. Overall, the concordance does seem good. You can compare with the plots of other indices here.
So I've made a different kind of plot. It shows the RMS difference between the methods, pairwise. By RMS I mean the square root of the average sum squares of difference, from now back by the number of years on the x-axis. Like a running standard deviation of difference.
This is clearer. The two upper curves are of simple grid. The next down (black) is of simple grid vs enhanced; perhaps not surprising that they show more agreement. But the advanced methods agree more. Best is mesh vs SH, then mesh vs infill. An interesting aspect is that all the curves involving SH head north (bad) going back more than sixty years. I think this is because the SH set allows for relatively high frequencies, and when large datafree sections start to appear, they can engage in large fluctuations there without restraint.
There is a reason why there is somewhat better agreement in the range 25-55 years ago. This is the anomaly base region, where they are forced to agree in mean. But that is a small effect.
Of course, we don't have an absolute measure of what is best. But I think the fact that the mesh method is involved in the best agreements speaks in its favour. The best RMS agreement is less than 0.03°C which I think is pretty good. It gives more confidence in the methods, and, if that were needed, in the very concept of a global average anomaly.