19 Jan 2021
19 Jan 2021
The blessing of dimensionality for the analysis of climate data
 Danish Meteorological Institute, Copenhagen, Denmark
 Danish Meteorological Institute, Copenhagen, Denmark
Abstract. We give a simple description of the blessing of dimensionality with the main focus on the concentration phenomena. These phenomena imply that in high dimensions the length of independent random vectors from the same distribution have almost the same length and that independent vectors are almost orthogonal. In climate and atmospheric sciences we rely increasingly on ensemble modelling and face the challenge of analysing large samples of long timeseries and spatially extended fields. We show how the properties of high dimensions allow us to obtain analytical results for, e.g., correlations between sample members and the behaviour of the sample mean when the size of the sample grows. We find that the properties of high dimensionality with reasonable success can be applied to climate data. This is the case although most climate data show strong anisotropy and both spatial and temporal dependence resulting in effective dimensions around 25–100.
Bo Christiansen
Status: final response (author comments only)

RC1: 'Comment on npg20212', Anonymous Referee #1, 22 Mar 2021
In the present manuscript, various aspects of the blessing of dimensionality are analysed theoretically and experimentally. Specifically, highdimensional (HD) spaces are shown to feature the phenomena of the concentration of measures and waist concentration which are explained at an abstract level, although with clear references to previous climatological research. The authors show that independent vectors in high dimensions can be taken to be orthogonal and how sample means can be treated as a constant. The extension to dependent samples is discussed by making a review of the notion of effective dimension and the mixing/decorrelation rates of a timeevolving system. Then, three HD climate data sets are studied and shown to match, generally, the earlier theoretical discussion. The CMIP5 data set was argued to posses somewhat dependent samples and thus, the blessing of dimensionality (in the sense of concentration of measures and waist concentration) did not fully apply in this case. Exploiting the orthogonality of independent vectors in HD, the author finds that the correlation between sample differences is equal to 0.5 and that correlations of sample differences and differences between ensemble mean and ensemble members is equal to 2^0.5. This calculation is reflected in the climate data. The importance of sample size is also discussed in the context of HD data, whereby the authors show that ensemble mean length converges to the true mean at a rate inversely proportional to the sample size. This fact is illustrated in the conceptual cases of Gaussian and Gamma distributions as well as two of the climate data sets.
The paper is concise and well explained, linking theoretical findings with experimental ones. I would encourage the Journal to accept this manuscript if some minor comments are taken into account.
Specific minor corrections and comments:
Line 107: are the vectors a and b sampled from the Gaussian distribution mentioned above or are they generic?
Line 119: The inequalities in Eq. (4)(5) very much resemble a large deviation principle. Making a reference to large deviations theory will strengthen the connection of highdimensionality with the statistical mechanics theory presented in the paragraph starting in line 99
Line 185: When presenting the climate data, it would be very useful for the reader to know the dimensions N and sample size K in each case. Certainly, this is done later when analysing each case, but I believe doing it earlier could help the reading.
Line 228: The author mentions that x^k  \bar x is orthogonal to \bar x in high dimensions. This fact follows from the concentration of measures result presented in section 2.2, so I would encourage the author to make a clear reference to the theory.
Line 240: The reference to the spikes of the unit cube is very helpful and in fact it illustrates the comment done in Line 228. The author might consider making this analogy earlier.
Clearly, the CMIP5 outputs deviated the most from having waist concentration, i.e. it provided nonorthogonal samples. This was attributed to high dependence. To what extent could it be attributed, instead, to low effective dimension of the climatological fields? From a personal perspective, I’d be interested in hearing the author’s further views on this general question.
Technical Minor Corrections:
Line 18: “This might seem…”
Figure 1: In the caption there is no reference to the right panel (two references to the “left”).
Line 171: “Other measure is…”

AC1: 'Reply on RC1', Bo Christiansen, 27 Mar 2021
Thanks for the positive review and the constructive comments. I will consider them all in a revised version.
More specifically:
I will make the connection to 'large deviation theory' near Eqs. 3 and 4 and include a few relevant references.
I agree that it is a good idea to mention the ensemble size and dimension in the description of the different datasets in section 4. I will do that and perhaps include this information also in Table 2.
I will now mention the sample mean and why it is special already in the description of the unit cube (l70).
Regarding the deviation from waist concentration in CMIP5, I believe the negative skewness is due to dependence amongst models. But the width of the bulk of the distribution is probably due to the effective dimension. The width corresponds roughly to an effective dimension around 50 (compare Fig. 1, left). This is also near the effective dimension found in nearsurface temperature as mentioned in section 3 (l151). I will try to expand on these arguments in the revised version.

AC1: 'Reply on RC1', Bo Christiansen, 27 Mar 2021

RC2: 'Comment on npg20212', Maarten Ambaum, 06 Jul 2021
This paper has a strong didactic focus; much of it is a review of convergence theorems and properties when a high number of independent variables are available, i.e. high dimensionality. A lot of it is reasonably wellknown, but the way it has been put together in this paper seems quite valuable. I am particularly pleased to see that basic ideas that existed for a long time in statistical mechanics (thermodynamic limits, essentially) are here being highlighted as potentially applicable to climate data. The main result seems to be that a selection of climate data is shown to be behaving as if it is drawn from a (moderately) high dimensional space. This has important repercussions for much literature that has not been highlighted in this paper, in my opinion, particularly around regime behaviour in geophysical data  NB: I am not suggesting you should now include such a discussion, but it may be something to think about as a further application.
In my view the paper is well written and can be published pretty much as is, except that I invite the author to address a few minor points/questions first. I describe them in order of apperance, below.
l.85 & l.92: "a constant" I know this phraseology is used in related literature, but I do not like it very much. I would have prefered to explicitly say something like "... a constant for differnt realizations of the sampling process," or something similar. Essentially this is a frequentists statistical argument, suggesting there is a fixed distribution mean (the "constant") which can be approximated by a sample mean.
l.116: "subGaussian" I am happy with this word, but it may be useful to include a onesentence definition of it (I am not sure how widely known the word is.)
Section 3: the discussion of effective dimensionality is a wellworn topic in geophysics, and it remains an important topic. It reminds me of a paper I wrote some years ago (doi:10.1175/2007jas2298.1) on how suggested mulimodality in a wave amplitude index for the atmosphertic circulation possibly is a statistical fluke: it hinges on exactly the high dimensionality argument you are discussing here, but interpreted slightly differently. In effect, standard statsitics (moments) from a high dimensional data set are always expected to exhibit these "blessing of dimensionality" properties, and more fancy, nonlinear properties, such as multimodality, are moreorless by definition excluded. I think this is an important application of the high dimensionality property, and I wonder whether you care to comment on it.
Section 4.2: I thought this section, despite its simplicity, was really thoughtprovoking. I tried to interpret this in light of the wellknown "signaltonoise paradox" (https://doi.org/10.1038/s4161201800384), as it seems to be germane to that problem. Does the author agree that his discsusion here may shed light in the signaltonoise problem? It would be a rather important application.
l.333 and Abstract: Perhaps I did not catch it but the dimensionality of 25100 seems to fall a little bit from thin air. Can you please highlight where this estimate is based on? Furthermore: Figs 2 & 3 show empirical distributions of \phi, which have a given shape for independent data (Eq. 5). I would have thought that you can fit a Gaussian to those distributions and thus estimate a value of N. Did you do this? Does it give you the 25100 estimate?

AC2: 'Reply on RC2', Bo Christiansen, 12 Jul 2021
Thanks for the positive and constructive review.
I am in particular pleased that the reviewer likes the connection between climate science and statistical mechanics.
In the revised manuscript I will make changes to address the reviewers minor comments.
More specifically:1) I will briefly define 'subGaussian' as a distribution which tails decay at least as fast as the tails of a Gaussian distribution.
2) I will consider the use of 'a constant', although I don't really understand the reviewer's concern.
3) There are many areas of machine learning that are influenced by the 'curse/blessing of dimensionality', e.g., distancebased methods may be affected by the almost identical distances between random vectors. But I haven't thought much about the relevance of the curse/blessing for regime detection and multimodality and it is several years since I read the author's paper. However, I will consider to include a few lines about the relevance of the 'curse of dimensionality' for regime detection.
4) I have had the same idea about the 'signaltonoise paradox' and the 'curse of dimensionality'. In fact curse of dimensionality' can be used to derive analytical approximations for correlations between observations and ensemble mean such as those given in Zhang et al. 2021 (10.1007/s00382020056218) and Siegert et al. 2016 (10.1175/JCLID150196.1). These approximations give the mean, but we can also  with some effort  derive approximations the spread. I am in the process of writing this up.
5) There is actually a little more details and references regarding the number of degrees of freedom in the paragraph beginning at l151. The widths of the distributions of the angles are related to the degrees of freedom, but there is also an effect of the model dependence as mention in l214 that will disturb a direct calculation of the degrees of freedom.

AC2: 'Reply on RC2', Bo Christiansen, 12 Jul 2021
Bo Christiansen
Bo Christiansen
Viewed
HTML  XML  Total  BibTeX  EndNote  

631  84  11  726  4  4 
 HTML: 631
 PDF: 84
 XML: 11
 Total: 726
 BibTeX: 4
 EndNote: 4
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1