The blessing of dimensionality for the analysis of climate data

We give a simple description of the blessing of dimensionality with the main focus on the concentration phenomena. These phenomena imply that in high dimensions the length of independent random vectors from the same distribution have almost the same length and that independent vectors are almost orthogonal. In climate and atmospheric sciences we rely increasingly on ensemble modelling and face the challenge of analysing large samples of long time-series and spatially extended fields. We show how the properties of high dimensions allow us to obtain analytical results for, e.g., correlations between 5 sample members and the behaviour of the sample mean when the size of the sample grows. We find that the properties of high dimensionality with reasonable success can be applied to climate data. This is the case although most climate data show strong anisotropy and both spatial and temporal dependence resulting in effective dimensions around 25-100.

These advantageous properties of high dimensionality -often referred to as the blessing of dimensionality -have rarely been applied to the atmospheric and climate sciences. Exceptions are our previous papers on the subject. In Christiansen (2018) we described how the blessing of dimensionality explains why the ensemble mean often outperforms the individual ensemble members and why the ensemble mean often has an error that is 30 % smaller than the median error of the individual ensemble members. In Christiansen (2019) we used the properties of high dimensions to analyse a global ensemble reforecast. 30 We described how the behaviour of the ensemble mean forecast can be described by a simple model in which variances and bias depend on lead-time. In Christiansen (2020) we analyzed a multi-model climate ensemble using the properties of high dimensions to separate two competing understandings of the ensemble -the indistinguishable interpretation and the truth centered interpretation. In this paper we aim to give a more comprehensive and coherent discussion of the blessing of dimensionality and to which extent it applies to the situation in atmospheric science. 35 In section 2 we describe the properties of high dimensional spaces focusing first on what is often called the curse/blessing of dimensionality (subsection 2.1) and then more specifically on the concentration of measures (subsection 2.2). The mathematical results are often only proved for independent and identically distributed (iid) random variables. In section 3 we discuss how this requirement can be loosened and how it relates to geophysical fields which often contains strong temporal and spatial dependence. In section 4 we focus on the application to atmospheric and climate science. First, in subsection 4.1 we directly 40 investigate to which extent the climatic fields fulfill the requirements of high dimensionality. We then (subsections 4.2 and 4.3) discuss analytical results for distances and correlations between samples and how well these hold for climate fields. In subsection 4.4 we likewise explore analytical results for how the ensemble mean depends on ensemble size. The paper is closed with the conclusions in section 5.
2 Properties of high dimensional spaces 45 Here we give a brief overview of the properties of high dimensional spaces. We begin in subsection 2.1 with some general considerations about high dimensional spaces while we in subsection 2.2 focus more on the concentration of measures. Some of the simple examples were also, but more briefly, described in Christiansen (2018).

Curse of dimensionality
The properties of high dimensional spaces often defy our intuition based on two and three dimensions (Cherkassky and Mulier,50 2007; Bishop, 2007;Blum et al., 2020). Apart from the well-known fact -sometimes called the empty space phenomenon -that the number of samples needed to obtain a given coverage grows exponentially with dimension, there are other less appreciated features of high dimensional spaces (Blum et al., 2020). For example, almost every point is an outlier in its own projection and independent vectors are almost always orthogonal. The latter property is called waist concentration and, more precisely, states that when the dimension increases the angles between independent vectors become narrowly distributed around the mean π/2 55 with a variance that converges towards zero. Table 1. Results for a unit cube in N dimensions. The vertices of a unit cube [−1/2, 1/2] N are [±1/2, ±1/2, . . . ± 1/2]. The number of vertices is 2 N and the length of the vertices √ N /2. The fraction of volume within ǫ of the edge is 1 − (1 − ǫ) N . The volume of inscribed sphere is π N/2 (d/4) N /Γ(N/2 + 1) with d = 1. The properties of high dimensional spaces are sometimes called the curse and sometimes the blessing of dimensionality depending on the considered problem. In the present context these properties turn out to be a blessing as they strongly simplify the analysis and make analytical results possible.
As a simple example we consider a cube in N dimensions with side d and centered around 0. The cube has 2 N vertices 60 with the positions d(±1/2, ±1/2, . . . ± 1/2). The distance between each vertex and the center is d √ N /2. The volume of the cube within a distance ǫd of the edge is (d N − (d − ǫd) N )/d N = 1 − (1 − ǫ) N and the volume of the inscribed sphere is π N/2 (d/4) N /Γ(N/2 + 1). The situation is shown in Table 1 for a unit cube (d = 1) for different values of N . For N = 100 there are more than 10 30 vertices 1 and more that 99 % of the volume is within a distance 0.05 of the edge. The volume of the inscribed sphere -which for 2 dimensions contains the bulk of the cube -is virtually zero. Thus, the volume increasingly concentrates near the surface when the dimension increases. The form of the N -dimensional cube has been compared to that of a sea urchin (Hecht-Nielsen, 1990).
Consider now a sample of points drawn independently from the high dimensional cube. For moderate sample size (≪ 2 N , which already for N = 25 is larger than 10 7 , so moderate is probably not the right word) all samples will be located in different vertices. This means that all samples will have almost the same distance from the center and that all pairs of samples will be 70 almost perpendicular. The distances between pairs of samples will also be almost identical making concepts such as nearest neighbours problematic. However, the ensemble mean will be different as it will be located near the otherwise vacant center of the cube. These properties are not particular for the cube, but are quite general also for unbounded distributions as we will see in the next subsection. The beneficial properties of high dimensionality are recognized in many areas of machine learning (Kainen, 1997;Gorban and Tyukin, 75 2018), but the lack of contrast between distances may also prose problems for algorithms such as clustering (Tomasev and Radovanović, 2016;Kabán, 2012).

Concentration of measures
We first look at a very simple example to describe the general idea of concentration of measures. Consider N iid random variables x i , i = 1, 2 . . . N , each with mean µ and variance σ 2 . For the sum of the variables, i x i , we have for the expectation 80 and variance: and V ar( Therefore, when N grows both the expectation of the sum and the width of its 85 distribution will grow, but the relative width will decrease. We can therefore, with some reason, say that the distribution of the sum becomes more and more sharply defined around its mean. If we normalize the sum with N to get the mean, we have E(x) = µ and V ar(x) = σ 2 /N . Therefore, the mean becomes increasingly narrowly distributed around the constant µ. Thus, for large N we can in many situations treat the mean, x, as a constant.
The considerations above are basically the rationale behind the law of large numbers and are also closely related to the central limit theorem which states that (x−µ)/ √ N /σ converges towards a standard Gaussian distribution, N (0, 1). The concentration of measures can be extended beyond the iid situation (see section 3) as indicated by the quotation from Chazottes (2015) in the introduction.
Let us organize the random variables into an N-vector x = (x 1 , x 2 . . . x N ). Now ||x|| 2 = x 2 1 + x 2 2 . . . x 2 N is a sum of independent variables and will therefore -according to the arguments above -for large N be approximately a constant.

95
Let us consider a multi-variate standard Gaussian distribution P (x) = (2π) −N/2 exp(− N n=1 x 2 n /2). The surface area of a hyper-sphere with radius r in N dimensions is S N −1 = 2π N/2 r N −1 /Γ( N 2 ). So as function of r = ||x|| we get the χdistribution: The maximum of P (r) is reached for r = √ N − 1 and the width (standard deviation) of the peak converges fast with N towards 100 1/ √ 2. This is illustrated in Fig. 1.
The concentration of measures is the backbone of statistical mechanics. As a simple example, we consider the canonical ensemble of weakly interacting identical particles. This ensemble describes a system with a constant number of particles, N , in a heat-bath. All particles have the same spectrum of energy states, E i , and the probability for a particle to be in the ith state is proportional to exp (−βE i ). The total energy grows like N , while the fluctuations (standard deviation) in the total 105 energy grows like √ N . Thus, the relative fluctuations in the total energy go as ∼ 1/ √ N and in the thermodynamic limit, N → ∞, these fluctuations and fluctuations in other macroscopic quantities can be neglected. This holds also for non-identical and interacting particles (see Gorban and Tyukin, 2018, for a recent discussion) just as the concentration of measures can be extended beyond the iid situation.
Let us take a brief look at waist concentration. Consider two independent unit vectors a and b. Without lack of generality we 110 can set a = (1, 0, 0 . . .). The dot-product then becomes b 1 . It is therefore easy to see that a · b has zero mean and that its spread converges to zero as ∼ 1/ √ N . This result does not require Gaussuanity, see, e.g., Lehmann and Romano (2005) for a general derivation. The angle φ between a and b will therefore converge towards π/2 as cos φ = a · b. This is illustrated in Fig. 1 for Gaussian distributed vectors for different values of N : For N = 2 the distribution of angles is flat, but for larger values of N it becomes increasingly peaked around π/2.

115
The topic of concentration properties is an active mathematical field with focus on probabilistic bounds on how fast empirical means converge to the ensemble means for different classes of random variables including non-iid variables (Vershynin, 2018;Wainwright, 2019). Such bounds include Bernstein's and Hoeffding's inequalities and give strict mathematical meaning to the looser considerations above. As an example the Hoeffding bound states that for all t ≥ 0: This holds for independent random variables drawn from a distribution which tails decay at least as fast as the tails of a Gaussian distribution (sub-Gaussian, Wainwright, 2019). Here µ i is the mean of x i and σ is a constant. For the angle φ between two independent vectors we have similarly for all t ≥ 0 (from Gorban and Tyukin, 2018): of probabilities of large fluctuations. See Touchette (2009) for a general review and Gálfi et al. (2019) for an application to weather extremes.

Extension to situations with dependent and non-identical variables
Like the central limit theorem (CLT), the concentration properties are originally developed for iid variables. However, also as the central limit theorem, they can be extended to classes of dependent variables. Although no general condition exists 130 for the CLT (Clusel and Bertin, 2008), an important factor for both the CLT and the concentration properties is the strength of the dependence (Kontorovich and Ramanan, 2008;Chazottes, 2015). Many properties of iid processes can be extended to processes where the rate of mixing is strong enough (Chazottes, 2015). Here, mixing processes are defined by a decay of correlations towards zero, i.e., x i and x j , should become independent when |i − j| increases.
Here correlations generally refer to measures of the dependence, e.g., the distance between the joint distribution and the generally Chazottes (2015) finds that the concentration of measures holds for a random variable that smoothly depends on the influence of many weakly dependent random variables.

140
The mixing and decay of correlations are closely related to the concept of effective degrees of freedom also known as the effective dimension (e.g., Clusel and Bertin, 2008). Shalizi 2006 2 shows an example of the CLT for dependent variables '..
only with the true sample size replaced by an effective sample size ..' The basic idea is that dependent variables of effective dimension N * gives the same information as independent variables of dimension N . Heuristically, consider a function on a 2-dimensional square region with each side of length L. If correlations decay exponentially with characteristic length ξ, 145 then we can to a first approximation describe the function by N * = (L/ξ) 2 independent variables. For fixed ξ the number of independent variables go to infinity with increasing L and in this situation we may assume that the limit theorems hold. Note, that some methods to calculate the number of effective dimensions of, e.g., surface temperature are directly based on these arguments using an average ξ (see the summary in Christiansen and Ljungqvist, 2017).
The situation is well-known in the study of 1-dimensional time-series (see, e.g., von Storch and Zwiers, 1999, section 17).
In the case of 2-dimensional fields different methods exist to estimate the number of effective dimensions N * (Wang and Shen, 1999;Bretherton et al., 1999). Some methods are directly based on the characteristic length, ξ, using an average over the differ-160 ent directions (Christiansen and Ljungqvist, 2017). The estimated number depends both on the method used and on the field, the time-scale, and the geographical region. For annual mean surface-temperatures values of N * vary between 50 and 100 depending on method (Briffa and Jones, 1993;Hansen and Lebedeff, 1987;Shen et al., 1994) when the whole globe is considered. Values in the same range have been found for monthly surface temperatures in the northern hemisphere (Wang and Shen, 1999;Bretherton et al., 1999).

165
These numbers are off course small compared to Avogadro's number relevant for statistical mechanics, but they are still comparable to the dimensions in Fig. 1 where the concentration properties hold to a reasonable degree. In subsection 4.b we directly investigate to which extent the concentration properties hold for atmospheric fields.

Atmospheric and climate science
As we saw in section 2, concentration of measures and waist concentration allow us in high dimensions to set dot products 170 of independent vectors to zero and substitute the length of a random vector with its expectation value. In section 3 we argued that when the components of the fields or time-series are dependent the concentration phenomena hold when the effective dimension is large. However, to test the concentration properties we also need independent samples.
For initial condition ensembles consisting of experiments with the same model but with different initial conditions, the different ensemble members can be considered independent (considering anomalies with respect to the ensemble center as 175 explained in the next subsection). For multi-model ensembles where experiments are performed with models with different physical parameterizations (but the same external forcings) the situation is more complicated (e.g., Knutti et al., 2013;Boé, 2018;Christiansen, 2020, and references therein). The annual or monthly climatologies are obvious measures for comparing models or for validating the models against observations (Gleckler et al., 2008). Another used measure is the forced response in, e.g., time-series of global means.

180
Another way to obtain independent samples from the same distribution is to consider a given variable at different times. For example, we could look at the spatial field of precipitation or temperature at different days or months. To ensure that the fields are drawn from the same distribution we need to avoid or remove the annual cycle and -if longer periods are considered -to make sure that there is no external forcing. The sample times should also be sufficiently separated.
In the next sections we will consider the following geophysical data-sets; 1) Daily means of near-surface temperature and of independent vectors drawn from an N -dimensional spherical (all components have zero mean and unit variance) Gaussian distribution as in Fig. 1. The second sample is drawn from a standard Gamma distribution with shape parameter 3 (location and scale parameters 0 and 1). In the latter case we include anisotropy (not identically distributed components) by multiplying the nth component with 5n/N , so the mean and variance of the nth component become 15n/N and 75(n/N ) 2 , respectively.
For the simple random variables we let the dimension, N , vary from 1 to 100. The sample size is chosen to 50.

Concentration of measures in atmospheric fields
In this subsection we directly investigate the distributions of the lengths of the sample members and the distributions of the angles between them. The results from this and the following subsections are summarized in Table 2.
We center the sample, x k , k = 1, . . . K, to the sample mean, x = k x k /K, and calculate the lengths as the square root of ||x k − x|| 2 /N for each sample member. The angle φ between two sample members, k and l, is given by ( ||x k −x|| ||x l −x|| cos φ. This gives us K lengths and K(K −1)/2 angles. This centering -the subtraction of the sample mean -is not important for the calculation of the lengths as we explain in the end of this subsection.
We first consider the near surface temperature and precipitation fields from the AgERA data-set. Figure 2 shows the lengths and angles for daily means taken every fifth day for June in the period 1980-1990. The 11 years gives us 66 samples. We see that for temperature the lengths are relatively tightly distributed around 4.36 K (σ in Eq. 6) with a standard deviation of 0.58 210 K. The angles are likewise distributed around π/2 with a standard deviation of 0.21. For precipitation the distributions are somewhat narrower in particular for the angles. This is what we would expect due to the larger number of effective degrees of freedom compared to temperature. However, this effect is reduced as we include both dry and wet days in the analysis. While the precipitation amount on wet days has a short decorrelation length this does not hold for the spatial field indicating wet/dry days. Note also that the distribution of precipitation is extremely non-Gaussian. These results indicate that the concentration of 215 measures and the waist concentration hold at least to some extent for these field. Figure 3 shows the lengths and angles for the monthly seasonal cycle in near-surface temperature, 1980-2015, for the multimodel CMIP5 ensemble. The models have been regridded to a common 144x73 grid so N = 144x73x12. The sample has a size of 45 and consists of one ensemble member from each of the models. The lengths are distributed around 2.57 K with a standard deviation of 0.46 K and the angles around π/2 with a standard deviation of 0.28. Thus, compared to the example 220 in Fig. 2 the distributions are less tightly distributed. The main explanation is probably that the effective degrees of freedom in the monthly climatology is smaller than that of the daily fields. However, there are also reasons to believe that the multimodel ensemble is not totally independent (Knutti et al., 2013;Boé, 2018). Note, the negative skewness in the distribution of the angles. Angles close to zero indicate pairs of models that are almost parallel and therefore strongly dependent. These pairs correspond to variants of the same model, such as MIROC-ESM and MIROC-ESM-CHEM, which are well known to be 225 close in the model genealogy (Knutti et al., 2013). A simple comparison between the distributions of φ in Figs. 2 and 3 with the distributions in Fig. 1 (from Gaussians) shows that the effective dimension is between 25-50 for temperature and several hundreds for precipitation. Results for the MPI-GE 100 member initial condition ensemble are shown in Table 2. Here we have 192x96 grid-points so N = 192x96x12 and the sample size is 100. The distributions of lengths and angles are now narrower compared to the multi-230 model CMIP5 ensemble. This corresponds to a larger effective dimension in the monthly climatology which now reflects only different initial conditions and not model differences. Also in this example are the distributions for precipitation narrower than those for temperature.
Reducing the spatial area decreases the effective dimension. As an example we have included in Table 2 the results for the AgERA when applied to Northern Europe (50-65 • N, 0-25 • E). As expected we see an increase in the width of the distributions 235 for both precipitation and near surface temperature.
In the analysis above we centered the sample to the sample mean before calculating the lengths, i.e., we used ||x k − x|| 2 /N instead of ||x k || 2 /N . But these expressions only differ by the length of the mean: ||x k || 2 /N = x 2 /N + ||x k − x|| 2 /N as

Distances between samples and between samples and ensemble mean
If the sample members are drawn independently from the same distribution in high dimensions they have approximately the same length, and we can write 245   For the distance between two different sample members we get: where we have used that x k − x and x l − x are orthogonal.
Therefore, the distance between two sample members are a square-root of 2 larger than the distance between a sample member and the sample mean. The geometric interpretation is that the sample mean and any two sample members form an 250 isosceles right triangle with the right angle at the sample mean (Hall et al., 2005;Palmer et al., 2006). The factor of √ 2 then comes from Pythagoras's equation. It is worth noting that the sample mean is special and is not drawn from the same distribution as the sample members. As mentioned when discussing the example of the high-dimensional unit cube from section 2a, the sample members would be located in the spikes while the sample mean would be close to the center.  shown together with the distribution of the distances between the sample members and the sample mean. The mean and width of these distributions are also shown in Table 2 both for these and the other data-sets. In all cases the factor of √ 2 are clearly seen for the mean values although the widths of the distributions are substantial in all cases. For the AgERA daily precipitation (Fig. 5 left) the two distributions are almost separated while this is not the case for the CMIP5 ensemble.
With this assumption and the considerations above, Christiansen (2018) explained the ubiquitous observation that the error (compared to observations) of the ensemble mean often is 30 % smaller ((1 − √ 2)/ √ 2) than the typical error of the individual ensemble members (e.g., Gleckler et al., 2008). We also explained why the ensemble mean very often has a smaller error than all individual ensemble members (Christiansen, 2019).
The results in this subsection and subsections 4.2 and 4 do not only hold for the Euclidean (square) norm distance, but also 270 for, e.g., the maximum norm distance and the correlation distance ( √ 1 − r 2 , where r is correlation).

Correlations between sample differences
Error correlations and correlations between model differences are important when studying the structure of a model ensemble and when comparing an ensemble to observations (Annan and Hargreaves, 2010;Pennell and Reichler, 2011;Bishop and Abramowitz, 2013). 275 We have in general corr(x k , x l ) = 1 − 1 2 ||x k −x l || 2 /N , whereˆindicates variables standardized to zero mean and unit variance. Therefore, with where in the last step we have used Eq. 7. Thus, in high dimensions the correlation between sample differences is 1/2.
Replacing x l with the sample mean we get In the last step we used the independence of the two terms and applied Eq. 6 to each. Figure 6 shows the correlations for the simple random vectors as also used in Fig. 4. For all N the correlations are distributed 285 around 1/2 and 1/ √ 2. For small N the spread is large but it decreases when N increases and for large N the correlations are very narrowly distributed around 1/2 and 1/ √ 2 ≈ 0.71.
The correlations for AgERA daily mean precipitation for June and for the CMIP5 monthly climatology of near-surface temperate are shown in Fig. 7. The mean values are close to the high-dimensional values from Eqs. 8 and 9 although the spread is rather high. This is also the case for the other fields as reported in Table 2.

290
If we again assume that the observations are drawn from the same distribution as the ensemble members -the indistinguishable interpretation -the error correlation is 1/2 (Eq. 8). On the other hand, if observations are near the ensemble mean -the truth centered interpretation-the error correlations will be zero as x k − x and x l − x are orthogonal. Error correlations around 1/2 have been observed in many studies of climate models (e.g., Pennell and Reichler, 2011;Herger et al., 2018;Abramowitz et al., 2019), providing evidence for the indistinguishable interpretation.  This quantity is shown in Figure 2 of Pennell and Reichler (2011) for the climatology of different variables in the CMIP3 multi-model ensemble and it is always close to 1/ √ 2 ≈ 0.71 as predicted by Eq. 9.

Effect of sample size
We now consider how the sample mean depends on the sample size. The ensemble mean is often used to estimate the forced re-300 sponse from initial condition and multi-model ensembles (Frankcombe et al., 2018;Bengtsson and Hodges, 2019;Liang et al., 2020) and it is of interest to know how large an ensemble that is needed for the estimation to be saturated (Milinski et al., 2019).
Similar results have been presented by van Loon et al. (2007) and Potempski and Galmarini (2009) based on other arguments.
See also Christiansen (2020) for the decay of the error of the ensemble mean when compared to observations.
The practical way to estimate the effect of sample size is to apply a bootstrap procedure to large sample of size K 0 . From this 310 sample we draw (with replacement) a number of sub-samples of size K, K = 1, . . . K 0 . From these sub-samples we calculate the mean and spread of ||x|| 2 /N for each K.
The mean is shown as function of K -using the bootstrap procedure -in Fig. 8 for the simple examples with N = 100 and N = 10. For N = 100 (black curves) ||x|| 2 /N is narrowly distributed around the theoretical mean (Eq. 12) for both the Gaussian (left) and Gamma distributed samples (right). For N = 10 (cyan curves) ||x|| 2 /N is also distributed around the 315 theoretical mean but with larger spread.
In the three previous subsections we studied the samples of daily June temperatures and of monthly climatologies. In the former the N -vectors consisted of spatial maps and in the latter of combined spatial climatologies for all 12 months. However, we also work in high-dimensionality when considering a single long time-series. The left panel in Fig. 9 shows time-series of the annual NH mean near-surface temperature for the MPI-GE 100 member initial condition ensemble and the CMIP5 45 The right panel shows ||x|| 2 /N as function of K. As expected from Eq. 12 the initial condition ensemble converges faster 325 than the multi-model ensemble due to its smaller variance. Note the excellent agreement with Eq. 12 (red curves) where σ 2 has been estimated as the variance over time and all ensemble members and µ 2 likewise estimated from the ensemble mean over all ensemble members. The large spread for the CMIP5 ensemble is due the well-known fact that the bias in global mean temperature is different for different models (Wang et al., 2014), which lead to a break-down of the condition of independence.
This is not the case for the initial condition ensemble (see also Table 2). Smaller spread is obtained for the CMIP5 ensemble if 330 each model is centered to its own (and not the ensemble) mean in the first 10 years.

Conclusions
It is well known that the number of samples necessary for a given coverage increases exponentially with the dimension. In this paper we have described other more non-intuitive properties of high-dimensional space such as the concentration of measures and waist concentration. In loose terms these properties state that independent sample members from the same distribution 335 Figure 9. Left: Time-series of annual NH mean temperature from MPI-GE (black) and CMIP5 (cyan). Thick solid curves are ensemble means, dashed curves ensemble means ± two standard deviations, and thin curves are individual models. Each ensemble has been centered to its ensemble mean in the first 10 years. Right: The length of the ensemble mean ||x|| 2 /N as function of ensemble size K for MPI-GE (black) and CMIP5 (cyan). The ensemble means ± two standard deviations are also shown. Theoretical results from Eq. 12 with σ 2 = 0.094, µ 2 = 0.164 K 2 for MPI-GE and σ 2 = 0.654, µ 2 = 0.193 K 2 for CMIP5 are shown in red.
have the same lengths and that pairs of independent sample members are orthogonal. While most results are derived for iid random variables we discussed the extension to the non-iid situation and how the strength of the dependence is related to the effective dimension.
We directly investigated to which extent these properties hold for typical climate fields and time-series. Ensemble modelling provides an obvious source of samples, but samples can also be obtained by considering, e.g., different days or years. We 340 investigated the monthly climatology of both an initial condition ensemble and a multi-model ensemble. We also investigated fields of daily means from a reanalysis. While the nominal dimensions of such fields are high, the effective dimensions are typically of the order 25-100, and it is not obvious to which degree the properties of high-dimensional dimension apply to such fields.
We found that for the global scale fields of near-surface temperature and precipitation both the concentration of measures 345 and the waist concentration hold to a reasonable degree. The lengths of the sample members are rather narrowly distributed around the mean length with widths (standard deviation) around 1/5-1/10 of the mean value. The angles between pairs of sample members are also rather narrowly distributed around π/2. This holds both when the samples consist of the climatology of different ensemble members from a model and when the samples consist of different daily means from a reanalysis.
than for the multi-model ensemble (CMIP5). In the latter case the dependence of related models will result in these models being far from orthogonal.
Based on the concentration properties we derived simple analytical results that hold for large dimensions. These analytical results include: 1) The distances between two sample members are a factor of √ 2 larger than the distance between sample members and the sample mean.
2) The correlations between differences of pair of sample members are 1/2 while the corre-355 lations between differences of sample members and the sample mean are 1/ √ 2. 3) An expression for how the sample mean depends on the sample size and on the sample spread. We found that these results describe the behaviour of the climate fields reasonable well.
We conclude that in many cases the concentration properties allow us a deeper understanding the behaviour of samples of climate fields. However, in each case it is important to investigate if the conditions of high dimensionality and independence 360 are fulfilled. Even for global fields there is a substantial spread around the values predicted for the high dimensional limit.
We have only briefly mentioned the relation between observations and models. The relation depends on whether we assume that observations are drawn from the same distribution as the model ensemble (the indistinguishable interpretation) or whether we assume that the ensemble members are centered around the observations (truth centered interpretation). In the former case the results for individual model members also hold for observations as we discussed in section 4.2, while in the latter case 365 results may be different. Many of the simple analytical results can be extended to situations where, e.g., the models are biased as explored in Christiansen (2020) using a simple statistical model that included both interpretations as limits.