Articles | Volume 28, issue 3
https://doi.org/10.5194/npg-28-409-2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/npg-28-409-2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The blessing of dimensionality for the analysis of climate data
Danish Meteorological Institute, Copenhagen, Denmark
Related authors
Fredrik Charpentier Ljungqvist, Bo Christiansen, Lea Schneider, and Peter Thejll
Clim. Past Discuss., https://doi.org/10.5194/cp-2024-41, https://doi.org/10.5194/cp-2024-41, 2024
Preprint under review for CP
Short summary
Short summary
We study the climatic signal, with focus on volcanic-induced shocks, in two long annual records of wine production quantity (spanning 1444–1786) from present-day Luxembourg, close to the northern limit of viticulture in Europe. Highly significant wine production declines are found during years following major volcanic events. Furthermore, warmer and drier climate conditions favoured wine production, with spring and summer conditions being the most important ones.
Fredrik Charpentier Ljungqvist, Bo Christiansen, Jan Esper, Heli Huhtamaa, Lotta Leijonhufvud, Christian Pfister, Andrea Seim, Martin Karl Skoglund, and Peter Thejll
Clim. Past, 19, 2463–2491, https://doi.org/10.5194/cp-19-2463-2023, https://doi.org/10.5194/cp-19-2463-2023, 2023
Short summary
Short summary
We study the climate signal in long harvest series from across Europe between the 16th and 18th centuries. The climate–harvest yield relationship is found to be relatively weak but regionally consistent and similar in strength and sign to modern climate–harvest yield relationships. The strongest climate–harvest yield patterns are a significant summer soil moisture signal in Sweden, a winter temperature and precipitation signal in Switzerland, and spring temperature signals in Spain.
Torben Schmith, Peter Thejll, Peter Berg, Fredrik Boberg, Ole Bøssing Christensen, Bo Christiansen, Jens Hesselbjerg Christensen, Marianne Sloth Madsen, and Christian Steger
Hydrol. Earth Syst. Sci., 25, 273–290, https://doi.org/10.5194/hess-25-273-2021, https://doi.org/10.5194/hess-25-273-2021, 2021
Short summary
Short summary
European extreme precipitation is expected to change in the future; this is based on climate model projections. But, since climate models have errors, projections are uncertain. We study this uncertainty in the projections by comparing results from an ensemble of 19 climate models. Results can be used to give improved estimates of future extreme precipitation for Europe.
Bo Christiansen, Nis Jepsen, Rigel Kivi, Georg Hansen, Niels Larsen, and Ulrik Smith Korsholm
Atmos. Chem. Phys., 17, 9347–9364, https://doi.org/10.5194/acp-17-9347-2017, https://doi.org/10.5194/acp-17-9347-2017, 2017
Short summary
Short summary
Ozone soundings in the troposphere from nine Arctic stations covering the period 1984–2014 have been analyzed. Stations with the best data coverage show a consistent and significant temporal variation with a maximum near 2005 followed by a decrease. Some significant changes are found in the annual cycle in agreement with the notion that the ozone summer maximum is appearing earlier in the year. Such changes in Arctic ozone in the free troposphere have not been reported before.
Fredrik Charpentier Ljungqvist, Bo Christiansen, Lea Schneider, and Peter Thejll
Clim. Past Discuss., https://doi.org/10.5194/cp-2024-41, https://doi.org/10.5194/cp-2024-41, 2024
Preprint under review for CP
Short summary
Short summary
We study the climatic signal, with focus on volcanic-induced shocks, in two long annual records of wine production quantity (spanning 1444–1786) from present-day Luxembourg, close to the northern limit of viticulture in Europe. Highly significant wine production declines are found during years following major volcanic events. Furthermore, warmer and drier climate conditions favoured wine production, with spring and summer conditions being the most important ones.
Fredrik Charpentier Ljungqvist, Bo Christiansen, Jan Esper, Heli Huhtamaa, Lotta Leijonhufvud, Christian Pfister, Andrea Seim, Martin Karl Skoglund, and Peter Thejll
Clim. Past, 19, 2463–2491, https://doi.org/10.5194/cp-19-2463-2023, https://doi.org/10.5194/cp-19-2463-2023, 2023
Short summary
Short summary
We study the climate signal in long harvest series from across Europe between the 16th and 18th centuries. The climate–harvest yield relationship is found to be relatively weak but regionally consistent and similar in strength and sign to modern climate–harvest yield relationships. The strongest climate–harvest yield patterns are a significant summer soil moisture signal in Sweden, a winter temperature and precipitation signal in Switzerland, and spring temperature signals in Spain.
Torben Schmith, Peter Thejll, Peter Berg, Fredrik Boberg, Ole Bøssing Christensen, Bo Christiansen, Jens Hesselbjerg Christensen, Marianne Sloth Madsen, and Christian Steger
Hydrol. Earth Syst. Sci., 25, 273–290, https://doi.org/10.5194/hess-25-273-2021, https://doi.org/10.5194/hess-25-273-2021, 2021
Short summary
Short summary
European extreme precipitation is expected to change in the future; this is based on climate model projections. But, since climate models have errors, projections are uncertain. We study this uncertainty in the projections by comparing results from an ensemble of 19 climate models. Results can be used to give improved estimates of future extreme precipitation for Europe.
Bo Christiansen, Nis Jepsen, Rigel Kivi, Georg Hansen, Niels Larsen, and Ulrik Smith Korsholm
Atmos. Chem. Phys., 17, 9347–9364, https://doi.org/10.5194/acp-17-9347-2017, https://doi.org/10.5194/acp-17-9347-2017, 2017
Short summary
Short summary
Ozone soundings in the troposphere from nine Arctic stations covering the period 1984–2014 have been analyzed. Stations with the best data coverage show a consistent and significant temporal variation with a maximum near 2005 followed by a decrease. Some significant changes are found in the annual cycle in agreement with the notion that the ozone summer maximum is appearing earlier in the year. Such changes in Arctic ozone in the free troposphere have not been reported before.
Related subject area
Subject: Time series, machine learning, networks, stochastic processes, extreme events | Topic: Climate, atmosphere, ocean, hydrology, cryosphere, biosphere | Techniques: Big data and artificial intelligence
Learning extreme vegetation response to climate drivers with recurrent neural networks
Representation learning with unconditional denoising diffusion models for dynamical systems
Characterisation of Dansgaard–Oeschger events in palaeoclimate time series using the matrix profile method
Evaluation of forecasts by a global data-driven weather model with and without probabilistic post-processing at Norwegian stations
The sampling method for optimal precursors of El Niño–Southern Oscillation events
A comparison of two causal methods in the context of climate analyses
A two-fold deep-learning strategy to correct and downscale winds over mountains
Downscaling of surface wind forecasts using convolutional neural networks
Data-driven methods to estimate the committor function in conceptual ocean models
Exploring meteorological droughts' spatial patterns across Europe through complex network theory
Integrated hydrodynamic and machine learning models for compound flooding prediction in a data-scarce estuarine delta
Predicting sea surface temperatures with coupled reservoir computers
Using neural networks to improve simulations in the gray zone
Producing realistic climate data with generative adversarial networks
Identification of droughts and heatwaves in Germany with regional climate networks
Extracting statistically significant eddy signals from large Lagrangian datasets using wavelet ridge analysis, with application to the Gulf of Mexico
Ensemble-based statistical interpolation with Gaussian anamorphosis for the spatial analysis of precipitation
Applications of matrix factorization methods to climate data
Detecting dynamical anomalies in time series from different palaeoclimate proxy archives using windowed recurrence network analysis
Remember the past: a comparison of time-adaptive training schemes for non-homogeneous regression
Francesco Martinuzzi, Miguel D. Mahecha, Gustau Camps-Valls, David Montero, Tristan Williams, and Karin Mora
Nonlin. Processes Geophys., 31, 535–557, https://doi.org/10.5194/npg-31-535-2024, https://doi.org/10.5194/npg-31-535-2024, 2024
Short summary
Short summary
We investigated how machine learning can forecast extreme vegetation responses to weather. Examining four models, no single one stood out as the best, though "echo state networks" showed minor advantages. Our results indicate that while these tools are able to generally model vegetation states, they face challenges under extreme conditions. This underlines the potential of artificial intelligence in ecosystem modeling, also pinpointing areas that need further research.
Tobias Sebastian Finn, Lucas Disson, Alban Farchi, Marc Bocquet, and Charlotte Durand
Nonlin. Processes Geophys., 31, 409–431, https://doi.org/10.5194/npg-31-409-2024, https://doi.org/10.5194/npg-31-409-2024, 2024
Short summary
Short summary
We train neural networks as denoising diffusion models for state generation in the Lorenz 1963 system and demonstrate that they learn an internal representation of the system. We make use of this learned representation and the pre-trained model in two downstream tasks: surrogate modelling and ensemble generation. For both tasks, the diffusion model can outperform other more common approaches. Thus, we see a potential of representation learning with diffusion models for dynamical systems.
Susana Barbosa, Maria Eduarda Silva, and Denis-Didier Rousseau
Nonlin. Processes Geophys., 31, 433–447, https://doi.org/10.5194/npg-31-433-2024, https://doi.org/10.5194/npg-31-433-2024, 2024
Short summary
Short summary
The characterisation of abrupt transitions in palaeoclimate records allows understanding of millennial climate variability and potential tipping points in the context of current climate change. In our study an algorithmic method, the matrix profile, is employed to characterise abrupt warmings designated as Dansgaard–Oeschger (DO) events and to identify the most similar transitions in the palaeoclimate time series.
John Bjørnar Bremnes, Thomas N. Nipen, and Ivar A. Seierstad
Nonlin. Processes Geophys., 31, 247–257, https://doi.org/10.5194/npg-31-247-2024, https://doi.org/10.5194/npg-31-247-2024, 2024
Short summary
Short summary
During the last 2 years, tremendous progress has been made in global data-driven weather models trained on reanalysis data. In this study, the Pangu-Weather model is compared to several numerical weather prediction models with and without probabilistic post-processing for temperature and wind speed forecasting. The results confirm that global data-driven models are promising for operational weather forecasting and that post-processing can improve these forecasts considerably.
Bin Shi and Junjie Ma
Nonlin. Processes Geophys., 31, 165–174, https://doi.org/10.5194/npg-31-165-2024, https://doi.org/10.5194/npg-31-165-2024, 2024
Short summary
Short summary
Different from traditional deterministic optimization algorithms, we implement the sampling method to compute the conditional nonlinear optimal perturbations (CNOPs) in the realistic and predictive coupled ocean–atmosphere model, which reduces the first-order information to the zeroth-order one, avoiding the high-cost computation of the gradient. The numerical performance highlights the importance of stochastic optimization algorithms to compute CNOPs and capture initial optimal precursors.
David Docquier, Giorgia Di Capua, Reik V. Donner, Carlos A. L. Pires, Amélie Simon, and Stéphane Vannitsem
Nonlin. Processes Geophys., 31, 115–136, https://doi.org/10.5194/npg-31-115-2024, https://doi.org/10.5194/npg-31-115-2024, 2024
Short summary
Short summary
Identifying causes of specific processes is crucial in order to better understand our climate system. Traditionally, correlation analyses have been used to identify cause–effect relationships in climate studies. However, correlation does not imply causation, which justifies the need to use causal methods. We compare two independent causal methods and show that these are superior to classical correlation analyses. We also find some interesting differences between the two methods.
Louis Le Toumelin, Isabelle Gouttevin, Clovis Galiez, and Nora Helbig
Nonlin. Processes Geophys., 31, 75–97, https://doi.org/10.5194/npg-31-75-2024, https://doi.org/10.5194/npg-31-75-2024, 2024
Short summary
Short summary
Forecasting wind fields over mountains is of high importance for several applications and particularly for understanding how wind erodes and disperses snow. Forecasters rely on operational wind forecasts over mountains, which are currently only available on kilometric scales. These forecasts can also be affected by errors of diverse origins. Here we introduce a new strategy based on artificial intelligence to correct large-scale wind forecasts in mountains and increase their spatial resolution.
Florian Dupuy, Pierre Durand, and Thierry Hedde
Nonlin. Processes Geophys., 30, 553–570, https://doi.org/10.5194/npg-30-553-2023, https://doi.org/10.5194/npg-30-553-2023, 2023
Short summary
Short summary
Forecasting near-surface winds over complex terrain requires high-resolution numerical weather prediction models, which drastically increase the duration of simulations and hinder them in running on a routine basis. A faster alternative is statistical downscaling. We explore different ways of calculating near-surface wind speed and direction using artificial intelligence algorithms based on various convolutional neural networks in order to find the best approach for wind downscaling.
Valérian Jacques-Dumas, René M. van Westen, Freddy Bouchet, and Henk A. Dijkstra
Nonlin. Processes Geophys., 30, 195–216, https://doi.org/10.5194/npg-30-195-2023, https://doi.org/10.5194/npg-30-195-2023, 2023
Short summary
Short summary
Computing the probability of occurrence of rare events is relevant because of their high impact but also difficult due to the lack of data. Rare event algorithms are designed for that task, but their efficiency relies on a score function that is hard to compute. We compare four methods that compute this function from data and measure their performance to assess which one would be best suited to be applied to a climate model. We find neural networks to be most robust and flexible for this task.
Domenico Giaquinto, Warner Marzocchi, and Jürgen Kurths
Nonlin. Processes Geophys., 30, 167–181, https://doi.org/10.5194/npg-30-167-2023, https://doi.org/10.5194/npg-30-167-2023, 2023
Short summary
Short summary
Despite being among the most severe climate extremes, it is still challenging to assess droughts’ features for specific regions. In this paper we study meteorological droughts in Europe using concepts derived from climate network theory. By exploring the synchronization in droughts occurrences across the continent we unveil regional clusters which are individually examined to identify droughts’ geographical propagation and source–sink systems, which could potentially support droughts’ forecast.
Joko Sampurno, Valentin Vallaeys, Randy Ardianto, and Emmanuel Hanert
Nonlin. Processes Geophys., 29, 301–315, https://doi.org/10.5194/npg-29-301-2022, https://doi.org/10.5194/npg-29-301-2022, 2022
Short summary
Short summary
In this study, we successfully built and evaluated machine learning models for predicting water level dynamics as a proxy for compound flooding hazards in a data-scarce delta. The issues that we tackled here are data scarcity and low computational resources for building flood forecasting models. The proposed approach is suitable for use by local water management agencies in developing countries that encounter these issues.
Benjamin Walleshauser and Erik Bollt
Nonlin. Processes Geophys., 29, 255–264, https://doi.org/10.5194/npg-29-255-2022, https://doi.org/10.5194/npg-29-255-2022, 2022
Short summary
Short summary
As sea surface temperature (SST) is vital for understanding the greater climate of the Earth and is also an important variable in weather prediction, we propose a model that effectively capitalizes on the reduced complexity of machine learning models while still being able to efficiently predict over a large spatial domain. We find that it is proficient at predicting the SST at specific locations as well as over the greater domain of the Earth’s oceans.
Raphael Kriegmair, Yvonne Ruckstuhl, Stephan Rasp, and George Craig
Nonlin. Processes Geophys., 29, 171–181, https://doi.org/10.5194/npg-29-171-2022, https://doi.org/10.5194/npg-29-171-2022, 2022
Short summary
Short summary
Our regional numerical weather prediction models run at kilometer-scale resolutions. Processes that occur at smaller scales not yet resolved contribute significantly to the atmospheric flow. We use a neural network (NN) to represent the unresolved part of physical process such as cumulus clouds. We test this approach on a simplified, yet representative, 1D model and find that the NN corrections vastly improve the model forecast up to a couple of days.
Camille Besombes, Olivier Pannekoucke, Corentin Lapeyre, Benjamin Sanderson, and Olivier Thual
Nonlin. Processes Geophys., 28, 347–370, https://doi.org/10.5194/npg-28-347-2021, https://doi.org/10.5194/npg-28-347-2021, 2021
Short summary
Short summary
This paper investigates the potential of a type of deep generative neural network to produce realistic weather situations when trained from the climate of a general circulation model. The generator represents the climate in a compact latent space. It is able to reproduce many aspects of the targeted multivariate distribution. Some properties of our method open new perspectives such as the exploration of the extremes close to a given state or how to connect two realistic weather states.
Gerd Schädler and Marcus Breil
Nonlin. Processes Geophys., 28, 231–245, https://doi.org/10.5194/npg-28-231-2021, https://doi.org/10.5194/npg-28-231-2021, 2021
Short summary
Short summary
We used regional climate networks (RCNs) to identify past heatwaves and droughts in Germany. RCNs provide information for whole areas and can provide many details of extreme events. The RCNs were constructed on the grid of the E-OBS data set. Time series correlation was used to construct the networks. Network metrics were compared to standard extreme indices and differed considerably between normal and extreme years. The results show that RCNs can identify severe and moderate extremes.
Jonathan M. Lilly and Paula Pérez-Brunius
Nonlin. Processes Geophys., 28, 181–212, https://doi.org/10.5194/npg-28-181-2021, https://doi.org/10.5194/npg-28-181-2021, 2021
Short summary
Short summary
Long-lived eddies are an important part of the ocean circulation. Here a dataset for studying eddies in the Gulf of Mexico is created through the analysis of trajectories of drifting instruments. The method involves the identification of quasi-periodic signals, characteristic of particles trapped in eddies, from the displacement records, followed by the creation of a measure of statistical significance. It is expected that this dataset will be of use to other authors studying this region.
Cristian Lussana, Thomas N. Nipen, Ivar A. Seierstad, and Christoffer A. Elo
Nonlin. Processes Geophys., 28, 61–91, https://doi.org/10.5194/npg-28-61-2021, https://doi.org/10.5194/npg-28-61-2021, 2021
Short summary
Short summary
An unprecedented amount of rainfall data is available nowadays, such as ensemble model output, weather radar estimates, and in situ observations from networks of both traditional and opportunistic sensors. Nevertheless, the exact amount of precipitation, to some extent, eludes our knowledge. The objective of our study is precipitation reconstruction through the combination of numerical model outputs with observations from multiple data sources.
Dylan Harries and Terence J. O'Kane
Nonlin. Processes Geophys., 27, 453–471, https://doi.org/10.5194/npg-27-453-2020, https://doi.org/10.5194/npg-27-453-2020, 2020
Short summary
Short summary
Different dimension reduction methods may produce profoundly different low-dimensional representations of multiscale systems. We perform a set of case studies to investigate these differences. When a clear scale separation is present, similar bases are obtained using all methods, but when this is not the case some methods may produce representations that are poorly suited for describing features of interest, highlighting the importance of a careful choice of method when designing analyses.
Jaqueline Lekscha and Reik V. Donner
Nonlin. Processes Geophys., 27, 261–275, https://doi.org/10.5194/npg-27-261-2020, https://doi.org/10.5194/npg-27-261-2020, 2020
Moritz N. Lang, Sebastian Lerch, Georg J. Mayr, Thorsten Simon, Reto Stauffer, and Achim Zeileis
Nonlin. Processes Geophys., 27, 23–34, https://doi.org/10.5194/npg-27-23-2020, https://doi.org/10.5194/npg-27-23-2020, 2020
Short summary
Short summary
Statistical post-processing aims to increase the predictive skill of probabilistic ensemble weather forecasts by learning the statistical relation between historical pairs of observations and ensemble forecasts within a given training data set. This study compares four different training schemes and shows that including multiple years of data in the training set typically yields a more stable post-processing while it loses the ability to quickly adjust to temporal changes in the underlying data.
Cited articles
Abramowitz, G., Herger, N., Gutmann, E., Hammerling, D., Knutti, R., Leduc, M., Lorenz, R., Pincus, R., and Schmidt, G. A.: ESD Reviews: Model dependence in multi-model climate ensembles: weighting, sub-selection and out-of-sample testing, Earth Syst. Dynam., 10, 91–105, https://doi.org/10.5194/esd-10-91-2019, 2019. a
Annan, J. D. and Hargreaves, J. C.: Reliability of the CMIP3 ensemble, Geophys. Res. Lett., 37, L02703, https://doi.org/10.1029/2009GL041994, 2010. a
Bartlett, M. S.: Some aspects of the time-correlation problem in regard to tests of significance, J. R. Stat. Soc., 98, 536–543, https://doi.org/10.2307/2342284, 1935. a
Bengtsson, L. and Hodges, K. I.: Can an ensemble climate simulation be used to separate climate change signals from internal unforced variability?, Clim. Dynam., 52, 3553–3573, https://doi.org/10.1007/s00382-018-4343-8, 2019. a
Bishop, C.: Pattern recognition and machine learning (Information science and statistics), Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2nd edn., 2007. a
Bishop, C. H. and Abramowitz, G.: Climate model dependence and the replicate Earth paradigm, Clim. Dynam., 41, 885–900, https://doi.org/10.1007/s00382-012-1610-y, 2013. a
Boé, J.: Interdependency in multimodel climate projections: Component replication and result similarity, Geophys. Res. Lett., 45, 2771–2779, https://doi.org/10.1002/2017GL076829, 2018. a, b
Bretherton, C. S., Widmann, M., Dymnikov, V. P., Wallace, J. M., and Bladé, I.: The effective number of spatial degrees of freedom of a time-varying field, J. Climate, 12, 1990–2009, https://doi.org/10.1175/1520-0442(1999)012<1990:TENOSD>2.0.CO;2, 1999. a, b
Briffa, K. R. and Jones, P. D.: Global surface air temperature variations during the twentieth century: Part 2, implications for large-scale high-frequency palaeoclimatic studies, Holocene, 3, 77–88, 1993. a
Cherkassky, V. S. and Mulier, F.: Learning from data: concepts, theory, and methods, John Wiley and Sons, Hoboken, N.J, 2nd edn., 2007. a
Christiansen, B.: Ensemble averaging and the curse of dimensionality, J. Climate, 31, 1587–1596, https://doi.org/10.1175/JCLI-D-17-0197.1, 2018. a, b, c
Christiansen, B.: Analysis of ensemble mean forecasts: The blessings of high dimensionality, Mon. Weather Rev., 147, 1699–1712, https://doi.org/10.1175/MWR-D-18-0211.1, 2019. a, b
Christiansen, B. and Ljungqvist, F. C.: Challenges and perspectives for large-scale temperature reconstructions of the past two millennia, Rev. Geophys., 2016RG000521, https://doi.org/10.1002/2016RG000521, 2017. a, b
Clusel, M. and Bertin, E.: Global fluctuations in physical systems: a subtle interplay between sum and extreme value statistics, Int. J. Mod. Phys. B, 22, 3311–3368, https://doi.org/10.1142/S021797920804853X, 2008. a, b
Crack, T. F. and Ledoit, O.: Central limit theorems when data are dependent: Addressing the pedagogical gaps, Journal of Financial Education, 36, 38–60, 2010. a
ECMWF: Daily surface meteorological data set for agronomic use, based on ERA5, ECMWF [dat set], https://doi.org/10.24381/cds.6c68c9bb, 2021. a
ESGF: Coupled Model Intercomparison Project – Phase 5, World Climate Research Programme (WCRP), ESGF [dat set], available at: https://esgf-node.llnl.gov/projects/esgf-llnl/, last access: 23 August 2021a. a
ESGF (Earth System Grid Federation): ESGF-CoG Node, DKRZ (German Climate Computing Centre), available at: https://esgf-data.dkrz.de/projects/esgf-dkrz/, last access: 23 August 2021b. a
Flato, G., Marotzke, J., Abiodun, B., Braconnot, P., Chou, S. C., Collins, W. J., Cox, P., Driouech, F., Emori, S., Eyring, V., Forest, C., Gleckler, P., Guilyardi, E., Jakob, C., Kattsov, V., Reason, C., and Rummukainen, M.: Evaluation of Climate Models, in: Climate Change 2013. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by: Stocker, T. F., Qin, D., Plattner, G.-K., Tignor, M., Allen, S. K., Boschung, J., Nauels, A., Xia, Y., Bex, V., and Midgley, P. M., Cambridge University Press, Cambridge, UK and New York, NY, USA, chap. 9, 741–866, https://doi.org/10.1017/CBO9781107415324.020, 2013. a
Frankcombe, L. M., England, M. H., Kajtar, J. B., Mann, M. E., and Steinman, B. A.: On the choice of ensemble mean for estimating the forced signal in the presence of internal variability, J. Climate, 31, 5681–5693, https://doi.org/10.1175/JCLI-D-17-0662.1, 2018. a
Gálfi, V. M., Lucarini, V., and Wouters, J.: A large deviation theory-based analysis of heat waves and cold spells in a simplified model of the general circulation of the atmosphere, J. Stat. Mech.-Theory E., 2019, 033404, https://doi.org/10.1088/1742-5468/ab02e8, 2019. a
Gleckler, P., Taylor, K., and Doutriaux, C.: Performance metrics for climate models, J. Geophys. Res., 113, D06104, https://doi.org/10.1029/2007JD008972, 2008. a, b
Gorban, A. N. and Tyukin, I. Y.: Blessing of dimensionality: mathematical foundations of the statistical physics of data, Philos. T. Roy. Soc. A, 376, 20170237, https://doi.org/10.1098/rsta.2017.0237, 2018. a, b, c
Hall, P., Marron, J. S., and Neeman, A.: Geometric representation of high dimension, low sample size data, J. R. Stat. Soc. B, 67, 427–444, https://doi.org/10.1111/j.1467-9868.2005.00510.x, 2005. a
Hansen, J. and Lebedeff, S.: Global trends of measured surface air temperature, J. Geophys. Res., 92, 13345–13372, 1987. a
Hecht-Nielsen, R.:
Neurocomputing, Addison-Wesley, Reading, Massachusetts, 1990. a
Herger, N., Abramowitz, G., Knutti, R., Angélil, O., Lehmann, K., and Sanderson, B. M.: Selecting a climate model subset to optimise key ensemble properties, Earth Syst. Dynam., 9, 135–151, https://doi.org/10.5194/esd-9-135-2018, 2018. a
Hersbach, H., Bell, W.,
Berrisford, P., Horányi, A., J., M.-S., Nicolas, J., Radu, R., Schepers, D., Simmons, A., Soci, C., and Dee, D.: Global reanalysis: goodbye ERA-Interim, hello ERA5, ECMWF Newsletter, 159, 17–24,
https://doi.org/10.21957/vf291hehd7, 2019. a
Kabán, A.: Non-parametric detection of meaningless distances in high dimensional data, Stat. Comput., 22, 375–385, https://doi.org/10.1007/s11222-011-9229-0, 2012. a
Kainen, P. C.: Utilizing geometric anomalies of high dimension: When complexity makes computation easier, in: Computer intensive methods in control and signal processing, pp. 283–294, Birkhäuser, Boston, MA, https://doi.org/10.1007/978-1-4612-1996-5_18, 1997. a
Knutti, R., Masson, D., and Gettelman, A.: Climate model genealogy: Generation CMIP5 and how we got there, Geophys. Res. Lett., 40, 1194–1199, https://doi.org/10.1002/grl.50256, 2013. a, b, c
Kontorovich, L. and Ramanan, K.: Concentration inequalities for dependent random variables via the martingale method, Ann. Probab., 36, 2126–2158, https://doi.org/10.1214/07-AOP384, 2008. a
Lehmann, E. L. and Romano, J. P.: Testing statistical hypotheses, Springer texts in statistics, Springer, New York, 3rd edn., 2005. a
Liang, Y.-C., Kwon, Y.-O., Frankignoul, C., Danabasoglu, G., Yeager, S., Cherchi, A., Gao, Y., Gastineau, G., Ghosh, R., Matei, D., Mecking, J. V., Peano, D., Suo, L., and Tian, T.: Quantification of the Arctic sea ice-driven atmospheric circulation variability in coordinated large ensemble simulations, Geophys. Res. Lett., 47, e2019GL085397, https://doi.org/10.1029/2019GL085397, 2020. a
Maher, N., Milinski, S., Suarez-Gutierrez, L., Botzet, M., Dobrynin, M., Kornblueh, L., Kröger, J., Takano, Y., Ghosh, R., Hedemann, C., Li, C., Li, H., Manzini, E., Notz, D., Putrasahan, D., Boysen, L., Claussen, M., Ilyina, T., Olonscheck, D., Raddatz, T., Stevens, B., and Marotzke, J.: The Max Planck Institute Grand Ensemble: Enabling the exploration of climate system variability, J. Adv. Model. Earth Sy., 11, 2050–2069, https://doi.org/10.1029/2019MS001639, 2019 (available at: https://www.mpimet.mpg.de/en/grand-ensemble/, last access: 23 August 2021). a, b
Milinski, S., Maher, N., and Olonscheck, D.: How large does a large ensemble need to be?, Earth Syst. Dynam., 11, 885–901, https://doi.org/10.5194/esd-11-885-2020, 2020. a
Mokkadem, A.: Mixing properties of ARMA processes, Stoch. Proc. Appl., 29, 309–315, https://doi.org/10.1016/0304-4149(88)90045-2, 1988. a
Palmer, T., Buizza, R., Hagedorn, R., Lorenze, A., Leutbecher, M., and Lenny, S.: Ensemble prediction: A pedagogical perspective, ECMWF Newsletter, 106, 10–17, 2006. a
Pennell, C. and Reichler, T.: On the effective number of climate models, J. Climate, 24, 2358–2367, https://doi.org/10.1175/2010JCLI3814.1, 2011. a, b, c
Potempski, S. and Galmarini, S.: Est modus in rebus: analytical properties of multi-model ensembles, Atmos. Chem. Phys., 9, 9471–9489, https://doi.org/10.5194/acp-9-9471-2009, 2009. a
Shen, S. S. P., North, G. R., and Kim, K.-Y.: Spectral approach to optimal estimation of the global average temperature, J. Climate, 7, 1999–2007, 1994. a
Talagrand, M.: A new look at independence, Ann. Probab., 24, 1–34, 1996. a
Tomašev, N. and Radovanović, M.: Clustering Evaluation in High-Dimensional Data, in: Unsupervised Learning Algorithms, edited by: Celebi, M. E. and Aydin, K., pp. 71–107, Springer, Cham, https://doi.org/10.1007/978-3-319-24211-8_4, 2016. a
Touchette, H.: The large deviation approach to statistical mechanics, Phys. Rep., 478, 1–69, https://doi.org/10.1016/j.physrep.2009.05.002, 2009. a
van Loon, M., Vautard, R., Schaap, M., Bergström, R., Bessagnet, B., Brandt, J., Builtjes, P., Christensen, J., Cuvelier, C., Graff, A., Jonson, J., Krol, M., Langner, J., Roberts, P., Rouil, L., Stern, R., Tarrasón, L., Thunis, P., Vignati, E., White, L., and Wind, P.: Evaluation of long-term ozone simulations from seven regional air quality models and their ensemble, Atmos. Environ., 41, 2083–2097, https://doi.org/10.1016/j.atmosenv.2006.10.073, 2007.
a
Vershynin, R.: High-dimensional probability – Probability theory and stochastic processes, Cambridge University Press, Cambridge, https://doi.org/10.1017/9781108231596, 2018. a
Wainwright, M. J.: High-dimensional statistics: A non-asymptotic viewpoint, Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge, https://doi.org/10.1017/9781108627771, 2019. a, b
Wang, C., Zhang, L., Lee, S.-K., Wu, L., and Mechoso, C. R.: A global perspective on CMIP5 climate model biases, Nat. Clim. Change, 4, 201–205, https://doi.org/10.1038/nclimate2118, 2014. a
Wang, X. and Shen, S. S.: Estimation of spatial degrees of freedom of a climate field, J. Climate, 12, 1280–1291, https://doi.org/10.1175/1520-0442(1999)012<1280:EOSDOF>2.0.CO;2, 1999. a, b
Short summary
In geophysics we often need to analyse large samples of high-dimensional fields. Fortunately but counterintuitively, such high dimensionality can be a blessing, and we demonstrate how this allows simple analytical results to be derived. These results include estimates of correlations between sample members and how the sample mean depends on the sample size. We show that the properties of high dimensionality with success can be applied to climate fields, such as those from ensemble modelling.
In geophysics we often need to analyse large samples of high-dimensional fields. Fortunately but...