Articles | Volume 27, issue 3
https://doi.org/10.5194/npg-27-453-2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/npg-27-453-2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Applications of matrix factorization methods to climate data
CSIRO Oceans and Atmosphere, Hobart, Australia
Terence J. O'Kane
CSIRO Oceans and Atmosphere, Hobart, Australia
Related authors
No articles found.
Serena Schroeter, Terence J. O'Kane, and Paul A. Sandery
The Cryosphere, 17, 701–717, https://doi.org/10.5194/tc-17-701-2023, https://doi.org/10.5194/tc-17-701-2023, 2023
Short summary
Short summary
Antarctic sea ice has increased over much of the satellite record, but we show that the early, strongly opposing regional trends diminish and reverse over time, leading to overall negative trends in recent decades. The dominant pattern of atmospheric flow has changed from strongly east–west to more wave-like with enhanced north–south winds. Sea surface temperatures have also changed from circumpolar cooling to regional warming, suggesting recent record low sea ice will not rapidly recover.
Courtney Quinn, Terence J. O'Kane, and Vassili Kitsios
Nonlin. Processes Geophys., 27, 51–74, https://doi.org/10.5194/npg-27-51-2020, https://doi.org/10.5194/npg-27-51-2020, 2020
Short summary
Short summary
This study presents a novel method for reduced-rank data assimilation of multiscale highly nonlinear systems. Time-varying dynamical properties are used to determine the rank and projection of the system onto a reduced subspace. The variable reduced-rank method is shown to succeed over other fixed-rank methods. This work provides implications for performing strongly coupled data assimilation with a limited number of ensemble members on high-dimensional coupled climate models.
C. L. E. Franzke, T. J. O'Kane, D. P. Monselesan, J. S. Risbey, and I. Horenko
Nonlin. Processes Geophys., 22, 513–525, https://doi.org/10.5194/npg-22-513-2015, https://doi.org/10.5194/npg-22-513-2015, 2015
Related subject area
Subject: Time series, machine learning, networks, stochastic processes, extreme events | Topic: Climate, atmosphere, ocean, hydrology, cryosphere, biosphere | Techniques: Big data and artificial intelligence
Evaluation of forecasts by a global data-driven weather model with and without probabilistic post-processing at Norwegian stations
Characterisation of Dansgaard-Oeschger events in palaeoclimate time series using the Matrix Profile
The sampling method for optimal precursors of El Niño–Southern Oscillation events
A comparison of two causal methods in the context of climate analyses
A two-fold deep-learning strategy to correct and downscale winds over mountains
Downscaling of surface wind forecasts using convolutional neural networks
Representation learning with unconditional denoising diffusion models for dynamical systems
Data-driven methods to estimate the committor function in conceptual ocean models
Exploring meteorological droughts' spatial patterns across Europe through complex network theory
Integrated hydrodynamic and machine learning models for compound flooding prediction in a data-scarce estuarine delta
Predicting sea surface temperatures with coupled reservoir computers
Using neural networks to improve simulations in the gray zone
The blessing of dimensionality for the analysis of climate data
Producing realistic climate data with generative adversarial networks
Identification of droughts and heatwaves in Germany with regional climate networks
Extracting statistically significant eddy signals from large Lagrangian datasets using wavelet ridge analysis, with application to the Gulf of Mexico
Ensemble-based statistical interpolation with Gaussian anamorphosis for the spatial analysis of precipitation
Detecting dynamical anomalies in time series from different palaeoclimate proxy archives using windowed recurrence network analysis
Remember the past: a comparison of time-adaptive training schemes for non-homogeneous regression
John Bjørnar Bremnes, Thomas N. Nipen, and Ivar A. Seierstad
Nonlin. Processes Geophys., 31, 247–257, https://doi.org/10.5194/npg-31-247-2024, https://doi.org/10.5194/npg-31-247-2024, 2024
Short summary
Short summary
During the last 2 years, tremendous progress has been made in global data-driven weather models trained on reanalysis data. In this study, the Pangu-Weather model is compared to several numerical weather prediction models with and without probabilistic post-processing for temperature and wind speed forecasting. The results confirm that global data-driven models are promising for operational weather forecasting and that post-processing can improve these forecasts considerably.
Susana Barbosa, Maria Eduarda Silva, and Denis-Didier Rousseau
Nonlin. Processes Geophys. Discuss., https://doi.org/10.5194/npg-2024-13, https://doi.org/10.5194/npg-2024-13, 2024
Revised manuscript accepted for NPG
Short summary
Short summary
The characterisation of abrupt transitions in palaeoclimate records allows the understanding of millennial climate variability and of potential tipping points in the context of current climate change. In our study an algorithmic method, the matrix profile, is employed to characterise abrupt warmings designated as Dansgaard-Oeschger (DO) events and to identify the most similar transitions in the palaeoclimate time series.
Bin Shi and Junjie Ma
Nonlin. Processes Geophys., 31, 165–174, https://doi.org/10.5194/npg-31-165-2024, https://doi.org/10.5194/npg-31-165-2024, 2024
Short summary
Short summary
Different from traditional deterministic optimization algorithms, we implement the sampling method to compute the conditional nonlinear optimal perturbations (CNOPs) in the realistic and predictive coupled ocean–atmosphere model, which reduces the first-order information to the zeroth-order one, avoiding the high-cost computation of the gradient. The numerical performance highlights the importance of stochastic optimization algorithms to compute CNOPs and capture initial optimal precursors.
David Docquier, Giorgia Di Capua, Reik V. Donner, Carlos A. L. Pires, Amélie Simon, and Stéphane Vannitsem
Nonlin. Processes Geophys., 31, 115–136, https://doi.org/10.5194/npg-31-115-2024, https://doi.org/10.5194/npg-31-115-2024, 2024
Short summary
Short summary
Identifying causes of specific processes is crucial in order to better understand our climate system. Traditionally, correlation analyses have been used to identify cause–effect relationships in climate studies. However, correlation does not imply causation, which justifies the need to use causal methods. We compare two independent causal methods and show that these are superior to classical correlation analyses. We also find some interesting differences between the two methods.
Louis Le Toumelin, Isabelle Gouttevin, Clovis Galiez, and Nora Helbig
Nonlin. Processes Geophys., 31, 75–97, https://doi.org/10.5194/npg-31-75-2024, https://doi.org/10.5194/npg-31-75-2024, 2024
Short summary
Short summary
Forecasting wind fields over mountains is of high importance for several applications and particularly for understanding how wind erodes and disperses snow. Forecasters rely on operational wind forecasts over mountains, which are currently only available on kilometric scales. These forecasts can also be affected by errors of diverse origins. Here we introduce a new strategy based on artificial intelligence to correct large-scale wind forecasts in mountains and increase their spatial resolution.
Florian Dupuy, Pierre Durand, and Thierry Hedde
Nonlin. Processes Geophys., 30, 553–570, https://doi.org/10.5194/npg-30-553-2023, https://doi.org/10.5194/npg-30-553-2023, 2023
Short summary
Short summary
Forecasting near-surface winds over complex terrain requires high-resolution numerical weather prediction models, which drastically increase the duration of simulations and hinder them in running on a routine basis. A faster alternative is statistical downscaling. We explore different ways of calculating near-surface wind speed and direction using artificial intelligence algorithms based on various convolutional neural networks in order to find the best approach for wind downscaling.
Tobias Sebastian Finn, Lucas Disson, Alban Farchi, Marc Bocquet, and Charlotte Durand
EGUsphere, https://doi.org/10.5194/egusphere-2023-2261, https://doi.org/10.5194/egusphere-2023-2261, 2023
Short summary
Short summary
We train neural networks as denoising diffusion models for state generation in the Lorenz 1963 system and demonstrate that they learn an internal representation of the system. We make use of this learned representation and the pre-trained model in two downstream tasks: surrogate modelling and ensemble generation. For both tasks, the diffusion model can outperform other more common approaches. Thus, we see a potential of representation learning with diffusion models for dynamical systems.
Valérian Jacques-Dumas, René M. van Westen, Freddy Bouchet, and Henk A. Dijkstra
Nonlin. Processes Geophys., 30, 195–216, https://doi.org/10.5194/npg-30-195-2023, https://doi.org/10.5194/npg-30-195-2023, 2023
Short summary
Short summary
Computing the probability of occurrence of rare events is relevant because of their high impact but also difficult due to the lack of data. Rare event algorithms are designed for that task, but their efficiency relies on a score function that is hard to compute. We compare four methods that compute this function from data and measure their performance to assess which one would be best suited to be applied to a climate model. We find neural networks to be most robust and flexible for this task.
Domenico Giaquinto, Warner Marzocchi, and Jürgen Kurths
Nonlin. Processes Geophys., 30, 167–181, https://doi.org/10.5194/npg-30-167-2023, https://doi.org/10.5194/npg-30-167-2023, 2023
Short summary
Short summary
Despite being among the most severe climate extremes, it is still challenging to assess droughts’ features for specific regions. In this paper we study meteorological droughts in Europe using concepts derived from climate network theory. By exploring the synchronization in droughts occurrences across the continent we unveil regional clusters which are individually examined to identify droughts’ geographical propagation and source–sink systems, which could potentially support droughts’ forecast.
Joko Sampurno, Valentin Vallaeys, Randy Ardianto, and Emmanuel Hanert
Nonlin. Processes Geophys., 29, 301–315, https://doi.org/10.5194/npg-29-301-2022, https://doi.org/10.5194/npg-29-301-2022, 2022
Short summary
Short summary
In this study, we successfully built and evaluated machine learning models for predicting water level dynamics as a proxy for compound flooding hazards in a data-scarce delta. The issues that we tackled here are data scarcity and low computational resources for building flood forecasting models. The proposed approach is suitable for use by local water management agencies in developing countries that encounter these issues.
Benjamin Walleshauser and Erik Bollt
Nonlin. Processes Geophys., 29, 255–264, https://doi.org/10.5194/npg-29-255-2022, https://doi.org/10.5194/npg-29-255-2022, 2022
Short summary
Short summary
As sea surface temperature (SST) is vital for understanding the greater climate of the Earth and is also an important variable in weather prediction, we propose a model that effectively capitalizes on the reduced complexity of machine learning models while still being able to efficiently predict over a large spatial domain. We find that it is proficient at predicting the SST at specific locations as well as over the greater domain of the Earth’s oceans.
Raphael Kriegmair, Yvonne Ruckstuhl, Stephan Rasp, and George Craig
Nonlin. Processes Geophys., 29, 171–181, https://doi.org/10.5194/npg-29-171-2022, https://doi.org/10.5194/npg-29-171-2022, 2022
Short summary
Short summary
Our regional numerical weather prediction models run at kilometer-scale resolutions. Processes that occur at smaller scales not yet resolved contribute significantly to the atmospheric flow. We use a neural network (NN) to represent the unresolved part of physical process such as cumulus clouds. We test this approach on a simplified, yet representative, 1D model and find that the NN corrections vastly improve the model forecast up to a couple of days.
Bo Christiansen
Nonlin. Processes Geophys., 28, 409–422, https://doi.org/10.5194/npg-28-409-2021, https://doi.org/10.5194/npg-28-409-2021, 2021
Short summary
Short summary
In geophysics we often need to analyse large samples of high-dimensional fields. Fortunately but counterintuitively, such high dimensionality can be a blessing, and we demonstrate how this allows simple analytical results to be derived. These results include estimates of correlations between sample members and how the sample mean depends on the sample size. We show that the properties of high dimensionality with success can be applied to climate fields, such as those from ensemble modelling.
Camille Besombes, Olivier Pannekoucke, Corentin Lapeyre, Benjamin Sanderson, and Olivier Thual
Nonlin. Processes Geophys., 28, 347–370, https://doi.org/10.5194/npg-28-347-2021, https://doi.org/10.5194/npg-28-347-2021, 2021
Short summary
Short summary
This paper investigates the potential of a type of deep generative neural network to produce realistic weather situations when trained from the climate of a general circulation model. The generator represents the climate in a compact latent space. It is able to reproduce many aspects of the targeted multivariate distribution. Some properties of our method open new perspectives such as the exploration of the extremes close to a given state or how to connect two realistic weather states.
Gerd Schädler and Marcus Breil
Nonlin. Processes Geophys., 28, 231–245, https://doi.org/10.5194/npg-28-231-2021, https://doi.org/10.5194/npg-28-231-2021, 2021
Short summary
Short summary
We used regional climate networks (RCNs) to identify past heatwaves and droughts in Germany. RCNs provide information for whole areas and can provide many details of extreme events. The RCNs were constructed on the grid of the E-OBS data set. Time series correlation was used to construct the networks. Network metrics were compared to standard extreme indices and differed considerably between normal and extreme years. The results show that RCNs can identify severe and moderate extremes.
Jonathan M. Lilly and Paula Pérez-Brunius
Nonlin. Processes Geophys., 28, 181–212, https://doi.org/10.5194/npg-28-181-2021, https://doi.org/10.5194/npg-28-181-2021, 2021
Short summary
Short summary
Long-lived eddies are an important part of the ocean circulation. Here a dataset for studying eddies in the Gulf of Mexico is created through the analysis of trajectories of drifting instruments. The method involves the identification of quasi-periodic signals, characteristic of particles trapped in eddies, from the displacement records, followed by the creation of a measure of statistical significance. It is expected that this dataset will be of use to other authors studying this region.
Cristian Lussana, Thomas N. Nipen, Ivar A. Seierstad, and Christoffer A. Elo
Nonlin. Processes Geophys., 28, 61–91, https://doi.org/10.5194/npg-28-61-2021, https://doi.org/10.5194/npg-28-61-2021, 2021
Short summary
Short summary
An unprecedented amount of rainfall data is available nowadays, such as ensemble model output, weather radar estimates, and in situ observations from networks of both traditional and opportunistic sensors. Nevertheless, the exact amount of precipitation, to some extent, eludes our knowledge. The objective of our study is precipitation reconstruction through the combination of numerical model outputs with observations from multiple data sources.
Jaqueline Lekscha and Reik V. Donner
Nonlin. Processes Geophys., 27, 261–275, https://doi.org/10.5194/npg-27-261-2020, https://doi.org/10.5194/npg-27-261-2020, 2020
Moritz N. Lang, Sebastian Lerch, Georg J. Mayr, Thorsten Simon, Reto Stauffer, and Achim Zeileis
Nonlin. Processes Geophys., 27, 23–34, https://doi.org/10.5194/npg-27-23-2020, https://doi.org/10.5194/npg-27-23-2020, 2020
Short summary
Short summary
Statistical post-processing aims to increase the predictive skill of probabilistic ensemble weather forecasts by learning the statistical relation between historical pairs of observations and ensemble forecasts within a given training data set. This study compares four different training schemes and shows that including multiple years of data in the training set typically yields a more stable post-processing while it loses the ability to quickly adjust to temporal changes in the underlying data.
Cited articles
Aloise, D., Deshpande, A., Hansen, P., and Popat, P.: NP-hardness of Euclidean
sum-of-squares clustering, Mach. Learn., 75, 245–248,
https://doi.org/10.1007/s10994-009-5103-0, 2009. a
Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Pérez, J. M., and Perona, I.: An extensive comparative study of cluster validity indices, Pattern Recognition, 46, 243–256, 2013. a
Banerjee, A., Merugu, S., Dhillon, I. S., and Ghosh, J.: Clustering with
Bregman Divergences, J. Mach. Learn. Res., 6, 1705–1749,
https://doi.org/10.1007/s10994-005-5825-6, 2005. a
Barnston, A. G. and Livezey, R. E.: Classification, Seasonality and
Persistence of Low-Frequency Atmospheric Circulation Patterns, Mon.
Weather Rev., 115, 1083–1126,
https://doi.org/10.1175/1520-0493(1987)115<1083:csapol>2.0.co;2, 1987. a
Barriopedro, D., Fischer, E. M., Luterbacher, J., Trigo, R. M., and
García-Herrera, R.: The Hot Summer of 2010: Redrawing the Temperature
Record Map of Europe, Science, 332, 220–224, https://doi.org/10.1126/science.1201224, 2011. a
Bezdek, J. C., Ehrlich, R., and Full, W.: FCM: The Fuzzy c-Means Clustering
Algorithm, Comput. Geosci., 10, 191–203,
https://doi.org/10.1109/igarss.1988.569600, 1984. a
Bregman, L.: The relaxation method of finding the common point of convex sets
and its application to the solution of problems in convex programming, USSR
Comp. Math. Math+, 7, 200–217,
https://doi.org/10.1016/0041-5553(67)90040-7,
1967. a
Bueh, C. and Nakamura, H.: Scandinavian pattern and its climatic impact,
Q. J. Roy. Meteor. Soc., 133, 2117–2131,
https://doi.org/10.1002/qj.173,
2007. a
Cheng, X. and Wallace, J. M.: Cluster Analysis of the Northern Hemisphere
Wintertime 500-hPa Height Field: Spatial Patterns, J. Atmos.
Sci., 50, 2674–2696,
https://doi.org/10.1175/1520-0469(1993)050<2674:CAOTNH>2.0.CO;2,
1993. a
Christiansen, B.: Atmospheric Circulation Regimes: Can Cluster Analysis Provide
the Number?, J. Climate, 20, 2229–2250, https://doi.org/10.1175/JCLI4107.1, 2007. a
Cutler, A. and Breiman, L.: Archetypal Analysis, Technometrics, 36, 338–347, 1994. a
Damianou, A. C., Titsias, M. K., and Lawrence, N. D.: Variational Gaussian Process Dynamical Systems, in: Advances in Neural Information Processing Systems 24 (NIPS 2011), 12–17 December 2011, Granada, Spain, 2510–2518, 2011. a
Ding, C. and He, X.: K-means clustering via principal component analysis, in:
Proceedings of the twenty-first international conference on Machine learning (ICML 2004),
4–8 July 2004, Banff, Canada, 29–37, 2004. a
Dole, R. M., Hoerling, M., Perlwitz, J., Eischeid, J., Pegion, P., Zhang, T.,
Quan, X.-W., Xu, T., and Murray, D.: Was there a basis for anticipating the
2010 Russian heat wave?, Geophys. Res. Lett., 38, L06702,
https://doi.org/10.1029/2010GL046582,
2011. a, b
Dole, R. M. and Gordon, N. D.: Persistent Anomalies of the Extratropical
Northern Hemisphere Wintertime Circulation: Geographical Distribution and
Regional Persistence Characteristics, Mon. Weather Rev., 111,
1567–1586, https://doi.org/10.1175/1520-0493(1983)111<1567:PAOTEN>2.0.CO;2,
1983. a, b
Dommenget, D. and Latif, M.: A cautionary note on the interpretation of EOFs, J. Climate, 15, 216–225,
https://doi.org/10.1175/1520-0442(2002)015<0216:ACNOTI>2.0.CO;2, 2002. a
Dunn, J. C.: A fuzzy relative of the ISODATA process and its use in detecting
compact well-separated clusters, J. Cybernetics, 3, 32–57,
https://doi.org/10.1080/01969727308546046, 1973. a
Eckart, C. and Young, G.: The approximation of one matrix by another of lower
rank, Psychometrika, 1, 211–218, https://doi.org/10.1007/BF02288367, 1936. a
Efimov, V., Prusov, A., and Shokurov, M.: Patterns of interannual variability
defined by a cluster analysis and their relation with ENSO, Q. J. Roy. Meteor. Soc., 121, 1651–1679, 1995. a
Fereday, D. R., Knight, J. R., Scaife, A. A., Folland, C. K., and Philipp, A.:
Cluster Analysis of North Atlantic–European Circulation Types and Links
with Tropical Pacific Sea Surface Temperatures, J. Climate, 21,
3687–3703, https://doi.org/10.1175/2007JCLI1875.1, 2008. a
Forgey, E.: Cluster analysis of multivariate data: Efficiency vs.
interpretability of classification, Biometrics, 21, 768–769, 1965. a
Gönen, M., Khan, S., and Kaski, S.: Kernelized Bayesian matrix
factorization, in: Proceedings of the 30th International Conference on Machine Learning (ICML 2013), 17–19 June 2013, Atlanta, USA, 864–872,
2013. a
Hannachi, A. and Legras, B.: Simulated annealing and weather regimes
classification, Tellus A, 47, 955–973,
https://doi.org/10.1034/j.1600-0870.1995.00203.x,
1995. a
Hannachi, A. and Trendafilov, N.: Archetypal analysis: Mining weather and
climate extremes, J. Climate, 30, 6927–6944,
https://doi.org/10.1175/JCLI-D-16-0798.1, 2017. a, b, c, d
Hannachi, A., Jolliffe, I. T., and Stephenson, D. B.: Empirical orthogonal
functions and related techniques in atmospheric science: A review,
Int. J. Climatol., 27, 1119–1152, https://doi.org/10.1002/joc.1499,
2007. a
Harada, Y., Kamahori, H., Kobayashi, C., Endo, H., Kobayashi, S., Ota, Y.,
Onoda, H., Onogi, K., Miyaoka, K., and Takahashi, K.: The JRA-55 Reanalysis:
Representation of Atmospheric Circulation and Climate Variability, J. Meteorol. Soc. Jpn. Ser. II, 94, 269–302,
https://doi.org/10.2151/jmsj.2016-015, 2016. a
Harries, D. and O'Kane, T. J.: Matrix factorization case studies code, Zenodo, https://doi.org/10.5281/zenodo.3723948, 2020. a
Hartigan, J. A. and Wong, M. A.: Algorithm AS 136: A K-Means Clustering
Algorithm, J. Roy. Stat. Soc. C-App., 28, 100–108,
https://doi.org/10.2307/2346830, 1979. a
Hastie, T., Tibshirani, R., and Friedman, J.: The Elements of Statistical
Learning: Data Mining, Inference and Prediction, Springer, New York, USA, 2005. a
Horenko, I.: On a scalable entropic breaching of the overfitting barrier in
machine learning, Neural Computation, arXiv [preprint], arXiv:2002.03176, 8 February 2020. a
Hunter, J. D.: Matplotlib: A 2D graphics environment, Comput. Sci.
Eng., 9, 90–95, https://doi.org/10.1109/MCSE.2007.55, 2007. a
Huth, R., Beck, C., Philipp, A., Demuzere, M., Ustrnul, Z., Cahynová, M.,
Kyselý, J., and Tveito, O. E.: Classifications of Atmospheric Circulation
Patterns, Ann. NY Acad. Sci., 1146, 105–152,
https://doi.org/10.1196/annals.1446.019, 2008. a
Jolliffe, I. T., Trendafilov, N. T., and Uddin, M.: A Modified Principal
Component Technique Based on the LASSO, J. Comput.
Graph. Stat., 12, 531–547, https://doi.org/10.1198/1061860032148, 2003. a, b
Kaiser, E., Noack, B. R., Cordier, L., Spohn, A., Segond, M., Abel, M.,
Daviller, G., Östh, J., Krajnović, S., and Niven, R. K.:
Cluster-based reduced-order modelling of a mixing layer, J. Fluid
Mech., 754, 365–414, 2014. a
Kaiser, H. F.: The varimax criterion for analytic rotation in factor analysis, Psychometrika, 23, 187–200, https://doi.org/10.1007/BF02289233, 1958. a
Kidson, J. W.: The Utility Of Surface And Upper Air Data In Synoptic
Climatological Specification Of Surface Climatic Variables, Int.
J. Climatol., 17, 399–413,
https://doi.org/10.1002/(SICI)1097-0088(19970330)17:4<399::AID-JOC108>3.0.CO;2-M, 1997. a
Kidson, J. W.: An analysis of New Zealand synoptic types and their use in
defining weather regimes, Int. J. Climatol., 20, 299–316,
https://doi.org/10.1002/(SICI)1097-0088(20000315)20:3<299::AID-JOC474>3.0.CO;2-B, 2000. a
Kobayashi, S., Ota, Y., Harada, Y., Ebita, A., Moriya, M., Onoda, H., Onogi,
K., Kamahori, H., Kobayashi, C., Endo, H., Miyaoka, K., and Takahashi, K.:
The JRA-55 Reanalysis: General Specifications and Basic Characteristics,
J. Meteorol. Soc. Jpn. Ser. II, 93, 5–48,
https://doi.org/10.2151/jmsj.2015-001, 2015 (data available at: https://jra.kishou.go.jp/JRA-55/index_en.html, last access: 12 April 2019). a, b
Lau, K.-M., Sheu, P.-J., and Kang, I.-S.: Multiscale Low-Frequency Circulation Modes in the Global Atmosphere, J. Atmos. Sci., 51,
1169–1193, 1994. a
Lawrence, N.: Probabilistic Non-linear Principal Component Analysis with
Gaussian Process Latent Variable Models, J. Mach. Learn. Res., 6, 1783–1816, 2005. a
Legras, B., Desponts, T., and Piguet, B.: Cluster analysis and weather regimes, in: Seminar on the Nature and Prediction of Extra Tropical Weather Systems, 7–11 September 1987, Reading, UK, 123–150, 1987. a
Li, T. and Ding, C.: The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering, in: Proceedings of the Sixth International Conference on Data Mining (ICDM'06), 18–22 December 2006, Hong Kong, China, 362–371, 2006. a
Lloyd, S.: Least squares quantization in PCM, IEEE T.
Inform. Theory, 28, 129–137, https://doi.org/10.1109/TIT.1982.1056489, 1982. a, b
Lorenz, E. N.: Empirical Orthogonal Functions and Statistical Weather
Prediction, Tech. rep., Massachusetts Institute of Technology, Cambridge, UK,
1956. a
MacKay, D. J. C.: Information theory, inference and learning algorithms,
Cambridge University Press, Cambridge, UK, 2003. a
MacQueen, J.: Some methods for classification and analysis of multivariate observations, in: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, 21 June–18 July 1965 and 27 December 1965–7 January 1966, Berkeley, USA, 281–297, 1967. a
Mahajan, M., Nimbhorkar, P., and Varadarajan, K.: The planar k-means problem is NP-hard, Theor. Comput. Sci., 442, 13–21,
https://doi.org/10.1016/j.tcs.2010.05.034,
2012. a
Matsueda, M.: Predictability of Euro-Russian blocking in summer of 2010,
Geophys. Res. Lett., 38, L06801, https://doi.org/10.1029/2010GL046557,
2011. a
Michelangeli, P.-A., Vautard, R., and Legras, B.: Weather Regimes: Recurrence
and Quasi Stationarity, J. Atmos. Sci., 52, 1237–1256,
https://doi.org/10.1175/1520-0469(1995)052<1237:WRRAQS>2.0.CO;2,
1995. a, b
Mnih, A. and Salakhutdinov, R. R.: Probabilistic Matrix Factorization, in: Advances in Neural Information Processing Systems 20 (NIPS 2007), 3–6 December 2007, Vancouver, Canada, 1257–1264, 2008. a
Mo, K. and Ghil, M.: Cluster analysis of multiple planetary flow regimes,
J. Geophys. Res.-Atmos., 93, 10927–10952,
https://doi.org/10.1029/JD093iD09p10927,
1988. a, b, c
Molteni, F., Tibaldi, S., and Palmer, T. N.: Regimes in the wintertime
circulation over northern extratropics. I: Observational evidence, Q.
J. Roy. Meteor. Soc., 116, 31–67,
https://doi.org/10.1002/qj.49711649103, 1990. a
Monahan, A. H., Fyfe, J. C., Ambaum, M. H. P., Stephenson, D. B., and North,
G. R.: Empirical Orthogonal Functions: The Medium is the Message, J. Climate, 22, 6501–6514, https://doi.org/10.1175/2009JCLI3062.1, 2009. a
Neal, R., Fereday, D., Crocker, R., and Comer, R. E.: A flexible approach to
defining weather patterns and their application in weather forecasting over
Europe, Meteorol. Appl., 23, 389–400, https://doi.org/10.1002/met.1563,
2016. a
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel,
O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.,
Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E.:
Scikit-learn: Machine Learning in Python, J. Mach. Learn.
Res., 12, 2825–2830, 2011. a
Pelly, J. L. and Hoskins, B. J.: A New Perspective on Blocking, J.
Atmos. Sci., 60, 743–755, https://doi.org/10.1175/1520-0469(2003)060<0743:ANPOB>2.0.CO;2,
2003. a
Pohl, B. and Fauchereau, N.: The Southern Annular Mode Seen through Weather
Regimes, J. Climate, 25, 3336–3354, https://doi.org/10.1175/JCLI-D-11-00160.1, 2012. a
Rayner, N. A., Parker, D. E., Horton, E. B., Folland, C. K., Alexander, L. V., Rowell, D. P., Kent, E. C., and Kaplan, A.: Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late
nineteenth century, J. Geophys. Res.-Atmos., 108, 4407,
https://doi.org/10.1029/2002JD002670,
2003 (data available at: https://www.metoffice.gov.uk/hadobs/hadisst/, last access: 29 April 2019). a, b
Renwick, J. A.: Persistent Positive Anomalies in the Southern Hemisphere
Circulation, Mon. Weather Rev., 133, 977–988, https://doi.org/10.1175/MWR2900.1, 2005. a
Richman, M. B.: Rotation of Principal Components, J. Climatol., 6,
293–335, https://doi.org/10.1177/1746847713485834, 1986. a
Ruspini, E. H.: A New Approach to Clustering, Inform. Control, 15,
22–32, 1969. a
Salakhutdinov, R. and Mnih, A.: Bayesian Probabilistic Matrix Factorization using Markov Chain Monte Carlo, in: Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML'08), 5–9 July 2008, Helsinki, Finland, 880–887, 2008. a
Seth, S. and Eugster, M. J.: Probabilistic archetypal analysis, Mach.
Learn., 102, 85–113, https://doi.org/10.1007/s10994-015-5498-8, 2016. a, b, c
Shan, H. and Banerjee, A.: Generalized Probabilistic Matrix Factorizations for Collaborative Filtering, in: Proceedings of the Tenth IEEE International Conference on Data Mining, 14–17 December 2010, Sydney, Australia, 1025–1030, 2010. a
Shaposhnikov, D., Revich, B., Bellander, T., Bedada, G. B., Bottai, M., Kharkova, T., Kvasha, E., Lezina, E., Lind, T., Semutnikova, E., and Pershagen, G.:
Mortality related to air pollution with the Moscow heat wave and wildfire of
2010, Epidemiology, 25, 359–364, https://doi.org/10.1097/EDE.0000000000000090, 2014. a
Steinschneider, S. and Lall, U.: Daily Precipitation and Tropical Moisture
Exports across the Eastern United States: An Application of Archetypal
Analysis to Identify Spatiotemporal Structure, J. Climate, 28,
8585–8602, https://doi.org/10.1175/JCLI-D-15-0340.1, 2015. a
Stone, E. and Cutler, A.: Introduction to archetypal analysis of
spatio-temporal dynamics, Physica D, 96, 110–131,
https://doi.org/10.1016/0167-2789(96)00016-4, 1996. a
Stone, R. C.: Weather types at Brisbane, Queensland: An example of the use of
principal components and cluster analysis, Int. J.
Climatol., 9, 3–32, https://doi.org/10.1002/joc.3370090103, 1989. a
Straus, D. M., Corti, S., and Molteni, F.: Circulation Regimes: Chaotic
Variability versus SST-Forced Predictability, J. Climate, 20,
2251–2272, https://doi.org/10.1175/JCLI4070.1, 2007. a
Tibshirani, R., Walther, G., and Hastie, T.: Estimating the number of clusters in a data set via the gap statistic, J. Roy. Stat. Soc. B, 63, 411–423,
https://doi.org/10.1111/1467-9868.00293, 2001. a, b
Tipping, M. E. and Bishop, C. M.: Probabilistic Principal Component Analysis,
J. Roy. Stat. Soc. B, 61, 611–622, https://doi.org/10.1111/1467-9868.00196, 1999. a, b
Virtanen, T., Cemgil, A. T., and Godsill, S.: Bayesian extensions to non-negative matrix factorisation for audio signal modelling, in: Proceedings of the 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing, 30 March–4 April 2008, Las Vegas, USA, 1825–1828, 2008. a
Wang, C., Xie, S.-P., and Carton, J. A.: A Global Survey of Ocean–Atmosphere
Interaction and Climate Variability, American Geophysical Union (AGU), 1–19, https://doi.org/10.1029/147GM01, 2004. a
Wang, J., Hertzmann, A., and Fleet, D. J.: Gaussian Process Dynamical Models, in: Advances in Neural Information Processing Systems 18 (NIPS 2005), 5–8 December 2005, Vancouver, Canada, 1441–1448, 2006. a
Witten, D. M., Tibshirani, R., and Hastie, T.: A penalized matrix
decomposition, with applications to sparse principal components and canonical
correlation analysis, Biostatistics, 10, 515–534,
https://doi.org/10.1093/biostatistics/kxp008, 2009. a, b
Short summary
Different dimension reduction methods may produce profoundly different low-dimensional representations of multiscale systems. We perform a set of case studies to investigate these differences. When a clear scale separation is present, similar bases are obtained using all methods, but when this is not the case some methods may produce representations that are poorly suited for describing features of interest, highlighting the importance of a careful choice of method when designing analyses.
Different dimension reduction methods may produce profoundly different low-dimensional...