Articles | Volume 27, issue 2
https://doi.org/10.5194/npg-27-329-2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Special issue:
https://doi.org/10.5194/npg-27-329-2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
From research to applications – examples of operational ensemble post-processing in France using machine learning
Maxime Taillardat
CORRESPONDING AUTHOR
Météo-France, Toulouse, France
CNRM UMR 3589, Toulouse, France
Olivier Mestre
Météo-France, Toulouse, France
CNRM UMR 3589, Toulouse, France
Related authors
Romain Pic, Clément Dombry, Philippe Naveau, and Maxime Taillardat
Adv. Stat. Clim. Meteorol. Oceanogr., 11, 23–58, https://doi.org/10.5194/ascmo-11-23-2025, https://doi.org/10.5194/ascmo-11-23-2025, 2025
Short summary
Short summary
Correctly forecasting weather is crucial for decision-making in various fields. Standard multivariate verification tools have limitations, and a single tool cannot fully characterize predictive performance. We formalize a framework based on aggregation and transformation to build interpretable verification tools. These tools target specific features of forecasts, improving predictive performance characterization and bridging the gap between theoretical and physics-based tools.
Jonathan Demaeyer, Jonas Bhend, Sebastian Lerch, Cristina Primo, Bert Van Schaeybroeck, Aitor Atencia, Zied Ben Bouallègue, Jieyu Chen, Markus Dabernig, Gavin Evans, Jana Faganeli Pucer, Ben Hooper, Nina Horat, David Jobst, Janko Merše, Peter Mlakar, Annette Möller, Olivier Mestre, Maxime Taillardat, and Stéphane Vannitsem
Earth Syst. Sci. Data, 15, 2635–2653, https://doi.org/10.5194/essd-15-2635-2023, https://doi.org/10.5194/essd-15-2635-2023, 2023
Short summary
Short summary
A benchmark dataset is proposed to compare different statistical postprocessing methods used in forecasting centers to properly calibrate ensemble weather forecasts. This dataset is based on ensemble forecasts covering a portion of central Europe and includes the corresponding observations. Examples on how to download and use the data are provided, a set of evaluation methods is proposed, and a first benchmark of several methods for the correction of 2 m temperature forecasts is performed.
Guillaume Evin, Matthieu Lafaysse, Maxime Taillardat, and Michaël Zamo
Nonlin. Processes Geophys., 28, 467–480, https://doi.org/10.5194/npg-28-467-2021, https://doi.org/10.5194/npg-28-467-2021, 2021
Short summary
Short summary
Forecasting the height of new snow is essential for avalanche hazard surveys, road and ski resort management, tourism attractiveness, etc. Météo-France operates a probabilistic forecasting system using a numerical weather prediction system and a snowpack model. It provides better forecasts than direct diagnostics but exhibits significant biases. Post-processing methods can be applied to provide automatic forecasting products from this system.
Stephan Hemri, Sebastian Lerch, Maxime Taillardat, Stéphane Vannitsem, and Daniel S. Wilks
Nonlin. Processes Geophys., 27, 519–521, https://doi.org/10.5194/npg-27-519-2020, https://doi.org/10.5194/npg-27-519-2020, 2020
Romain Pic, Clément Dombry, Philippe Naveau, and Maxime Taillardat
Adv. Stat. Clim. Meteorol. Oceanogr., 11, 23–58, https://doi.org/10.5194/ascmo-11-23-2025, https://doi.org/10.5194/ascmo-11-23-2025, 2025
Short summary
Short summary
Correctly forecasting weather is crucial for decision-making in various fields. Standard multivariate verification tools have limitations, and a single tool cannot fully characterize predictive performance. We formalize a framework based on aggregation and transformation to build interpretable verification tools. These tools target specific features of forecasts, improving predictive performance characterization and bridging the gap between theoretical and physics-based tools.
Jonathan Demaeyer, Jonas Bhend, Sebastian Lerch, Cristina Primo, Bert Van Schaeybroeck, Aitor Atencia, Zied Ben Bouallègue, Jieyu Chen, Markus Dabernig, Gavin Evans, Jana Faganeli Pucer, Ben Hooper, Nina Horat, David Jobst, Janko Merše, Peter Mlakar, Annette Möller, Olivier Mestre, Maxime Taillardat, and Stéphane Vannitsem
Earth Syst. Sci. Data, 15, 2635–2653, https://doi.org/10.5194/essd-15-2635-2023, https://doi.org/10.5194/essd-15-2635-2023, 2023
Short summary
Short summary
A benchmark dataset is proposed to compare different statistical postprocessing methods used in forecasting centers to properly calibrate ensemble weather forecasts. This dataset is based on ensemble forecasts covering a portion of central Europe and includes the corresponding observations. Examples on how to download and use the data are provided, a set of evaluation methods is proposed, and a first benchmark of several methods for the correction of 2 m temperature forecasts is performed.
Guillaume Evin, Matthieu Lafaysse, Maxime Taillardat, and Michaël Zamo
Nonlin. Processes Geophys., 28, 467–480, https://doi.org/10.5194/npg-28-467-2021, https://doi.org/10.5194/npg-28-467-2021, 2021
Short summary
Short summary
Forecasting the height of new snow is essential for avalanche hazard surveys, road and ski resort management, tourism attractiveness, etc. Météo-France operates a probabilistic forecasting system using a numerical weather prediction system and a snowpack model. It provides better forecasts than direct diagnostics but exhibits significant biases. Post-processing methods can be applied to provide automatic forecasting products from this system.
Stephan Hemri, Sebastian Lerch, Maxime Taillardat, Stéphane Vannitsem, and Daniel S. Wilks
Nonlin. Processes Geophys., 27, 519–521, https://doi.org/10.5194/npg-27-519-2020, https://doi.org/10.5194/npg-27-519-2020, 2020
Related subject area
Subject: Predictability, probabilistic forecasts, data assimilation, inverse problems | Topic: Climate, atmosphere, ocean, hydrology, cryosphere, biosphere | Techniques: Big data and artificial intelligence
Selecting and weighting dynamical models using data-driven approaches
A quest for precipitation attractors in weather radar archives
Robust weather-adaptive post-processing using model output statistics random forests
Guidance on how to improve vertical covariance localization based on a 1000-member ensemble
Weather pattern dynamics over western Europe under climate change: predictability, information entropy and production
Calibrated ensemble forecasts of the height of new snow using quantile regression forests and ensemble model output statistics
Enhancing geophysical flow machine learning performance via scale separation
Training a convolutional neural network to conserve mass in data assimilation
Data-driven predictions of a multiscale Lorenz 96 chaotic system using machine-learning methods: reservoir computing, artificial neural network, and long short-term memory network
Pierre Le Bras, Florian Sévellec, Pierre Tandeo, Juan Ruiz, and Pierre Ailliot
Nonlin. Processes Geophys., 31, 303–317, https://doi.org/10.5194/npg-31-303-2024, https://doi.org/10.5194/npg-31-303-2024, 2024
Short summary
Short summary
The goal of this paper is to weight several dynamic models in order to improve the representativeness of a system. It is illustrated using a set of versions of an idealized model describing the Atlantic Meridional Overturning Circulation. The low-cost method is based on data-driven forecasts. It enables model performance to be evaluated on their dynamics. Taking into account both model performance and codependency, the derived weights outperform benchmarks in reconstructing a model distribution.
Loris Foresti, Bernat Puigdomènech Treserras, Daniele Nerini, Aitor Atencia, Marco Gabella, Ioannis V. Sideris, Urs Germann, and Isztar Zawadzki
Nonlin. Processes Geophys., 31, 259–286, https://doi.org/10.5194/npg-31-259-2024, https://doi.org/10.5194/npg-31-259-2024, 2024
Short summary
Short summary
We compared two ways of defining the phase space of low-dimensional attractors describing the evolution of radar precipitation fields. The first defines the phase space by the domain-scale statistics of precipitation fields, such as their mean, spatial and temporal correlations. The second uses principal component analysis to account for the spatial distribution of precipitation. To represent different climates, radar archives over the United States and the Swiss Alpine region were used.
Thomas Muschinski, Georg J. Mayr, Achim Zeileis, and Thorsten Simon
Nonlin. Processes Geophys., 30, 503–514, https://doi.org/10.5194/npg-30-503-2023, https://doi.org/10.5194/npg-30-503-2023, 2023
Short summary
Short summary
Statistical post-processing is necessary to generate probabilistic forecasts from physical numerical weather prediction models. To allow for more flexibility, there has been a shift in post-processing away from traditional parametric regression models towards modern machine learning methods. By fusing these two approaches, we developed model output statistics random forests, a new post-processing method that is highly flexible but at the same time also very robust and easy to interpret.
Tobias Necker, David Hinger, Philipp Johannes Griewank, Takemasa Miyoshi, and Martin Weissmann
Nonlin. Processes Geophys., 30, 13–29, https://doi.org/10.5194/npg-30-13-2023, https://doi.org/10.5194/npg-30-13-2023, 2023
Short summary
Short summary
This study investigates vertical localization based on a convection-permitting 1000-member ensemble simulation. We derive an empirical optimal localization (EOL) that minimizes sampling error in 40-member sub-sample correlations assuming 1000-member correlations as truth. The results will provide guidance for localization in convective-scale ensemble data assimilation systems.
Stéphane Vannitsem
Nonlin. Processes Geophys., 30, 1–12, https://doi.org/10.5194/npg-30-1-2023, https://doi.org/10.5194/npg-30-1-2023, 2023
Short summary
Short summary
The impact of climate change on weather pattern dynamics over the North Atlantic is explored through the lens of information theory. These tools allow the predictability of the succession of weather patterns and the irreversible nature of the dynamics to be clarified. It is shown that the predictability is increasing in the observations, while the opposite trend is found in model projections. The irreversibility displays an overall increase in time in both the observations and the model runs.
Guillaume Evin, Matthieu Lafaysse, Maxime Taillardat, and Michaël Zamo
Nonlin. Processes Geophys., 28, 467–480, https://doi.org/10.5194/npg-28-467-2021, https://doi.org/10.5194/npg-28-467-2021, 2021
Short summary
Short summary
Forecasting the height of new snow is essential for avalanche hazard surveys, road and ski resort management, tourism attractiveness, etc. Météo-France operates a probabilistic forecasting system using a numerical weather prediction system and a snowpack model. It provides better forecasts than direct diagnostics but exhibits significant biases. Post-processing methods can be applied to provide automatic forecasting products from this system.
Davide Faranda, Mathieu Vrac, Pascal Yiou, Flavio Maria Emanuele Pons, Adnane Hamid, Giulia Carella, Cedric Ngoungue Langue, Soulivanh Thao, and Valerie Gautard
Nonlin. Processes Geophys., 28, 423–443, https://doi.org/10.5194/npg-28-423-2021, https://doi.org/10.5194/npg-28-423-2021, 2021
Short summary
Short summary
Machine learning approaches are spreading rapidly in climate sciences. They are of great help in many practical situations where using the underlying equations is difficult because of the limitation in computational power. Here we use a systematic approach to investigate the limitations of the popular echo state network algorithms used to forecast the long-term behaviour of chaotic systems, such as the weather. Our results show that noise and intermittency greatly affect the performances.
Yvonne Ruckstuhl, Tijana Janjić, and Stephan Rasp
Nonlin. Processes Geophys., 28, 111–119, https://doi.org/10.5194/npg-28-111-2021, https://doi.org/10.5194/npg-28-111-2021, 2021
Short summary
Short summary
The assimilation of observations using standard algorithms can lead to a violation of physical laws (e.g. mass conservation), which is shown to have a detrimental impact on the system's forecast. We use a neural network (NN) to correct this mass violation, using training data generated from expensive algorithms that can constrain such physical properties. We found that, in an idealized set-up, the NN can match the performance of these expensive algorithms at negligible computational costs.
Ashesh Chattopadhyay, Pedram Hassanzadeh, and Devika Subramanian
Nonlin. Processes Geophys., 27, 373–389, https://doi.org/10.5194/npg-27-373-2020, https://doi.org/10.5194/npg-27-373-2020, 2020
Short summary
Short summary
The performance of three machine-learning methods for data-driven modeling of a multiscale chaotic Lorenz 96 system is examined. One of the methods is found to be able to predict the future evolution of the chaotic system well from just knowing the past observations of the large-scale component of the multiscale state vector. Potential applications to data-driven and data-assisted surrogate modeling of complex dynamical systems such as weather and climate are discussed.
Cited articles
Athey, S., Tibshirani, J., and Wager, S.: Generalized random forests,
Ann. Stat., 47, 1148–1178, 2019. a
Baran, S. and Lerch, S.: Combining predictive distributions for the statistical
post-processing of ensemble forecasts, Int. J. Forecast.,
34, 477–496, 2018. a
Barry, R. G.: Mountain weather and climate, London and New York, Routledge, 2nd edn., 2008. a
Bellier, J., Bontron, G., and Zin, I.: Using meteorological analogues for
reordering postprocessed precipitation ensembles in hydrological forecasting,
Water Resour. Res., 53, 10085–10107, 2017. a
Bellier, J., Zin, I., and Bontron, G.: Generating Coherent Ensemble Forecasts
After Hydrological Postprocessing: Adaptations of ECC-Based Methods, Water
Resour. Res., 54, 5741–5762, 2018. a
Bénichou, P.: Cartography of statistical pluviometric fields with an
automatic allowance for regional topography, in: Global Precipitations and
Climate Change, pp. 187–199, Springer, Berlin and Heidelberg, 1994. a
Breiman, L.: Random forests, Mach. Learn., 45, 5–32, 2001. a
Breiman, L., Friedman, J., Stone, C. J., and Olshen, R.: Classification and
Regression Trees, CRC Press, Boca Raton, Florida, 1984. a
Bremnes, J. B.: Ensemble postprocessing using quantile function regression based on neural networks and Bernstein polynomials, Mon. Weather Rev., 148, 403–414, 2020. a
Clark, M., Gangopadhyay, S., Hay, L., Rajagopalan, B., and Wilby, R.: The
Schaake shuffle: A method for reconstructing space–time variability in
forecasted precipitation and temperature fields, J. Hydrometeorol.,
5, 243–262, 2004. a
Courtier, P., Freydier, C., Geleyn, J.-F., Rabier, F., and Rochas, M.: The
Arpege project at Meteo France, in: Seminar on Numerical Methods in
Atmospheric Models, 9–13 September 1991, vol. II, pp. 193–232, ECMWF, ECMWF,
Shinfield Park, Reading, available at: https://www.ecmwf.int/node/8798 (last access: 26 May 2020),
1991. a
Cressie, N.: Spatial prediction and ordinary kriging, Math. Geol., 20,
405–421, 1988. a
Dabernig, M., Mayr, G. J., Messner, J. W., and Zeileis, A.: Spatial ensemble
post-processing with standardized anomalies, Q. J. Roy.
Meteor. Soc., 143, 909–916, 2017. a
Feldmann, K., Richardson, D. S., and Gneiting, T.: Grid-Versus Station-Based
Postprocessing of Ensemble Temperature Forecasts, Geophys. Res.
Lett., 46, 7744–7751, 2019. a
Franke, R.: Smooth interpolation of scattered data by local thin plate splines,
Comput. Math. Appl., 8, 273–281, 1982. a
Frei, C.: Interpolation of temperature in a mountainous region using nonlinear
profiles and non-Euclidean distances, Int. J. Climatol.,
34, 1585–1605, 2014. a
Fundel, V. J., Fleischhut, N., Herzog, S. M., Göber, M., and Hagedorn, R.:
Promoting the use of probabilistic weather forecasts through a dialogue
between scientists, developers and end-users, Q. J. Roy.
Meteor. Soc., 145, 210–231, 2019. a
Gascón, E., Lavers, D., Hamill, T. M., Richardson, D. S., Bouallègue,
Z. B., Leutbecher, M., and Pappenberger, F.: Statistical post-processing of
dual-resolution ensemble precipitation forecasts across Europe, Q.
J. Roy. Meteor. Soc., 145, 3218–3235, 2019. a
Genuer, R., Poggi, J. M., and Tuleau-Malot, C.: VSURF: An R Package for Variable Selection Using Random Forests, R Journal, 7, 2015. a
Gneiting, T.: Calibration of medium-range weather forecasts, European Centre
for Medium-Range Weather Forecasts, Reading, 2014. a
Gneiting, T. and Katzfuss, M.: Probabilistic forecasting, Annu. Rev.
Stat. Appl., 1, 125–151, 2014. a
Gneiting, T. and Raftery, A. E.: Strictly proper scoring rules, prediction, and
estimation, J. Am. Stat. Assoc., 102, 359–378,
2007. a
Hagedorn, R., Buizza, R., Hamill, T. M., Leutbecher, M., and Palmer, T.:
Comparing TIGGE multimodel forecasts with reforecast-calibrated ECMWF
ensemble forecasts, Q. J. Roy. Meteor. Soc.,
138, 1814–1827, 2012. a
Haiden, T., Janousek, M., Vitart, F., Ferranti, L., and Prates, F.: Evaluation
of ECMWF forecasts, including the 2019 upgrade, European Centre for Medium-Range Weather Forecasts, Reading, https://doi.org/10.21957/mlvapkke, 2019. a
Hamill, T. M.: Practical aspects of statistical postprocessing, in: Statistical
Postprocessing of Ensemble Forecasts, pp. 187–217, Elsevier, Amsterdam, Oxford and Cambridge, USA, 2018. a
Hemri, S., Haiden, T., and Pappenberger, F.: Discrete postprocessing of total
cloud cover ensemble forecasts, Mon. Weather Rev., 144, 2565–2577,
2016. a
Hosking, J. R. M., Wallis, J. R., and Wood, E. F.: Estimation of the
generalized extreme-value distribution by the method of probability-weighted
moments, Technometrics, 27, 251–261, 1985. a
Hudson, G. and Wackernagel, H.: Mapping temperature using kriging with external
drift: theory and an example from Scotland, Int. J.
Climatol., 14, 77–91, 1994. a
Laurantin, O.: ANTILOPE: Hourly rainfall analysis merging radar and rain gauge
data, in: Proceedings of the International Symposium on Weather Radar and
Hydrology, pp. 2–8, International Association of Hydrological Sciences, Grenoble, France, 2008. a
Manzato, A.: A note on the maximum Peirce skill score, Weather Forecast.,
22, 1148–1154, 2007. a
Meinshausen, N.: Quantile regression forests, J. Mach. Learn.
Res., 7, 983–999, 2006. a
Naveau, P., Huser, R., Ribereau, P., and Hannart, A.: Modeling jointly low,
moderate, and heavy rainfall intensities without a threshold selection, Water
Resour. Res., 52, 2753–2769, 2016. a
Paluszynska, A.: Biecek P. randomForestExplainer: Explaining and Visualizing
Random Forests in Terms of Variable Importance, R package version 0.9, available at: https://cran.r-project.org/package=randomForestExplainer (last access: 28 May 2020), 2017. a
Papastathopoulos, I. and Tawn, J. A.: Extended generalised Pareto models for
tail estimation, J. Stat. Plan. Inf., 143,
131–143, 2013. a
Peppier, R. A.: A review of static stability indices and related thermodynamic
parameters, Tech. rep., Illinois State Water Survey,
available at: http://hdl.handle.net/2142/48974 (last access: 26 May 2020), 1988. a
R Core Team: R: A Language and Environment for Statistical Computing, R
Foundation for Statistical Computing, Vienna, Austria,
available at: https://www.R-project.org/ (last access: 26 May 2020), 2015. a
Roebber, P. J.: Visualizing multiple measures of forecast quality, Weather Forecast., 24, 601–608, 2009. a
Salazar, E., Sansó, B., Finley, A. O., Hammerling, D., Steinsland, I.,
Wang, X., and Delamater, P.: Comparing and blending regional climate model
predictions for the American southwest, J. Agr. Biol. Envir. St., 16, 586–605, 2011. a
Saveliev, A. A., Romanov, A. V., and Mukharamova, S. S.: Automated mapping
using multilevel B-Splines, Applied GIS, 1, 17–01, 2005. a
Schefzik, R., Thorarinsdottir, T. L., and Gneiting, T.: Uncertainty
quantification in complex simulation models using ensemble copula coupling,
Stat. Sci., 28, 616–640, 2013. a
Scheuerer, M., Hamill, T. M., Whitin, B., He, M., and Henkel, A.: A method for
preferential selection of dates in the S chaake shuffle approach to
constructing spatiotemporal forecast fields of temperature and precipitation,
Water Resour. Res., 53, 3029–3046, 2017. a
Schlosser, L., Hothorn, T., Stauffer, R., and Zeileis, A.: Distributional
regression forests for probabilistic precipitation forecasting in complex
terrain, Ann. Appl. Stat., 13, 1564–1589, 2019. a
Schmeits, M. J. and Kok, K. J.: A comparison between raw ensemble
output,(modified) Bayesian model averaging, and extended logistic regression
using ECMWF ensemble precipitation reforecasts, Mon. Weather Rev., 138,
4199–4211, 2010. a
Seity, Y., Brousseau, P., Malardel, S., Hello, G., Bénard, P., Bouttier,
F., Lac, C., and Masson, V.: The AROME-France convective-scale operational
model, Mon. Weather Rev., 139, 976–991, 2011. a
Stein, J. and Stoop, F.: Neighborhood-based contingency tables including errors
compensation, Mon. Weather Rev., 147, 329–344, 2019. a
Thorarinsdottir, T. L., Gneiting, T., and Gissibl, N.: Using proper divergence
functions to evaluate climate models, SIAM/ASA Journal on Uncertainty
Quantification, 1, 522–534, 2013. a
Vannitsem, S., Wilks, D. S., and Messner, J.: Statistical postprocessing of
ensemble forecasts, Elsevier, Amsterdam, Oxford and Cambridge, USA, 2018. a
Van Schaeybroeck, B. and Vannitsem, S.: Ensemble post-processing using
member-by-member approaches: theoretical aspects, Q. J.
Roy. Meteor. Soc., 141, 807–818, 2015. a
van Straaten, C., Whan, K., and Schmeits, M.: Statistical postprocessing and
multivariate structuring of high-resolution ensemble precipitation forecasts,
J. Hydrometeorol., 19, 1815–1833, 2018. a
Whan, K. and Schmeits, M.: Comparing area probability forecasts of (extreme)
local precipitation using parametric and machine learning statistical
postprocessing methods, Mon. Weather Rev., 146, 3651–3673, 2018. a
Whiteman, C. D.: Mountain meteorology: fundamentals and applications, Oxford
University Press, Oxford, 2000. a
Zamo, M., Bel, L., Mestre, O., and Stein, J.: Improved gridded wind speed
forecasts by statistical postprocessing of numerical models with block
regression, Weather Forecast., 31, 1929–1945, 2016. a
Zimmerman, D., Pavlik, C., Ruggles, A., and Armstrong, M. P.: An experimental
comparison of ordinary and universal kriging and inverse distance weighting,
Math. Geol., 31, 375–390, 1999. a
Short summary
Statistical post-processing of ensemble forecasts is now a well-known procedure in order to correct biased and misdispersed ensemble weather predictions. But practical application in European national weather services is in its infancy. Different applications of ensemble post-processing using machine learning at an industrial scale are presented. Forecast quality and value are improved compared to the raw ensemble, but several facilities have to be made to adjust to operational constraints.
Statistical post-processing of ensemble forecasts is now a well-known procedure in order to...