In previous work, it was shown that the preservation of physical properties in the data assimilation framework can significantly reduce forecast errors. Proposed data assimilation methods, such as the quadratic programming ensemble (QPEns) that can impose such constraints on the calculation of the analysis, are computationally more expensive, severely limiting their application to high-dimensional prediction systems as found in Earth sciences. We, therefore, propose using a convolutional neural network (CNN) trained on the difference between the analysis produced by a standard ensemble Kalman filter (EnKF) and the QPEns to correct any violations of imposed constraints. In this paper, we focus on the conservation of mass and show that, in an idealised set-up, the hybrid of a CNN and the EnKF is capable of reducing analysis and background errors to the same level as the QPEns.

The ensemble Kalman filter

NNs are powerful tools to approximate arbitrary nonlinear functions

Fully replacing data assimilation by a NN has been attempted by

We generate our training data by performing twin experiments with the 1D modified shallow water model

The modified shallow water model

The 1D model domain, representing 125 km, is discretised with

The nature run which mimics the true state of the atmosphere is a model simulation starting from an arbitrary initial state. The ensemble is chosen to be of a small size with

To deal with undersampling, covariance localisation using the fifth piecewise rational function

When the assimilation window

We aim to produce initial conditions of the same quality as the ones produced by the QPEns by upgrading the initial conditions produced by the EnKF using a CNN. To that end, we generate QPEns cycling data

Schematic of the generation of the data sets

The output of our training set

A validation data set

Value of the loss function

Root mean squared error (RMSE) of the ensemble averaged over 500 experiments of the variables (columns) for the background (top rows) and analysis (bottom rows) as functions of the assimilation cycles for the EnKF (blue), the QPEns (red) and the CNN (green). The panels in

We choose to use a CNN with four convolutional hidden layers, consisting of 32 filters each with kernels of size 3, and the “selu” activation function as follows:

We assign the name

The loss function, the mean RMSE of the variables

Next, we are interested in how the CNNs perform when applied within the data assimilation cycling. In Fig.

Absolute mass error averaged over 500 experiments of

With respect to RMSEs, for

The same as Table

Correlation coefficient for increments of the output (left column) and the prediction for

The same as Fig.

To support this claim, we trained an additional CNN with the training set corresponding to

The same as Fig.

Truth (black) and ensemble mean snapshot for EnKF (blue), QPEns (red) and NN with

Figures

Geoscience phenomena have several aspects that are different from standard data science applications, for example, governing physical laws, noisy observations that are non-uniform in space and time from many different sources and rare, interesting events. This makes the use of NNs particularly challenging for convective-scale applications, although attempts have been made for predicting rain, hail or tornadoes

These encouraging results prompt the question of the feasibility of this approach being applied to fully complex numerical weather prediction systems. The challenge here lies in the generation of the training data. First, the effectiveness of conserving different quantities has to be verified in a non-idealised numerical weather prediction framework, where the quantities to be conserved may not be known and may not be exactly conserved within the numerical weather prediction model

The provided source code (

YR set up and performed the experiments and prepared the paper with contributions from all coauthors. SR set up the code for the CNN. TJ contributed to the scientific design of the study and the analysis of the numerical results.

The authors declare that they have no conflict of interest.

This research has been supported by the German Research Foundation (DFG; subproject B6 of the Transregional Collaborative Research Project SFB/TRR 165, “Waves to Weather” and grant no. JA 1077/4-1).

This paper was edited by Alberto Carrassi and reviewed by Marc Bocquet and Svetlana Dubinkina.