Data assimilation is considered as a problem in Bayesian estimation, viz. determine the probability distribution for the state of the observed system, conditioned by the available data. In the linear and additive Gaussian case, a Monte Carlo sample of the Bayesian probability distribution (which is Gaussian and known explicitly) can be obtained by a simple procedure: perturb the data according to the probability distribution of their own errors, and perform an assimilation on the perturbed data. The performance of that approach, called here ensemble variational assimilation (EnsVAR), also known as ensemble of data assimilations (EDA), is studied in this two-part paper on the non-linear low-dimensional Lorenz-96 chaotic system, with the assimilation being performed by the standard variational procedure. In this first part, EnsVAR is implemented first, for reference, in a linear and Gaussian case, and then in a weakly non-linear case (assimilation over 5 days of the system). The performances of the algorithm, considered either as a probabilistic or a deterministic estimator, are very similar in the two cases. Additional comparison shows that the performance of EnsVAR is better, both in the assimilation and forecast phases, than that of standard algorithms for the ensemble Kalman filter (EnKF) and particle filter (PF), although at a higher cost. Globally similar results are obtained with the Kuramoto–Sivashinsky (K–S) equation.

The purpose of assimilation of observations is to reconstruct as accurately as possible the state of the system under observation, using all the relevant available information. In geophysical fluid applications, such as meteorology or oceanography, that relevant information essentially consists of the physical observations and of the physical laws which govern the evolution of the atmosphere or the ocean. Those physical laws are in practice available in the form of a discretized numerical model. Assimilation is therefore the process by which the observations are combined together with a numerical model of the dynamics of the observed system in order to obtain an accurate description of the state of that system.

All the available information, the observations as well as the numerical model, is
affected (and, as far as we can tell, will always be affected) with some
uncertainty, and one may wish to quantify the resulting uncertainty in the
output of the assimilation process. If one chooses to quantify uncertainty in
the form of probability distributions (see e.g.

There is one situation in which the Bayesian probability distribution is readily obtained in analytical form. That is when the link between the available information on the one hand, and the unknown system state on the other, is linear, and affected by additive Gaussian error. The Bayesian probability distribution is then Gaussian, with explicitly known expectation and covariance matrix (see Sect. 2 below).

Now, the very large dimension of the numerical models used in meteorology and
oceanography (that dimension can lie in the range

The EnKF, contrary to the standard KF, evolves an ensemble of points in state space. One advantage is that it can be readily, if empirically, implemented on non-linear dynamics. On the other hand, it keeps the same linear Gaussian procedure as KF for updating the current uncertainty with new observations. EnKF exists in many variants and, even with ensemble sizes of relatively small size (O(10–100)), produces results of high quality. It has now become, together with variational assimilation, one of the two most powerful algorithms used for assimilation in large-dimension geophysical fluid applications.

Concerning the Bayesian properties of EnKF,

Contrary to EnKF, which was from the start developed for geophysical
applications (but has since extended to other fields), particle filters (PFs)
have been developed totally independently of such applications. They are
based on general Bayesian principles and are thus independent of any
hypothesis of linearity or Gaussianity (see

There exist at least two other algorithms that can be utilized to build a
sample of a given probability distribution. The first one is the
acceptance–rejection algorithm described in

Coming back to the linear and Gaussian case, not only, as said above, is the (Gaussian) conditional probability distribution explicitly known, but a simple algorithm exists for determination of independent realizations of that distribution. In succinct terms, perturb additively the data according to their own error probability distribution, and perform the assimilation for the perturbed data. Repetition of this procedure on successive sets of independently perturbed data produces a Monte Carlo sample of the Bayesian posterior distribution.

The present work is devoted to the study of that algorithm, and of its
properties as a Bayesian estimator, in non-linear and/or non-Gaussian cases.
Systematic experiments are performed on two low-dimensional chaotic toy
models, namely the model defined by

This algorithm is not new. There exist actually a rather large number of
algorithms for assimilation that are variational (at least partially) and
build (at least at some stage) an ensemble of estimates of the state of the
observed system. A review of those algorithms has been recently given by

EnsVAR, as defined here, has been specifically studied under various names and
in various contexts by several authors

EnsVAR is also used operationally at the European Centre for Medium-Range
Weather Forecasts (ECMWF)

None of the above ensemble methods seems however to have been systematically and objectively evaluated as a probabilistic estimator. That is precisely the object of the present two papers.

The first of these is devoted to the exactly linear and weakly non-linear cases, and the second to the fully non-linear case. In this first one, Sect. 2 describes in detail the EnsVAR algorithm, as well as the experimental set-up that is to be used in both parts of the work. Section 3 describes the statistical tests to be used for objectively assessing EnsVAR as a probabilistic estimator. EnsVAR is implemented in Sect. 4, for reference, in an exactly linear and Gaussian case in which theory says it achieves exact Bayesian estimation. It is implemented in Sect. 5 on the non-linear Lorenz system, over a relatively short assimilation window (5 days), over which the tangent linear approximation remains basically valid and the performance of the algorithm is shown not to be significantly altered. Comparison is made in Sect. 6 with two standard algorithms for EnKF and PF. Experiments performed on the Kuramoto–Sivashinsky equation are summarized in Sect. 7. Partial conclusions, valid for the weakly non-linear case, are drawn in Sect. 8.

The second part is devoted to the fully non-linear situation, in which EnsVAR
is implemented over assimilation windows for which the tangent linear
approximation is no longer valid. Good performance is nevertheless achieved
through the technique of quasi-static variational assimilation (QSVA),
defined by

The general conclusion of both parts is that EnsVAR can produce good results which, in terms of performance as a probabilistic estimator and of numerical accuracy, are at least as good as the results of EnKF and PF.

In the sequel of the paper we denote by

We assume the available data make up a vector

In those conditions the Bayesian probability distribution

At first glance, the above equations seem to require the invertibility of the

The conditional expectation

In the case where the error

Minimization of Eq. (

Coming back to the linear and Gaussian case, consider the perturbed data
vector

That is the ensemble variational assimilation, or EnsVAR, algorithm that is
implemented below in non-linear and non-Gaussian situations, with the
analogue of the estimate

All the experiments presented in this work are of the standard identical twin type, in which the observations to be assimilated are extracted from a prior reference integration of the assimilating model. And all experiments presented in this first part are of the strong-constraint variational assimilation type, in which the temporal sequence of states produced by the assimilation are constrained to satisfy exactly the equations of the assimilating model.

That model, which will emanate from either the Lorenz or the
Kuramoto–Sivashinsky equation, will be written as

Choosing an assimilation window

The following process is then implemented

Perturb the observations

Assimilate the perturbed observations

The objective function (Eq.

The process (i)–(ii), repeated

In the perspective taken here, it is not the properties of those individual solutions that matter the most, but the properties of the ensemble considered as a sample of a probability distribution.

The ensemble assimilation process, starting from Eq. (

In variational assimilation as it is usually implemented, the objective
function to be minimized contains a so-called background term at the
initial time

The covariance matrix

We sum up the description of the experimental procedure and define precisely
the vocabulary to be used in the sequel. The output of one
experiment consists of

The minimizations (Eq.

We recall the general result that, among all deterministic functions from
data space into state space, the conditional expectation

What should ideally be done here for the validation of results is objectively
assess (if not on a case-by-case basis, at least in a statistical sense)
whether the ensembles produced by EnsVAR are samples of the corresponding
Bayesian probability distributions. In the present setting, where the
probability distribution of the errors

Through repeated independent realizations of the process defined by
Eqs. (

We have evaluated instead the weaker property of reliability (also called calibration). Reliability of a probabilistic estimation system (i.e. a system that produces probabilities for the quantities to be estimated) is the statistical consistency between the predicted probabilities and the observed frequencies of occurrence.

Consider a probability distribution

Reliability can be objectively evaluated, provided a large enough
verification sample is available. Bayesianity clearly implies reliability.
For any data vector

Root-mean-square errors from the truth as functions of time along
the assimilation window (linear and Gaussian case). Blue curve: error in
individual minimizations. Red curve: error in the means of the ensembles.
Green curve: error in the assimilations performed with the unperturbed
observations

Diagnostics of statistical performance (linear and Gaussian case).

Another desirable property of a probabilistic estimation system, although not directly related to Bayesianity, is resolution (also called sharpness). It is the capacity of the system for a priori distinguishing between different outcomes. For instance, a system which always predicts the climatological probability distribution is perfectly reliable, but has no resolution. Resolution, like reliability, can be objectively evaluated if a large enough verification sample is available.

We will use several standard diagnostic tools for validation of our results.
We first note that the error in the mean of the predicted ensembles is itself
a measure of resolution. The smaller that error, the higher the capacity of
the system to a priori distinguish between different outcomes. Concerning
reliability, the classical rank histogram and the reduced centred random
variable (RCRV) (the latter is described in Appendix A) are (non-equivalent)
measures of the reliability of probabilistic prediction of a scalar variable.
The reliability diagram and the associated Brier score are relative to
probabilistic prediction of a binary event. The Brier score decomposes into
two parts, which measure respectively the reliability and the resolution of
the prediction. The definition used here for those components is given in
Appendix A (Eqs.

We present in this section results obtained in an exactly linear and Gaussian
case, in which theory says that EnsVAR must produce an exact Monte Carlo
Bayesian sample. These results are to be used as a benchmark for the
evaluation of later results. The numerical model (Eq.

Since conditions for exact Bayesianity are verified, any deviation in the
results from exact reliability can be due to only the finiteness

Figure

All errors are smaller than the observation error (horizontal dashed–dotted
line). The estimation errors are largest at both ends of the assimilation
window and smallest at some intermediate time. As known, and already
discussed by various authors

For a reliable system, the reduced centred random variable, which we denote

Histogram of (half) the minima of the objective function
(Eq.

Figure

Diagnostics relative to the non-linear and Gaussian case, with
assimilation over 5 days.

Same as Fig.

It is known that the minimum

The histogram of the minima

For the theoretical conditions of exact Bayesianity considered here, reliability should be perfect and should not be degraded when the information content of the observations decreases (through increased observation error and/or degraded spatial and/or temporal resolution of the observations). Statistical resolution should, on the other hand, be degraded. Experiments have been performed to check this aspect (the exact experimental procedure is described in Sect. 5). The numerical results (not shown) are that both components of the Brier score are actually degraded and can increase by 1 order of magnitude. The reliability component always remains much smaller than the resolution component, and the degradation of the latter is much more systematic. This is in good agreement with the fact that the degradation of reliability can be due to only numerical effects, such as less efficient minimizations.

The above results, obtained in the case of exact theoretical Bayesianity, are going to serve as reference for the evaluation of EnsVAR in non-linear and non-Gaussian situations where Bayesianity does not necessarily hold.

The non-linear Lorenz-96 model

Impact of the informative content of observations on the two
components of the Brier score (non-linear case). The format of each panel is
the same as the format of the bottom panels of Figs.

Values of (half) the minima of the objective function for all realizations (non-linear case) (horizontal coordinate: realization number; vertical coordinate: value of the minima).

Except for the dynamical model, the experimental setup is fundamentally the
same as in the linear case. In particular, the model time step 0.25 days (our
definition), the observation frequency 0.5 days, and the values

The results are shown on Fig.

The bottom panel, which shows error statistics accumulated over all
assimilation windows, is in the same format as Fig.

Figure

Cross section of the objective function

Figure

Figure

In view of previous results, in particular results obtained by

Non-linearity is also obvious in Fig.

We have evaluated the Gaussian character of univariate marginals of the
ensembles produced by the assimilation by computing their
negentropy. The negentropy of a probability distribution is the
Kullback–Leibler divergence of that distribution with respect to the Gaussian
distribution with the same expectation and variance (see Appendix B). The
negentropy is positive and is equal to 0 for exact Gaussianity. The mean
negentropy of the ensembles is here

Experiments have been performed in which the observational error, instead of
being Gaussian, has been taken to follow a Laplace distribution (with still
the same variance

Same as Fig.

Same as Fig.

We present in this section a comparison with results obtained with the
ensemble Kalman filter (EnKF) and the particle filter (PF). As used here,
those filters are sequential in time. Fair comparison is therefore possible
only at the end of the assimilation window. Figure

Figure

Comparison with Fig.

Following comments from referees, we have made a few experiments not using localization in the EnKF. The RMSE and the RCRV are significantly degraded, while the rank histogram and the resolution component of the Brier score are improved. The reliability component of the Brier score remained the same. All this is true for both assimilation and forecast. These results, not included in the paper, would deserve further studies which are postponed for a future work.

Figure

RMS errors at the end of 5 days of assimilation (left column) and of 5 days of forecast (right column) for the three algorithms.

Same as Fig.

Same as Fig.

Same as Fig.

Same as Fig.

The left column of Table

Finally, the right column of Table

Similar experiments have been performed with the Kuramoto–Sivashinsky (K–S)
equation. It is a one-dimensional spatially periodic evolution equation, with
an advective non-linearity, a fourth-order dissipation term, and a second-order
anti-dissipative term. It reads

With

Ensemble variational assimilation (EnsVAR) has been implemented on two small-dimension non-linear chaotic toy models, as well as on linearized versions of those models.

One specific goal of the paper was to stress what is in the authors' mind a critical aspect, namely to systematically evaluate ensembles produced by ensemble assimilation as probabilistic estimators. This requires us to consider these ensembles as defining probability distributions (instead of evaluating them principally, for instance, by the error in their mean). In view of the impossibility of objectively validating the Bayesianity of ensembles, the weaker property of reliability has been evaluated instead. In the linear and Gaussian case, where theory says that EnsVAR is exactly Bayesian, the reliability of the ensembles produced by EnsVAR is high, but not numerically perfect, showing the effect of sampling errors and, probably, of numerical conditioning.

In the non-linear case, EnsVAR, implemented on temporal windows on the order of magnitude of the predictability time of the systems, shows as good (and in some cases slightly better) performance as in the exactly linear case. Comparison with the ensemble Kalman filter (EnKF) and the particle filter (PF) shows EnsVAR is globally as good a statistical estimator as those two other algorithms.

On the other hand, EnsVAR, at it has been implemented here, is numerically more costly than either EnKF or PF. And the specific algorithms used for the latter two methods may not be the most efficient. But it is worthwhile to evaluate EnsVAR in the more demanding conditions of stronger non-linearity. That is the object of the second part of this work.

No data sets were used in this article.

This Appendix describes in some detail two of the scores that are used for
evaluation of results in the paper, namely the reduced centred random
variable and the reliability–resolution decomposition of the classical
Brier score. Given a predicted probability distribution for a scalar
variable

We recall the Brier score for a binary event

The first term on the right-hand side, which measures the horizontal
dispersion of the points on the reliability diagram about the diagonal, is a
measure of reliability. The second term, which is a (negative) measure of the
vertical dispersion of the points, is a measure of resolution (the larger the
dispersion, the higher the resolution, and the smaller the second term on the
right-hand side). It is those two terms, divided by the constant

Both measures are negatively oriented and have 0 as optimal value.

As said in the main text, more on the above diagnostics and, more generally,
on objective validation of probabilistic estimation systems can be found in
e.g. chap. 8 of the book by

The negentropy of a probability distribution with density

MJ and OT have defined together the scientific approach to the paper and the numerical experiments to be performed. MJ has written the codes and run the experiments. Most of the writing has been carried out by OT.

The authors declare that they have no conflict of interest.

This article is part of the special issue “Numerical modeling, predictability and data assimilation in weather, ocean and climate: A special issue honoring the legacy of Anna Trevisan (1946–2016)”. It is a result of a Symposium Honoring the Legacy of Anna Trevisan – Bologna, Italy, 17–20 October 2017.

This work has been supported by Agence Nationale de la Recherche, France, through the Prevassemble and Geo-Fluids projects, as well as by the programme Les enveloppes fluides et l'environnement of Institut national des sciences de l'Univers, Centre national de la recherche scientifique, Paris. The authors acknowledge fruitful discussions during the preparation of the paper with Julien Brajard and Marc Bocquet. The latter also acted as a referee along with Massimo Bonavita. Both of them made further suggestions which significantly improved the paper. Edited by: Alberto Carrassi Reviewed by: Marc Bocquet and Massimo Bonavita