This contribution addresses the characterization of the model-error covariance matrix from the new theoretical perspective provided by the parametric Kalman filter method which approximates the covariance dynamics from the parametric evolution of a covariance model. The classical approach to obtain the modified equation of a dynamics is revisited to formulate a parametric modelling of the model-error covariance matrix which applies when the numerical model is dissipative compared with the true dynamics. As an illustration, the particular case of the advection equation is considered as a simple test bed. After the theoretical derivation of the predictability-error covariance matrices of both the nature and the numerical model, a numerical simulation is proposed which illustrates the properties of the resulting model-error covariance matrix.

A significant portion of the work being carried out in
state-of-the-art data assimilation concerns the
treatment of the forecast-error covariance matrix.
Actually, the forecast error is composed of two parts. While one part
of it is related to the uncertainty in the initial condition, another part is due to the
model error

Although some theoretical studies have been conducted in the past, which elucidate
the generic behaviour related to the model error from the dynamical system perspective
and in connection with the data assimilation
(e.g.

It has been noted in Kalman filtering and ensemble Kalman filtering (EnKF) that the propagation of error covariance
with a discretized advection model produces a model error (variance) in the form of a
variance loss

Recently,

The aims of the present work are to study how the parametric dynamics for
covariance matrix evolution can help to characterize the model-error covariance
matrix, and more precisely, to determine if is it possible to capture
some part of the model-error covariance which is due to the numerical
scheme. In this methodological contribution, we will limit ourselves to
diffusive numerical errors whose uncertainty dynamics can be
explored from the results of

The paper is organized as follows: the background in data assimilation is
reviewed in Sect.

Here, we assume that the

Due to the imperfect knowledge of the nature and the limitations encountered
during the computation, the nature dynamics is only approximated by

In practice, the true state

The forecast-error covariance matrix is related to the analysis-error
covariance matrix through a deterministic relation as follows.
From the definition of the forecast error (Eq.

When the analysis error and the model error are
decorrelated, the forecast-error covariance matrix is written as

Note that, in the case where the true nature is used to forecast the uncertainty,
the forecast-error covariance matrix coincides with the predictability-error covariance
matrix. In the latter, the predictability error with respect to the nature dynamics
plays an important role. So in order to avoid any confusion with the predictability error associated
with the numerical model, the notation

The modelling of the model error can be seen as a trade-off between its real properties
and the lack of knowledge to address this error.
In particular, the various assumptions encountered in data assimilation may be considered as
suboptimal ways to model this error. For instance, assuming that the model error
is unbiased leads to modelling the bias as some variance and overestimates the
effective model-error variance. Then, assuming a decorrelation between
the analysis and the model errors is certainly wrong for deterministic error, as for the
model error due to the discretization of the dynamics, but it may not apply for highly non-linear
processes as for the turbulent processes and transport by the turbulent processes.
Again, assuming the decorrelation between the analysis and the model errors leads to
overestimating the true effect of the model error with an overestimation of the true forecast-error
uncertainty.
However, with these assumptions, or actually
this modelling, some part of the model-error statistics can be estimated from the data.
For instance, with the assumption that the analysis and the model errors are decorrelated,
leading to Eq. (

Illustration of the evolution of the uncertainty by the nature and the numerical
model: the generic situation

By some aspects, the understanding and the specification of the model-error covariance
matrix look like the development of the background-error covariance matrix some
decades ago.
Indeed, in variational data assimilation, the background-error covariance matrix was a constant matrix,
estimated from the climatology

Because the model error can mean different things, to understand the context in which we are using
model error, let us consider the situation sketched in
Fig.

Figure

Figure

Thereafter, we consider the situation sketched in Fig.

Thus,

The decomposition (Eq.

Compared with climatological modelling of the model-error covariance matrix, as usually
encountered in data assimilation, the model for

Note that the modelling equation (Eq.

Indeed, M2000s introduced a modelling of the model-error covariance matrix similar to
Eq. (

In particular, M2000s have observed that the
Kalman filter, with the corrected predictability-error covariance, required less residual
model error

At a computational level,

However, computing

To overcome the above limitations, a high-order discretization

The parametric formulation provides a framework where a limited number of covariance parameters
(based on the continuous PDE) of the nature can be computed. The parametric formulation works
as follows.
If

The family of covariance models parameterized by the variance field
and the local anisotropic tensors, the VLATcov models, are of particular
interest

For instance, the diffusion operator of

Following

Now, we apply the parametric covariance dynamics for model-error covariance estimation.

From now, we will assume that

If the dynamics of the parameters

Hence, thanks to the parametric dynamics in the case where the nature is
known from its partial derivative equations, a new method to compute the
model-error covariance matrix can be proposed as follows.
By considering the TL dynamics for the numerical model and for the nature,
Eq. (26) provides a way to
compute both the predictability-error covariance matrices

In the next section, we apply the parametric model-error dynamics to a transport equation.

The transport equation of a passive scalar

Two discretization methods are interesting to study for the transport
equation: the finite-difference approach and the semi-Lagrangian method
resulting from the Lagrangian interpretation of Eq. (

The aim of this section is to detail the model-error covariance matrix for both schemes. This theoretical part is organized as follows. The error covariance parametric dynamics for the nature is first described considering the covariance model based on the diffusion equation; then both finite-difference and semi-Lagrangian schemes are introduced with their particular parametric dynamics.

To describe the time evolution of the predictability-error covariance
matrix, Eq. (

The computation of the metric dynamics (Eq.

From Eq. (36), it results that the variance and the diffusion are
independent quantities. The variance is conserved while it is transported by the wind.
The diffusion is not only transported, but it is also modified by the source
term

Hence, in this subsection, the predicability-error covariance for the nature
(Eq.

The finite-difference scheme is now considered as a first numerical integration
method for Eq. (

When the velocity field

While the numerical solution computed with the aid of a given numerical scheme
can converge toward the true solution as

More precisely, if

Compared with the nature (Eq.

From the PKF point of view, the modified equation is crucial since it
converts a discrete dynamics into a partial differential equation, which appeared
from P16 and P18, which is much simpler to handle when considering error covariance
dynamics.
Thanks to the modified equation (Eq. 38), it is now possible to
compute the TL evolution of the predictability error for the Euler-upwind
scheme, which can be expressed as

Equations of the PKF forecast can be computed under a similar
derivation as in Sect.

Compared with the PKF dynamics of the nature (Eq. 36), the
PKF for the Euler-upwind scheme gives rise to additional terms
which result from the numerical diffusion of magnitude

The model-error covariance matrix, Eq. (

As another example, the model-error parameters for the semi-Lagrangian scheme are now discussed.

The modified equation technique has been previously considered
for semi-Lagrangian (SL) schemes. For instance,

Because we want to focus on the method to address the issue of the model error,
and since uncertainty prediction of diffusive dynamics has been detailed by P18,
we limit the presentation to the linear interpolation in the semi-Lagrangian scheme,
and we present the modified equation of
Eq. (

The Lagrangian perspective (Eqs. 31 to

In the Lagrangian way of thinking,
starting from a given position

In its present form, the semi-Lagrangian procedure is not suited to the PKF method
since it does not give rise any partial differential equation which lies
at the core of the parametric approximation for covariance dynamics.
To proceed further and to obtain PDEs, additional assumptions are introduced
to translate the semi-Lagrangian procedure (Eq.

In the case where the discretization satisfies the CFL condition

Hence, since this corresponds mainly to the modified equation
(Eq. 38) encountered for the Euler-upwind scheme (Eq.

Note that the derivation leading to the Euler-upwind and Euler-downwind schemes is due to the choice of the linear interpolation. The bridge between the SL and the Euler-upwind/downwind procedures is not a novelty. The derivation has been carried out since it offers an insight into how to build a modified equation for the SL scheme and also for the self-consistency of the presentation. In the general situation, the modified equation for the SL scheme is hard to obtain, if at all possible, and it is not the idea to claim the procedure as universal. But it provides a new insight into the model-error covariance matrix for the SL scheme, which is one of the main goals of the present contribution.

The next section presents the numerical experiments carried out to assess the ability of the PKF to characterize the model-error covariance matrix.

In this experimental test bed, the domain is assumed to be the one-dimensional
segment

Nature

The wind field

In order to verify the CFL condition, the time step for the numerical
simulation is set to

For the numerical experiment, the initial state for

For numerical validation, since no simple analytical solution of the partial
differential equation (Eq.

Figure

Having validated the two numerical models

Predictability-error variance field,

The length-scale counterpart of Fig.

Time evolution of the spatial average over the domain of the predictability-error variance

The PKF predictability-error covariance matrix dynamics for the
transport equation (Eq.

To do so, an ensemble of

Because the dynamics are linear, the TL nature and model are independent
of any analysis state, and the ensemble is computed from the forecasts
by the high-order discretization of the nature

The predictability-error covariance dynamics for the nature is first considered.
Since the variance of the nature (Eq.

The predictability-error covariance dynamics for the numerical model is now discussed.
For the Euler-upwind scheme, the numerical diffusion
resulting from the spatiotemporal discretization in Eq. (

As a conclusion of this section, the PKF appears able to predict the variance
and the length-scale features of the predicability-error covariance dynamics of
the nature (Eq.

From the previous section, the Euler-upwind discretization of the advection (Eq.

Flow-dependent model-error covariance, modelled from Eq. (

In order to focus on the flow-dependent part of Eq. (

At the initial time, as there is no model error, the model-error variance is zero. But then,
the model-error variance should increase linearly because the sink term

Then, the model-error variance continues to grow, with a peak of uncertainty that evolves
with the flow.
In this numerical experiment, with the magnitude of the

The model-error length scale, given by Eq. (

Note that the model-error length scale is much smaller, but not null, which will balance the
large length scale of the predictability-error covariance matrix

It is interesting to compare

Hence, the present numerical experiment illustrated and characterized the flow-dependent part of the
model-error covariance

Before concluding, we end this work by addressing some general points about the flow-dependent model which has been introduced here.

The originality of the present contribution is two-fold.
First, we have formulated a theoretical background corresponding
to the model-error covariance matrix and introduced a modelling for its flow-dependent part,
Eq. (

The flow-dependent component of the model-error covariance introduced here can be computed in practice, because it relies on (1) the analysis uncertainty as characterized by the analysis state and its error covariance that can be estimated in data assimilation; and (2) the time evolution of the analysis-error covariance by the nature and by the numerical model that can be computed from an ensemble method or from the PKF approach.

Note that, if the difference between a low- and a high-resolution forecast is often used to
compute the model-error at a given time, this does not tell anything about the model-error
covariances at that time. At most, the model errors collected for a large number of dates,
and for the same forecast time, can be used to compute the climatological bias and the climatological
model-error covariance. To capture the error of the day following Eq. (

Hence, the use of the PKF is important because Eq. (

For the dynamics of a tracer, the PKF applies in 1-D as well as in 2-D and 3-D domains, where the number of equations are this time of five in 2-D and eight in 3-D (the additional equations are for the components of the local anisotropic tensor). However, in general, the use of the PKF is limited by the knowledge of the parameter dynamics. The formalism of the PKF is adapted for dynamics given by partial differential equations, as for the advection of a tracer, but the design of a multivariate PKF formulation is needed to address multivariate dynamics. Note that for the model error as presented here, the knowledge of the modified equation is a prerequisite that can be difficult to determine in general.

While the PKF is designed from the TL approximation, it is a second-order Gaussian filter
that is a particular implementation of non-linear Kalman-like filters

In this contribution, the part of the model-error covariance due to the spatiotemporal discretization scheme is explored by considering the parametric approximation for the Kalman filter (PKF). The PKF approach applies for a system whose dynamics is given by a set of PDEs. In the PKF formulation, covariances are approximated by covariance models characterized by a set of covariance parameters, whose dynamics is deduced from the PDEs of the system, supplemented by an appropriate closure if necessary. We focused on the class of covariance model distinguished by the variance field and the local anisotropic tensors (VLATcov). Therefore, for VLATcov matrices, the covariance dynamics is given by the dynamics of the variance and the local anisotropic tensors, whose dynamics are deduced from the partial differential equations of the system.

In the case where the numerical model presents a dissipation due to the discretization, or where the numerical model is more dissipative than the nature, we introduced a modelling of the model-error covariance, where its flow-dependent part is approximated as the difference between the parametric approximation of the predictability-error covariance matrix of the nature and of the numerical model, plus a residual climatological covariance matrix. This modelling of the flow-dependent part can be computed in real applications because it relies on quantities that can be estimated: the analysis state and its analysis-error covariance matrix (or some of its characteristics). For a dynamics given by a partial differential equation, the parametric predictability-error covariance matrix of the nature is deduced from the evolution equation, while the predictability-error covariance matrix of the numerical model is computed from the modified evolution, i.e. the partial differential equations that best fits the numerical solution.

The ability of the parametric approach to characterize part of the model-error covariance dynamics has been illustrated in a numerical test bed in 1-D. We have considered the transport of a scalar by a heterogeneous velocity field. In this case, the parametric dynamics of the forecast error shows that the variance is conserved along the flow, while the local anisotropic tensor is transported by the flow and deformed by the gradient of the velocity.

For this transport dynamics, two numerical schemes have been considered:
an Euler-upwind scheme and a semi-Lagrangian scheme in the case of a
linear interpolation.
The modified equations of both schemes make an additional heterogeneous
dissipation and a perturbation of the velocity appear, whose characteristics depend
on the spatiotemporal discretization (

An ensemble of forecasts has been introduced, taken as the reference,
to compare the true covariance
evolution with the parametric approximation. The numerical experiment
shows the ability of the parametric dynamics to reproduce
the predictability-error covariance dynamics.
Then, the modelling of the flow-dependent part of the model-error
covariance matrix has been computed and discussed. In particular, we discussed
the growth of the model-error variance from the understanding of the
PKF dynamics, showing a linear increase in time followed by a saturation in

With the flow-dependent formulation being introduced for modelling the situation where the numerical model is more dissipative than the nature, the model-error variance provided by the PKF should be a lower bound of the true model-error variance, which needs a residual climatological covariance to account for the bias.

While there is no data assimilation experiment here,
this contribution provides a theoretical background on the model-error
covariance that sheds light on a study previously done by

The methodology introduced here has shown the potential of exploring the
model-error covariance from the parametric dynamics of error covariance.
While the characterization of the model-error covariance is a challenge,
as in air quality forecasts

However, the parametric dynamics faces closure issues that have to
be addressed depending on applications.
Here, the investigation of diffusive model errors has been made possible thanks to
the Gaussian closure of P18. For other kind of numerical errors,
an appropriate closure will have to be specified, either from theoretical closures
or from the data as suggested by the data-driven and physics-informed
identification of uncertainty dynamics of

The aim of this section is to provide the demonstrations of some decompositions of the forecast error: the usual expression as encountered in data assimilation, an expression where the model error is considered with respect to the analysis state and an expression that makes the predictability error appear with respect to the nature.

The forecast error is defined in Eq. (

The forecast error (Eq.

Note that

As

Considering the definition of the predictability error (Eq.

Note that Eq. (

Here, we consider the particular case where the model-error covariance model
is approximated as Eq. (

The modified partial differential equation associated
with the numerical scheme (Eq.

The aims of this section are two-fold: the first goal is to obtain a discrete scheme from the semi-Lagrangian procedure, and the second goal is to deduce the modified equation of the discrete scheme.

For the sake of simplicity, the linear advection dynamics

From the characteristic curve resolution, it follows that

The modified differential equation is obtained by replacing

The data have been generated from a numerical experiment as described in the paper (see Sect. 4.1 and 4.2).

OP and RM conceived the idea to explore the influence of the numerical scheme on the error model. OP linked the modified equation to the parametric formulation of the uncertainty prediction. MEA contributed to the simulation during its training period, supervised by OP and MP.

The authors declare that they have no conflict of interest.

We would like to thank Mateusz Reszka for English proofreading. We would like to thank the three anonymous referees for their fruitful comments which have contributed to improving the manuscript.

This research has been supported by the LEFE INSU (KAPA grant).

This paper was edited by Wansuo Duan and reviewed by three anonymous referees.