When taking the model error into account in data assimilation, one needs to evaluate the prior distribution represented by the Onsager–Machlup functional. Through numerical experiments, this study clarifies how the prior distribution should be incorporated into cost functions for discrete-time estimation problems. Consistent with previous theoretical studies, the divergence of the drift term is essential in weak-constraint 4D-Var (w4D-Var), but it is not necessary in Markov chain Monte Carlo with the Euler scheme. Although the former property may cause difficulties when implementing w4D-Var in large systems, this paper proposes a new technique for estimating the divergence term and its derivative.

In traditional weak-constraint 4D-Var settings

The aim of this work is to organise existing knowledge about the OM functional into a form that can be used to represent model errors in data assimilation, i.e. numerical evaluation of non-linear smoothing problems.

Throughout this article, we consider non-linear smoothing problems of the
form

Before moving on to its applications, here we review the concept of the OM
functional. To make presentation simple, we assume that

The model Eq. (

It is also discretised with the trapezoidal scheme (with the drift term at
the midpoint) as

Generally, we can assign a measure

Suppose we have a cylinder set

Measures

The change-of-measure argument (Appendix

If we perform path sampling with a sufficient number of paths, in theory we
can find the mean of distribution by averaging the samples, or the mode of
distribution by organising them into a histogram. Still, in some practical
applications, we must efficiently find the mode of distribution by
variational methods; computationally, this approach is much cheaper than path
sampling. For that purpose, we are tempted to use a quadratic cost function
for the minimisation. However, we can illustrate a simple example against
maximising the path probability (

Motivated by this example, we shall investigate a proper strategy to find the
route that maximises the density of paths. In this regard, we ask how densely
the paths populate in the small neighbourhood of a curve

Assuming that

Because

Consequently, we obtain the asymptotic expression for the ensemble average
when

Importantly, the control variable for the optimisation has changed from

Using the OM functional derived in Sect.

Following the derivation in Sect. 2.3 of

Based on the argument in Sect.

On the other hand, based on the argument in Sect.

The corresponding posterior probabilities are thus given as follows:

In the argument in Sects.

Euler scheme (E)

Euler scheme with divergence term (ED):

trapezoidal scheme (T):

trapezoidal scheme with divergence term (TD)

where

By using one of the above schemes adopted for the model error term in the
cost function, we can apply a data assimilation algorithm – either Markov
chain Monte Carlo (MCMC)

To investigate the applicability of the four candidate schemes in
Sect.

The results should be checked with “the correct answer”. The reference
solution that approximates the correct answer is provided by a particle
smoother (PS)

Generate samples of initial and
model errors, integrate

where

Reweight it according to Bayes' theorem:

In our first example, we solve the non-linear smoothing problem for the
hyperbolic model

Figure

The results of 4D-Var, which represents the MAP estimates, are shown in
Fig.

Probability density of paths derived by MCMC and PS for the hyperbolic model.

Expected path derived by MCMC (hyperbolic model).

Most probable tube derived by 4D-Var (hyperbolic model).

In our second example, we solve the non-linear smoothing problem for the
stochastic Rössler model

The results by MCMC and 4D-Var for the Rössler model are shown in
Figs.

Figure

Expected path derived by MCMC (Rössler model).

Most probable tube derived by 4D-Var (Rössler model).

Applicable OM schemes.

When one computes the cost value

We examined several discretisation schemes of the OM functional,

This justifies, for instance, the use of the following cost function for the
MAP estimate given by 4D-Var:

For application in large systems, the Euler scheme without the divergence term is preferred for path sampling because it does not require cumbersome calculation of the divergence term. In 4D-Var, the divergence term can be incorporated into the cost function by utilising Hutchinson's trace estimator.

The code for data assimilation is available at

Taylor expansion of the

For a sample path of the stochastic process, the scaling
is

In the case of a smooth curve, there is no stochastic term, and thus

Consider two stochastic processes (cf. Sect. 6.3.2 of

When weight is assigned to smooth tubes, there should always be a divergence term, for the following reason.

Let

If we assume

where

If we assume

where

In the similar manner as in 1,

where

The evaluation of

By applying Taylor's expansion to

By applying Ito's product rule
to

Regarding the second term on the RHS of
Eq. (

where

Equation (

Cost functions in Eqs. (

A realisation of the cost function is given as

The author declares that she has no conflict of interest.

The author is grateful to the referees for their comments which helped improve the readability of the paper. This work was partly supported by MEXT KAKENHI Grant-in-Aid for Scientific Research on Innovative Areas JP15H05819. All the numerical simulations were performed on the JAMSTEC SC supercomputer system.Edited by: Zoltan Toth Reviewed by: two anonymous referees