https://doi.org/10.5194/npg-25-429-2018
https://doi.org/10.5194/npg-25-429-2018
Research article |  | 21 Jun 2018

Sensitivity analysis with respect to observations in variational data assimilation for parameter estimation

Victor Shutyaev, Francois-Xavier Le Dimet, and Eugene Parmuzin
Abstract

The problem of variational data assimilation for a nonlinear evolution model is formulated as an optimal control problem to find unknown parameters of the model. The observation data, and hence the optimal solution, may contain uncertainties. A response function is considered as a functional of the optimal solution after assimilation. Based on the second-order adjoint techniques, the sensitivity of the response function to the observation data is studied. The gradient of the response function is related to the solution of a nonstandard problem involving the coupled system of direct and adjoint equations. The nonstandard problem is studied, based on the Hessian of the original cost function. An algorithm to compute the gradient of the response function with respect to observations is presented. A numerical example is given for the variational data assimilation problem related to sea surface temperature for the Baltic Sea thermodynamics model.

Dates
1 Introduction

The methods of data assimilation have become an important tool for analysis of complex physical phenomena in various fields of science and technology. These methods allow us to combine mathematical models, data resulting from observations, and a priori information. The problems of variational data assimilation can be formulated as optimal control problems (e.g., Lions1968; Le Dimet and Talagrand1986) to find unknown model parameters, such as initial and/or boundary conditions, right-hand sides in the model equations (forcing terms), and distributed coefficients, based on minimization of the cost function related to observations. A necessary optimality condition reduces an optimal control problem to an optimality system which involves the model equations, the adjoint problem, and input data functions. The optimal solution depends on the observation data, and for future forecasts it is very important to study the sensitivity of the optimal solution with respect to observation errors .

The necessary optimality condition is related to the gradient of the original cost function; thus, to study the sensitivity of the optimal solution, one should differentiate the optimality system with respect to observations. In this case, we come to the so-called second-order adjoint problem . The first studies of sensitivity of the response functions after assimilation with the use of second-order adjoint were done by for variational data assimilation problems aimed at restoration of initial condition, where sensitivity with respect to model parameters was considered. The equations of the forecast sensitivity to observations in a four-dimensional (4D-Var) data assimilation were derived by . Based on these results, a practical computational approach was given by to quantify the effect of observations in 4D-Var data assimilation.

The issue of sensitivity is related to the statistical properties of the optimal solution . General sensitivity analysis in variational data assimilation with respect to observations for a nonlinear dynamic model was given by to control the initial-value function. The dynamic formulation of the problem is important because it shows different implementation options .

This paper is based on the results of and presents the sensitivity analysis with respect to observations in variational data assimilation aimed at restoration of unknown parameters of a dynamic model. We should mention the importance of the parameter estimation problem itself. A precise determination of the initial condition is very important in view of forecasting; however, the use of variational data assimilation is not limited to operational forecasting. In many domains (e.g., hydrology) the uncertainty in the parameters is more crucial than the uncertainty in the initial condition (e.g., White at al.2003). In some problems the quantity of interest can be represented directly by the estimated parameters as controls. For example, in the sea surface heat flux is estimated in order to understand its spatial and temporal variability. The problems of parameter estimation are common inverse problems considered in geophysics and in engineering applications . In the last years, interest in the parameter estimation using 4D-Var is rising .

We consider a dynamic formulation of the variational data assimilation problem for parameter estimation in a continuous form, but the presented sensitivity analysis formulas with respect to observations do not follow from our previous results for the initial condition problem and constitute a novelty of this paper. Of course, the initial condition function may also be considered as a parameter; however, in our dynamic formulation we have two equations for the model: one equation for describing an evolution of the model operator (involving model parameters such as right-hand sides, coefficients, boundary conditions, etc.) and another equation is considered as an initial condition.

This paper is organized as follows. In Sect. 2, we give the statement of the variational data assimilation problem for a nonlinear evolution model to estimate the model parameters. In Sect. 3, sensitivity of the response function after assimilation with respect to observations is studied, and its gradient is related to the solution of a nonstandard problem. In Sect. 4 we derive an operator equation involving the Hessian to study the solvability of the nonstandard problem, and give an algorithm to compute the gradient of the response function. A proof-of-concept analytic example with a simple model is given in Sect. 5 to demonstrate how the sensitivity analysis algorithm works. Section 6 presents an application of the theory to the data assimilation problem for a sea thermodynamics model. Numerical examples are given in Sect. 7 for the Baltic Sea dynamics model. The main results are discussed in Sect. 8.

2 Statement of the problem

We consider the mathematical model of a physical process that is described by the evolution problem

where the initial state u belongs to a Hilbert space X, φ=φ(t) is the unknown function belonging to $Y={L}_{\mathrm{2}}\left(\mathrm{0},T;X\right)$ with the norm $‖\mathit{\phi }{‖}_{Y}=\left(\mathit{\phi },\mathit{\phi }{\right)}_{Y}^{\mathrm{1}/\mathrm{2}}=\left({\int }_{\mathrm{0}}^{T}‖\mathit{\phi }\left(t\right){‖}_{X}^{\mathrm{2}}\text{d}t{\right)}^{\mathrm{1}/\mathrm{2}}$, F is a nonlinear operator mapping Y×Yp into Y, Yp is a Hilbert space (space of control parameters, or control space), and fY. Suppose that for a given $u\in X,f\in Y$, and λYp, there exists a unique solution φY to Eq. (1) with $\frac{\partial \mathit{\phi }}{\partial t}\in Y$. The function λ is an unknown model parameter. Let us introduce the cost function

$\begin{array}{ll}\text{(2)}& J\left(\mathit{\lambda }\right)=& \frac{\mathrm{1}}{\mathrm{2}}\left({V}_{\mathrm{1}}\left(\mathit{\lambda }-{\mathit{\lambda }}_{\text{b}}\right),\mathit{\lambda }-{\mathit{\lambda }}_{\text{b}}{\right)}_{{Y}_{\text{p}}}& +\frac{\mathrm{1}}{\mathrm{2}}\left({V}_{\mathrm{2}}\left(C\mathit{\phi }-{\mathit{\phi }}_{\text{obs}}\right),C\mathit{\phi }-{\mathit{\phi }}_{\text{obs}}{\right)}_{{Y}_{\text{obs}}},\end{array}$

where λbYp is a prior (background) function, φobsYobs is a prescribed function (observational data), Yobs is a Hilbert space (observation space), $C:Y\to {Y}_{\text{obs}}$ is a linear bounded observation operator, and ${V}_{\mathrm{1}}:{Y}_{\text{p}}\to {Y}_{\text{p}}$ and ${V}_{\mathrm{2}}:{Y}_{\text{obs}}\to {Y}_{\text{obs}}$ are symmetric positive definite bounded operators.

Let us consider the following data assimilation problem with the aim to estimate the parameter λ: for a given $u\in X,$ and fY, find λYp and φY such that they satisfy Eq. (1), and on the set of solutions to Eq. (1), the functional J(λ) takes the minimum value, i.e.,

We suppose that the solution of Eq. (3) exists. Let us note that the solvability of the parameter estimation problems (or identifiability) has been addressed, e.g., in and . To derive the optimality system, we assume the solution φ and the operator F(φ,λ) in Eqs. (1)–(2) are regular enough, and for vYp find the gradient of the functional J with respect to λ:

$\begin{array}{ll}\text{(4)}& {J}^{\prime }\left(\mathit{\lambda }\right)v& =\left({V}_{\mathrm{1}}\left(\mathit{\lambda }-{\mathit{\lambda }}_{\text{b}}\right),v{\right)}_{{Y}_{\text{p}}}+\left({V}_{\mathrm{2}}\left(C\mathit{\phi }-{\mathit{\phi }}_{\text{obs}}\right),C\mathit{\varphi }{\right)}_{{Y}_{\text{obs}}}& =\left({V}_{\mathrm{1}}\left(\mathit{\lambda }-{\mathit{\lambda }}_{\text{b}}\right),v{\right)}_{{Y}_{\text{p}}}+\left({C}^{*}{V}_{\mathrm{2}}\left(C\mathit{\phi }-{\mathit{\phi }}_{\text{obs}}\right),\mathit{\varphi }{\right)}_{Y},\end{array}$

where ϕ is the solution to the problem:

Here ${F}_{\mathit{\phi }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right):Y\to Y,\phantom{\rule{0.25em}{0ex}}{F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right):{Y}_{\text{p}}\to Y$ are the Fréchet derivatives of F with respect to φ and λ, correspondingly, and C* is the adjoint operator to C defined by $\left(C\mathit{\phi },\mathit{\psi }{\right)}_{{Y}_{\text{obs}}}=\left(\mathit{\phi },{C}^{*}\mathit{\psi }{\right)}_{Y},\phantom{\rule{0.25em}{0ex}}\mathit{\phi }\in Y,\mathit{\psi }\in {Y}_{\text{obs}}$.

Let us consider the adjoint operator $\left({F}_{\mathit{\phi }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}:Y\to Y$ and introduce the adjoint problem:

Then Eq. (4) with Eqs. (5) and (6) gives

$\begin{array}{ll}\text{(7)}& {J}^{\prime }\left(\mathit{\lambda }\right)v& =\left({V}_{\mathrm{1}}\left(\mathit{\lambda }-{\mathit{\lambda }}_{\text{b}}\right),v{\right)}_{{Y}_{\text{p}}}-\left({\mathit{\phi }}^{*},{F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right)v{\right)}_{Y}& =\left({V}_{\mathrm{1}}\left(\mathit{\lambda }-{\mathit{\lambda }}_{\text{b}}\right),v{\right)}_{{Y}_{\text{p}}}-\left(\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{\mathit{\phi }}^{*},v{\right)}_{{Y}_{\text{p}}},\end{array}$

where $\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}:Y\to {Y}_{\text{p}}$ is the adjoint operator to ${F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right)$. Therefore, the gradient of J is defined by

${J}^{\prime }\left(\mathit{\lambda }\right)={V}_{\mathrm{1}}\left(\mathit{\lambda }-{\mathit{\lambda }}_{\text{b}}\right)-\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{\mathit{\phi }}^{*}.$

From Eqs. (4)–(7) we get the optimality system (the necessary optimality conditions; ):

$\begin{array}{}\text{(10)}& {V}_{\mathrm{1}}\left(\mathit{\lambda }-{\mathit{\lambda }}_{\text{b}}\right)-\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{\mathit{\phi }}^{*}=\mathrm{0}.\end{array}$

We assume that the system (Eqs. 810) has a unique solution. The system (Eqs. 810) may be considered as a generalized model 𝒜(U)=0 with the state variable $U=\left(\mathit{\phi },{\mathit{\phi }}^{*},\mathit{\lambda }\right)$, and it contains information about observations. In what follows we study the problem of the sensitivity of functionals of the optimal solution to the observation data.

If the observation operator C is nonlinear, i.e., Cφ=C(φ), then the right-hand side of the adjoint equation (Eq. 9) contains $\left({C}_{\mathit{\phi }}^{\prime }{\right)}^{*}$ instead of C* and all the analysis presented below is similar.

3 Sensitivity of functionals after assimilation

In geophysical applications the observation data cannot be measured precisely; therefore, it is important to be able to estimate the impact of uncertainties in observations on the outputs of the model after assimilation.

Let us introduce a response function G(φ,λ), which is supposed to be a real-valued function and can be considered as a functional on Y×Yp. We are interested in the sensitivity of G with respect to φobs, with φ and λ obtained from the optimality system (Eqs. 810). By definition, the sensitivity is defined by the gradient of G with respect to φobs:

$\begin{array}{}\text{(11)}& \frac{\text{d}G}{\text{d}{\mathit{\phi }}_{\text{obs}}}=\frac{\partial G}{\partial \mathit{\phi }}\frac{\partial \mathit{\phi }}{\partial {\mathit{\phi }}_{\text{obs}}}+\frac{\partial G}{\partial \mathit{\lambda }}\frac{\partial \mathit{\lambda }}{\partial {\mathit{\phi }}_{\text{obs}}}.\end{array}$

If δφobs is a perturbation on φobs, we get from the optimality system:

$\begin{array}{ll}\text{(14)}& {V}_{\mathrm{1}}\mathit{\delta }\mathit{\lambda }& -\left({F}_{\mathit{\lambda }\mathit{\phi }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right)\mathit{\delta }\mathit{\phi }{\right)}^{*}{\mathit{\phi }}^{*}-\left({F}_{\mathit{\lambda }\mathit{\lambda }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right)\mathit{\delta }\mathit{\lambda }{\right)}^{*}{\mathit{\phi }}^{*}& -\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}\mathit{\delta }{\mathit{\phi }}^{*}=\mathrm{0},\end{array}$

and

$\begin{array}{}\text{(15)}& {\left(\frac{\text{d}G}{\text{d}{\mathit{\phi }}_{\text{obs}}},\mathit{\delta }{\mathit{\phi }}_{\text{obs}}\right)}_{{Y}_{\text{obs}}}={\left(\frac{\partial G}{\partial \mathit{\phi }},\mathit{\delta }\mathit{\phi }\right)}_{Y}+{\left(\frac{\partial G}{\partial \mathit{\lambda }},\mathit{\delta }\mathit{\lambda }\right)}_{{Y}_{\text{p}}},\end{array}$

where δφ, δφ*, and δλ are the Gâteaux derivatives of φ, φ*, and λ in the direction δφobs (for example, $\mathit{\delta }\mathit{\phi }=\frac{\partial \mathit{\phi }}{\partial {\mathit{\phi }}_{\text{obs}}}\mathit{\delta }{\mathit{\phi }}_{\text{obs}}$).

To compute the gradient ${\mathrm{\nabla }}_{{\mathit{\phi }}_{\text{obs}}}G\left(\mathit{\phi },\mathit{\lambda }\right)$, let us introduce three adjoint variables P1Y, P2Y, and P3Yp. By taking the inner product of Eq. (12) by P1, Eq. (13) by P2, and of Eq. (14) by P3 and adding them,

Then, using integration by parts and adjoint operators, we get

Here we put

$\begin{array}{ll}-\frac{\partial {P}_{\mathrm{1}}}{\partial t}& -\left({F}_{\mathit{\phi }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{P}_{\mathrm{1}}-\left({F}_{\mathit{\phi }\mathit{\phi }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right){P}_{\mathrm{2}}{\right)}^{*}{\mathit{\phi }}^{*}\\ & -\left({F}_{\mathit{\lambda }\mathit{\phi }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right){P}_{\mathrm{3}}{\right)}^{*}{\mathit{\phi }}^{*}+{C}^{*}{V}_{\mathrm{2}}C{P}_{\mathrm{2}}=\frac{\partial G}{\partial \mathit{\phi }},\end{array}$

and

$\begin{array}{ll}{V}_{\mathrm{1}}{P}_{\mathrm{3}}& -\left({F}_{\mathit{\phi }\mathit{\lambda }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right){P}_{\mathrm{2}}{\right)}^{*}{\mathit{\phi }}^{*}-\left({F}_{\mathit{\lambda }\mathit{\lambda }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right){P}_{\mathrm{3}}{\right)}^{*}{\mathit{\phi }}^{*}\\ & -\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{P}_{\mathrm{1}}=\frac{\partial G}{\partial \mathit{\lambda }},\phantom{\rule{0.25em}{0ex}}{P}_{\mathrm{1}}{|}_{t\phantom{\rule{0.125em}{0ex}}=,T}=\mathrm{0},\end{array}$

$\frac{\partial {P}_{\mathrm{2}}}{\partial t}-{F}_{\mathit{\phi }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){P}_{\mathrm{2}}-{F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){P}_{\mathrm{3}}=\mathrm{0},\phantom{\rule{0.25em}{0ex}}{P}_{\mathrm{2}}{|}_{t\phantom{\rule{0.125em}{0ex}}=\phantom{\rule{0.125em}{0ex}}\mathrm{0}}=\mathrm{0}.$

Thus, if ${P}_{\mathrm{1}},{P}_{\mathrm{2}},$ and P3 are the solutions of the following system of equations

$\begin{array}{ll}\text{(19)}& {V}_{\mathrm{1}}{P}_{\mathrm{3}}& -\left({F}_{\mathit{\phi }\mathit{\lambda }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right){P}_{\mathrm{2}}{\right)}^{*}{\mathit{\phi }}^{*}-\left({F}_{\mathit{\lambda }\mathit{\lambda }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right){P}_{\mathrm{3}}{\right)}^{*}{\mathit{\phi }}^{*}& -\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{P}_{\mathrm{1}}=\frac{\partial G}{\partial \mathit{\lambda }},\end{array}$

then from Eq. (16) we get

${\left(\frac{\partial G}{\partial \mathit{\phi }},\mathit{\delta }\mathit{\phi }\right)}_{Y}+{\left(\frac{\partial G}{\partial \mathit{\lambda }},\mathit{\delta }\mathit{\lambda }\right)}_{{Y}_{\text{p}}}={\left(\mathit{\delta }{\mathit{\phi }}_{\text{obs}},{V}_{\mathrm{2}}C{P}_{\mathrm{2}}\right)}_{{Y}_{\text{obs}}},$

and due to Eq. (15) the gradient of G is given by

$\begin{array}{}\text{(20)}& \frac{\text{d}G}{\text{d}{\mathit{\phi }}_{\text{obs}}}={V}_{\mathrm{2}}C{P}_{\mathrm{2}}.\end{array}$

We get a coupled system of two differential equations (Eqs. 17 and 18) of the first order with respect to time, and Eq. (19). To study this nonstandard problem (Eqs. 1719), we reduce it to a single operator equation involving the Hessian of the original cost function.

4 Operator equation via Hessian and response function gradient

Let us denote the auxiliary variable v=P3 and rewrite the nonstandard problem (Eqs. 1719) in an equivalent form:

$\begin{array}{ll}\text{(23)}& {V}_{\mathrm{1}}v& -\left({F}_{\mathit{\phi }\mathit{\lambda }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right){P}_{\mathrm{2}}{\right)}^{*}{\mathit{\phi }}^{*}-\left({F}_{\mathit{\lambda }\mathit{\lambda }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right)v{\right)}^{*}{\mathit{\phi }}^{*}& -\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{P}_{\mathrm{1}}=\frac{\partial G}{\partial \mathit{\lambda }}.\end{array}$

Here we have three unknowns: $v\in {Y}_{\text{p}},\phantom{\rule{0.25em}{0ex}}{P}_{\mathrm{1}}$, and P2Y. Let us write Eqs. (21)–(23) in the form of an operator equation for v. We define the operator , which acts on w belonging to Yp, by the successive solution of the following problems:

$\begin{array}{ll}\text{(26)}& \mathcal{H}w=& {V}_{\mathrm{1}}w-\left({F}_{\mathit{\phi }\mathit{\lambda }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right)\mathit{\varphi }{\right)}^{*}{\mathit{\phi }}^{*}& -\left({F}_{\mathit{\lambda }\mathit{\lambda }}^{\prime \prime }\left(\mathit{\phi },\mathit{\lambda }\right)w{\right)}^{*}{\mathit{\phi }}^{*}-\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{\mathit{\varphi }}^{*}.\end{array}$

Here λ,φ, and φ* are the solutions of the optimality system (Eqs. 810). Then Eqs. (21)–(23) are equivalent to the following equation in Yp:

$\begin{array}{}\text{(27)}& \mathcal{H}v=\mathcal{F}\end{array}$

with the right-hand side defined by

$\begin{array}{}\text{(28)}& \mathcal{F}=\frac{\partial G}{\partial \mathit{\lambda }}+\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{\stackrel{\mathrm{̃}}{\mathit{\varphi }}}^{*},\end{array}$

where ${\stackrel{\mathrm{̃}}{\mathit{\varphi }}}^{*}$ is the solution to the adjoint problem:

It is easily seen that the operator defined by Eqs. (24)–(26) is the Hessian of the original functional J considered on the optimal solution λ of the problem (Eqs. 810): ${J}^{\prime \prime }\left(\mathit{\lambda }\right)=\mathcal{H}$. Under the assumption that is positive definite, the operator equation (Eq. 27) is correctly and everywhere solvable in Yp (Vainberg1964), i.e., for every there exists a unique solution vYp and

$‖v{‖}_{{Y}_{\text{p}}}\le c‖\mathcal{F}{‖}_{{Y}_{\text{p}}},\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}c=\text{const}>\mathrm{0}.$

Therefore, under the assumption that J′′(λ) is positive definite on the optimal solution, the nonstandard problem (Eqs. 1719) has a unique solution ${P}_{\mathrm{1}},{P}_{\mathrm{2}}\in Y$, and P3Yp.

Based on the above consideration, we can formulate the following algorithm to compute the gradient of the response function G:

1. For $\frac{\partial G}{\partial \mathit{\lambda }}\in {Y}_{\text{p}}$, and $\phantom{\rule{0.25em}{0ex}}\frac{\partial G}{\partial \mathit{\phi }}\in Y$ solve the adjoint problem

and put

$\mathcal{F}=\frac{\partial G}{\partial \mathit{\lambda }}+\left({F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{\stackrel{\mathrm{̃}}{\mathit{\varphi }}}^{*}.$
2. Find v by solving

$\mathcal{H}v=\mathcal{F}$

with the Hessian of the original functional J defined by Eqs. (24)–(26).

3. Solve the direct problem

4. Compute the gradient of the response function as

$\begin{array}{}\text{(32)}& \frac{\text{d}G}{\text{d}{\mathit{\phi }}_{\text{obs}}}={V}_{\mathrm{2}}C{P}_{\mathrm{2}}.\end{array}$

Equation (32) allows us to estimate the sensitivity of the functionals related to the optimal solution after assimilation, with respect to observation data.

Remark 1. In the above consideration, to show the solvability, we have assumed that the direct and adjoint tangent linear problems of the form

with $f,g\in Y$ have the unique solutions $\mathit{\varphi },{\mathit{\varphi }}^{*}\in Y$.

Remark 2. The analysis presented above is based on the hypothesis that the initial state of the system under observation is known, and that it is only model parameters (boundary conditions, forcing terms, distributed coefficients, etc.) that are to be determined from the observations. Often, a more realistic situation would be one where the assimilation is intended to determine both the initial conditions of the system and, in addition, model parameters . The sensitivity analysis can be applied as well to such a situation. To consider the joint state and parameter estimation problem, we should use the results both of this paper and of the previous one . In this case we need to introduce an additional term related to the initial condition into the cost function (Eq. 2) to find simultaneously u and λ. The optimality system (Eqs. 810) will be supplemented by an additional equation related to the gradient of the cost function with respect to u. The Hessian in this case is a 2 × 2 operator matrix, acting on the augmented vector $U=\left(u,\mathit{\lambda }{\right)}^{T}$, and all the derivations are made similarly, being, however, more cumbersome and lengthy.

Below we give a proof-of-concept analytic example to show how the algorithm (Eqs. 3032) works. Then, as an application, we consider a variational data assimilation problem for a sea thermodynamics model.

5 Proof-of-concept analytic example

Let us consider a simple evolution problem for the ordinary differential equation

where u∈ℝ; $a,\mathit{\lambda }\in \mathbb{R}$; $g=g\left(t\right)\ge \mathrm{0}$. Here, in the notations of Sect. 2, we have X=ℝ, $Y={L}_{\mathrm{2}}\left(\mathrm{0},T\right)$, and $F\left(\mathit{\phi },\mathit{\lambda }\right)=-a\mathit{\phi }+\mathit{\lambda }g$. Let us formulate the data assimilation problem to find the parameter λ if we have observation data for φ at the end of the time interval t=T. We need to minimize the cost function

$\begin{array}{}\text{(34)}& J\left(\mathit{\lambda }\right)=\underset{v\in \mathbb{R}}{inf}J\left(v\right),\end{array}$

where $J\left(v\right)=\frac{\mathrm{1}}{\mathrm{2}}|\stackrel{\mathrm{̃}}{\mathit{\phi }}{|}_{t\phantom{\rule{0.125em}{0ex}}=\phantom{\rule{0.125em}{0ex}}T}-{\mathit{\phi }}_{\text{obs}}{|}^{\mathrm{2}}$, and $\stackrel{\mathrm{̃}}{\mathit{\phi }}$ is the solution to Eq. (33) with λ=v. Thus, here we have Yp=ℝ, V1=0, V2=1, and $C\mathit{\phi }=\mathit{\phi }{|}_{t=T}$. In this case, the optimality system (Eqs. 810) has the form:

$\begin{array}{}\text{(37)}& \left(g,{\mathit{\phi }}^{*}\right)=\underset{\mathrm{0}}{\overset{T}{\int }}g\left(t\right){\mathit{\phi }}^{*}\left(t\right)=\mathrm{0}.\end{array}$

It is easy to see that the problem of data assimilation (Eqs. 3334) has a unique solution

$\begin{array}{}\text{(38)}& \mathit{\lambda }={\mathit{\lambda }}_{\text{opt}}=\frac{{\mathit{\phi }}_{\text{obs}}-{\mathit{\phi }}_{\mathrm{0}}}{{\mathit{\phi }}_{\mathrm{1}}},\end{array}$

where ${\mathit{\phi }}_{\mathrm{0}}={u}_{\mathrm{0}}{e}^{-aT}$, ${\mathit{\phi }}_{\mathrm{1}}=\underset{\mathrm{0}}{\overset{T}{\int }}{e}^{-a\left(T-{t}^{\prime }\right)}g\left({t}^{\prime }\right)\text{d}{t}^{\prime }$.

Indeed, if λ has the form (Eq. 38), the solution of the problem (Eq. 33) satisfies $\mathit{\phi }{|}_{t=T}={\mathit{\phi }}_{\text{obs}}$, and the functional J from Eq. (34) attains its minimal value J=0. In this case ${\mathit{\phi }}^{*}=\mathrm{0}$, and the optimality system (Eqs. 3537) is satisfied. Let us consider the response function in the form

$\begin{array}{}\text{(39)}& G\left(\mathit{\phi },\mathit{\lambda }\right)=\underset{\mathrm{0}}{\overset{T}{\int }}\mathit{\phi }\left(t\right)\text{d}t.\end{array}$

Let a≠0. After assimilation, taking into account the solution of the problem (Eq. 33), we have

$\begin{array}{}\text{(40)}& G\left(\mathit{\phi },\mathit{\lambda }\right)=\frac{u}{a}\left(\mathrm{1}-{e}^{-aT}\right)+\frac{{\mathit{\lambda }}_{\text{opt}}}{a}\left(\underset{\mathrm{0}}{\overset{T}{\int }}g\left(t\right)\text{d}t-{\mathit{\phi }}_{\mathrm{1}}\right),\end{array}$

where λopt is given by Eq. (38). Then, by differentiation of G with respect to φobs we have the gradient

$\begin{array}{}\text{(41)}& \frac{\text{d}G}{\text{d}{\mathit{\phi }}_{\text{obs}}}=\frac{\mathrm{1}}{a{\mathit{\phi }}_{\mathrm{1}}}\left(\underset{\mathrm{0}}{\overset{T}{\int }}g\left(t\right)\text{d}t-{\mathit{\phi }}_{\mathrm{1}}\right).\end{array}$

Let us now apply the algorithm (Eqs. 3032) to compute the gradient of the function G. Since $\frac{\partial G}{\partial \mathit{\phi }}=\mathrm{1}$, and $\left({F}_{\mathit{\phi }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}=-a$, then on the first step of the algorithm, we solve the problem (Eq. 30) and get the solution

$\begin{array}{}\text{(42)}& {\stackrel{\mathrm{̃}}{\mathit{\varphi }}}^{*}\left(t\right)=\frac{\mathrm{1}}{a}\left(\mathrm{1}-{e}^{-a\left(T-t\right)}\right).\end{array}$

Taking into account that $\partial G/\partial \mathit{\lambda }=\mathrm{0}$ and $\left({F}_{\mathit{\phi }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right){\right)}^{*}{\stackrel{\mathrm{̃}}{\mathit{\varphi }}}^{*}=\left(g,{\stackrel{\mathrm{̃}}{\mathit{\varphi }}}^{*}\right)$, we get $\mathcal{F}=\left(g,{\stackrel{\mathrm{̃}}{\mathit{\varphi }}}^{*}\right)$, i.e.,

$\begin{array}{}\text{(43)}& \mathcal{F}=\underset{\mathrm{0}}{\overset{T}{\int }}g\stackrel{\mathrm{̃}}{{\mathit{\varphi }}^{*}}\text{d}t=\frac{\mathrm{1}}{a}\left(\underset{\mathrm{0}}{\overset{T}{\int }}g\left(t\right)\text{d}t-{\mathit{\phi }}_{\mathrm{1}}\right).\end{array}$

On the second step of the algorithm, one needs to solve the equation v=ℱ with the Hessian defined by Eqs. (24)–(26). Since all the second-order derivatives of F(φ,λ) equal zero, then it is easily seen that in this case is the operator of multiplication by the scalar

$\begin{array}{}\text{(44)}& \mathcal{H}=\underset{\mathrm{0}}{\overset{T}{\int }}g\left(t\right)\mathit{\psi }{|}_{t\phantom{\rule{0.125em}{0ex}}=\phantom{\rule{0.125em}{0ex}}T}{e}^{-a\left(T-t\right)}\text{d}t=\left(\mathit{\psi }{|}_{t\phantom{\rule{0.125em}{0ex}}=\phantom{\rule{0.125em}{0ex}}T}{\right)}^{\mathrm{2}},\end{array}$

where ψ(t) is the solution of the problem (Eq. 33) with u=0, and λ=1. Then, after the second step of the algorithm we get

$\begin{array}{}\text{(45)}& v={\mathcal{H}}^{-\mathrm{1}}\mathcal{F}=\left(\mathit{\psi }{|}_{t\phantom{\rule{0.125em}{0ex}}=\phantom{\rule{0.125em}{0ex}}T}{\right)}^{-\mathrm{2}}\mathcal{F}.\end{array}$

On the third step of the algorithm, we need to solve the problem (Eq. 31). Since ${F}_{\mathit{\lambda }}^{\prime }\left(\mathit{\phi },\mathit{\lambda }\right)=g$, the solution of this problem has the form P2(t)=vψ(t). Finally, using Eq. (32), we get the gradient of G with respect to φobs:

$\begin{array}{}\text{(46)}& \frac{\text{d}G}{\text{d}{\mathit{\phi }}_{\text{obs}}}={P}_{\mathrm{2}}{|}_{t\phantom{\rule{0.125em}{0ex}}=\phantom{\rule{0.125em}{0ex}}T}=v\mathit{\psi }\left(T\right)=\frac{\mathit{\psi }\left(T\right)\mathcal{F}}{{\mathit{\psi }}^{\mathrm{2}}\left(T\right)}=\frac{\mathcal{F}}{\mathit{\psi }\left(T\right)}.\end{array}$

Moreover, since φ1=ψ(T), then from Eqs. (46) and (43) we have

$\begin{array}{}\text{(47)}& \frac{\text{d}G}{\text{d}{\mathit{\phi }}_{\text{obs}}}=\frac{\mathrm{1}}{a{\mathit{\phi }}_{\mathrm{1}}}\left(\underset{\mathrm{0}}{\overset{T}{\int }}g\left(t\right)\text{d}t-{\mathit{\phi }}_{\mathrm{1}}\right).\end{array}$

Thus, the gradient obtained by the algorithm (Eqs. 3032) exactly coincides with the value of the gradient obtained in Eq. (41) by direct differentiation, which is the expected result.

6 Data assimilation problem for a sea thermodynamics model

Consider the sea thermodynamics problem in the form :

$\begin{array}{ll}& {T}_{t}+\left(\stackrel{\mathrm{‾}}{U},\mathrm{Grad}\right)T-\mathrm{Div}\left({\stackrel{\mathrm{^}}{a}}_{T}\cdot \mathrm{Grad}\phantom{\rule{0.25em}{0ex}}T\right)={f}_{T}\phantom{\rule{0.25em}{0ex}}\text{in}\phantom{\rule{0.25em}{0ex}}D×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right),\\ & T={T}_{\mathrm{0}}\phantom{\rule{0.25em}{0ex}}\text{for}\phantom{\rule{0.25em}{0ex}}t={t}_{\mathrm{0}}\phantom{\rule{0.25em}{0ex}}\text{in}\phantom{\rule{0.25em}{0ex}}D,\\ & -{\mathit{\nu }}_{T}\frac{\partial T}{\partial z}=Q\phantom{\rule{0.25em}{0ex}}\text{on}\phantom{\rule{0.25em}{0ex}}{\mathrm{\Gamma }}_{\text{S}}×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right),\phantom{\rule{1em}{0ex}}\frac{\partial T}{\partial n}=\mathrm{0}\phantom{\rule{0.25em}{0ex}}\text{on}\phantom{\rule{0.25em}{0ex}}{\mathrm{\Gamma }}_{w,\text{c}}×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right),\\ & {\stackrel{\mathrm{‾}}{U}}_{\text{n}}^{\left(-\right)}T+\frac{\partial T}{\partial n}={Q}_{T}\phantom{\rule{0.25em}{0ex}}\text{on}\phantom{\rule{0.25em}{0ex}}{\mathrm{\Gamma }}_{w,\text{op}}×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right),\end{array}$

$\begin{array}{}\text{(48)}& \frac{\partial T}{\partial n}=\mathrm{0}\phantom{\rule{0.25em}{0ex}}\text{on}\phantom{\rule{0.25em}{0ex}}{\mathrm{\Gamma }}_{\text{H}}×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right),\end{array}$

where $T=T\left(x,y,z,t\right)$ is an unknown temperature function, $t\in \left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)$, $\left(x,y,z\right)\in D=\mathrm{\Omega }×\left(\mathrm{0},H\right)$, Ω⊂R2, $H=H\left(x,y\right)$ is the function of the bottom relief, $Q=Q\left(x,y,t\right)$ is the total heat flux, $\stackrel{\mathrm{‾}}{U}=\left(u,v,w\right)$, ${\stackrel{\mathrm{^}}{a}}_{T}=\text{diag}\left(\left({a}_{T}{\right)}_{ii}\right)$, $\left({a}_{T}{\right)}_{\mathrm{11}}=\left({a}_{T}{\right)}_{\mathrm{22}}={\mathit{\mu }}_{T}$, (aT)33=νT, and ${f}_{T}={f}_{T}\left(x,y,z,t\right)$ are given functions. The boundary of the domain $\mathrm{\Gamma }\equiv \partial D$ is represented as a union of four disjoint parts ΓS, Γw,op, Γw,c, and ΓH, where ΓS (the unperturbed sea surface), Γw,op is the liquid (open) part of vertical lateral boundary, Γw,c is the solid part of the vertical lateral boundary, ΓH is the sea bottom, ${\stackrel{\mathrm{‾}}{U}}_{\text{n}}^{\left(-\right)}=\left(|{\stackrel{\mathrm{‾}}{U}}_{\text{n}}|-{\stackrel{\mathrm{‾}}{U}}_{\text{n}}\right)/\mathrm{2},$ and ${\stackrel{\mathrm{‾}}{U}}_{\text{n}}$ is the normal component of $\stackrel{\mathrm{‾}}{U}$. The other notations and a detailed description of the problem statement can be found in .

Equation (48) can be written in the form of an operator equation:

$\begin{array}{}\text{(49)}& \begin{array}{ll}{T}_{t}+LT=\mathcal{F}+BQ,& t\in \left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right),\\ T={T}_{\mathrm{0}},\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}t={t}_{\mathrm{0}},& \end{array}\end{array}$

where the equality is understood in the weak sense, namely,

$\begin{array}{}\text{(50)}& \left({T}_{t},\stackrel{\mathrm{^}}{T}\right)+\left(LT,\stackrel{\mathrm{^}}{T}\right)=\mathcal{F}\left(\stackrel{\mathrm{^}}{T}\right)+\left(BQ,\stackrel{\mathrm{^}}{T}\right)\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\forall \stackrel{\mathrm{^}}{T}\in {W}_{\mathrm{2}}^{\mathrm{1}}\left(D\right),\end{array}$

in this case L, , and B are defined by the following relations:

$\begin{array}{ll}\left(LT,\stackrel{\mathrm{^}}{T}\right)\equiv & \underset{D}{\int }\left(-T\mathrm{Div}\left(\stackrel{\mathrm{‾}}{U}\stackrel{\mathrm{^}}{T}\right)\right)\text{d}D+\underset{{\mathrm{\Gamma }}_{w,\text{op}}}{\int }{\stackrel{\mathrm{‾}}{U}}_{\text{n}}^{\left(+\right)}T\stackrel{\mathrm{^}}{T}\text{d}\mathrm{\Gamma }\\ & +\underset{D}{\int }{\stackrel{\mathrm{^}}{a}}_{T}\mathrm{Grad}\left(T\right)\cdot \mathrm{Grad}\left(\stackrel{\mathrm{^}}{T}\right)\text{d}D,\end{array}$

$\begin{array}{ll}& \mathcal{F}\left(\stackrel{\mathrm{^}}{T}\right)=\underset{{\mathrm{\Gamma }}_{w,\text{op}}}{\int }{Q}_{T}\stackrel{\mathrm{^}}{T}\text{d}\mathrm{\Gamma }+\underset{D}{\int }{f}_{T}\stackrel{\mathrm{^}}{T}\text{d}D,\left({T}_{t},\stackrel{\mathrm{^}}{T}\right)=\underset{D}{\int }{T}_{t}\stackrel{\mathrm{^}}{T}\text{d}D,\\ & \left(BQ,\stackrel{\mathrm{^}}{T}\right)=\underset{\mathrm{\Omega }}{\int }Q\stackrel{\mathrm{^}}{T}{|}_{z=\mathrm{0}}\text{d}\mathrm{\Omega },\end{array}$

and the functions ${\stackrel{\mathrm{^}}{a}}_{T}$, QT, fT, and Q are such that Eq. (50) makes sense. The properties of the operator L were studied by .

Due to Eq. (50), Eq. (49) is considered in ${Y}^{*}={L}_{\mathrm{2}}\left({t}_{\mathrm{0}},{t}_{\mathrm{1}};\left({W}_{\mathrm{2}}^{\mathrm{1}}\left(D\right){\right)}^{*}\right)$, and the operator $B:{L}_{\mathrm{2}}\left(\mathrm{\Omega }×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)\right)\to {Y}^{*}$ maps the function $Q\in {L}_{\mathrm{2}}\left(\mathrm{\Omega }×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)\right)$ into the function $BQ\in {Y}^{*}$ such that $\left(BQ,\stackrel{\mathrm{^}}{T}\right)=\underset{\mathrm{\Omega }}{\int }Q\stackrel{\mathrm{^}}{T}{|}_{z=\mathrm{0}}\text{d}\mathrm{\Omega }$, $\forall \stackrel{\mathrm{^}}{T}\in {W}_{\mathrm{2}}^{\mathrm{1}}\left(D\right)$. Therefore, BQ is a linear and bounded functional on ${L}_{\mathrm{2}}\left(\mathrm{0},T;{W}_{\mathrm{2}}^{\mathrm{1}}\left(D\right)\right)$.

Consider the data assimilation problem for the sea surface temperature (see Agoshkov et al.2008). Suppose that the function $Q\in {L}_{\mathrm{2}}\left(\mathrm{\Omega }×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)\right)$ is unknown in Eq. (48). Let ${T}_{\text{obs}}\left(x,y,t\right)$ also be the function on $\stackrel{\mathrm{‾}}{\mathrm{\Omega }}\equiv \mathrm{\Omega }\cup \partial \mathrm{\Omega }$ obtained for $t\in \left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)$ by processing the observation data, and this function in its physical sense is an approximation to the surface temperature function on Ω, i.e., to $T{|}_{z\phantom{\rule{0.125em}{0ex}}=\phantom{\rule{0.125em}{0ex}}\mathrm{0}}$. We suppose that ${T}_{\text{obs}}\in {L}_{\mathrm{2}}\left(\mathrm{\Omega }×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)\right)$, but the function Tobs may not possess greater smoothness and hence it cannot be used for the boundary condition on ΓS. We admit the case when Tobs is defined only on some subset of $\mathrm{\Omega }×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)$ and denote the indicator (characteristic) function of this set by m0. For definiteness sake, we assume that Tobs is zero outside this subset.

Consider the data assimilation problem for the surface temperature in the following form: find T and Q such that

where

$\begin{array}{ll}J\left(Q\right)=& \phantom{\rule{0.25em}{0ex}}\frac{\mathit{\alpha }}{\mathrm{2}}\underset{{t}_{\mathrm{0}}}{\overset{{t}_{\mathrm{1}}}{\int }}\underset{\mathrm{\Omega }}{\int }|Q-{Q}^{\left(\mathrm{0}\right)}{|}^{\mathrm{2}}\text{d}\mathrm{\Omega }\text{d}t\\ \text{(52)}& & +\frac{\mathrm{1}}{\mathrm{2}}\underset{{t}_{\mathrm{0}}}{\overset{{t}_{\mathrm{1}}}{\int }}\underset{\mathrm{\Omega }}{\int }{m}_{\mathrm{0}}|T{|}_{z\phantom{\rule{0.125em}{0ex}}=\phantom{\rule{0.125em}{0ex}}\mathrm{0}}-{T}_{\text{obs}}{|}^{\mathrm{2}}\text{d}\mathrm{\Omega }\text{d}t,\end{array}$

and ${Q}^{\left(\mathrm{0}\right)}={Q}^{\left(\mathrm{0}\right)}\left(x,y,t\right)$ is a given function, and $\mathit{\alpha }=\text{const}>\mathrm{0}$.

For α>0 this variational data assimilation problem has a unique solution. The existence of the optimal solution follows from the classic results of the theory of optimal control problems (Lions1968), because it is easy to show that the solution to Eq. (48) continuously depends on the flux Q (a priori estimates are valid in the corresponding functional spaces), the functional J is weakly lower semi-continuous, and the space of admissible controls ${L}_{\mathrm{2}}\left(\mathrm{\Omega }×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)\right)$ is weakly compact.

For α=0 the problem does not always have a solution, but, as was shown by , there is unique and dense solvability, and it allows one to construct a sequence of regularized solutions minimizing the functional, which is related to a sequence of coefficients αn, with αn→0 when n→∞.

The optimality system determining the solution of the formulated variational data assimilation problem according to the necessary condition grad J=0 has the form:

$\begin{array}{ll}\text{(53)}& & {T}_{t}+LT=\mathcal{F}+BQ\phantom{\rule{0.25em}{0ex}}\text{in}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}D×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right),& T={T}_{\mathrm{0}},t={t}_{\mathrm{0}},\\ \text{(54)}& & -\left({T}^{*}{\right)}_{t}+{L}^{*}{T}^{*}=B{m}_{\mathrm{0}}\left({T}_{\text{obs}}-T\right)\phantom{\rule{0.25em}{0ex}}\text{in}\phantom{\rule{0.25em}{0ex}}D×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right),& {T}^{*}=\mathrm{0},t={t}_{\mathrm{1}},\\ \text{(55)}& & \mathit{\alpha }\left(Q-{Q}^{\left(\mathrm{0}\right)}\right)-{T}^{*}=\mathrm{0}\phantom{\rule{0.25em}{0ex}}\text{on}\phantom{\rule{0.25em}{0ex}}\mathrm{\Omega }×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right),\end{array}$

where L* is the operator adjoint to L.

Here the boundary-value function Q plays the role of λ from Sect. 2, φ=T, the operator F has the form $F\left(T,Q\right)=-LT+BQ$, and ${F}_{T}^{\prime }=-L,{F}_{Q}^{\prime }=B$. Since the operator F(T,Q) is bilinear in this case, the Hessian acting on some $\mathit{\psi }\in {L}_{\mathrm{2}}\left(\mathrm{\Omega }×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)\right)$ is defined by the successive solution of the following problems:

$\begin{array}{}\text{(58)}& \mathcal{H}\mathit{\psi }=\mathit{\alpha }\mathit{\psi }-{B}^{*}{\mathit{\varphi }}^{*}.\end{array}$

To illustrate the above-presented theory, we consider the problem of sensitivity of functionals of the optimal solution Q to the observations Tobs. Let us introduce the following functional (response function):

$\begin{array}{}\text{(59)}& G\left(T\right)=\underset{{t}_{\mathrm{0}}}{\overset{{t}_{\mathrm{1}}}{\int }}\text{d}t\underset{\mathrm{\Omega }}{\int }k\left(x,y,t\right)T\left(x,y,\mathrm{0},t\right)\text{d}\mathrm{\Omega },\end{array}$

where $k\left(x,y,t\right)$ is a weight function related to the temperature field on the sea surface z=0. For example, if we are interested in the mean temperature of a specific region of the sea ω for z=0 in the interval $\stackrel{\mathrm{‾}}{t}-\mathit{\tau }\le t\le \stackrel{\mathrm{‾}}{t}$, then as k we take the function

where mes ω denotes the area of the region ω. Thus, Eq. (59) is written in the form:

Equation (61) represents the mean temperature averaged over the time interval $\stackrel{\mathrm{‾}}{t}-\mathit{\tau }\le t\le \stackrel{\mathrm{‾}}{t}$ for a given region ω. The functionals of this type are of most interest in the theory of climate change .

In our notations, Eq. (59) may be written as

$G\left(T\right)=\underset{{t}_{\mathrm{0}}}{\overset{{t}_{\mathrm{1}}}{\int }}\left(Bk,T\right)\text{d}t=\left(Bk,T{\right)}_{Y},\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}Y={L}_{\mathrm{2}}\left(D×\left({t}_{\mathrm{0}},{t}_{\mathrm{1}}\right)\right).$

We are interested in the sensitivity of the functional G(T), obtained for T after data assimilation, with respect to the observation function Tobs.

By definition, the sensitivity is given by the gradient of G with respect to Tobs:

$\begin{array}{}\text{(62)}& \frac{\text{d}G}{\text{d}{T}_{\text{obs}}}=\frac{\partial G}{\partial T}\frac{\partial T}{\partial {T}_{\text{obs}}}.\end{array}$

Since $\frac{\partial G}{\partial T}=Bk$, then, according to the theory presented in Sect. 4, to compute the gradient (Eq. 62) we need to perform the following steps:

1. For k defined by Eq. (60) solve the adjoint problem

and put $\mathrm{\Phi }={B}^{*}{\stackrel{\mathrm{̃}}{\mathit{\varphi }}}^{*}.$

2. Find χ by solving χ with the Hessian defined by Eqs. (56)–(58).

3. Solve the direct problem

4. Compute the gradient of the response function as

$\begin{array}{}\text{(65)}& \frac{\text{d}G}{\text{d}{T}_{\text{obs}}}={m}_{\mathrm{0}}{P}_{\mathrm{2}}{|}_{z\phantom{\rule{0.125em}{0ex}}=\phantom{\rule{0.125em}{0ex}}\mathrm{0}}.\end{array}$

Equation (65) allows us to estimate the sensitivity of the functionals related to the mean temperature after data assimilation, with respect to the observations on the sea surface.

7 Numerical example for the Baltic Sea dynamics model

The numerical experiments have been performed using the three-dimensional numerical model of the Baltic Sea hydro-thermodynamics developed at the Institute of Numerical Mathematics, Russian Academy of Sciences on the base of the splitting method and supplied with the assimilation procedure for the surface temperature Tobs with the aim to reconstruct the heat fluxes Q.

The object of simulation is the Baltic Sea water area. The parameters of the considered domain and its geographic coordinates can be described in the following way: the σ-grid is $\mathrm{336}×\mathrm{394}×\mathrm{25}$ (latitude, longitude, and depth, respectively). The first point of the “grid C” has the coordinates 9.406 E and 53.64 N. The mesh sizes in x and y are constant and equal to 0.0625 and 0.03125 degrees, respectively. The time step is Δt=5 min. The initial condition for the whole model, including T0, was chosen in the following way: the model was started running with zero initial conditions and ran with atmospheric forcing obtained from reanalysis of about 20 years, and after that the result of calculation was taken as an initial condition for further running of the model. The assimilation procedure worked only during some time windows. To start the assimilation procedure for the heat flux estimation, the initial condition was taken as a model forecast for the previous time interval.

The Baltic Sea daily-averaged nighttime surface-temperature data were used for Tobs. These are the data of the Danish Meteorological Institute based on measurements of radiometers (AVHRR, AATSR, and AMSRE) and spectroradiometers (SEVIRI and MODIS) (Karagali2012). Data interpolation algorithms were used to convert observations on computational grid of the numerical model of the Baltic Sea thermodynamics. For each time step the heat flux was determined at each surface point; therefore, the number of scalar parameters to be determined were equal to the number of scalar observations.

The mean climatic flux obtained from the NCEP (National Centers for Environmental Prediction) reanalysis was taken for Q(0). We need to mention that Q(0) has a physical meaning here, it is not only an initial guess but also a parameter calculated from atmospheric data and taken in the model for temperature boundary condition on the sea surface when the model runs without the assimilation procedure.

Using the hydro-thermodynamics model mentioned above, which is supplied with the assimilation procedure for the surface temperature Tobs, we have performed calculations for the Baltic Sea area where the assimilation algorithm worked only at certain time moments t0; in this case ${t}_{\mathrm{1}}={t}_{\mathrm{0}}+\mathrm{\Delta }t$.

The aim of the experiment was the numerical study of the sensitivity of functionals of the optimal solution Q to observation errors in the interval (t0,t1).

Implementing the assimilation procedure, we considered a system of the form in Eqs. (53)–(55), where Eqs. (53)–(54) mean the finite-dimensional analogues of the corresponding problems . For the statement of a data assimilation problem we introduce the cost function (Eq. 52) with a regularization parameter α, which weights the squared difference $|Q-{Q}^{\left(\mathrm{0}\right)}{|}^{\mathrm{2}}$. Since in all numerical experiments α was chosen very small, the impact of the first term in the functional was also small, and therefore Q was different from Q(0).

We use here the SI units, namely K (kelvin) is used for temperature, m s−1 for velocities, and mK s−1 for the heat flux Q. The parameter α is defined as s2 m−2 to give both terms in Eq. (52) the same dimension. It is easily seen that in this case the units of the gradient $\frac{\text{d}G}{\text{d}{T}_{\text{obs}}}$ from Eq. (65) are defined as m−2 s−1.

Let us present some results of numerical experiments. The calculation results for t0=50 h (600 time steps for the model) are presented in Fig. 1 showing the gradient of the functional G(T) defined by Eq. (61) and related to the mean temperature after data assimilation, with respect to the observations on the sea surface, according to Eqs. (63)–(65). Here ω, τt, $\stackrel{\mathrm{‾}}{t}={t}_{\mathrm{1}}$, and $\mathit{\alpha }={\mathrm{10}}^{-\mathrm{5}}$ s2 m−2.

Figure 1The gradient of the functional G(T) (m−2 s−1).

We can see the sub-areas (in red) in which the functional G(T) is most sensitive to errors in the observations during assimilation. The largest values of the gradient of G(T) correspond to the points x,y lying near the regions with a small depth (cf. sea topography, Fig. 2). One explanation of this phenomenon may be the fact that in the areas with depths of about 50 m, rapid convection occurs in the upper mixed layer. With the assimilation of the surface temperature, information is transmitted faster to shallower depths, which in turn contributes to a higher sensitivity to data in these places, in contrast to deeper regions.

Figure 2Baltic Sea topography (m).

Remark 3. We use the discretize-then-optimize approach, and for numerical experiments all the presented equations are understood in a discrete form, as finite-dimensional analogues of the corresponding problems, obtained after approximation. This allows us to consider model equations as a perfect model, with no approximation errors. Therefore, the accuracy of the sensitivity estimates given by Eqs. (63)–(65) are determined by the accuracy of solving the Hessian equation χ (step 2 of the algorithm). Due to Eqs. (56)– (58), this equation is equivalent to a linear data assimilation problem, and an approximate solution to the minimization problem is obtained by an iterative procedure.

The above studies allow us to solve the problem of the definition of sea sub-areas in which the functional of the optimal solution is most sensitive to errors in the observations during variational data assimilation, when the error values are not a priori known.

8 Conclusions

In this paper we have considered numerical algorithms to study the sensitivity of functionals of the optimal solution of the variational data assimilation problem aimed at the reconstruction of unknown parameters of the model. The optimal solution obtained as a result of assimilation depends on the observations that may contain uncertainties. Computing the gradient of the functionals with respect to observations reduces to the solution of a nonstandard problem which is a coupled system involving direct and adjoint equations with mutually dependent variables. Solvability of the nonstandard problem is related to the properties of the Hessian of the original cost function. An algorithm developed to compute the gradient of the response function is based on the second-order adjoint techniques. A numerical example for the variational data assimilation problem related to sea surface temperature for the Baltic Sea thermodynamics model demonstrates the result of the gradient computation of the response function associated with the mean surface temperature. The presented algorithm may be used to determine the sea sub-areas in which the functionals of the optimal solution are most sensitive to errors in the observations during variational data assimilation.

Data availability

The Baltic Sea daily-averaged surface-temperature data (Copernicus product ID ST_BAL_SST_L4_REP_OBSERVATIONS_010_016) can be found at the Copernicus Marine Environment Monitoring Service website (http://marine.copernicus.eu/services-portfolio/access-to-products/).

Competing interests

The authors declare that they have no conflict of interest.

Special issue statement

This article is part of the special issue “Numerical modeling, predictability and data assimilation in weather, ocean and climate: A special issue honoring the legacy of Anna Trevisan (1946–2016)”. It is a result of A Symposium Honoring the Legacy of Anna Trevisan – Bologna, Italy, 17–20 October 2017.

Acknowledgements

The authors are greatly thankful to Olivier Talagrand and the reviewers for providing helpful comments that resulted in substantial improvements for the paper. This work was carried out within the Russian Science Foundation project 17-77-30001 (studies in Sects. 1–4), the AIRSEA project (INRIA Grenoble Rhône-Alpes), and the project 18-01-00267 of the Russian Foundation for the Basic Research.

Edited by: Olivier Talagrand
Reviewed by: Olivier Talagrand and one anonymous referee

References

Agoshkov, V. I., Parmuzin, E. I., and Shutyaev, V. P.: Numerical algorithm of variational assimilation of the ocean surface temperature data, Comp. Math. Math. Phys., 48, 1371–1391, 2008. a, b, c, d, e, f

Agoshkov, V. I., Parmuzin, E. I., Zalesny, V. B., Shutyaev, V. P., Zakharova, N. B., and Gusev, A. V.: Variational assimilation of observation data in the mathematical model of the Baltic Sea dynamics, Russ. J. Numer. Anal. Math. Modelling, 30, 203–212, 2015. a

Agoshkov, V. I. and Sheloput, T. O.: The study and numerical solution of some inverse problems in simulation of hydrophysical fields in water areas with “liquid” boundaries, Russ. J. Numer. Anal. Math. Modelling, 32, 147–164, 2017. a

Alifanov, O. M., Artyukhin, E. A., and Rumyantsev, S. V.: Extreme Methods for Solving Ill-posed Problems with Applications to Inverse Heat Transfer Problems, Begell House Publishers, Danbury, USA, 1996. a

Baker, N. L. and Daley, R.: Observation and background adjoint sensitivity in the adaptive observation-targeting problem, Q. J. Roy. Meteorol. Soc., 126, 1431–1454, 2000. a

Bocquet, M.: Parameter-field estimation for atmospheric dispersion: application to the Chernobyl accident using 4D-Var, Q. J. Roy. Meteorol. Soc., 138, 664–681, 2012. a

Chavent, G.: Local stability of the output least square parameter estimation technique, Math. Appl. Comp., 2, 3–22, 1983. a

Cioaca, A., Sandu, A., and de Sturler, E.: Efficient methods for computing observation impact in 4D-Var data assimilation, Comput. Geosci., 17, 975–990, 2013. a

Daescu, D. N.: On the sensitivity equations of four-dimensional variational (4D-Var) data assimilation, Mon. Weather Rev., 136, 3050–3065, 2008. a

Dee, D. P.: Bias and data assimilation, Q. J. Roy. Meteorol. Soc., 131, 3323–3343, 2005. a

Gejadze, I., Le Dimet, F.-X., and Shutyaev, V.: On analysis error covariances in variational data assimilation, SIAM J. Sci. Computing, 30, 1847–1874, 2008. a

Gejadze, I. Yu., Copeland, G. J. M., Le Dimet, F.-X., and Shutyaev, V. P.: Computation of the analysis error covariance in variational data assimilation problems with nonlinear dynamics, J. Comput. Phys., 230, 7923–7943, 2011. a

Gejadze, I. Yu., Shutyaev, V. P., and Le Dimet, F.-X.: Analysis error covariance versus posterior covariance in variational data assimilation, Q. J. Roy. Meteorol. Soc., 139, 1826–1841, 2013. a

Gejadze, I. Yu., Shutyaev, V. P., and Le Dimet, F.-X.: Hessian-based covariance approximations in variational data assimilation, Russ. J. Numer. Anal. Math. Modelling, 33, 25–39, 2018. a

Karagali, I., Hoyer, J., and Hasager, C. B.: SST diurnal variability in the North Sea and the Baltic Sea, Remote Sens. Environ., 121, 159–170, 2012. a

Le Dimet, F.-X. and Shutyaev, V.: On deterministic error analysis in variational data assimilation, Nonlin. Processes Geophys., 12, 481-490, https://doi.org/10.5194/npg-12-481-2005, 2005.

Le Dimet, F.-X. and Talagrand, O.: Variational algorithms for analysis and assimilation of meteorological observations: theoretical aspects, Tellus A, 38, 97–110, 1986. a

Le Dimet, F.-X., Navon, I. M., and Daescu, D. N.: Second-order information in data assimilation, Mon. Weather Rev., 130, 629–648, 2002. a

Le Dimet, F.-X., Ngodock, H. E., Luong, B., and Verron, J.: Sensitivity analysis in variational data assimilation, J. Meteorol. Soc. Jpn., 75, 245–255, 1997. a

Lions, J.-L.: Contrôle Optimal des Systèmes Gouvernés par des Équations aux Dérivées Partielles, Dunod, Paris, France, 1968. a, b, c

Marchuk, G. I.: Adjoint Equations and Analysis of Complex Systems, Kluwer, Dordrecht, the Netherlands, 1995. a

Marchuk, G. I., Dymnikov, V. P., and Zalesny, V. B.: Mathematical Models in Geophysical Hydrodynamics and Numerical Methods for their Realization, Gidrometeoizdat, Leningrad, USSR, 1987. a

Marchuk, G. I., Agoshkov, V. I., and Shutyaev, V. P.: Adjoint Equations and Perturbation Algorithms in Nonlinear Problems, CRC Press Inc., New York, USA, 1996. a, b

Navon I. M.: Practical and theoretical aspects of adjoint parameter estimation and identifiability in meteorology and oceanography, Dyn. Atmos. Oceans, 27, 55–79, 1998. a

Schirber, S., Klocke, D., Pincus, R., Quaas J., and Anderson, J. L.: Parameter estimation using data assimilation in an atmospheric general circulation model: From a perfect toward the real world, J. Adv. Model Earth Sy., 5, 58–70, 2013.  a

Shutyaev, V., Gejadze, I., Copeland, G. J. M., and Le Dimet, F.-X.: Optimal solution error covariance in highly nonlinear problems of variational data assimilation, Nonlin. Processes Geophys., 19, 177–184, https://doi.org/10.5194/npg-19-177-2012, 2012. a

Shutyaev, V., Le Dimet, F.-X., and Shubina E.: Sensitivity with respect to observations in variational data assimilation, Russ. J. Numer. Anal. Math. Modelling, 32, 61–71, 2017. a, b, c, d

Smith, P. J., Thornhill, G. D., Dance, S. L., Lawless, A. S., Mason, D. C., and Nichols, N. K.: Data assimilation for state and parameter estimation: application to morphodynamic modelling, Q. J. Roy. Meteorol. Soc., 139, 314–327, 2013. a

Storch, R. B., Pimentel, L. C. G., and Orlande H. R. B.: Identification of atmospheric boundary layer parameters by inverse problem, Atmos. Environ., 41, 417–1425, 2007. a

Sun, N.-Z.: Inverse Problems in Groundwater Modeling, Kluwer, Dordrecht, the Netherlands, 1994. a

Vainberg, M. M.: Variational Methods for the Study of Nonlinear Operators, Holden-Day, San Francisco, USA, 1964. a

White, L. W., Vieux, B., Armand, D., and Le Dimet, F.-X.: Estimation of optimal parameters for a surface hydrology model, Adv. Water Resour., 26, 337–348, 2003. a

Yuepeng, W., Yue, Ch., Navon, I. M., and Yuanhong, G.: Parameter identification techniques applied to an environmental pollution model, J. Ind. Manag. Optim., 14, 817–831, 2018. a

Zakharova, N. B., Agoshkov, V. I., and Parmuzin, E. I.: The new method of ARGO buoys system observation data interpolation, Russ. J. Numer. Anal. Math. Modelling, 28, 67–84, 2013. a

Zalesny, V., Agoshkov, V., Aps, R., Shutyaev, V., Zayachkovskiy, A., Goerlandt, F., and Kujala, P.: Numerical modeling of marine circulation, pollution assessment and optimal ship routes, J. Mar. Sci. Eng., 5, 1–20, 2017. a, b

Zhu, Y. and Navon, I. M.: Impact of parameter estimation on the performance of the FSU global spectral model using its full-physics adjoint, Mon. Weather Rev., 127, 1497–1517, 1999. a