Introduction
It is well understood that in chaotic physical systems, dynamical instability
is among the leading drivers of forecast uncertainty
.
Furthermore, recent mathematical and numerical results have established a
rigorous framework for understanding the relationship between dynamical
instability, in terms of the non-negative Lyapunov exponents, and the
asymptotic properties of the uncertainty in ensemble-based data assimilation
techniques: in perfect models, with weakly-nonlinear error growth, the
anomalies of ensemble Kalman filters project strongly on the span of the
unstable-neutral backward Lyapunov vectors ,
and the divergence of the
ensemble Kalman filter depends significantly upon whether error in this space
is sufficiently observed and corrected.
Inspired by the “assimilation in the unstable subspace” (AUS) methodology of
Anna Trevisan and her collaborators
, recent mathematical results have rigorously
validated the underlying hypothesis of AUS: for perfect, linear models, the
uncertainty of the Kalman filter asymptotically collapses to the span of the
backward Lyapunov vectors with non-negative exponents
. Furthermore, if a reduced rank filter has an
estimated covariance initialized only in these modes, and the
unstable–neutral subspace is uniformly, completely observed, the reduced rank
filter becomes asymptotically equivalent to the optimal Kalman filter
. Furthermore, this phenomenon has been
generalized as a necessary and sufficient criterion for the exponential
stability of continuous time filters, in perfect models, in terms of the
detectability of the unstable–neutral subspace
.
The above mathematical studies demonstrate how the stable dynamics in perfect
models dissipate forecast errors, in sequential filters, such that a reduced
rank representation of the error covariance matrix in the unstable–neutral
subspace alone suffices to control error growth. This behavior, similarly
understood in the smoothing problem ,
is now also mathematically verified for the linear Kalman smoother and its
nonlinear ensemble formulation is shown numerically to exhibit the same
behavior in a weakly-nonlinear regime for error dynamics
. With these results, AUS provides a robust
theoretical framework for interpreting the behavior of the ensemble Kalman
filter in terms of the model dynamics. However, this framework has, for the
most part, been limited to understanding the ensemble Kalman filter in models
without errors.
The sources of model error are varied and a common simplifying assumption in
data assimilation is that it takes the form of additive, Gaussian noise that
is white in time. The work of recently extended the
theoretical framework of AUS, so far established for perfect models, to the
presence of additive model errors with additional qualifications. The study
introduced novel bounds on the Kalman filter's asymptotic forecast
uncertainty, and a necessary criterion for filter stability, as an inverse
relationship between the model's dynamical instabilities and the relative
precision of observations. Particularly, in stationary dynamics and the
absence of corrections to forecast errors in the stable modes, the Grudzien et al. (2018) work
demonstrated that the model dynamics alone are once again sufficient to
uniformly bound the errors in the span of the stable backward Lyapunov
vectors.
However, the uniform bound may be impractically large due to the excitation
of model errors by the transient instabilities in stable directions. While
uncertainty is asymptotically dissipated by the stable dynamics, the
reintroduction of uncertainty from model error significantly differentiates
models with additive errors. Newly injected errors are subject to the growth
rates of the local (in time) Lyapunov exponents, and stable Lyapunov
exponents of sufficiently high variance may experience transient periods of
growth. Therefore, strategies for representing the forecast error with a low
rank ensemble must be adapted for imperfect models to account for a residual
error in the span of the stable, backward Lyapunov vectors which never
vanishes and, moreover, may go through transient periods of growth. As a
consequence, confining the error description within a reduced rank Kalman
filter to only the unstable–neutral subspace does not suffice when model
error is present and suggests that one must include additional,
asymptotically stable, modes (Grudzien et al., 2018).
Furthermore, in this current work we show that such an increase of the
ensemble span does not automatically render the filter optimal: one may also
need to account for the injection of error from unfiltered directions into
the ensemble span. In particular, when an ensemble-based Kalman gain is used
to correct the forecast errors, the dynamics induce error propagation which
transmits uncertainty from the uncorrected, complementary subspace into the
ensemble span. In this study, the propagation of error in the linear Kalman
filter, written in a basis of backward Lyapunov vectors, will reveal the
leading order evolution of the unfiltered uncertainty. Although the evolution
is derived for linear models, the mechanism for error propagation can be
considered a generic feature of ensemble Kalman filters. Under the condition
that error evolution is weakly-nonlinear, the ensemble span will align with
the span of the leading backward Lyapunov vectors; therefore, the error
decomposition in the basis of backward Lyapunov vectors will be valid for the
ensemble Kalman filter.
Similar to how we view AUS as a theoretical framework for understanding the
properties of ensemble-based covariances in the presence of chaotic dynamics
(and in the absence of model error), this work aims to be used as a
theoretical explanation for the empirically observed properties of
ensemble-based covariances in the presence of chaotic dynamics and additive
model errors, providing a theoretical motivation for the role of covariance
inflation in preventing filter divergence. We demonstrate that even when
issues of sampling error, truncation errors due to nonlinearity, and
misspecification of model and observation error distributions are all
excluded, there is an intrinsic deficiency of the standard reduced rank error
covariance recursion that leads to systematic underestimation of the forecast
errors in the ensemble span. While we believe this provides a new theoretical
explanation for the role of covariance inflation in the ensemble Kalman
filter, we also discuss possible strategies to obviate inflation with less ad
hoc methods that take into account the evolution of unfiltered errors more
directly.
This paper is structured as follows: Sect. concerns
essential results from the theory of Lyapunov vectors which are used
throughout; Sects. and describe
the basic framework for the Kalman filter, and will motivate our subsequent
results; Sect. contains our main analytical
result, the derivation of the exact forecast error under a reduced rank
filter in a basis of backward Lyapunov vectors; Sect.
will use numerics to qualitatively explore the forecast error of the reduced
rank filter, and its approximation in nonlinear models. Implications of the
results in this work are discussed in Sect. ,
with an emphasis on future research directions and their challenges. Final
conclusions are drawn in Sect. .
Preliminaries
We begin by introducing our notation and the problem formulation, including
relevant definitions. There is inconsistent use of the terminology for
Lyapunov vectors in the literature, and so we choose to use the nomenclature
of for its accessibility and self-consistency.
Lyapunov vectors
Throughout the entire text, the conventional notation k=0,1,2,… is
adopted to indicate that the quantity is defined at time tk. Let
zk-1∈Rn be an arbitrary vector, the matrix
propagator of the forward model from tk-1 to tk is given by
Mk, such that zk=Mkzk-1. We denote
the operator taking the system state from an arbitrary time tl to tk as
Mk:l≜MkMk-1…Ml+1, with the symbol ≜ used to signify that the
expression is a definition. We denote Mk:k≜In, where In is the identity matrix (of size n×n in this case). At all times we assume Mk to be non-singular
and to be uniformly bounded in k.
Although much of the derivations that follow are done for linear dynamics, we
are ultimately concerned with nonlinear systems; therefore, we will assume
that Oseledec's theorem holds, even for linear model propagators. In general,
this is a non-trivial assumption, but one which can be considered generic for
the tangent-linear model of a wide class of nonlinear systems, due to the
multiplicative ergodic theorem (MET): with probability one, Oseledec's
theorem holds, the Lyapunov exponents are well defined and the values of the
Lyapunov exponents are independent of the initial condition see their
theorems 2.1.1 and 2.1.2 for a full statement and proof. A
more general version of the MET, and its interpretation for several physical
systems, is provided by in their Theorem 1.1 and
Example 1.2.
We order the Lyapunov exponents
λ1≥⋯≥λn0≥0>λn0+1≥⋯≥λn,
such that the unstable–neutral subspace is of dimension n0 and the model
state dimension is n. Note that we do not assume that the Lyapunov
exponents are distinct.
Oseledec's theorem decomposes the (tangent-linear) model space into a direct
sum of time-varying, covariant Oseledec spaces, referred to as an Oseledec
splitting or decomposition. At times, we will refer to the covariant Oseledec
spaces, as well as to the covariant, and to the forward Lyapunov vectors.
These discussions will provide a deeper interpretation of our results for
those familiar with these technical points. However, these discussions are
not crucial to the understanding of our results; therefore, we limit the
use of formal definitions to the backward Lyapunov vectors. For a more formal
discussion of the Oseledec spaces, constructions for Lyapunov vectors and
related results for the full rank Kalman filter see . For
a survey on the mathematical and numerical construction of Lyapunov vectors
see . For a discussion of general Oseledec splitting and a
comparison of methods for its computation see .
The backward Lyapunov vectors can be defined by a choice of an orthonormal
eigenbasis for the far past operator, and/or by recursive QR factorizations
of the (tangent-linear) model propagator . Throughout the
text, we utilize the invariance of the backward Lyapunov vectors under the
recursive QR algorithm.
Definition 1. Define the matrix
Ek to be the orthogonal matrix at time k whose i-th
column is the i-th “backward Lyapunov vector” (BLV), corresponding to
the Lyapunov exponent λi.
Lemma 1. There is an n × n
upper triangular matrix Uk, such that the
(tangent-linear) model propagator satisfies
Mk=EkUkEk-1T.
Define the product of matrices
Uk:l≜Uk⋯Ul+1,
the i-th Lyapunov exponent is equal to the limit
λi=limk→∞1k-llogUk:lii,
where Uk:lii is the i-th diagonal element of the
matrix Uk:l. The local Lyapunov exponents are defined
by logUkii.
Proof. Equation () follows
from Eq. (31) of and is a consequence of the invariance of
the BLVs under the recursive QR decomposition . Computing
Lyapunov exponents via recursive QR factorizations as in
Eq. () is the standard method, described by, e.g.,
, and .
The decomposition in Eq. () represents a change of the
basis of the model space into the upper triangular dynamics of the moving
frame of BLVs, defining a basis for the backward Lyapunov filtration
. In particular, Ek-1T takes the
model state into the orthogonal projection coefficients in the basis of the
BLVs at time k-1. We will denote the projection coefficients of an
arbitrary vector zk into a basis of BLVs with a “hat”, i.e.,
EkTzk≜z^k. Using
the orthogonality of the matrix Ek, the invariant dynamics in the
BLVs is written as follows:
z^k=Ukz^k-1⇔zk=Mkzk-1.
Thus, the operator Uk describes the invariant, upper triangular
dynamics, transferring the model state into its forward representation in the
BLVs at time k.
The Kalman filter
We seek to estimate the distribution of a Gaussian random vector xk∈Rn evolved via a linear Markov model with additive white
noise,
xk=Mkxk-1+wk,
and with observations yk∈Rd given in the following form:
yk=Hkxk+vk.
The forecast mean, xkb, is propagated from the last
posterior mean, xk-1a by the deterministic component of
Eq. (), i.e.,
xkb=Mkxk-1a.
The model variables and observation vectors are related via the linear
observation operator Hk:Rn↦Rd.
The respective model and observation noise, wk and vk, are assumed
mutually independent, unbiased, Gaussian white sequences such that
E[vkvlT]=δk,lRkandE[wkwlT]=δk,lQk,
where E is the expectation, Rk∈Rd×d is the observation error covariance matrix at time tk, and
Qk∈Rn×n stands for the model error
covariance matrix. The error covariance matrix Rk can be assumed
invertible without losing generality. To avoid pathologies, we assume that
the model and the observation error covariance matrices are uniformly
bounded. For 1≤t<s≤n, and given a matrix A∈Rn×n, we define At:s∈Rn×(s-t+1) to be the matrix composed (inclusively) of columns s through
t of A.
Definition 2. The “forecast
error” is defined as the difference of the mean state estimated by the filter
and the unknown random state, i.e.,
ϵk≜xkb-xk.
The “innovation” is the measured difference between the
forecast in the observation space and the observation,
δk≜yk-Hkxkb=vk-Hkϵk.
We define the “exact forecast error covariance” at time k to
be
Bk≜EϵkϵkT.
Conversely, suppose some filter, yet to be identified, is used
to estimate the forecast mean and error covariance – the estimated
forecast error covariance will be denoted Pk, defined
according to the chosen estimation algorithm.
Suppose that Kk∈Rn×d is some estimator
which takes the forecast state to the analysis state. In the case of the
theoretical Kalman filter (KF), where the exact forecast error covariances are
computed Pk≡Bk, the gain Kk will be
defined as follows:
Kk≜PkHkTHkPkHkT+Rk-1=BkHkTHkBkHkT+Rk-1.
In this text, we will vary the choice of the analysis update operator
Kk, but the functional form of the recursion for the analysis
update of the mean will be unchanged and defined as
xka≜xkb+Kkyk-Hkxkb=xkb-KkHkϵk+Kkvk.
Therefore, for any estimator, the forecast mean can be derived recursively
from Eq. () and Eq. () as
xk+1b≜Mk+1xkb-KkHkϵk+Kkvk,
where Kk is some choice for the gain. The recursion on the
forecast error can be derived equal to
ϵk+1≜Mk+1In-KkHkϵk+Kkvk-wk+1,
although ϵk,vk and wk+1 are assumed
to be unknown.
Rank deficiency in the Kalman filter
In a linear model, with known Gaussian observation and model error
distributions, the estimated error covariances of the KF are exact: the posterior
error distribution for the state is Gaussian, and the KF completely describes
the Bayesian posterior through its recursive equations for the estimated mean
and covariance. However, it is often the case that the recursion for the
posterior error distribution is approximated with a reduced rank surrogate in
which the estimated covariance, Pk, and resulting exact error
covariance, Bk, may not be equal . This
mismatch can lead to the systematic underestimation of the forecast error and
filter divergence.
Nonetheless, it is possible in an ideal setting to analytically describe the
error statistics of a reduced rank Kalman filter – to illustrate this,
assume that we have a linear model with known Gaussian error distributions.
Suppose we apply the analysis update in a reduced rank set of BLVs, as has
been done in extended Kalman filter (EKF)-AUS
. Furthermore, suppose the exact error covariance,
Bk, is known. Then the gain
Kk≜Ek1:n0Ek1:n0TBkEk1:n0Ek1:n0THkT×HkEk1:n0Ek1:n0TBkEk1:n0Ek1:n0THkT+Rk-1
yields the exact Kalman estimator with respect to a subset of the anomaly
variables, defined by the span of the leading n0 BLVs. We may use
Eq. () to derive the analytical recursion for
the forecast error covariance, Bk+1, under the reduced rank
gain in Eq. (). The “rank deficiency (or
reduced rank)” is defined by the restriction of the Kalman estimator to a low
dimensional subspace. Note that, although the estimator is restricted to the
span of Ek1:n0, the observation operator is still applied to
the full state vector; thus, the analysis does not equal the restriction
of the Bayesian update to the leading n0 BLVs. We recover the restricted
Bayesian update using the estimator in Eq. ()
precisely when HkEkn0+1:n≡0d×(n-n0).
The significance of deriving an analytical recursion for the forecast error
under the reduced rank estimator in Eq. () is as
follows. The analysis operator in Eq. () is
characteristic of the typical gain for the ensemble Kalman filter
(EnKF) in large, geophysical models: the ensemble-based gain applies
its update with respect to the subspace defined by the span of the ensemble
of anomalies, which is usually of reduced rank and aligns with the span of
the leading BLVs . Therefore, the standard EnKF can,
be considered a Monte Carlo estimate of the error statistics
resulting from a rank deficient Kalman estimator as in
Eq. (). This is the motivation of
Sect. , where we will define a reduced rank
gain which operates within the span of an arbitrary number of the leading
BLVs and derive the resulting exact forecast error covariance.
Filtering in the backward Lyapunov basis vectors
Consider the forecast error recursion for the linear KF in
Eq. (). As we are motivated by ensemble
covariances, suppose Kk is defined as a reduced rank gain which
corrects only the leading r BLVs, with r<n. The subspace defined by
the span of the anomalies defines a subspace of “filtered variables” where
we perform our analysis. The “unfiltered subspace” is uniquely defined (up
to the inner product) as the orthogonal complement to the filtered space,
i.e., the subspace in which the reduced rank Kalman estimator makes no
correction.
Definition 3. For each k, we define the “filtered
subspace” by the column span of the vectors Ekf≜Ek1:r and the “unfiltered subspace” by the column
span of the vectors Eku≜Ekr+1:n. The projection coefficients of a vector z∈Rn
into the filtered and unfiltered subspace will be denoted
z^f≜EkfTz and
z^u≜EkuTz, respectively.
Thus, we decompose the forecast error into its orthogonal projections in the
filtered and unfiltered subspaces as
ϵk≜Ekfϵ^kf+Ekuϵ^ku.
For r=n, define Ekf≜Ek and
Eku≜0n such that
ϵ^kf is the full error written in
an orthogonal change of basis – this case will only be referred to for
comparison.
For i,j∈{f,u}, we write the sub-covariances in the basis
defined by Ek as
B^kij≜Eϵ^kiϵ^kjT.
Therefore, the exact forecast error covariance is given as
Bk≡EkB^kffB^kfuB^kufB^kuuEkT,
where B^kff and
B^kuu are symmetric matrices, and
B^kfu=B^kufT. We similarly
express Uk as a block matrix,
Uk≜UkffUkfu0(n-r)×rUkuu.
For an arbitrary rank filtered subspace, the reduced rank gain Kk
correcting the span of Ekf is defined by
Kk≜EkfK^k,K^k≜BkffEkfTHkTHkEkfBkffEkfTHkT+Rk-1,
where K^k represents the projection coefficients of the
reduced rank gain into the filtered variables.
For every k≥1, we decompose the model error covariance into the basis
of filtered and unfiltered BLVs as
Qk≜EkQ^kffQ^kfuQ^kufQ^kuuEkT,
where Q^kff and
Q^kuu are symmetric matrices, and
Q^kfu=Q^kufT.
With the above notation, and using Eq. (), the evolution
of the forecast error under the reduced rank gain is derived from
Eq. () as
ϵk+1=Mk+1In-EkfK^kHkϵk+Mk+1EkfK^kvk-wk+1=Ek+1Uk+1EkT-Ek+1Uk+1In×rK^kHkϵk+Ek+1Uk+1In×rK^kvk-wk+1.
Equation () describes the evolution of the forecast
error with respect to the suboptimal filter, and suggests, as in
Eq. (), how we may write the error evolution
into the upper triangular dynamics in the moving frame of BLVs. Computing the
evolution of ϵ^kf and
ϵ^ku under the forecast–analysis
update cycle in Eq. (), we will derive the exact
recursion for B^kff. This will describe the
exact forecast uncertainty in the filtered subspace under a gain which
operates in the span of the leading r BLVs.
Evolution of unfiltered error
We begin by deriving the evolution of error in the unfiltered subspace, by
verifying that it evolves according to the free evolution. Notice first the
following relation:
Ek+1uTEk+1Uk+1In×r=0(n-r)×r,
due to the fact that Ek+1 is an orthogonal matrix,
the above product is equal to the lower left block of
Uk+1, which is upper triangular. With the substitution of
Eq. () into in Eq. () for
ϵk, multiplying on the left by
EkuT to move into the
unfiltered subspace, and by utilizing Eq. () to cancel the
error in the filtered space, we find
ϵ^k+1u=Ek+1uTEk+1Uk+1EkTEkfϵ^kf+Ekuϵ^ku-Ek+1uTwk+1=Uk+1uuϵ^ku-w^k+1u.
Equation () demonstrates that the evolution of
the error in the unfiltered subspace exactly follows the free forecast
evolution. The covariance of unfiltered error at time k can be computed
from Eq. () as
B^kuu=Uk:0uuB^0uuUk:0uuT+∑l=1kUk:luuQ^luuUk:luuT.
The initial uncertainty in the unfiltered subspace evolves as
Uk:0uuB^0uuUk:0uuT; thus, when r>n0,
it vanishes exponentially. This implies that asymptotic unfiltered error is
independent of the initialization, similar to the results of
. The remaining sum in
Eq. () represents the contribution to the current
forecast uncertainty from the model error at all times after initialization,
propagated under the upper triangular evolution in the BLVs. Therefore, while
the initial error is forgotten, the asymptotic error in the reduced rank
filter here explicitly depends on the dimension of the unfiltered subspace
and the local variability of the stable BLVs therein.
The error in the i-th BLV in Eq. () is given by
the free evolution of perturbations, formerly studied by :
when the filtered subspace dimension is of dimension r≥n0, we can
recursively, and stably, compute the unfiltered uncertainty via
B^k+1uu=Q^k+1uu+Uk+1uuB^kuuUk+1uuT.
When r<n0, we explicitly see that the filter will diverge as a consequence
of leaving an unstable direction unfiltered.
Evolution of filtered error
We now consider the evolution of the projection of the forecast error into
the filtered space, with respect to the reduced rank gain. From
Eq. () we derive
ϵ^k+1f=Ek+1fTEk+1Uk+1EkTEkfϵ^kf+Ekuϵ^ku-Ek+1fTEk+1Uk+1In×rK^kHkEkfϵ^kf+Ekuϵ^ku+Ek+1fTEk+1Uk+1In×rK^kvk-wk+1.
Similar to Eq. (), we see the terms
Ek+1fTEk+1Uk+1EkTEkf=Uk+1ffEk+1fTEk+1Uk+1EkTEku=Uk+1fu,
using the
orthogonality of the BLVs. Therefore, substitution into
Eq. () yields
ϵ^k+1f=Uk+1ff-Uk+1ffK^kHkEkfϵ^kf+Uk+1ffK^kvk-w^k+1f+Uk+1fu-Uk+1ffK^kHkEkuϵ^ku.
The terms (a) and
(b) correspond to the standard recursion on
the KF forecast error. If the filtered subspace is the entire state space
(i.e., Ekf≜Ek) the term
(c) is identically zero, and the terms
(a) and (b) are
equivalent to a change of basis for the forecast error recursion in
Eq. (), written in the invariant dynamics for
the moving frame of the BLVs.
For r<n, the remaining term (c) is our
primary object of interest. Term (c) is
fundamentally different from the relationship described by terms
(a) and (b),
which represents the usual stabilizing effect of the forecast–analysis
cycle. Instead, term (c) describes two
different processes: (i) Uk+1fu represents the
purely dynamical upwelling of the unfiltered error into the filtered
variables; (ii) Uk+1ffK^kHkEku is the correction
in the filtered subspace, due to the sensitivity of these variables to
observations in the unfiltered subspace, forward propagated to time
tk+1. When Kk yields the restricted Bayesian update, i.e.,
when Kk is defined as in Eq. () and
HkEku≡0d×(n-r), term
(c) represents dynamical upwelling alone.
Generically, Uk+1fu-Uk+1ffK^kHkEku≠0r×(n-r) and ϵ^ku is Gaussian distributed with covariance given by
Eq. (); thus term (33c) is almost surely non-zero. This
demonstrates that the forecast error in the filtered subspace depends on the
unfiltered error via the forward evolution, whereas the unfiltered error does
not depend on the error in the filtered space.
This implies that the direct application of EKF-AUS from perfect dynamics
to a linear system with model error systematically
underestimates the uncertainty in the span of the leading r BLVs.
Specifically, EKF-AUS neglects the injection of the errors from the trailing
vectors, ϵ^ku, into the forecast
of the leading vectors, ϵ^k+1f,
represented in Eq. (c). Even when the
uncertainty in the stable BLVs is uniformly bounded ,
error in the trailing BLVs moves up the Lyapunov filtration, and may cause
filter divergence. In perfect, linear models, where uncertainty in the stable
BLVs vanishes exponentially, the injection of error from the stable BLVs into
the unstable subspace results in temporary misestimation although does not
pose an issue to the asymptotic stability .
However, with model error, term (c)
demonstrates that reduced rank Kalman filters must be augmented to correct a
persistent underestimation.
It is important to note that the error in the unfiltered subspace moves
upward through the backward Lyapunov filtration precisely because the
unfiltered subspace is defined by the span of the trailing BLVs, governed by
the invariant upper triangular dynamics. The span of the trailing BLVs is not
equal to the direct sum of the trailing Oseledec spaces, which are themselves
covariant with the dynamics. However, this choice for the unfiltered subspace comes
naturally because the filtered subspace (the image space of
Kk) is given by the span of the leading BLVs, and is equivalent
to the span of the leading covariant Lyapunov vectors see their
Eq. 43.
In principle, data assimilation could be designed to prevent dynamical
upwelling of unfiltered error by defining the unfiltered space to be the
direct sum of the trailing, stable Oseledec spaces. In this case, the
unfiltered error would be covariant with the dynamics and leave the filtered
error unaffected. To achieve this, the filtered space would need to be
defined by the orthogonal complement to trailing Oseledec spaces, i.e., the
span of the leading forward Lyapunov vectors see their
Eq. 43. However, the span of the leading forward Lyapunov
vectors has been numerically shown not to contain the largest mass of the
uncertainty . Similarly, uniformly completely observing the
leading d≥n0 forward Lyapunov vectors has been numerically shown to
put a weaker constraint on the growth of the uncertainty than uniformly
completely observing the leading d≥n0 BLVs .
Furthermore, the forward Lyapunov vectors are defined by the recursive QL
factorization , and the lower triangular dynamics for the
forecast error would transmit filtered uncertainty to the unfiltered
subspace, creating a dynamic downwelling which cannot be accounted for in the
ensemble subspace. These results suggest that it is preferable that the
unfiltered space is equal to the span of the trailing BLVs, or equivalently,
that the filtered space is defined equal to the span of the leading
covariant/backward Lyapunov vectors.
With the recursive form of the filtered error in
Eq. (), we directly compute the covariance of
the filtered error, and the cross covariance of the filtered and unfiltered
error, in the basis of BLVs. We define the operators
Φk+1≜Uk+1fu-Uk+1ffK^kHkEku,Σk≜Ir-K^kHkEkfB^kffIr-K^kHkEkfT
+K^kRkK^kT,
where Φk is the operator which describes the propagation
of unfiltered error into the filtered space and the operator
Σk corresponds to the analysis error covariance for the
standard KF, written in the basis of BLVs.
We first consider the recursion for the cross covariance. In particular, by
combining Eq. () and
Eq. (), we obtain
B^k+1fu=Φk+1B^kuuUk+1uuT+Q^k+1fu+Uk+1ffIr-K^kHkEkfB^kfuUk+1uuT.
We now consider the covariance of the forecast error in the filtered
variables. Using the identity in Eq. () we derive the
recursion for the filtered error covariance
B^k+1ff as
Bk+1ff=Uk+1ffΣkUk+1ffT+Q^k+1ff+Φk+1B^kuuΦk+1T+Uk+1ffIr-K^kHkEkfB^kfuΦk+1T+Φk+1B^kufIr-K^kHkEkfTUk+1ffT.
When the filtered space is the whole space, i.e., Ekf=Ek, term () entirely describes the
evolution of the forecast error in the basis of BLVs – this is indeed just
the forward propagation of the analysis error covariance for the KF. Term
() represents the contribution of uncertainty from the
unfiltered subspace, propagated via the Φk operator, while
terms () and () describe the
forward evolution of the cross covariances of the uncertainty into the
filtered space.
Assimilation in the unstable subspace exact (AUSE)
Having derived the exact error covariance associated to the reduced rank
Kalman estimator, characteristic of the ensemble-based Kalman gain in
geophysical models, we will summarize the result.
Definition 4. For all k, let the matrix
Bk be decomposed as in Eq. (). Then,
define the recursive relationship
K^k=BkffEkfTHkTHkEkfBkffEkfTHkT+Rk-1,B^k+1uu=Q^k+1uu+Uk+1uuB^kuuUk+1uuT,Φk+1=Uk+1fu-Uk+1ffK^kHkEku,B^k+1fu=Φk+1B^kuuUk+1uuT+Q^k+1fu+Uk+1ffIr-K^kHkEkfB^kfuUk+1uuT,Σk=Ir-K^kHkEkfB^kffIr-K^kHkEkfT+K^kRkK^kT,B^k+1ff=Uk+1ffΣkUk+1ffT+Q^k+1ff+Φk+1B^kuuΦk+1T+Uk+1ffIr-K^kHkEkfB^kfuΦkT+ΦkB^kufIr-K^kHkEkfTUk+1ffT,
to be the Kalman filter, assimilation in the unstable subspace exact
(KF-AUSE) Riccati equation, for a filtered subspace of dimension
1≤r<n.
Proposition 1. Assume that a Gaussian prior
distribution is given for x0, the state of the system defined by
Eq. (). Assume that the initial uncertainty,
ϵ0, is of mean zero and covariance B0,
and suppose observations of the state are given as in
Eq. (). Let Kk be defined as the Kalman
estimator restricted to the span of Ekf (rank 1≤r<n) as in Eq. (). Then, the forecast error defined by
Eq. () is Gaussian, mean zero, with covariance
matrix defined recursively by the KF-AUSE Riccati equation,
Eq. ().
Proof. Proving the covariance is given by
Eq. () is the content of Sects.
and . That the error is mean zero and Gaussian is
easily proven by induction.
It should be noted that the KF-AUSE Riccati equation is also valid for the
exact forecast error covariance of a reduced rank Kalman filter in perfect
models, where Qk≜0n for all k. Let
r=n0, Qk≜0n and Pk≜EkfΓkEkfT be defined as the estimated
forecast error covariance for EKF-AUS , then the
recursion is defined by
Γk+1≜Uk+1ffIr-K^kHkEkfΓkIr-K^kHkEkfTUk+1ffT+Uk+1ffK^kRkK^kTUk+1ffT,
analogous to term (). Comparing
Eqs. () and (), we see that even in
perfect models the estimated error covariance of EKF-AUS in the filtered
subspace and the exact error covariance do not agree, i.e.,
Γk+1≠B^k+1ff.
This is because the estimated AUS error in Eq. ()
neglects the upwelling of the initial error in the unfiltered subspace,
described by terms (), () and
(). However, the unfiltered error decays exponentially
and the misestimation in the filtered space does not threaten filter
stability: the AUS estimated error covariance converges to the exact error in
its asymptotic limit, although possibly arithmetically
.
Discussion: dynamical upwelling and covariance inflation
We emphasize that the KF-AUSE Riccati equation () is not
intended to provide a computational advantage – its computation requires
knowledge of error in the unfiltered subspace and, in nonlinear models, a
full rank representation of the tangent-linear dynamics. Nonetheless, this
recursion is demonstrative of an important concept: for a reduced rank Kalman
estimator that applies its analysis update in the span of the leading BLVs,
the exact error in the same span always depends on the unfiltered error in
the trailing vectors. This dependence is explicitly described by the terms
()–() for the recursion on the
filtered error covariance in KF-AUSE, representing the missing terms in the
Kalman filter recursion necessary to describe the exact uncertainty in the
ensemble span.
Thus, the upwelling of uncertainty from the unfiltered subspace to the ensemble
span highlights a dynamical mechanism, and provides a theoretical
motivation for why covariance inflation in the EnKF has been successful in
preventing filter divergence. In certain scenarios, due to the neglected
upwelling terms in the standard Kalman filter error recursion, covariance
inflation may emulate the process of upwelling in the ensemble span,
replicating the increased uncertainty in the ensemble span due to the
injection of terms ()–(), or
otherwise ameliorate the effect of neglecting these terms.
Generally, the reasons for using covariance inflation in the EnKF are wide,
including the treatment of model error, sampling error, intrinsic bias and
non-Gaussianity of error distributions see Sect. 2.2 for a
survey. However, Eq. ()
demonstrates that even when excluding nonlinearity, non-Gaussianity, and
intrinsic deficiencies of the EnKF, the exact correction to the error in the
ensemble span requires the covariance of the unfiltered error as well as the
cross covariance of the error in the filtered and unfiltered subspaces, as in
Eq. (). In practice, one must find a suitable
approximation of the upwelling phenomenon to prevent the systematic
underestimation of the forecast error, and/or, extend the rank of the
ensemble-based correction to control the transient growth of errors in the
stable modes.
Reduced rank Kalman filters have previously corrected for the upwelling of
model errors with both multiplicative and additive covariance inflation
methods. Although it was not explicitly formulated as such, the SEEK filter
of can been seen to compensate for model errors originating
in the unfiltered, stable subspace: while components of the model error
covariance which are orthogonal to the filtered subspace are left neglected,
there is an implicit treatment by utilizing its forgetting factor to inflate
the variance of the estimated error in the filtered subspace
. The contribution of the unfiltered error to the
estimated error was also studied in ensemble methods by
, in which the authors explored sampling
methodology to compensate for the unresolved model errors, residing outside
of the ensemble span. Our work adds to this discussion, now highlighting the
explicit mechanism which these covariance inflation techniques have
compensated for.
The dynamical upwelling of model error differs from the misrepresentation of
the covariance due to truncation error or sampling error induced by nonlinear
dynamics in perfect models, treated in the modified EKF-AUS-NL
and in the finite size ensemble Kalman filter,
(EnKF-N) . We have
shown that the upwelling of the unfiltered error through the Lyapunov
filtration acts as a linear effect and is acute in the presence of additive
model errors which are excited by transient instabilities. While the effect
of the dynamical upwelling could be neglected in perfect models
, the work of has
demonstrated that transient instability in the span of the stable BLVs can
drive the unfiltered error to become impractically large; furthermore, this
error is transmitted into the filtered subspace, driving filter divergence if
it is left uncorrected. However, the significance of perfect models is not
lost: if the dimension of the filtered space is sufficiently large such that
dynamical stability rapidly dissipates unfiltered errors, the effect of the
upwelling may become negligible.
Without otherwise augmenting the ensemble-based Kalman gain, the upwelling of
uncertainty into the filtered space can, in certain scenarios, be emulated
with multiplicative inflation. In the following section, we numerically
explore the interaction of the filtered subspace rank, the stability in the
unfiltered directions, and multiplicative covariance inflation in relation to
the effect of dynamical upwelling in reduced rank Kalman filters. However,
while the results of Sect. empirically validate the
hypothesis that multiplicative inflation can compensate for unrepresented
dynamical upwelling, they also reveal how multiplicative inflation may be
obviated by less ad hoc methods. Likewise, the observed structure of the
reduced rank covariance suggests that, even when the upwelling of error is
well parameterized, the greatest driver of forecast uncertainty may be due to
the presence of unfiltered errors in the trailing BLVs – this will be the
subject of the discussion in Sect. .
Numerical results
Experimental setup
We will explore two different discrete model configurations in which we vary
the effect of nonlinearity. In the continuous model configuration with
stochastic differential equations, we also achieve qualitatively similar
results which will not be included. It is important to remark that the
analytic form for the forecast error in Eq. () is only a
useful representation for weakly-nonlinear evolution of error, corresponding
to the error evolution of the EnKF on short timescales. As the effect of
nonlinearity is increased, the linear approximations utilized in our work
will no longer be adequate, leading to truncation errors as discussed by,
e.g., .
In the following, we use two different formulations of the standard Lorenz 96
equations (L96) , commonly used in data assimilation
literature (see, e.g., and references therein). For
each m∈{1,⋯,n}, the (L96) equations read
dxdt≜L(x),
Lm(x)=-xm-2xm-1+xm-1xm+1-xm+F
such that the components of the vector x are given by the variables
xm with periodic boundary conditions, x0=xn, x-1=xn-1 and
xn+1=x1. The term F in L96 is the forcing parameter.
The tangent-linear model is governed by the equations of
the Jacobian matrix, ∇L(x),
∇Lm(x)=0,⋯,-xm-1,xm+1-xm-2,-1,xm-1,0,⋯,0.
Discrete linear experiments
In linear experiments, we construct a discrete, linear model from the L96
system. Fixing the system dimension n≜10, the linear propagator
in our model Mk is generated by computing the discrete, tangent-linear
model from the resolvent of the Jacobian equation,
Eq. (). In generating the discrete, tangent-linear model,
the discretization time between observations is fixed at δk≜δ=0.1 for all k. We numerically integrate the Jacobian equation
with a fourth-order Runge–Kutta scheme with a fixed time step of
h≜0.01. For the forcing value of F=8, with 10 dimensions,
there are three unstable, one neutral and six stable Lyapunov exponents,
i.e, n0=4. The observation error covariance Rk, model error
covariance Qk and observation operator Hk are all
fixed as the identity I10 in this setup for simplicity.
Discrete nonlinear experiments
In our experiments with the discrete extended Kalman filter for nonlinear
systems, we use Eq. () directly for our model state evolution,
and fix the state dimension to n≜40. For the 40 dimensional L96,
with standard forcing F=8, the unstable neutral subspace is of dimension
n0=14, with one neutral Lyapunov exponent. The nonlinear trajectory is
integrated with a fourth-order Runge–Kutta scheme, with a fixed step size of
h≜0.05, and an interval between observation times of δk≜δ=0.1. At each observation time, before observations are
given, the true trajectory is perturbed (in model space) by additive Gaussian
noise with a prescribed covariance Q, fixed in time. In general,
the random model noise can be injected at different intervals than the
interval between observations, affecting the nonlinearity of the error
evolution. However, we fix these intervals to be equal for simplicity.
Let us define the nonlinear map Ψ(t0,t1):Rn→Rn to be the flow map, generated from
Eq. (), that takes the model state from time t0 to t1.
Then, noting that Ψ(t,t+δ)=Ψ(s,s+δ) for all t and s, we will define Ψδ≜Ψ(0,δ). Thus, in our experiments, the “truth”
is evolved via the equation,
xk+1=Ψδ(xk)+wk+1,
wk+1∼N(0,Q), while the mean trajectory of
the “model” state is given by the deterministic evolution,
xk+1b=Ψδ(xkb).
In our experimental design, the extended Kalman filter estimates the state of
the nonlinear “true” state, perturbed by the noise wk,
Eq. (), and Mk+1 (the linear propagator for
the covariance forward evolution) is defined by the map
Mk+1≜∇Ψδxkb.
The matrix Q is defined by the circulant matrix with c0=0.5,
c1=0.25, c2=0.125, c39=0.25, c38=0.125 and all other entries
zero,
Q≜c0c39⋯c2c1c1c0c39c2⋮c1c0⋱⋮c38⋱⋱c39c39c38⋯c1c0.
The choice of the circulant matrix reflects the stationary statistics and
periodic nature of the L96 model, and the fact that we wish to highlight the
effect of analytically resolving complex model error. The observation error
covariance matrix is fixed as 0.25*I40. The observation
operator is fixed in time as Hk≜I40.
This experimental configuration is mathematically consistent with the
extended Kalman filter for a discrete nonlinear map with model error, and is
a standard formulation for model error twin experiments, utilized by e.g,
Mitchell and Carrassi (2015) and Sakov et al. (2018), with the configuration
using the circulant covariance matrix, Q, drawn specifically from
. The interval between observations δ
controls the nonlinearity of the evolution of the forecast errors in the combined
forecast/analysis data assimilation cycle. Our chosen configuration for the
observation interval, and the interval of random forcing in the model, can be
considered weakly-nonlinear.
Linear Kalman filter
In a linear setting, we compute the exact forecast error covariance of
KF-AUSE via the recursive Riccati equation, Eq. (), and
compare it with that of the KF, for which the filtered space is the entire
model space. This illustrates the performance of a rank deficient filter
where the forecast error is treated analytically, without misestimation of
the error covariances. We compute the average eigenvalues of the forecast
covariance matrix for each the KF and KF-AUSE over 105 parallel forecast
cycles and examine the stratification of the uncertainty in a basis of BLVs,
i.e., how strongly the covariance projects into each direction. Specifically,
for both the KF and KF-AUSE we compute the average projection coefficient of
the forecast error covariance into the i-th BLV at each forecast time,
EkiTBkEki, and
average this coefficient over k.
In Fig. , the averaged eigenvalues of the KF and KF-AUSE
forecast error covariance are plotted, with triangle markers, differentiated
by color. In each subplot, the KF remains the same but we vary the dimension
of the filtered subspace, r, for KF-AUSE. In the top left panel of
Fig. the number of corrected modes is equal to n0,
corresponding to correcting the error in the unstable–neutral subspace.
Here, the leading eigenvalue of the forecast uncertainty of KF-AUSE is orders
of magnitude above the forecast uncertainty in the KF. This should be
contrasted with perfect models where, asymptotically, there can only be four
non-zero eigenvalues and, under
generic conditions, the KF and EKF-AUS will coincide
. In accordance with the results of
, correcting error in the first stable mode (r=5) brings
a substantial reduction in forecast uncertainty (see top right
Fig. ). We see that the forecast uncertainty also
diminishes as each additional mode is corrected, as the KF-AUSE covariance
converges to that of the KF.
Eigenvalues of the KF and KF-AUSE forecast error covariance plotted
with triangles. Projection coefficients of the KF-AUSE forecast error
covariance plotted with crosses. Dimension of the KF-AUSE filtered subspace is
r. Note the log scale of the y axis.
It is of special interest how the projection coefficients of the forecast
error covariance relate to the dimension of the filtered subspace, r. In
the KF, the projection coefficients are closely aligned with the eigenvalue
profile, descending in the order of the Lyapunov exponents, and this line is
not pictured due to the redundancy. However, in the forecast error covariance
of KF-AUSE, the leading uncorrected stable mode is the dominant direction for
the uncertainty among the BLVs, systematically across n0≤r<n, with
a projection coefficient on the order of the leading eigenvalue. This
distinguishes the setting of additive model error from perfect models where
the projection coefficients of the forecast error covariance in the stable
BLVs will be zero asymptotically .
Discrete extended Kalman filter
In our experiments with the discrete extended Kalman filter, we compute the
analysis root mean square error (RMSE) of each of the following: (i) the full rank extended
Kalman filter (EKF), (ii) the EKF-AUS and (iii) the EKF-AUSE, for which
Eq. () is used to compute the estimated covariance and
rank r gain. We will study the effect of analytically resolving the
unfiltered error as compared with the straightforward implementation of
EKF-AUS, which will make no correction to account for the unfiltered error in
the trailing BLVs, or its upwelling into the leading BLVs.
Recall that EKF-AUS has historically only been studied without additive model
errors – we implement EKF-AUS in the presence of model error by computing a
rank r estimated error covariance, which includes the projection of the
model error covariance, Qk into the span of the leading BLVs in
the forecast Riccati equation, i.e.,
EkfTQkEkf=Q^kff. This
corresponds to utilizing only the first line of the recursion for
B^kff, Eq. (), to compute
the estimated forecast error covariance of EKF-AUS. Thus, the implementation
of EKF-AUSE differs by utilizing a full rank ensemble of anomalies to compute
the complete Riccati equation, Eq. ().
We study the performance of EKF-AUS/E
when the dimension of the filtered
subspace is greater than, or equal to, the dimension of the unstable–neutral
subspace; the case r<n0 will trivially lead to divergence
. In Fig. , we plot the
analysis RMSE of EKF-AUS and EKF-AUSE with triangles and crosses,
respectively, while we vary over the dimension of the filtered subspace, with
the RMSE computed over 105 analysis cycles.
To benchmark the performance of EKF-AUS/E, we plot the observation error
standard deviation and the analysis RMSE of the standard, full rank EKF in
horizontal lines – the algorithms for EKF-AUS/E are tantamount to a change
of basis for the EKF when the filtered subspace is equal to the full space;
thus, this is the logical point of comparison. We are interested in
finding the necessary dimension of the filtered subspace such that EKF-AUS/E
has an RMSE which (i) performs better than the observation error standard
deviation and (ii) performs comparably to filtering the entire space. When
the RMSE of EKF-AUS/E falls below the observation error standard deviation,
the filter has a forecast performance superior to initializing observations
directly in the model; when it performs similarly to the EKF, the filter can be
considered close to optimal performance, while utilizing a suboptimal
correction based on only r<n directions.
Analysis RMSE of EKF-AUS plotted with triangles and EKF-AUSE plotted
with crosses, varying over the rank of the suboptimal gain. Horizontal lines are
the observation error standard deviation and the EKF analysis RMSE. Note the log
scale of the y axis.
In Fig. , when the dimension of the filtered subspace
reaches 28, the difference between EKF-AUS/E and the full-rank EKF becomes
negligible. The RMSE of each filter is as follows: (i) EKF is approximately
0.198; (ii) EKF-AUS, r=28, is approximately 0.213; and (iii) EKF-AUSE,
r=28, is approximately 0.205. The fact that EKF-AUS obtains near optimal
performance, representing the uncertainty in the leading r=28 BLVs while
neglecting the remaining, corroborates the claim of : in
the presence of model noise, the filter correction should also incorporate
weakly stable directions that can be instantaneously unstable. However, it is
of particular interest that the convergence of EKF-AUSE to the skill of the
full rank EKF is substantially faster: EKF-AUSE obtains adequate filter
performance (RMSE lower than observation error standard deviation) by
correcting the error in only 16 BLVs while EKF-AUS requires a correction of
rank 19. For other scalings of the matrix Q, multiplying
Q by 0.1, 0.2, 1.5, 2, changing the observation dimension,
e.g., d=20 or d=30 and by varying the time between observations, e.g.,
δk=0.01 or 0.5, we obtain qualitatively similar results, that are
not pictured here. The profiles of the curves in Fig. are
similar across these experimental configurations: the RMSE of EKF-AUSE is
improved over EKF-AUS by analytically resolving the effect of the upwelling,
and the RMSE approaches an adequate/optimal level with a smaller dimension
for the filtered space. We emphasize again that EKF-AUSE does not represent a
computational advantage as a full rank set of perturbations is used to
describe the analytic form for the upwelling of the error.
We look at the behavior of the local Lyapunov exponents for the L96 model to
explain the convergence of EKF-AUS to the full rank EKF. In
Fig. we show the box plot statistics of the local
Lyapunov exponents for exponents 14 through 28 of the L96 model. Exponent
λ14=0, and the remaining pictured exponents correspond to the
leading, stable BLVs. We emphasize that the local Lyapunov exponents of
λ15 through λ19, although they have a negative mean,
are sufficiently unstable locally that EKF-AUS diverges when it disregards
the upwelling of the error from these asymptotically stable modes.
Box plot statistics of the local Lyapunov
exponents, for Lyapunov exponents 14 through 29, over 105 realizations
for the 40 dimensional L96 model. The mean (i-th Lyapunov exponent) is
plotted as a triangle with median the horizontal line. Box contains inner two
quartiles of realizations, with whiskers extending to 1.5 the inner quartile
width from the third and first quartile. Outliers are realizations outside of
this range, plotted individually.
When the filtered subspace for EKF-AUS is of dimension 19, such that the
leading unfiltered BLV corresponds to λ20, all unfiltered Lyapunov
exponents have over 75 % of local realizations strictly stable; this
corresponds to the rank when EKF-AUS has adequate performance. Likewise, the
difference between EKF-AUS/E and the EKF is negligible when the leading
unfiltered BLV corresponds to λ29, with only 1.51 % of its
local realizations being non-negative. These findings are consistent with the
results from : in the presence of model error, unconstrained
forecast error is strongly forced by the error in BLVs, which are
asymptotically stable, but experience strong and frequent local
instabilities.
Finally, we are interested in how analytically computing the upwelling of
error from the unfiltered subspace, as in EKF-AUSE, compares with a
homogeneous, multiplicative inflation applied to the EKF-AUS algorithm.
Multiplicative scalar inflation is among the most common approaches to
mitigate for sampling and model error in Kalman filtering methods, and it is
widely used in operational environmental forecasts utilizing the EnKF. We
define Pk≜EkffTΓk+Q^kffEkff to be
the estimated forecast error of EKF-AUS, where Γk is
defined in Eq. (). The inflated covariance
PkI is defined as PkI=EkffTαΓk+Q^kffEkff for some
chosen scalar α. The inflated covariance PkI is
used to compute the reduced rank gain, as a simple way to compensate for the
underestimation of the forecast error when using the recursion in
Eq. (). Furthermore, the inflated covariance is
subsequently used in the recursion for the analysis and forecast error
covariances.
From the results in Fig. , we select the dimension of the
filtered subspace to be 17, such that EKF-AUSE has RMSE below the observation
error standard deviation while EKF-AUS (without inflation) has diverged. In
Fig. , we plot the analysis RMSE of EKF-AUSE, with
filtered subspace dimension 17, the observation error standard deviation and
the full-rank EKF analysis RMSE as in Fig. as horizontal
lines. Additionally, we plot the analysis RMSE (y axis) of EKF-AUS as a
function of the inflation value (the x axis) applied to the forecast error
covariance. The inflation values, α, are defined as the evenly spaced
points in the interval [1,4] at increments of 0.1, denoted by triangles.
The RMSE is again computed over 105 forecast cycles.
Analysis RMSE of EKF-AUS (y axis),
correction rank 17, with multiplicative inflation plotted versus the
inflation value α (x axis). Horizontal lines are the observation
error standard deviation, EKF-AUSE and EKF analysis RMSE. Note the log scale
of the y axis.
Figure distinctly highlights the impact of including
multiplicative inflation to EKF-AUS: the performance of EKF-AUS with
inflation quickly becomes comparable to the analytically resolved EKF-AUSE,
which in this case, represents the lowermost bound for the RMSE of EKF-AUS
with homogeneous inflation. The lowest RMSE for EKF-AUS with inflation,
realized in Fig. , is approximately 0.322 compared to
the RMSE of EKF-AUSE, approximately 0.304. Figure
confirms the role of multiplicative inflation as compensating for the
upwelling of unfiltered error under weakly-nonlinear error growth, and
explains the underlying dynamical mechanism: multiplicative inflation brings
the estimated forecast error covariance of EKF-AUS closer to the covariance
given by EKF-AUSE.
Discussion: the reduced rank KF covariance and gain augmentation
found evidence that additive inflation could better
compensate for the effects of unresolved model error, while multiplicative
inflation is best suited to account for sampling error, consistent with what
was noted by and . This
hypothesis is supported by our results as follows. The combination of rank
deficiency of the analysis and the presence of additive model error leads to
a persistent, residual unfiltered uncertainty, and its resultant upwelling
into the ensemble span of the EnKF. The dynamical upwelling forms the basis
for a systematic underestimation of the uncertainty in the ensemble space, as
demonstrated in Fig. . This can be compensated for with
multiplicative inflation in the ensemble span, which emulates the additional
uncertainty that is neglected in the standard, reduced rank Kalman filter
recursion – this effect is exhibited in Fig. .
Figure gives a conceptual diagram of the number of
samples (ensemble members) needed to prevent divergence of the EnKF in
different dynamical regimes, and the effect of multiplicative inflation on
this requirement.
Conceptual representation of the number of
samples necessary to prevent divergence of the EnKF in different filtering
regimes. Dark green represents near-optimal filter performance and dark red
represents filter divergence. In perfect-linear models, only n0 samples
are needed for an asymptotically optimal performance. Without inflation, in
noisy linear and perfect, weakly-nonlinear regimes, near optimal performance
can be obtained by correcting error in all modes up to the moderately-stable
BLVs – here nws corresponds to the number of
unstable/neutral/weakly-stable modes, while nms furthermore
includes moderately-stable modes. Additional samples may be necessary to
control error growth with noisy, weakly-nonlinear evolution. Multiplicative
inflation corrects for the upwelling from the uncorrected stable modes so
that near optimal performance can be obtained when the error growth in
unstable/neutral/weakly-stable modes are corrected.
However, multiplicative inflation (in the ensemble span) neglects the
fundamental issue that the unfiltered error lying outside of the ensemble
span can be the major driver of the uncertainty in a reduced rank filter with
model error. Figure shows that when the upwelling is
analytically resolved, the largest uncertainty typically lies in the leading
unfiltered BLV, even when this is an asymptotically stable mode. We provide a
conceptual, two-dimensional visualization of the difference between the
standard (full rank) Kalman filter forecast error covariance and the reduced
rank Kalman filter forecast error covariance in Fig. .
Note that the shape of the reduced rank Kalman filter forecast error
covariance may depend strongly on the model error covariance, Q.
Conceptual diagram of the shape of the exact
forecast error covariance of the full rank Kalman filter and the exact
reduced rank Kalman filter. The U axis represents the span of the
unstable-neutral BLVs, where the forecast uncertainty projects most strongly
in the standard (full rank) Kalman filter. The S axis represents
the span of the stable BLVs, where the uncertainty is the largest (although
bounded), for a reduced rank Kalman filter that neglects corrections to these
modes. The comparison between the full rank and reduced rank Kalman filter
covariance corresponds to the behavior exhibited in the curves in
Fig. .
Generally, unless local Lyapunov exponents in the unfiltered space are
strongly stable and thereby rapidly dissipate the unfiltered perturbations of
model error, transient instabilities can make the unfiltered errors large
enough to prevent useful state estimates . This is
evidenced in Fig. where neither EKF-AUSE nor EKF-AUS,
with multiplicative inflation, achieve an RMSE comparable with the full rank
EKF. For this reason, it is highly pertinent to explore the role of
augmenting the EnKF gain with a suboptimal correction which provides some
control on the transient error growth in the orthogonal complement to the
ensemble span. Ideally, some constraint on the unfiltered error, even if
suboptimal, would further close the gap between the RMSE of EKF-AUSE and EKF
in Fig. .
This issue of instability forcing unfiltered error is even more acute in
practice. If an EnKF applies a correction of a rank less than the number of
unstable and neutral Lyapunov exponents, it has been found that the filter's
estimated error can become small while the filter permanently loses track of
the true trajectory . This behavior is easily understood in
terms of the filter's failure to correct the error growth in the span of at
least one of the unstable-neutral BLVs. For large geophysical models,
computational limitations may prohibit the use of an ensemble of a size
sufficient to even span the unstable–neutral subspace, let alone the weakly-stable
modes which exhibit transient instabilities. In this case, the
unfiltered error in the unstable–neutral modes can grow, possibly
exponentially, and the filter may experience catastrophic filter divergence,
due to the failure of the ensemble-based gain to correct the error in the
span of all the unstable–neutral BLVs .
Hybridization of the ensemble-based gain and additive inflation of the
ensemble-based covariance are two historical methods for compensating for the
inability to correct for instabilities outside of the ensemble span. In
hybridization, the ensemble-based Kalman estimator is augmented by a static,
climatologically based estimator – using a background climatological
covariance, the rank of the estimator used for the analysis update is
increased, and has the effect of applying a correction to additional modes
outside of the ensemble span . Likewise, the use of
additive, random perturbations to the ensemble-based covariance has been
shown to prevent filter divergence by rectifying the rank deficiency of the
covariance, which in turn rectifies the rank deficiency of the ensemble-based gain
.
However, there is considerable difficulty in mathematically analyzing the
exact recursive form for a suboptimal augmentation of the ensemble-based
covariance and ensemble-based Kalman gain. Although the dynamical upwelling
of errors is a generic dynamical feature of these systems, the one-way
dependence of the error in the leading BLVs on the trailing BLVs does not
persist, due to the introduction of estimation errors into the trailing modes
via the augmented gain. Moreover, the surrogate covariance used to constrain
error in the trailing BLVs will not generally agree with the exact error
covariance in the trailing BLVs, making a closed form more difficult to
derive. In this setting, it may be more appropriate to derive heuristic
methods which attempt to (i) provide some corrections in the trailing BLVs,
albeit suboptimal, (ii) describe the dynamical upwelling of the residual
error from the trailing BLVs into the leading BLVs and (iii) describe the
cross covariances, between the leading and trailing BLVs, with respect to the
corrections.
Multiplicative inflation may be used in this case to account for
misestimation of forecast errors resulting from these approximations, but
this misestimation can also be accounted for using less ad hoc approaches
including parameterizing this error with hyperpriors
. We argue that the hyperprior in the EnKF-N can,
in principle, also be selected to take the dynamical upwelling
exhibited by KF-AUSE into account. Recently, an extension of the EnKF-N to the presence of
model error has utilized an adaptive multiplicative inflation term to
compensate for model errors , but we suggest that
an alternative approach including gain augmentation suggested in
Sect. 7, and a hyperprior parameterizing the resulting
error distribution, including dynamical upwelling, would be a logical
extension for future research.
Conclusions
Assimilation in the unstable
subspace (AUS) has provided a useful conceptual framework for understanding
the dynamical properties of data assimilation cycling in perfect models. Both
numerical and mathematical results have confirmed the underlying hypothesis
of Anna Trevisan: in the setting of perfect, chaotic models, the evolution of
uncertainty is confined to a space characterized by non-negative Lyapunov
exponents, typically of much lower dimension than the full model state space
. In ensemble data assimilation, we see that the
asymptotic characteristics of the anomalies exhibit these properties, which
can be exploited to reduce the computational burden of the assimilation cycle
. This phenomena has recently also been utilized to
reduce the numerical cost of synchronization in dynamical shadowing based
data assimilation methods . Furthermore, the work of
proposed an extension of the
EKF-AUS-NL algorithm to account for parametric model errors.
This paper now demonstrates that the framework of AUS can also be used to
understand the underlying mechanisms for the evolution of uncertainty for
ensemble-based filters in chaotic models with additive errors. Due to the
high dimensional models, and unresolved physical processes, this circumstance
is ubiquitous in high-dimensional geoscience applications where standard
EnKFs are extremely rank deficient. Utilizing the Lyapunov filtration for the
backward vectors, we have shown how unfiltered error, outside of the span of
the anomalies, is transmitted by the dynamics into the filtered subspace. In
perfect models, or when stability in the unfiltered subspace is sufficiently
strong, this effect can be neglected due to the rapid dissipation of
unfiltered errors. However, demonstrate how weakly stable
modes of high variance can go through periods of transient instability,
exciting unfiltered error. The dynamic upwelling of unfiltered error,
characterized by the term (c) in the forecast
error recursion, and by the terms
()–() in the filtered error
covariance, acts as a linear effect on filters with small ensemble sizes.
Under weakly-nonlinear error growth, the span of the anomalies projects
strongly onto the span of the leading BLVs – therefore, the Riccati
equation, Eq. (), highlights an important, and
previously unexplained, mechanism driving the systematic underestimation of
the forecast error in ensemble-based Kalman filters. This mechanism also
explains one reason why, in certain scenarios, covariance inflation has been
successful in preventing filter divergence.
The role of inflation we describe differs from previous studies, e.g., the
work of , which studied the nonlinear interactions of
error in perfect models. The phenomena of dynamical upwelling is also
independent of the misestimation of error due to a finite sample size
representing the error statistics . Rather, we
exhibit an effect which can contribute to filter divergence over short timescales
in ensemble data assimilation when the error dynamics are linear or
weakly-nonlinear, and uncertainty is forced by additive model errors. This
persistent dynamical upwelling of errors from the unfiltered space into the
ensemble subspace is a phenomena which we prove analytically in linear
models, and demonstrate numerically to be a valid approximation of
weakly-nonlinear error growth in nonlinear models for reduced rank extended
Kalman filters.
If we treat the standard EnKF as a Monte Carlo estimate of the error statistics
characteristic of the KF-AUSE covariance, Eq. (), the
dynamical upwelling explains a significant role for covariance inflation in
the EnKF. However, our results also suggest that covariance inflation may
potentially be obviated by (i) sufficiently increasing the ensemble size to
include asymptotically stable modes that produce transient instabilities,
(ii) increasing the rank of the analysis update itself, with a hybridized
gain, (iii) parameterizing the upwelling of error via a hyperprior which
targets the evolution of forecast errors, or (iv) some combination of the
above. The necessary ensemble size to mitigate the effect of transient
instabilities can, in principle, be studied empirically by examining the local
variability of the exponents as in Fig. , and their
forcing on the evolution of perturbations as in the numerical study performed
by in their Sect. 5. However, computational limits on
ensemble sizes in large geophysical models and the non-stationarity of the
system's dynamics can limit the effectiveness of this approach. Thus, our
understanding of the dynamics of error propagation opens new
opportunities in algorithm design, where a combination of the above
techniques may be used directly to ameliorate the effects of dynamical
upwelling, and produce more robust ensemble-based filters.
Where there is dynamical chaos, AUS will continue to be a robust framework
for the theory of data assimilation in physical models. Understanding the
dynamical mechanisms that govern the evolution of error in fully nonlinear
data assimilation, e.g., the unstable–neutral manifolds of a (stochastic)
chaotic attractor, will be the subject of future research and may be
considered the logical extension of the framework put forward by Anna
Trevisan – her insight to the underlying processes in assimilation will
continue to provide inspiration to both developers and practitioners of data
assimilation methods.