Controlling balance in an ensemble Kalman filter

We present a method to control unbalanced fast dynamics in an ensemble Kalman filter by introducing a weak constraint on the imbalance in a spatially sparse observational network. We show that the balance constraint produces significantly more balanced analyses than ensemble Kalman filters without balance constraints and than filters implementing incremental analysis updates (IAU). Furthermore, our filter with the weak constraint on imbalance produces good rms error statistics which outperform those of ensemble Kalman filters without balance constraints for the fast fields.


Introduction
In data assimilation one seeks to find the best estimation of the state of a dynamical system given a forecast model with possible model error and noisy observations at discrete observation intervals (Kalnay, 2002).This estimate is coined the analysis.This procedure, however, does not necessarily produce dynamically consistent analyses.In particular, the analysis may contain unbalanced gravity waves, which are absent in the true atmospheric state and which may spoil the subsequent forecast initialised with these dynamically inconsistent states.Ever since the early days of numerical weather prediction the creation of imbalance has been central to the problem of producing reliable forecasts (see for example Daley, 1993, Chapter 6, andLynch, 2006 for a historical account).The heuristic reasoning behind the occurrence of unbalanced analyses is that there may be several states of the fast variables which are compatible with the observations of the slow state variables, most of them corresponding to unbalanced states.Furthermore, unbalanced states can be generated by the discontinuous nature of the data assimilation procedure, leading to unphysical readjustment processes of analyses by the subsequent nonlinear fore-cast model (Bloom et al., 1996;Ourmières et al., 2006).Examples of the creation of imbalance in variational data assimilation schemes are, for example, Bloom et al. (1996) and Lorenc (2003b).In the context of ensemble filters, unbalanced analyses are further created by the procedure of localisation which was introduced by Houtekamer and Mitchell (1998Mitchell ( , 2001)), Hamill et al. (2001), Ott et al. (2004) and Szunyogh et al. (2005) to mitigate spurious cross-correlations in the covariance matrices due to finite ensemble sizes.Localisation of any type can potentially cause imbalance in the initial conditions (Cohn et al., 1998;Lorenc, 2003a;Mitchell et al., 2002;Houtekamer and Mitchell, 2005;Oke et al., 2007;Kepert, 2009;Greybush et al., 2011).
There exist several strategies to combat undesired unbalanced analyses.These strategies can be divided into those which employ a re-balancing procedure after the data assimilation, and those which try to create balanced analyses within the data assimilation process itself.Post-processing methods include digital filtering (Lynch and Huang, 1992) and normal mode initialisation (Machenhauer, 1977;Baer and Tribbia, 1977).Within variational data assimilation algorithms balance constraints can be implemented to ensure sufficient balance (Thépaut and Courtier, 1991;Polavarapu et al., 2000;Gauthier and Thépaut, 2001;Neef et al., 2006;Watkinson et al., 2007;Cotter, 2013).To militate against the effects of intermittent discontinuous assimilations, several filtering approaches have been introduced to render the assimilation procedure more continuous.Bloom et al. (1996) introduced the method of incremental analysis updates (IAU) for 3D-Var in which the analyses increments are distributed over a fixed time window.It has since been applied to ensemble filters, see for example Polavarapu et al. (2004), and has found numerous applications in atmospheric and oceanic contact (Zhu et al., 2003;Weaver et al., 2003;Ourmières et al., 2006).Bergemann and Reich (2010a) (Bergemann et al., 2009;Bergemann and Reich, 2010b).Kepert (2009) modified the covariance localisation procedure so that it respects balance.
Here we will present a novel approach to generating balanced analyses within an intermittent discontinuous data assimilation procedure.We will incorporate prior information on the amount of imbalance to augment given observational information for the slow variables.This implementation of a balance constraint within the data assimilation step eliminates unwanted spurious imbalance, leading to physical analyses states and to an improved analysis skill as measured by the rms error of the fast variables.
In the next section we briefly describe the framework of variance limiting Kalman filters developed in Gottwald et al. (2011) which will form the basis of our imbalance limiting filter.In Sect. 3 we present a modified slow-fast Lorenz-96 model which incorporates balanced dynamics, introduced in Bergemann and Reich (2010a).In Sect. 4 we present results showing how controlling unbalance can produce better skill than current ensemble Kalman filters.We conclude with a summary in Sect. 5. Gottwald et al. (2011) introduced a variation of the ensemble Kalman filter, coined variance limiting Kalman filter (VLKF).This filter was designed to control overestimation of error covariances caused by finite ensemble sizes in sparse observational grids.The filter imposes weak constraints on unobserved variables and data voids using climatological information.The effect of the weak constraint was shown to drive the analysis of the unobserved variables towards their climatic mean and furthermore to limit the posterior error covariance of the unobserved variables to not exceed their climatic covariance.This yielded a remarkable increase in the skill, even in the observed variables.The filter has since been used in Mitchell and Gottwald (2012) to control noise at the grid resolution scale caused by model error.

The variance limiting Kalman filter
It is our aim here to employ the VLKF to control undesirable imbalance.In general the instantaneous amount of imbalance is not available through direct observations.We assume prior knowledge of the climatological mean and of the climatological covariance of imbalance.This statistical information may be available through historical observational data or through free running simulations.We will use the weak constraint in VLKF on imbalance to drive the analysis towards balance, inhibiting excessive unphysical unbalanced fast energy.
The filter described in Gottwald et al. (2011) and Mitchell and Gottwald (2012) was formulated for large ensemble sizes, ensuring invertibility of the forecast error covariance (a situation not satisfied for data assimilation in operational numerical weather forecast centres).We recast the VLKF here in a form which allows for small ensemble sizes, and redo the derivation in a slightly different manner.
Given a D-dimensional dynamical system which is observed at discrete times t i = i t obs , data assimilation aims at producing the best estimate of the current state given a typically chaotic, possibly inaccurate model ż = f (z) and noisy observations of the true state z t (Kalnay, 2002).
We assume that we are given observations where the observation operator H : R D → R D o maps from the whole space into observation space, and r o ∈ R D o is assumed to be i.i.d.observational Gaussian noise with associated error covariance matrix R o .Additionally we incorporate climatological information of D w pseudo-observables, in particular their mean a clim ∈ R D w and their covariances A clim ∈ R D w ×D w .In general, it is not advisable to incorporate simultaneously direct observations and climatological information for a variable, as this may spoil the generally more accurate information of the direct observations.In our application here the climatological information will be the mean and covariances of some measure of imbalance, but pseudo-observables may be any subset of unobserved variables or their integrated quantities such as their energy.We assume that we can determine those quantities prior to the data assimilation procedure either through historical data or through long-time numerical simulations.We remark that one may use values other than the climatic covariance to control the analysis error covariance if one interprets the variance constraint merely as a numerical tool to stabilise and regularise the filter.Furthermore, in non-equilibrium situations, when climatological information is irrelevant, such as during strong fronto-genesis in a weather forecasting context for example, we may estimate the mean and the covariance of the unobserved pseudo-observables via a running average of the analysis (this requires the analysis to be tracking).
We introduce a pseudo-observation operator h : R D → R D w which maps from the whole space into the space of the pseudo-observables.The (as yet unknown) error covariance of those pseudo-observations is denoted by R w .Gottwald et al. (2011) considered D o + D w = D, which will be relaxed here.
The Kalman filter can be formulated as a minimisation problem of the following cost function (e.g.Kalnay, 2002;Simon, 2006) with a given background z f and associated error forecast covariance P f as The error covariance matrix R w is so far undetermined.We will invoke below a constraint on the analysis error covariance, namely that the analysis error covariance projected onto the subspace spanned by the pseudo-observations equals the climatological covariance A clim .In anticipation of the analytical results below which reveal that such a constraint cannot be imposed on the whole D w -dimensional unobserved subspace whilst simultaneously ensuring positive definiteness of R w , but only on a Dw ≤ D w -dimensional subspace of the unobserved subspace, we introduce here a (so far undetermined) transformation matrix S w ∈ R D w × Dw .The transformation matrix satisfies S T w S w = I Dw (but not necessarily S w S T w = I D w ).We will formulate the filter restricted to this subspace and introduce the transformed pseudo-observation operator ĥ = S T w h, and the transformed error covariances as well as the transformed climatological mean of the pseudo-observations âclim = S T w a clim .We now combine direct observations and pseudo-observations, and write the cost function in the more compact form where we introduced combined observations ŷ, the observation operator Ĥ and the error covariance matrix R with The analysis is given as the minimum of the cost function J (z) and is readily calculated as where the Kalman gain matrix is given by with the error covariance matrix of the analysis given by Using the matrix identity Simon, 2006) the analysis error covariance is recast in a form which does not involve the inverse of the forecast error covariance P f as and the Kalman gain matrix can be rewritten in the computationally more convenient form which involves only taking the inverse of (( We remark that one can explicitly separate the updates according to the deviations from the observations and the pseudo-observations in the analysis and have with This shows that weighted by the error covariance of the weak constraint Rw the analysis of the pseudo-observables is driven towards their climatic mean.However, due to the generically global nature of the Kalman gain matrices the inclusion of climatological information of the pseudoobservables also affects the observed degrees of freedom.So far the error covariance Rw associated with the weak constraint is undetermined.We will now determine Rw and thereby control the variance of the unresolved pseudoobservables h z by requiring that the analysis error covariance, projected onto the pseudo-observables, equals the climatological covariance, i.e.

h P a h
We rewrite the analysis error covariance (Eq.6) as where we introduced the analysis error covariance P for a standard Kalman filter without any weak constraint which only combines the forecast with direct observations Upon using the Sherman-Morrison-Woodbury formula (P −1 + ĥT R−1 w ĥ) −1 ĥT R−1 w = P ĥT ( Rw + ĥ P ĥT ) −1 (see for example Simon, 2006), the error covariance matrix for the pseudo-observables Rw is found from the constraint (Eq.10) to be Expanding Eq. ( 13) we obtain This makes apparent the role of the transformation matrix S w .S w ∈ R D w × Dw can be chosen such that Rw = S T w R w S w , being an error covariance, is positive definite: the transformation matrix S w projects onto the subspace of the space of pseudo-observables which in a standard Kalman filter would experience an analysis error covariance h P h T exceeding the climatological covariance A clim .All other D w − Dw pseudoobservations are discarded in order to ensure a positive definite and invertible error covariance matrix Rw ∈ R Dw × Dw .In Appendix A we provide an algorithm to compute S w .
We formulate the filter in the framework of ensemble Kalman filters (EnKF) (Evensen, 2006;Hamill, 2006) where an ensemble with k members is propagated by the model dynamics according to the model The forecast ensemble is split into its mean z f and ensemble deviation matrix Z f .The ensemble deviation matrix Z f can be used to provide a Monte Carlo estimate for the ensemble forecast covariance matrix via Note that P f (t) is rank-deficient for k < D, which is the typical situation in numerical weather prediction where N is of the order of 10 9 and k of the order of 100.
At the end of each analysis cycle an ensemble Z a is generated which must be consistent with the analysis error covariance P a , and satisfies In previous work Gottwald et al. (2011) and Mitchell and Gottwald (2012) used the ensemble transform Kalman filter (ETKF) (Bishop et al., 2001;Tippett et al., 2003;Wang et al., 2004), which seeks a transformation T ∈ R k×k such that the analysis deviation ensemble Z a is given as a deterministic perturbation of the forecast ensemble Z f via Z a = Z f T. In order to incorporate localisation needed for small ensemble sizes easily, we will implement for our VLKF here an approximate square root filter (DEnKF) proposed by Sakov and Oke (2008) where the analysis deviations are determined according to A new forecast is obtained by propagating Z a with the nonlinear forecast model to the next observation time, where a new analysis cycle will be started.We will use here diagonal target matrices A clim where the diagonal entries are set to the mean value of the diagonal entries of the full climatic covariance.We found that otherwise the variance constraint is not "switched on" sufficiently often to drive the dynamics to the mean a clim (due to a lack of simultaneous diagonalisability of A clim and h P h T ; cf.Appendix A).This suggests that the variance constraint is a numerical tool to regularise the filter, with the advantage however that the regularisation is performed in a dynamically consistent way, performed within the data assimilation procedure using only dynamical quantities such as measured imbalance.

The modified Lorenz-96 model
The Lorenz-96 model (Lorenz, 1996;Lorenz and Emanuel, 1998) with periodic x j = x j +d is a standard test bed for data assimilation as it is computationally manageable but still incorporates crucial ingredients of real mid-latitude atmospheric flows such as nonlinear energy conservation, advection, forcing and linear damping.Recently, Bergemann and Reich (2010a) introduced a modification of the standard Lorenz-96 model by coupling it to a purely dispersive fast wave equation mimicking the influence of fast gravity waves on slow Rossby waves in a quasi-geostrophic regime.The modified Lorenz system reads as The fast wave part (Eq.18) is purely dispersive; if the dissipation and the forcing in the slow x equation (Eq.17) is ignored the system conserves the total energy with 0 ≤ η ≤ 1.The modified Lorenz-96 system (Eqs.17-18) contains an approximate slow manifold given by which is obtained by formally setting ε = 0 in Eq. ( 18).Higher order balance relations could be derived by employing asymptotic theory.for initially balanced fields with a small value of ǫ = 0.0025.The figure clearly illustrates that balance is approximately preserved by the dynamics, provided the timescale separa-400 tion is sufficiently large, i.e. ǫ sufficiently small.This justifies the terminology of (20) defining a slow manifold, as the initially generated imbalance does not interact with the slow variables on long time scales.The situation is very different when the dynamics is interrupted by data assimilation cycles 405 where the data assimilation procedure introduces imbalance.In Cohn et al. (1998); Lorenc (2003b); Mitchell et al. (2002); Houtekamer and Mitchell (2005); Oke et al. (2007); Kepert (2009); Greybush et al. (2011) the imbalance was associated with the procedure of covariance localisation.In Figure 3 we 410 show that ensemble filters can generate unbalanced analyses in sparse observational grids due to the intermittent discontinuous analyses updates, even without localisation.We present results for an ETKF with a large ensemble of 1000 members where only the slow {x j } variables are observed, 415 and compare it to the case when all variables {x j }, {h j } and { ḣj } are observed.Whereas in the fully observed case the imbalance B exhibits the actual physical imbalance (cf. Figure 2), increased imbalance is clearly seen in the sparser observational grid.We remark that this is not a finite size effect 420 and cannot be mitigated by larger ensembles (we tested ensemble sizes of 3000), consistent with results for 3D-VAR by Bloom et al. (1996).We remark that for smaller observational noise with R o = 0.21 the imbalance exhibits the same mean values as in Figure 3 with R o = 0.84.In the next sec-425 tion we explore how this spurious imbalance can be controlled by using the VLKF framework established in Section 2.  We set the number of degrees of freedom to d = 40 and F = 8 for the forcing.We consider here weak coupling with η = 0.1, implying sufficiently nonlinear behaviour of the slow x variables.The "Rossby number" is set to ε = 0.0025 and the "Burgers number" is set to α 2 = 0.25.In Fig. 1 we show typical initially balanced fields.Note that the balance relation (Eq.20) implies that the balanced field {h j } = (h 1 , h 2 , . . ., h d ) is smoother than {x j } = (x 1 , x 1 , . . ., x d ) (h is obtained from x via the application of an inverse Helmholtz operator).Note that this is different to the situation in realistic atmospheric models where the fast variables are small scale and rapidly oscillate around the slow manifold.
We introduce the imbalance operator B which acts on z with z j = (x j , h j , ḣj ) as which according to Eq. ( 20) is zero to leading order if initially so. Figure 2 shows the temporal evolution of the siteaveraged imbalance Fig. 3. Imbalance B of the analysis as a function of analysis cycles for ETKF with 1000 ensemble members and an observation interval of t obs = 2 h and observational noise error variance R o = 0.84, without covariance inflation and localisation.Results are shown for the case when all variables {x j }, {h j } and { ḣj } of the modified slowfast Lorenz-96 model (Eqs.17-18) are observed (blue) and for the case of a spatially sparse observations when only {x j } are observed (red).
for initially balanced fields with a small value of = 0.0025.The figure clearly illustrates that balance is approximately preserved by the dynamics, provided the timescale separation is sufficiently large, i.e. sufficiently small.This justifies the terminology of Eq. ( 20) defining a slow manifold, as the initially generated imbalance does not interact with the slow variables on long timescales.The situation is very different when the dynamics is interrupted by data assimilation cycles where the data assimilation procedure introduces imbalance.In Cohn et al. (1998), Lorenc (2003a), Mitchell et al. (2002), Houtekamer and Mitchell (2005), Oke et al. (2007), Kepert (2009) and Greybush et al. (2011) the imbalance was associated with the procedure of covariance localisation.In Fig. 3 we show that ensemble filters can generate unbalanced analyses in sparse observational grids due to the intermittent discontinuous analyses updates, even without localisation.We present results for an ETKF with a large ensemble of 1000 members where only the slow {x j } variables are observed, and compare it to the case when all variables {x j }, {h j } and { ḣj } are observed.Whereas in the fully observed case the imbalance B exhibits the actual physical imbalance (cf.Fig. 2), increased imbalance is clearly seen in the sparser observational grid.We remark that this is not a finite size effect and cannot be mitigated by larger ensembles (we tested ensemble sizes of 3000), consistent with results for 3D-VAR by Bloom et al. (1996).We remark that for smaller observational noise with R o = 0.21 the imbalance exhibits the same mean values as in Fig. 3

Numerical results
We now present results from numerical data assimilation cycles of Eqs. ( 17)-( 18).We consider a sparse observational grid in which only every second slow {x 2j } variable is observed; the variables {h j } and { ḣj } are not observed.We use D = 3 d = 3 × 40, and therefore in the notation of Sect. 2 we have D o = 20.We observe the system in equidistant observation intervals t obs ranging from 1 to 6.5 h, adopting the timescales suggested by the standard Lorenz-96 system (Eq.16), i.e. t = 1/120 roughly corresponds to 1 hour (see for example Lorenz and Emanuel, 1998).Observations are contaminated by Gaussian noise, with error variance R o = (0.25σ x,clim ) 2 I 20 = 0.84 I 20 where σ 2 x,clim = 13.50 is the climatic variance of {x j }.We perform 4000 analysis cycles after a spin-up period of 1000 analysis cycles.All simulations are initialised with balanced data using Eq. ( 20).To generate the observations and to propagate forward the forecast model (Eqs.17-18) we employ an implicit midpoint rule with a time step of dt = 0.0025 (see, for example, Leimkuhler and Reich, 2005).
Besides the variance limiting Kalman filter VLKF-B where we impose a climatic constraint on the imbalance B z, we also employ a variance limiting Kalman filter VLKFḣ where we impose a climatic constraint on the unobserved fast variables { ḣj }.The climatic mean and variances of { ḣj } and those of the imbalance {(B z) j } were estimated through longtime simulations of the full modified Lorenz-96 system (Eqs.17-18) with balanced initial data as B z = 0 and σ 2 Bz,clim = 8.4 × 10 −4 , and ḣ = −0.01 and σ ḣ,clim = 224.35,respectively.We set a clim = 0 and A clim = σ 2 Bz,clim I 40 for VLKF-B, and a clim = −0.01 and A clim = σ 2 ḣ,clim I 40 for VLKFḣ, respectively.We note that both the climatic covariance of B z and of ḣ are concentrated near the diagonal.
For comparison with our implementations of VLKF-B and VLKFḣ we will employ a suite of ensemble filters.In particular, we will use the EnKF with perturbed observations as in Burgers et al. (1998) and the approximate square root filter DEnKF as in Sakov and Oke (2008).Furthermore, we implement an incremental analysis update (IAU) as in Bloom  (1996) and Polavarapu et al. (2004), where the analysis increments are calculated by a DEnKF.
All filter implementations use 10 ensemble members, which is smaller than the attractor dimension of the system (Eqs.17-18).We employ covariance inflation whereby the prior forecast error covariance is increased by an inflation factor δ (Anderson and Anderson, 1999).Since the {h j } variables are not damped in the modified Lorenz system (Eq.18), inflation of the unobserved {h j } variables would in general lead to an increasing growth in the associated forecast covariance.Inflation is therefore applied at each time step only to the {x j } components of the ensembles for DEnKF, EnKF and IAU, but to all components of the ensemble for VLKF-B and VLKFḣ which explicitly constrain.This was found to be advantageous for all respective filters.Results are obtained for a wide range of inflation factors and only the optimal result for each particular formulation of the filter is reported here.We have optimised over 1000 equally spaced values of δ ∈ (1, 1.16).The small ensemble size chosen here requires localisation.We employ the localisation of the analysis as a function of the observation interval ∆t obs .P f experiences covariances above the climatological -which 600 was achieved in this work by singular vector decomposition.The extra computational cost implied has to be weighed against the cost of an additional application of the forecast model involved in IAU or of fast Fourier transforms when using digital filters, as well as whether the superior perfor-605 mance of VLKF-B in generating less unphysical imbalance is worth it.

Appendix A
We present here an algorithm of how to construct the trans-610 formation matrix S w .This matrix projects into the subspace of those pseudo-observable subspace which in a standard Kalman filter would produce an analysis whose analysis error covariance matrix exceeds the prescribed (climatological) error covariance A clim .We need to find S w such that ( 14) 615 produces a positive error covariance matrix R w .For conve-Fig. 5. Rms error of the analysis as a function of the observation interval t obs (in hours) for the slow variables {x j }. method along the line of Houtekamer andMitchell (1998, 2001) and Hamill et al. (2001) whereby the forecast error covariance P f is Schur-multiplied with a localisation matrix C loc .We use the compactly supported localisation function introduced by Gaspari and Cohn (1999) where correlations with distances larger than 2ρ loc are set to 0. We set the localisation radius to ρ loc = 8 for all filters.
In Fig. 4 we present results of the amount of imbalance as measured by the imbalance B(t) and by the temporally averaged imbalance B (n t obs ) , accrued during the data assimilation procedure for our suite of filters.EnKF and DEnKF generate a significant amount of unphysical imbalance, with values much larger than those of the actual balanced toy model with B ≈ 0.018 (cf.Fig. 2).
The increased imbalance in EnKF may be due to sampling errors introduced through the perturbed observations.Given the particular nature of imbalance present in the toy model (Eqs.17-18), this may not be an issue in realistic atmospheric models.The IAU implementation strongly reduces imbalance, albeit to levels significantly larger than those expected from the actual dynamics.Our VLKF-B filter is able to constrain imbalance very close to the the actual physical imbalance.Note that although the pseudo-observations used in VLKF-B were for the imbalance of each variable B z driving the dynamics towards B z = 0, the analysis reproduces dynamically realistic values of the integrated measure of imbalance B .VLKFḣ also achieves a pronounced reduction in imbalance, but to larger values than the IAU implementation, in particular for larger observation intervals t obs .Surprisingly, for filters which only constrain the climatic variance of the height variable {h j }, and do not impose any explicit constraints on the imbalance, i.e. h = (0 40 I 40 0 40 ), one also observes a significant reduction in imbalance (not shown).filter which constraints the statistics of the fast variables but not the imbalance, coined VLKF-ḣ, and standard implementations of EnKF, DEnKF and IAU.It was found that our balance controlling filter VLKF-B is able to constrain the amount of imbalance to lie within the phys-590 ically observed limits.Besides improved balance of the analyses this implied also very good error statistics for the unobserved height field.We tested our method against the widely-used IAU implementation and found that it generates less unphysical imbalance and has very similar rms 595 error statistics for the observed and the unobserved variables.
The variance constraint we employ requires the determination of the overestimating subspace -the eigenspace in which P f experiences covariances above the climatological -which 600 was achieved in this work by singular vector decomposition.The extra computational cost implied has to be weighed against the cost of an additional application of the forecast model involved in IAU or of fast Fourier transforms when using digital filters, as well as whether the superior perfor-605 mance of VLKF-B in generating less unphysical imbalance is worth it.

Appendix A
We present here an algorithm of how to construct the trans-610 formation matrix S w .This matrix projects into the subspace of those pseudo-observable subspace which in a standard Kalman filter would produce an analysis whose analysis error covariance matrix exceeds the prescribed (climatological) error covariance A clim .We need to find S w such that ( 14) 615 produces a positive error covariance matrix R w .For conve-Fig.6. Rms error of the analysis as a function of the observation interval t obs (in hours) for the unobserved height variables {h j }.
We now investigate rms error statistics.We consider the site-averaged rms error of variables z between the truth z t and the ensemble mean z a , where N is the number of analysis cycles and D G denotes the number of variables involved.We introduce the norm a 2 G = a T G a to investigate the error over all {x j } variables E x using G = δ ij for 1 ≤ i ≤ 40 and the error of the fast {h j } variables E h using G = δ ij for 41 ≤ i ≤ 80. Figure 5 shows E x for our suite of filters.DEnKF, IAU and our VLKF-B and VLKFḣ exhibit very similar rms errors, with values much smaller than the observational noise with r o = √ 0.84 = 0.91.EnKF produces consistently worse rms errors, which again may be due to sampling errors stemming from the randomly perturbed observations.
Figure 6 shows that the correct balance statistics of VLKF-B manifests itself in superior rms errors for the unobserved fast height field {h j } when compared to filters which do not incorporate a balance constraint.IAU and VLKF-B exhibit comparable rms error statistics for the height field.Furthermore, it is seen that constraining the covariance of { ḣj }, as done in VLKFḣ, also generates comparably good rms errors for the height field.The variations of the error in the height field E h with the observation interval t obs mirror exactly the imbalance shown in Fig. 4.
We found that EnKF, DEnKF and IAU exhibit instances of catastrophic filter divergence whereby the forecast model develops numerical instabilities (Gottwald and Majda, 2013).Instances of this type of filter divergence were not observed in the variance constraining filters VLKF-B and VLKFḣ.
We have presented here an implementation of an ensemble filter which explicitly limits the amount of imbalance within the data assimilation procedure.We were able to produce balanced analyses by incorporating statistical information such as mean and variance of imbalance, available through longtime integration or historical data, as pseudo-observations.This procedure not only successfully constraints imbalance to its climatic values, but also produces very good filter performance in terms of rms errors of the fast unobserved variables.
We presented a comparison between a filter which explicitly constraints the amount of imbalance, coined VLKF-B, a filter which constraints the statistics of the fast variables but not the imbalance, coined VLKFḣ, and standard implementations of EnKF, DEnKF and IAU.It was found that our balance controlling filter VLKF-B is able to constrain the amount of imbalance to lie within the physically observed limits.Besides improved balance of the analyses this also implied very good error statistics for the unobserved height field.We tested our method against the widely used IAU implementation and found that it generates less unphysical imbalance and has very similar rms error statistics for the observed and the unobserved variables.
The variance constraint we employ requires the determination of the overestimating subspace -the eigenspace in which P f experiences covariances above the climatological -which was achieved in this work by singular vector decomposition.The extra computational cost implied has to be weighed against the cost of an additional application of the forecast model involved in IAU or of fast Fourier transforms when using digital filters, as well as whether the superior performance of VLKF-B in generating less unphysical imbalance is worth it.also provides an expression for the transformation matrix S w ∈ R Dw ×D w as S w = S red Q T M T 0 .We note that one may define formally an effective pseudoobservation operator ĥ ∈ R Dw ×D w ĥ = S red Q T M T 0 h.
create balanced analyses by using a continuous formulation of the Kalman Published by Copernicus Publications on behalf of the European Geosciences Union & the American Geophysical Union.G. A. Gottwald: Controlling balance in an ensemble Kalman filter analysis step

Fig. 4 .
Fig. 4. Top panel: imbalance B of the analysis as a function of analysis cycles for t obs = 5.5 h in a log plot.In order of increasing values of B are VLKF-B (magenta open circles), IAU (black crosses) VLKFḣ (cyan squares), DEnKF (blue diamonds) and EnKF (red crosses).Bottom panel: temporally averaged imbalance B of the analysis as a function of the observation interval t obs .

Fig. 5 .
Fig. 5. Rms error of the analysis as a function of the observation interval ∆t obs (in hours) for the slow variables x.

Fig. 4 .Fig. 5 .
Fig. 4. Top: Imbalance B of the analysis as a function of analysis cycles for ∆t obs = 5.5hrs in a log-plot.In order of increasing values of B are VLKF-B (magenta open circles), IAU (black crosses) VLKF-ḣ (cyan squares), DEnKF (blue diamonds) and EnKF (red crosses).Bottom: Temporally averaged imbalance B = 1 N P N n=1 B(n∆t obs ) of the analysis as a function of the observation interval ∆t obs .

Fig. 6 .
Fig. 6.Rms error of the analysis as a function of the observation interval ∆t obs (in hours) for the unobserved height variables h.

Gottwald: Controlling balance in an ensemble Kalman filter we
explore how this spurious imbalance can be controlled by using the VLKF framework established in Sect. 2.