Inferring the instability of a dynamical system from the skill of data assimilation exercises

Data assimilation (DA) aims at optimally merging observational data and model outputs to create a coherent statistical and dynamical picture of the system under investigation. Indeed, DA aims at minimizing the effect of observational and model error, and at distilling the correct ingredients of its dynamics. DA is of critical importance for the analysis of systems featuring sensitive dependence on the initial conditions, as chaos wins over any finitely accurate knowledge of the state of the system, even in absence of model error. Clearly, the skill of DA is guided by the properties of dynamical system under investigation, as merging optimally observational data and model outputs is harder when strong instabilities are present. In this paper we reverse the usual angle on the problem and show that it is indeed possible to use the skill of DA to infer some basic properties of the tangent space of the system, which may be hard to compute in very high-dimensional systems. Here, we focus our attention on the first Lyapunov exponent and the Kolmogorov-Sinai entropy, and perform numerical experiments on the Vissio-Lucarini 2020 model, a recently proposed generalisation of the Lorenz 1996 model that is able to describe in a simple yet meaningful way the interplay between dynamical and thermodynamical variables.

ordered according to their value, with λ 1 being the largest. Unless the system feature symmetries, all the LEs are distinct, and in the case of continuous time dynamics, one of them vanishes corresponding to the direction of the flow and defining the neutral tangent space. Once ordered from the largest to the smallest, the sum of the first k LEs gives the asymptotic growth rate of a k−volume element defined by k displaced infinitesimally nearby the reference trajectory plus the reference trajectory itself.
Additionally, if λ n0 denotes the smallest non-negative LE, in many practical applications one can estimate the Kolmogorov-Sinai (or metric) entropy σ KS , which defines the rate of creation of information of the system due to its instabilities, can be identified with n0 i=1 λ i (Pesin's identity). Finally, it is possible to use the spectrum of LEs to define a notion of dimension for the attractor of a chaotic systems. The Kaplan-Yorke conjecture, which follows from the estimate of the rate of growth of the infinitesimal k−volume, indicates that the information dimension of a chaotic attractor is given by D KY = p+ p i=1 λ i /|λ p+1 |, where p is the largest index such that p i=1 λ i ≥ 0. In systems where the phase space contracts (the large class of dissipative systems), one has D KY < n. Roughly speaking, larger values of λ 1 , of σ KS and of D KY are associated with conditions of high instability and low predictability for the flow. This is clearly an extremely informal presentation of some of the features and properties of the LE; see Eckmann and Ruelle (1985) for a now-classic discussion of these topics.
It is indeed possible to associate each LE with a physical mode. Ruelle (1979) proposed the idea of performing a covariant splitting of the tangent linear space such that the basis vectors are actual trajectories of linear perturbations. The average growth rate of each of the covariant Lyapunov vector (CLVs) equals one of the LE. This idea was first implemented by Trevisan and Pancotti (1998) for studying the properties of the Lorenz 1963 model (Lorenz, 1963). Separate algorithms for the computation of CLVs were proposed in Ginelli et al. (2007) and Wolfe and Samelson (2007); see the recent comprehensive review by Froyland et al. (2013). Note that the CLVs corresponding to the positive (negative) LEs span the unstable (stable) tangent space.
Numerical evidences and recent analytical proofs have shown that, under certain conditions of the observations (their types, spatio-temporal distribution, and accuracy), the performance of DA with chaotic dynamics relates directly to the instability properties of the dynamical model where data are assimilated. One can thus in principle use the knowledge of the dynamical features to inform not only the design of the DA that better suits the specific application -e.g., how many model realizations for the Monte Carlo based DA methods, the length of the assimilation window in variational DA -but also the best-possible observational deployment.
A stream of research has shed light on the mechanisms driving the response of the ensemble-based DA (Evensen, 2009), i.e. its functioning and suitability, when applied to chaotic systems. A recent comprehensive review can be found in Carrassi et al. (2021), while we succinctly recall the main findings in the following. In the deterministic linear and Gaussian case with Kalman filter (KF) and smoother (KS), it has been analytically proved that the error covariance matrices converge in time onto the model's unstable-neutral subspace, i.e. the span of the backward Lyapunov vectors (BLVs), or of the covariant Lyapunov vectors (CLVs), associated to the non-negative Lyapunov exponents (LEs, . These results have then been shown numerically to hold for the ensemble Kalman filter/smoother in weakly nonlinear regimes (EnKF/EnKS;Evensen, 2009) by . In practice, for sufficiently well observed scenarios, the error of the state estimate is fully confined within the unstable-neutral subspace. Because this subspace is usually much smaller than the full system's phase space, the above convergence results imply that an ensemble size as large as the unstable-neutral subspace dimension suffices to achieve satisfactorily performance, i.e. to track the "true" and effectively reduce the estimation error thus leading to a substantial computational saving.
The picture slightly changes in the presence of a degenerate spectrum of LEs. This degeneracy often arises in systems with multiple scales or in coupled dynamics (Vannitsem and Lucarini, 2016a;De Cruz et al., 2018a): the degeneracy usually regards the unstable-neutral portion of the LE spectrum. In these cases it is necessary to increase the ensemble size to account for all of the degenerate modes (Tondeur et al., 2020;Carrassi et al., 2021).
The necessity for going beyond the number of asymptotic unstable-neutral modes is also connected to the local variability of the instantaneous instabilities along a system's trajectory. A recent study performed on a quasi-geostrophic model of the atmosphere (Lucarini and Gritsun, 2020) provided a strong evidence that the large heterogeneity of the atmospherics's predictability is due to the presence of substantial variability in the number of unstable dimensions (Lai, 1999) of the unstable periodic orbits (UPOs) populating the attractor and defining the skeletal dynamics of the system (Auerbach et al., 1987). As a result of the fact that the orbit of a chaotic system shadows the UPOs supported on the attractor in some of its regions , certain directions of the stable space experience finite-time error growth due to locally important instabilities, causing the need for a larger ensemble size than the dimension of unstable-neutral.
In the stochastic scenario, the above results still hold, although with some qualifications. Stochasticity usually injects noise in the system irrespective of the flow-dependent modes of instabilities. Consequently, and with a non-zero probability, it also injects error onto stable directions that would not have been otherwise influential in the long term. The trade-off between the frequency of the noise injection and its amplitude on the one hand, and the dissipation rate of stable modes on the other, determines the amplitude of the long term error along stable modes (Grudzien et al., 2018a). This mechanism implies the need to include additional members in the ensemble to encompass weakly stable modes that experience instantaneous growth and, although exacerbated by the presence of system noise, is present in any reduced-rank KFs as explained by Grudzien et al. (2018b).
Moving away from the Gaussian and weakly-nonlinear scenario, the impact of instabilities on the functioning and performance of nonlinear DA, in particular particle filters (PFs, see e.g. Van Leeuwen et al., 2019) has also recently been clarified. Carrassi et al. (2021) have shown that the number of particles needed to reach convergence depends on the size of the unstableneutral subspace rather than the observation vector size.
We have seen how the knowledge of the LEs and LVs can be used to operate key choices in the implementation of ensemblebased DA schemes aimed at enhancing accuracy with the smallest possible computational cost. This view angle is made explicit in a class of DA algorithms that operates a reduction in the dimension of the model (e.g., the assimilation in the unstable subspace, AUS, Palatella et al., 2013), of the data  or both (Albarakati et al., 2021).
1.3 This paper: data assimilation as a tool to interrogate the dynamics While extremely appealing from a theoretical view-point and practically useful in low-to-moderate dimensional problems, the use of the dynamically informed DA approaches is difficult in high dimensions, where even just computing the asymptotic spectrum of LEs, let alone the very relevant state-dependent local LEs (LLEs), is very difficult or just impossible. As pointed out in Carrassi et al. (2021), the recent advent of powerful machine learning methods may open the path to efficiently emulate the otherwise very costly recursive procedure to compute the LEs and vectors.
In this work, we attempt to reverse the view angle by posing the question: can we use the results of DA to infer some fundamental quantities of the underlying dynamics, such as the spectrum of the LEs or the Kolmogorov-Sinai entropy (σ KS ), that are, as aforementioned, difficult to compute in high-dimensions.
The paper is structured as the following: in Sect. 2, an upper bound of the root mean squared error of the Kalman filter for the linear dynamics in the asymptotic limit is derived. In Sect. 3 we present the Vissio and Lucarini (2020) (VL20) model and its DA setup. The VL 2020 model is a recently proposed generalisation of the Lorenz (1996) model that is able to describe in a simple yet meaningful way the interplay between dynamical and thermodynamical variables. Additionally, the presence of qualitatively distinct set of spatially extended variables allows one to consider non-trivial cases of partial observations for DA exercises. Sect. 4 presents the main results of the paper by comparing the skill of the performed DA exercises with some fundamental measures of instability of the VL20 model. Finally, in Sect. 5 we discuss our results and present perspectives for future investigations.

Kalman filter error bounds and Lyapunov spectrum
We are interested in searching for a further relation between the skill of EnKF-like methods applied to perfect (no model error) chaotic dynamics and the spectrum of LEs. We shall build our derivations on the results mentioned in Sect. 1.2 and reviewed in Carrassi et al. (2021). In this section we set ourselves in a linear and Gaussian context, whereby the Kalman filter (KF) yields the exact solution of the Gaussian estimation problem. Linear results will guide the interpretation of the findings in the general nonlinear setting with the EnKF.
At time t k , let x k ∈ R n and y k ∈ R d be the state and observation vector, respectively. The (linear) model dynamics M k ∈ R n×n and observation model H k ∈ R p×n read The observation noise, v k , is assumed to be a zero-mean Gaussian white sequence with statistics with E[ ] being the expectation operator, δ k,l the Kronecker's delta function, and R k the error covariance matrix of the observations at time t k . For the sake of notation clarity, we assume that the model dynamics is non-degenerate so that all its Lyapunov exponents are distinct; we note that the extension to the general degenerate case is possible.
The singular vector decomposition (SVD) of the model dynamics between t k > t l reads: where U k:l and V k:l are non-degenerate orthogonal matrices and Λ k:l the diagonal matrix of singular values. For t l → −∞ the left singular vectors, U k:l , converge to the backward Lyapunov vectors (BLVs) at t k , and, similarly for t k → ∞ the right singular vectors, V k:l , converge to the forward Lyapunov vectors (FLVs) at t l . The singular values (SVs) in Λ k:l converge to n distinct values of the form diag(Λ k:l ) i = exp(λ i (t k − t l )), in which λ i are the Lyapunov exponents (LEs) in descending order, λ 1 > λ 2 > . . . λ n0 = 0 > λ n−1 > λ n . The n 0 non-negative LEs identifies the n 0 unstable-neutral modes.
Let define the information matrix: which measures the "observability" of the state at t k , with Ω l = H T l R −1 l H l being the precision matrix of the observations mapped to the model space. Moreover, let U T +,k be a matrix whose columns are the n 0 unstable and neutral BLVs of the dynamics M k .  have shown that, if the following three conditions hold, (i) the unstable-neutral modes are sufficiently observed, such that with I n ∈ R n being the identity matrix, (ii), the neutral modes, u, are subject to the stronger constraint, and, (iii) confining the initial error covariance matrix to the space of FLVs at time t 0 , then the KF forecast error covariance matrix, P f k , converges asymptotically to the sequence In real applications, the convergence (within numerical accuracy) occurs in long-but-finite times .
The asymptotic mean squared error of the forecast (MSEF) of the KF solution is given by the trace of Eq. (8), where, for last equality, we made use of the cyclic property of the matrix trace and the orthogonal relation of the BLVs, U T +,k U +,k = I n0 . Equation (9) puts in evidence that the asymptotic MSEF depends on the observation constraint through the information matrix (data accuracy, encapsulated in R, while data type and deployment encapsulated in H) , but also on the unstable-neutral BLVs. Despite this, it is particularly involved to use Eq. (9) to derive a direct relation between the MSEF and the spectrum of LEs. This is because U +,k is not invertible in general, and because one needs to make specific (often overly simplified) assumptions on the model dynamics and observations, i.e. on M k:l , H l and R l , in order to get a treatable expression of the information matrix. In alternative, rather than a direct relation, we shall seek for informative bounds for the MSEF in terms of the LEs.
Let substitute the SVD of M k:l , Eq. (4), in the information matrix, For every t l , the individual terms in the summation can be written as: We now define the maximum projection of the precision matrix onto the FLVs as: with . being the Euclidean norm, and use it to get an upper bound for the inverse of each term, Eq. (12), in the information matrix summation, The inequality is based on the Löwner partial ordering of R n×n (i.e. the partial order defined by the convex cone of positive semi-definite matrices; see e.g.,  their Appendix B). We shall use this partial ordering in the following derivations.
By defining the maximum of β l across all 0 ≤ t l ≤ t k−1 as we get the following lower bound for the information matrix: The bound reflects the effect of assimilating observations (the rhs) compared to the unconstrained free model run (the lhs)note that U k:l Λ 2 k:l U T k:l = M k:l M T k:l . Given that U k:l Λ 2 k:l U T k:l is symmetric positive definite, we can invoke the aforementioned partial ordering for this class of matrices, and further develop the lower bound of the information matrix as where e 2λ1(t k −t l ) is the largest eigenvalue of U k:l Λ 2 k:l U T k:l . The lower bound of the information matrix in Eq. (16) then becomes Under the assumption that the assimilation cylcle is uniform in time, e.g.
coincides with a geometric series with known sums: By using the lower bound, Eq. (19), and the orthogonality of the BLVs, U T +,k U +,k = I n0 , we get a lower bound for the information matrix projected onto the unstable-neutral subspace: We can thus finally use Eq. (21) in the expression of the MSEF, Eq. (9), and derive the following upper bound: This upper bound incorporates the key players shaping the relation between the KF estimation error and the model dynamics.
The presence of β k and ∆t reflects the observation modulation of the MSEF: the stronger the data constraint the smaller β k and ∆t. The signatures of the model instabilities are in the term n 0 , the size of the unstable-neutral subspace, and in λ 1 , the error growth rate along the leading mode of instability, both related directly to the amplitude of the bound. Under a Bayesian interpretation, the factor β k can be seen as the likelihood of data, and the remaining terms in the bound altogether as the prior distribution. Note that, if the dynamical model is stable (and independently of the data), λ 1 = 0, s i = 1 k and the MSEF goes to zero asymptotically.
As alluded at the beginning of the section, a direct expression (e.g. an equality in place of a bound) relating the model insta- indicates that the MSEF is determined by a convolution of model dynamics and observation error.
In the next sections we will perform numerical experiments under controlled scenarios to investigate the conditions for which the bound holds. In particular, we will study the conditions leading to the smallest possible upper bound, such that the output of a converged DA, i.e. its asymptotic MSEF, can be used to infer the LEs spectrum of the model dynamics.
3 Experimental setting

The Vissio-Lucarini 2020 model
Our test-bed for numerical experiments is the low-order model recently developed by Vissio and Lucarini (2020), hereafter referred to as the VL20. The VL20 model is an extension of the classical Lorenz 96 model (Lorenz, 1996) that includes additional thermodynamic variables. The model is given by the following set of n ODEs (with n being an even integer): where X represents the momentum, θ is the thermodynamic variable, and the subscript 1 ≤ i ≤ n/2 is the gridpoint index. The model is spatially periodic, and the boundary condition is expressed as: In the VL20 model it is possible to introduce a notion of kinetic energy K = n/2 =1 X 2 j /2 and potential energy P = n =1 X 2 j /2. Additionally, the model features an energy cycle that allows for the conversion between the kinetic and potential forms and for introducing a notion of efficiency. The parameter α modulates the energy transfer between the two forms, while γ controls the energy dissipation rate and F and G are external forcing defining the energy injection into the system. Table 1. Instabilities features of the VL20 model for the three forcing configurations; n = 36, and α = γ = 1.
(F, G) (10, 10) (10, 0) (0, 10) λ1 The model's evolution can be written as the sum of a quasi-symplectic term, which conserves the total energy, and of a gradient term, which describes the impact of forcing and dissipation. In the turbulent regime, the VL20 allows for propagation of signals in the form of wave-like disturbances associated with unstable waves exchanging energy in both potential and kinetic form with the background. In terms of energetics, the difference between the L96 and the VL20 model mirror the one between a one-layer and a two-layer quasi-geostrophic model, because the former features only barotropic processes, while the latter, features the coupling between dynamical and thermodynamic processes via baroclinic conversion, which makes its dynamics much more complex (Holton and Hakim, 2013). The VL20 model is thus a very good test-bed for research in DA, a further step toward realism from the very successful L96. Further details on the model as well as an extensive analysis of its dynamical and statistical properties can be found in Vissio and Lucarini (2020).
In all the following experiments, we set n = 36 implying both model variables X and θ have 18 components, and consider three model configurations differing in the values of the external forcings: F = G = 10, F = 10, G = 0, and F = 0, G = 10.
Unless otherwise stated, the model runs with the default parameters α = γ = 1, and it is numerically integrated using the standard 4-th order Runge-Kutta time stepping method with a time step ∆t = 0.05 time units. A summary of the model instability properties with the chosen configurations is given in Tab. 1.

Data assimilation setup
Synthetic observations are generated according to Eq. (2) by sampling a "true" solution of the VL20 model, Eqs. (27), and then adding simulated observational error from the Gaussian distribution N (0, R). Observational error is assumed to be spatially uncorrelated thus that the error covariance, R, is a diagonal matrix, and we observe the model components directly, implying that the observation operator is linear and under the form of a matrix, H ∈ R p×n . The observation error variance is set to be 5% of the variance (i.e. the squared temporal variability) of the climatology of the corresponding state vector component such that: By linking the observation error to the model variance makes the setup more realistic, but it ties the error amplitude to the choice of the model parameters. For example, the model's state vector variance gets very small when the dissipation is strong, potentially making the R matrix degenerate. Under such circumstances, the corresponding entries in R are set to 5 × 10 −6 .
In line with previous studies (e.g., Carrassi et al., 2021), we work with deterministic EnKFs, whereby it is possible to study the filter performance in relation to the model instabilities without the inclusion of additional noise that is inherent to stochastic versions of the EnKFs (Evensen, 2009). In particular we choose to use the finite-size ensemble Kalman filter (EnKF-N, Bocquet et al., 2015) because it automatically computes the required covariance inflation thus saving us from running many inflation tuning experiments. The initial conditions for the ensemble are sampled from the Gaussian distribution N (x t 0 , R) with the x t 0 being the "truth" at t 0 : this choice signigies that the initial condition error is taken to be equal to the observational error.
The performance of DA experiments will be assessed primarily by using the root mean square error of the analysis, normalised by the observation variance: The nRMSEa measures the analysis error independent from observation error, allowing for a multivariate assessment of the performance. Unless otherwise stated, observations are taken at every time step, and the experiments last 2, 000 model time units. With this setting, an experiment comprises 40, 000 DA cycles, and when computing time-averages of the nRMSEa, we only consider the last 500. Finally, and again unless otherwise stated, we shall adopt N = 40 ensemble members in the EnKF-N.

Numerical results
Our analysis focuses on the relation between observational design and filter accuracy, and the relation between the model instabilities and the filter accuracy. By exploiting the novel dynamical-thermodynamical feature of VL20 over its L96 precursor, we will also study the EnKF-N under observational scenarios that alternatively measure the dynamical variable, X, or, the thermodynamical one, θ. Carrassi, 2017) and with coupled models (Tondeur et al., 2020), Fig. 2 shows that, even with a multivariate model, the error converges to very low level as soon as the ensemble size exceeds the number of unstable-neutral modes, n 0 , and that it does not further decreases by adding more members. This behaviour is possible because error evolution is bounded to be linear or weakly nonlinear. This means that one can in principle induce linearity intentionally in the error evolution to meet the aforementioned relation between filter accuracy and ensemble size and use it to infer the number of unstable-neutral modes.

Data assimilation with the VL20 model: general features
In a DA experiment, a "practical" way to achieve this is by strengthening the observational constraint (i.e., by increasing the measurements spatial and temporal density); here we observe the full system's state at every time-step.
The VL20 model represents four main physical mechanisms: i) the transition from kinetic to potential energy; ii) the energy injection from external forcing; iii) the advection; and iv) the dissipation. Although these processes all participate the evolution of the model with nonlinear interplay's that cannot be straightforwardly disentangled, we shall try to refer to them when interpreting the outcome of the DA experiments. In particular, in each experiment we will attempt to identify the prevailing mechanism over the aforementioned four. We perform three experiments, where we observe the full system state (i.e., H = I 36 ), or alternatively X or θ alone (implying in both cases H ∈ R 18×36 ). Results are given in Fig. 3, that displays the time averaged nRMSEa (global or for the dynamics or thermodynamics only) over a range of the coefficient α that modulates the energy transfer rate.
Overall, and as expected, the analysis error is smaller in the observed variables (cf the left and mid columns and corresponding color lines), and attains the smallest level when X and θ are simultaneously observed (right column). Nevertheless a few remarkable points can be raised. First, when the system is fully observed, for large α (i.e. for large conversion between available potential energy and kinetic energy) the skills in X and θ get very similar (right column): we conjecture this to be a consequence of the system getting more evenly turbulent with all variables sharing a similar internal variability as energy is exchanged efficiently between the kinetic and potential form. Second, for small α (i.e., small energy conversion), the effect of external forcing becomes dominant and determines the analysis error of X and θ (last column in Fig. 3). For instance, whenever the momentum is externally forced (F = 10), the error in X is systematically smaller than in θ (first and second rows of the last column): DA is more effective in controlling the dynamics than the thermodynamics even when they are subject to the same observational constraint. The situation is somehow reversed when only the thermodynamics is forced (F = 0, G = 10): the analysis error of the momentum and the thermodynamic variable is undifferentiated. With small α and no forcing for the momentum, the nonlinear momentum advection is limited by the small magnitude of the momentum that is not able to activate much the dynamical variables, so that we observe similar analysis error between the thermodynamics variable and the momentum.
Finally, the effect of the energy transfer and advection can be revealed by looking at the partially observed experiments (left and mid columns). Both mechanisms involve the momentum, making it more efficacious to observe X than θ especially in the energy transfer dominated regime (large α). However, in an advection-dominated regimes (small α), if θ is unobserved, X has limited capability to constrain the error in θ due to the weak feedback from θ to X. On the other hand, observing θ reduces error in X via the accurate estimate of the advection process of θ (see mid column).
Further insight on the role of the driving (unstable) variable, and on the interplay between the prevailing physical mechanisms and the analysis error is given by looking at the CLVs (Kuptsov and Parlitz, 2012). In Fig. 4 we show at (normalized time-    averaged) absolute amplitude of CLVs components along the state vector: it tells us which variables/component have the larger influence on each CLVs, thus indicating what processes participate more to a specific direction of error growth/decay. As discussed above, the change of α induces the shift from the advection dominated regime to an energy mixing one, where the thermodynamics and the kinetic energy mixes with each other: these two regimes are portrayed in Fig. 4, by selecting α = 0.4 and α = 2.2. Moreover, these two values of α correspond roughly to those giving the largest differences in nRMSEa between the momentum and the thermodynamic variables (cf left and mid panels of Fig. 3). For small energy exchange (α = 0.4 -left column in Fig. 4), the model instabilities are driven by the external forcing, with the driving variable being the one where energy is injected. This is clearly visible when comparing the amplitudes of CLVs between X and θ on the left column of Fig. 4: larger amplitudes of the unstable-neutral CLVs are found in the forced variables. When the momentum and the thermodynamics are equally forced (blue lines), the amplitude of the unstable-neutral CLVs for X and θ are close to each other.
The nonlinear advection process intensifies the error growth, especially for X. The nonlinear advection and the momentum is of lesser importance if the momentum is not forced (F = 0, G = 10 -green lines) while the thermodynamic processes control dominantly both the stable and unstable subspace. The thermodynamic variable on the stable subspace acts as an energy sink to stabilize the dynamical system. The effect of the thermodynamics is shown noticeably by the large relative amplitude of the CLVs of the thermodynamic variable in the stable subspace when the momentum is directly forced (F = 10 -blue and red line).
The situation changes sensibly when the energy exchange is the dominant physical mechanism (α = 2.2 -right column).
This causes a stronger mixing across the model variables so that both X and θ play a comparable role in the unstable-neutral components of the CLVs leading to similar amplitude of the CLVs for all types of forcing. Remarkably, the effect of the energy conversion also applies to the stable components of the CLVs leading to similar amplitude of the CLVs between X and θ.
The results in Fig. 4 reveal the effect of the prevailing physical mechanisms on determining the driving unstable variables.
The figure suggests what variables should in principle be controlled by targeting measurements on the portion of the system's state vector with larger amplitude on the unstable-neutral CLVs.
Along with α, the energy in the system is modulated by the dissipation, γ: larger values of γ implies an efficient removal of energy from the system, and thus reducing the system's variability of both the potential and kinetic energy. At dynamical level, the parameter γ controls the contraction of the phase space as the sum of all Lyapunov exponents (equal to the average flow divergence) is −nγ. Hence, one expects that larger values of γ correspond to weaker instability for the model, as in the case of the classical L96 model (Gallavotti and Lucarini, 2014). Fig. 5 is the same as Fig. 3 but for the dissipation, γ.
Overall, we see that, with large γ, the system's internal variability reduces and we find similar small errors in both X and θ. For weaker dissipation, the momentum is better controlled than the thermodynamics. With partial observations (left and mid columns), the error is much larger than in the corresponding fully observed cases. Similar to Fig. 3, the momentum is generally better reconstructed by the DA than the thermodynamics, although observing the latter appears more efficacious (i.e. it leads to smaller analysis error) than observing the momentum. We think that this is due to the prevailing mechanism being the advection of the thermodynamics given that α = 1 in these experiments (cf. also Fig. 3). The amplitudes of the CLVs along the state vector is studied in Fig. 6. We consider the cases γ = 0.4 and γ = 1.0 for which the difference in the nRMSEa between X and θ is roughly the largest (cf Fig. 5).
With the leading CLVs strongly affected by the external forcing, the amplitude of the CLVs along the system's components is similar to the pattern of low energy exchange rate in Fig. 4 where α = 0.4 even though here, α = 1. This confirms that the dynamical regime of our experiments lies in the regime dominated by advection, and dissipation does not mix the kinetic and potential energy diffusely as the energy exchange, but rather it uniformly removes both types of energy without changing the prevailing physical mechanism. This is also reflected in the consistently low nRMSEa for the observed variable when varying dissipation rates in the partially observed experiments (see Fig. 5). The decreasing analysis error in Fig. 5 corresponds to the increases of γ, which reduces the dimension of the unstable-neutral subspace with increased relative importance of forced variables in the unstable-neutral subspace as the fast energy removal reduce the amount of energy mixing.
The results of Sect. 4.1 confirm the relation between the performance of DA (in terms of analysis error) and the dimension and characteristics of the unstable-neutral subspace. In particular, we conclude that successful DA relies on controlling the error in the unstable-neutral subspace by observing the variable that drives the error growth. The VL20 model enabled the investigation of the relation between the DA and the specific physical mechanisms such as the advection, the energy transfer among dynamics and thermodynamics as well as the dissipation. The effect of DA (i.e. its efficacy) is strongly influenced by the form of the coupling between the unobserved and the observed variables that is in turn shaped by the prevailing physical mechanisms.

Inferring the degree of model instability with data assimilation
The derivation in Sect. 2 shows that, in the linear setting, the assimilation error is asymptotically bounded from above by a factor dependent on the observation error, the first LE and the number of unstable-neutral modes of the underlying forecast model. In this section we explore the extent to which this result holds in a nonlinear scenario whereby the observational constraint is strong enough such that the error evolution is maintained approximately linear or weakly-nonlinear. We shall make use of numerical experiments with the VL20 model.
A first insight on the existence of a direct relation between the model instabilities and the skill of the EnKF-N is already provided in Fig. 3 and 5. They display the Kolmogorov-Sinai entropy, σ KS (black line) and the first LE, λ 1 (amplified by a factor of 3 -gray line), along with the nRMSEa (discussed in Sect. 4.1). Even just by visual inspection the figures clearly evidence the high correlation between the analysis error and both the σ KS and λ 1 .
The nature of this relation is further studied in Fig. 7 that shows scatter plots between the nRMSEa (with black markers) and  The linear relation does not hold when ln(nRMSEa) < −4 (see the black markers distribution in the panels' inset). We explain this behaviour in the following way. The wide clouds of points in correspond all to model configurations with very small σ KS and λ 1 . In these quasi-stable dynamics, the error growth in between successive analysis is very little, with occasional error decay. The observational error, which is random and white-in-time, will be often larger than the forecast error and will dominate the analysis error, thus breaking its direct dependence on the instability-driven forecast error. In addition, in the weakly unstable model configurations, the most unstable direction behaves almost like a neutral mode which also breaks the assumptions of the theoretical upper bound for unstable models. In fact, we do not show the weakly unstable results with σ KS < 1 for both σ KS and λ 1 .
The error bounds in Sect. 2 relies on the assumption of linear error evolution, a condition that we met in our experiments thanks to a strong observational constraint, with (synthetic) measurements covering the full state vector at each time-steps.
These conditions are rarely achievable in practice, so it is relevant to explore how results will change with lighter observational constraint. There are three direct ways to achieve this by acting on (i) the number/type of measurements, (ii) the measurement error, and/or, (iii) the temporal frequency.
The effect of the first is studied in Fig. 8 that is similar to Fig. 7 but for DA experiments whereby only one of each variable in the VL20 is observed.
The impact of partially observing the system causes the emergence of a weakly quadratic relationship between the analysis error and either σ KS or λ 1 . However, the analysis error is still uniquely and monotonically related to them especially for σ KS .
A quadratic law requires one additional coefficient to be determined compared to a linear law, yet the mere existence of such a law suggests again that one could in principle infer σ KS and/or λ 1 based on the analysis error. With the relaxed observation constraint, the analysis error can (and indeed do so in several instances) exceed the theoretical upper bound. However, the general trend of the numerical experiments still follow the theoretical upper bound.
We study the effect of changing the amplitude of the observational error in Fig. 9. Results reveal that varying the observation error in the range of 5% − 10% does not break the quasi-linear relationship between the analysis error and σ KS or λ 1 . The nRMSEa is quite insensitive to the observation variance due to the normalization. Nevertheless, the upper bound is not violated -3.  as in Fig. 7 and the slope of the nRMSEa from the numerical experiment is remarkably similar to the slope of the theoretical bound.
Finally, the impact of varying the observation frequency is explored in Fig. 10. It is patent that decreasing the frequency leads to blurring the linear relation between the analysis error and the σ KS or λ 1 . There is a clear deviation from the trend of the theoretical upper bound and from the uniqueness of the relation between analysis error and σ KS , as soon as the observational time interval exceeds the error doubling time (that is inverse related to λ 1 ), and DA error evolves beyond the linear regime.
However, for frequent enough observations a linear relation similar to the upper bound appears and, again, one could in principle deduce σ KS and/or λ 1 based on DA.
Finally note that, the different effect of the observational noise and data frequency depicted in Fig. 9 and Fig. 10 is consistent with the findings in  (their appendix) where it is shown how observation frequency is a much more effective driver to induce (or to break) linear error evolution.

Conclusions
It is sometimes of great importance to be able to obtain information on the instability of a system of interest by performing data analysis of suitably defined observables. This is of key importance when one does not have direct access to the evolution equations of the system or when the analysis of its tangent space is too computationally burdensome. As an example, quantitative information on the degree of instability of a chaotic system can be extracted using extreme value theory by studying the statistics of close dynamical recurrences as well as of extremes of so-called physical observables (Lucarini et al., , 2016. The use of such a strategy has shown a great potential for the analysis of geophysical fluid dynamical models in a highly turbulent regime (Gálfi et al., 2017) as well for the understanding of the properties of the actual atmosphere Messori et al., 2017).
In this study, we have addressed this problem by taking the angle of DA. The relation between DA and the instability of the dynamical system where it is applied has long been studied (see e.g. Miller et al., 1994;Carrassi et al., 2008), and has been used to design DA techniques in various field of geosciences (Carrassi et al., 2021;Albarakati et al., 2021). Here, we have reversed this viewpoint and investigated the possibility of using DA to infer fundamental quantities of the underlying dynamics, in particular the Lyapunov exponents, λ i , or the Kolmogorov-Sinai entropy (σ KS ). The basic idea is to look at DA as a control problem, and relate our ability to control the system, ceteris paribus, to its underlying instability. We have leveraged on a stream of previous works that set the theoretical foundation and that proved the convergence of the error covariance of the Kalman filters onto the unstable-neutral subspace of the dynamical system. Based on this, we derived here an upper bound of the Kalman filter forecast error, i.e. under the assumptions of a linear model dynamics and a linear observation operator. The upper bound is very informative as it relates the error's amplitude to all of the essential descriptors of the model instabilities on the one hand and of the DA on the other. These are the dimension of the unstable-neutral subspace, n 0 , the first Lyapunov exponent, λ 1 , the frequency of the observation assimilation, ∆t, and the observation error, β k . By properly normalising the bound by the observation error, it can be written as a function of the model dynamical properties exclusively.
The existence of a relation between λ 1 or σ KS and the DA skill, as well as the validity of the bound, has then been investigated in a nonlinear scenario using numerical experiments. We have used the EnKF-N (Bocquet et al., 2015) as a prototype of deterministic EnKF (Evensen, 2009) and the new model developed by Vissio and Lucarini (2020). The VL20 is an extension of the widely used Lorenz 96 model that includes a thermodynamic component. While maintaining all of the virtues of a lowdimensional model suitable for investigations on new methods at low computational cost, VL20 is conceptually much richer than the original L96 model. In particular it allows for the exchange of energy between a kinetic and potential form, which, together with forcing and dissipation, provides the fundamental framework for the Lorenz (1955) energy cycle. Additionally, as advection impacts temperature-like variables, one can observe the emergence of more complex dynamical behaviors. By changing the value of its key parameters, and in particular of those determining its forcing and dissipation, the model explores various dynamical regimes, ranging from fixed point, periodic, quasi-periodic, and chaotic behaviour. In terms of DA, the VL20 model has the attractive feature that it includes two qualitatively different set of variables, associated with dynamics and thermodynamics, respectively. Hence, it is possible to explore the problem of having partial observation beyond focusing of the spatial extent of the observations only.
We demonstrate that the skill of the EnKF-N is directly linked to both λ 1 and σ KS . Whenever the error within the EnKF-N cycles are kept sufficiently linear via a strong observational constraint, the relation is clearly linear too. By relaxing the observational constraint (by either reducing the frequency of measurements or by increasing their noise) deviation from linearity emerge. Nevertheless, the linear relation is very robust against the level of observational noise (within certain range) while it turns quadratic once the interval between successive measurements gets too large and it exceeds the system's doubling time.
Similarly, we found out that the theoretical upper bounds for the errors, derived for linear system, still holds as long as the observational constraint is strong enough, but are then violated.
The error bound and the linear relation between error and λ 1 and σ KS represent a potentially powerful direct way to infer λ 1 and σ KS by looking at the output of a DA exercise. Knowing the analysis error (or a suitable approximate estimate of it) computing σ KS or λ 1 requires the unstable-neutral subspace dimension, n 0 . It can be obtained by looking at the analysis error convergence when increasing the ensemble size, N : n 0 will be equal to N * − 1 where N * being the smallest ensemble size for which the error reaches is minimum.
There are some follow up questions that emerge naturally from this work. Among those, we are currently considering how these results will change when performing DA for state and parameter estimation. In this context, a relevant recent study has shown how the minimum number of ensemble members, N * , will need to be increased to include as many members as the number of parameters to be estimated . By modifying its parameters, the model's instabilities properties will change too, potentially inducing a catastrophic change (a tipping point) of its long term behavior. Data assimilation will then need to infer the best parameter values to track the data signal and keep the DA solution on its same region of the bifurcation diagram.
Author contributions. Yumeng Chen designed and conducted the experiments, and prepared the manuscript. Alberto Carrassi and Valerio Lucarini both provide the original idea and the writing of the manuscript. All authors have then contributed to develop the work.