Multiple scale error growth in a convection-resolving model

The properties of the multiple scale instabilities present in a non-hydrostatic forecast model are investigated. The model simulates intense convection episodes occurring in Northern Italy. A breeding technique is used to construct ensembles of perturbations of the model trajectories aimed to represent the instabilities that are responsible for error growth at various time and space scales. By means of perfect model twin experiments it is found that for initial errors of the order of 5 present-day analysis error, a non-negligible fraction of the forecast error can be explained by a bred vector ensemble of reasonable size representing growth of errors at intermediate scales. In contrast, when the initial error is much smaller, the spectrum of bred vectors representing the fast convectivescale instabilities becomes flat and the number of ensemble members needed to explain even a small fraction of the forecast error becomes extremely large. The conclusion is that as the analysis error is 10 decreased, it becomes more and more computationally demanding to construct an ensemble that can describe the high-dimensional subspace of convective instabilities and that can thus be potentially useful for controlling the error growth.


Introduction
In the last 10-15 years there has been operational interest for non-hydrostatic, convection-resolving 15 models (e. g. Dixon et al., 2009;Seity et al., 2011;Baldauf et al., 2011) and for the possibility of making use of data assimilation (e. g. Zhang et al., 2003;Kain et al., 2010;Schenkman et al., 2011;Claussnitzer et al., 2011) to improve short range forecasting and nowcasting of weather fields, or the accuracy of a past event trajectory reconstruction. Some attention has been devoted to evaluate whether a higher resolution and a more realistic description of convective events are in fact associated 20 to a quantitative improvement in forecasting their location, timing and intensity (e. g. Weisman et al., 1 2008;Schwartz et al., 2009).
Prediction and data assimilation systems for the ocean and the atmosphere are largely funded upon the theory, developed over the last decades, of predictability and state estimation in chaotic dynamical systems. Atmospheric NWP (Numerical Weather Prediction) models that predict the synoptic 25 scale evolution of midlatitude weather systems are based on the hydrostatic assumption and include parameterisation of convective processes. Ensemble forecasting and data assimilation for such models have reached a mature stage, whereas non-hydrostatic models face the hard task of dealing with the fast instabilities typical of convection. Predictability in the presence of multiple scales has been investigated in several studies starting with the pioneering work of Lorenz (1969). Errors in the 30 small-scale weather features such as thunderstorms grow very rapidly and reach saturation within a timescale comparable to their brief lifetime, while errors in larger scales may still be in a linear stage of growth. A comparison of the time scales involved in prediction of hydrostatic versus nonhydrostatic models has been performed by Hohenegger and Schär (2007a), shedding some doubts on the feasibility of borrowing algorithms developed in the context of the former for the application 35 to the latter models.
In the present study we will try to look more deeply into this question, using as a "laboratory" a realistic model (MOLOCH,Sect. 3) and applying to a real case study the classical breeding method.
The relation between bred vectors and Lyapunov vectors, developed in a rigorous theoretical framework, states that as the rescaling amplitude goes to zero Lyapunov vectors are recovered from bred 40 vectors (BVs hereafter). This argument can be exploited when there fundamentally is one typical instability scale, as in synoptic scale models. In a convection-resolving model we expect to find a very large number of positive, competing Lyapunov exponents with very large values, making it unfeasible to compute all or even a reasonable number of their corresponding modes.
The flow-dependent character of forecast uncertainty and its estimation has become an important 45 issue in NWP in the last decades starting from the early works of Kalnay and Dalcher (1987) and Lorenz (1996), among others. In the forecasting practice, before Lorenz clarified the dynamical mechanism of selection of instabilities in his 1996 model (Lorenz, 1996), researchers at the National Meteorological Centre (NMC, now NCEP) were well aware of this problem and the breeding method Kalnay, 1993, 1997) was devised for its solution. When instabilities in a wide range of 50 scales is present, breeding provides a useful tool for selecting the instability scale one wants to investigate by tuning the breeding parameters, i. e. the rescaling amplitude and time interval. For instance, if the rescaling amplitude is larger than the typical saturation amplitude of a given scale, unstable structures characteristic of that scale will not appear in the bred vectors.
While the importance for prediction of a rigorous description of flow instabilities in terms of the 55 Lyapunov vectors with positive exponents that define the unstable subspace has long been recognised, the introduction of these concepts and the use of Lyapunov vectors in the state estimation problem and data assimilation algorithms is more recent. Starting with the paper by Trevisan and 2 Uboldi (2004), a number of studies (Uboldi and Trevisan, 2006;Carrassi et al., 2007Carrassi et al., , 2008aPalatella et al., 2013) have shown that, by confining the Assimilation of observations in the Unstable 60 Subspace (AUS) of the system, one can construct data assimilation algorithms that are computationally more efficient (EKF-AUS, Trevisan and Palatella, 2011b) and in some cases more accurate (4DVar-AUS, Trevisan et al., 2010) than their classical counterparts: the Extended Kalman Filter (EKF) and 4-Dimensional Variational assimilation (4DVar). A review on the role of instabilities in weather forecasting and data assimilation is found in Trevisan and Palatella (2011a).

65
For geophysical models, the huge number of degrees of freedom constitutes the main difficulty whenever a description of the error probability distribution is attempted, in particular in the application of data assimilation algorithms. During the forecast stage, errors grow in the unstable subspace, which can be of much lower dimension than the full phase-space dimension of the model: the algorithms that perform the assimilation in the unstable subspace take full advantage of the reduced 70 dimension in which they operate. In high resolution non-hydrostatic models, the presence of a large number of fast instabilities and of spatial scales as small as convective cells makes the problem of high dimensionality particularly critical.
One of the goals of the present paper is to investigate, in a realistic high-resolution, convectionresolving model, what is the structure of the forecast error in relation to the unstable modes at various 75 scales that can be estimated by breeding. One question we try to address is the following: in line of principle, can an assimilation algorithm keep under control a reasonable fraction of the errors that grow on such a wide spectrum of space and time scales?

Preliminary remarks
Limited-area models are forced by lateral boundary conditions, that represent an error source in 80 addition to initial state errors and to errors intrinsic to the model dynamics (model error), which are present in all operational systems. Moreover, initial and model errors evolve in a convoluted manner (a relevant subject in recent studies, e. g. Nicolis et al., 2009;Carrassi and Vannitsem, 2011).
The goal of this work is, however, to characterise the error growth at various dynamical scales. This is obtained by studying the evolution and growth of errors that are only introduced at initial 85 time: both model and boundary conditions are then "perfect", i. e. they are the same in the "true" trajectory and in all, control and perturbed, trajectories. The reason of this choice is that the introduction of model and/or boundary error would obscure the interpretation of results: it is important not to attribute to newly introduced errors any behaviour which is only and specifically due to error dynamical evolution. At the same time we feel the need to caution the reader against too optimistic 90 an interpretation of any results obtained in the absence of model and boundary errors.
A reader not particularly familiar with the perturbation breeding technique, besides the original papers Kalnay, 1993, 1997) may refer to the book by Kalnay (2003), where the relation between bred and Lyapunov vectors is discussed. See also Trevisan and Pancotti (1998); Legras and Vautard (1996); Wolfe and Samelson (2007). A classical method to compute Lyapunov exponents 95 and vectors is to make use of frequent Gram-Schmidt re-orthogonalisation of small perturbations (Benettin et al., 1980). The review by Trevisan and Palatella (2011a) discusses the role of unstable Lyapunov vectors in atmospheric predictability and data assimilation. Finally, the perturbative equations relevant for breeding in both an unforced dynamical system and a in system forced by the assimilation of observations are presented in Section 2.1 of the paper by Uboldi and Trevisan (2006).

100
Although throughout this work we broadly associate fast growth to small scales and slow growth to large scales, this should not be intended as a very rigid spatial characterisation of error and perturbation fields, especially in the nonlinear regime of growth.

Model and case study
The model is MOLOCH, developed at CNR-ISAC, Bologna, Italy (Buzzi et al., 2004;Zampieri et al., 105 2005; Malguzzi et al., 2006;Davolio et al., 2006Davolio et al., , 2009a 1 . MOLOCH is a convection-resolving, non-hydrostatic atmospheric model, running at about 2.2 km resolution with 50 hybrid vertical levels. State variables include temperature, pressure, three velocity components, specific humidity, concentrations of cloud water and ice and of three precipitation phases: rain, snow, and hail (plus temperature and water content at 5 ground levels). The domain includes the Alps, Northern Italy 110 and Corsica, with portions of Ligurian, Tyrrhenian and Adriatic Seas. The reference trajectory is a simulation of the real case of 26 September 2006 (Davolio et al., 2009b), with initial and boundary conditions from GFS (Global Forecasting System) 2 and BOLAM (Buzzi et al., 1998(Buzzi et al., , 2004Zampieri et al., 2005;Malguzzi et al., 2006). In this case-study, the circulation at 500 hPa is characterised by a deep trough West of the Alps. Near the surface, a pressure low is located in the Genoa Gulf,

Characterising the fastest instabilities: an extension of previous results
The results reported in this Section regard the fastest instabilities, associated to the smallest dynamical scales resolved by the model. Results presented in this Section were obtained by simulations performed in a smaller domain, not including Corsica and the Western Alps (Uboldi, 2010;Uboldi 130 et al., 2010), and the integrations only covered the time interval between 00:00 and 12:00 UTC on 26 September. The time scales that characterise the growth rate and saturation of instabilities associated to convection in the 26 September case are investigated in the line of Hohenegger and Schär (2007a).
First, a perturbation breeding technique has been used: two initial random, independent pertur- In order to estimate the linearity time, two perturbed states are obtained by adding, to the state of the reference trajectory, two perturbations with the same structure (an organised bred vector), but with opposite signs. These perturbed states are non-linearly evolved, and the spatial correlation of the perturbations (i. e. the cosine of the angle between the perturbation vectors) is used as a lin-150 earity indicator (Hohenegger and Schär, 2007a). The two perturbations, that initially (01:30 UTC) have the same direction (the cosine is −1), progressively loose their alignment as non-linearity becomes important. Conventionally, linearity is considered lost when the spatial correlation exceeds the threshold −0.25 (for random states it would be +0.5, Hohenegger and Schär, 2007b). The scalar product used to compute spatial correlation is extended to the whole domain, and to variables T, U 155 and V after scaling each of them with its own variability (estimated as described above). A minor dependence on variable has been found when the spatial correlation is computed for each variable separately: global linearity appears to be lost when it is lost in the horizontal velocity field. Estimates Recalling that the perturbed states evolve according to the full non-linear dynamics, the growth rate has been estimated by averaging during the linear growth regime, when the logarithm of the 165 amplification factor is approximately a linear function of time (and the growth rate is approximately constant). The estimates of the doubling time are T D 2.5 h, approximately the same value as T lin , during the night, and T D 2.0 h, smaller than T lin , during the morning.
Comparing T D with T lin is important because when T lin is shorter than T D , errors may reach nonlinear saturation so quickly that one can hardly see a linear growth phase. In principle, a reduction of 170 the initial error cannot guarantee a reduction of the forecast error after a time when non-linear error saturation has been reached (Hohenegger and Schär, 2007a).
By comparing T D with T lin in the present case, the morning episode of organised convection, though more intense, appears more stable and predictable than the scattered convection episode occurring during the night.

175
The values obtained here (for this system and case study) for the ratio T lin /T D appear a bit more promising than those obtained by Hohenegger and Schär (2007a), at least in the perspective of very short-range forecasting or nowcasting.

Experiments
In Sect. 4 we described perturbation growth and saturation properties with regard to the fastest-180 growth instabilities associated to the smallest dynamical scales. This was obtained by breeding perturbations, with a very high rescaling frequency and a very small rescaling amplitude. Such amplitude is much smaller, in fact, than the order of magnitude presently achievable in the operational practice for analysis error.
With a rescaling amplitude of the same order of magnitude as the analysis error, and an appro-185 priately lower rescaling frequency, the bred vectors do not describe the fast instabilities, which then have sufficient time to saturate between successive rescaling times.
As an example of a saturated small-scale error, the position of a particular, localised and isolated convective feature can be off by some distance in a "true" and in a "control" trajectory. As long as this offset is shorter than the spatial width of the disturbance itself, the error may grow in a linear fashion.

190
When the offset is larger than the disturbance, though, the error is determined by the presence, in the two state fields, of two separate, non-overlapping signals of approximately the same size. As they drift away from each other, the RMS state difference (any variable) does not change steadily, so that this error component does not effectively grow anymore. Saturated small-scale instabilities are not directly responsible for further error growth, but are expected to affect, by nonlinear interaction, 195 larger scales that may still be in a regime of linear growth. These large scales are mainly responsible for error growth at an error level comparable to the analysis error.
Under these premises, we perform an experiment where the error, i. e. the state difference between a trajectory representing the truth and a control run representing the forecast, has an initial amplitude (norm) compatible with the present-day analysis error. We then examine the ability of bred vectors 200 to capture the forecast error. In order to do so and in view of the different scales that may be involved in the growth of forecast errors, a range of values for rescaling amplitude and frequency is taken into consideration. In a second parallel experiment, an hypothetical analysis error an order of magnitude smaller is considered.
The experimental framework is that of a twin experiment, were a model trajectory is considered 205 to represent the "truth" and is compared with one or more control trajectories. Here the true trajectory is a free MOLOCH evolution initiated by an external model state (from the same BOLAM The second control trajectory has been obtained from the first one, by rescaling by 0.1 all components of its error vector at 21:00 UTC on 25 September in order to have a smaller initial error and again using the same boundary conditions. This second control trajectory, labelled R21, is used to construct bred vectors intended to estimate small-scale, fast instabilities (still slower, though, than 220 those studied in Sect. 4).
In any case no comparison is done before 00:00 UTC on 26 September, so that all trajectories have time enough to develop the dynamical scales typical of a MOLOCH non-hydrostatic, convection resolving simulation. The experiments described in Sect. 4 suggest to wait an initial 3-hour period for both control and "true" states to approach the system attractor and for the bred vectors to acquire 225 structures representative of the unstable directions of the system. Remark that the bred vectors are initiated as random noise (Sect. 5.2) at 21:00 UTC on 25 September 2006 and that, at this very initial time, the error is due to features present in the external model field. Instead, after 00:00 UTC on 26 September, forecast error and bred vector structures are those typical of the convection-resolving model dynamics, making it possible and significant to compare them.

230
The procedure used in the experiments is very similar to what is often done in the operational practice, when the state used as the initial condition for a non-hydrostatic model is taken from a hydrostatic model that also provides boundary conditions. This procedure implies the absence, in the initial control state, of small and fast scales typical of a convection-resolving model: these dynamic scales, present in reality, are also present in our experiments as we gave enough integration time for 235 7 them to develop.

Norm and scalar product definition
For both control trajectories, 18H and R21, the error vector is orthogonally projected onto orthonormalised bred vectors and the square error component, which is a function of the number of bred vectors used, is expressed as a percentage of the square norm of the whole error vector (Sects. 5.2 240 and 6.2). In order to orthogonalise the bred vectors and to compute the error orthogonal projection, the definition of a scalar product is needed. Our choice, appearing as the most natural, is the sum of component products extended to the whole three-dimensional domain (actually vertical levels 5-45 only) for the variables temperature and horizontal velocity, each normalised with its own variability (Sect. 4). This is called TUV scalar product hereafter. The norm of a vector is by definition the 245 square root of the scalar product of the vector with itself; the TUV norm is then dimensionless; it can be interpreted as a fraction of the overall state variability. Since the state vector is composed by many other variables besides T , U , and V (pressure, specific humidity, vertical velocity, concentration of condensed water phases, etc.), this is more properly a seminorm: it is considered sufficient, though, to evaluate errors and state differences, because the chosen variables have the most direct 250 impact on the dynamics. The error size is defined as the TUV norm of the vector difference between control and true state.

Breeding and orthonormalisation procedure
A breeding scheme is characterised by a rescaling period and amplitude. The rescaling period is set to 30 min for trajectory 18H (larger initial error), and, to estimate faster instabilities, to 15 min  Every three hours all bred vectors are Gram-Schmidt orthonormalised with the TUV scalar product; the error, i. e. the vector difference between control and true state, is then projected onto the orthonormal bred vectors. The ratio between the sum of square error projections and the square norm of the whole error is calculated and expressed as percentage: this is an increasing function of the number of bred vectors used (it would be 100 % for a complete orthonormal basis). A similar tech-270 8 nique was used by Wei and Toth (2003) to compare forecast error estimates with member subspaces in an ensemble prediction system.
In each experiment, 12 bred vectors are computed (this was the maximum possible allowed by computer memory). A second set of 12 bred vectors was computed in some cases, to assess the dependence on increasing subspace dimension of the projected error component.

Multiple scale instabilities in the forecast error and in Bred Vectors subspaces
As stated in the introduction, one of the aims of the present work is to extract information about the instabilities at play from the inspection of bred vectors and their associated growth rates. A second 285 related question is whether the forecast error can be described by BVs in a model such as the present one, which includes the fast small scales typical of convection. In the case of a positive answer, the algorithms developed for ensemble weather forecasting and ensemble data assimilation could be successfully used also in the context of nonhydrostatic models.
We start discussing Fig. 1, that shows the forecast error as a function of time for the two control 290 trajectories starting from initial errors with the same spatial structure, but with different amplitude: one, R21, an order of magnitude smaller than the other, 18H. For both trajectories the growth rate is positive in the initial phase, until about 05:00 UTC for trajectory 18H and 09:00 UTC for trajectory R21, approximately corresponding to episode (1) of Sect. 3, and in the last part of the simulation, from 11:00 UTC to the end, when the intense phenomena progressively extend from North-Eastern

295
Italy to almost the whole domain.
We observe that, whereas the ratio of the initial errors in the two control trajectories was initially as large as 10 (in view of the hypothetical assumption that the analysis error could be reduced by an order of magnitude) after 21 h the ratio of the forecast errors has become smaller than 1.5. The error, during the first 10 h in fact grows much faster in the small initial error trajectory (R21), whereas 300 during the second episode the error growth of the two trajectories is very similar. This is because the growth rate is dominated by (small scale) fast instabilities when the initial error is small (regardless of its initial spatial structure, which is the same in both cases), but, as forecast error grows in time, these components saturate and slower instabilities associated to larger dynamical scales become dominant 3 . This is the mechanism suggested in the Lorenz (1996) paper, as well as in Toth and 305 Kalnay (1993), that we here discuss relative to a real case study. We now will see how bred vectors can be useful to understand this behaviour and to quantify the characteristics of instabilities at various scales. Figure 2 shows the "spectrum" of BVs (3-h averaged) growth exponents for the two control trajectories of Fig. 1 describe errors on the various scales present in the forecast error. We observe that in the set of BVs that has been constructed for the control trajectory with larger initial error (18H) and meant to describe larger scale errors present in the forecast, the growth rate generally decreases with BV index.

320
In general, these BVs are more active (have larger growth rates) at the end of the integration period, when slower scales become dominant, and less active at the beginning. There are not many BVs with positive growth rate in this set except during the second episode of error growth.
In contrast, all the BVs of the set for the control trajectory with small initial error (R21) have large growth rates during the entire period, except for a time interval between the error growth episodes; 325 furthermore the growth rate in this set of BVs does not decrease at all with BV index, indicating that a large number of directions are present in the small scale unstable subspace. In Sect. 6.2 it is shown that this number is much larger than 12. In Sect. 6.2 we examine in more detail whether the forecast error of the two trajectories can be described in terms of the two sets of BVs.

Forecast error component in orthonormalised Bred Vectors subspaces
As described in Sect. 5, we performed the orthogonal projection of the forecast error on the subsets 340 of BVs that were constructed for each of the two control trajectories. The breeding parameters were varied and we explored a wide range of values. Results are shown for several values of the amplitude and the appropriate renormalisation frequency. On the contrary, as it can be seen in the bottom panels of both Figs. 4 and 5, for the small initial error trajectory R21 the maximum error component is obtained as time progresses for an increasing rescaling amplitude. As a consequence, no "optimal value" is found in this case. The maximum error component on bred vectors is found for smaller rescaling amplitudes at the beginning of the 360 simulation, and for larger rescaling amplitudes at later times. In other words, bred vectors corresponding to larger and slower scales progressively become dominant for increasing time. This result is consistent both with the results of Sect. 5.1 and with the theory: small scale unstable structures grow fast in the initial phases and, when they begin to saturate, they trigger growth at larger and slower scales, which become more important later on.

365
An important result is that, when the initial error is small and as long as it remains small, as in the initial part (00:00-03:00 UTC) of trajectory R21, because of the many fast instabilities present in the first stages of error growth, only a small fraction of the forecast error is captured by the bred modes. This is evident in the bottom panels of Fig. 4 and 5. 12 additional bred vectors have been computed for two selected cases 4 : for 18H control trajectory 370 with rescaling amplitudes 0.36 and period 30 min; for R21 control trajectory with 0.100 amplitude and 15 min period. A total of 24 orthonormal bred vectors is then available in these two cases, and the square norm of the error orthogonal projection onto each set is shown as a function of the subspace dimension in Fig. 6. The top panel shows that for trajectory 18H (large initial error), though very different error percentages are reached at different simulation times, increasing the number of 375 bred vectors above 12 does not increase the projected error component in an important way. Assuming that slower growth at very large scales (that could hypothetically be found by further increasing the subspace dimension) is suppressed by boundary forcing, this means that most of the error components are not, in fact, growing when the initial error is so large. Non-growing error components can be composed of localised unstable structures which have already reached growth saturation, and, 380 also, of wider signals, originally present in the large initial error, that are still present in areas that are dynamically rather inactive (the 18H initial state has been in fact constructed from a forecast state of the hydrostatic model BOLAM, Sect. 5). Such larger non-growing signals may appear, in the error field, as locally organised structures, which persist or drift in the domain, accounting for an important portion of the error.

385
Finally, in the bottom panel of Fig. 6, corresponding to trajectory R21 (small initial error), the error components appear to continue increasing, more or less steadily, with increasing subspace dimension. This means that the number of small-scale fast instabilities is large (larger than 24, at least).

Discussion and further developments 390
In a preliminary study, it was possible to evaluate growth and saturation properties of the fastest instabilities associated to the smallest (convective) dynamical scales (Sect. 4). Their doubling time, characterising linear error growth, and their tangent-linear time, characterising the duration of growth in the linear regime, resulted of comparable magnitude, leaving, on the one hand, some theoretical possibilities to make use of some characterisation of the linear regime to control error growth. On the 395 other hand, both these characteristic times resulted quite short (about 2 h) with respect to lead-time lengths of interest for weather forecasting, even if, in principle, still of some interest for nowcasting (6-12 h).
A possibility that would remain open in that context would be the implementation of a sequence of dynamically consistent analysis steps, with appropriate frequency, say every hour, with the per-400 spective of attempting control of the assimilation system trajectory.
The forecast error, built in a twin-experiment framework, has been compared with sets of orthogonalised bred vectors, and its orthogonal projection on bred vectors subspaces has been computed (Sect. 6.2).
When the initial error is comparable with the present-day analysis error, the largest forecast error 405 component on a subspace of 12 bred vectors has been found to have a square norm of about 30 % of the whole square error norm. This appears as a remarkable result, considering the huge linear dimension of the state vector, and the limited number of bred vectors used. Another remarkable point is, though, that when the number of orthogonal bred vectors is increased, the error component on the BVs subspace does not grow much further. Slower mesoscale instabilities, not mainly convective, 410 which remain longer in the linear regime of growth, appear then to be controllable using a small number of independent directions. The "unexplained" 70 % of (square) forecast error appears to be distributed on saturated small-scale unstable structures and on non-growing larger-scale signals present in the error field, directly inherited from the initial state: a field of the larger-scale hydrostatic model providing boundary conditions. These last error components could be reduced or eliminated 415 by assimilating observations, but in fact they are re-introduced each time the model is restarted from an external model field.
Starting with a smaller initial error (using a higher rescaling frequency), the "explained" square percentage of forecast error grows very slowly with the number of bred vectors. Fastest growing, small scale instabilities (purely convective), which are also known to saturate quickly, are important 420 when the error is initially small and appear to be active in a very large number, impossible to control within a reasonably small dimension subspace.

Error evolution after subtraction of the forecast error projection on the bred vectors subspace
In a successive experiment a new state at 03:00 UTC is obtained by taking the vector difference 425 between the 18H control state and its error projection on the 12-dimensional BVs subspace. In this way, the square error of the new state is 30 % smaller than before. together. Undetected error components include many very localised signals, but also a few larger structures (for example one located in the Northern Apennines).
The (RMS) error evolution of the new model trajectory restarted from this new state is shown in Fig. 8 together with, for comparison, the error of the 18H control forecast of Fig. 1. The error 435 appears to grow faster than before: the new trajectory error reaches in about three hours the level of the unperturbed control trajectory, even slightly exceeding it before settling on a similar behaviour.
After restart, the RMS error goes in 3 h from about 0.3 to about 0.4: this corresponds to a doubling time of about 7 h, much longer than that estimated in Sect. 4 for the fastest-growing instabilities, but similar to that of the initial part of control trajectory R21 started from a small initial error (Fig. 1).

440
The error reduction obtained by subtracting the error projection on the BV subspace brings some small-scale error features below their saturation level: in this condition fast error growth can be fueled by strong instabilities.

13
We think that we can summarise all this by drawing the following picture.
Perturbations that grow fast in a non-hydrostatic, convection-resolving system, are associated to 445 small dynamical scales, typical of convection. These fast-growing instabilities may be present in a large number and quickly reach linear growth saturation, and they do this at an error level smaller than the present-day analysis error. An important portion of larger scale instabilities are effectively controlled by boundary forcing. Between these "extremes", there exist some linear growth, so that part of the error can be estimated by breeding (30 % of its square norm in this case) and, in principle, 450 could be eliminated by assimilating observations. However, after subtracting the BV-estimated component, the growth of fast instabilities is reactivated and rapidly brings the error to its previous level in the subsequent free evolution. This means that it would be necessary to assimilate observations very frequently, with an interval of the order of one hour in the forecast-analysis cycle process, in order at least to control the error growth of slower instabilities. Moreover, setting up such a forecast-455 analysis cycle process might in principle make unnecessary the periodical restart from an external model field (as customary in the operational practice), thus avoiding the periodical re-introduction of errors due to the different nature of the external model solution.

Conclusions
In this paper, we have investigated the properties of error evolution in the forecast of a real case study 460 with a convection resolving model. We use the model as a laboratory where we assume that a model trajectory represents the truth and we can vary the amplitude and structure of the initial condition error. We produce forecasts that start from two hypothetical analyses, one with an error comparable to and one with an error ten times smaller than the present day analysis error. In this way it has been possible to quantify the properties of instabilities with different space and time scales that are 465 representative of the evolution of the forecast error. According to the theory, when the analysis and the ensuing forecast are so accurate to reproduce the details of the single convective cells, the error growth rate is that typical of this phenomenon, while as the error increases and convective-scale instabilities saturate, the error growth is due to slower instabilities. This result is based on the theory of Lyapunov exponents and vectors that characterise the unstable, neutral and stable subspace of the 470 system. It is impossible to compute the entire set of Lyapunov vectors with positive growth rate that would give us all the information on the instabilities characterising the linear growth regime of the non-hydrostatic model. To quantify the instabilities we then used the breeding technique, which, moreover, enables selecting those that are relevant for forecast errors of a given typical amplitude.
We tuned the breeding parameters and used different rescaling amplitude and frequency to adapt 475 them to the assumed initial condition error. The results can be summarised as follows. When the initial error is comparable to the present day analysis error, the unstable subspace, as estimated with the breeding technique, has the following properties: the growth rate always decreases with BV 14 index and the number of BVs that are active during the convective episodes is not extremely large: just one or two in this case, during the night and the morning; only in the afternoon their number 480 increases, to 8 after 12:00 UTC and to 12 (at least) after 15:00 UTC (Fig. 2), in agreement with the flow-dependent instabilities evident in the forecast error evolution (Fig. 1). By projecting the forecast error on this subspace we find that most of the forecast error projection is on the leading BVs, and it does not significantly increase when the number of BVs is raised from 12 to 24. Typical values of growth rates for the leading BVs are about 0.05-0.07 h −1 corresponding to doubling times 485 of 10-14 h.
When we decrease the analysis error by an order of magnitude, we find a completely different scenario. The spectrum of BVs is flat: there is a large number of BVs with competitive growth rates, three or four times larger than those found in the previous case (large initial error), corresponding to doubling times of 2-6 h. In addition, the projection of the forecast error on the leading 24 BVs 490 is very small and continues to increase very slowly and steadily, suggesting the need of a very large number of BVs to account even for a small fraction of the forecast error. This makes us believe that the unstable subspace at the convective scale really has a very large dimension. The conclusion is that as we attempt to significantly decrease the analysis error (e. g. by improving on the quality of observations and assimilation schemes), the need to increase the number of members in an ensemble 495 forecast or an ensemble-based assimilation scheme may very well turn out to be so computationally demanding as to become an insurmountable obstacle.
The situation appears to be different for errors typical of present day analysis errors, when the instabilities at play are not growing so fast, because the fast growth of errors at the small convective scales is saturated. In this case we have seen that a non-negligible fraction of the forecast error is 500 captured by the BVs.
We performed an experiment which can be interpreted as a preliminary test on the expected performance of an assimilation in the present situation. We assumed a hypothetical assimilation scheme, able to exactly subtract from the forecast error its projection on the BV subspace (once assigned the subspace where the analysis increment is confined, this would be an upper limit to scheme perfor-505 mance and observations quality), and looked at the forecast error of the subsequent free evolution (i. e. after this single "assimilation" step). A relevant error reduction is obtained (in this perfectmodel, perfect-boundary framework): subtracting 30 % of square norm means reducing the RMS "analysis" error to about 84 % of RMS forecast error. However, after restart, again the fast instabilities take over and the forecast error recovers its previous levels in the matter of 3 h. This means that 510 a frequent assimilation, every hour or so, would be required to control the large-scale, slower instabilities, but rapid growth should still be expected after each analysis, due to the action of small-scale fast instabilities. Their large number makes the possibility of controlling them by data assimilation (by means of an hypothetical "multiple-scale" breeding or ensemble) particularly challenging. Green: trajectory labelled R21, obtained from the previous one by rescaling, at 21:00 UTC on 25 September, its error to 0.03, i.e. an order of magnitude smaller than present-day analysis error. 20 Fig. 2. Growth rates of the 12 Bred vectors at different forecast times, as a function of the bred vector index (the growth rate has been averaged over three hours intervals, as indicated in the legend box). Different colors refer to different forecast times. As indicated in panel titles, the BVs are constructed differently for trajectory 18H (large initial error, top panel) and for trajectory R21 (small initial error, bottom panel).