Fluctuations in a Quasi-stationary Shallow Cumulus Cloud Ensemble

We propose an approach to stochastic parameteri-sation of shallow cumulus clouds to represent the convective variability and its dependence on the model resolution. To collect information about the individual cloud lifecycles and the cloud ensemble as a whole, we employ a large eddy simulation (LES) model and a cloud tracking algorithm, followed by conditional sampling of clouds at the cloud-base level. In the case of a shallow cumulus ensemble, the cloud-base mass flux distribution is bimodal, due to the different shallow cloud subtypes, active and passive clouds. Each distribution mode can be approximated using a Weibull distribution, which is a generalisation of exponential distribution by accounting for the change in distribution shape due to the diversity of cloud lifecycles. The exponential distribution of cloud mass flux previously suggested for deep convection parame-terisation is a special case of the Weibull distribution, which opens a way towards unification of the statistical convective ensemble formalism of shallow and deep cumulus clouds. Based on the empirical and theoretical findings, a stochas-tic model has been developed to simulate a shallow convec-tive cloud ensemble. It is formulated as a compound random process, with the number of convective elements drawn from a Poisson distribution, and the cloud mass flux sampled from a mixed Weibull distribution. Convective memory is accounted for through the explicit cloud lifecycles, making the model formulation consistent with the choice of the Weibull cloud mass flux distribution function. The memory of individual shallow clouds is required to capture the correct convective variability. The resulting distribution of the subgrid convective states in the considered shallow cumulus case is scale-adaptive – the smaller the grid size, the broader the distribution.


Introduction
To set a path towards the development of a stochastic shallow-cloud parameterisation for numerical atmospheric models, we study how the unresolved convective processes relate to the resolved grid-scale variables in an ensemble of shallow cumulus clouds.According to a conventional deterministic approach to cloud parameterisation, the outcome of shallow cumulus processes within a grid box of a numerical model is represented as an average over the cloud ensemble or as a bulk effect.However, different microscopic configurations of a convective cloud ensemble can lead to the same average outcome on the macroscopic grid scale (Plant and Craig, 2008).If a one-to-one relation between the subgrid and grid scales is assumed, the spatial and temporal variability of convection that is observed in nature and in the cloudresolving simulations will not be represented in atmospheric models.At the same time, the improvement in parameterisation should address the dependence of the subgrid-to gridscale relation on the model resolution and physics time step (e.g.Jung and Arakawa, 2004).This is especially important on the meso-γ atmospheric scales, since moist convection and rain formation are recognised as the most uncertain processes acting on these scales and the core reason for the short mesoscale predictability limit (e.g.Tan et al., 2004;Zhang et al., 2003Zhang et al., , 2006;;Hohenegger et al., 2006).
Commonly used tools to study convective cloud processes at a high temporal and spatial resolution in order to develop parameterisations are the cloud resolving models (CRMs).To represent deep convective clouds explicitly, CRMs are used at the grid scale of 1 km order of magnitude, while shallow convective clouds become explicitly resolved at a grid scale of O (10-100 m), which is the size of the largest Published by Copernicus Publications on behalf of the European Geosciences Union & the American Geophysical Union.
energy-producing eddies in the turbulent boundary layer, hence the name large eddy simulation (LES).To formulate the effects of clouds on their environment across the different scales of atmospheric flow, a technique of coarse graining can be applied to the CRM and LES fields (see, for example, Shutts and Palmer, 2007, Sect. 3).In this way, a relation between the subgrid convection and the resolved flow can be emulated to reveal the properties and components of the parameterisation and to reflect its dependence on the model grid resolution.
From the previous studies of deep convective cloud fields using CRMs and the coarse-graining methods, it is known that the subgrid-to grid-scale relation is neither fully deterministic nor diagnostic, which suggests that stochastic and memory components should be included in a parameterisation.These components are sensitive to the spatial and temporal scales of a numerical model.As the horizontal resolution of a model gets higher, the stochastic component of the subgrid-to grid-scale relation becomes more pronounced (Xu et al., 1992;Shutts and Palmer, 2007;Jones and Randall, 2011).At the same time, an increase in horizontal resolution implies a shorter model time step and, as a consequence, a larger impact of the memory component on parameterisation.In this case, changes in the resolved flow take place on a timescale close to or less than the convective response timescale, and the convective cloud system exhibits a nondiagnostic behaviour (e.g.Pan and Randall, 1998;Jones and Randall, 2011).Along with the effects of time lag in the convective response, memory of convection also comprises a feedback process by which the past interactions between convective elements and thermodynamics fields on the nearcloud scale modify convection at the current time (Davies et al., 2013).Furthermore, a delay in the convective response becomes longer with the emergence of mesoscale cloud organisation (Xu et al., 1992), and can be interpreted as an additional convective memory effect (Bengtsson et al., 2013).
A behaviour of the subgrid-to grid-scale relation similar to the behaviour of deep convection, but on the smaller spatial scales, can be confirmed in LES studies of shallow convection.The stochastic effects in a coarse-grained shallow convective cloud ensemble become dominant on the scales close to 10 km and less (see Fig. 2 in Dorrestijn et al., 2013).We will refer to these spatial scales as the "stochastic" scales for the shallow convective ensemble.
Parameterisation schemes developed specifically for shallow convection are in most cases based on the mass flux concept (Bechtold et al., 2001;von Salzen and McFarlane, 2002;Deng et al., 2003;Bretherton et al., 2004;Neggers, 2009).In a mass flux scheme, clouds within a model grid box are parameterised as a single bulk updraft or as a spectrum of cloud updrafts via a simple entraining-detraining plume model, and the vertical transport is determined by the upward mass flux through the cloud base.Estimation of the bulk or ensemble average cloud-base mass flux is a part of the model closure and is based on some form of the quasi-equilibrium assumption (Arakawa and Schubert, 1974).According to the quasi-equilibrium assumption, in a slowly varying large-scale environment, the subgrid convective ensemble is under control of the large-scale forcing with a statistical balance fulfilled between the unresolved and resolved processes.However, at the stochastic scales, the quasi-equilibrium assumption is no longer valid.The model grid box is not large enough to contain a robust statistical sample of shallow clouds and the timescale of parameterised processes can not be separated from the timescale of the resolved processes.This suggests that a stochastic and nondiagnostic approach to parameterisation is necessary not only for representing the small-scale variability of convection, but also for representing the cloud field adequately by providing a way to make the parameterisation scale-adaptive, and to avoid the scale-separation problem.
Increasing horizontal resolution of atmospheric models is also strongly connected to the mesoscale predictability limit, which is reached faster on the smaller scales of the resolved motion (Lorenz, 1969).The reason for a shorter predictability time on the smaller spatial scales comes from the faster error growth on these scales due to moist convection (Zhang et al., 2003(Zhang et al., , 2006)).In the simulations with the grid resolution of the order of 1 km, the small-scale initial errors spread fast throughout the domain and exponentially amplify over the regions with the convective instability (Hohenegger et al., 2006).Due to non-linear interactions, initial uncertainties propagate upscale in a process known as the "inverse error cascade" and degrade the forecast quality on the larger scales (Lorenz, 1969;Leith, 1971).Here the stochastic term of a parameterisation plays a role in representing the subgrid fluctuations that, due to the non-linearity of the process, lead to the error growth and upscale error propagation.Thus, the stochastic term provides a way to quantify the uncertainties coming from the formulation of the subgrid cloud processes and is necessary to improve the ensemble spread in the ensemble prediction systems -EPS (see the review of Palmer et al., 2005).
Recently, EPS have been developed for the limited area models at the convection-permitting grid resolution to address the sensitive dependence on initial conditions (e.g.Kong et al., 2007;Gebhardt et al., 2008;Clark et al., 2009;Migliorini et al., 2011).The main goal of this new field of research is the improvement of the quantitative precipitation forecasts and the forecasts of convective and storm events.In the convection-permitting models, deep convective clouds are explicitly represented on the grid scale, while the planetary boundary layer (PBL) convection and shallow clouds are still subgrid processes and have to be parameterised.Nevertheless, the introduction of the stochastic physics into the convection-permitting EPS has been limited so far.The stochastically perturbed parameterisation tendencies (SPPT) scheme of Buizza et al. (1999) is adapted and applied in a short-range convection-permitting EPS by Bouttier et al. (2012) to improve the ensemble reliability and the ensemble spread-error relationship.Another example is the recent work of Baker et al. (2014), where another similar method of parameter perturbation of the model physics tendencies called the random parameters (RP) scheme (Bowler et al., 2008) is modified and applied to a convection-permitting EPS.Both of these approaches are rather pragmatic and general in perturbing the physical tendencies in a model.The effect of stochastic schemes specifically developed for the shallow clouds and based on the underlying physical processes has not been investigated so far, mainly because stochastic schemes for shallow clouds have not been formulated until recently.One example is the scheme developed for stochastic parameterisation of convective transport by shallow cumulus convection (Dorrestijn et al., 2013), based on LES studies of non-precipitating shallow convection over the ocean.In this scheme, the pairs of turbulent heat and moisture fluxes are randomly selected as corresponding to different states of a data-inferred conditional Markov chain (CMC).In another approach, two stochastic processes are implemented in the eddy-diffusivity mass-flux (EDMF) scheme (Siebesma et al., 2007;Neggers, 2009), the Monte Carlo sampling of the convective plumes and the stochastic lateral entrainment (Sušelj et al., 2013).
The goal of our study is to formulate a shallow convective parameterisation that encompasses the stochastic and memory effects of convection, using the theoretical and empirical findings about the cloud ensemble.We study a shallow convective-cloud case (Rain in Cumulus over the Ocean -RICO) using large eddy simulation (LES).RICO is a precipitating quasi-stationary shallow convective case that also shows some mesoscale organisation.We coarse grain the cloud ensemble to study the subgrid-to grid-scale relation and its dependence on the horizontal resolution.The variability of shallow convection and its scaling with the horizontal resolution is then quantified.Individual cloud lifecycles and the role of the diversity of cloud lifetimes are examined employing the cloud tracking routine of Heus and Seifert (2013).This numerical study gives a path to apply the theory of fluctuations in an equilibrium convective ensemble of Craig and Cohen (2006b) to a shallow convective case.
In the following, we propose a generalisation of the theory of fluctuations in a convective ensemble by including the system memory and by considering the impact of the diversity in cloud lifecycles on the cloud-base mass flux distribution shape.This provides a stochastic and memory term in the subgrid-to grid-scale relation, and a deterministic component is also retained in adequate proportion, depending on the grid scale.This combined empirical-theoretical concept is then structured in a stochastic stand-alone model of a shallow cumulus ensemble, similar to the approach of Plant and Craig (2008) for deep convection, referred to as PC-2008 in the following text.A spectral representation of the cloud field with the cloud lifecycles modelled explicitly introduces the memory of individual clouds and opens the way to estimating the impact of this memory on the variability of con-vection.Sensitivity tests of the gradual generalisation of the convective-fluctuation theory provide a definition of a consistent and least complex model formulation.
Large eddy simulation and the cloud tracking algorithm necessary for the analysis are described in Sect. 2. Physical and statistical properties of a cloud ensemble are described here and the cloud mass flux distribution is analysed.A stand-alone stochastic model is constructed based on empirical and theoretical findings and the model formulation is derived for the different levels of system generalisation (Sect.3).Different formulations of the stochastic model are discussed, and tested against LES results, to decide on minimal and consistent representation of all relevant features of subgrid convection and its variability (Sect.4).

Shallow cumulus ensemble statistics
To develop a stochastic parameterisation for shallow cumuli that includes convective memory in its formulation, a detailed description of the cloud ensemble and the processes acting on the scale of an individual cloud is necessary.A large eddy simulation as a cloud-resolving model suffices for the detailed description of the shallow cumuli field in a large horizontal area, while the cloud tracking as a postprocessing routine collects the information about every simulated cloud during its lifetime.

Large eddy simulations and cloud tracking
We use the UCLA-LES (University of California, Los Angeles -Large Eddy Simulation) model, a version from Stevens (2010), to simulate shallow convection.The dynamical core of the LES model is based on the Ogura-Phillips anelastic equations, discretised over the doubly periodic uniform Arakawa C-grid (Stevens et al., 1999(Stevens et al., , 2005)).The set of anelastic equations is solved for the prognostic variables: velocity components (u, v, w), total water mixing ratio r t , liquid water potential temperature θ l , number ratio of rainwater N r and mass mixing ratio of rainwater r r .The time integration is solved using a third-order Runge-Kutta numerical method.A directionally split monotone upwind scheme is used for the advection of scalars, and directionally split fourth-order centered differences are used for the momentum advection.The subgrid fluxes are modelled by the Smagorinsky-Lilly scheme, and the warm-rain scheme of Seifert and Beheng (2001) is used for the cloud microphysics as described in Stevens and Seifert (2008).
In this study, the LES model is set up to simulate the GCSS (GEWEX Cloud Systems Studies) RICO shallow cumulus case, as in van Zanten et al. (2011).The RICO case is based on the Rain In Cumulus over the Ocean field study (Rauber et al., 2007) Atlantic.The focus of this field study was on the processes related to the rain formation in shallow cumuli and on how the rain modifies the individual cloud and the cloud ensemble statistics.
The standard RICO-GCSS case was simulated over a large domain of around 50 km × 50 km, with the horizontal grid spacing of 25 m and vertical resolution of 25 m up to 4 km in height.In such a large domain and on a high-resolution grid, a cloud field can evolve into an organised mesoscale convective system, forming clusters and line-like structures (Seifert and Heus, 2013).This transition to an organised cloud field depends mostly on precipitation rate and, for the RICO-GCSS simulation, the first organised cloud clusters develop around the twelfth hour of the simulation (Fig. 1d).In the RICO-140 case, which has a doubled cloud droplet number density, N c = 140 cm −3 , and is virtually non-precipitating, the cloud field remains quasi-random, but the individual clouds grow in size throughout the simulation time (Fig. 1a,  c, e and g).The convective variability in an organised case is, of course, very different from the variability of a quasirandom cloud field.This is discussed in more detail later in Sect.4.2, where we discuss RICO-GCSS and RICO-140 to quantify the effects of organisation, but for most of the analysis we focus on the simple case of the RICO-140 with a quasi-random cloud field.
The cloud tracking algorithm developed by Heus and Seifert (2013) is used as a post-processing tool for the LES simulation results.The tracking is based on the vertically integrated liquid water content, namely the liquid water path.The clouds are projected onto a two-dimensional plane and are identified as consisting of the adjacent points with the liquid water path exceeding a chosen threshold value.Cloud merging and splitting is done in two directions: forward and backward in time.Along with the projected cloud area, cloud buoyant cores, sub-cloud thermals and rain are tracked during the simulation, with the links among them retained.The choice of the two-dimensional tracking of the projected clouds came from the limitations imposed by the computational expenses and the large memory resources that are required.For more details and validation of the tracking method, see Heus and Seifert (2013).
To develop a cloud parameterisation based on the mass flux concept, the cloud mass flux has to be estimated at the cloud-base level.For the RICO case, we choose the level at 700 m, which is the first or second height level above the cloud base during most of the simulation time.Thus, it is necessary to identify the area that every cloud occupies at the 700 m level.Because the liquid water path threshold of 5 g m −2 is taken as a definition of a cloudy column in the cloud tracking algorithm and the clouds are projected onto a two-dimensional surface, we check what the error is introduced by the tracking regarding the domain average cloud variables at the 700 m height.We define the cloudy air at the 700 m height level as points holding the liquid water content q l larger than 0.01 g kg −1 , which is the same definition as in the LES model analysis.In this way, we are able to test the tracking and the cloud conditional sampling routine, comparing the outcome statistics with the original LES statistics.The relative difference in cloud fraction before and after the tracking is 1.93 %, which is a negligible difference in absolute value, and can be neglected.

Cloud definition and the distribution of cloud-base mass flux
Starting from the sixth hour of RICO simulation to avoid the model spin-up period, we choose several sequential time frames of 6 h duration and apply the tracking method to the cloud field.Each individual cloud in the simulated cloud field is tracked in space and time during its life and cloud properties are recorded each minute of the simulation.Clouds are taken into account only if their existence started during the selected time frame, but if their duration spanned beyond the time frame, they are tracked further on to complete their lifecycles.We study the lifetime average cloud properties, contrary to the instantaneous properties of the cloud field at a single model time step.How should clouds be defined in a parameterisation?A definition of the cloud entity is chosen depending on the processes that will be introduced in a parameterisation.We aim for a unified scheme, which will be used to reproduce the cloud fraction, cloud vertical transport of mass and scalars, and possibly also rain formation.Therefore, we test how the distribution of cloud mass flux depends on the choice of the cloud entity as a cloud condensate, cloud buoyant core or a cloud updraft.To identify the points that form the cloud entity on a certain height level, a conditional sampling is performed with the three different criteria (as in Siebesma and Cuijpers, 1995;de Roode et al., 2012): 1. cloud sampling over the points with liquid water content: q l > 0 g kg −1 ; 2. buoyant core sampling, by comparing the virtual potential temperature of each cloudy point with the slab average: θ v > θ v and q l > 0 g kg −1 ; 3. and cloud updraft sampling over the cloudy points with positive vertical velocity: w > 0 m s −1 and q l > 0 g kg −1 .
Following the work of Cohen and Craig (2006a), the mass flux of an individual cloud at a certain height level is defined as where ρ is the domain average density, a i is the cloud area, w i is the vertical velocity averaged over the cloudy points, and n is the number of clouds (Arakawa and Schubert, 1974).The cloud-base mass flux of each individual cloud that appeared during the time frame of 6 h (from the sixth to the twelfth hour) is averaged over the cloud lifetime and the distribution of lifetime-averaged mass flux is calculated for all three cloud entity definitions (Fig. 2).This distribution is defined as the cloud rate distribution of cloud-base mass flux g(m, t)dmdt, which gives the number of clouds with the lifetime-average mass flux in the range [m, m + dm] generated during the time interval [t, t + dt].The integration of g(m, t) with respect to m results in the cloud generation rate, G(t), which is the number density of clouds generated per unit time: (2) The total number of clouds in a domain, N (t), can be estimated by integrating the instantaneous distribution n(m , t) with respect to m : the cloud rate distribution carries the information about individual cloud lifecycles.A similar concept is introduced in astrophysics (e.g.Chabrier, 2003), where the time-dependent distribution function called the galactic stellar creation function, corresponding to our g(m, t), is introduced to relate the present-day stellar mass function to the initial stellar mass function.In this paper, we are limiting our case to 6 h time frames to stay within a stationary regime.Therefore, the dependence on time in g(m, t) can be left out for notational simplicity, and in the further text we will write g(m).When we are referring to the probability density function, g(m) normalised by G, the notation p(m) will be used.The shape of p(m) does not depend strongly on the choice of the cloud entity definition (Fig. 2).The main factor influencing the shape of p(m) is the liquid water content criterion, which is the reason for the similar look of the three lines in Fig. 2. Including buoyancy shifts the distribution slightly towards higher density values.The reason is that only the clouds that are positively buoyant at the 700 m level are taken into account, so the total number of clouds is reduced and some of the smallest clouds are left out.For further analysis we choose to sample the cloud mass flux from a distribution of the cloud ensemble whose elements are defined using the most general cloud definition: connected points holding a cloud condensate, q l > 0 g kg −1 .

Shallow cloud subtypes
The shallow cumulus cloud ensemble is composed of different cloud subtypes (Stull, 1985).Shallow clouds that originate from the convective updrafts overshoot into the inversion layer at the top of the mixed layer.If a cloud has enough inertia to overcome the convective inhibition and reaches the level of free convection (LFC), its growth is fuelled further up.Those are the active buoyant clouds.Clouds that never reach the LFC and remain negatively buoyant above the mixed layer are the forced clouds.Another cloud group is made of passive clouds, which are remnants of the old decaying clouds or are formed due to gravity waves.
Following the definition of an active cloud in the tracking routine as a cloud holding a buoyant core with the maximum in-cloud excess of θ v exceeding the threshold of 0.5 K (Heus and Seifert, 2013), we divide the cloud ensemble from the RICO-140 simulation (6-12 h time period) into two separate groups: the active-cloud group comprising the clouds with single or multiple buoyant cores, and all the other clouds in the passive-cloud group.
The two different groups of shallow cumuli form the two modes of the cloud rate distribution and the joint distribution of cloud mass flux and other cloud properties (Fig. 3).In the RICO cloud ensemble, passive clouds are large in number and can develop a smaller area at the cloud base and transport less mass compared to the active clouds.This can be identified at the cloud rate distribution of cloud-base mass flux, as the passive cloud group takes the lower range of the mass flux and higher probabilities in the distribution, and the active cloud group takes a higher mass flux range and the distribution tail (Fig. 3a).In a random shallow cumulus field, small-scale turbulent motion controls the in-cloud processes and the interaction of clouds with their environment.As a result of the quasi-random processes, the cloud fields are highly variable and the cloud properties are vastly diverse.It is obvious that clouds of equal area at the cloud base do not have for both cloud groups, with parameters α i and β i corresponding to the passive (1) and active (2) cloud groups.Vertical velocity w i , i = 1, 2 is averaged over all clouds in each group and plotted as a horizontal line.a unique magnitude of the other cloud properties; they are in fact highly dispersed.However, the joint distribution of cloud mass flux and cloud lifetime shows some correlation, with a Spearman rank correlation coefficient of r ρ = 0.79.This joint distribution can be well approximated with two powerlaw relations τ i = α i m β i with i = 1, 2 describing a power-law increase in cloud lifetime with the cloud mass flux for each cloud group separately (Fig. 3b).Similarly, the two different cloud groups form the two modes of the joint distribution of cloud mass flux and cloud vertical velocity (Fig. 3c).In this case the correlation coefficient is r ρ = 0.48 and it is evident that the cloud-base mass flux does not scale with the vertical velocity.Therefore, the lifetime averaged cloud-base mass flux of an individual cloud is mainly controlled by the horizontal area that it occupies at the cloud base.
During the selected 6 h time frame (6-12 h) of the RICO-140 simulation, passive clouds form around 72 % of the total cloud number in the ensemble.Even though a single passive cloud on average contributes less to the upward transport and cloud fractional cover than an active cloud, their collective contribution can not be neglected, because they are large in number and can also live long (see Fig. 3b).The contribution of active clouds to the vertical transport of mass and scalars is around 63 %, even though they form only 27 % of the total cloud number in the ensemble, while the contribution of active clouds to the cloud fraction is only slightly higher than the contribution of the passive cloud group, around 54 % (Table 1).

Canonical cloud ensemble distribution
According to the theory of fluctuations in an ensemble of weakly interacting deep convective clouds that is in statistical equilibrium with the large-scale environment (Craig and Cohen, 2006b), the cloud mass flux distribution follows an exponential law  These plots correspond to the RICO-140 simulation time frame of 6-12 h.The cloud rate density distribution is fitted using the mixdist R package (R Core Team, 2013), and the distribution shape parameter is set as equal for both distribution modes: where m > 0 is the average mass flux of an individual cloud, and m is the cloud ensemble average mass flux per cloud.This distribution was derived in analogy to the Gibbs canonical distribution of microstates of a physical system.In the case of shallow convection, the cloud rate distribution of mass flux at the 700 m height level is more complicated than a simple exponential function.This distribution is a superposition of two modes (Fig. 4a), due to the existence of different cloud subtypes forming the shallow cumulus ensemble (Stull, 1985): passive clouds in one mode and active buoyant clouds in the second mode (see Sect. 2.3).Forced clouds are not defined separately in the cloud tracking routine, but based on the buoyancy criterion, we can assign them to the passive cloud distribution mode.Furthermore, the cloud rate distribution deviates from the exponential distribution.This is observed from the semi-logarithmic plot in Fig. 4a, where the density distribution function does not form a straight line for either of the modes, and the best fit suggests a more general distribution function.
The cloud rate distribution of mass flux is a highly right-skewed distribution with a heavy tail and can be well where f is a fraction of the cloud ensemble belonging to the first passive mode and 1 − f is a fraction of the cloud ensemble belonging to the second active mode.The Weibull distribution is a special case of the generalised gamma distribution family and is frequently used in the survival analysis field of statistics to model the physical systems with components that age during the time towards their failure.The parameters θ 1 > 0 and θ 2 > 0 refer to the scale of the two distribution modes, and parameter k > 0 is the distribution shape.
Here we are making a parallel between the cloud mass flux distribution and a lifetime distribution to explain the deviation of the cloud rate distribution of mass flux from the exponential shape through the parameter k.The parameter k introduces the effect of system memory in the cloud rate distribution of mass flux.The two main types of convective memory effects recognised in the CRM studies (Davies et al., 2013) are a memory effect due to the time evolution of a cloud field in a changing environment, and a memory effect due to the finite individual cloud lifetimes.In our case, because of the stationarity assumption, we only include the latter effect, and the distribution shape k is smaller than 1 due to the different and finite lifetimes of individual clouds.This local memory effect is accounted for through the correlation of cloud-base mass flux of individual clouds with their lifetime.
If the shape parameter lies in the interval 0 < k < 1, the Weibull distribution describes a cloud population with the failure rate decreasing with the cloud mass flux by following the failure rate function where h(m) is the failure rate defined as the frequency of failures per unit mass flux, conditioned on the average mass flux of a cloud.If a cloud has already developed higher mass flux, it is more likely that it will be able to transport an additional portion of the mass through its cloud base compared to a cloud that has developed lower mass flux.The results from LES support the theoretical failure rate function of cloud population, showing a decrease in the failure rate with the cloud-base mass flux (see Fig. 4b).In the case of a shallow cumulus population, the Weibull distribution with 0 < k < 1 provides a good fit to the empirical data, since the cloud ensemble consists of a large number of short-lived clouds in the lower range of the cloud-base mass flux, and with fewer long-lived clouds in the high mass flux range (see Fig. 3b).
A special case of the Weibull distribution, when k = 1 and the failure rates are constant, i.e. h(m) = 1/ m , is the exponential distribution.A population would have an exponential distribution if the system was memoryless and if the system constituents had equal lifetimes.When describing a realistic cloud ensemble, this distribution is likely to be bimodal, with each mode being right skewed and heavy tailed (0 < k < 1).This comes from a reasoning that in any cloud ensemble, it is more likely that large clouds will live longer and develop higher mass flux compared to the smaller clouds.In the cloud ensemble of the RICO case, the best fit suggests the shape parameter k = 0.7 (Fig. 4a).However, the value of parameter k might change with the changes in the large-scale environment and with the emergence of the cloud field organisation, since both of these features carry a component of convective memory.We will discuss the sensitivity of the ensemble statistics to this parameter further in Sect. 4.
An important aspect of applying the Weibull distribution to the parameterisation of clouds is its potential universality as a cloud mass flux distribution.During the transition of a cloud field from shallow to deep convection, the shape parameter might change from approximately k = 0.5 in the case of a shallow cloud field to close to k = 1, corresponding to the exponential distribution function which has been suggested for deep convective clouds.With this in mind, it might be possible to unify the parameterisation of fluctuations in shallow and deep convective cloud systems within the same scheme.Furthermore, this approach can be considered to be an empirical generalisation of the Gibbs formalism to convective cloud systems with memory.

Variability of the small-scale convective states
The domain of the LES RICO-140 simulation is successively divided into areas of different sizes, to mimic the different grid sizes of the stochastic model, and cloud properties are averaged or summed over these areas.In this way, we obtain the distribution of compound subgrid convective states depending on the horizontal resolution of the model.
Figure 5 shows the subgrid cloud fraction histograms for the different coarse-graining resolutions: 1.6, 3.2, 6.4, 12.8, and 25.6 km.Small-scale states in each spatial bin vary from the realisations in the surrounding bins, even though the given forcing is spatially uniform and constant in time.The smaller the averaging area, the more possible states exist and histograms become significantly broader, since the averaged values of cloud properties can take wider ranges.The variability arises from a different number of clouds in areas of the same size and from the fact that individual clouds can be stronger or weaker (Plant and Craig, 2008).The distribution of compound cloud properties changes its shape from exponential-like in the case of high-resolution grids to Gaussian-like for the coarse grids.A grid box in a model with the coarser horizontal resolution will contain a larger number of clouds and the outcomes of the sub-sampling approach the expected value of distribution (the distribution becomes narrower), which is in agreement with the law of large numbers.This kind of variability results from the small-scale convective processes themselves and does not originate from the changes in large-scale dynamic forcing, though it can be influenced by these changes.

Empirical-theoretical model formulation
According to the parameterisation framework of Plant and Craig (2008), a model grid box contains a subset of a cloud ensemble, and represents one possible outcome of the response to the large-scale forcing.Therefore, around each model grid box, we choose a large area A containing the "full" cloud ensemble (Fig. 6), assuming that the total mass flux in a cloud ensemble is determined by the large-scale environment.By doing so, we assume that quasi-equilibrium is valid on a large scale.For this assumption to hold, the number of clouds in an ensemble has to be very large so that area A contains the full spectrum of the cloud sizes.For the purpose of this study, we set the large-scale area A to the domain size of the LES RICO simulation.
The initialisation of n clouds in the area A is modelled as a random Poisson process and the cloud mass flux m is drawn randomly for each individual cloud from the generalised ensemble distribution (Eq.5) defined for the selected area A around the grid box (see Fig. 6).After initialisation, the clouds are distributed uniformly over the area A so that in every grid box the distribution of the initialised cloud number also follows the Poisson distribution.A cloud lifetime is assigned to each initialised cloud as a function of the cloud mass flux, according to the fit obtained from Fig. 3b.During the model run, clouds are treated as individual objects with their own memory and duration.A lifecycle is assigned to each cloud, with the cloud properties changing accordingly, and after the lifetime expires the cloud is removed from the simulation.So, at each model time step, which is set to 1 min, the subgrid convective processes are represented by the effects of all clouds that exist in a grid box, at the different stages of their lifecycles.
The large-scale properties driving the model are the ensemble mean properties: total cloud number N and total cloud-base mass flux M .In addition, cloud fraction C is also taken as a third quantity, because we aim for a scheme that unifies the representation of the cloud vertical transport and cloud cover.Thus, as a result of the stochastic modelling, we get the fractional cloud cover C and the total mass flux M in each model grid box, and the correct variability, depending on the choice of the model horizontal resolution (see also Keane and Plant, 2012).With the cloud ensemble statistics formulated in this way, the variability of small-scale states is represented in a physically based manner, resulting from the random and limited sampling (Plant and Craig, 2008).

Counting the clouds
Initialisation of new clouds within a model time step is done through a Poisson counting process, after which the clouds are uniformly and randomly distributed over space.In this section we test whether the temporal Poisson distribution holds for the RICO case.
For a process to be described as a random Poisson process, events should be independent of each other and the distribution of events should follow the Poisson distribution.The Poisson distribution is often found in nature, since it results from a process subject to the law of rare events (Pinsky and Karlin, 2011).This law can be interpreted as a very low probability of occurrence of two exactly identical clouds in a given area, even though this area can contain a large number of clouds.Therefore, according to the law of rare events, the number of generated clouds in the area should approximately The distribution is fitted using the method of moments, while the histograms and Q-Q plots are made using R libraries (R Core Team, 2013).The time interval between the snapshots is 10 min.
follow the Poisson distribution.If we assume that the shallow cumuli are point-like events with a low probability of occurrence and that the events occur randomly but with a constant cloud production rate G, as in Craig and Cohen (2006b), the probability that n clouds will be generated in a domain during the time interval (t, t + t] is given by the Poisson distribution Consequently, we assume that the distribution of the total number of clouds in a domain also approximately follows the Poisson distribution.This approximation is necessary for the estimation of variance of the compound cloud mass flux distribution in Sect.3.3.To test the validity of an assumption for the Poisson distribution, we show the empirical histogram of the total number of clouds in the LES RICO case domain, and a fit to the theoretical Poisson model for the 6 h period of simulation (Fig. 7b).The rate parameter for the distribution fit is estimated from empirical LES-RICO results using the method of moments.Even though the RICO case is not ideally stationary (the number of clouds has a decreasing trend, see Fig. 7a), for a limited time period of 6 h, these two distributions are similar.Figure 7c shows the quantile-quantile plot (Q-Q plot as defined in Wilks, 2006) with the points representing the pairs of quantiles of the theoretical vs. empirical distributions.The two distributions match closely, with the points lying approximately on the straight x = y line.

Closure for the distribution parameters
The cloud rate distribution of cloud mass flux g(m) relates to the instantaneous distribution n(m ) through the information about the cloud lifetime τ (m).So, in the ensemble average limit, we can assume that Because of the stationarity, the ensemble average equals the time average in our case and will be denoted with . .Note that a similar relation is also used for the galactic stellar creation function as a product of the distribution of stars (mass function) and their formation rate (function of time) (e.g.Chabrier, 2003, Eq. 6).This relation is also implicitly used in the scheme of Plant and Craig (2008).We approach the formulation of closure by approximating the cloud rate distribution of mass flux with a two-component mixed Weibull function with scale parameters λ i and shape parameter k related to the average mass flux per cloud as m i = λ i (1 + 1 k ).The cloud generating rate G, as the number of generated clouds per second in a given area, is the intensity parameter of the Poisson distribution, and the index i refers to the two cloud subtypes (see Sect. 2.3).
The ensemble average number of clouds in a domain can be derived by integrating the instantaneous distribution of cloud mass flux: We use a power-law relation for the cloud lifetime dependence on the cloud mass flux: The parameters α i and β i for the two cloud subtypes are obtained from the non-linear least square fitting of the joint distribution of cloud mass flux and cloud lifetime (Fig. 3b).
After substitution of Eqs. ( 9) and ( 11) into Eq.( 10) and integration, we get an expression for the ensemble mean number of clouds: An expression for the ensemble mean cloud fraction C can be derived using the Riemann-Stieltjes integration of the instantaneous distribution function where a(m ) is the instantaneous cloud area just above the cloud base (700 m level).From the definition of the cloud mass flux it follows that the lifetime-averaged cloud area is a(m) = m/(ρw), and we assume that the density equals ρ = 1 kg m −3 for notational simplicity.The average vertical velocity is also a closure parameter, and here we simplify it by using an average over all clouds, w = M /( C A).By applying the relation between the instantaneous and cloud rate mass flux distribution Eq. ( 8), we get After substitution and integration, and assuming that w is constant among individual clouds, we find and, similarly, for the total cloud mass flux, When k = 1, Eqs. ( 12)-( 16) describe a system with exponentially distributed cloud-base mass flux.
In the case of a constant cloud lifetime among all clouds in the ensemble, Eqs. ( 12)-( 16) reduce to This formulation results in a system of two equations, Eqs. ( 12) and ( 15) or Eq. ( 16), with three unknowns, G, m = λ (1 + 1 k ) and k, for each cloud subtype.For the purpose of this study, we set the parameter k to 0.7 for both cloud groups, as estimated from the empirical RICO case distribution (Fig. 4a).The parameters of the power-law relation for the cloud lifetime Eq. ( 11), α i and β i , i = 1, 2, are estimated from the empirical results from LES and are of secondary importance for the variability in our model (see Sect. 4.2).This leaves us with a closed system, if the ensemble average number of clouds N i and cloud fraction C i or cloud-base mass flux M i are known, and the stochastic model can be constrained to reproduce the correct ensemble average statistics and the small-scale variability.In this study we focus on the variability of convection when the forcing is constant and the ensemble average properties are taken as known quantities from the results of the cloud tracking.
However, in a large-scale numerical model, it is not likely that the information about the total cloud number in a domain will be available.It would also be useful if the distribution parameters were constrained by the closure formulation as dependent on the large-scale model quantities, so that the distribution shape could change with the cloud field evolution.To avoid counting the clouds and fitting the cloud number and cloud mass flux distribution empirically, a more robust quantity could be used -the average lifetime per cloud, τ = N /G.In a large-scale model, the constraint on M or C is given from the resolved scales in an existing massflux parameterisation, and the information necessary to divide the cloud ensemble into passive and active cloud groups is available from the separate treatment of the active and passive cloudiness (for example, see Neggers, 2009).Therefore, the closure of m and τ has to be developed from empirical studies or from theory, so that we are left with the two equations and two unknowns: G and k.In the PC-2008 scheme, as a first approximation, the parameters m and τ are set to a constant value, though they might depend on the changes in the large-scale environment.We assume that this approximation holds for the RICO simulation, since the cloud evolution is quasi-stationary and the forcing is constant.Results from the cloud tracking of RICO clouds support this approximation (Table 2).For the three successive time frames from 6 to 24 h of simulation, the average mass flux per cloud is  around 1 × 10 5 kg s −1 for the active cloud group and around 1 × 10 4 kg s −1 for the passive cloud group, and the average lifetime is roughly 20 min for active clouds and 5 min for passive clouds.

The variance of compound distribution
The total mass flux M in a model grid box can be interpreted as a random sum of the individual cloud mass fluxes of a random number of clouds n (as in Craig and Cohen, 2006b): where the cloud mass flux is constant during the cloud lifetime, so that m = m.We assume that the total number of clouds in some region (or a model grid box) follows the Poisson distribution which can be justified with the good fit to the empirical results (Fig. 7).Here N is the average number of clouds within a model grid box.In the case of the Weibull-distributed lifetime-average cloud mass flux, the distribution at a certain instant in time is given by where τ is the average lifetime per cloud.
The probability distribution of the sum of n independent identically distributed random variables m, conditioned on the number n, is the compound distribution or the distribution of the random sum where f n (M) is the n-fold convolution of p(m ).Properties of this distribution depend on the random number of clouds n and are analysed empirically for the RICO case in Sect.2.5.
In the case of exponentially distributed individual cloud mass fluxes, this distribution is defined as the compound Poisson distribution of cloud population, and can be analytically expressed (Eq. 14 in Craig and Cohen, 2006b).By definition, the expected value of a compound distribution can be expressed as and the variance as (Pinsky and Karlin, 2011).
In a cloud field with variable cloud lifetime and Weibull distributed cloud mass flux, the expected value of the compound distribution is and the variance is The variance of the compound distribution that encompasses the diversity of cloud lifetimes depends on the average number of clouds in a region N, average cloud mass flux m functioning through λ and k, the β exponent from the lifetime relation, and the average lifetime per cloud τ .The average cloud lifetime is defined as Please note that in Eq. ( 28) N corresponds to the full convective ensemble in a large equilibrium area, while N introduced in this section corresponds to the model grid box of an arbitrary size.
To test the scale adaptivity of the compound distribution variance, we derive the relation to describe how the normalised variance of total mass flux changes with the average number of clouds: When k = 1, this reduces to the expression valid for the exponential function case with the cloud lifetimes defined as Eq. ( 11) for a single exponential mode: and furthermore, if it is assumed that the lifetimes of all clouds are equal, this reduces to (31) as in Craig and Cohen (2006b) (their Eq. 18).

Cloud lifecycle
In the case of shallow convection, large variability in the cloud size and cloud lifetime can be found.Individual shallow clouds can have a lifetime ranging from a couple of minutes to several hours.Therefore, in contrast to the PC-2008 where the cloud lifetime is constant among different clouds, we introduce the varying cloud lifetime depending on the cloud mass flux and we model the cloud lifecycles explicitly.
On the convection-permitting scales of resolved motion, the subgrid shallow convection is in a non-equilibrium regime; i.e. there is no timescale separation between the subgrid and resolved processes.To adjust to the changes in forcing, convection requires a finite time that can span longer than the model time step.This timescale is referred to as the convective adjustment or the closure timescale in the literature.Using cloud-resolving simulations of deep convection, Davies et al. (2013) identified another memory timescale that is not carried by the large-scale mean thermodynamic fields, but by the structures on the near-cloud scale.These structures are the result of individual clouds modifying their environment throughout their lifecycles.This type of convective memory expresses itself through the effects of past convection modifying the convection at the present time.A first step to introducing the aspects of these two timescales of convective memory into the parameterisation is to represent the cloud lifecycles explicitly.
The cloud lifetime of individual clouds τ (m) can be evaluated empirically from LES (Fig. 3b) by approximating the joint distribution of cloud mass flux and cloud lifetime with a simple power-law relation Eq. ( 11).This distribution is highly dispersed and the power-law fit is biased by the smallest clouds that are large in number.The implications of this crude simplification of a highly dispersed joint distribution are not significant, and will be explained further in Sect. 4.
Having the average mass flux of each cloud in a model grid box, an idealised cloud lifecycle can be assigned to each cloud following a simple lifecycle function (similar to Herbort and Etling, 2011, where a sine function was used for the temporal development of deep convective shower cells).The cloud mass flux of each cloud at each time step m is normalised by the lifetime average cloud mass flux m and changes according to Eq. ( 32) as a function of the normalised cloud time t/τ .The empirical cloud lifecycles from LES and cloud tracking results are more complicated than the idealised cloud lifecycle function (Fig. 8).Smaller, short-lived clouds follow the idealised cloud lifecycle function more closely (Fig. 8a), compared to the longer-lived clouds (Fig. 8b).The discrepancy from Eq. ( 32) is especially pronounced if the cloud is a long-lived multi-pulse entity (Fig. 8c).Please note that in the previous section, derivation of the total mass flux variance (Eq.29) did not incorporate the cloud lifecycle function (Eq.and only the variability in the cloud lifetimes in a convective ensemble was taken into account.

Tests with different levels of model complexity
The goal of every parameterisation is to represent the subgrid processes using a simple concept and as few parameters as possible, but on the other hand not to degrade the quality and level of produced information.The consistency of the parameterisation assumptions can provide a valuable guidance to choose a certain set of assumptions over another.In the following, we compare different formulations of the stochastic model, to test what the level of complexity necessary to model the shallow convective cloud ensemble is, and discuss possible inconsistencies, especially in simplified models.The stochastic model should reproduce the ensemble average quantities and the variability of subgrid convective states.
The stochastic model is run as an ensemble with 50 members on the horizontal domain of 51.2 km × 51.2 km.The ensemble model runs are performed multiple times with the different model formulation and each of these runs is repeated five times using the different horizontal grid resolutions of the stochastic model: 1.6, 3.2, 6.4, 12.8, and 25.6 km.The empirical coarse-grained LES quantities (Sect.2.5) are used for the validation of results from the stochastic model ensemble runs.To stay within the quasi-stationary regime of the RICO case, we limit the time frame to 6 h, focusing on the time period from the sixth to the twelfth hour of the sim-Distribution parameters are estimated as a function of the large-scale quantities: ensemble average cloud cover C i , total mass flux M i and total number of clouds in a domain N i , which are taken from the LES tracking results (Table 1).The distribution parameters, λ i , i = 1, 2 for the cloud rate mass flux distribution and G i , i = 1, 2 for the Poisson cloud number distribution, are calculated using Eqs.( 12)-( 16), and their values are given in Table 3. Estimation of the parameters in this way ensures that the model reproduces the correct ensemble average quantities.
The fraction of the active cloud mode is calculated as and the fraction of the passive cloud mode as f 1 = 1 − f 2 (Table 3).The cloud-base mass flux is sampled for each cloud individually, depending on the group it belongs to, following the procedure for generating the random variates from the mixed exponential function described in Wilks (2006, p. 127).The choice for the splitting into two groups is given by generating a random number f = [0, 1].The initialised cloud becomes active if the fraction f is less than f 2 ; otherwise, it is assigned to the passive cloud group.

Generalisation of the exponential distribution
In this section, we compare the performance of the stochastic model depending on the choice for the cloud rate distribution, starting from a single-parameter single-mode exponential function and then gradually increasing the distribution complexity by adding a second mode and one more parameter -the distribution shape.
Compared to the LES domain average statistics, the cloud ensemble average properties are reproduced well using the different formulations of the stochastic model, with the relative error below 0.6 % (Table 4 showing the mixed Weibull case).Low errors in the ensemble average quantities prove that the model equations and the numerical methods are consistent with the theoretical model formulation.
From the snapshots taken over six hours of simulation (6-12 h), the frequency distributions of the compound cloud mass flux at the 700 m height level are constructed for the different horizontal resolutions of the stochastic model and compared with the coarse-grained LES results (Fig. 9).It can be concluded, already by visual inspection, that the LES and the stochastic model frequency distributions are highly similar.Limited sampling of the cloud ensemble produces a correct frequency distribution of the subgrid convective states for the different choices of the model grid size.This signifies that the stochastic model is scale-adaptive and that the variability of small-scale convective states depends on the model grid resolution.There is a lack of variability when the cloud mass flux is sampled from an exponential function with constant cloud lifetime (exp.τ = 20 min, Fig. 9).This model set-up would correspond to the prescribed exponential function for deep convection in PC-2008, with the constant cloud lifetime τ = 45 min.Thus, in a shallow convective case, a more complicated distribution function that encompasses the effect of cloud lifecycles should be used.This statement is supported by the improvement in performance of the stochastic model in the case of a mixed Weibull distribution including the explicit cloud lifecycles (mix wei.τ = αm β , Fig. 9).The reason for this improvement could be the generalisation of the cloud rate distribution, the introduction of the second distribution mode, the introduction of the cloud lifecycles, or a combination of all three.We examine all three reasons in the rest of the paper.
As a tool for quantitative comparison between the frequency distribution resulting from different runs of the stochastic model and the reference distribution obtained from the LES coarse graining, we use the Hellinger distance as a measure of distribution similarity.The Hellinger distance H between the two discrete probability distributions P and Q is defined as relative frequency H = 0.01785 q q q q q q q q q q q q q q q q q q q q q LES mix wei.relative frequency H = 0.02345 q q q q q q q q q q q q q q q q q q q q (c) 6.4 km 0.00 0.02 0.04 0.06 0.08 0.10 0.00 0.05 0.10 0.15 0.20 total mass flux (kg s −1 m −2 ) relative frequency H = 0.04088 q q q q q q q q q q q q q q q q q q q q (d) 12.8 km relative frequency H = 0.06717 q q q q q q q q q q q q q q q q q q q q (e) 25.6 km relative frequency H = 0.11669 q q q q q q q q q q q q q q q q q q q q where p i and q i are the corresponding probability measures.
A useful property of the Hellinger distance is its skew independence, which enables us to compare the scores between the distribution pairs of different skewness resulting from the different choice of horizontal grid resolution (Fig. 9).The Hellinger distance H confirms a high level of similarity between the distributions of different resolution pairs, with the H values in a very low range, from 0.018 to 0.12 (Fig. 9a-e).Comparison of the results from the stochastic model set-up using a single exponential function vs. a mixed exponential or a mixed Weibull function via Hellinger distance shows the importance of modelling the two distribu-tion modes for each cloud group separately (Fig. 10).For the distribution similarity, the introduction of the second mode in the cloud rate distribution (mix exp. vs. exp., Fig. 10) has a larger impact than the explicit modelling of the cloud lifecycles (exp.τ = αm β vs. exp.τ = 20 min, Fig. 10).The difference in performance of a mixed exponential case vs. a mixed Weibull case (i.e.k = 1 vs. k = 0.7) is not so evident from the point of view of frequency distribution match, but it becomes distinct for evaluation of the variability measure (see Sect. 4.2).

Quantifying the variability
According to the theory of fluctuations in a convective ensemble (Sect.3.3), the normalised variance of the compound distribution scales inversely with the cloud number following Eqs.( 29)-( 31).With the increasing complexity of the cloud rate distribution, from a single mode exponential to a mixed Weibull distribution, the variance of subgrid convective states becomes more accurately represented (Fig. 11a), taking the LES coarse-grained variance scaling (RICO_140 6-12 h) as a reference case.The magnitude of normalised variance is controlled by the number of clouds in the subgrid regions.The smaller the grid box, the smaller the number of clouds it can contain, and the variance gets higher.Here, the cloud lifecycles play a role as well, since the cloud number will be influenced by the individual cloud lifetimes (see Sect. 4.3).The effect of introducing a second distribution mode (exp.to mix exp.) on the variance scaling is approximately equal to the effect of a generalisation of the cloud rate distribution from exponential to Weibull (mix exp. to mix wei., Fig. 11a).The latter points to the fact that the shape parameter k has a significant impact on the variance (Fig. 11b and Eq.29), since the change from a mixed exponential to a mixed Weibull distribution happens through the change in k from 1 to 0.7.The effect of excluding the explicit cloud lifecycles from the model formulation using a single exponential distribution mode (exp.τ = 20 min, Fig. 11a) is a minor and negligible improvement, but it still reveals a more correct formulation of the model.
The parameter k controls the range of the cloud mass flux that can be sampled from the probability density function in the model.Setting the value of the shape parameter to 0.6 ≤ k ≤ 0.7, the stochastic model generates a cloud ensemble with a large number of short-lived small clouds and fewer large clouds, which fits the cloud ensemble of the RICO case (Fig. 11b).When increased to k = 1 (mix exp.Fig. 11b), this parameter describes a cloud ensemble of equal lifetimes not depending on the cloud size.Constrained by the model formulation, the exponential probability distribution function, from which the cloud mass fluxes are sampled, does not span across a large enough range of the cloud mass flux values to match the results from the LES.With the decrease in k, the sensitivity of the variance scaling becomes higher, which means that in a cloud ensemble with more diversity in the cloud lifecycles, the shape of the distribution changes faster with the further increase in diversity.
The sensitivity of the stochastic model is also tested with regard to the exponent of the cloud lifetime relation, β.A relatively large range for β i , i = 1, 2 is explored (Fig. 11c), and Eq. ( 29) is used as a theoretical model for this test.The variability of convection does not depend highly on the exponent β of the cloud lifetime relation Eq. ( 11), as long as the lifetime increases with the cloud mass flux following a power law within the dispersion range of Fig. 3b.
The stochastic model was constructed using the assumption of a random cloud field with non-interacting cloud elements (clouds could interact only through the large-scale flow).From the results presented in Figs. 9 and 11, we conclude that this assumption is valid for a quasi-random cloud field (Fig. 1a-c, e, and g) before the emergence of cloud clusters or arcs.With the ageing of the cloud field, the variability does not change unless the cloud field starts to show a pronounced spatial organisation.Therefore, we test the effects of organisation on the variability of small-scale convective states (Fig. 11d).The variance produced by clustering of the clouds in the time frame from 12 to 18 h (Fig. 1d, f) and organisation into the mesoscale structures during the time frame from 18 to 24 h (Fig. 1f, h) have approximately the same magnitudes as the effects of the convective intensity in the domain in terms of the range of cloud mass flux of individual clouds in a domain.The emerging organisation of clouds will cause a decrease in the shape parameter of the mass flux distribution, though this decrease will be small and visible as a change in a distribution tail (not shown here).This indicates that the effects of organisation are important for the convective variability, but they are clearly not introduced solely through the mass flux distribution and the individual cloud lifecycles.We speculate here that the additional source of memory and spatial correlations related to the mesoscale organisation are a mechanism responsible for the increase in variance.Convective organisation and the correct convective variability are not represented in commonly used deterministic convective parameterisations in numerical weather and climate models.Stochastic approaches are a promising tool for addressing this problem; a good example of a mechanism for parameterisation of convective organisation is the cellular automaton (e.g.Palmer, 2001;Bengtsson et al., 2013).How a stochastic model, assuming a locally random cloud field, will be able to model convective or-for the scales larger than 20 km.In case (4) the error in the ensemble mean is between 0.42 and 0.74 %, which is larger than the error in case (5), but is not increasing with the coarsening of the resolution.However, due to the compensation of the error in the ensemble mean with the error of undersampling of the mass flux distribution function and the error introduced by excluding the cloud lifecycles, the Hellinger distance in case (4) is lower than in case (5) for the coarse grid resolutions.
As a result of equal lifetimes in a cloud population (cases 1-3), convective compound variance is overestimated by the same amount for the different choices of the cloud lifetime (Fig. 12b).This independence from the specific value of the constant lifetime (from 10 to 30 min) means that, on the gridscale level, the system has no memory and the effects of the individual clouds average out.The same would apply for case (4) if we test for a different constant τ , with the difference that the underestimation of the variance in this case comes from the distribution shape choice (k = 1 vs. k = 0.7).In case (5), the effects of convective memory will be carried on by the clouds that are small in number but that live longer.On the other hand, a large number of small short-lived clouds will have less effect on the future state of convection, which depicts a more realistic situation.
The question of consistency in the model formulation enters here.The error compensation in case (4) can be justified by the consistency in combining the different effects in the model formulation, which is more important than the accuracy and complexity in the representation of the separate processes.There are two options for the model formulation, consistent with our understanding of the cloud ensemble statistics: 1. a memoryless system, bearing in mind the stationarity of our case, which should be described using a mixed exponential distribution and a constant lifetime among clouds (similar to PC-2008), and 2. a system with memory, with diverse cloud lifecycles modelled explicitly and with the corresponding Weibull distribution for the cloud-base mass flux.
This raises the question of the importance of the system memory, introduced by the diversity of cloud lifecycles, for the parameterisation of convection.From the results shown in Fig. 12b we conclude that the convective memory, and hence the model set-up (2), is necessary to reproduce the convective variability accurately, with a higher importance of the system memory for the more diverse cloud field (smaller k) and for the higher model resolution.
In the reference case of the stochastic model test runs, which corresponds to model set-up (2), the cloud vertical velocity is set to a constant value applied to all clouds, and the cloud lifetime is sampled from a deterministic power-law relation to the cloud-base mass flux.This is in disagreement with the empirical results from LES, which show a highly scattered joint distribution for both quantities (Fig. 3).Is a deterministic relation between the mass flux and other cloud properties a valid approximation?The variance of compound Poisson distribution depends on the number of convective elements in a model grid box, and scales as Var[M]/(E[M]) 2 = 2/ N (Craig and Cohen, 2006b).With the of the cloud-base mass-flux-dependent cloud lifetime Eq. ( 30), this relation incorporates a weak dependence of variance on the cloud lifetime relation through the exponent β, while in the case of the more general Weibull distribution Eq. ( 29), also on parameter k.Having in mind such weak dependence of variance on the cloud lifetime relation (Fig. 11c), it is not likely that the variability could be enhanced by the conditional random sampling of the joint probability distribution of the cloud-base mass flux and cloud lifetime.Therefore, there is no need for the further sophistication of the stochastic model; i.e. a deterministic relation between the cloud mass flux and other cloud properties is sufficient.

Summary and conclusions
Subgrid-scale convective processes can be related to the mean large-scale field through a parameterisation that comprises a deterministic component, a stochastic component and the convective memory carried by the finite lifecycles of clouds.These three components change in their contribution to the overall subgrid effects, depending on the resolution of the model.Thus, a cloud parameterisation should be developed in such a way as to adapt to the different resolutions of model grid and model time step.
We have studied the fluctuations in a shallow convective ensemble of the Rain in Cumulus over the Ocean (RICO) case, which is a precipitating shallow convective case in the trade-wind region.Shallow cumulus ensemble statistics are analysed using LES, and cloud tracking is applied to study the cloud lifecycles.The theory of fluctuations in an equilibrium convective ensemble of Craig and Cohen (2006b) is extended and applied to shallow convection, combining it with the empirical findings.As a first step towards a stochastic shallow convective parameterisation, the stochastic standalone model has been developed.The model is based on an approach similar to the PC-2008 stochastic scheme, in which the subgrid convective state is represented as a sub-sample of the full convective ensemble.
The diversity of shallow cloud lifecycles causes the deviation of the cloud-base mass flux distribution from the exponential memoryless distribution.Therefore, we introduce the dependence of the cloud mass flux on the cloud lifetime by generalising the cloud mass flux distribution to a Weibull probability density function.In this way, the variability of cloud lifecycles is introduced in the stochastic representation of shallow convection.We also account for the different shallow cloud subtypes by defining two modes of the cloud-base mass flux distribution.
The convective ensemble average statistics and convective variability are constrained by the model closure by setting implicitly the value of two parameters, the average mass flux per cloud m and the average ensemble cloud lifetime τ .The model formulation is such that, depending on how these two parameters might change due to the forcing, the underlying distribution and its relation to the cloud lifecycles would dynamically adapt to these changes.
Clouds are initiated in a model grid box assuming that their number follows the Poisson distribution and the cloud-base mass flux is drawn randomly for each cloud from the mixed Weibull probability density function.The model is forced with the domain ensemble average cloud properties from LES and the probability density function parameters are fitted theoretically using a formulation for the system closure.Limited sampling of clouds in a model grid box results in the compound Poisson distribution of small-scale convective states, which possesses an inherent property of scale adaptivity.In this way the model is constrained to give the correct ensemble average values, and the variability of subgrid convective states is reproduced in a physically based manner.
As a measure of convective variability, the variance of the subgrid compound distribution is dependent on the number of clouds in a grid box and the range of their cloud-base properties.We show that the correct variability can be reproduced by the model by accounting for the system memory through the cloud-base mass flux distribution and by modelling the cloud lifecycles explicitly.The resulting histograms of subgrid convective states are simulated with a high level of agreement with LES across the different scales.Even though the individual cloud properties are highly dispersed, the compound distribution of subgrid convective states is robust and insensitive to the randomness of local cloud properties.This implies that the simplicity of the stochastic model can be retained and that the assumption about deterministic relations between the cloud mass flux and other cloud properties is valid.
This study provides a generalisation of the convective ensemble theory of Craig and Cohen (2006b), using a formulation that attempts to unify the stochastic parameterisation of shallow and deep convective clouds depending on two parameters: τ and m .These parameters are related to the large-scale information that is controlled by the convective regime, and are possibly also dependent on the changes in the large-scale forcing.Therefore, it is necessary to develop a closure for these two parameters, based on the largescale processes controlling the atmospheric boundary layer and transition to deep convection.In this paper, we establish the applicability of the convective fluctuation theory to shallow convection, generalising it by the introduction of system memory.
In future work, the stochastic model will be developed further by coupling it to an existing mass flux-based shallow convective parameterisation in a numerical model.

Figure 1 .
Figure 1.Snapshots taken every 6 h during the simulation showing the cloud albedo: the higher cloud droplet number density RICO case (RICO-140) vs. the standard RICO case (RICO-GCSS).

Figure 2 .
Figure 2. Semi-logarithmic plot of the cloud rate probability density function of cloud-base mass flux for the different cloudy point definitions (1-3).This plot corresponds to the RICO-140 simulation time frame of 6-12 h.
Figure 3. (a) Cloud rate density distribution of cloud-base mass flux with the split into active and passive distribution modes.(b) Scatter plot of the cloud lifetime and average cloud mass flux and (c) cloud lifetime and average in-cloud vertical velocity.Active clouds are shown as red points, while the passive clouds are in blue.The nonlinear least square fit of the relation τi = α i m β i , i = 1,2 is plotted for both cloud groups, with parameters α i and β i corresponding to the passive (1) and active (2) cloud groups.Vertical velocity w i , i = 1, 2 is averaged over all clouds in each group and plotted as a horizontal line.
failure rate function h as a count of failures per an interval of mass flux ∆m = 10000 kg/s, conditioned on the lifetime average mass flux

Figure 4 .
Figure 4. Semi-logarithmic plots of the cloud rate density distribution of cloud-base mass flux and the cloud failure rate function.These plots correspond to the RICO-140 simulation time frame of 6-12 h.The cloud rate density distribution is fitted using the mixdist R package (R Core Team, 2013), and the distribution shape parameter is set as equal for both distribution modes: k 1 = k 2 = k.

Figure 5 .
Figure 5. Histograms of the fractional cloud cover at the 700 m height level for the different horizontal resolutions of the LES coarse graining.

Figure 7 .
Figure7.The total cloud number time series, and a corresponding histogram plot with a fit to the Poisson model, and a Q-Q plot as a goodness of fit test.The distribution is fitted using the method of moments, while the histograms and Q-Q plots are made using R libraries(R Core Team, 2013).The time interval between the snapshots is 10 min.

Figure 8 .
Figure 8. Idealised function for the cloud lifecycle (red) and the examples of individual cloud lifecycles (gray dots) from the LES RICO-140 case, after the cloud tracking.

Figure 9 .
Figure 9. Histograms of the compound cloud mass flux at the 700 m height level normalised by the grid box area of the different horizontal resolution: coarse-grained LES tracking results vs. stochastic model results.Plots show the two stochastic model cases: a two-component mixed Weibull case with explicit cloud lifecycles (k = 0.7; coloured lines) and a single-mode exponential case without cloud lifecycles (k = 1; coloured dots).Colours also correspond to Fig. 5. Hellinger distance H stands for the mixed Weibull case.

Figure 10 .
Figure10.Comparison of the Hellinger distance between the distribution pairs from simulations using different model configurations: a single exponential (exp.)configuration with and without cloud lifecycles, and a mixed exponential (mix exp.) and mixed Weibull (mix wei.) configuration with explicit cloud lifecycles.

Figure 12 .
Figure 12.Comparison of the distribution pairs from the simulations using a constant and the mass-flux-dependent cloud lifetime.

Table 1 .
Contribution of the different cloud subtypes r N , r C and r M to the total cloud number N , cloud fraction C and vertical mass flux M , respectively.Given results are the time averages for the time frame 6-12 h of the LES RICO-140 simulation.

Table 2 .
Model closure parameters estimated from the cloud tracking results.

Table 3 .
Parameters for the model formulation with the twocomponent mixed Weibull distribution.

Table 4 .
Ensemble average cloud properties resulting from the stochastic model ensemble runs with the different horizontal resolutions.