Understanding the errors caused by spatial-scale transformation in Earth observations and simulations requires a rigorous definition of scale. These errors are also an important component of representativeness errors in data assimilation. Several relevant studies have been conducted, but the theory of the scale associated with representativeness errors is still not well developed. We addressed these problems by reformulating the data assimilation framework using measure theory and stochastic calculus. First, measure theory is used to propose that the spatial scale is a Lebesgue measure with respect to the observation footprint or model unit, and the Lebesgue integration by substitution is used to describe the scale transformation. Second, a scale-dependent geophysical variable is defined to consider the heterogeneities and dynamic processes. Finally, the structures of the scale-dependent errors are studied in the Bayesian framework of data assimilation based on stochastic calculus. All the results were presented on the condition that the scale is one-dimensional, and the variations in these errors depend on the differences between scales. This new formulation provides a more general framework to understand the representativeness error in a non-linear and stochastic sense and is a promising way to address the spatial-scale issue.

Introduction

The spatial scale in Earth observations and simulations refers to the observation footprint or model unit in which a geophysical variable is observed or modelled (scale is used below as an abbreviation for spatial scale). Scale is traditionally defined in terms of distance, which is not adequate both because distance is a one-dimensional quantity while scale generally refers to a two- or three-dimensional space and because the scale may change in a very complicated manner (for example, from an irregular observation footprint to a square observation footprint). Generally, the scale is not explicitly expressed in the dynamics of a geophysical variable, partially because a rigorous definition of scale is difficult to find, except for an intuitive conception (Goodchild and Proctor, 1997) and certain qualitative classifications of scale (Vereecken et al., 2007). This reflects the complexity of scale and consequently requires a more rigorous mathematical conceptualisation of scale.

The scale transformation of a geophysical variable may result in significant errors (Famiglietti et al., 2008; Crow et al., 2012; Gruber et al., 2013; Hakuba et al., 2013; Huang et al., 2016; Li and Liu, 2017; Ran et al., 2016). These errors are mainly caused by the strong spatial heterogeneities (Miralles et al., 2010; Li, 2014) and irregularities (Atkinson and Tate, 2000) that are associated with geophysical variables across different scales, and are also closely related to dynamic variations, e.g. in hydrological (Giménez et al., 1999; Vereecken et al., 2007; Merz et al., 2009; Narsilio et al., 2009), soil (Ryu and Famiglietti, 2006; Lin et al., 2010) and ecological (Wiens, 1989) processes. How to elucidate the scale transformation by developing mathematical tools has yet to be fully addressed.

Data assimilation could be an ideal tool to explore the scale transformation because it presents a unified and generalised framework in Earth system modelling and observation (Talagrand, 1997). Geophysical data are typically observed by various Earth observations; thus, updating the observation data in a data assimilation system may result in scale transformations between the observation space and system state space. If observation operator is strongly non-linear and complex, the errors caused by the scale transformation are even more serious (Li, 2014). An important concept that is related to the scale transformation in data assimilation is “representativeness error”, which is associated with the inconsistency in the spatial and temporal resolutions between states, observations and operators (Lorenc, 1986; Janjić and Cohn, 2006; van Leeuwen, 2014; Hodyss and Nichols, 2015), and the missing physical information that is related to a numerical operator compared to the ideal operator (van Leeuwen, 2014), such as the discretisation of a continuum model or neglect of necessary physical processes. The representativeness error and instrument error make up the observation error of data assimilation. Under the Gaussian assumption, they are independent of each other (Lorenc, 1995; van Leeuwen, 2014). This study will not consider the instrument error when formulating the scale transformation in data assimilation.

Recently, approaches have been developed to assess the representativeness error. Janjić and Cohn (2006) studied the representativeness error by treating system state as the sum of resolved and unresolved portions. Bocquet et al. (2011) used a pair of operators, namely, restriction and prolongation, to connect the relationship between the finest regular scale and a coarse scale, and determined the representativeness error using a multi-scale data assimilation framework. van Leeuwen (2014) considered two complicated cases, i.e. conducting the observation vector in a finer resolution compared with system state vector and assimilating the retrieved variables. Their solutions were formulated using an agent in observation or state space, and a particle filter was proposed to treat the non-linear relationship between observations, states and retrieved values. Hodyss and Nichols (2015) also estimated the representativeness error by investigating the difference between the truth and the inaccurate value that is generated by forecasting model.

Although these approaches explored the structure of the representativeness error and offered various solutions, improvements are still necessary to investigate the exact expression of the errors caused by scale transformation in data assimilation. The authors believe that these approaches are optimal in linear systems but may not be suitable when observations are heterogeneous and sparse, or when operators are non-linear between states and observations, although the general equations in the non-linear case were given. Without taking heterogeneities and non-linear operators into account, the representativeness error cannot be fully understood. However, heterogeneity varies depending on the situation and is difficult to formulate in a general theoretical study.

Data assimilation studies based on stochastic processes (Apte et al., 2007; Miller, 2007) or a stochastic dynamic model (Miller et al., 1999; Eyink et al., 2004) have been proposed recently. Compared to deterministic models, stochastic data assimilation is more applicable in an integrated and time-continuous theoretical study (Bocquet et al., 2010) and creates an infinite sampling space of the system state (Apte et al., 2007). Although the theorems of calculus that are based on stochastic processes (or stochastic calculus) are different from those of ordinary calculus, these advantages suggest that stochastic data assimilation offers a more general framework to study scale transformation.

We attempt to explore the mathematic definitions of scale and scale transformation, and then formulate the errors caused by the scale transformation on stochastic data assimilation in a general theoretical study. The next section introduces the basic concepts and theorems of measure theory, stochastic calculus and data assimilation. In Sect. 3, we present the definitions of scale and scale transformation. The posterior probability of system state is also reformulated by scale transformation in a stochastic data assimilation framework. In the final section, the contributions and deficiencies of this study are discussed.

Basic knowledge

The scale greatly depends on the geometric features of a certain observation footprint or model unit. The model unit is a specified subspace where a geophysical variable evolves in the model space; it could be a point, a rectangular grid, or an irregular unit such as a response unit (watershed, landscape patch, etc.). We offer a solution in which the definition of scale uses measure theory and the expression of a geophysical variable as a stochastic process uses stochastic calculus. Therefore, we first introduce several basic concepts of measure theory and stochastic calculus.

Measure theory

Let Ω be an arbitrary non-empty space. F is a σ-algebra (or σ-field) of subsets of Ω that satisfies the following conditions:

ΩF, and the empty set ΦF;

AF implies that its complementary set AcF;

A1A2,F implies their union A1A2F.

A set function μ of F is called a measure if it satisfies the following conditions:

μA[0,) and μΦ=0;

If A1A2,F is any disjoint sequence and k=1AkF, μ is countably additive such that μk=1Ak=k=1μAk.

If μΩ=1, μ can be replaced by the probability measure p, and if μ is finite, p can be calculated as pA=μA/μΩ. The triples Ω,F,μ and Ω,F,p are the measure space and probability measure space, respectively.

Let Ω be the set of real numbers R and σ-algebra B be Borel algebra, which is generated by all closed intervals in R. Then, A=a,bB, a Lebesgue measure on R is defined as IA=b-a. Intuitively, the Lebesgue measure on R coincides with the length.

An n-dimensional Lebesgue volume is defined to measure the standard volumes of the subsets in Rn based on InA=k=1nbk-ak, where A=x:akxkbk,k=1,2,,n is an n-dimensional regular cell in Rn. The n-dimensional Lebesgue volume is an ordinary volume, such as length (n=1), area (n=2) and volume (n=3).

Next, the outer measure is defined as mnA=infi=1+InAi, where inf is the infimum, Ai=x:ai,kxkbi,k,k=1,2,,n is the n-dimensional regular cell in Rn, and Ai=1+Ai. Thus, if A is any subset of Rn, one can collect many sets of n-dimensional regular cells Ai to cover A. Among them, the outer measure denotes the set, whose union has the smallest n-dimensional Lebesgue volume.

Actually the outer measure does not match the two conditions of a measure, but one can define the outer measure mnas a Lebesgue measure on measure spaces Rn,Ln,mn, where Ln is the Lebesgue σ-algebra of Rn. The construction of the Lebesgue σ-algebra is based on the Caratheodory condition (Bartle, 1995, definition 13.3). Fortunately, almost all of the observation footprints and model units are finite and closed; therefore, they are Lebesgue measurable. This consequently ensures that the Lebesgue measure mn is a measure and the triple Rn,Ln,mn is a measure space. The Lebesgue measure of a Lebesgue measurable subset in Rn also coincides with its volume.

The n-dimensional Lebesgue integral in Rn,Ln,mn is fdmn, where f is a real function on Rn. The Lebesgue integral can be further denoted by fdmn=fxdx, where xRn and x=x1,,xn.

In the two-dimensional case (n=2), the Lebesgue integral is Afx1,x2dx1dx2, where AL2. Next, we consider the Lebesgue integration by substitution on R2. Let Tx1,x2=t1x1,x2,t2x1,x2=y1,y2 be a one-to-one mapping of a subset X onto another subset Y on R2. Assuming that T is continuous and has a continuous partial derivative matrix Tx=t1/x1t1/x2t2/x1t2/x2, then Yfy1,y2dy1dy2=XfTx1,x2Jx1,x2dx1dx2, where the Jacobian determinant Jx1,x2=detTx=t1/x1t1/x2t2/x1t2/x2. If T is linear, the integral reduces to Yfy1,y2dy1dy2=Jx1,x2XfTx1,x2dx1dx2. By doing so, any observation footprint or model unit can be regarded as a Lebesgue measurable subset in a two-dimensional space R2.

Additional details regarding measure theory can be found in the literature (for example, Billingsley, 1986; Bartle, 1995).

Stochastic calculus

We then introduce some necessary concepts and theorems of stochastic calculus without proofs; their detailed derivations can be found in the literature (Itô, 1944; Karatzas and Shreve, 1991; Shreve, 2005).

Stochastic calculus is defined for ordinary integrals with respect to stochastic processes. One of the simplest stochastic processes defined on Ω,F,p is Brownian motion W. It is characterised as follows:

W0=0.

t1>s1t2>s20, the increments Wt1-Ws1 and Wt2-Ws2 are independent.

t>s0, Wt-WsN0,t-s.

The last two conditions represent that t2>s2t1>s10, Wt2-Ws2 and Wt1-Ws1 are independent Gaussian random variables.

Stochastic calculus based on Brownian motion produces an Ito process. The differential form of the time-dependent Ito process is dI=φtdt+σtdWt, where φt,σt and Wt are the drift rate, volatility rate and Brownian motion, respectively. The integral form of Eq. (1) is It=I0+0tφudu+0tσudWu. Theorem 1: For any Ito process defined as in Eq. (1), the quadratic variation that is accumulated on the interval 0,t is I,It=0tσ2udu, and the drift of Eq. (1) is I0+0tφudu.

As distinguishing features of stochastic calculus, the quadratic variation and drift can be regarded as stochastic versions of the variance and expectation, respectively. That is, the variance and expectation are instances of their stochastic counterparts within a certain integral path. Therefore, rather than being constants, the quadratic variation and drift are given in terms of probability.

Theorem 2 (Ito's Lemma): If the partial derivatives of function fu,I, viz. fuu,I, fIu,I and fIIu,I, are defined and continuous. If t0, we have ft,It=f0,I0+0tfuu,Iudu+0tfIu,IuσudWu+0tfIu,Iuφudu+120tfIIu,Iuσ2udu. Ito's Lemma is typically used to build the differential of a stochastic model with Ito processes. In this study, Ito's Lemma is applied to study the scale-dependent relationship between the observation and state and the errors caused by scale transformation.

Traditional formulation of data assimilation in the Bayesian theorem framework

We use the well-accepted Bayesian theory of data assimilation (Lorenc, 1995; van Leeuwen, 2015) to investigate its time- and scale-dependent errors. State and observation are first assumed to be one-dimensional.

A non-linear forecasting system can be described by Xtk=Mk-1:kXtk-1+ηtk, where Mk-1:k, Xtk and ηtk represent a non-linear forecasting operator that transits the state from the discrete time k-1 to k, the state with prior probability distribution function (PDF) pX and the model error at time k, respectively.

If a new observation is available at time k, the observation system is given by Yotk=HkXtk+εtk, where Hk, Yotk and εtk represent the non-linear observation operator, true observation with prior PDF pY and observation error at time k, respectively.

Previous studies (e.g. Janjić and Cohn, 2006; Bocquet et al., 2011) described the origins of the components of εtk and ηtk, such as white noise, the discretisation error of a continuum model, the errors that are caused by missing physical processes, and the scale-dependent bias. In this study, we assume that both forecasting and observation operators are perfect models; thus, errors caused by missing physical processes are discarded.

According to Bayesian theory, the posterior PDF of the state based on the addition of a new observation into the system is pX|Y=pY|XpX/pY, where pX|Y is the posterior PDF that presents the PDF value of state X given an available observation Y. pY|X is a likelihood function, which is the probability that an observation is Y given a state X. pX and pY are the prior PDF values of the state and observation, respectively. Here, pX is supposed to be known and pY is a normalisation constant (van Leeuwen, 2014). The aim of data assimilation is equivalent to finding the posterior PDF pX|Y.

Reformulation of scale transformation in data assimilation framework Definition of scale

We define the scale based on the measure theory that was introduced in Sect. 2. The relationship between Lebesgue measure in R2,L2,m2 and scale is first introduced by the following measures of Earth observations.

Measure of a single-point observation: when the observation footprint is very small and homogeneous, we assume that its footprint approaches zero, and its measure is accordingly zero under the condition of the Lebesgue measure.

Measure along a line: the measure is a one-dimensional Lebesgue measure.

Measure of a rectangular pixel (for example, remote sensing observation): A=x:akxkbk,k=1,2, it is a two-dimensional Lebesgue volume, i.e. μiiiA=I2A=k=12bk-ak.

Measure of a footprint-scale observation: the footprint is any bounded closed domain A, which is not necessary to be regular rectangles, but can also be circles or ellipses. We use Lebesgue measure on R2, i.e. μivA=m2A=infi=1+I2Ai, where Ai=x:ai,kxkbi,k,k=1,2 and Ai=1+Ai. Clearly, measures (i)–(iii) are special cases of the measure of a footprint-scale observation.

All of the above measures depend mainly on the shape and size of A. The Lebesgue measure on R2 coincides with the area; thus, the Lebesgue integral of μivA is Adx1dx2, where the real function f1.

Now, we can generalise the above examples by defining the scale as the Lebesgue measure with respect to the observation footprint. This definition can also be extended to a certain model unit. Thus, for any subset AL2, the scale is s=m2A=Adx1dx2, where the real function f1. From a geometric perspective, the measure function m2 refers to the shape of the subset, and the scale further indicates its size.

We represent the scale as s, and let s0=m02A0=A0dx1dx2=1 be the standard scale, where A0=x:0xk1,k=1,2 is the unit square in R2. The standard scale can be regarded as a basic unit of scale. It presents a standard reference by which one can make a quantitative comparison between different scales. The standard scale is also the origin of scales that lets scales vary similarly to other physical quantities, such as time.

We can further define scale transformation. For A1A2L2, if there are two different scales, s1=m2A1=A1dx1dx2 and s2=m2A2=A2dy1dy2, then we can obtain s2=A2dy1dy2=A1Jx1,x2dx1dx2 based on Lebesgue integration by substitution, where the Jacobian matrix Jx1,x2 represents the geometric transformation from A1 to A2. In particular, if Jx1,x2=diagξ,ξ,ξR, which also indicates that the geometric transformation is linear, then the following expression is valid based on Lebesgue integration by substitution: s2=Jx1,x2A1dx1dx2=ξ2s1, where s1 and s2 represent the change of the one-dimensional rule.

If two scales follow the one-dimensional rule, they are geometrically similar. This rule simplifies scale as a one-dimensional variable that corresponds to the scale transformations between most remote sensing images with various spatial resolutions. For example, A=x:axkb,k=1,2, where A and the unit square A0 are geometrically similar, and the scale s=μiiiA can be expressed by the one-dimensional rule of scale transformation: s=μiiiA=Jx1,x2A0dx1dx2=b-a2s0. For another example, let s=Ady1dy2 be the scale of a disc footprint A with radius r. The mapping function between A and A0 is Tx1,x2=rx1cos⁡2πx2,rx1sin⁡2πx2;0x11,0x21=y1,y2, and the Jacobian determinant Jx1,x2=rcos⁡2πx2-2πrx1sin⁡2πx2rsin⁡2πx22πrx1cos⁡2πx2=2πr2x1. Therefore, s=Ady1dy2=A0Jx1,x2dx1dx2=πr2s0, which is equal to its area. However, s0 and s do not obey the one-dimensional rule because the Jacobian matrix is not diagonal.

Layer 1 in Fig. 1 shows the relationship between the Lebesgue measure and scale. The measure space Ω=x:0xk4,k=1,2 is regularly divided by the unit square A0. Let scales sC1=mC12C1, sC2=mC22C2 and sC3=mC32C3 be the Lebesgue measures of disc observation footprints C1, C2 and C3, respectively. Then, mC12=mC22=mC32 because they are the same Lebesgue measure functions. That is, if Ai is the set with the smallest volume that covers C1, then similar sets Ai+2 and Ai×3+2 can be used (with the origin located in the upper-left corner) to cover C3 and C2 with the smallest volumes, respectively. Here, Ai+2=xi:xi,k+2,xi,kAi,k=1,2 and Ai×3+2=xi:xi,k×3+2,xi,kAi,k=1,2, which proves that functions mC12, mC22 and mC32 collect the desired set based on the same scheme; therefore, they are identical. Additionally, sC2=mC22C2=I2Ai×3+2 is much larger than sC1=mC12C1=I2Ai and sC3=mC32C3=I2Ai+2. Therefore, the scale of C2 is not equal to the two other scales because the volumes of their subsets are different. However, their scales are governed by one-dimensional rules because their measures are identical and the Jacobian matrices between them are diagonal.

Diagram of the relationships among a Lebesgue measure, scale and geophysical variable.

Stochastic variables in data assimilation

Instead of using Eqs. (5) and (6), which are discrete in time, we use Ito process-formed expressions with the one-dimensional infinitesimals ds and dt to formulate a continuous-time (or continuous-scale) state and observation.

A geophysical variable can be regarded as a real function Vs,t, and it maps the space R2,L2,m2 onto R, where s is the scale, s=m2A, AL2, and t is the time. In n-dimensional data assimilation, a geophysical variable V is related to an element of state vector X at a specific scale s and time t.

In Fig. 1, layer 2 presents a heterogeneous geophysical variable in the entire region. If we aggregate layer 2 into layer 1 and let each pixel intensity be the value for a geophysical variable in that pixel, then the measure space Ω is heterogeneous. A geophysical variable represents a spatial average in a specific observation footprint with a specific scale. Therefore, the geophysical variables in C1 and C3 are not equal because their observation footprints are different, and the geophysical variables in C2 and C3 are also different because the scale changes. The former introduces that the geophysical variables vary with the location, and the latter states that the geophysical variables are scale dependent.

If the statistical properties of the geophysical variable are available, we can construct an explicit stochastic equation for it. We introduce the time-dependent Ito process Eq. (1) to define the geophysical variable process: dV=ptdt+qtdWt. Similarly, the geophysical variable is supposed to evolve via a stochastic process, for which the dynamic process and uncertainty are allowed to vary with scale, dV=φsds+σsdWs, where φs and σs are the scale-based drift rate and volatility rate, respectively. The geophysical variable is a probabilistic process with respect to scale and thus has scale-dependent errors, where the scale should shift forward or backward based on the condition that the scale follows the one-dimensional rule.

Equation (9) can be regarded as a continuous-time version of Eq. (5), i.e. the estimation of the state is equal to the integral of Eq. (9) over a time interval. Here, pt indicates the physical process with respect to time, and qt is the error only caused by the evolution of time; thus, model error η in Eq. (5) contains more parts than qt. Equation (10) implies that the value and variance of a geophysical variable may change if the scale changes. The formulation of φs should consider the spatial heterogeneities and physical process variations among different scales, which together constitute the deterministic part of a geophysical variable. However, neither of them is well understood in a general theoretical study. Therefore, φs is conceptualised in Eq. (10). Particularly, if the study region is homogeneous, then the values of a variable that are observed at the same place are identical between the large scale and fine scale, and φs can be left out. Due to the integral over the space of Brownian motion, σs is the stochastic part, meaning that scale transformation produces uncertainties.

The state in the forecasting step can be expressed by Eq. (9) because only time is involved. In the analysis step of data assimilation, the state does not pertain to time, and we assume that the scale has a quantifiable effect on the errors in this step; thus, both the states and observations can be defined by Eq. (10).

Expression of scale transformation in a stochastic data assimilation framework

First, we provide the following lemma.

Lemma 1: for s0>0, let W0=Ws0-Ws0,,Ws=Ws0+s-Ws0; then, Wss0 is a Brownian motion.

Remark on Lemma 1: obviously, W is Brownian motion because W0=0 and the increments Wsi+1-Wsi are equal to Ws0+si+1-Ws0+si. Therefore, EWsi+1-Wsi=0 and VarWsi+1-Wsi=si+1-si.

Note that in the definition of Brownian motion, the parameter starts at zero. However, the scale is realistically greater than zero, which means that it cannot be directly applied in Brownian motion. Therefore, Lemma 1 is logical because it implies that Wsss0 is an equivalent expression of Wss0. Therefore, beginning with the standard scale, the Brownian motion and stochastic calculus with respect to scale can be further developed.

In the following content, we use Brownian motion with a parameter that starts at s0 to define the scale-dependent geophysical variables; therefore, the classic expressions above are changed. According to Lemma 1, Eq. (3) is given by I,Is=s0sσ2udu. Additionally, the integral form of Eq. (10) is Vs=V0+s0sφudu+s0sσudWu, where V0=Vs0, and the drift of Eq. (12) is V0+s0sφudu. Similarly, Eq. (4) becomes

fs,Is=fs0,Is0+s0sfuu,Iudu+s0sfIu,IuσudWu+s0sfIu,Iuφudu+12s0sfIIu,Iuσ2udu. Now, we make the following assumptions.

Assumption 1: the scale transformations between the state and observation spaces of data assimilation obey the one-dimensional rule as defined in Sect. 3.1.

Assumption 2: in the forecasting step, the model unit equals the scale of the state space, and both of them are constant.

Assumption 3: in the analysis step, the state, observation and observation operator are scale dependent. Only one observation is added into the data assimilation system at a time.

In assumption 1, the one-dimensional rule ensures that scale changes in a sense of geometrical similarity (for example, from a larger square observation footprint to a smaller square observation footprint, or from C2 to C3 as presented in Fig. 1). Therefore, based on assumption 1, scale only varies in one-dimensional space, meaning that the corresponding scale transformation is an integral over one-dimensional space.

Assumption 2 indicates that the model unit and state scale are supposed to be the same and both invariant in space and time. Thus, there is no scale transformation in the forecasting step; thus, Eq. (9) can adequately describe this step.

Based on assumption 3, the analysis step is related to the scale. The scale transformation is only involved in the process of mapping the state vector from state space to observation space. According to Eq. (10), the state and observation in the analysis step are dX=φXsds+σXsdWs and dY=φYsds+σYsdWs, where φXs, σXs, φYs and σYs represent the scale-dependent drift rates and volatility rates of state X and observation Y, respectively. φs also implies the heterogeneities and physical processes from standard scale to a specific scale, which may be hard to formulate. σu can be regarded as the stochastic perturbation with respect to scale.

Based on the above discussion, the integral forms of the state are XsX=X0+s0sXφXsds+s0sXσXsdWs. For the observation, we have YsY=Y0+s0sYφYsds+s0sYσYsdWs. In Eqs. (15) and (16), the time t is omitted, and sX, sY, X0 and Y0 represent the scale of the state space, scale of the observation space, state in s0 and observation in s0, respectively. These formulas prove that the value of state varies with the changes of scale.

The Bayesian equation of data assimilation (Eq. 7) produces the posterior PDF pX|Y that is associated with the likelihood function pY|X and the distributions of the state and observation. In addition, under the condition that the variances exist, assumption 1 states that the scales vary in one-dimensional space, which results in XNX0+s0sXφXsds,s0sXσX2sdsandYNY0+s0sYφYsds,s0sYσY2sds. Equations (17) and (18) are the prior PDFs of state and observation with respect to scale in state space and observation space, respectively. These two prior PDFs are introduced into the Bayesian theorem that is reformulated by scale.

Then, we calculate the posterior PDF. The scale-dependent observation operator is Hs,I, which suggests that the observation operator and its parameters are both susceptible to the scale. If Hs,I is defined, its continuous partial derivatives are Hss,I, HIs,I and HIIs,I. In line with Ito's Lemma, we get an estimation of observation in the observation space (the notations u,Xu and u were omitted, Hs=Hsu,Xu, σX=σXu, etc.) HsY,XsY=Hs0,X0+s0sYHsdu+s0sYHIσXdWu+s0sYHIφXdu+12s0sYHIIσX2du=Hs0,X0+s0sYHs+HIφX+12HIIσX2du+s0sYHIσXdWu.

Assumption 1 suggests that the observation and state spaces have the same probability measure; thus, the Brownian motions in these two spaces are equivalent. Equation (19) can also be rewritten by replacing s0 with sX, namely HsY,XsY=HsX,XsX+sXsYHs+HIφX+12HIIσX2du+sXsYHIσXdWu, and then we obtain YsY-HsY,XsY=YsY-HsX,XsX+sXsYHs+HIφX+12HIIσX2du+sXsY-HIσXdWu. Equation (20) can be regarded as an Ito process, and its drift is YsY-HsX,XsX+sXsYHs+HIφX+12HIIσX2du. The last integral term in Eq. (21) is the difference in the first-order differential observation operator between the state scale sX and the observation scale sY. This term illustrates that the mapping process should consider not only the observation operator but also the first-order differential term when state is mapped to the observation space. The former is typically determined from the literature, whereas the latter was derived in this study for the first time. This result prompted us to further consider the first-order differential of the observation operator when calculating the representativeness error.

The quadratic variation of Eq. (20) is sXsYHI2σX2du. This equation suggests that the uncertainty in the observation error includes the change in the observation operator from scale sX to sY. Therefore, Eqs. (21) and (22) can be combined to produce pY|X=N(YsY-[HsX,XsX+sXsY(Hs+HIφX+12HIIσX2)du],sXsYHI2σX2du). Based on Eqs. (17), (18) and (23), pY|X, pX and pY are stochastic functions that depend on the scale; thus, the posterior PDF of the state is scale-dependent as well.

In particular, if Y is a direct observation, which means that the observation is of the same physical quantity and scale as the state, and for simplicity, assume that X is only influenced by scale-dependent Gaussian noises, viz. Hs,Xs=Xs=X0+s0sdWs. Then the result becomes YsY-XsY=YsY-XsX-sXsYdWuandpY|X=NYsY-XsX,sY-sX.

In Eq. (24), the integral sXsYdWu can be regarded as the noise based on the increment of Brownian motion with respect to scale, and its expectation equals zero.

The significance of Eqs. (20)–(25) is that the effect of scale on the posterior PDF can be determined quantitatively. In addition to the model error and instrument error (both were not introduced explicitly in this study because they have little influence on the error caused by scale transformation), a new type of error in data assimilation was discovered in the analysis step. The expectation of the posterior PDF may vary with the scale of the state space if Y is an indirect observation, and the variance of the drift depends on the difference between sY and sX (based on Eq. 22). In addition, if Y is a direct observation and X is only influenced by scale-dependent Gaussian noises (Eqs. 24 and 25), the expectation of the posterior PDF is the difference between Y and X, and the variance is equal to the increment of Brownian motion with respect to the scale. Additionally, if the results are not derived from assumption 1, i.e. the scale varies randomly, the posterior PDF is more complex because the Jacobian matrix in the Lebesgue integration of scale transformation is arbitrary.

Example: the stochastic radiative transfer<?xmltex \hack{\break}?> equation (SRTE)

To explicitly show how the stochastic scale transformations impact assimilation, we introduce an illustrative example based on the scales presented in Fig. 1. Assume that in the analysis step, the state has the standard scale s0, whose observation footprint is the unit square A0. If the scale of observation space is sC1 and its observation footprint is the disc C1, then the Jacobian matrix of the transformation between the scales of the state space and observation space is not diagonal according to the statements in Sect. 3.1, leading the two scales to not obey the one-dimensional rule and be against assumption 1. However, if the scales of state space and observation space are sC3 and sC2, respectively, assumption 1 is met, and it can be determined that sX=sC3=π4s0 and sY=sC2=9π4s0.

Now the scales of state space and observation space obey the one-dimensional rule, and we further presume that the measure space Ω in Fig. 1 is free of spatial heterogeneities and dynamic process variations depending on scale. Consequently, the drift rate φs=0. If the value of state in the standard scale is denoted as X0 and assuming that σs=1, then the prior PDF of state is XNX0,π4s0-s0 according to Eq. (17), where π4s0-s0 is not a real number and is only used to indicate the variation when the scale changes.

If Hs,Xs=Xs, the observation has the same physical quantity as the state, and according to Eq. (25), the likelihood function is pY|X=NYsY-XsX,sY-sX=NYsY-XsX,sC2-sC3=NYsY-XsX,9π4s0-π4s0.

To formulate the likelihood function in the case that the observation is different from the state, the SRTE will be employed in the following text. The SRTE is a stochastic integral-differential equation that describes the radiative transfer phenomena through a stochastically mixed immiscible media. Scientists have developed analytical or numerical methods for finding the stochastic moments of the solution, such as the ensemble averaged and the variance of the radiation intensity (Pomraning, 1998; Shabanov et al., 2000; Kassianov and Veron, 2011).

Consider the general expression of the SRTE (leaving out the scattering and emission), -μdIτdτ=-Iτ, where Iτ, μ and τ are the radiation intensity, coefficient of radiation direction and optical depth, respectively.

To tie into more substantial random optical properties of the transfer media, such as absorption and scattering, the optical depth τ is assumed to be stochastic. This suggests that the optical depth is a scale-dependent Ito process and can be expressed as dτs=φτsds+στsdWs. This causes the radiation intensity to depend on scale.

The analytical solution of Eq. (26) is Iτ=I0eτ/μ, where I0=Iτs0.

SRTE can be considered as a concrete instance of a stochastic observation operator by defining Hs,xs=Ix=I0ex/μ. Therefore, its first- and second-order derivatives are Hss,xs=0, Hxs,xs=1μI0ex/μ and Hxxs,xs=1μ2I0ex/μ. Based on Ito's Lemma, dIτ(s)=dHs,τ(s)=Hss,τ(s)ds+Hxs,τ(s)dτ(s)+12Hxxs,τ(s)dτ(s)dτ(s)=1μI0eτ(s)/μdτ(s)+12μ2I0eτ(s)/μdτ(s)dτ(s)=1μIτ(s)dτ(s)+12μ2Iτ(s)dτ(s)dτ(s)=1μIτ(s)στ(s)dW(s)+1μIτ(s)φτ(s)ds+12μ2Iτ(s)στ2(s)ds=στ2(s)2μ2+φτ(s)μIτ(s)ds+στ(s)μIτ(s)dW(s). The radiation intensity is a scale-dependent Ito process. The difference between Eq. (28) and the general Ito process is that there is a primitive function Iτs in the integral term. Therefore, the uncertainty of the radiation intensity is more complex because it is related to both the change of scale and the primitive function.

Integrating both sides of Eq. (28) yields the general solution of the radiation intensity, Iτ(s)=Cexp⁡[στ2(s)2μ2+φτ(s)μds+στ(s)μdW(s)], where the constant CR. Equation (29) further indicates that Iτs is a scale-dependent Ito process.

Considering that the optical depth τ is the state, the radiation intensity I is the observation and Iτs is the observation operator, the results in Sect. 3.3 could easily be applied here. For example, Eqs. (20) and (23) become YsY-HsY,XsY=IτsY-IτsX-sXsY1μ2στ22μ+φτ+στ2Iτ2μ2I2τdu-sXsYστμ2I2τdWu,pY|X=N(IτsY-IτsX-sXsY1μ2I2τστ22μ+φτ+στ2Iτ2μ2du,sXsYστ2μ4I4τdu). Then, the posterior PDF of the data assimilation can be determined by Eqs. (27), (29) and (31).

Discussion and conclusions Discussion

Our study offered a stochastic data assimilation framework to formulate the errors that are caused by scale transformations. The necessity of the methodology, the difference from previous works by other investigators, and the advantages and limitations of this study are discussed as follows.

The reasons that the methodology focuses on a stochastic framework are as follows. First, the stochastic data assimilation framework is essentially consistent with the concepts of scale and scale transformation; both are associated with corresponding measure spaces Ω,F,μ. Therefore, it is natural to regard the state space and observation space as two different measure spaces, and each element of state (or observation) vector can be seen as a geophysical variable that maps the state (or observation) measure space onto R. Correspondingly, as the integrals of random processes with respect to random processes, stochastic calculus was ultimately adopted. Second, using stochastic calculus can also formulate the errors caused by scale transformations. The study proceeds with and improves the understanding of representativeness error in terms of scale. The results did not only prove the conventional point that the uncertainties of these errors mainly depend on the differences between scales but also indicated that the first-order differential of the non-linear observation operator should be incorporated in representativeness error. Third, the error caused by scale transformation was presented in a general form. The drift and quadratic variation of error were formulated by Eqs. (21) and (22), respectively, and both defined the probability distribution space of pY|X. Last, stochastic calculus can be extended to meet a general scale transformation and formulate the corresponding representativeness error, which was unattainable in previous work. For example, if the scale changes randomly, say, from an irregular footprint to another irregular footprint, the stochastic equation can offer a multiple integral to present this type of scale transformation, such as Vx,y=V0+Y0YX0Xφx,ydxdy+Y0YX0Xσx,ydW1(x) dW2(y), where W1(x) and W2(y) are two independent Brownian motions.

The significant innovation of this work is as follows. We developed a more rigorous formulation of the scale and scale transformation based on Lebesgue measure, which places the related concepts in a rigorous mathematical framework and then provides a new understanding of the errors caused by scale transformation. In addition, due to the Ito process-formed state and observation, a stochastic data assimilation framework was proposed by considering the non-linear operators, heterogeneity of a geophysical variable and a general Gaussian representativeness error. The scale transformation is also non-linear if the one-dimensional rule is not applied. Additionally, Ito process-formed state and observation offer the drift rate (i.e. φs in Eq. 10) to formulate the heterogeneity associated with scale transformation. It also permits the representativeness error to be general Gaussian in this framework. If all the integrands in Eqs. (13) and (14) are non-linear functions instead of constants, then these two equations can be integrated over the field of Brownian motion, and state and observation are the general Gaussian processes of scale. Based on these functions, the representativeness error is a general Gaussian process.

As a theoretical exploration towards scale transformation and stochastic data assimilation, there is still much room for improvement. First, we reduced the scale transformation by the one-dimensional rule, and let the variables in data assimilation evolve regularly according to assumptions 1–3; thus, only the ideal result was investigated. Therefore, an in-depth and comprehensive exploration should be conducted in the future to describe other situations in the real world. However, the use of either an arbitrary scale transformation or the geophysical variable without ignoring the drift rates will obtain lengthy results. Therefore, the second improvement focuses on how to make the formulation more concise. Lastly, noting that all the results in our framework were given in terms of probability, it is necessary to implement real-world applications of these theoretical results, such as introducing some concrete dynamic models to formulate the Ito process-formed geophysical variable of scale.

Conclusions

In this study, we mainly addressed two basic problems associated with scale transformation in Earth observation and simulation. First, we produced a mathematical formalism of scale and scale transformation by employing measure theory. Second, we demonstrated how scale transformation and its associated errors could be presented in a stochastic data assimilation framework.

We revealed that the scale is the Lebesgue measure with respect to the observation footprint or model unit. The scale is related to the shape and size of a footprint, and scale transformation depends on the spatial change between different footprints. We then defined the geophysical variable, which further considers the heterogeneities and physical processes. A geophysical variable consequently expresses the spatial average at a specific scale.

We formulated the expression of scale transformation and investigated the error structure that is caused by scale transformation in data assimilation using basic theorems of stochastic calculus. The formulations explicate that the first-order differential of the non-linear observation operator should be considered in representativeness error, and the uncertainty of representativeness error is directly associated with the difference between scales. A concrete physical models (SRTE) was introduced to demonstrate the results when observation operator is non-linear.

This work conducted a theoretical exploration of formulating the errors caused by scale transformation in a stochastic data assimilation framework. We hope that the stochastic methodology can benefit the study of these errors.

Notation

Basic notations.

Ω Non-empty space F σ-algebra μ Measure dV Variable process Ws Brownian motion Ω,F,μ Measure space In N-dimensional Lebesgue volume mn Lebesgue measure or an outer measure on Rn Ln Lebesgue σ-algebra of Rn fdmn Lebesgue integral J Jacobian determinant

New notations.

Notation Name Explanation Index s Scale The observation footprint or model unit to observe or model a geophysical variable Sects. 1 and 3.1 A0 Unit square in R2 Sect. 3.1 s0 Standard scale A Lebesgue integral where A0 is the unit area Sect. 3.1 One-dimensional rule Two scales are geometrically similar Eq. (8) V Geophysical variable Estimation of a variable at a specific scale Sect. 3.2 dX State process Ito process-formed state Eq. (13) dY Observation process Ito process-formed observation Eq. (14) X0 State at s0 Eq. (15) Y0 Observation at s0 Eq. (16) sX Scale of state space Eq. (15) sY Scale of observation space Eq. (15)

The authors declare that they have no conflict of interest.

Acknowledgements

We thank the executive editor of NPG, Olivier Talagrand, and his kind help and valuable comments on our manuscript. We also thank Peter Jan van Leeuwen and another anonymous reviewer for their valuable comments and suggestions. This work was supported by the NSFC projects (grant numbers 91425303 and 91625103) and the CAS Interdisciplinary Innovation Team of the Chinese Academy of Sciences. Edited by: Olivier Talagrand Reviewed by: Peter Jan van Leeuwen and one anonymous referee