The authors have done a fantastic job in clarifying all mathematics used, and I find the manuscript very easy to read as the argument s are now presented in a logical order (for me). Furthermore, the results are very interesting. However, I suggest a modifications along the following lines.
1. I still expect the readers get lost. Let me explain what I think their reasoning could be:
They understand that a geophysical variable is a random variable as function of scale. The reason is two-fold, at a certain scale the variable can have different values in different parts of the domain, and our prior knowledge of the random variable value at a certain scale is limited. This, then leads to a pdf p(V,s) at each time instant. This joint pdf gives the likely range of V given likely ranges of s. It’s marginal p_s(V) gives the pdf of V at each scale value s. This is as far as they understand the reasoning until page 10 line 17.
However, at 18 on that page the authors introduce an SDE for V(s). This description has a different interpretation from the one above. Now the uncertainty, so the width of the pdf at a certain scale depends on the scale at which the SDE is started. This, then, means that the pdf p(V,s) depends also on a starting scale s_0, and a different s_0 leads to a different pdf. But, as far as they can see, s_0 is arbitrary, so this cannot work.
Even if one defines s_0 as the model unit scale or the observational footprint, there are still problems. Why would the uncertainty grow the further we are away from that scale? My guess would be that the pdf p_s(V) as defined above will be narrower at large scales, but this is not what the SDE will give us. Furthermore, the manuscript assumes s_0 to be the same for observation and model, and it is unclear why that is a good choice.
So their question will be: why is modelling the random character of the geophysical variable as function of scale via an SDE the right thing to do? Why can’t one work directly with p(V,s) as defined above?
If one does use p(V,s) as defined above for both state and observation they won't see the extra terms in 20-27 arising, apart from those terms already present in the representation error literature.
It is important that the authors make sure that this interpretation is avoided.
2. A completely different interpretation, and the one that the authors have in mind I think, is that one does know the variable at one scale and needs to transform that to a value for the variable at another scale. This is indeed what is needed to compare model results with observations. The interpretation of equation (10) is now that it described how to transform the variable value from the model unit, or the observational footprint, to another scale where the comparison will be made. That will in general be a deterministic part to this (e.g. due to statistical knowledge about the averaging process), and a stochastic part due to the heterogeneity of the variable. This should be made more clear.
3. The assumption is that the scale at which the comparison is made is larger than both the scale at which we know the model and the observation. This makes this problem very different from the main issue in dealing with representation errors in data assimilation, in which we do know the model only at a much courser scale then the observations, and the observations only at the fine scale. Then a direct comparison is not possible and extra information is needed to solve this problem.
This assumption is crucial to mention, and, although a limitation, it still makes this an interesting problem to study.
4. The authors need to add a motivation why an SDE is a good model for changing scales. What does the drift term represent? Does it represent that we can use more model units or more observations from different positions to describe the scale change via averaging? And if so, the SDE leads to an increasing stochastic variance for an increasing scale separation. Is that realistic if one can do spatial averaging with increasing scale?
5. The authors implicitly assume that both model unit and observational footprint are at scale s_0. Otherwise the Y_0 in equation (21) is unknown and the integrations in (21) from s_0 cannot be performed, so the equations cannot be used in practise. I think this assumption is not necessary, as long as s_0 for Y and s_0 for X are smaller than the scale at which the comparison is made they can be different.
6. Further on point 5, I actually think the authors should reformulate page 13. Assume the model is available at scale s_x and the observation at scale s_y. One cannot refer to scale s_0 as it is smaller then both of them, so x_0 and Y_0 are not known, so cannot be used.
Let us first look at the case s_x < s_y. We first have to transform the model values from scale s_x to s_y before we can use H. This is done via the SDE. Equation (19) needs to be rewritten as (assuming sigma_x = sigma_y = 1, and phi_x=phi_y=0 as in the paper):
H(s_y,X(s_y)) = equation (19) with s_0 replaced by s_x, and s_x by s_y.
Equation (20) becomes
Y(s_y)- H(s_y,X(s_x)) = Y(s_y)- H(s_x,X(s_x)) + ½ int_s_x^s_y H_xx du +
int sx^s_y dW – int_s_x^s_y H_x dW
This leads to a new drift and variance term easily extracted from the above.
In this way all reference to Y_0 and X_0 is disappeared, and all results become intuitive.
Now consider s_x> s_y. In that case one uses the SDE on Y, and finds:
Y(s_y)- H(s_y,X(s_x)) = Y(s_x) + int_s_y^s_x dW )- H(s_y,X(s_x))
And again drift and variance can easily be extracted.
(In fact, I would suggest the authors keep the phi’s and sigma’s as it is good to have the full set of equations.)
7. It would be good if the authors would discuss methods to find the sigma’s and the phi’s.
8. Example 3.4 would benefit from specifying the phi’s and sigma’s and pushing the example to the end, perhaps showing pictures of the pdf’s involved.
9. I don’t see how 3.5 is relevant and/or needed. Why all this trouble, why the one-dimensional rule? Different variables will have different scaling rules, but as far as I can see one can follow the scaling rule ideas for each independent observation separately. What is the problem?
10. As far as I can see, one can just define scale as the Lebesque integral, why do we need section 2.1? |