In the revised version the authors have answered many of my initial questions, but some questions remain and the revisions have raised new questions. As I now understand it, the major point of the paper is that earlier results that seemed to show that 4DVAR didn’t work so well in chaotic systems is that, in the cases cited, the control space didn’t have enough degrees of freedom. With insufficient degrees of freedom in the control space, the problem becomes that of finding a minimum in the surface depicted in figure 2. From this point of view, the punch line to the manuscript is figure 7.
The conclusion that initial conditions are not sufficient to control a chaotic system in many cases of interest is well known, but it’s useful to see it stated succinctly in one place. The philosophical problem I have with the approach taken in this manuscript is that it seems to imply that a control space of greater dimension is always preferable to one of lesser dimension. This is definitely not the case. For the linear problem, the dimension of the control space should not exceed the number of observations, and conditioning issues will usually restrict the dimension further. This is not necessarily so for nonlinear problems, but it is a useful guide.
As I said before, it is good to see the chi-square test used systematically in this way, but the authors should note that it is J (equation 6) rather than J_d (equation 4) that is the useful chi-square random variable, the randomness arising only from the observation noise. If only J_d is considered, there will be no indication if the prior control estimates and their statistics were reasonable; see the texts by Bennett or Kalnay.
p5, line 1: “Sx and Sf are weights that restrict the size of the perturbations to 5rad and 10rad s−2, respectively.” I wouldn’t say “restrict.” Greater perturbations are permitted, but penalized more severely.
p6, line 20: Please give the definition of the controllability matrix, and state the theorem that specifies the condition for controllability. If I don’t remember, neither do 90% of the readers of NPG. It’s not reasonable to send your readers scurrying to the literature to find those things at this point.
p7 The “method of total inversion” needs some further explanation. The terminology is not standard in the community that constitutes the readership of NPG. Without further explanation, it’s hard to know what to make of (12). Consider the expression in the first square brackets. The matrix E is 1x2 (p4, line 24) and the matrix Q appears to be 3x3. It’s not even clear that (12) is a valid vector expression. Also, “…we update the state transition and controllability matrices iteratively.” Since that is the case, you probably want the last expression to reflect the evolution state from i\Delta t_y to (i+1)\Delta t_y (pardon my TeX, but I don’t see how to put symbols into this writeup)
Starting with the improved first guess and then iterating until a satisfactory chi-square statistic is achieved is a good strategy. I have seen very, very few inversions of realistic models of the ocean or atmosphere that satisfy the chi-square condition. It’s really the chi-square statistics that allow us to conclude that the black curve in the bottom panel of figure 3 is to be preferred to the gray curve. The real world is rarely so accommodating. The last sentence of section 3.2, “The method of Lagrange multipliers is therefore superior to the first-guess estimate because all components of the solution, both the state and forcing, can be physically interpreted.” is only meaningful to the extent that the chi-square statistics say it is.
p9: “…small-scale features that are added to the surface forcing of ocean models in order to fit observations (e.g., Stammer et al., 2002b)” The real problem with the reconstructed forcing in Stammer et al. (2002b) was not small scale features but large scale features because of inaccurate model dynamics.
p10: As is strongly implied in the following discussion, the statement that “…there appears to be no lower limit on the number of observations necessary in order to produce an acceptable state estimate with this model” is only valid if there are sufficient degrees of freedom in the control space. The Lagrange multiplier method would not be nearly so successful if the control space were restricted to the initial condition, as was the case in some of the earlier studies that cast doubt on the efficacy of 4DVAR in chaotic systems. This point is nicely made in figures 2 and 7 of the present manuscript.
p10: “…the null hypothesis that the model is consistent with observations.” It is true that, for the area above the heavy line in figure 6 a satisfactory fit to data can be obtained, but a too-low chi square statistic is evidence of overfitting, and the reconstructed inputs cannot be trusted.
Also, figure 6 depicts a contour plot of J_d. It does not, therefore, distinguish among different choices of forcing fields.
last sentence: “In the ocean state estimation problem, all air-sea fluxes are uncertain and temporally variable so there is no shortage of controls that can be defined.” That is true, but amount of independent air-sea flux data is limited. There may be no shortage of controls that can be defined, but there is a very real shortage of controls that can be constrained by observation.
p12: Relation to the Kalman filter/smoother. Line 21: “Both methods solve the same least squares problem …” This is not strictly true. The Kalman smoother solves the same least squares problem as 4DVAR, but the Kalman filter does not. The Kalman filter and 4DVAR are only guaranteed to agree at the final time.
Figure 9 gives a nice illustration of the effect of nonlinearity. |

Thank you for the careful revision of your manuscript and your response to comments from the three expert Reviewers! Since all Reviewers recommended Major revisions, I am asking them if they can comment on the revised manuscript. Beyond potential additional comments from the Reviewers, I also offer some suggestions below, mostly editorial. I hope that with some additional revisions your manuscript will become ready for publication in NPG. Thank you for submitting your research results for publication to NPG

Zoltan Toth

p2 l8 - "and it does not pose"

p9, l35: "The nonlinear timescale here IS defined"

p12, l30 - "COMPLETE observability"

p13 l15 - "In THIS section"

Response to Rev. 1:

"The reviewer’s point about the underdetermined nature of the problem is consistent with our discus- sion in Sec. 3.3, where we acknowledge that the solution is not unique, but we focus on finding any acceptable fit. We view the problem as having two clear steps. It is a first step to find any solution that acceptably fits the data. Only then can we proceed to investigate the uniqueness of the solution. In real-world situations, the first step may be the only one that is practical."

- While I agree the Rev's point is consistent with your discussion in Section 3.3, perhaps you could use more direct language in Section 3.3 (or elsewhere, similar to your response to the Reviewer's comment?)

Rev. 3 comments

"The controllability in this article is very vaguely defined."

- Perhaps you could clarify, as you define "controllability" on p.3, that in your study, you control the system via adjustments to the atmospheric boundary forcing?

comments on p3 & p14 - perhaps in passing by, the dependent relationship between omega and theta could be pointed out in the manuscript