(note: I did not review the first version of this manuscript)
Bayesian inference can in principle be used to estimate, and quantify the uncertainty in, model parameters. The paper uses a MCMC sampler to perform this task to quantify the uncertainty in five parameters and complete a stochastic model for the earth's geomagnetic field. In particular, a recent approach is used in which certain `features' are selected from the data, and a data model for the features is then constructed.
The paper is well written and interesting, but I have several concerns that should be addressed. My largest concern has been raised already by the former reviewer: that the decision to reduce the time-data to three points, but to only reduce the spectral data to hundreds of points, has introduced unknown flaws to the inference algorithm. The authors partially address this issue by tuning the variance of the three time-data points, but this has artificially created a narrow posterior in several parameters. I do not think there is sufficient information to trust e.g. the estimate for the parameter a, which changes by two standard deviations when *not* narrowing the variance.
This issue with the inference scheme is critical because the key contributions of the paper are all to the understanding of advantages and limitations of the chosen dipole model. But if the inference scheme is flawed, so is the inference, and we don't learn anything about the model...
I think one of several strategies should be adopted by the authors. Either the feature DA should be adjusted to remove the need for ad hoc tuning, or the tuning should be better justified, or the tuned results should come later in the paper. I have elaborated on these options in great detail (sorry) below.
1) adjustment: somehow the issue caused by low-dimensional, high variance time data should be resolved without resorting to ad hoc methods.
As already suggested by a prior reviewer, the spectral data could be reduced to dimension 3-10, commensurate with the time-data. The reviewer suggests a parametrization of the spectral features. I think something along these lines should be tried. The paper already lays the foundation for one simple approach. Equations (24) and (25) describe the spectral `feature observation operator' in three components corresponding to f<0.05, 0.05<f<0.5, and 0.9<f<9.9. The high dimension comes from evaluating these components at many points, but really one has already made an assumption that the spectral data is split into three features.
Perhaps one could replace the high-dimensional spectral data with these three features. The first feature would be (e.g.) the average distance from y_lf to h_lf(theta) for f<0.05; the second feature is the same thing but for 0.05<f<0.5, and the third feature is again the mean innovation, but in the high-frequency regime.
This choice reduces the spectral data to the same dimension as the time data, and the appropriate covariance matrix can be calculated from the values already used in the paper.
2) If the tuning cannot be removed, then it needs to be discussed in detail. Obviously this was the goal of the authors, and I acknowledge the very relevant and interesting observations made from comparing the different observation configurations in Section 6. However, I think an analysis of the *direct* consequences of the variance tuning is missing - though much of the legwork has been done.
-In which parameters is it likely that the posterior is too narrow?
-What is the difference between configuration (e) and the (not shown) configuration that, like (e), only assimilates the time data, but that uses the correct variances for the data?
-Is it certain that there no issue with degeneracy in the MCMC scheme due to the tight likelihood around the time data? (I note that MCMC Hammer is affine invariant, and therefore the answer to this question is likely in the affirmative; but I would like to see something on this in the paper.)
Once this analysis has been done, the claims of the abstract and introduction must be re-evaluated in terms of this analysis, in particular "Our approach thus results in a reliable stochastic model for selected aspects of the long term behavior of the geomagnetic dipole field whose limitations and errors are well understood". At the present moment I would not say the limitations and errors are well understood; not by me, at least.
3) If the tuning cannot be removed or justified better, then please present configurations (d) and (e) first. Then configuration (a) can be framed as a (partially successful!) attempt to obtain consistent parameters from two inconsistent data sources, and this can all be achieved with minimal changes to the existing discussion. If the authors do not feel like my requests (1) or (2) are achievable, I think this is a very valid rearrangement of the manuscript. The chief issue with configuration (a) is that it is presented first, gets its own section, and is treated - in terms of manuscript structure - as the prominent method, with other methods present for discussion. If instead configuration (a) came later as a comparison, rather than as the figurehead scheme, then I have much weaker objections to the variance reduction.
My second major contention is that the authors do not seem to mention investigating the validity of the Gaussian assumption for the features' likelihood(s). This assumption, while necessary for the approach taken, should be mentioned and (even if only briefly) investigated or justified with suitable references.
Minor comments:
p1 abstract: "Another important aspect of our overall approach is that it can reveal inconsistencies between model and data..."
You have identified inconsistencies; please strengthen the statement accordingly.
p.3 line 7: were->where
p.3 line 15, "For example, a point-wise mismatch of model and data is difficult to enforce when two different data sets report two different values for the same quantity": I am not convinced of the statement as written. If I have a 1d state x and two observations y1, y2 of the same state variable (here just x), then (assuming obs errors are independent) the observation operator is H = [1; 1] and the covariance matrix is R = [sigma1 0; 0 sigma1]. That is, there is no issue with multiple observations of the same quantity, in principle. I think the issue here is the unknown amount of observation error, or bias, means that the two observation sets are very inconsistent, and that this bias is impossible to model (?).
p3, line 25: either have `likelihood' or `likelihoods' for both
p9, last line: "This will lead to an improved fit, along with an
improved understanding of model uncertainties." just repeats the previous sentence
p13, Section 4.2. The Feature-based likelihoods introduction repeats much from the introduction/motivation sections. Consider removing the first paragraph, and/or perhaps the last few sentences from p.13
p15 "...but alternative strategies are
straightforward to implement within our overall feature-based Bayesian estimation approach."
This phrase will need updating one way or another. I recommend removing it no matter what happens to the rest of the paper. I got angry reading this because if a better strategy would have been straightforward to implement - why not do it??
p23: "The configurations we consider are summarized in Table 4 and the corresponding parameter estimates are reported in Table 5." On my copy of the manuscript, the references are off by one (ie the configurations are summarised in table *5* and estimates reported in table *6*).
p.29, code and data availability: If allowed by the journal, please clarify for the reader which repository/directory contains the appropriate code, or a starting example (e.g. suggesting the file to run to simulate the results for Figure 6), or upload a readme file to the repo. These are nice results and I think only a little work will make them more accessible. |