03 Mar 2021
03 Mar 2021
Multivariate localization functions for strongly coupled data assimilation in the bivariate Lorenz ’96 system
 Department of Applied Mathematics, University of Colorado, Boulder, Colorado
 Department of Applied Mathematics, University of Colorado, Boulder, Colorado
Abstract. Localization is widely used in data assimilation schemes to mitigate the impact of sampling errors on ensemblederived background error covariance matrices. Strongly coupled data assimilation allows observations in one component of a coupled model to directly impact another component through inclusion of crossdomain terms in the background error covariance matrix. When different components have disparate dominant spatial scales, localization between model domains must properly account for the multiple length scales at play. In this work we develop two new multivariate localization functions, one of which is a multivariate extension of the fifthorder piecewise rational GaspariCohn localization function; the withincomponent localization functions are standard GaspariCohn with different localization radii while the crosslocalization function is newly constructed. The functions produce nonnegative definite localization matrices, which are suitable for use in variational data assimilation schemes. We compare the performance of our two new multivariate localization functions to two other multivariate localization functions and to the univariate analogs of all four functions in a simple experiment with the bivariate Lorenz '96 system. In our experiment the multivariate GaspariCohn function leads to better performance than any of the other localization functions.
Zofia Stanley et al.
Status: open (until 28 Apr 2021)

RC1: 'Comment on npg20218', S.G. Penny, 27 Mar 2021
reply
General Comments:
The development of localization schemes for coupled dynamics is an important activity that needs increased attention as operational forecast centers transition to greater reliance on coupled Earth system forecast models. The authors provide a promising advancement to address the localization of crossdomain error correlations. I believe the work should be published after the authors explore a larger parameter space for their experimental results, as described below. In exploring a larger parameter space, it may be sufficient to focus on one or two leading methods (e.g. GC and BW).
One concern is the choice of model, and how well the results can transfer to more realistic scenarios, given the near linear relationship between the slow and fast components in this system (e.g. see S. Rasp note referenced below). Do the authors have confidence that the results can translate in some way to more sophisticated systems? I would be interested to know how the results change as the coupling strength between the slow and fast components is weakened or strengthened from the baseline state used by the authors.
A second concern is the restriction to observing only the fast dynamics. I would like to see a complete investigation examining the observation of fastonly, slowonly, and the full slowfast coupled system. I’d like to see Figure 3 repeated for a few different scenarios, including those just mentioned, but also potentially varying parameters of the EnKF, such as the frequency of observations, the density of observations, the amount of observation noise, the length of the analysis cycle, etc. Not all results need to be reported in figures, but some indication that the authors have explored more variations in the problem specification would help to build confidence in the robustness of the final reported results.
Minor issue: There are a few instances where the present tense is used when it should be past tense.
Specific Comments:
L 8:
“The functions produce nonnegative definite localization matrices, which are suitable for use in
variational data assimilation schemes.”
I think the term ‘positive semidefinite’ is more common, and the one originally used by Gaspari and Cohn (1999). I would suggest changing all instances of this throughout the manuscript.
L 1416:
“The background error covariance statistics stored in B dictate how information from observations propagates through the domain during the assimilation step (Bannister, 2008)”
The term ‘propagates’ seems appropriate for 4DVar, but perhaps not for all DA methods. More generally, the background error covariance provides a structure function that determines how observed quantities affect the model state variables, which is of particular importance when the state space is not fully observed.
L 25:
“Localization is typically incorporated into an ensemble estimate of B through a Schur (or elementwise) product.”
I would change this to say that localization is typically incorporated into the data assimilation in one of two ways  either through the B matrix using a Shur product, or through the observation error covariance R (e.g. Greybush et al., 2011). You are focusing on the localization applied directly to the B matrix.
Greybush et al., 2011: Balance and Ensemble Kalman Filter Localization Techniques.
https://journals.ametsoc.org/view/journals/mwre/139/2/2010mwr3328.1.xml
L 3233:
“In Earth system modeling in particular, coupled DA shows improvements over single domain analyses (Penny et al., 2017; Zhang et al., 2020)”
Additional sources that determined this point clearly are Sluka et al. (2016) and Penny et al. (2019):
Sluka, T., S.G. Penny, E. Kalnay, and T. Miyoshi, 2016: Using Strongly Coupled Ensemble Data Assimilation to Assimilate Atmospheric Observations into the Ocean. Geophys. Res. Lett., 43, doi:10.1002/2015GL067238.
Penny, S.G., E. Bach, K. Bhargava, CC. Chang, C. Da, L. Sun, T. Yoshida, 2019: Strongly coupled data assimilation in multiscale media: experiments using a quasigeostrophic coupled model. Journal of Advances in Modeling Earth Systems, 11. https://doi.org/10.1029/2019MS001652 https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019MS001652
L 3537:
“Schemes that include crossdomain [error] correlations in the B matrix are broadly classified
as strongly coupled, which is distinguished from weakly coupled schemes where B does not include any nonzero crossdomain [error] correlations. The inclusion of crossdomain [error] correlations in B offers advantages”
To be more precise, the term “crossdomain error correlations” should be used if referred to the error covariance matrix B.
L 55:
I’ll note that Lorenz himself cited this as (Lorenz, 1996). See comment below regarding line 461.
L 5758:
“We find that, in our set up, artificially decreasing the magnitude of the crossdomain correlation hinders the assimilation of observations.”
This is a positive sign for the advancement strongly coupled DA, but I wonder if this could be partly due to the use of the Lorenz system II, which has some highly linear relationships between the small and large scale systems. Some discussion was given, for example, in this blog post by Stephan Rasp:
https://raspstephan.github.io/blog/lorenz96istooeasy/
L 61:
“localization function[s] from the literature.”
L 7778:
“A fundamental difficulty in localization for strongly coupled DA is how to propose a crosslocalization function LXY to populate both LXY and LYX”
It might be useful to explain at this point which term controls the effect of system X on Y, and Y on X.
L 102:
“we define two processes Zj , j = X,Y”
I understood this on the third read through. Perhaps the authors could reword this sentence slightly to make it more clear. For example,
“we define two processes Zj, where j can represent either X or Y”
Or simply,
“we define two processes Zj, with j=X,Y”
L 105106:
“ Thus LXX, LYY ,LXY form a multivariate covariance function, and hence a multivariate, nonnegative definite function”
Based on the terminology defined so far, I’m not sure how to interpret the triple (LXX,LYY,LXY) forming a single function. Perhaps a line or two could be added to explain this step.
L 118:
The way I am interpreting the notation is that the term (1r/c)_+ is zero when the term in parentheses is less than or equal to 0, which would occur when r>=c. Can the authors explain the comment about the convolution being zero at distances greater than 2c in line 120, it is not immediately obvious.
L 139:
“who perform[ed] the”
L 140:
“in never develop[ed] multivariate”
L 156:
“Porcu et al. (2013) develop[ed] a multivariate version”
“et al.” is short for the Latin term “et alia,” meaning “and others.” It is strange to reference the actions of Porcu “and others” in the year 2013 using present tense.
L 157:
“Roh et al. (2015) [found] that”
L 159:
“Daley et al. (2015) extend[ed] the work”
L 166:
“with B the beta function”
Could you define this here for clarity.
L 168:
“Daley et al. (2015) [gave]”
L 184:
“This approach leads to a “weakly” coupled scheme, which is not the focus of this work.”
I understand this may not be the focus, but it seems that it would be appropriate to compare to this approach given that the weakly coupled DA scheme is the standard approach for current operational forecast systems.
L 184186:
“Additionally, in our setup we observe only one of the two processes and we find that when the assimilation is not allowed to update the unobserved process the result is prone to catastrophic divergence”
It might be appropriate to perform a few experiments where both components are observed, and results are compared using weakly and strongly coupled DA.
L 200/202/205:
“ Lorenz (199[6])”
See comment below for line 461.
L 203:
“using an adaptive fourthorder RungeKutta method”
Perhaps provide a citation for the method.
L 204205:
“The solutions are output with a time interval of 0.005 nondimensional units, or 36 minutes”
It seems strange to say there are nondimensional units and then indicate that it is the same as 36 minutes. Perhaps repeat some of the justification from Lorenz to indicate the relative error growth rates and its relation to more realistic applications that would be approximately equivalent to 36 minutes in operational prediction in the early 1990’s.
Figure 2 caption:
“setup” is a noun that means “the way in which something… is organized, planned, or arranged.” This should probably be used in most places where the authors current use two words: “set up“.
L 209:
“Increasing the coupling strength leads to larger covariances between the forecast errors in processes X and Y , thereby making the effect of crosslocalization more pronounced and easier to study.”
I believe this is the case. However, I would like to see some sensitivity study of how the benefit of strongly coupled DA paradigm breaks down as the coupling strength between the two components weakens and asymptotes to 0.
L 213:
“We choose to place the variable Xk in the middle”
Does the placement of the X variable have any influence on the results of localization? Is there any sensitivity here, or are the results generally the same regardless of how the placement of the X and Y variables are interpreted?
L 218:
“We develop localization functions for EnVar schemes where nonnegative definiteness of the localization matrix is essential to ensure convergence of the numerical optimization. Since the minimizer of the 3DEnVar objective function is the same as the EnKF analysis mean in the case of linear observation (Lorenc, 1986), in this experiment we make use of the EnKF rather than implement an ensemble of 3DEnVar assimilation scheme (Evensen, 1994; Houtekamer and Mitchell, 1998; Burgers et al., 1998)”
This discussion is a bit confusing. I think it could be cleaned up with a little reorganization, e.g.
We develop localization functions for data assimilation schemes that rely on Schur product modification of the background error covariance matrix B. In our experiments we use the stochastic EnKF (Evensen, 1994; Houtekamer and Mitchell, 1998; Burgers et al., 1998). However, because the minimizer of the 3DEnVar objective function is the same as the EnKF analysis mean in the case of linear observations (Lorenc, 1986), our results translate to EnVar schemes as well. The positive semidefiniteness of the localization matrix is essential to ensure convergence of the numerical optimization methods used to implement EnVar <cite>.
L 230231:
“In this experiment we use the adaptive inflation scheme of El Gharamti (2018) and apply the inflation to the prior estimate.”
Can this be added to the EnKF equations above for clarity?
L 233234:
“We run each DA scheme for 3,000 time steps, discarding the first 1,000 time steps and reporting statistics from the remaining 2,000 time steps.”
Is this referring to model time steps, or the number of analysis cycles?
L 235237:
“The observation operator H is such that all of the Y variables are observed, and none of the X variables are observed. In this way we can isolate the effect of the localization on the performance of the filter for the X variable.”
This means you are observing the fast dynamics and using this to update the slow dynamics through the error covariance statistics. This has been shown effective in a number of studies exploring strongly coupled DA. Penny et al. (2019) showed that the reverse was also possible, particularly if the size of the analysis window is decreased (or the frequency of observation updates is increased).
L 256:
“, so that we hypothesize that GC allows”
Change to:
“, so we hypothesize that GC allows”
L 266267:
“ By contrast, the BW and Askey functions show virtually no difference between the multivariate and univariate versions”
The BW method looks slightly improved, and the Askey method slightly degraded.
L 280:
It is up to the authors, but generally the conclusions reads more clearly if this is now written in past tense. E.g.
“In this work, we develop[ed]…”
“We compare[ed] multivariate GC to three…”
“We [found] that, in a toy model…”
“In this work we investigate[d]…”
“We [found] that this…”
L 284:
“the localized estimate of the background [error] covariance matrix”
L 294:
“A natural application of this work is localization in a coupled atmosphereocean model. Multivariate GC allows for within component covariances to be localized with GC exactly as they would be in an uncoupled setting, using the optimal localization length scale for each component Ying et al. (2018). In this work we discuss the importance of the crosslocalization radius in determining performance. However, this work does not address the question of optimal crosslocalization radius selection, which is an important area for future research”
This is certainly of interest  are there any conjectures that can be made about the applicability of the results here extending to an application like a coupled atmosphereocean model?
While the interpretation of localization is clear in the withincomponent covariances, how would you interpret localization on the crosscomponent covariances? Could a situation in which it might be desirable for an atmospheric observation to have an influence on the ocean state but not vice versa create difficulties with the symmetry relied on above in forming LXX, LYY, and LXY as a triple, and the need for maintaining positive semidefiniteness?
L 461:
The full citation for Lorenz96 is not given. It should be Lorenz (1996):
Lorenz, E.N., 1996: Predictability—A problem partly solved.Proc. Seminar on Predictability,Vol. 1, Reading, Berkshire, UnitedKingdom, ECMWF, 1–18.
Note that Lorenz cited it himself this way in:
Lorenz, E.N., 2005. Designing chaotic models. J. of the Atmos. Sci. 62, 1574–1587. DOI:10.1175/JAS3430.1.

RC2: 'Comment on npg20218', Anonymous Referee #2, 30 Mar 2021
reply
The paper by Stanley et.al. develops multisscale extensions to the traditional families of the parameterized localization functions such as the GaspariCohn 5^{th} order polynomial. A key contribution of the paper is to note that each of these polynomial expressions can be represented as a product of the squareroot kernels that can be crossmultiplied to achieve a positivedefinite crossscale localization. Authors, correctly, draw relevance of these new techniques to the problem of cross scale localization encountered in coupled data assimilation. Authors correctly choose the right type of the test problem that is usefulenough to test the mathematics of the developed extensions yet is not too complex to obscure the interpretation of the results. I agree with authors that there is no need to overinterpret the results of this simple experiment with regard to its relevance in a more complex oceanatmosphere problems.
I found this article very relevant, wellwritten, with adequate experimental plan, and appropriate interpretation of the experimental results. I congratulate the authors on a nice contribution to the literature and suggest this paper for publication after minor revision.
Summary of suggested changes:
 This contribution parallels the work of Mark Buehner on multiscale localization. I suggest that authors draw this parallel by referencing some of his work such as [https://doi.org/10.3402/tellusa.v67.28027]. By including these references, authors can then discuss how their work can also be related to the problem of multiscale localization (e.g. such as assimilation in convectiveresolving models).
 I made some minor suggestion to how authors might consider changing or extending other references in the introduction section (see pdf attached).
 I tried to read the paper from the perspective of someone who might want to implement some of the localization formulas discussed by the authors. I made some suggestions on clarifications. I highly appreciate that authors published the source code for their work. I suggest that authors mention that in the main body of the paper.
Zofia Stanley et al.
Model code and software
Multivariate Localization Functions Zofia Stanley and Ian Grooms https://doi.org/10.5281/zenodo.4574612
Zofia Stanley et al.
Viewed
HTML  XML  Total  BibTeX  EndNote  

280  39  7  326  2  0 
 HTML: 280
 PDF: 39
 XML: 7
 Total: 326
 BibTeX: 2
 EndNote: 0
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1