03 Mar 2021
03 Mar 2021
Multivariate localization functions for strongly coupled data assimilation in the bivariate Lorenz ’96 system
 Department of Applied Mathematics, University of Colorado, Boulder, Colorado
 Department of Applied Mathematics, University of Colorado, Boulder, Colorado
Abstract. Localization is widely used in data assimilation schemes to mitigate the impact of sampling errors on ensemblederived background error covariance matrices. Strongly coupled data assimilation allows observations in one component of a coupled model to directly impact another component through inclusion of crossdomain terms in the background error covariance matrix. When different components have disparate dominant spatial scales, localization between model domains must properly account for the multiple length scales at play. In this work we develop two new multivariate localization functions, one of which is a multivariate extension of the fifthorder piecewise rational GaspariCohn localization function; the withincomponent localization functions are standard GaspariCohn with different localization radii while the crosslocalization function is newly constructed. The functions produce nonnegative definite localization matrices, which are suitable for use in variational data assimilation schemes. We compare the performance of our two new multivariate localization functions to two other multivariate localization functions and to the univariate analogs of all four functions in a simple experiment with the bivariate Lorenz '96 system. In our experiment the multivariate GaspariCohn function leads to better performance than any of the other localization functions.
Zofia Stanley et al.
Status: final response (author comments only)

RC1: 'Comment on npg20218', S.G. Penny, 27 Mar 2021
General Comments:
The development of localization schemes for coupled dynamics is an important activity that needs increased attention as operational forecast centers transition to greater reliance on coupled Earth system forecast models. The authors provide a promising advancement to address the localization of crossdomain error correlations. I believe the work should be published after the authors explore a larger parameter space for their experimental results, as described below. In exploring a larger parameter space, it may be sufficient to focus on one or two leading methods (e.g. GC and BW).
One concern is the choice of model, and how well the results can transfer to more realistic scenarios, given the near linear relationship between the slow and fast components in this system (e.g. see S. Rasp note referenced below). Do the authors have confidence that the results can translate in some way to more sophisticated systems? I would be interested to know how the results change as the coupling strength between the slow and fast components is weakened or strengthened from the baseline state used by the authors.
A second concern is the restriction to observing only the fast dynamics. I would like to see a complete investigation examining the observation of fastonly, slowonly, and the full slowfast coupled system. I’d like to see Figure 3 repeated for a few different scenarios, including those just mentioned, but also potentially varying parameters of the EnKF, such as the frequency of observations, the density of observations, the amount of observation noise, the length of the analysis cycle, etc. Not all results need to be reported in figures, but some indication that the authors have explored more variations in the problem specification would help to build confidence in the robustness of the final reported results.
Minor issue: There are a few instances where the present tense is used when it should be past tense.
Specific Comments:
L 8:
“The functions produce nonnegative definite localization matrices, which are suitable for use in
variational data assimilation schemes.”
I think the term ‘positive semidefinite’ is more common, and the one originally used by Gaspari and Cohn (1999). I would suggest changing all instances of this throughout the manuscript.
L 1416:
“The background error covariance statistics stored in B dictate how information from observations propagates through the domain during the assimilation step (Bannister, 2008)”
The term ‘propagates’ seems appropriate for 4DVar, but perhaps not for all DA methods. More generally, the background error covariance provides a structure function that determines how observed quantities affect the model state variables, which is of particular importance when the state space is not fully observed.
L 25:
“Localization is typically incorporated into an ensemble estimate of B through a Schur (or elementwise) product.”
I would change this to say that localization is typically incorporated into the data assimilation in one of two ways  either through the B matrix using a Shur product, or through the observation error covariance R (e.g. Greybush et al., 2011). You are focusing on the localization applied directly to the B matrix.
Greybush et al., 2011: Balance and Ensemble Kalman Filter Localization Techniques.
https://journals.ametsoc.org/view/journals/mwre/139/2/2010mwr3328.1.xml
L 3233:
“In Earth system modeling in particular, coupled DA shows improvements over single domain analyses (Penny et al., 2017; Zhang et al., 2020)”
Additional sources that determined this point clearly are Sluka et al. (2016) and Penny et al. (2019):
Sluka, T., S.G. Penny, E. Kalnay, and T. Miyoshi, 2016: Using Strongly Coupled Ensemble Data Assimilation to Assimilate Atmospheric Observations into the Ocean. Geophys. Res. Lett., 43, doi:10.1002/2015GL067238.
Penny, S.G., E. Bach, K. Bhargava, CC. Chang, C. Da, L. Sun, T. Yoshida, 2019: Strongly coupled data assimilation in multiscale media: experiments using a quasigeostrophic coupled model. Journal of Advances in Modeling Earth Systems, 11. https://doi.org/10.1029/2019MS001652 https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019MS001652
L 3537:
“Schemes that include crossdomain [error] correlations in the B matrix are broadly classified
as strongly coupled, which is distinguished from weakly coupled schemes where B does not include any nonzero crossdomain [error] correlations. The inclusion of crossdomain [error] correlations in B offers advantages”
To be more precise, the term “crossdomain error correlations” should be used if referred to the error covariance matrix B.
L 55:
I’ll note that Lorenz himself cited this as (Lorenz, 1996). See comment below regarding line 461.
L 5758:
“We find that, in our set up, artificially decreasing the magnitude of the crossdomain correlation hinders the assimilation of observations.”
This is a positive sign for the advancement strongly coupled DA, but I wonder if this could be partly due to the use of the Lorenz system II, which has some highly linear relationships between the small and large scale systems. Some discussion was given, for example, in this blog post by Stephan Rasp:
https://raspstephan.github.io/blog/lorenz96istooeasy/
L 61:
“localization function[s] from the literature.”
L 7778:
“A fundamental difficulty in localization for strongly coupled DA is how to propose a crosslocalization function LXY to populate both LXY and LYX”
It might be useful to explain at this point which term controls the effect of system X on Y, and Y on X.
L 102:
“we define two processes Zj , j = X,Y”
I understood this on the third read through. Perhaps the authors could reword this sentence slightly to make it more clear. For example,
“we define two processes Zj, where j can represent either X or Y”
Or simply,
“we define two processes Zj, with j=X,Y”
L 105106:
“ Thus LXX, LYY ,LXY form a multivariate covariance function, and hence a multivariate, nonnegative definite function”
Based on the terminology defined so far, I’m not sure how to interpret the triple (LXX,LYY,LXY) forming a single function. Perhaps a line or two could be added to explain this step.
L 118:
The way I am interpreting the notation is that the term (1r/c)_+ is zero when the term in parentheses is less than or equal to 0, which would occur when r>=c. Can the authors explain the comment about the convolution being zero at distances greater than 2c in line 120, it is not immediately obvious.
L 139:
“who perform[ed] the”
L 140:
“in never develop[ed] multivariate”
L 156:
“Porcu et al. (2013) develop[ed] a multivariate version”
“et al.” is short for the Latin term “et alia,” meaning “and others.” It is strange to reference the actions of Porcu “and others” in the year 2013 using present tense.
L 157:
“Roh et al. (2015) [found] that”
L 159:
“Daley et al. (2015) extend[ed] the work”
L 166:
“with B the beta function”
Could you define this here for clarity.
L 168:
“Daley et al. (2015) [gave]”
L 184:
“This approach leads to a “weakly” coupled scheme, which is not the focus of this work.”
I understand this may not be the focus, but it seems that it would be appropriate to compare to this approach given that the weakly coupled DA scheme is the standard approach for current operational forecast systems.
L 184186:
“Additionally, in our setup we observe only one of the two processes and we find that when the assimilation is not allowed to update the unobserved process the result is prone to catastrophic divergence”
It might be appropriate to perform a few experiments where both components are observed, and results are compared using weakly and strongly coupled DA.
L 200/202/205:
“ Lorenz (199[6])”
See comment below for line 461.
L 203:
“using an adaptive fourthorder RungeKutta method”
Perhaps provide a citation for the method.
L 204205:
“The solutions are output with a time interval of 0.005 nondimensional units, or 36 minutes”
It seems strange to say there are nondimensional units and then indicate that it is the same as 36 minutes. Perhaps repeat some of the justification from Lorenz to indicate the relative error growth rates and its relation to more realistic applications that would be approximately equivalent to 36 minutes in operational prediction in the early 1990’s.
Figure 2 caption:
“setup” is a noun that means “the way in which something… is organized, planned, or arranged.” This should probably be used in most places where the authors current use two words: “set up“.
L 209:
“Increasing the coupling strength leads to larger covariances between the forecast errors in processes X and Y , thereby making the effect of crosslocalization more pronounced and easier to study.”
I believe this is the case. However, I would like to see some sensitivity study of how the benefit of strongly coupled DA paradigm breaks down as the coupling strength between the two components weakens and asymptotes to 0.
L 213:
“We choose to place the variable Xk in the middle”
Does the placement of the X variable have any influence on the results of localization? Is there any sensitivity here, or are the results generally the same regardless of how the placement of the X and Y variables are interpreted?
L 218:
“We develop localization functions for EnVar schemes where nonnegative definiteness of the localization matrix is essential to ensure convergence of the numerical optimization. Since the minimizer of the 3DEnVar objective function is the same as the EnKF analysis mean in the case of linear observation (Lorenc, 1986), in this experiment we make use of the EnKF rather than implement an ensemble of 3DEnVar assimilation scheme (Evensen, 1994; Houtekamer and Mitchell, 1998; Burgers et al., 1998)”
This discussion is a bit confusing. I think it could be cleaned up with a little reorganization, e.g.
We develop localization functions for data assimilation schemes that rely on Schur product modification of the background error covariance matrix B. In our experiments we use the stochastic EnKF (Evensen, 1994; Houtekamer and Mitchell, 1998; Burgers et al., 1998). However, because the minimizer of the 3DEnVar objective function is the same as the EnKF analysis mean in the case of linear observations (Lorenc, 1986), our results translate to EnVar schemes as well. The positive semidefiniteness of the localization matrix is essential to ensure convergence of the numerical optimization methods used to implement EnVar <cite>.
L 230231:
“In this experiment we use the adaptive inflation scheme of El Gharamti (2018) and apply the inflation to the prior estimate.”
Can this be added to the EnKF equations above for clarity?
L 233234:
“We run each DA scheme for 3,000 time steps, discarding the first 1,000 time steps and reporting statistics from the remaining 2,000 time steps.”
Is this referring to model time steps, or the number of analysis cycles?
L 235237:
“The observation operator H is such that all of the Y variables are observed, and none of the X variables are observed. In this way we can isolate the effect of the localization on the performance of the filter for the X variable.”
This means you are observing the fast dynamics and using this to update the slow dynamics through the error covariance statistics. This has been shown effective in a number of studies exploring strongly coupled DA. Penny et al. (2019) showed that the reverse was also possible, particularly if the size of the analysis window is decreased (or the frequency of observation updates is increased).
L 256:
“, so that we hypothesize that GC allows”
Change to:
“, so we hypothesize that GC allows”
L 266267:
“ By contrast, the BW and Askey functions show virtually no difference between the multivariate and univariate versions”
The BW method looks slightly improved, and the Askey method slightly degraded.
L 280:
It is up to the authors, but generally the conclusions reads more clearly if this is now written in past tense. E.g.
“In this work, we develop[ed]…”
“We compare[ed] multivariate GC to three…”
“We [found] that, in a toy model…”
“In this work we investigate[d]…”
“We [found] that this…”
L 284:
“the localized estimate of the background [error] covariance matrix”
L 294:
“A natural application of this work is localization in a coupled atmosphereocean model. Multivariate GC allows for within component covariances to be localized with GC exactly as they would be in an uncoupled setting, using the optimal localization length scale for each component Ying et al. (2018). In this work we discuss the importance of the crosslocalization radius in determining performance. However, this work does not address the question of optimal crosslocalization radius selection, which is an important area for future research”
This is certainly of interest  are there any conjectures that can be made about the applicability of the results here extending to an application like a coupled atmosphereocean model?
While the interpretation of localization is clear in the withincomponent covariances, how would you interpret localization on the crosscomponent covariances? Could a situation in which it might be desirable for an atmospheric observation to have an influence on the ocean state but not vice versa create difficulties with the symmetry relied on above in forming LXX, LYY, and LXY as a triple, and the need for maintaining positive semidefiniteness?
L 461:
The full citation for Lorenz96 is not given. It should be Lorenz (1996):
Lorenz, E.N., 1996: Predictability—A problem partly solved.Proc. Seminar on Predictability,Vol. 1, Reading, Berkshire, UnitedKingdom, ECMWF, 1–18.
Note that Lorenz cited it himself this way in:
Lorenz, E.N., 2005. Designing chaotic models. J. of the Atmos. Sci. 62, 1574–1587. DOI:10.1175/JAS3430.1.
 AC1: 'Reply on RC1', Zofia Stanley, 10 Jun 2021

RC2: 'Comment on npg20218', Anonymous Referee #2, 30 Mar 2021
The paper by Stanley et.al. develops multisscale extensions to the traditional families of the parameterized localization functions such as the GaspariCohn 5^{th} order polynomial. A key contribution of the paper is to note that each of these polynomial expressions can be represented as a product of the squareroot kernels that can be crossmultiplied to achieve a positivedefinite crossscale localization. Authors, correctly, draw relevance of these new techniques to the problem of cross scale localization encountered in coupled data assimilation. Authors correctly choose the right type of the test problem that is usefulenough to test the mathematics of the developed extensions yet is not too complex to obscure the interpretation of the results. I agree with authors that there is no need to overinterpret the results of this simple experiment with regard to its relevance in a more complex oceanatmosphere problems.
I found this article very relevant, wellwritten, with adequate experimental plan, and appropriate interpretation of the experimental results. I congratulate the authors on a nice contribution to the literature and suggest this paper for publication after minor revision.
Summary of suggested changes:
 This contribution parallels the work of Mark Buehner on multiscale localization. I suggest that authors draw this parallel by referencing some of his work such as [https://doi.org/10.3402/tellusa.v67.28027]. By including these references, authors can then discuss how their work can also be related to the problem of multiscale localization (e.g. such as assimilation in convectiveresolving models).
 I made some minor suggestion to how authors might consider changing or extending other references in the introduction section (see pdf attached).
 I tried to read the paper from the perspective of someone who might want to implement some of the localization formulas discussed by the authors. I made some suggestions on clarifications. I highly appreciate that authors published the source code for their work. I suggest that authors mention that in the main body of the paper.
 AC2: 'Reply on RC2', Zofia Stanley, 10 Jun 2021

RC3: 'Comment on npg20218', Anonymous Referee #3, 19 Apr 2021
General Comments:
This manuscript presents a new multivariate extension of the standard GaspariCohn localization function which is compared with 3 other multivariate functions, plus their univariate versions. These techniques are extremely relevant to problem of crossdomain localization in strongly coupled data assimilation and this work is an encouraging step towards developing appropriate methods for such systems. The localization techniques are illustrated using the bivariate Lorenz 96 system.
Whilst it would be nice to have an example illustrating how the new multivariate GC method translates to a more realistic system I appreciate that it is always important to test new ideas like this in a relatively simple system, where results can be more easily interpreted. I hope that the authors have the opportunity to extend this work to a more complex coupled model system in the future.
It is always good to see new work addressing issues related to the application of coupled DA. The article is timely, highly relevant and clearly written; it will make a nice contribution to the coupled DA literature. I suggest it is published after minor revisions.
Specific comments:
 It is a shame that results were only shown for case where the fast (Y) component is fully observed, and further that the performance of each method was only measured/ reported in terms of the RMSE of the X (slow, unobserved) component. I would like to see some results from experiments where only the X (slow) component is observed, and also where both the X and Y components are observed, both fully and partially. I appreciate that this would potentially increase the number of figures/length of the manuscript, but it may not be necessary to explicitly show all the results. A brief discussion of the results in order to confirm that the general conclusions still hold under different observing scenarios would give the reader greater confidence in the performance of the new GC method.
 I am not entirely clear on how the univariate localization functions were implemented. Lines 181182 state:
"We compare the four multivariate localization functions in Sect. 2 to a simple approach to localization in coupled DA, which is to use the same localization function for all model components. We call this approach univariate localization."
I think this means that each block of the localization matrix L uses the same localization function and radius for all blocks, rather than a different radius for the X and Y blocks and a different function and radius for the cross X,Y block, is this correct? I think what is confusing is that you are calling it univariate localization but you are actually localizing the cross XY blocks of the matrix B. Perhaps this needs to be stated more explicitly somewhere. In systems with very different error correlation scales this type of univariate localization function could be not really be expected to perform well.
Minor comments:
 The references are a bit strange – there are multiple web links for a lot of the papers; the https://doi.org/xxx link will be sufficient in most cases.
 Further minor comments and technical corrections are marked in the attached pdf.
 AC3: 'Reply on RC3', Zofia Stanley, 10 Jun 2021
Zofia Stanley et al.
Model code and software
Multivariate Localization Functions Zofia Stanley and Ian Grooms https://doi.org/10.5281/zenodo.4574612
Zofia Stanley et al.
Viewed
HTML  XML  Total  BibTeX  EndNote  

555  71  11  637  4  0 
 HTML: 555
 PDF: 71
 XML: 11
 Total: 637
 BibTeX: 4
 EndNote: 0
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1