Quantifying variability in Lagrangian particle dispersal in ocean ensemble simulations: an information  theory approach

Pierard, Claudio M.; Rühs, Siren; Gómez-Navarro, Laura; Denes, Michael Charles; Meirer, Florian; Penduff, Thierry; van Sebille, Erik

doi:https://doi.org/10.5194/npg-32-411-2025

Articles | Volume 32, issue 4

https://doi.org/10.5194/npg-32-411-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/npg-32-411-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 32, issue 4

Research article

|

20 Oct 2025

Research article |

| 20 Oct 2025

Quantifying variability in Lagrangian particle dispersal in ocean ensemble simulations: an information theory approach

Claudio M. Pierard, Siren Rühs, Laura Gómez-Navarro, Michael Charles Denes, Florian Meirer, Thierry Penduff, and Erik van Sebille

Download

Final revised paper (published on 20 Oct 2025)
Preprint (discussion started on 20 Dec 2024)

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-3847', Anonymous Referee #1, 25 Jan 2025

This article works to obtain the variability in particle dispersion based on Eulerian velocity data, by generating variability in single (one initial condition) trajectories by choosing ensembles whose initial conditions are (i) close to the initial condition, and (ii) at same location as the initial condition, but are at nearby times. Information-theoretical measures are used to quantify the variability of these simulations in the Gulf Stream region.
Being able to gain insights into variability and uncertainty of simulations based on Eulerian velocity data is an important issue. To my mind, the crucial thing here would be in the ability to parameterize the input uncertainties (in the measurements of the Eulerian velocity, in using various forms of diffusivity in the advection process, in interpolation methods used to estimate data at subgrid scales where no data is available, etc), as well as give careful assessments and/or sensitivity analyses of the other variables (time-of-flow, radii used for initial seeding, release date, etc) in the process. Any quantifier of Lagrangian dispersal uncertainty obtained will of course be strongly dependent on the quantifiers of, and the perspectives on modeling the impact of, these input uncertainties. My main comments below are related to this main theme, on which any conclusions obtained from this study are crucially dependent.
1. The authors use spatial radii ranging from 0.1 to 2 degrees (9 to 180 km) for their initial clouds of particles in the spatial uncertainty situation. They quote that computed average decorrelation lengthscale of 0.41 degrees. It's not clear to me how long the particles were advected -- one would of course expect nearby fluid parcels to spread-apart more and more as time progresses (reflected in Figure 4A, for example, where the entropy is used rather than decorrelation). To my mind, these decorrelation numbers are therefore not "useful." What is the connection to this time-of-flow, and how does this decorrelation change in relation to that? Is the fact that 0.41 degrees comparable in size to the 0.25 degree of resolution in the NATL025-CJMCYC3 model, i..e., that the Eulerian velocity is not resolved below this lengthscale, relevant? What is the physical motivation for using distances of 9-180 km to seed particles in assessing the dispersal over a certain time for a single initial condition? Not highlighting (I don't know whether it's buried somewhere and I couldn't find it) the time-of-flow is a serious limitation, because this will have a profound effect on any results obtained.
2. It seems that all calculations were done with 2 January as the release date. Given the unsteadiness (time-dependence) of the Eulerian velocity field, one will of course get different results if using a different release date. From an oceanographer's viewpoint, why is 2 January so important? Alternatively, there needs to be convincing results to show that "similar" results are obtained for different choices of release dates. Also, one needs some understanding of the sensitivity of the results of the time-of-flow chosen.
3. For the temporally-varying release, the authors have used time windows of "4, 12 and 20 weeks, all starting from 2 January 2010." Once again, the time-of-flow is not clear, and it is moreover not clear whether the time-of-flow for each of these releases is the same. For example, if the time-of-flow was 2 weeks uniformly across all of these simulations, the "4 week time window" would, in my understanding correspond to having some particles released on 2 January and travelling until 16 January, but then also particles released on 30 January and travelling until 13 February. And presumably particles on intermediate days. All of this has to be made absolutely clear to enable any assessment of the results obtained. Moreover, as above, there needs to be some rationale for selecting these time windows.
4. If my above interpretation of the particle release is correct, I think there are some issues which interfere with the interpretation of this process giving an indicator of Lagrangian dispersal. Notice that in the example the particles travelling from 2 January to 16 January are driven by velocity field data which is *completely* different from the particles released on 30 January and travelling to 13 February. From the dynamical systems perspective, this means that one is comparing things which are driven by completely different dynamical systems -- so what does the usage of these different final locations in a dispersal assessment actually mean? On the other hand, if the above interpretation is wrong, and there is some other way of thinking about this, what would the interpretation of that be? For example, if all particles are flowed till 13 February despite being released on different days, then again the dynamical systems are different, as is the time-of-flow (which strongly impacts correlation and any measure of dispersal). This sort of issue is relevant in the later calculations in the article as well, for example when computing time series of histograms described in Section 2.4---exactly how "time" is interpreted is crucial in an unsteady flow such as this.
5. The authors say "to ensure particles released on the same day followed different trajectories, we added small random perturbations to their release locations using uniform noise with an amplitude of 0.1 degrees." Does this mean that one chooses uniformly from a ball of amplitude 0.1 degrees centered at the initial location? What this actually means is that both temporal and spatial variability in the initial condition is used here, and not just temporal variability, does it not?
6. Using temporal variability but from a set location is the concept of "streaklines" in fluid mechanics. This has seen importance in assessing transport due to Lagrangian motion principally when a streakline can sensibly be used as some sort of "barrier" between coherent structures, notably when the position from which the streakline emanates is fixed (on a boundary), for example see Haller (J Fluid Mech 2004), Zhang (SIAM Review, 2013), Karrasch (SIAM J Applied Dyn Sys, 2016), Balasuriya (SIAM J Applied Dyn Sys, 2017). For genuinely unsteady flows, since the velocity field around any fixed Eulerian location changes with time, the dispersal from a streakline approach (as the authors do here in their temporal release strategy) will provide a curve (a tube in this case since a small cloud is released at each time) at any given final instance in time -- but exactly how to interpret this tube from a Lagrangian dispersal perspective is unclear. It will consist of particles which have flowed for different times, and (for example) particles in a particular cross-section of the tube will not necessarily have flowed for the same time. If instead the interpretation I have in point 3 above is used, what this means is that a particular subsample of the points in this "streaktube" will be obtained for each day of release -- and how one uses the collection of these subsamples to quantify Lagrangian dispersal appears ambiguous.
7. The authors do not seem to have used any randomness in the Lagrangian advection process (only randomness in initializing) which to my mind does not take into account fundamental contributors to Lagrangian dispersal in the ocean: effects of eddy diffusivity and uncertainties in the driving velocity fields. Eddy diffusivity modeling is a vast and important area (Berner et al, Bull Amer Meteorol Soc, 2017), for which many different models exist (e.g., Griffa, in Stochastic Modeling in Physical Oceanography, Birkhauser, 1996). Here, though, dispersal seems to happen though the taking of nearby initial conditions both spatially and temporally. This seems artificial when there is a "natural" way of including the primary physical issues via advecting a stochastic differential equation model with small noise, or using the alternative representation via the Fokker-Planck equation (e.g., Chandrasekhar, Rev Modern Phys, 1943) which explicitly governs a probability density of particles, or using more sophisticated eddy diffusivity models. The results from the Fokker-Planck, for example, would be an explicit quantifier of Lagrangian dispersal. There is a substantial literature on these methods, also in usage in oceanic data. Of course, running stochastic simulations of this nature, or attempting to solve the Fokker-Planck may incur substantial computational costs, but these would seem more compelling approaches to Lagrangian dispersal.
8. There are emerging tools for avoiding stochastic simulation and explicitly obtaining some quantifiers which are related to dispersal, in particular the idea of stochastic sensitivity (Balasuriya, SIAM Review, 2020) which has been shown to also be robust when applying to oceanographic data (Badza, Physica D, 2023). The authors' approach here of seeking dispersal measures for each initial condition (as opposed to a large ensemble) at first glance appears to be similar to finding the stochastic sensitivity (a scaled variance of the final distribution around the deterministic final location, when the Lagrangian evolution is subject on ongoing model noise) at each member (initial) location. Another approach along these lines which appears relevant is that of Branicki and Uda (SIAM J Appl Dyn Sys, 2023).
9. The authors use several quantifiers for dispersal based on their Lagrangian simulations: mixture probability distributions, connectivity, entropy, and KL-divergence. Computing each of these requires the final Lagrangian distributions. So, interpreting any of these depends strongly on how those Lagrangian distributions were obtained in a physically reasonable fashion (see my earlier points on this). Because of this, it is hard for me to interpret any of these information-theoretic results.
10. It appears that a major point that is being made is that the full-ensemble variability is being approximated by single-member simulations. This is stated in many places, but I am a little uncertain as to whether my interpretation of this statement is correct. My understanding is that "single-member" means one particular initial condition is chosen. By choosing "50 members" (line 158), Lagrangian simulations associated with 50 different initial conditions are chosen. Then the probability distributions associated with each of these 50 are combined in a mixture model to get the "full" distribution. Is this understanding correct? If so, it would appear that the "full ensemble" can be thought of as the 50 simulations collectively, and so this is collectively the "full ensemble"? If this is so, the claim that the "full-ensemble" is obtained from "single-member" simulations, presumably demonstrating some advantage, isn't too different from simply saying that one is choosing a full-ensemble comprising 50 members, presumably chosen to cover a region of interest at the initial time sufficiently well? If this is not true, things have not been expressed clearly. Basically, I think some clarification on this claim is necessary, explaining exactly what is meant and how it is achieved (and exactly what a "single-member simulation" means).
11. And finally to return to my preface to the numbered points: to interpret any of the results on dispersal, one needs to be comfortable that the strategies for computationally determining dispersal here have something to do with the physical issues which lead to dispersal. Many of the points above are related to this, asking for clarification as to why the actions used in this article to generate dispersed trajectories are meaningful, whether they can be parameterized in terms of something physical, effects of time, why diffusivity/stochasticity in the evolution is ignored, why other techniques which explicitly capture dispersion are not used, etc.
Based on my comments above, I feel that a major revision would be necessary for this article to be acceptable for publication in Nonlinear Processes in Geophysics.

Citation: https://doi.org/10.5194/egusphere-2024-3847-RC1
- AC1: 'Reply on RC1', Erik van Sebille, 03 Jun 2025
  
  We thank the reviewer for recognizing the important issue underlying our work. We have carefully addressed all the comments in the attached pdf.
  
  Citation: https://doi.org/10.5194/egusphere-2024-3847-AC1
RC2:
'Comment on egusphere-2024-3847', Anonymous Referee #2, 01 Mar 2025

The manuscript concerns ensembles of trajectories of tracers generated by ocean models. The authors wish to generate the "variability" of multiple ensembles by manipulating a single trajectory. The manipulations are primarily perturbing an initial condition in space, or by starting the trajectory at a different time. In order for such a study to be useful, pinning down exactly what "variability" means is crucial. The word "variability" is used extensively in the first several pages of the manuscript without stating what it actually is; I think perhaps on page 7, in relation to "connectivity", the reader begins to see what might be meant. As far as I could tell, according to this manuscript, variability means either "connectivity", or various types of entropy of coarse-grained future distributions of trajectories. A case is not clearly made for why these quantities are useful for ocean dynamicists or oceanographers. That is, why are these quantities the gold standard by which oceanographers should assess "sameness" of (collections of) trajectories.
The manuscript does not mention models where subgrid-scale dynamics is simulated e.g. stochastically. This would appear to be a very relevant set of comparators. It is also well known that trajectories are influenced by the resolution of the model grid, and that very different dynamics can arise from the same model with different resolutions. This aspect is also not addressed; as far as I understand, only a single 1/4 degree model is used.
Line 90: "The first strategy varies the release locations". Isn't this exactly part of what one does with ensemble generation? What is the difference?
Section 2.2: I believe it is a poor choice to put the first results in an appendix. In fact, the manuscript reads as though it was recently chopped up and rearranged because it is impossible to read from start to finish via the Appendices.
The Appendices contain definitions and details that the reader has not yet come across when reading from start to finish. I would strongly recommend removing the appendices and putting the material in the body of the paper to help with the narrative flow.
Section 2.4: I could not understand how the probability distributions were being formed. The description is wordy, vague, and a bit sloppy. It needs precision and some formulas wouldn't hurt. Is a hexagon a bin? I could not find it stated.
Section 2.5: I could not understand what a mixture probability distribution is. This seems to be a crucial object in the manuscript, but the description was brief and ambiguous. Again, some formulas may help. There is a discussion about the optimal number of particles. In what sense optimal? Again the reader is referred to an Appendix that is too brief and does not provide any insight.
Section 2.6: Connectivity is not defined, and it is not explained in the Appendix. What is it?
Section 2.7: Similar to section 2.4, the section is written verbosely and somewhat sloppily, to the extent that I could not understand what the various definitions were.
Line 202: "ensemble of bins" what does this mean?
Line 203: "t is the particle age of the distribution". This may make sense if the distribution was unambiguously defined at some point.
Line 210: The authors write "P_A(X) = (1/2, 1/4, 1/8, 1/8)". As far as I understood from e.g. line 201, P(x_i) should be the probability of event x_i occuring, i.e. a number. Therefore since X=(x_1,x_2,x_3,x_4) -- see line 211 -- P_A(X) should equal 1, not a string of probabilities. This is just one example of the vague writing. If the authors really want P_A(X) to be a string of probabilities, that is fine, define some suitable object, and ensure that the writing is clear and consistent.
In summary, on the basis of both the writing and the scientific impact, I do not recommend publication.
The manuscript could be improved by going back to the drawing board and asking what exactly are the properties by which it is *most meaningful* to oceanographers to compare trajectories or ensembles of trajectories. Strong justifications and illustrations would need to be provided. Comparing with ocean models of different types, including stochastic components, and across different grid resolutions would add to the robustness and generalization of the subsequent results. A linear narrative and a much more precise presentation would also be required.

Citation: https://doi.org/10.5194/egusphere-2024-3847-RC2
- AC2: 'Reply on RC2', Erik van Sebille, 03 Jun 2025
  
  We thank the reviewer for their careful comments. We have now addressed all comments in the attached pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-3847-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Erik van Sebille on behalf of the Authors (03 Jun 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (03 Jun 2025) by Irina I. Rypina

RR by Anonymous Referee #1 (23 Jun 2025)

RR by Anonymous Referee #3 (11 Jul 2025)

Suggestions for revision or reasons for rejection

This study uses numerical simulations to compare Lagrangian statistics in two configurations: (i) an ensemble of simulations in different realizations of the flow, and (ii) simulations in the same flow realization, but with different release times and locations. In the first case, the flow realizations can be assumed independent. In the second, the flow remains correlated in time and/or space across the different releases. It is my understanding that the central question is how much these correlations influence the overall statistics of Lagrangian trajectories. The second configuration is particularly relevant to practical applications, since in the real ocean, drifter releases may occur at different times and locations but within the same flow realization. This is an interesting and important topic, and the study provides a thorough and valuable analysis.

I have a few comments and requests for clarification listed below, but overall, I recommend publication following revision.

l.93: Please briefly clarify how this perturbation is applied.

ll.135–136: The model has an eddy-permitting resolution of ¼°, and the shortest meaningfully simulated spatial scale is most likely longer than 100 km, which implies that submesoscale motions are not represented. Please clarify why the release radius was made as small as 9 km, and what is meant by “submesoscale features” in this context. It is not surprising that particles released within 9 km and even 90 km radii exhibit only modest dispersion. I suggest the authors include a discussion of the spatial scales resolved by the model and how they relate to the choice of small release radii.

l.160: This comment echoes my earlier concern. At this resolution, the shortest resolved spatial scale exceeds 100 km, so the meaningfulness of the estimates at smaller spatial scales is unclear.

l.199: The choice of Kh and the purpose of this experiment require further justification. While such a low diffusivity may indeed be used in simulations at 25 km resolution, here it appears the diffusion is intended to mimic ensemble variability due to resolved mesoscale motions, rather than unresolved subgrid-scale processes. As such, the very limited spread of trajectories in Fig. 1d is not surprising. A much larger value, such as 1,000 m²/s or more—as used in coarse-resolution simulations—would likely be more appropriate. I would recommend a simulation with high diffusivity.

l.235: Is “mixture distribution” the same as the probability distribution referenced in Equation (5)? This is somewhat unclear. Since the mixture simulation is intended to represent the full ensemble, while this section discusses a single-member simulation, clarification would be helpful.

l.340: This sentence needs clarification. While I agree that choosing a release radius smaller than the model’s spatial resolution effectively results in a point release, it is unclear what is meant by “full ensemble variability” in this context.

l.460: It would be helpful to remind the reader how to interpret the results shown in Fig. 7, and to explain why low values of relative entropy indicate a good approximation of the reference case.

l.555: While I agree with the conclusion stated here, the purpose of this experiment remains unclear. If the goal is to mimic mesoscale-induced ensemble spread, it would make more sense to use a much larger diffusivity.

l.574: Please remove the extra “the.”

Hide

ED: Reconsider after major revisions (further review by editor and referees) (15 Jul 2025) by Irina I. Rypina

AR by Erik van Sebille on behalf of the Authors (09 Sep 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (09 Sep 2025) by Irina I. Rypina

RR by Anonymous Referee #3 (30 Sep 2025)

ED: Publish subject to technical corrections (30 Sep 2025) by Irina I. Rypina

AR by Erik van Sebille on behalf of the Authors (03 Oct 2025) Author's response Manuscript

Short summary

Particle-tracking simulations compute how ocean currents transport material. However, initializing these simulations is often ad hoc. Here, we explore how two different strategies (releasing particles over space or over time) compare. Specifically, we compare the variability in particle trajectories to the variability of particles computed in a 50-member ensemble simulation. We find that releasing the particles over 20 weeks gives variability that is most like that in the ensemble.