A method for objectively extracting the displacement signals associated with coherent eddies from Lagrangian trajectories is presented, refined, and applied to a large dataset of 3770 surface drifters from the Gulf of Mexico. The method, wavelet ridge analysis, is a general method for the analysis of modulated oscillations, here modified to be more suitable to the eddy-detection problem. A means for formally assessing statistical significance is introduced, addressing the issue of false positives arising by chance from an unstructured turbulent background and opening the door to confident application of the method to very large datasets. Significance is measured through a frequency-dependent comparison with a stochastic dataset having statistical and spectral properties that match the original, but lacking organized oscillations due to eddies or waves. The application to the Gulf of Mexico reveals major asymmetries between cyclones and anticyclones, with anticyclones dominating at radii larger than about 50 km, but an unexpectedly rich population of highly nonlinear cyclones dominating at smaller radii. Both the method and the Gulf of Mexico eddy dataset are made freely available to the community for noncommercial use in future research.

Trajectories from freely drifting, or Lagrangian, instruments are one of the major windows into observing the ocean circulation. A perennial theme in oceanography is the study of long-lived vortex structures, also known as coherent eddies, and their role in influencing the large-scale circulation. On account of these two factors, an important data analysis problem is to be able to accurately, objectively,

The term “objective” is used here in its conventional sense of not being shaped by personal opinion, that is, non-subjective. This is not to be confused with its more technical definition, used by, e.g.,

Obtaining a satisfactory solution to this problem that could scale to very large datasets would enable a rigorous eddy census of the entire surface drifter dataset from NOAA's Global Drifter Program

The problem of identifying and estimating the properties of coherent eddies in Lagrangian trajectories should be distinguished from the problem of describing the aggregate forms of trajectories due to the eddies they contain. To help clarify this, we introduce the term “eddy signal” to mean the displacement of a particle about an eddy's center. We would then see a trajectory as a superposition of different types of signals: e.g., an eddy signal, a near-inertial signal, and a mean flow. Whereas methods such as that of

Identifying eddies in trajectories is sometimes equated with finding trajectories that execute loops, and indeed the term “looper” is sometimes used to mean “a trajectory containing an eddy”. However, one must be very cautious about forming this equivalence as there is not a one-to-one relationship between trajectory loops and particle displacements due to eddy currents. Simply changing the value of an advecting flow will alter the appearance of, or even eliminate, trajectory loops for a given eddy signal. Similarly, a trajectory can form a loop for many reasons that do not involve a coherent eddy. For these reasons, eddy detections and property estimates based solely on the visual appearance of trajectories should be considered only as rough approximations.

Various methods have been proposed over the years for identifying and extracting eddy signals from Lagrangian trajectories

A major step in the eddy extraction problem was taken by

An application of the wavelet ridge analysis method to trajectories from a numerical model of an unstable jet on a beta plane

The wavelet ridge analysis method allows one to readily analyze datasets with dozens or perhaps hundreds of trajectories. However, there is a major challenge which prevents it – or any other eddy-extraction method – from being immediately applicable to very large datasets consisting of thousands or tens of thousands of trajectories. This is the problem of false-positive features arising from the interaction of the detection method with the turbulent background flow. For small to medium-size datasets such events may be readily identified with a visual scan, but that subjective operation becomes unwieldy for larger datasets.

In previous work

The stochastic background flow can be the source of spurious features that masquerade as coherent eddies, a phenomenon that can be understood as follows. In the one-dimensional case, a discrete random walk is intuitively described as a drunk staggering between lampposts. In the two-dimensional case, the drunk has a grid of lampposts available for their staggering. From time to time the drunk will, by chance, happen to turn in a circle or oscillate back and forth between two lampposts. This illustrates why, in applying the wavelet ridge analysis to time series of stochastic processes analogous to the random walk, oscillatory events are occasionally detected. One would not wish to confuse random features arising from the turbulent background with the organized oscillations due to coherent eddies.

This paper has three objectives. Firstly, the ridge analysis method of

Surface drifter trajectories in the Gulf of Mexico from the consolidated dataset of

Ellipses from all statistically significant eddy signals in the dataset shown in Fig.

As motivation for the investment required in learning this method, the results of the application to the Gulf of Mexico will be briefly described at the outset. The dataset to be analyzed in this work is a set of 3770 surface drifter trajectories, compiled from a variety of experiments, processed, and quality controlled as described in

The distributions of cyclonic and anticyclonic events are completely different. The dense populations of large anticyclonic eddies filling the central Gulf of Mexico are the well-known Loop Current Eddies

The structure of the paper is as follows. A mathematical model for the motion of a particle trapped in an eddy is presented in Sect.

This section describes a mathematical method for the displacement signal of a particle trapped in coherent eddy, building on the formulations in

The displacement signal of an instrument or particle advected by an eddy, within a moving frame of reference centered on the eddy's center – what we refer to as an “eddy signal” for short – will be modeled as the trajectory traced out by a particle orbiting a time-varying ellipse. We will use the complex-valued notation

A schematic of an ellipse is presented in Fig.

The lead author thanks Shane Elipot for pointing out an error in earlier published versions of Fig.

A schematic of a particle orbiting an ellipse, as described by Eq. (

The type of signal described by Eq. (

The ellipse generation equation of Eq. (

While oceanic vortices are often nearly circular, there are several reasons to permit elliptical motion in our conceptual model. Material ellipses arise naturally in considering a second-order Taylor expansion of a two-dimensional flow

The advection of a particle by a coherent eddy is not, however, the only type of physical phenomenon giving rise to a modulated elliptical signal. This model also matches the displacement signal expected for waves, most notably inertial oscillations. These are expected to have an anticyclonic circular polarization, i.e.,

It proves convenient to replace the ellipse semi-axes

The circularity is related to another measure of ellipse shape, the

The ellipse generation equation, Eq. (

The ellipse generation equation, Eq. (

A synthetically constructed modulated elliptical signal

It is useful at this point to show an example. The signal in Fig.

To develop this method, we first turn to the case of a univariate signal. The modulated elliptical signal is a generalization to two dimensions of an amplitude-modulated and frequency-modulated univariate oscillation:

Given

In the analytic signal method, a particular amplitude–phase pair, known as the

The canonical amplitude is defined uniquely in terms of the analytic signal by

To better understand the analytic signal, we first note that the Hilbert transform has a simple action in the frequency domain. Let

Recall that a cosine and sine have the Fourier representations of

A compelling argument in favor of the analytic signal method is due to

Note that there are actually two different sets of amplitude–phase parameters: those used to

A condition for when the generating amplitude and phase are identical to the canonical amplitude and phase was found in a remarkable paper by

The analytic signal method of assigning a time-varying amplitude and phase to a univariate signal

To infer the ellipse parameters from

Using for convenience

Equations (

As in the case of a univariate signal, there is a distinction between the generating and the inferred, or canonical, ellipse parameters, both of which lead to the same time-varying ellipse. These sets of parameters are identical for a sinusoidally orbited fixed ellipse and are expected to become increasingly different as the modulation strength increases. Further examination of the conditions for exact recovery of the generating ellipse parameters is outside the scope of this paper.

The generalization of the univariate instantaneous frequency, Eq. (

The bivariate instantaneous frequency can be written in terms of the ellipse parameters as

In the previous section, it was shown that given a trajectory consisting of only a single modulated elliptical signal, a unique assignment of time-varying ellipse parameters that could have generated it may be found by forming the associated analytic signal. Real-world eddies, however, do not occur in isolation, but rather they are superposed onto flows due to large-scale turbulence, other eddies, waves, and any self-propagation tendency of the eddy itself. To adapt the analytic signal method to handle realistic trajectories, we therefore need to incorporate a filtering step. This is accomplished using the continuous wavelet transform, as described next. We remind the reader that the notation list in Table

A Lagrangian instrument records a time-varying longitude

The displacement signal

The corresponding model for the latitude and longitude signals is

Each of the modulated oscillations

The conceptual model of Eq. (

The analytic signal method presented in Sect.

Panel

To illustrate this, and to motivate the approach developed in this section, we return to synthetic signal shown in Fig.

Ellipses inferred from the composite signal in Fig.

Applying the analytic signal method of the previous section to a detrended version of the total displacement signal

The extraction of unobserved modulated elliptical signals from the observed position signal

Systematic examination of continuous-time wavelets in common use by

The frequency at which

The frequency-domain wavelets can be approximated in the vicinity of their peak frequency as a Gaussian:

With

A generalized Morse wavelet

The generalized Morse wavelet we will use in this paper,

Next we use many rescaled versions of the wavelet to filter a univariate signal

In the frequency domain, the wavelet transform can be expressed as

The complex conjugate that would normally appear on the wavelet's Fourier transform

With these definitions, the wavelet transform of a phase-shifted sinusoid

Turning now to a vector-valued signal

The first author thanks Georgi Sutyrin for asking the question, with respect to this method, “What is the principle?”.

For a vector-valued signal with only a single modulated oscillation present,

When applied to a real-world signal

Application of the ridge analysis method to surface drifter trajectories leads to another type of error, namely false positives arising from the stochastic background

If the multiplicity exceeds unity, we may define the time-varying position of the center of the

Finally, the background process on the Cartesian plane,

For notational convenience, we will henceforth drop the superscripts “

Experience shows that it is important to set a threshold for the minimum length of a ridge. The ridge length is measured in terms of the number of oscillations executed along the ridge, which is found by integrating the estimated instantaneous frequency

In the case of a univariate signal

A threshold on

A ridge length threshold is therefore employed in which we keep only ridges with lengths

As a simple example, the wavelet ridge analysis for the synthetic composite signal from Fig.

Multivariate wavelet ridge analysis applied to the synthetic composite signal shown in Fig.

Evaluating the wavelet transform along this ridge curve

It is clear from Figs.

Importantly, we have not needed to specify any properties of the oscillatory signal apart from the frequency range of the transform. As mentioned earlier,

Because here we are particularly interested in eddies, which tend to be destroyed if strained to a high degree of eccentricity, a modification is made to the ridge analysis. In its general form, the ridge analysis places no constraints on the polarization, that is, the degree of eccentricity of the signal. Numerical experiments with noise datasets show that when ridges emerge, they can be of any polarization, and this polarization tends to wander with time, sometimes even changing sign across

It therefore seems preferable to explicitly exclude sign transitions across

The wavelet ridge analysis is then performed twice. Ridges of

The net result of this modification is that ridges lacking a sign transition are unchanged, ridges containing sign transitions are broken into shorter segments, and fewer spurious, false positive events survive the ridge length threshold. A desirable feature of this approach is that it does not involve setting an ad hoc threshold on the degree of eccentricity, as any numerical value of the circularity

In this section the wavelet ridge analysis method is applied to the Gulf of Mexico drifter dataset presented in Fig.

The first decision to be made is what band of scales, or frequencies, the wavelet transform should be performed over. The instantaneous frequency

It is helpful to examine the physical interpretation of the nondimensional instantaneous frequency

Let the eddy have a maximum azimuthal velocity at radius

Another measure of the bulk nonlinearity of an eddy is its vorticity Rossby number, defined as

If a time-varying lower boundary is set for the ridge frequency band at

Inertial oscillations typically occur at a Fourier frequency of

Therefore, both to avoid inertial oscillations and because no eddies are physically expected, the ridge analysis should be truncated to exclude events below the time-varying curve

An example of applying the one-sided ridge analysis to a drifter trajectory from the Bay of Campeche in the southwestern Gulf of Mexico is presented in Figs.

An example of a trajectory decomposed according to the unobserved components model of Eq. (

One-sided wavelet ridge analysis of the signal presented in Fig.

This trajectory is analyzed using the same

The nonstationarity and multiscale variability that are apparent by eye are seen explicitly in the wavelet transforms. Small-scale variability is frequently attributable to inertial oscillations, as well as to a brief high-frequency cyclonic event around year day

The contributions of the various ridges to the latitude variability are seen in Fig.

This complex trajectory is a good example of a situation in which the multiplicity

During the second half of the record, the sum of three oscillations – two eddy-like signals and an inertial signal – accounts for most of the variability apart from an over northward drift. This indicates that the unobserved components model of Eq. (

Because the eddies we are interested in studying do not occur in isolation, but are embedded within a turbulent background flow, it is necessary to take into account that the background flow itself may occasionally, by chance, give rise to features in the drifter trajectories that appear as modulated oscillations. Such features would be detected by the ridge analysis and therefore constitute false positives.

To address this issue, we will compare the results of the ridge analysis to the results of a parallel analysis applied to a stochastic or “noise” dataset. The noise dataset will be created to match key properties of the observed dataset, but lacking the explicit signatures of any eddies. This will act as a null hypothesis and enable a level of statistical significance to be determined. It amounts to an idealized approximation to the background process

The noise dataset is constructed as follows. For each trajectory, we form an estimate of the spectrum of the complex-valued velocity

At each frequency, we define the spectrum of the complex-valued noise velocities, which will be denoted

The basic idea is that since eddies and other oscillatory components will tend to raise the spectral levels above that due to the background, taking the minimum tends to isolate velocities associated with the background displacement process

It is straightforward now to create a stochastic velocity time series

Proceeding in this way for each trajectory, we obtain a dataset that is the same size as the original dataset. Trajectory by trajectory, the time series duration, initial location, temporal mean velocity,

Note that the match between the temporal mean velocities obtained by differencing the trajectories is approximate rather than exact, due to minor differences between differentiating trajectories and integrating velocities on the sphere in the numerical implementation used here.

velocity variance, and approximate spectral form all match by construction. One realization of this dataset is shown in Fig.As in Fig.

To the various ellipse quantities introduced in Sect.

The quantities

To compactly summarize the results of the ridge analysis, we will look at properties averaged along the duration of a ridge. Denote the time average of some quantity

The results of applying the ridge analysis using the settings described above to both the surface drifter dataset, and to the noise dataset, are summarized in Fig.

Scatterplots of ridge-averaged quantities for the drifter data

A number of important features in the real-world data are apparent in the left column of Fig.

As in the lower row of Fig.

Another presentation of the ridge-averaged properties is that in Fig.

Inertial oscillations, as expected, are seen to be strongly circularly polarized in Fig.

The central columns of Figs.

A final issue that should be mentioned is the possibility of contamination by the tides. The semidiurnal tide is excluded by our chosen frequency range,

The plots from the previous section emphasize the importance of excluding false positives through an assessment of statistical significance. In doing so, it is desirable to avoid ad hoc cutoffs that involve the very properties we are most interested in studying, such as the radius, velocity, or Rossby number. If we consider what the essence of coherent vortices is, we can say that they are (i) long-lived features by definition and (ii) roughly circular, with the possibility of small degrees of eccentricity arising due to various dynamical processes such as ambient strain. This suggests that

A novel significance criterion is created as follows. Some to-be-determined quantity characterizing the ridges will be chosen as the basis for our significance test, denoted

The cumulative distribution function corresponding to

A significance criterion for wavelet ridges in the Gulf of Mexico surface drifter dataset. The black dots in panel

Comparing the estimated survival function of the data to that of the noise will allow us to assess whether an event's properties are extreme enough to warrant classifying it as statistically significant. We will choose

The density functions

The ratio of the estimated survival function of the noise to that of the data

The significance parameter

In a previous version of this analysis, we used a significance criterion computed on the

In establishing a measure of statistical significance, we have refrained from using the language of statistical hypothesis testing, e.g., significance levels and

Excluding now those ridges having a density ratio below

Ridge-averaged properties of the eddy ridges are shown in the right column of Fig.

The ellipses corresponding to these eddy ridges are those shown in the earlier Fig.

Overall there is a striking difference in geographic distribution between cyclones and anticyclones. Cyclones are concentrated in the eastern, western, and southern Gulf of Mexico, generally excluding the shallow-shelf regions. The anticyclones are concentrated primarily over the Loop Current and due west of it, corresponding to the deepest part of the basin

The distribution of anticyclonic events exhibits many large features with

A cluster of cyclonic activity is seen in Fig.

Two areas of energetic cyclonic eddy activity with roughly 10 to 50 km radius ellipses – the intermediate size class in the right-hand column of Fig.

Cyclone vs. anticyclone asymmetries at the mesoscale are expected on theoretical grounds

Finally, the small-radius (

This paper has presented a method for extracting the displacement signals associated with coherent eddies, and for estimating the properties of the features that generated those signals, from large Lagrangian datasets. The method is rooted in ideas taken from the signal analysis literature, such as the analytic signal, modulated oscillations, and wavelet analysis. As these ideas are not yet all widely known in oceanography, they were introduced in some detail and discussed in the context of the eddy-detection problem.

A modification of an existing analysis method, multivariate wavelet ridge analysis, to prohibit sign transitions between cyclonic and anticyclonic polarizations renders it more suitable for this application. The main innovation, however, is a means of determining statistical significance, which is done through the creation of a null hypothesis in the form of a noise dataset constructed to match the basic statistical properties of the real-world data, yet lacking organized oscillatory features arising from spectral peaks. A significance criterion is proposed in which the distribution of a combined parameter, reflecting both the number of oscillations executed as well as the degree of circular polarization, in the data is compared with that of the noise as a function of frequency, in order to identify parameter regimes in which the noise is unlikely to have generated the observed features.

The statistically significant features emerging from an application to a large surface drifter dataset from the Gulf of Mexico were briefly discussed. It is clear that there is much to be learned about the Gulf of Mexico eddy field through studying these eddy ellipses. Here, in order to maintain a focus on the analysis method, the attention paid to physical results is intentionally kept to a minimum. These will be thoroughly investigated in a follow-up paper.

The incorporation of a criterion for determining statistical significance makes possible the application of the eddy-extraction method to very large datasets. This has great potential not only for studying eddies in real-world data with a new level of detail, but also for providing a novel means of comparing eddy characteristics from numerical models with those from observations. In particular, the eddy census for the Gulf of Mexico indicates that large-scale studies of submesoscale cyclonic vortices from in situ data are now possible for the first time. The distribution of a self-contained analysis routine, discussed below, is aimed to help facilitate the usage of this method by other investigators.

It is intended that this paper provides the groundwork for a multi-paper effort to solve the eddy extraction and property estimation problem as completely and rigorously as possible. Future efforts will include (i) examining the behavior of the method with regard to exact analytic vortex solutions such as those reviewed in

The synthetic signal shown in Fig.

The time-varying period corresponding to this signal, shown in Fig.

The composite signal shown in Fig.

Some important symbols used in this paper, together with variables from the GOMED dataset. The quantities in the second and third sections of the table are all the variables appearing in GOMED.

All numerical code required for the analysis carried out in this paper is distributed to the community as a part of jLab, the lead author's open-source data analysis toolbox for MATLAB, available at GitHub (

The centerpiece needed for this work is a new function,

Whereas for simplicity the handling of spherical geometry has been discussed herein using a small angle expansion about a fixed point, the numerical implementation in

The dataset utilized in this paper is the consolidated surface drifter dataset created by

The results of applying the multivariate wavelet ridge analysis to GulfDriftersAll are distributed to the community as the Gulf of Mexico Eddy Dataset (GOMED) at

JML was responsible for the theory, coding, analysis, and writing. PPB was responsible for obtaining funding, for the planning, deployment, and upstream processing of several of the datasets utilized herein, for managing the sharing of the GOMED dataset, and for finding the legal pathway to make this dataset publicly available. She also provided regional expertise and guidance throughout this project.

The authors declare that they have no conflict of interest.

This is a contribution of the Gulf of Mexico Research Consortium (CIGoM). We acknowledge the specific request of Petróleos Mexicano (Pemex) to the Hydrocarbon Fund to address the environmental effects of oil spills in the Gulf of Mexico that made this project possible. We are grateful to Paula García, Argelia Ronquillo, Favio Medrano, and the support from Dirección de Telemática and Dirección de Impulso a la Innovaciń y el Desarrollo at CICESE, as well as Omar Monroy (Mink Global), for their help in the IT and legal aspects required for making GOMED available. Finally, we wish to thank the two anonymous reviewers for their constructive feedback.

This research has been partially funded by the Mexican National Council for Science and Technology – Mexican Ministry of Energy – Hydrocarbon Fund, project 201441. The work of Jonathan M. Lilly was supported in part by award no. 1658564 from the Physical Oceanography program of the United States National Science Foundation.

This paper was edited by Stefano Pierini and reviewed by two anonymous referees.