Network-based study of Lagrangian transport and mixing

Transport and mixing processes in fluid flows are crucially influenced by coherent structures and the characterization of these Lagrangian objects is a topic of intense current research. While established mathematical approaches such as variational or transfer operator based schemes require full knowledge of the flow field or at least high resolution trajectory data, this information may not be available in applications. Recently, different computational methods have been proposed to identify coherent behavior in flows directly from Lagrangian trajectory data : , ::: that ::: is, ::::::::: numerical :: or :::::::: measured ::::: times :::::: series :: of ::::::: particle 5 ::::::: positions :: in :: a :::: fluid :::: flow. In this context, spatio-temporal clustering algorithms have been proven to be very effective for the extraction of coherent sets from sparse and possibly incomplete trajectory data. Inspired by these recent approaches, we consider an unweighted, undirected network, where Lagrangian particle trajectories serve as network nodes. A link is established between two nodes if the respective trajectories come close to each other at least once in the course of time. Classical graph concepts are then employed to analyze the resulting network. In particular, local network measures such as the node degree : , 10 :: the ::::::: average :::::: degree :: of :::::::::: neighboring :::::: nodes, :::: and ::: the :::::::: clustering ::::::::: coefficient : serve as indicators of highly mixing regions, whereas spectral graph partitioning schemes allow us to extract coherent sets. The proposed methodology is very fast to run and we demonstrate its applicability in two geophysical flows the Bickley jet as well as the antarctic stratospheric polar vortex.


Introduction
The notion of coherence in time-dependent dynamical systems is used to describe mobile sets that do not freely mix with the surrounding regions in phase space.In particular, coherent behavior has a crucial impact on transport and mixing processes in fluid flows.The mathematical definition and numerical study of coherent flow structures has received considerable scientific interest for the last 2 decades.The proposed methods roughly fall into two different classes, geometric and probabilistic approaches; see Allshouse and Peacock (2015) for a discussion and comparison of different methods.Geometric concepts aim at defining the boundaries between coherent sets, i.e., codimension-1 material surfaces in the flow that can be characterized by variational criteria (see Haller, 2015, for a recent review).Central to these constructions is the Cauchy-Green strain tensor, which is derived from the derivative of the flow map.Thus, full knowledge of the flow field or at least high-resolution trajectory data is required for these methods to work successfully.This also applies to other geometric concepts such as shape coherence (Ma and Bollt, 2014).Probabilistic methods aim at defining sets that are minimally dispersive while moving with the flow.The main theoretical tools are transfer operators, i.e., linear Markov operators that describe the motion of probability densities under the action of the nonlinear, time-dependent flow.The different constructions in this family of approaches are reviewed in Froyland and Padberg-Gehle (2014), also highlighting the crucial role of diffusion in this setting.Recently, a dynamic Laplacian framework has been introduced by Froyland (2015), where explicit diffusion is no longer required in the analytical and computational framework.While for this approach fast and accurate algorithms have been developed in Froyland and Junge (2015), the classical transfer-operator setting requires the integration of many particle trajectories for the numerical approximation of the infinite-dimensional operator.Here again, full knowledge of the underlying dynamical system is needed, which Published by Copernicus Publications on behalf of the European Geosciences Union & the American Geophysical Union.
may not be available in applications.Moreover, all discussed approaches assume that the nonautonomous dynamics is represented by a flow map, which, by construction, only considers the starting and end points of each particle trajectory but neglects the dynamics between the initial and final points in time.
To overcome these problems, different computational methods have been proposed to identify coherent behavior in flows directly from Lagrangian trajectory data, such as obtained from particle tracking algorithms.One of the earliest attempts is the braiding approach proposed by Allshouse and Thiffeault (2012), where trajectories are classified according to their intertwining pattern in space-time.This method is mathematically sound but computationally demanding and currently restricted to two-dimensional flows.Trajectory-based approaches have also been introduced by Mancho et al. (2013) and Budišić and Mezić (2012).They use time-integrated quantities along trajectories, which again requires knowledge of the underlying dynamical system.Finally, Williams et al. (2015) attempt to reconstruct the transfer operator from a limited amount of trajectory data.
Very recently, spatio-temporal clustering algorithms have been proven to be very effective for the extraction of coherent sets from sparse and possibly incomplete trajectory data (see, e.g., Froyland and Padberg-Gehle, 2015;Hadjighasem et al., 2016;Banisch and Koltai, 2017;Schlueter-Kuck and Dabiri, 2017).Here, distance measures between trajectories are used to define groups of trajectories that remain close and/or behave similarly in the time span under investigation.All these methods can deal with sparse and incomplete trajectory data and do respect the dynamics of the entire trajectories, not just the end points.While c-means clustering as used by Froyland and Padberg-Gehle (2015) is computationally inexpensive and works well in example systems (see also Allshouse and Peacock, 2015), spectral clustering approaches as in Hadjighasem et al. (2016), Banisch andKoltai (2017), andSchlueter-Kuck andDabiri (2017) appear to be more robust, but require considerable computational effort.
Inspired by these recent approaches, our aim is to design a reliable but computationally inexpensive method for studying coherent behavior as well as mixing processes directly from Lagrangian trajectory data.For this, we consider an unweighted, undirected network, where Lagrangian particle trajectories serve as network nodes.A link is established between two nodes if the respective trajectories come close to each other at least once in the course of time.This construction is similar in spirit to the concept of recurrence networks (see, e.g., Donner et al., 2010a), but here in a spatio-temporal setting.Whereas in recurrence networks, two points on a trajectory or more generally of a time series are linked when they are close, in the present work we consider a whole trajectory as a single entity.We note that the discretized transfer operator has also been viewed and treated as a network: see, e.g., Dellnitz and Preis (2003), Dellnitz et al. (2005), Padberg et al. (2009), Lindner andDonner (2017), andSer-Giacomi et al. (2015).The latter used the directed, weighted network to analyze model data of the Mediterranean Sea with the main focus on in-and out-degrees.A different approach is taken in Donges et al. (2009).The authors compute the mutual information matrix M of a climate data set as an adjacency matrix A of an undirected and unweighted network.This way they use the betweenness centrality to identify regions of major importance for energy transport.
We use classical graph concepts and algorithms to analyze our trajectory-based undirected and unweighted flow network.Local network measures such as node degrees or clustering coefficients highlight regions of strong or weak mixing.These and other quantities have been considered in previous work on recurrence networks by Donner et al. (2010a) and Donner et al. (2010b), where the authors could link network measures to properties of the underlying dynamical system.In a similar fashion, Lindner and Donner (2017) as well as Ser-Giacomi et al. (2015) considered the in-and outdegrees of a weighted, directed network obtained from a discretized transfer operator and found these to highlight hyperbolic regions in the flow.We note that the node degree in our construction exactly corresponds to the trajectory encounter number very recently introduced by Rypina and Pratt (2017), a quantity that measures fluid exchange and thus mixing.Local clustering coefficients can be related to regular behavior, as has also been observed by Rodríguez-Méndez et al. (2017) in the context of transfer-operator-based networks.
In addition to considering local network measures, we will apply spectral graph partitioning schemes for the solution of a balanced cut problem (Shi and Malik, 2000).This allows us to efficiently extract coherent sets of the underlying flow, similar in spirit to the approaches proposed in Hadjighasem et al. (2016) and Banisch and Koltai (2017), who considered weighted networks, which are constructed based on using different metrics for determining the distance between two trajectories.
The paper is organized as follows.In Sect. 2 we describe our network construction.This is followed by a discussion of network analysis tools in Sect.3, where we review several simple local network measures as well as the spectral graph partitioning approach by Shi and Malik (2000).In Sect. 4 we apply the methodology to two different example systems, a Bickley jet as well as the stratospheric polar vortex.We close the paper with a discussion and an outlook on future work.

Networks of Lagrangian flow trajectories
In the following, we assume that we have n ∈ N Lagrangian particle trajectories from a flow simulation or from a particle tracking experiment in physical space R d , d = 2 or 3.In practice, the particle positions may be given at discrete times {0, 1, . .., T }.We denote the trajectories by x i , i = 1, . .., n and the particle positions at a certain time in-stance t = 0, . .., T by x i,t ∈ R d .We now set up a network in which the trajectories x 1 , . .., x n serve as nodes.We link two trajectories if they come close to each other at least once in the course of time.Such an undirected, unweighted network is uniquely described by a symmetric adjacency matrix A ∈ {0, 1} n,n .In practice, we construct this from the given trajectories by setting where χ B denotes the indicator function of a set B ⊂ R d .So A ij = 1; that is, a link is established between trajectories x i and x j , if and only if at one or more time instances t, x j,t can be found in an ball B (x i,t ) centered at x i,t and thus the trajectories x i and x j have come close.In this way, the network encodes in a compact manner how material is transported in the flow -in space and time.By an appropriate choice of one ensures that the network defined by Eq. ( 1) is connected and in this paper we will only consider connected networks.For instance, if the trajectories are initialized on a regular grid, then a natural lower bound to would be the mesh size.In the case that particles are randomly distributed, has to be chosen accordingly.We will study different choices of in Sect. 4.
Alternatively, the network might be set up by linking the k nearest neighboring trajectories at each time instance for some k ∈ N.While this allows us to get rid of the problem of a suitable choice of , it means that we have to choose a reasonable k.In two-dimensional systems a natural choice would be k = 4 mimicking the five-point stencil; similarly, k = 6 in three-dimensional systems.If trajectories are initialized on a regular grid, this choice again ensures that the resulting network is connected.Our own preliminary studies have indicated that this procedure gives very similar results to the -based definition in Eq. ( 1) but requires slightly longer computational run times.However, as the construction is not symmetric in general, we will not pursue this in the present work.
We note that the network depends on the time interval under consideration.While the study of different time intervals may reveal relevant information about the timescales and other inherent properties of the dynamics, this will not be the focus of our work here.

Network analysis
Here, we briefly discuss standard analysis concepts for networks (see, e.g., Newman, 2003) and relate them to features of the underlying flow.In particular, we will describe how to extract coherent structures by solving a graph-partitioning problem, the balanced minimum cut problem as proposed by Shi and Malik (2000) (see also Hadjighasem et al., 2016).

Degree matrix and graph Laplacian
From the adjacency matrix A one can derive two other important matrices to describe the network.The degree matrix D is a diagonal matrix with D ii = d i where d i is the degree of node x i , i.e., D ii = n j =1 A ij , that is, the number of links attached to node i.In our setting, d i ∈ N, i = 1, . .., n.By construction, for our network the degree of a node is non-zero, so there are no isolated nodes.
The non-normalized Laplacian is formed by L = D − A, where D is the degree matrix and A is the adjacency matrix.By the construction of A and D, L is symmetric and the entries of L are and thus L ∈ Z n,n .The normalized symmetric graph Laplacian L ∈ R n,n is defined as The other eigenvalues and corresponding eigenvectors can be characterized variationally in terms of the Rayleigh quotient of L. We come back to this in Sect.3.3.

Node degree
The degree of a node encodes how many other nodes are connected to it.In our setting, it measures how many different trajectories come close to the trajectory represented by the respective node, and thus it carries information about fluid exchange.The node degree d is encoded in the diagonal elements d i = D ii of the diagonal degree matrix D, with The node degree d corresponds to the trajectory encounter number as recently introduced by Rypina and Pratt (2017), who also compared this quantity to finite-time Lyapunov exponents and found good agreement in example systems.

Average degree of neighboring nodes
Here one considers the average node degree of the neighbors of a node x i , defined as Due to the averaging over all neighboring degrees, d nn tends to be smoother compared to the simple node degree d.
Both d and the average degree of neighboring nodes d nn will be large, when the corresponding trajectory comes close to many different other trajectories.In particular in the context of volume-preserving flows, this is only possible when fluid parcels get stretched and folded.Thus, both d and d nn are expected to be large in mixing regions and can be at least qualitatively related to finite-time Lyapunov exponents; see Donner et al. (2010a), Padberg et al. (2009), Froyland and Padberg-Gehle (2012), Lindner andDonner (2017), andSer-Giacomi et al. (2015) for related studies.However, whereas finite-time Lyapunov exponents measure the overall stretching at the final time, in our construction all intermediate times are also considered.Establishing a formal mathematical link to finite-time Lyapunov exponents is therefore subject to future research.

Local clustering coefficient
Here one considers the induced subgraph formed by the vertex x i under consideration and the vertices incident to it.The local clustering coefficient C indicates how strongly connected this subgraph is by measuring what proportion of the neighbors of x i are neighbors themselves: In the context of recurrence networks, large clustering coefficients have been found to indicate invariant sets of the underlying dynamics (Donner et al., 2010a).In flow networks obtained from a discretization of the transfer operator large clustering coefficients have been related to periodic behavior (Rodríguez-Méndez et al., 2017).In the aperiodic timedependent setting, invariant sets no longer exist, but instead mobile sets, such as vortices, in which the dynamics is regular.In these regions the dynamics is mainly characterized by rotation and translation.Therefore, in the course of time, trajectories tend to continue interacting with their initial neighbors and encounter only relatively few different trajectories.So the triples and triangles in the network that are due to initial neighborhoods (for sufficiently large ) continue to positively influence the value of the clustering coefficient in regular dynamics.A trajectory in a mixing region will be linked to many other trajectories, and due to the underlying stretching and folding, the proportion of triangles is small.Therefore, the local clustering coefficient C is expected to be large for trajectories in regular regions (i.e., for which d or d nn is small).The simple local network measures reviewed here depend on the local properties of the network and therefore, of course, on the choice of .We will study the dependence in our numerical studies in Sect. 4. In the context of recurrence networks, the problem of an appropriate choice of has been discussed in Donner et al. (2010b).They considered the edge , where |V | denotes the fixed number of vertices and |E( )| the -dependent number of edges of the network.In the literature, values of that maximize dρ d are proposed as optimal choices of .In the study of Donner et al. (2010b) (see also our own numerical investigations in Sect.4), however, it has been shown that such a choice typically results in very dense networks, which no longer encode the local properties of the underlying dynamics.Instead, a limit of ρ( ) ≤ 0.05 has been proposed to give reasonable results.

Spectral graph partitioning
Spectral graph partitioning aims at decomposing a network into components with specific properties.In our setting, the network encodes how material is transported by the flow, in both space and time.We are interested in identifying coherent structures in the flow, which are known to be organizers of fluid transport.From a spatio-temporal point of view, coherent sets are formed by trajectories that stay close to each other (Froyland and Padberg-Gehle, 2015) and thus are more tightly connected than others.Such information can be obtained by solving a balanced cut problem of the network (Hadjighasem et al., 2016).
As outlined above, the normalized symmetric graph Laplacian L has non-negative real eigenvalues 0 = λ 1 ≤ λ 2 ≤ . . .≤ λ n .The second smallest eigenvalue λ 2 ≥ 0 is called the algebraic connectivity or Fiedler eigenvalue of a graph (Fiedler, 1973).This eigenvalue is non-zero if and only if the network is connected.More generally, the number of connected components of the network appears as the multiplicity of the eigenvalue zero of the Laplacian matrix.If λ 2 > 0 but very close to zero, then the network is nearly decoupled and the sign structure of the corresponding eigenvector determines two communities in the network (Fiedler cut).If λ i , i = 2, . .., k for some k < n are close to zero and there is a spectral gap between λ k and λ k+1 , then the network is nearly decoupled into k communities.The corresponding eigenvectors w 2 , . .., w k carry information about the location of these communities, which can be verified by considering the Rayleigh quotient of the normalized graph Laplacian, as outlined in Shi and Malik (2000).They used this concept to solve a balanced cut problem for defining communities in the network that are characterized by minimum communication between different communities and maximum communication within communities.Such nearly decoupled subgraphs correspond to bundles of trajectories that are internally well connected but only loosely connected to other trajectories.This is indicative of coherent behavior (see also Hadjighasem et al., 2016).Instead of considering the eigenvalue problem Lw = λw, Shi and Malik (2000) propose to solve the equivalent generalized eigenvalue problem Nonlin.Processes Geophys., 24, 661-671, 2017 www.nonlin-processes-geophys.net/24/661/2017/ As both L and D are symmetric and have integer entries, eigenvalue problem Eq. ( 7) turns out to be numerically more convenient than the original one.It has the same eigenvalues 0 = λ 1 ≤ λ 2 ≤ . . .≤ λ n and the eigenvectors are related by w i = D 1 2 v i , i = 1, . .., n.In particular, v 1 = 1.The number of leading eigenvalues (i.e., eigenvalues close to zero) indicates the number of nearly decoupled subgraphs.An application of a standard k-means clustering algorithm (Lloyd, 1982) can then be employed to extract the sets of interest from the corresponding eigenvectors.

Bickley jet
As our first example we consider the Bickley jet proposed by Rypina et al. (2007).It is defined by the streamfunction  7) for the network constructed from high-resolution initial data in the Bickley jet (case i) with = 0.2.7) for the network constructed from 1000 random initial conditions in the Bickley jet (case ii) and = 0.5.
where r e = 6.371 as well as σ i = c i k i , i = 1, 2, 3. Here, we have dropped the physical units for brevity.The physical assumptions underlying the model equations and the parameters are described in detail in Rypina et al. (2007).For our choice of parameters and when considered on a cylinder, the system exhibits a meandering central jet and three regular vortices on each side of the jet.
Initial conditions are chosen in the domain M = [0, 20[ × [−3, 3] and are numerically integrated on the time interval [10, 30] using a fourth-order adaptive Runge-Kutta scheme and periodic boundary conditions in the x direction.We output the particle positions at integer time steps.We also tested finer temporal resolutions and different time intervals, but these did not significantly change our results for this system.We consider two sets of initial conditions, which we will refer to as cases (i) and (ii) in the following: i. 12 200 points from a regular grid on M with grid mesh size 0.1; and ii. 1000 random points uniformly distributed on M.
For the first high-resolution setting (i) we study different from 0.1 to 0.5 (in steps of 0.05), with = 0.1 corresponding to the distance between neighboring grid points.The different choices of result in values for the edge density ρ( ) between 0.002 and 0.04, which are well within the proposed limit of ρ( ) ≤ 0.05 as considered in Donner et al. (2010b).We found no local maximum of dρ d in this range.For = 0.5 the resulting network already has about 3 million links, so a possible peak of dρ d would lie well outside a computationally reasonable range of .
For the sparse setting (ii), we start with = 0.5, for which ρ( ) = 0.04.Significantly smaller values of did not produce a connected network in this case.A maximum of dρ d is detected at about = 1.9, yielding ρ = 0.45, which already corresponds to a dense network.So a reasonable range appears to be ∈ [0.5, 1.9].
In Fig. 1 the local network measures for case (i) are plotted with respect to the initial conditions.The left column contains the results for = 0.1, the middle column for = 0.2, and the right column for = 0.5.The top row displays the node degree d.As expected, d is high in mixing regions, i.e., where trajectories meet many other trajectories, and low in the regular regions, i.e., the six vortices and the jet core.Whereas the result for = 0.1 appears a bit fuzzy, those for = 0.2 and = 0.5 are much sharper.The average node degree of neighboring nodes d nn (middle row) gives a very pronounced indication of regular and mixing flow behavior for small , but at = 0.5, the jet core is no longer high- lighted by low values of d nn due to the increased neighborhood over which averages are taken.The bottom row shows the clustering coefficient C. For = 0.1, the vortex cores are characterized by a zero clustering coefficient.This is due to the fact that in this case is chosen as the distance between neighboring grid points.However, in this case, two neighbors of a grid point have initially a distance of at least √ 2, and therefore in the vortex core region, with its very regular dynamics, the network does not possess any triangles.For all other values of studied, the clustering coefficient gives a very clear indication of different dynamical flow regimes, with high values in regular regions as expected.
In Fig. 2 we repeat the study for the low-resolution case (ii), using = 0.5 (left column), = 1.5 (middle), and = 1.9 (right column).The results are very much comparable to the high-resolution case (i), with the average node degree d nn (middle row) giving again a good indication of the different flow regimes for small , where the node degree d only produces spurious results.At = 1.5, the average node degree d nn appears to be "switching" and starts to pick up regular regions instead of mixing regions as for smaller .This is again due to the enlarged neighborhood, where averages are now crucially influenced by flow regimes outside the local neighborhood of the trajectory under consideration.For instance, for a node with a small node degree, its neighborhood extends far into the mixing regions characterized by large node degrees, resulting in a large average degree for this node (and vice versa for nodes with large node degrees).For all choices of the local clustering coefficient C picks up the cores of the six vortices, whereas the node degree d is small in these regions and large along the jet, the major transport barrier in this flow.We note that in this sparse setting, the jet core is not resolved by any of the local network measures.
This study supports the local network measures being of course dependent, but in particular the node degree and the clustering coefficient are robust within a reasonable range of values, even for = 1.9 in the low-resolution case.As expected and as found in related work, the clustering coefficient indicates vortices, whereas the node degree highlights major transport barriers.The average node degree d nn appears to be a good choice for small , but turns out to be less robust for increasing , as larger and larger parts of the network are then considered for the averaging and thus the local nature of this network measure decreases.
In Fig. 3 the four (non-trivial) leading eigenvectors v 2 , . .., v 5 of the generalized eigenvalue problem Eq. ( 7) are  shown for the high-resolution initial conditions (case i) with = 0.2.The eigenvectors highlight the two regions delineated by the jet as well as the different vortices, comparable to the results in Banisch and Koltai (2017).We note that the corresponding figures for the other choices of would look the same.Surprisingly, in the study by Hadjighasem et al. (2016) only the six vortices have been identified but not the different flow regimes delineated by the jet core.
In the low-resolution case (ii), the leading eigenvectors match those of the high-resolution data case, but in a slightly  4 for the choice = 0.5).This comes from the fact that the four eigenvalues λ 3 , . .., λ 5 all have approximately the same magnitude and are therefore sensitive to perturbations.The 10 leading eigenvalues for case (i) and = 0.2 are displayed in Fig. 5a, the low-resolution case (ii) with = 0.5 in Fig. 5b.These spectra exhibit clear spectral gaps between the second and third and between the eighth and ninth eigenvalues.
The first spectral gap is related to the coherent behavior of the upper and lower parts of the cylinder, delineated by the jet core.The second (and larger) spectral gap indicates the existence of altogether eight coherent sets.These can be extracted via a standard k-means clustering (with k = 8) of the first eight eigenvectors.The resulting partitions are shown in Fig. 6.As expected, the six vortices and the two distinct stream regions are picked up in both the high-resolution (i) and sparse data (ii) cases.However, in the sparse case the clustering finds a few false green and blue points (Fig. 6a).For the low-resolution case (ii) and a choice of = 1 (or larger) the spectrum is no longer correctly recovered (see Fig. 5c for the choice = 1.9).
Finally, we note that the proposed approach is computationally inexpensive, with total run times of < 2 s for the sparse data case (ii) and ≈ 40 s for the high-resolution case (i) using MATLAB (R2016a) on a single processor; see Table 1 for details.

Stratospheric polar vortex
As a second example we study the transport and mixing dynamics in the stratospheric polar vortex over Antarctica.The coherent behavior of the polar vortex has already been numerically studied using transfer-operator methods (Froyland et al., 2010).For the computation of particle trajectories we use two-dimensional velocity data from the ECMWF-Interim data set 1 The global ECMWF data are given at a temporal resolution of 6 h and a spatial resolution of a 121 × 240 grid in the longitude and latitude directions, respectively.As in Froyland et al. (2010) we focus on the stratosphere over the Southern Hemisphere.We consider the flow from 1 September 2002 to 31 October 2002 on a 600 K isentropic surface.For the integration of particle trajectories, we seed initial data on a 64 × 64 grid centered at the South Pole (square of side lengths 12 000 km), with a mesh size of 187.5 km.
A fourth-order Runge-Kutta scheme with a constant step size of 45 min and linear interpolation in space and time are used and we output the particle positions every 6 h.For the construction of the trajectory network we choose = 375 km, i.e., twice the grid spacing.For this choice, we obtain an edge density of ρ( ) = 0.03, which lies well within the reasonable range proposed by Donner et al. (2010b).
In Fig. 7 we show the local network measures applied to this network.The node degree d and the average node degree d nn highlight again the strongly mixing regions that delineate the polar vortex.Similar observations have been made using other stretching measures (see, e.g., Joseph and Legras, 2002;Froyland and Padberg-Gehle, 2012).The local clustering coefficient is large in particular in the core of the vortex, where the node degree and the average node degree take on small values.As the dynamics is very irregular, the results are less pronounced than in the Bickley jet example, also for larger (not shown).
In Fig. 8a-c, the second eigenvector of the generalized graph Laplacian eigenvalue problem Eq. ( 7) is shown.It highlights the polar vortex as a coherent set, confirming the transfer-operator-based results obtained by Froyland et al. (2010) for a different data set (1-14 September 2008).However, the result of our computation appears spurious, with small, isolated yellow regions dispersed in the background flow.This is due to a bifurcation in the flow patterns: towards the end of September 2002, the polar vortex splits into two vortices.One of the two vortices becomes unstable and disperses, whereas the other vortex has increased again by the end of the computation (31 October 2002; see Fig. 8c).It would be very interesting to identify a precursor of the vortex splitting from the network properties, but this will be the subject of future work.
We repeat the study of the spectrum by considering a new network where the trajectories are restricted to the time span before the bifurcation (1-26 September 2002); see Fig. 8d and e.On this interval, the polar vortex can be clearly identified by the second eigenvector of the generalized graph Laplacian eigenvalue problem.

Discussion and conclusion
We have proposed a very simple and inexpensive approach to analyzing coherent behavior and thus transport and mixing phenomena in flows.It is based on a network in which Lagrangian particle trajectories form the nodes.A link is www.nonlin-processes-geophys.net/24/661/2017/ Nonlin.Processes Geophys., 24, 661-671, 2017 established between two nodes if the respective trajectories come close to each other at least once in the course of time.
The resulting network is unweighted and undirected and can be represented by a binary adjacency matrix.Classical local network measures such as node degree and clustering coefficient highlight regions of strong mixing and regular motion, respectively.While these network measures are dependent, they appear to be robust within a reasonable range of values.Overly large 's blur the local information of the underlying dynamics and an edge-density-dependent choice of as discussed in the context of recurrence networks (Donner et al., 2010b) has turned out to be useful in our setting as well.In addition, we have used a generalized graph Laplacian eigenvalue problem to efficiently and robustly extract coherent sets, even for the case of sparse data as illustrated by case (ii) in the Bickley jet investigations.
While in this paper we have only demonstrated our approach in examples that are volume-preserving and twodimensional, the extensions to three-dimensional flows and also to dissipative systems are straightforward.In addition, although not illustrated here, our method can easily deal with incomplete trajectory data as only one-time encounters of trajectories are required for setting up the network.The approach is not restricted to connected networks, and in particular in the presence of attracting sets in non-volumepreserving systems, these might be worthwhile considering as well.We have studied unweighted networks throughout the paper.Counting the number of times a trajectory comes close to another is one option for choosing weights.Our own preliminary studies indicate that in this case the node degree and average node degree become less meaningful, as these cannot distinguish any more between repeated encounters (as in regular regions) and many different encounters (as in mixing regions).Clustering coefficients and subdominant eigenvectors of the Laplacian appear to continue to highlight coherent regions.
There are some direct relations to other recently proposed methodologies such as the dynamic isoperimetry framework introduced by Froyland (2015), where a dynamic Laplacian and its spectrum play a central role.The graph Laplacian matrix studied in the present paper appears to be a very coarse but inexpensive and robust approximation of this operator and in a similar way it approximates the diffusion maps studied in Banisch and Koltai (2017).In this context, it might be interesting to analyze the networks resulting from the different choices of metrics used in Banisch and Koltai (2017), Schlueter-Kuck and Dabiri (2017), and Hadjighasem et al. (2016).A mathematical analysis of the commonalities and differences between these approaches and our novel network approach is the subject of future research.Finally, the node degree of our network construction exactly corresponds to the trajectory encounter number very recently introduced by Rypina and Pratt (2017), which has now obtained a wider interpretation in the context of flow networks.
Data availability.The velocity field that we used in Sect.4.2 is publicly available from the ECMWF website: http://apps.ecmwf.int/datasets/.
Competing interests.The authors declare that they have no conflict of interest.Special issue statement.This article is part of the special issue "Current perspectives in modelling, monitoring, and predicting geophysical fluid dynamics".It is not affiliated with a conference.

Figure 1 .Figure 2 .
Figure 1.Network measures for high-resolution initial conditions (case i) in the Bickley jet for = 0.1 (a), = 0.2 (b), and = 0.5 (c).From top to bottom: node degrees d and d nn and clustering coefficient C.

Figure 3 .
Figure 3.Leading eigenvectors v 2 -v 5 (from a to d) of the generalized graph Laplacian eigenvalue problem Eq. (7) for the network constructed from high-resolution initial data in the Bickley jet (case i) with = 0.2.

Figure 4 .
Figure 4. Leading eigenvectors v 2 -v 5 (from a to d) of the generalized graph Laplacian eigenvalue problem Eq. (7) for the network constructed from 1000 random initial conditions in the Bickley jet (case ii) and = 0.5.

Figure 7 .
Figure 7. Node degree d (a), average node degree of neighboring nodes d nn (b), and clustering coefficient C (c) of a network constructed from trajectories for the polar vortex flow between 1 September and 31 October 2002.