the Creative Commons Attribution 3.0 License. Nonlinear Processes in Geophysics Small world in a seismic network: the California case

Recent work has shown that disparate systems can be described as complex networks i.e. assemblies of nodes and links with nontrivial topological properties. Examples include technological, biological and social systems. Among them, earthquakes have been studied from this perspective. In the present work, we divide the Southern California re- gion into cells of 0.1 , and calculate the correlation of activ- ity between them to create functional networks for that seis- mic area, in the same way that the brain activity is studied from the complex network perspective. We found that the network shows small world features.


Introduction
Physics, a major beneficiary of reductionism, has developed an arsenal of successful tools to predict the behavior of a system as a whole from the properties of its constituents. The success of these modeling efforts is based on the simplicity of the interactions between the elements: there is no ambiguity as to what interacts with what, and the interaction strength is uniquely determined by the physical distance. We are at a loss, however, in describing systems for which physical distance is irrelevant, or there is ambiguity whether two components interact (Albert and Barabási, 2002).
Historically, the study of networks has been mainly the domain of a branch of discrete mathematics known as graph theory. Since its birth in 1736, when the Swiss mathematician Leonhard Euler published the solution to the Königsberg bridge problem (consisting in finding a round trip that traversed each of the bridges of the Prussian city of Königsberg exactly once), graph theory has witnessed many exciting developments and has provided answers to a series of practical questions such as: what is the maximum flow per unit Correspondence to: A. Jiménez (ajlloret@ual.es) time from source to sink in a network of pipes, how to color the regions of a map using the minimum number of colors so that neighboring regions receive different colors, or how to fill n jobs by n people with maximum total utility. In addition to the developments in mathematical graph theory, the study of networks has seen important achievements in some specialized contexts, as for instance in the social sciences. Social networks analysis started to develop in the early 1920s and focuses on relationships among social entities such as communication between members of a group, trades among nations, or economic transactions between corporations (Boccaletti et al., 2006).
Recent work has shown that disparate systems can be described as complex networks, that is, assemblies of nodes and links with nontrivial topological properties, examples of which include technological, biological and social systems (Eguíluz et al., 2005). Among them, earthquakes also have been studied from this perspective Suzuki, 2004, 2006a;Baiesi and Paczuski, 2005). In the past few years, the discovery of small-world and scale-free properties of many natural and artificial complex networks has stimulated a great deal of interest in studying the underlying organizing principles of various complex networks, and has led to dramatic advances in this emerging and active field of research (Wang and Chen, 2003). Here we present for earthquake fault systems a similar approach to that of Eguíluz et al. (2005) for functional brain networks, and find that the analyzed catalog has small-world behavior.
The small world concept, in simple terms, describes the fact that despite their often large size, in many networks there is a relatively short path between any two nodes. The distance between two nodes is defined as the number of edges along the shortest path connecting them. The most popular manifestation of small worlds is the six degrees of separation concept, uncovered by the social psychologist Milgram (Milgram, 1967), who concluded that there was a path of acquaintances with typical length about six between most pairs of people in the United States (Kochen, 1998)  Strogatz (1998), in their seminal paper, have proposed to define small-world networks as those networks having both a small value of Lp (characteristic path length), like random graphs, and a high clustering coefficient C, like regular lattices. They consider a one-dimensional graph with N nodes, each vertex being connected to its k nearest neighbors (where Nkln[N]). The number k of edges per vertex is also called the degree of the graph. Next, with a probability P , a random edge is chosen and rewired to connect to a randomly chosen vertex. By varying P between 0 and 1 graphs can be created which span the whole range from regular (P =0) to random (P =1).
Two measures were introduced to characterize such graphs: the characteristic path length Lp is the mean of the shortest path (expressed in number of edges) connecting any two vertices on the graph. The cluster coefficient Cp is the likelihood (between 0 and 1) that the kv neighbors of vertex v are also connected to each other, averaged over all vertices. Regular networks or graphs have a high Cp (Cp≈3/4) but a long characteristic path length (Lp≈N/2k); random graphs have a low Cp(k/N) but the shortest possible path length (Lp≈ln(N)/ ln(k)). The discovery of Watts and Strogatz was that some networks with 0<P ≪1, thus regular networks with only a very small number of random edges, have a path length that is much smaller than that of a regular network, while the Cp is still close to that of a regular network. This dramatic drop in Lp for P only slightly higher than 0 implies that any vertex on the graph can be reached from any other vertex in only a small number of steps. This is equivalent to the small-world phenomenon and this type of graph (Cp close to regular network; Lp close to random network) was called a small-world graph by Watts and Strogatz. They showed that many real world networks such as networks of actors playing in the same movies, the power grid of North America, and the neuronal network of Caenorhabditis elegans have small-world features. Furthermore, they suggested that such networks may be optimal for information processing in complex systems. Since then it has been shown that many real networks display small world features and that these may reflect an optimal architecture for information processing (Stam, 2004).

Method
For a network (or graph) representation, first we have to define the nodes and the edges. Figure 1 shows a scheme of the method. The seismic region is divided into squared cells (for latitude and longitude only in this particular case), which will be the nodes; the time is divided into intervals. At each time step (we will try some, from days to several years), the activity, a(x, t) of the cell is calculated as the number of earthquakes at that cell and time. Now we have a time series for each cell. For each pair of cells, x1 and x2, we calculate their correlation coefficient in this way: where σ 2 (a(x))= a(x, t) 2 − a(x, t) 2 , and · represents temporal averages. Then, a threshold matrix is calculated for different values of the correlation coefficient, r c , so that when the correlation between two cells (nodes) is higher than the threshold value (positive values of r c only), we say that they are positively correlated, and the nodes (cells) are connected by an edge. Once our network is defined, we proceed to analyze its properties.

Node degree and degree distributions
The degree (or connectivity) d i of a node i is the number of edges incident with the node, and is defined in terms of the adjacency matrix A as: The most basic topological characterization of a graph can be obtained in terms of the degree distribution p(d), defined as the probability that a node chosen uniformly at random has degree d or, equivalently, as the fraction of nodes in the graph having degree d.

Shortest path lengths and diameter
The shortest path is the geodesic distance between vertex pairs in a network. The mean geodesic l is then: where g ij is the geodesic distance from vertex i to vertex j , and n is the number of nodes. g ij is called the diameter of the graph. We used Dijkstra's algorithm to implement this calculation (Dijkstra, 1959).

Clustering
A clear deviation from the behavior of the random graph can be seen in the property of network transitivity, sometimes also called clustering (Newman, 2003). Here we use the definition by (Watts and Strogatz, 1998), that has found wide use in numerical studies and data analysis (Newman, 2003): C i = number of triangles connected to vertex i number of triples centered on vertex i where triple means a single vertex with edges running to an unordered pair of others. The clustering coefficient measures the average density of triangles in a network. For random networks, C tends to zero as n −1 in the limit of large system size. Fig. 3. Average length and clustering coefficient of the networks with different thresholds (r c ) compared to those corresponding to a random graph with the same number of nodes and the same average node degree, for time lags of 100 days.

Data
The catalog belongs to the Southern California Earthquake Center (SCEC) and contains the seismic data for the period 1 January 1984 to 3 July 2001. The analyzed area ranges from 32-37 N, and 115-121 W. The magnitude spans from 3.0 to 8.0. The catalog is complete above magnitude 3.
As explained in Sect. 2, the catalog has to be discretized, in order to translate it into nodes and edges. We used a 2D approximation, with box size of 0.1 • , which is reasonable taking into account the typical size of a small fault and the accuracy in the hypocentral locations. We also need a discretized time, so that the activity at each time step is the number of events at that interval, and then we obtain a time series of activity for each cell. This is necessary in order to correlate the different activities by means of Eq. (1). We tested different time intervals: 1 day, 100 days, and 1000 days. 1 day is a natural selection to obtain almost continuous time series seismic activity; between 100 and 1000 days we have the commonly accepted time scale for aftershock sequences and can be related to nucleation processes (Dieterich, 1994).

Results
As can be seen in Figs. 2-4, the clustering coefficient for the connected components is always much higher than that of a random network. It also shows that for correlations between the cells higher than 0.8, the average path is always lower than that of a random network. So, when we apply a threshold for r c higher than that 0.8, we obtain a complex network which behaves as a small world, as defined in (Watts and Strogatz, 1998). Note that we are interested in studying highly correlated cells, and r c >0.8 is therefore a good lower threshold for our seismicity network. Note also that the threshold r c affects the connectivity of generated networks. Larger r c will result in the disconnected network whose average short path length will become very large in the sense of graph theory, which would be different from the results in the figures. In Fig. 5 we show the degree distribution for r c =0.8 and 1, 100, and 1000 days, respectively. They are not scale free. This result is opposite to that found previously in Suzuki, 2004, 2006a;Baiesi and Paczuski, 2005). Thus, the scale invariance is violated by thresholding. This implies that thresholding eliminates an important element of complexity of a seismic network. We also analyze the scaling relationship between the clustering coefficient and the degree Degre distribution for r c =0.8 and 1 day, 100 days, 1000 days lag, respectively. As can be seen, they are not scale free. (Abe and Suzuki, 2006b). We see that it is also violated by thresholding (Fig. 6).
In Fig. 7 we present the networks obtained with r c >0.8 and 1 day time interval. The main component is relating the Landers earthquake of 1992. All the components are related to big earthquakes with their corresponding aftershocks. It Fig. 6. Clustering in function of the degre for r c =0.8 and 1 day, 100 days, 1000 days lag, respectively. As can be seen, they are not scale free.
is interesting to note that those Coalinga and Imperial Valley earthquakes have cells relatively far from them, but with a high correlation in the seismicity rate series.
When we visualize the network for 100 days lag, we can see that there are much more main components (56) than before (6) by using the hierarchical algorithm. The main com- Fig. 7. Networks obtained for r c >0.8, by using a betweenness and a hierarchical algorithm (with some of the main earthquakes in the region) to find the components, with Pajek (Batagelj and Mrvar, 1998), for time lags of 1 day. ponent (first in Fig. 8) is the same as the main component for 1 day lag. Other particular components are related to different earthquakes. Most of the clusters' links outline clearly the San Andreas fault direction.
The number of clusters decreases with respect to the network found for the 100 days lag. In Fig. 9, the main component for the 1000 days lag relates the whole area.
The way of constructing the seismic network is different than that proposed by Abe and Suzuki (2006a), but one conclusion obtained is very similar: the mainshock plays a role of a "hub" with large degrees of connectivity. For these highly correlated events, the network obtained represents a small world. It is also interesting to note that those components can be viewed as the different activities a brain processes. Each stimulus is answered by different cells in the area. However, the networks obtained are not scale-free (the degree distributions were not found to follow a power law). So, the model for the network seems to be as follows: links are much more likely to connect "neighbor nodes" than distant nodes. However, as can be deduced from the networks for 1 day lag (Fig. 7), there are some long range links, in particular following the San Andreas fault. Since the rate of earthquakes is related to the stress transfer (Helmstetter et al., 2005), the networks reflect the way stresses are diffused in the area. Fig. 9. Some of the 18 components for the 1000 days time interval and r c >0.8, by using a hierarchical algorithm to find the components, with Pajek (Batagelj and Mrvar, 1998).

Conclusions
We propose a different analysis of seismicity in terms of complex networks. They are obtained in a way similar to the way brain functional networks are studied (Eguíluz et al., 2005). In our preliminary results, we see that the different components of the obtained networks act as different responses to the stimulus given by the general plate motions in the region. The results of this study show that the functional connectivity matrix of seismic activity recordings can be converted into a sparsely connected graph by applying a suitable threshold of r c . So, it can be said that the highest correlated cells in the region form a small world network. This small world property means that there are long-range connections in the seismic network. These connections might be related to the San Andreas or other large faults in the region, that transfer the stresses. This method could be useful to find triggered earthquakes, as well as for declustering the catalogs. This is related to the correlation coefficients, that may offer some important information about the earthquake activities. Another choice would be to let these "small-world" networks to be weighted. In the present study we only made our analysis in two dimensions, due to limitations in the computations.