Magnitude correlations in a self-similar aftershock rates model of seismicity

Crucial to the development of earthquake forecasting schemes is the manifestation of spatiotemporal correlations between earthquakes as highlighted, for example, by the notion of aftershocks. Here, we present an analysis of the statistical relation between subsequent magnitudes for a recently proposed self-similar aftershock rates model of seismicity, whose main distinguishing feature is that of interdependence between trigger and triggered events in terms of a time-varying frequency magnitude distribution. By means of a particular statistical measure, we study the level of magnitude correlations under specific 5 types of time conditioning, explain their provenance within the model framework and show that the type of null model chosen in the analysis plays a pivotal role in the type and strength of observed correlations. Specifically, we show that while the variations in the magnitude distribution can give rise to large trivial correlations between subsequent magnitudes, the non-trivial magnitude correlations are rather minimal. Simulations mimicking Southern California show that these non-trivial correlations cannot be observed at the 3σ-level at the current level of completeness. We conclude that only the time variations in the 10 frequency-magnitude distribution might lead to significant improvements in earthquake forecasting.

of the Hawkes process (Hawkes, 1971), such as the epidemic type aftershock sequence (ETAS) model (Ogata, 1988(Ogata, , 1998, the branching aftershock sequence (BASS) model , the every earthquake is a precursor according to scale (EEPAS) Rhoades and Evison, 2004), or a branching model based on a dynamical scaling hypothesis (Lippiello et al., 2007b, a), all of which exhibit spatiotemporal clustering.
It is through the aforementioned constitutive statistical models that forecasting of seismicity is often implemented (Helmstet-5 ter et al., 2006;Schorlemmer et al., 2010;Woessner et al., 2010;Zechar et al., 2010). Temporal clustering is exemplified by the increased rate (number of earthquakes per unit time) of local seimicity after large earthquakes, where triggering of earthquakes by other earthquakes through either static or dynamic stress changes is one of the predominant physical processes occurring over a wide range of spatio-temporal scales Hainzl et al., 2014). The empirically derived Omori-Utsu relation (Omori, 1894;Utsu, 1957) is used to encode temporal clustering in many of the aforementioned models of seismic-10 ity, e.g., the extensively studied ETAS model employs this relation. Yet, in its original formulation the Omori-Utsu relation is typically not self-similar. By formulating the rate of earthquakes as a self-similar process which gives rise to a generalized Omori-Utsu relation, one finds greater agreement with observed seismic behavior in Southern California (SC) when compared to the standard non self-similar form (Davidsen and Baiesi, 2016). Such self-similarity provides support for the hypothesis that one can scale up constitutive rules derived from fracture and friction experiments in the lab to tectonic earthquakes (Scholz,15 1990; Mogi, 2007). The self-similar aftershock rates (SSAR) model proposed in (Davidsen and Baiesi, 2016) is a model of earthquake-earthquake triggering, similar to the ETAS model. The SSAR model contains two distinguishing features; event rates are self-similar with respect to the magnitude difference between the trigger and the triggered event, and the frequencymagnitude distribution of the triggered events for a single trigger varies in time. In the standard Omori-Utsu relation the number of triggered events of a particular energy does not simply depend on the magnitude difference between trigger and triggered 20 events (i.e. mother-daughter events). This relation is not self-similar unless very special conditions are met; conditions which are typically inconsistent with observations (Davidsen and Baiesi, 2016). The ansatz for a self-similar mother-daughter rate relation can be expressed as, where τ ∆m and c ∆m are two time scales that only depend on ∆m = m − m as , m and m as being the magnitudes of the 25 trigger and triggered event, respectively, and t is the time interval between the trigger and triggered event. This expression is a generalization of the dynamical scaling hypothesis proposed in (Lippiello et al., 2007b, a), which did not consider the time scale τ ∆m and is, thus, inconsistent with high-resolution SC earthquake catalogs (Davidsen and Baiesi, 2016). Dependence on the magnitude difference for the two time scales allows one to obtain two scale-free regimes when considering the frequencymagnitude distribution of events triggered by events of the same magnitude; a feature consistent with observed behavior in 30 the SC catalog (Davidsen and Baiesi, 2016) and shown to exist in the detailed analysis of aftershock rates in Japan (Peng et al., 2007). Furthermore, scaling of the form in Eq. 1 for a particular f t c∆m was shown to be consistent with all accepted empirical relations (Davidsen and Baiesi, 2016). Just as the ETAS model is widely used in forecasting efforts, one would wish to use the SSAR model for this particular purpose given that the latter was shown to better describe the SC seismic data. The main difference between the ETAS and SSAR model is that in the latter case the mother-daughter magnitudes are effectively coupled in accordance with Eq. 1. From this vantage point, one would like to quantify the strength of the statistical correlation of magnitudes in a time ordered catalog in the SSAR model, which may ultimately aid in developing more reliable forecasting methods. For this purpose, the magnitude correlations between subsequent events are of particular interest. In order to study 5 magnitude correlations between subsequent events we apply here a statistical method similar to the ones employed in (Lippiello et al., 2008;Davidsen and Green, 2011;Davidsen et al., 2012). An important aspect to highlight is that in our analysis we test two different types of null hypotheses against the SSAR model. We find that the null hypothesis plays a significant role in the types and strength of magnitude correlations observed. This allows us to distinguish between trivial magnitude correlations that are simply a consequence of the variations in frequency-magnitude distribution and non-trivial ones that are not.

Overview of this paper
We first give a brief overview of the SSAR model (Sect. 2), introduce the specifics of the surrogate catalogs (Sect. 2.1), followed by the methodology (Sect. 3.1) and analysis of the magnitude correlations between subsequent events through the lens of a particular statistical measure (Sect. 3.2, 3.3). In the latter, we show why it is important that in the analysis of magnitude correlations care must be taken with the methodology (on choosing the randomized magnitudes, i.e., the type of null model) if 15 one wishes to avoid confounding factors. In Sect. 4 we present a discussion of our results.

The self-similar aftershock rates (SSAR) model
The SSAR model recasts the standard Omori-Utsu rate equation into a self-similar version. A distinguishing feature of the rate equation in the SSAR model is that it only depends on the difference between mother-daughter events making it self-similar (e.g., the rates of a magnitude 3 mother and magnitude 2 daughter event are the same as those of a magnitude 5 mother and a 20 magnitude 4 daughter event), with time scales, c ∆m = c 0 10 g∆m and τ ∆m = τ 0 10 −z∆m , where g and z are universal scaling exponents, with constant prefactors c 0 and τ 0 , and p 1 (Davidsen and Baiesi, 2016) 25 (p ≤ 1 is unphysical if considering only daughter events aka directly triggered events, see, for example, (Davidsen et al., 2015)). A similar ansatz with related scaling relations has been proposed in (Shcherbakov et al., 2015) in a context in which the magnitude of the largest aftershock is assumed to play a significant role. To obtain the total number of triggered events of magnitude m as for a given trigger m , we integrate Eq. 2 in the time domain, which only depends on ∆m ensuring self-similarity. Integrating Eq. 4 from a chosen cutoff magnitude m cut to ∞, we find 5 which is simply the Gutenberg-Richter relation for triggered events (Davidsen and Baiesi, 2016), giving us the scaling relation, In contrast, for finite times the number of triggered events of magnitude m as up to a time t f is, 10 Plotting Eq. 7 for different values of t f in Fig. 1, we observe a defining characteristic of the SSAR model. Unlike another self-similar model (Lippiello et al., 2007a(Lippiello et al., , 2008, two scale-free regimes co-exist for all finite values of t f in the frequency-magnitude distribution; an effect consistent with the observed temporal variability of the b-value during aftershock sequences (Shcherbakov et al., 2004). Namely, b → z corresponds to the dominating exponent in the early time limit of Eq. 2, and a second regime with b → g + z that dominates at later times. In other words, at early times we can observe a b-value of 15 z over an extended regime of small magnitudes but as t f increases the transition point moves towards smaller magnitudes and we begin to see a more extended range with a b-value of g + z, which corresponds to the asymptotic behavior for t f → ∞.
Analogously to the ETAS model, the full SSAR model is given by a time-varying seismic rate (also called the conditional intensity or stochastic intensity), which takes on the following form where µ is a constant and s 0 (m) is the probability density function (PDF) of the magnitudes of background events, i.e., events that are not triggered by other events, and their product determines the background rate, which is assumed to be uniform in time and space. Similarly, ψ m,m (t) , s (m) , and ζ m (r) are the PDFs for the temporal distance, magnitude and spatial distance of daughter events triggered by a mother of magnitude m , respectively. κ (m ) corresponds to the total number of daughters triggered by a mother, often denoted as the productivity relation. For the purpose of our temporal analysis of magnitudes, we can ignore the spatial component in Eq. 8, we refer the reader to Davidsen and Baiesi, 2016) for a treatment of ζ m (r). The PDFs for the magnitude distribution of background and triggered events are the normalized Gutenberg-Richter relations: where m cut indicates the lower magnitude cut-off, β = b ln 10 and β as = b as ln 10. The productivity relation and the normalized temporal distribution for the generalized Omori-Utsu relation are, respectively,

SSAR model catalogs
Since the SSAR model was tested using a catalog from SC (Davidsen and Baiesi, 2016), we focus here on synthetic model catalogs that are comparable to those from SC. We would like to point out again that for the purpose of our analysis of magnitude correlations below, the spatial location of events is not relevant. A seismic catalog generated through the SSAR model (giving b as = 0.9) that agree with those of SC (Davidsen and Baiesi, 2016). This corresponds to a coverage of about 36 years, the current length of the relocated SC catalog from 1981 to 2017 (Hauksson et al., 2012). We have also imposed a hard up-10 per cutoff, M max = 7.40, for the largest magnitude possible in the SSAR-SC catalog in order to avoid an unphysical runway process.

Magnitude correlations in the SSAR model
In this section we aim to answer the question of what is the type and strenght of the effective magnitude correlations in the SSAR model (for an analysis of magnitude correlations in SC catalogs, see (Davidsen and Green, 2011)). Correlations arise by 15 means of the rate equation given by Eq. 2 since the timing of the daughters will depend on the magnitude difference between the daughters and mothers as captured by the functional dependencies of c ∆m and τ ∆m . We first draw attention to the methodology since this is a crucial aspect in understanding the types of correlations observed in the model.

Methodology
Our study of magnitude correlations, similar to methods used in (Lippiello et al., 2008;Davidsen and Green, 2011;Davidsen et al., 2012), considers subsequent events in the time ordered catalog. Specifically, it focuses on ∆m i = m i+1 − m i (for a particular magnitude thresholds m th ) and compares these to randomized magnitude differences averaged over 500 realizations, i.e., ∆m * i = m i * −m i where m i * is a magnitude chosen at random. If magnitude correlations between subsequent events in the 5 ordered catalog are present, the distribution of ∆m will deviate from the distribution of the randomized case; ∆m * . To asses whether magnitude correlations exist in the SSAR model we considered three types of conditioning (unconditioned, ∆t and ∆t For ∆t and ∆t & M-D (mother-daughter) conditioning we consider the corresponding quantities, 10 δP (m 0 |∆t < y) = P (∆m < m 0 |∆t < y) − P (∆m * < m 0 |∆t < y) , and Specifically, for ∆t and ∆t & M-D conditioning one only considers subsequent event pairs and their ∆m i if the time interval between the two events is not longer than ∆t. In addition, for ∆t & M-D conditioning these event pairs also have to 15 be a mother-daughter pair. The reason why we choose to condition on time intervals is motivated by the expectation that event pairs that are closer in time are more likely to be related -either by being mother-daughter pairs or by being daughters of the same mother -than those further apart. Note that in the SSAR model all dependencies are fundamentally encoded at the mother-daughter level, viz. Eq. 2. By conditioning on time we are also preferentially "picking" certain magnitude differences via the rate equation, Eq. 2. 20 Another important aspect in our analysis is how we randomly choose the magnitudes m i * in the case of ∆t or ∆t & M-D conditioning. One can either pick m i * 's from the already conditioned catalog -which we call sub-catalog randomizing -or one can pick m i * 's from the full unconditioned catalog -called full-catalog randomizing. It is important to point out that in sub-catalog randomizing the frequency-magnitude distributions of m i+1 and m i * are identical by construction. In full-catalog randomizing, however, the m i * might or might not follow a different frequency-magnitude distribution. This is possible since 25 the frequency-magnitude distribution can vary in the SSAR model as discussed above. A more detailed explanation of sub-and full-catalog randomizing can be found in the supplemental material. (non-trivial) magnitude correlations, while the correlations in the other case correspond to a mixing of non-trivial and trivial correlations, where the latter simply arise due to the differences in the frequency-magnitude distribution.

Magnitude correlations for the unconditioned case of the SSAR model
In Fig. 3 we show the previously described measure of magnitude correlations for subsequent events for the unconditioned case in the SSAR-SC catalog. Magnitude correlations that are significant at the 3σ-level exist in the SSAR-SC catalog in the 5 range m th = 1.60 -2.80. Inspecting the slope of δP (m 0 ) in Fig. 3, we see that the values ∆m have a higher tendency to lie in a given range when compared to the randomized case ∆m * , and less likely to lie outside said range: for m th < 2.4 the slope of δP is typically positive in the range -0.5 to 0.25 showing an ≈ 0.9% higher probability that ∆m lies within this range when compared to ∆m * . While significant, this difference is very small. The absence of significant correlation at the 3σ-level for m th = 3.40 (the current magnitude of completeness for SC (Schorlemmer and Woessner, 2008)) is simply a consequence of 10 an insufficient number of events.

∆t & mother-Daughter conditioning in the SSAR model
To test whether magnitude correlation become stronger if one considers pairs of events that are related, we now focus on the magnitude correlation analysis for ∆t & M-D conditioning. The correlations that arise under ∆t & M-D conditioning are shown in Fig. 4. The two randomization types produce vastly different results. For sub-catalog randomizing, Fig. 4 a), we see is now up to four times higher for the smallest ∆t than for δP (m 0 ) (Fig. 3). Even for δP (m 0 |∆t < ∞ & M-D) there is a significant increase in the excess of daughters that are on average the same size as the mother to about 2.6%. As before, this excess comes at the expense of significantly smaller and larger daughter events. As Fig. 4 c) shows, a similar behavior occurs for larger m th .
In contrast, when estimating δP (m 0 |∆t < y & M-D) using full-catalog randomizing, Fig. 4 b), the distribution is qualita-5 tively and quantitatively distinct compared to δP (m 0 ). Specifically, Fig. 4 b) demonstrates that independent of the ∆t values, the magnitudes of the daughter events tend to be on average larger than those of the mother events compared to what is expected based on the null model. The associated probability increases from a minimum of 8% to over 20% for increasing values of ∆t.
As Fig. 4 d) shows, a similar behavior occurs for larger m th .
As discussed above, the underlying difference between sub-catalog and full-catalog randomizing for ∆t & M-D condition-10 ing is the frequency-magnitude distribution of the randomized daughter events (c.f., end of Sect.3.1). Thus, our observations indicate that the non-trivial magnitude correlations as captured by sub-catalog randomizing are significant but smaller by up to a factor of 6 for ∆ = 10 2 s and m th = 2.4 compared to the mixture of trivial and non-trivial magnitude correlations measured by using full-catalog randomizing. This indicates that the trivial magnitude correlations arising from differences in the frequency-magnitude distribution significantly outweigh the non-trivial ones under appropriate conditioning and play the more While in our model simulations we can readily identify mother-daughter pairs, i.e., the ground truth is known, this is not the case for field data. Thus, for such catalogs -including the SC catalog -one would need to infer mother-daughter pairs m th =1.6 a) Δ t = 10 2 s Δ t = 10 2.8 s All events aka decluster the catalog first (Baiesi and Paczuski, 2004;Zaliapin et al., 2008;Marsan and Lengliné, 2008;Zaliapin and Ben-Zion, 2013;Gu et al., 2013) in order to estimate the magnitude correlations for ∆t & M-D conditioning. As an alternative, time conditioning alone has been used in the past (Lippiello et al., 2008;Davidsen and Green, 2011;Davidsen et al., 2012).
This is what is shown in Fig. 5. When comparing to Fig. 4, a significant decrease in amplitude of δP can be observed. This decrease is especially large for small m th . Yet, as before the trivial magnitude correlations arising from differences in the 5 frequency-magnitude distribution significantly outweigh the non-trivial ones.
To clarify the reason for the difference between the ∆t & M-D conditioning and the ∆t conditioning, we can examine the ratio of event pairs within ∆t that are mother-daughter to all event pairs that fall within the time interval ∆t in the SSAR model (Fig. 6). According to Fig. 6, in order to maximize the ratio of mother-daughter events one should choose subsequent event pairs that have a high m th and are close in time (small ∆t). This explains the differences between two different types of 10 conditioning shown in Fig. 4 and in Fig. 5, which become less pronounced for higher m th and smaller ∆t.
The aforementioned maximization comes with a trade-off; although a higher m th value captures more mother-daughter pairs, the total number of selected event pairs goes down at the same time leading to higher statistical uncertainties. It is also important to realize that the ratios shown in Fig. 6 are for the specific parameters used in our realization of the SSAR model

Discussion & Conclusion
Through a particular statistical measure (Sect. 3.1) we have shown how two different types of magnitude correlations between subsequent events arise in the SSAR model (Sect. 3.3). Trivial correlations are largely a consequence of variations in the frequency-magnitude distribution, while this is not the case for non-trivial correlations, similar to what has been discussed in the context of tectonic seismicity (Davidsen and Green, 2011). Both types of correlations can be estimated by using different 5 underlying null models, implemented here by the two different types of catalog randomizing. Given that magnitudes in the SSAR model are not independent (as exemplified in Eq. 2), it does not come as a surprise that non-zero magnitude correlations exist. We were able to explicitly show that it is indeed the mother-daughter pairs that are largely responsible for these correlations. Based on this, we were also able to show that one can increase the observed magnitude correlations by conditioning on shorter time intervals and considering higher values of m th (c.f., Fig. 6); an important fact one can use when triggering 10 relations or declustering algorithms are unknown and only information on time intervals are available, such as in the case of real world catalogs. When dealing with real-world catalogs one needs, however, to consider the effects due to magnitude of completeness and short term aftershock incompleteness as well (Kagan, 2004;Moradpour et al., 2014;Hainzl, 2016).
Finally, the significantly higher strength of the trivial correlations compared to the non-trivial correlations is the main outcome of our analysis. Thus, when it comes to improving earthquake forecasting efforts our analysis leads us to believe that 15 looking at the time variations in the frequency-magnitude distribution could perhaps be a more fruitful approach then focusing on non-trivial correlations. Using the SSAR model instead of the ETAS model in existing forecasting frameworks would be one way to utilize this. This remains a challenge for the future.
type of coding that would be required for the creation of the SSAR code and for helpful consultations, to Mohammed Yaghoobi for helpful discussions on the interpretation of the magnitude correlation plots and to Ayush Mandawal for lending an ear and participating in general dialogues on the topic. JD was financially supported by NSERC.