Scaling laws in earthquake memory for interevent times and distances

Yongwen Zhang,1,2,3,* Jingfang Fan ,4,2 Warner Marzocchi ,5 Avi Shapira ,6 Rami Hofstetter,7 Shlomo Havlin,2 and Yosef Ashkenazy1 1Department of Solar Energy and Environmental Physics, The Jacob Blaustein Institutes for Desert Research, Ben-Gurion University of the Negev, Midreshet Ben-Gurion 84990, Israel 2Department of Physics, Bar-Ilan University, Ramat Gan 52900, Israel 3Data Science Research Center, Kunming University of Science and Technology, Kunming 650500, Yunnan, China 4Potsdam Institute for Climate Impact Research, 14412 Potsdam, Germany 5Department of Earth, Environmental, and Resources Sciences, University of Naples, Federico II, Complesso di Monte Sant’Angelo, Via Cinthia 21, 80126 Napoli, Italy 6National Institute for Regulation of Emergency and Disaster, College of Law and Business, Bnei Brak 511080, Israel 7Geophysical Institute of Israel, Lod 7019802, Israel

Earthquakes involve complex processes that span a wide range of spatial and temporal scales. The limited earthquake predictability is partly due to the erratic nature of earthquakes and partly due to the lack of understanding of the underlying mechanisms of earthquakes. To improve our understanding and possibly the predictability of earthquakes, we develop here a lagged conditional probability method to study the spatial and temporal long-term memory of interevent earthquakes above a certain magnitude. We find, in real data from different locations, that the lagged conditional probabilities show long-term memory for both the interevent times and interevent distances and that the memory functions obey scaling and decay slowly with time, while, at a characteristic time (crossover), the decay rate becomes faster. We also show that the epidemic-type aftershock sequence model, which is often used to forecast earthquake events, fails in reproducing the scaling function of real catalogs as well as the crossover in the scaling function. Our results suggest that aftershock rate is a critical factor to control the long-term memory.

I. INTRODUCTION
The mechanisms of earthquakes are still not fully understood and remain a great scientific challenge [1]. Still, there are some well known empirical laws regarding earthquakes: (i) the Gutenberg-Richter law, which determines the relation between the number of earthquakes N in a given region and a time period and the magnitude m as N (m) ∝ 10 −bm (b ≈ 1) [2], and (ii) the Omori law, according to which the rate of triggered events is ∼t −p , where t is the time since a triggering earthquake (p ≈ 1 for large earthquakes) [3].
An important characteristic of some complex systems is the scaling behavior [4] where earthquakes have been found to follow scaling behavior [5][6][7]. Using the Gutenberg-Richter law and the exponent of the Omori law, Bak et al. [8] found that the probability density function (PDF) of interevent times for different magnitude thresholds and different spatial grid sizes can be rescaled into a single function. This suggests a universal scaling law for earthquakes. Corral [9] extended this scaling to different regions and introduced a more general approach. Specifically, he introduced a unified function f (x) to describe the distribution of interevent times as D(τ ) = R f (Rτ ), where τ is the interevent time and R = 1/τ is the average occurrence rate which depends on magnitude, space scale, and different locations. Corral also argued that the optimal fitted function of f (x) is the generalized distribution [10]. This scaling function follows a power law for small scales and decays exponentially at large scales. Some questions have been raised regarding the universal scaling with region size [11]. In the context of the epidemic-type aftershock sequence (ETAS) model, the scaling function has been found to depend on the ratio between correlated and independent events [11,12]. In addition, a different study suggested that multiple characteristic timescales, which are controlled by the parameters of the ETAS model, are relevant for the universal scaling behavior of the interevent time distribution in the ETAS model [13].
The distribution of earthquake events alone does not reflect all the information about the dynamics, and further time series analysis could improve our understanding of the underlying dynamics of earthquakes. For example, Livina et al. [14] studied the conditional probability of consecutive interevent times and found that these are correlated and not random, i.e., a short interevent time tends to follow a short one and a long interevent tends to follow a long one. Furthermore, detrended fluctuation analysis (DFA) of the interevent interval time series revealed long-range (power-law) correlations [15]. In addition, memory has been found in the time series of earthquakes magnitudes [15,16]. The conditional probability method and DFA have been applied recently to study the memory in time series of consecutive interevent interval time series of real and ETAS model earthquake data [17].
The studies mentioned above [14][15][16][17] focused only on short-term memory, i.e., on successive interevent intervals of earthquakes. However, as shown below, there exists long-term memory in both interevent times and distances. Long-term memory in earthquakes has been found by previous studies using different techniques [18][19][20]. Here we study the longterm memory by considering the lagged conditional probabilities of interevent times and distances in real earthquake catalogs. Thus, we consider here not only the dependence of an interevent interval on the previous one (as in Refs. [14,17]) but also the conditional probability of an interevent interval depending on a previous (lagged) kth interevent intervals. Moreover, we study not only interevent times (as in previous studies) but also interevent distances.

II. MEMORY IN REAL SEISMIC CATALOGS
We start by analyzing the seismic catalog of Italy [21]. An interevent interval time τ i is defined as the time interval between two consecutive earthquake events τ i = t i+1 − t i (in days) above a certain magnitude threshold. Following the Gutenberg-Richter law, the mean interevent time increases with the threshold magnitude. Similarly, we define an interevent distance r i as the distance (in kilometers) between the locations of events i + 1 and i above a certain magnitude threshold. Figure 1 depicts the time series of interevent times and their corresponding distances for all of Italy. As can be seen, after the occurrence of a large earthquake, the interevent times decrease rapidly and then slowly increase, in agreement with the Omori law mentioned above [ Fig. 1(a)]. Similarly, the interevent distances [ Fig. 1(b)] also typically decrease fast after a large earthquake but the following gradual increase observed in the interevent times [see Fig. 1(a)] is less apparent here. We find that similar behavior occur for two specific smaller areas in Italy (see Fig. S3 in [22]) and also for Japan and California [23] (Fig. S4 in [22]). Figure S5 in [22] depicts a scatter plot of interevent times vs interevent distances where two well separated blobs can be observed [see also Fig. 2(b)]. These two groups can be attributed to aftershocks (left blob, short distances) and main shocks (right blob, long distances) (see also [24,25]). Figure S5 in [22] shows that the interevent times and distances exhibit some dependence when considering different blobs (as shorter distances have shorter interevent time) but seem almost uncorrelated within each of the blobs.
To study possible long-term memory in the interevent times and distances of earthquakes, we introduce a lagged conditional PDF method. First, we sort all the interevent times (distances) in ascending order and then divide the sorted series into three equal quantiles. Thus, the first quantile Q1 contains the smallest 1/3 interevent times (distances) and the third quantile Q3 contains the largest 1/3 interevent times (distances). We define the conditional PDF of interevent times and distances as ρ(τ k |τ 0 ) [ρ(r k |r 0 )], where τ 0 (r 0 ) belongs to Q1 or Q3 and τ k (r k ) is the lagged kth interevent time that follows τ 0 (r 0 ). Note that earlier studies [14,17] considered only the first lag (k = 1) and considered only interevent times (but not interevent distances). Figure 2(a) shows the lagged k = 1 conditional PDF ρ(τ 1 |τ 0 ) for the first (left peak) and third (right peak) quantiles. Both ρ(τ 1 |τ 0 ) for Q1 and Q3 are substantially different from the overall PDF ρ(τ ) (indicated by the dashed line). These results are consistent with the significant short-term memory of the nearest time intervals reported in previous studies [14,17]. In addition, we find significant memory for the interevent distances function ρ(r 1 |r 0 ) as shown in Fig. 2(b). Moreover and interestingly, we find significant long-term and long-range memory (correlations) even for large lags, e.g., for k = 10 [Figs. 2(c) and 2(d)] and for k = 50 and k = 100 (Fig. S6 in [22]). The significance of this type of long-term memory can be verified by comparing the conditional PDF to those of the randomly shuffled time series. Since the randomly shuffled time series contain no memory, ρ(τ k |τ 0 ) and ρ(r k |r 0 ) should be identical to the unconditional overall PDF of interevent times and distances, ρ(τ ) or ρ(r), as indeed shown in Fig. S7 in [22].
To quantify the level of memory expressed by the conditional PDF, we suggest as a measure the common area between the conditional PDFs of the smallest and largest quantiles Q1 and Q3. In the absence of memory the common area is one, while when the PDFs of the two quantiles are completely separated, the common area is zero. The common area is marked in Fig. 2 as s 13 (dark area). Therefore, we define the level of memory to be S(τ k |τ 0 ) ≡ 1 − s 13 in a range between 0 and 1 where large S(τ k |τ 0 ) indicates strong memory. Similarly, S(r k |r 0 ) represents the level of memory for the interevent distances.
Next we divide all of Italy into a grid of boxes of edge size L and construct the interevent times (distances) of the events within each box. Then, as described above, we obtain the memory measure S(τ k |τ 0 ) for all the boxes (grid points) of size L. The conditional PDF for the small size L = 3.5 • is depicted in Fig. S8 in [22]. Figure 3(a) shows S(τ k |τ 0 ) for different grid sizes L (different colors) and magnitude thresholds M 0 (different symbols). The memory in the small grid size (red or dark gray) is stronger than the memory of large grid size (green or light gray). The weaker memory for the larger grid size is due to the mixture of the weakly correlated events from remote locations with the nearby highly correlated events. The corresponding memory measure S(r k |r 0 ) for the interevent distances exhibits weaker dependence on the grid size [ Fig. 3(b)]. Note that the memory measure S has two decay rates: It decays slowly for small k and faster for large k, for both the interevent times and interevent distances. Moreover, the crossover point for both is nearly the same.
Since the frequency of earthquakes decreases exponentially with magnitude in Fig. S1 in [22] (Gutenberg-Richter law), the interevent times grow exponentially with magnitude. Indeed, if we rescale k as k × 10 bM 0 (where b = 1 as for the Gutenberg-Richter law) and multiply the memory measure S by L d u /10 aM 0 , we obtain a single scaling function [Figs. 3(c) and 3(d)]. Thus, the rescaled memory measure F (x) is where d u = 0.14 and a = 0.09 for the interevent times and d u = −0.08 and a = 0.24 for the interevent distances. Thus, we are able to rescale the memory measure of different grid sizes and different magnitude thresholds into a single function (memory measure). The scaling parameters were obtained by minimizing the average of the standard variation in all bins in Figs. 3(a) and 3(b). The average of the standard variation for different parameter values is shown in Fig. S9 in [22]. It is apparent that the memory measure decreases for larger grid size L and d u > 0 represents this decrease for the interevent times. For the interevent distances we find the opposite where d u < 0, indicating that the memory measure actually increases with the grid size. The PDF for the large quantile Q3 is limited by the size L and as the grid size increases, the PDF of Q3 shifts to the right and the common area is reduced [ Fig. 2(b)], indicating a stronger memory. We find a positive a, indicating that a larger magnitude threshold M 0 tends to have a stronger memory after k is rescaled. Note that the parameter a of interevent distances is larger than the corresponding parameter a of interevent times.
The scaling functions shown in Figs. 3(c) and 3(d) indicate a crossover between two distinct power-law relations. The scaling function is F (x) ∼ x −γ 1 for x = k × 10 bM 0 , where x is in the range of [10 0 , 10 5 ] and γ 1 is 0.19 and 0.21 for the interevent times and interevent distances, respectively. Both scaling functions exhibit a significant crossover at x c ≈ 10 5.0 and the approximate scaling exponent for large x (i.e., in range of [10 5 , 10 5.5 ]) is γ 2 = 0.88 and 1.11 for interevent times and interevent distances, respectively. It is clear that γ 1 γ 2 such that the decay for small scales is much slower than the decay for larger scales. Note that, due to the short range of large x [10 5 , 10 5.5 ], this range decay could be also fitted to an exponential decay. We cannot rule out that the crossover is due to a transition from a correlated (short-range) regime to an uncorrelated (long-range) regime where the transition is exponential. Figure S10 in [22] shows the relation between the average (and rescaled average) time differences between two earthquakes and their lag k (and rescaled lag). It indicates that the TABLE I. Estimated parameters a and d u , power-law exponents of the scaling function γ 1 and γ 2 , and the crossover points x c and t c for the interevent times and distances for the Italy, Japan, and California earthquake catalogs.

Parameters
Italy Japan California crossover x c for the smaller grid size L corresponds to a longer time t c . For all of Italy, the time t c is around 130 days and it is the same for different magnitude thresholds.
To verify the generality of our scaling, we performed the same scaling analysis [using Eq. (1)] for the Japan and California earthquake catalogs and obtained good scaling. The scaling functions exhibit a crossover similar to that discussed above (Figs. S11 and 12 in [22]). The scaling parameters and exponents are slightly different for the different catalogs and are summarized in Table I. The rescaled memory measure F of the Japan catalog decays slower (as expressed in the smaller exponents γ 1 and γ 2 ) in comparison to the other locations, probably due to the high aftershock rate there. The rescaled lags of the crossover x c are also listed in Table I. Note that the scaling curves of interevent times and distances have a similar crossover point x c for the Italy, Japan, and California catalogs. Thus the crossover x c seems to be unaffected by time and size scales of catalogs (see also Fig. S16 in [22]). It is likely that the crossover point x c is related to the universal behavior of aftershocks. After a characteristic time, the sequences for independent and dependent events will overlap with high probabilities. This could destroy correlations leading to a fast decay of memory even when they exist in spatially separated but temporally overlapping sequences [11].

III. MEMORY IN THE ETAS MODEL
A good earthquake model should be able to reproduce the observed long-term and long-range memory features. Such a model could have the potential to significantly improve the forecasting capability of earthquakes. Thus, we next test the memory in the frequently used earthquake model, the epidemic-type aftershock sequence model (see Supplemental Material in [22]) [26,27]. The ETAS model can provide statistically some reliable forecasts of seismicity [28]. Figure S13 in [22] shows the conditional PDF of the interevent times and Colors and shapes are the same as in Fig. 3. Also shown is the rescaled memory measure for (c) interevent times and (d) distances. The model's parameters are estimated for the Italian catalog [17,29] (see also [22]). The memory measure S is averaged over 50 independent realizations and each realization includes 10 6 events. The error bars are the standard deviations. distances and these exhibit much larger overlap in comparison to the memory in the data (Fig. 2), indicating weaker memory in the model. Figures 4(a) and 4(b) shows the memory measure for both interevent times and distances calculated from the ETAS model. As for the real catalogs [Figs. 4(c) and 4(d)], the memory measure of the model satisfies the scaling relation expressed in Eq. (1). The scaling parameters are d u = 0.16 and a = 0.14 for the interevent times and d u = −0.04 and a = 0.55 for the interevent distances. As expected, the grid size scaling leads to d u > 0 for the interevent times and d u < 0 the interevent distances, similar to the real catalogs. Also, the magnitude threshold scaling parameter a is positive, as for the real catalogs; however, the value of a of the interevent distances in the model (a = 0.55) is two to three times larger than the value we obtain for the real catalogs (a = 0.24, Table I). Thus, the memory measure of the model is significantly smaller than that of the real catalog, even when using the optimal parameters for the ETAS model suggested in [17,29]. Moreover, the crossover power-law behavior at x c that has been observed in the real catalogs (Fig. 3) cannot be seen in the model (Fig. 4). We thus conclude that the ETAS model does not reproduce the main memory features found in the present study for real catalogs.
To better understand which parameters control the powerlaw exponent and to find the optimal model's parameters that reproduce the characteristics of the real catalogs, we perform sensitivity tests by varying the parameters p and α for different choices of μ. We estimate the scaling exponents γ 1 and γ 2 for different regimes of k [similar to Figs. 3(c) and 3(d) and Table I]. The ratio between the high-and small-k exponents γ 2 /γ 1 quantifies the level of crossover and the double power-law behavior. For real catalogs the ratio γ 2 /γ 1 is much larger than 1 and is about 5 for the interevent times and above 2 for the interevent distances. For the model, the results of the sensitivity tests for interevent times and distances are shown in Figs. S14 and 15 in [22]. It indicates that γ 1 and γ 2 depend on both p and α. They are almost unaffected by the choice of the background noise level μ. A small p and large α correspond to a small γ 1 and a large γ 2 at the bottom right corner; these exponents mimic the exponents of the data. Thus, the largest γ 2 /γ 1 will be found at this corner too. Still, γ 2 /γ 1 for the model is very different from the ratio for the real catalogs summarized in Table I; for the ETAS model, the largest γ 2 /γ 1 is about 1.0, thus indicating the absence of crossover and double power-law behavior that was observed for the data (see Figs. S14 and 15 in [22]).

IV. CONCLUSION
In summary, we have proposed a lagged conditional PDF method and detected significant memory in both interevent times and distances for real data and the model. We proposed a scaling function of this memory, for both interevent times and distances, in which the scaling curves of different magnitude thresholds and grid sizes collapse into a single curve. However, we found that the memory function in real data is very different from that of the ETAS model: The model's memory is weaker (stronger) in the short (long) timescale compared to the real catalogs. Moreover, the model does not exhibit the clear crossover observed in the real catalogs (see Figs. 3 and  4). The two power-law decays found here and the crossover between them may imply the existence of large and small aftershock productivity rates α, corresponding to short-and long-term memories of earthquakes.