Hierarchical burst model for complex bursty dynamics

Temporal inhomogeneities observed in various natural and social phenomena have often been characterized in terms of scaling behaviors in the autocorrelation function with a decaying exponent $\gamma$, the interevent time distribution with a power-law exponent $\alpha$, and the burst size distributions. Here the interevent time is defined as a time interval between two consecutive events in the event sequence, and the burst size denotes the number of events in a bursty train detected for a given time window. In order to understand such temporal scaling behaviors implying a hierarchical temporal structure, we devise a hierarchical burst model by assuming that each observed event might be a consequence of the multi-level causal or decision-making process. By studying our model analytically and numerically, we confirm the scaling relation $\alpha+\gamma=2$, established for the uncorrelated interevent times, despite of the existence of correlations between interevent times. Such correlations between interevent times are supported by the stretched exponential burst size distributions, for which we provide an analytic argument. In addition, by imposing conditions for the ordering of events, we observe an additional feature of log-periodic behavior in the autocorrelation function. Our modeling approach for the hierarchical temporal structure can help us better understand the underlying mechanisms behind complex bursty dynamics showing temporal scaling behaviors.


I. INTRODUCTION
Events in temporal patterns of natural and social phenomena have often been found to be inhomogeneously distributed in time.Examples include solar flares [1], earthquakes [2,3], neuronal firing [4], and human social activities [5,6].Such temporal inhomogeneities in event sequences have been studied in terms of bursts, which are rapidly occurring events in short-time periods, alternating with long periods of inactivity.It is known that many dynamical processes, such as spreading or diffusion, taking place in a network of individuals are strongly influenced by the bursty temporal patterns of individuals and/or interactions between them [7][8][9][10][11][12][13]. Therefore, it is of utmost importance to comprehensively characterize temporal inhomogeneities, not only for understanding various complex dynamics but also for predicting and even controlling them, if possible.
In order to characterize the temporal inhomogeneities in event sequences, we first denote the event sequence by x(t) that has a value of 1 at the moment of event occurred, 0 otherwise.Then one can measure an autocorrelation function with delay time t d as where • t is a time average.For event sequences with long-range memory effects, the autocorrelation function often shows a power-law decaying behavior as with a decaying exponent γ.In general, temporal correlations characterized by A(t d ) can be understood in terms of (i) interevent times and (ii) correlations between interevent times [14].Here the interevent time is defined as a time interval between two consecutive events, denoted by τ .The statistics of interevent times have been described by the interevent time distribution, while the correlations between interevent times have been studied in terms of burst size distributions [6,15].In many empirical datasets showing temporal inhomogeneities, the interevent time distribution P (τ ) has been characterized by a power-law function as with α denoting the power-law exponent [6].It has been proved that when interevent times are fully uncorrelated with each other, the power-law exponent α of the interevent time distribution is related to the decaying exponent γ of the autocorrelation function such that α+γ = 2 for 1 < α < 2 [16,17].On the other hand, the correlations between interevent times have been studied in terms of bursty trains [15].A bursty train or burst is defined as a set of events such that interevent times between any two consecutive events in the same burst are less than or equal to a given time window ∆t, while those between events in different bursts are larger than ∆t.We denote the number of events in each burst by b, and its distribution by Q ∆t (b).Since the uncorrelated interevent times result in the exponential function of burst size distributions, any deviation from the exponential function may indicate the existence of correlations between interevent times, often called correlated bursts [14,15,[18][19][20][21].In particular, the power-law burst size distributions for a wide range of ∆t have been observed in earthquakes, neuronal activities, and human communication patterns [15], i.e., with β denoting the power-law exponent.Here one can ask a question about how strong correlations between interevent times should be present to violate the scaling relation α + γ = 2 derived for the uncorrelated case.Our understanding on this issue is far from complete, except for few recent works [14,17,22].
Along with various characterization methods for the bursty temporal patterns, a number of modeling approaches have been suggested to understand the underlying mechanisms behind such temporal inhomogeneities [6].In the case with human dynamics, we find several modeling approaches, such as priority queuing models [5,23], inhomogeneous Poissonian processes [24,25], self-exciting point processes [20,26], and reinforcement models [15,27].Although previous modeling approaches have been successful for understanding the observed temporal inhomogeneities to some extent, we here take an alternative modeling approach, inspired by the scaling behaviors in Eqs. ( 2)-( 4), indicating a hierarchical temporal structure in various complex systems.In order to understand the hierarchical temporal structure, we devise a hierarchical burst model by assuming that each observed event in an event sequence might be a consequence of the multi-level causal or decision-making process: A seed event at the zeroth level induces other events at the first level, each of which in turn leads to other events at the second level, and so on.Thanks to the simplicity of our model, we can derive a fractal dimension d f of the event sequence, the decaying exponent γ of the autocorrelation function, and the power-law exponent α of the interevent time distribution to confirm the scaling relation α + γ = 2, while the derivation of the burst size distributions turns out to be not straightforward.Our modeling approach can help us better understand the underlying mechanisms behind complex bursty dynamics, e.g., in terms of a hierarchical task organization for human dynamics.We also note that hierarchical document streams have been modeled using an infinite-state automaton [28], sharing the goal with our approach.
Our paper is organized as follows: In Sec.II, after introducing the hierarchical burst model, we study our model analytically and numerically in terms of the scaling behaviors of the fractal temporal structure, the autocorrelation function, and the interevent time distribution.Then we numerically obtain the stretched exponential burst size distributions, for which we provide an analytic argument.We also discuss the effect of imposing the ordering of events in terms of log-periodicity in the autocorrelation function.Finally, we conclude our work in Sec.III.

A. Model definition
We introduce the hierarchical burst model by assuming that each observed event in an event sequence might be a consequence of the multi-level causal or decision-making process: A seed event at the zeroth level induces other events at the first level, each of which in turn leads to other events at the second level, and so on.Then the events at the final level compose the event sequence, while events at other levels are considered to be unobservable or hidden [29].Precisely, for the levels of l = 0, 1, • • • , L − 1, each event at the lth level induces exactly η events at the (l + 1)th level, where η > 1.If the timing of one event at the lth level is denoted by t l , the events induced by that event take place uniformly at random in the time interval of (t l , t l +R/λ l ], which we call an induction interval.Here R is the induction interval assigned to the seed event, and λ > 1 denotes the contraction factor between the induction intervals of consecutive levels.See Fig. 1(a) for the schematic diagram and Fig. 1(b) for an event sequence generated using η = 2, λ = 2.5, R = 1, and L = 8, resulting in n = η L = 256 events.
We focus on the case with one seed event, enabling us to set R = 1 without loss of generality.Then the case with multiple seed events will be briefly discussed in Subsec.II E. Since η > 1 and λ > 1, the induction interval decreases exponentially as a function of the level index l, while the number of events at the lth level is an exponentially increasing function of l.Hence, our model can be interpreted as a successive division of a big task (the seed event) into smaller tasks, ending up with the smallest unit of tasks (events at the final level) that are supposed to be executed in a bursty way.Our model can also be mapped to the one-dimensional Soneira-Peebles model [30], which was originally introduced to Simulation results of the hierarchical burst model using η = 2.In (a-c), the box counting result N (r) for measuring the fractal dimension, the autocorrelation function A(t d ), and the interevent time distribution P (τ ) in the case with λ = 2.5 and L = 18 (circles) are compared to the analytical results for d f , γ, and α (solid lines), respectively.Each curve was averaged over 500 realizations of event sequences.In (d-f), for the same value of λ = 2.5, we plot the fractal dimension d f (n), the decaying exponent γ(n) of the autocorrelation function, and the power-law exponent α(n) of the interevent time distribution, as functions of n = η L , i.e., the number of events in the event sequence (circles).For fitting the data, we adopt the functional form of f (n) = f (∞) + an −ν with ν > 0 (solid lines).In (g), we plot numerical values of d f , γ, and α using various values of λ for fixed η = 2 and L = 18, where each point (circle) was averaged over 200 realizations of event sequences, to confirm the scaling relations of α = 1 + d f (dotted line) and γ = 1 − d f (solid line).
generate self-similar galaxy distributions in two-or threedimensional space [31,32], and recently applied to model a population landscape for human mobility [33].

B. Temporal scaling behaviors
Once an event sequence of n = η L events is generated by our model, we can derive its fractal dimension, autocorrelation function, and interevent time distribution, while the derivation of burst size distributions turns out to be not straightforward.
For calculating the fractal dimension d f of the event sequence, the box-counting method is used: We count the number of boxes of size r needed to cover all events, which is denoted by N (r).If N (r) decays as a power law according to r, the corresponding power-law exponent defines the fractal dimension d f , namely, When the box size is given as Here we have assumed that the boxes covering events or induction intervals at the same level do not necessarily overlap, or that even when they overlap, its effect would be negligible in estimating the fractal dimension.We will use this assumption for the following analysis, while its effect will be numerically studied in Subsec.II D. Note that d f cannot be larger than the spatial (or temporal) dimension of 1, even when η > λ.However, we will consider only the case with η < λ.
As evident in Eq. ( 1), the autocorrelation function A(t d ) with delay time t d essentially measures the possibility of finding two events observed in t and t + t d , no matter how many events occur between them.The number of events within the range of t d from any event is of the order of t d f d using Eq. ( 6).Therefore, the number of events in the range of (t d , t d + dt d ) is of the order of t

implying that the autocorrelation function has the form of
Next, we derive the interevent time distribution P (τ ).Let us consider η l+1 events at the (l + 1)th level.Among them, events induced by the same event in t = t l at the lth level will be found in the range of (t l , t l +1/λ l ].Thus, the interevent times between events induced by the same lth-level event must be of the order of 1/λ l .On the other hand, events induced by different lth-level events will be separated by interevent times larger than 1/λ l .The number of such cases corresponds to that of events at the lth level, i.e., ∼ η l .Hence one can write By using F (τ ) ≡ ´∞ τ P (τ )dτ and the relation η = λ d f in Eq. ( 6), one gets F (τ ) ∼ τ −d f , leading to Since 0 ≤ d f ≤ 1 in our model, the value of α is limited to the range of [1,2].Finally, combing the results in Eq. ( 7) and Eq. ( 9), we obtain the scaling relation between α and γ as follows: which turns out to hold irrespective of d f , i.e., irrespective of η and λ.This scaling relation has been derived for the case that interevent times are fully uncorrelated with each other [16,17].Hence, this result in Eq. ( 10) may indicate that the correlations between interevent times in our model are not strong enough to violate the scaling relation in Eq. (10).In order to tackle this issue, we will study the burst size distribution Q ∆t (b) in Subsec.II C.
For the numerical simulations of our model, we begin with one seed event in t = 0 at the zeroth level.Then η events are uniformly distributed in the range of (0, 1] at the first level.Each of these induced events in turn induces η events at the second level in the range of (t 1 , t 1 + 1/λ], with t 1 denoting the timing of one of events at the first level.This induction process is repeated until the Lth level is reached, leaving us n = η L events.For the demonstration, we focus on the case with η = 2 and λ = 2.5, with which one expects d f ≈ 0.756 from Eq. ( 6), consequently γ ≈ 0.244 and α ≈ 1.756 from Eq. ( 7) and Eq. ( 9), respectively.
We analyze the generated event sequences for various values of L. For example, Fig. 2(a-c) shows the numerical results of N (r), A(t d ), and P (τ ), all averaged over 500 event sequences using L = 18.We find that the estimated values of corresponding power-law exponents, i.e., d f , γ, and α, are comparable with those expected from the analysis, but with some visible deviations in cases of d f and γ.These deviations could be due to the finitesize effects.In order to study such finite-size effects, we estimate the above power-law exponents for various sizes of n, equivalently, for various values of L = 12, • • • , 18, as shown in Fig. 2(d-f).The size-dependent power-law exponents are denoted by d f (n), γ(n), and α(n), respectively.Each of these exponents is fitted with a functional form of f (n) = f (∞) + an −ν with ν > 0, from which the value of f (∞) is obtained.We find that d f (∞) = 0.75(1), γ(∞) = 0.25 (1), and α(∞) = 1.75(1), all consistent with those expected from the analysis within error bars.Based on these results, we conclude that the effects of overlapping induction intervals at the same level turn out to be negligible to the scaling relations.Finally, we numerically confirm the scaling relations in Eqs. ( 7) and ( 9) for various values of λ when η = 2 is fixed, as depicted in Fig. 2(g).

C. Burst size distributions
In order to scrutinize the existence of correlations between interevent times, we measure the burst size distributions Q ∆t (b) for various values of the time window ∆t.For example, the numerical results for L = 18 are shown in Fig. 3(a).The curves of Q ∆t (b) for a wide range of ∆t, when properly normalized, turn out to collapse into the same curve for the range of b > b ∆t , where b ∆t is the average burst size when the time window is given as ∆t.This curve is well described by the stretched exponential function, i.e., with µ ≈ 0.29(1) and c ∆t denoting a proper coefficient depending on ∆t.The fact that Q ∆t (b) deviates from the exponential function indicates the existence of correlations between interevent times.At the same time, such correlations depicted in terms of stretched exponential functions might not be strong enough to violate the scaling relation between α and γ in Eq. ( 10).This conclusion is indeed consistent with the observations in Ref. [14], in which even the power-law burst size distribution in the form of Q ∆t (b) ∝ b −β could not violate the relation of α+γ = 2 unless it has a sufficiently heavy tail with β < 3.
In order to understand why burst size distributions observed in our model are better described by a stretched exponential function rather than a power-law function, we study how likely it is to cluster events induced by the different events to the same burst for a given time window.The more likely such case is to happen, the burst size distribution can have a heavier tail.Precisely, for a given time window ∆t, we calculate the probability of events induced by the different events to be clustered to the same burst, which is then compared to the probability of events induced by the same event to be clustered to the same burst.
For the analysis we consider the minimal case with η = 2.When the time window is given as ∆t = 1/λ l , we only need to consider the events at the lth, (l − 1)th, and (l−2)th levels, as depicted in Fig. 3(b).The timescales at other levels are either too large or too small to be relevant to the analysis.We denote the timing of one event at the (l − 2)th level by t l−2 .This event induces two events at the (l − 1)th level, whose timings are respectively t l−1,0 and t l−1,1 , satisfying These two events at the (l − 1)th level induce four events at the lth level, whose timings are respectively t l,0 , t l,1 , t l,2 , and t l,3 , satisfying  19) and Pr(τ2 < ∆t) in Eq. ( 29) for ∆t = 1/λ l .
That is, the events in t l,0 and t l,1 are induced by the event in t l−1,0 , while the events in t l,2 and t l,3 are induced by the event in t l−1,1 .By assuming that t l,1 < t l,2 , we have three interevent times between events at the lth level: see Fig. 3(b).Using the order statistics [34][35][36], we get the distribution of τ 1 as for 0 < τ 1 ≤ 1/λ l−1 , which is the same as P (τ 3 ).Then the probability of clustering two events induced by the same (l − 1)th-level event for a given ∆t = 1/λ l is calculated as Next, in order to derive the distribution of τ 2 , we rewrite τ 2 in Eq. ( 16) as where x see Fig. 3(b).Here x 0 is indeed the interevent time between events at the (l − 1)th level, leading to its distribution as for 0 < x 0 ≤ 1/λ l−2 .We also get the distribution of x 1 as for 0 < x 1 ≤ 1/λ l−1 , which is the same as P (x 2 ).We now calculate Pr(τ 2 < ∆t), i.e., (26) As x i s for i = 0, 1, 2 are statistically independent of each other, using t ≡ 1/λ l +1/λ l−1 we rewrite the above equation as where θ(•) is the Heaviside step function.Taking the Laplace transform, one gets By plugging the Laplace transforms of P (x i ) in Eqs. ( 24) and (25) into the above equation, and then taking the inverse Laplace transform of h(s), one can get h(t).We finally obtain for the entire range of λ > η = 2.By comparing Eq. ( 19) with Eq. (29), we conclude that for λ > 2 as numerically shown in Fig. 3(c).This inequality holds for any level index l, implying that the chance of clustering events induced by the different events at the previous level must be low at any level.This low chance in turn lowers the possibility of finding big bursts, giving us a hint at the reason why burst size distributions in our model do not show a heavier tail than the stretched exponential function.
4. Log-periodic behaviors in the autocorrelation functions of the hierarchical burst model using several values of (η, L) = (2, 17), (3,11), and (4, 9) for a fixed λ = 4.8 under conditions in Eqs. ( 31) and ( 32) for non-overlapping induction intervals, where all curves were averaged over 200 realizations of event sequences.For each curve, we have used the best fit value of γ fit for the best presentation of the log-periodicity.The curve for η = 2 was vertically shifted for the clear presentation.

D. Effect of non-overlapping induction intervals
Our model allows induction intervals at the same level to overlap with each other, although its effects turn out to be irrelevant to the scaling relations between d f , γ, and α, as discussed in Subsec.II B. Let us consider two events at the (l −1)th level, which respectively take place in times t l−1,0 and t l−1,1 , with t l−1,0 < t l−1,1 .Since induction intervals can overlap, it is possible that some events induced by the event in t l−1,0 take place later than other events induced by the event in t l−1,1 , e.g., in the case when t l,0 < t l,2 < t l,1 < t l,3 in Subsec.II C.This situation can be called an event crossing.The occurrence of event crossing may cause some problems, e.g., in the context of task executions: Although the tasks are supposed to be executed sequentially, they can be executed out of order, if possible.In order to avoid such event crossings, we impose a rule for the non-overlapping induction intervals.When one event in t l−1 at the (l − 1)th level induces η events at the lth level, their timings, denoted by t l,i or t l,j for i, j = 1, • • • , η, are to satisfy the following conditions at each level: By the first condition in Eq. ( 31) any descendent events of the event in t l−1 are forced to remain in the range of (t l−1 , t l−1 + 1/λ l−1 ].The second condition in Eq. ( 32) prohibits the induction intervals at the same level from overlapping with each other.
By performing numerical simulations, we find that the fractal dimension, the interevent time distribution, and the burst size distributions show overall the same behaviors as in the original version of our model (not shown).However, the autocorrelation function shows a qualitatively different behavior, as shown in Fig. 4. We find a power-law decaying function coupled with log-periodic behavior, say, with appropriate constants c 0 , ω, and φ.In particular, we can relate the frequency ω to the contraction factor λ, based on the observation that the distance between consecutive peak times of A(t d ) increases by a factor of λ mainly due to Eq. ( 32).Precisely, let us consider a simple log-periodic function of g(t) = cos(ω ln t).The peak times are determined by g(t k ) = 1, i.e., t k = e 2πk/ω for integers k.As the distance between the kth and (k + 1)th peak times is larger than the distance between the (k − 1)th and kth peak times by a factor λ, one can write leading to the relation between ω and λ as follows: For example, when λ = 4.8, we get ω ≈ 4.006 from Eq. ( 35), which is comparable with the numercial value of ω = 4.02(3) estimated from the curve for η = 3 in Fig. 4. We also find that ω is not a function of η, which is probably because the condition in Eq. ( 32) can be imposed irrespective of η.

E. Case with multiple seed events
So far we have considered the case only with one seed event at the zeroth level.Here we test if our conclusions in the case with a single seed event are robust with respect to the number of seed events at the zeroth level.For this, we perform the numerical simulations of our original model, i.e., allowing induction intervals to overlap, with 10 seed events whose timings are randomly chosen in the range of [0, 1].In this case, we set the induction interval for each seed event as R = 0.1.The numerical results are summarized in Fig. 5, showing overall the same scaling behaviors of the fractal dimension, the autocorrelation function, and the interevent time distribution.We also find the stretched exponential function with the same value of µ = 0.29(1) in Eq. ( 11) fitted well to the burst size distributions.

III. CONCLUSION
We have studied the hierarchical burst model for the hierarchical temporal structure by assuming that an observed event sequence is generated by a multi-level causal or decision-making process.A seed event at the zeroth level induces η events at the first level, each of which in turn induces other η events at the second level, and so on.The interval for the induction is assumed to decrease by a contraction factor λ from one level to the next level.Only the events at the final level are considered to be observed in the event sequence.We first analyze the model by deriving the analytic solutions for the fractal dimension d f = ln η/ ln λ, the autocorrelation function with power-law exponent γ = 1 − d f , and the interevent time distribution with power-law exponent α = 1+d f .We immediately obtain α + γ = 2, irrespective of d f .This scaling relation has been derived for the case that interevent times are fully uncorrelated with each other [16,17].The scaling relations of γ = 1 − d f and α = 1 + d f have also been numerically confirmed.
On the other hand, it turns out that the burst size distributions are not straightforward to analyze.By performing numerical simulations, we find the stretched exponential function for the burst size distributions, implying the existence of correlations between interevent times, often called correlated bursts.However, such correlations are not strong enough to violate the scaling relation α + γ = 2.For the stretched exponential burst size distributions, we provide an argument based on an analytical calculation.We also find that by imposing non-overlapping induction intervals for the ordering of events, the autocorrelation function is described by a power-law decaying function coupled with log-periodic behavior, whose frequency is related to the contraction factor λ.
Despite the debate on the functional form of burst size distributions [15,37], one can extend our model to reproduce the power-law burst size distributions as evident in some empirical data analysis [15], which then can help us to understand the correlations between interevent times in the context of the hierarchical burst structure [14].We also remark that 1 ≤ α ≤ 2 in our model, while its empirical values are often found to be out of the range of [1,2], as summarized in Ref. [6].This requires us to devise more flexible hierarchical burst models showing a wide range of α.Finally, our model can be extended to incorporate a number of complex realistic situations.For example, we can consider the context of events [38] and a network of interacting individuals [11,[39][40][41], whose interaction activities are described by complex bursty dynamics.

FIG. 1 .
FIG. 1.(a) Schematic diagram of the hierarchical burst model with η = 2 up to the second level.Red vertical arrows indicate the events at each level, each of which is assigned an induction interval (light green shade with horizontal dotted arrow) for events at the next level.(b) An example of the event sequence generated by the model with η = 2, λ = 2.5, R = 1, and L = 8.See the text for the details of the model.

FIG. 3 .
FIG. 3. (a) Simulation results of the burst size distributions Q∆t(b) for various time windows ∆t by the hierarchical burst model using η = 2, λ = 2.5, and L = 18 (symbols), fitted with a stretched exponential function (black solid line).Here b ∆t is the average burst size for a given ∆t.(b) A schematic diagram for the analytic calculation of probability distributions of interevent time between events induced by the same (or different) events at the (l − 1)th level, denoted by P (τ1) [P (τ2)].(c) Comparison between Pr(τ1 < ∆t) in Eq. (19) and Pr(τ2 < ∆t) in Eq. (29) for ∆t = 1/λ l .

7 5 6 NFIG. 5 .
FIG. 5. Simulation results of the hierarchical burst model using η = 2, λ = 2.5, R = 0.1, and L = 18, with 10 seed events at the zeroth level: (a) Box counting result for measuring the fractal dimension, (b) the autocorrelation function, (c) the interevent time distribution, and (d) the burst size distributions for various values of ∆t.Each curve was averaged over 100 realizations of event sequences.Black solid lines in (a-c) represent the analytic results, while the black solid line in (d) shows a stretched exponential function fitted to the data.