Heterogeneous excitable systems exhibit Griffiths phases below hybrid phase transitions

In $d>2$ dimensional, homogeneous threshold models discontinuous transition occur, but the mean-field solution provides $1/t$ power-law activity decay and other power-laws, thus it is called mixed-order or hybrid type. It has recently been shown that the introduction of quenched disorder rounds the discontinuity and second order phase transition and Griffiths phases appear. Here we provide numerical evidence, that even in case of high graph dimensional hierarchical modular networks the Griffiths phase of the $K=2$ threshold model is present below the hybrid phase transition. This is due to the fragmentation of the activity propagation by modules, which are connected via single links. This provides a widespread mechanism in case of threshold type of heterogeneous systems, modeling the brain or epidemics for the occurrence of dynamical criticality in extended Griffiths phase parameter spaces. We investigate this in synthetic modular networks with and without inhibitory links as well as in the presence of refractory states.


I. INTRODUCTION
Phase transitions in genuine nonequilibrium systems have often been investigated among reaction-diffusion (RD) type of models exhibiting absorbing states [1,2]. In many cases mapping to surface growth, spin systems or stochastic cellular automata have been used. Criticality allows us to define universality classes, defined by the scaling exponents, which have been explored in homogenous systems [3,4]. In heterogeneous network models the situation is less clear. Hybrid phase transition (HPT) means that at the transition point the order parameter exhibits a jump, in conjunction with critical phenomena related to it. It can mean avalanches of activity at the transition point with power-law (PL) size distribution for example. Such type of transitions have been known for a long time [5], for example at tricriticality [6,7], but had not been the focus of research and the term appeared later. HPT-s have been found in network science in case of k-cores [8], interdependent networks [9] and multiplexes [10].
The "mixed-order" naming for the same phenomena in statistical physics arouse some years ago [11] by the exactly soluble one-dimensional Ising model with long range interactions. It is also known to appear in nonequilibrium models, exhibiting transition to absorbing states [12]. Further examples include critical models at extended surface defects [13,14] and synchronization [15][16][17].
Criticality is an ubiquitous phenomenon in nature as systems can benefit many ways from it. As correlations and fluctuations diverge [18] in neural systems working memory and long-connections can be generated spontaneously [19] and the sensitivity to external signals is maximal. Furthermore, it has also been shown that information-processing capabilities are optimal near the critical point. Therefore, systems tune themselves close to criticality via self-organization (SOC) [20,21], presumably slightly below to avoid blowing over excitation. Besides, if quenched heterogeneity (that is called disorder compared to homogeneous system) is present, rareregion (RR) effects [22] and an extended semi-critical region, known as Griffiths Phase (GP) [23] can emerge. RR-s are very slowly relaxing domains, remaining in the opposite phase than the whole system for a long time, causing slow evolution of the order parameter. In the entire GP, which is an extended control parameter region around the critical point, susceptibility diverges and auto-correlations exhibit fat tailed, power-law behavior, resulting in bursty behavior [24], frequently observed in nature [25]. Even in infinite dimensional systems, where mean-field behavior is expected, Griffiths effects [26] can occur in finite time windows.
It is known that strong disorder can round or smear phase transitions [22]. According to the arguments by Imry-Ma [27] and Aizenman-Wehr [28], first-order transitions do not exist in low-dimensional disordered systems. It has recently been shown [29,30] that this is true in genuinely nonequilibrium systems [1,4].
Experimental and theoretical research provide evidence that the brain operates in a critical state between sustained activity and an inactive phase [18,[31][32][33][34][35][36]. Criticality in general occurs at continuous, second order phase transitions. On the other hand, meta-stability and hysteresis are also common in the brain behavior. They are related to the ability to sustain stimulus-selective persistent activity for working memory [37]. The brain rapidly switches from one state to another in response to stimulus, and it may remain in the same state for a long time after the end of the stimulus. It suggests the existence of a repertoire of meta-stable states. There have been several model describing this [38,39]. It introduces an apparent contradiction, because meta-stability and hysteresis occur in general at first order, discontinuous phase transitions. But the brain can operate at different regimes close to the critical point which can provide the desired advantages for biological systems. Another possible resolution for the above controversy is the operation at a transition of hybrid type. It has also been suggested in a recent theoretical work [40].
Threshold type of systems, like the integrate and fire models of the brain [41], are also suggested to describe other phenomena, like power-grids [42][43][44], crack and fracture formation [45], contagion [46], etc. In these models HPT can emerge naturally, thus the present results can also be relevant.
Heterogeneity effects are very common in nature and result in dynamical criticality in extended GP-s, in case of quasi static quenched disorder approximation [47]. This leads to avalanche size and time distributions, with non-universal PL tails. It has been shown within the framework of modular networks [47][48][49] and a large human connectome graph [50]. In this study we re-use the hierarchical modular network of [48] and provide numerical evidence that above the GP a HPT emerges. Meta-stable states and hysteresis behavior can also be found, thus this system can oscillate between up and down states, depending on the level of oscillations, without the need of oscillators at the nodes, as in case of the Ginzburg-Landau theory, suggested to model cortical dynamics [51]. By extending our model we will show that the proposed mechanism is very general, providing an explanation for the observed wide range of scale-free behavior below the transition point.

II. MEAN-FIELD APPROXIMATION
Discrete threshold models can be defined as two-state systems: x i = 0, 1 (inactive, active) at sites i, with a conditional activation rule, depending on the sum of activity of neighbours compared to the threshold value K: where w i,j is the weight of the link connecting site j to i. In interacting homogenous systems w i,j is just the adjacency matrix element: A i,j , which is 1 if nodes are connected or 0 otherwise. To describe stochasticity this activity creation can be accepted with probability λ, competing with an activity removal process of probability ν. The mean-field description of the threshold model of N nodes can be obtained in a similar way as in case of RD systems [52]. That work is defined on the lattice, but we can apply it for other graphs. In [52] it was shown that discontinuous jump occurs in mean-field models of the n > m RD systems, in which n neighbouring particles are needed for creation and m neighbours for removal.
Here we don't have diffusion and particles, but the activity can be considered as site occupancy and we can map the threshold model with K = 2 to an RD system with n = 2 active neighbors necessary for creation and m = 1 for spontaneous removal. Let us rewrite the RD system (Eq.7) to the threshold model: 2A means two active neighbors (more would give higher order terms, which are less relevant for the scaling) and an inactive site to be activated, with the probability of occupancy. In the mean-field approximation the probability of site activity is ρ, and two active neighboring sites can occur in a (N − 1)(N − 2)/2 way thus the creation rate in case of a global acceptance probability Λ is Let us call: Λ(N − 1)(N − 2)/2 = λ. The spontaneous deactivation process occurs with the probability For a full graph of N nodes we can setup the rate equation which in the N → ∞ limit provides λ c = 0, but for finite graphs λ c > 0. In the steady state we have By imposing the condition we obtain which can be solved as The solution is real if at the threshold value At this transition point we can determine how the density approaches ρ c : Thus here we find PL dynamical behavior even though the transition is discontinuous, as in other known hybrid or mixed order transitions. For λ < λ c we have ρ = 0 stable solution and exponentially decaying activity. Right above the transition the steady state density vanishes with a square-root singularity as in case of k-core [8] or multiplex percolation hybrid transitions [10]: but unlike the contact process [53] near multiple junctions [12], or the Kuramoto model with uniform frequency distribution [15], which thus belong to another hybrid universality class. In the following sections we investigate what happens to this HPT if we implement the threshold model on a hierarchical modular network (HMN).

III. THRESHOLD MODEL ON HIERARCHICAL MODULAR NETWORKS
In this section we describe the HMN models we use for the simulations. It is important to note that we believe that hierarchy is not relevant, but that modularity is what enhances RR effects as in case of the study [49]. The models are motivated by brain networks originated from Kaiser and Hilgetag, who performed numerical studies to investigate the effects of different topologies on the activity spreading [54]. Their hierarchical model reflects general features of brain connectivity at large and mesoscopic scales, where the nodes were intended to represent cortical columns instead of individual neurons. The connections between them were modeled excitatory, since there appears to be no long-distance, inhibitory connections within the cerebral cortex [55].
The network was generated beginning with the highest level and adding modules to the next lower level with random connectivity within modules. Kaiser and Hilgetag explored hierarchical networks with different numbers of hierarchical levels and numbers of sub-modules at each level. The total, average node degree was set to a fixed value, motivated by comparative experimental studies. However, they investigated different topologies by varying the edge density across the levels. All the tested HMN-s were small-world type, i.e. exhibited infinite topological dimensions.
The spreading model they investigated was a two-state threshold model, in which nodes became activated with probability λ, when at least K of their directly connected node neighbors were active at the same time or spontaneously deactivated, with probability ν. Note that this model is very similar to RD models known in statistical physics [3,4], with a synchronous cellular automaton (SCA) update. Without loss of generality this algorithm produces faster dynamical scaling results for threshold models than those with random sequential updates.
In this paper we investigate versions of HMN-s, which possess increasing edge density from top to bottom levels. Clearly, such topologies can be expected to be more suitable for activity localization and for RR effects. One can also make a correspondence with the spatially embedded networks [56]. These networks have long links, with algebraically decaying probabilities in the Euclidean distance R as We added random long links by level-to-level from top to bottom, similar to in [48]. The levels l = 0, 1, ..., l max are numbered from bottom to top. The size of domains, i.e the number of nodes in a level, grows as N l = 4 l+1 in the case of the 4-module construction, related to a tiling of the 2d base lattice, due to the rough distance level relation: Here b is related to the average degree k of nodes , which was prescribed to be k = 12 for this construction. We connected nodes in a hierarchical modular way as if they were embedded in a regular two-dimensional lattice (HMN2d) as shown by the adjacency matrix on Fig. 1, similarly as in [48]. The 4 nodes of the level l = 0 were fully connected. The single connectedness of the networks is guaranteed by additional linking of these 4-node modules, by two edges to the subsequent ones: the first and the last nodes of module (i) to the first node of module (i+1). Accidental multiple connections were removed and self-connections were not allowed. Note that the single connectedness at low level does not result in stable steady states in case of the threshold value K = 2.
The in-degree distribution of 4 randomly selected graphs with N = 4096 nodes can be seen on Fig. 2. The lowest in-degree is always k in i ≥ 5. The modularity quo- tient of the networks is high: Q > 0.9, defined by where A ij is the adjacency matrix and δ(i, j) is the Kronecker delta function. The Watts-Strogatz clustering coefficient [57] of a network of N nodes is where n i denotes the number of direct edges interconnecting the k i nearest neighbors of node i. C = 0.295 is about 10 times higher than that of a random network of same size C r = 0.0029, defined by C r = k /N . The average shortest path length is defined as where d(i, j) is the graph distance between vertices i and j. In case of this typical network L = 6.74, about twice larger than that of the random network of same size: L r = 3.615, following from the formula [58]: So this is a small-world network, according to the definition of the coefficient [59]: because σ = 5.363 is much larger than unity. We estimated the effective topological dimension using the breadth-first search (BFS) algorithm: d = 4.18 (5), defined by N (r) ∼ r d , where we counted the number of nodes N (r) with chemical distance r or less from the seeds and calculated averages over the trials. Note, that this is just an estimate for the finite sized graph, because we know that d → ∞ is expected for s = 3. It renders this model into the mean-field region, because for threshold models the upper-critical dimension is d c ≤ 4. Still, due to the heterogeneous structure, we find very non-trivial dynamical GP scaling behavior as will be shown in the following sections.

IV. DYNAMICAL SIMULATIONS
Time dependent simulations were performed for single active seed initial conditions. It means that a pair of nodes is activated at neighboring sites: x i = x i+1 = 1, in an otherwise fully inactive system. It can trigger an avalanche, a standard technique in statistical physics to investigate critical initial slip [2]. We measured the spatio-temporal size s = N i=1 T t=1 x i and the duration of the avalanches (T ) for tens of thousands of random initial conditions: both initial sites and initial graph configurations. The graphs we investigated had l = 4, 5, 6, 7 levels, containing N = 1024, 4096, 16384, 65536 nodes, respectively. The average node degree was k ≃ 12, after the low level linking and the removal of accidental multiple edges. The ratio of short and long links was ≃ 0.6.
We have set ν = 1−λ and updated the sites at discrete time steps, i.e. set the state variables x ′ (i) = 1 if it was inactive x(i) = 0 and the sum of active neighbors j x(j) exceeded K = 2 with probability λ or to x ′ (i) = 0 with probability 1−λ if it was active x(i) = 1. Following a full sweep of sites we wrote x(i) = x ′ (i) for all nodes, corresponding to one Monte Carlo step (MCs), throughout the study we measure time in MCs units. We have measured the density of active nodes

A. Excitatory model
The simulations were run for T = 10 7 MCs, or until the system goes to a fully inactive state, corresponding to the end of the avalanche. We computed the probability density functions of avalanche sizes p(s) and final survival time distributions p(t). We repeated these simulations for different λ branching rates, by increasing its value. As Fig. 3 shows we don't see exponential decays as should be in the inactive phase of a mean-field model. Instead, there are PL-like tails for λ > 0.31, modulated slightly by oscillations, which is a well known phenomenon when discrete spatial periodicity is present, here the size of the modules. The slopes of the PL tails vary from τ = 2.02 to τ = 1.39 as we increase λ from 0.315 to 0.33. Non-universal PL tails are more clearly visible on the avalanche survival time distributions plotted on the graph, shown on Fig. 4. Here we can observe a greater variation as moving from λ = 0.315 with δ = 1.80(1) to λ = 0.33 with δ = 0.172(1). The avalanche duration distributions can be deduced from these curves as the time integral, thus δ is related to the duration exponent of P (t) ∝ t −τt as These non-universal PL-s suggest that Griffiths effects are present, as reported in [48] for this model at different parameters. By repeating the simulations at different sizes: l = 5, 6 the distribution curves do not change within error margin, and this size invariance implies the presence of real GP-s. The seminal experiments by Beggs and Plenz [31] reported neuronal avalanches with size (s) dependence, defined as either the number of electrodes with suprathreshold activity or the sum of the potentials, according to a power law, p(s) ∝ s −1.5 . For the duration distribution of such events P (t) ∝ t −2 , PL tails were observed. These exponents are in agreement with the mean-field (MF) exponents of the Directed Percolation (DP) criticality: τ = 3/2, τ t = 2 see [3]. Mean-field exponents are expected to occur if the fluctuation effects are weak, when the system dimensionality is above the upper critical dimension d c .
On the other hand, Palva et al. [60] have found that source-reconstructed M/EEG data exhibit robust powerlaw, long-range time correlations and scale-invariant avalanches with a broad range of exponents: 1 < τ < 1.6 and 1.5 < τ t < 2.4. These broad range exponent results have also been found in a recent cortical electrode experiment study on rodents [61]. An obvious explanation for this wide spread of critical exponents can be heterogeneity, which in the GP causes non-universal dynamical exponents [48,[62][63][64].

B. Inhibitory model
Although inhibitory links are not expected at long range links of the brain [55], we believe that our synthetic model may describe smaller cortical scales as well. Besides, inhibitory mechanisms can occur in other phenomena with threshold dynamics. In case of power-grids, for example, feedback is applied to prevent catastrophic blackout avalanches, or in models of social/epidemic contagion, nodes with inhibitory properties may also exist. For simplicity we modeled the inhibitions by the introduction of links with negative weight contribution (w i,j ) in the threshold comparison rule given by Eq. 1, although we think our results are easily transferred to the case of inhibitory nodes. As in [63,64], we randomly flipped the sign of 20% of the edges after the generation of the network.
The same analysis resulted in similar behavior as for the excitatory case. One can see p(s) distributions with non-universal PL-s ranging from λ = 0.5 with τ = 1.651(1) to λ = 0.55 with τ = 1.168(1) (Fig. 5). Finite size dependence is not visible by changing the size from N = 4096 to N = 16384.
Usually it is believed that overlapping avalanches distort the scaling behavior. From the point of view of statistical physics, this would contradict universal asymptotic scaling behavior. Indeed we can see the same cumu-lative p(s) distribution tails even in case of starting the system from half filled active state as shown on the inset of Fig. 5. The only difference is that the tails are shifted to larger s values following an initial growth, which might not be observed in case of short time measurements. The p(t) decays show GP behavior from λ = 0.505 with δ = 1.10(3) to λ = 0.52 with δ = 0.70(3) (Fig. 6). These values do not correspond to the ends of the GP, we did not aim to determine them precisely as the exponents are non-universal. Furthermore, as we will show in Sect. V the determination of the upper limit of the GP, corresponding to the critical decay is almost impossible by numerical simulations.
Again the τ and the τ t = 1 + δ values lie within the range obtained by experiments.

C. Inhibitory-refractory model
Finally, we extended the inhibitory case study with the possibility of refractory states. Refractorieness means that, following an activation, nodes cannot fall back immediately to inactive state on the next update, instead they stay for time ∆t in a refractory state. Thus they cannot be reactivated by the neighbors they excited. This refractoriness is generic in excitable systems and has been used in most of the neural studies [18,65,66]. One of the consequences of refractoriness is to induce oscillatory dynamics if ∆t is large enough and the spreading properties resemble to annular growth, corresponding to Dynamical Isotropic Percolation (DIP) [4]. However, real DIP occurs if re-activation is not possible, i.e. in the limit ∆t → ∞, and in high dimension the avalanche scaling exponents of DYP are the same as those of DP [4,67]. Thus analytic studies or simulations in high dimensions do not show differences. In the extensive GP simulations we used ∆t = 1, but on the inset of Fig. 7 we show oscillatory activity behavior of a single run for ∆t = 10, λ = 0.8 and l = 6.

V. STEADY STATE SIMULATIONS
In order to determine the steady state behavior we first performed long runs, up to T = 10 8 MCs, by starting the system from fully active state or from randomly half filled activity: ρ(0) = 0.5. Fig. 9 shows the results for the excitatory model. At λ = 0.3 the activity density falls exponentially fast to zero. We can see non-universal PL tails for 0.32 ≤ λ < 0.33, in agreement with the seed simulations. At λ = 0.33 the density does not saturate to a constant value. Examining it on log.-lin. scale and performing average over thousands of independent samples it turns out that even the λ = 0.34 curve decays very slowly. Only for λ ≥ 0.35 we can see saturation, corresponding to active steady state, thus, we estimate λ c = 0.345 (5). We plotted the steady state saturation values on Fig. 11. The same analysis has been done for the inhibitory and refractory-inhibitory cases and one can observe the shift of λ c to higher values as the consequence of the model modifications. We show the results for the inhibitory network on Fig. 10. Again, slow activity decays were observed, ending up with visible PL tails for 0.51 ≤ λ ≤ 0.54, while saturation starts from λ c ≥ 0.80(1). The saturation value is ρ c = 0.685(1), so the discontinuity is large. In the region 0.54 < λ < 0.80 the curves do not saturate up to T = 10 8 MCs, neither reach a scaling region. They belong to the inactive phase, but it is very difficult to distinguish them from other (logarithmic) decay forms.
We can see large jumps at the transition points in all cases, suggesting a discontinuous transition above the GP. It is very hard to locate the exact location of the transition points as stability disappears very slowly. This suggests that at the critical point logarithmic decay occurs like in case of the disordered DP [22].
We have also tried to start from other initial conditions than the full and single seed ones. As the inset of Fig. 11 shows we can see different saturation values for ρ(0) < 0.1, that means we have multi-stability in case of low initial densities. This is the consequence of the fact that for low ρ(0)-s only parts of the graph can be activated. Even though the networks are simple connected  19(1) (inhibitory), d = 4.18(5) (graph dimension estimated for 5 < r < 10). and the lowest in-degree is k in i = 5, not all nodes have 2 incoming links from the same neighbor, which is necessary for the activation. These nodes cannot be activated by a neighbor if they are on the "border" of an active territory, thus the graphs are practically fragmented from the activity point of view. This provides a mechanism for the emergence of GP even in high topological dimension without breaking the hypothesis [47]. Furthermore, we can see the emergence of discontinuous transition with multi-stable states. But we don't have to abandon the bi-stability scenario, corresponding to ρ(0) > 0.1: "up" and single active pair: "down" states.

VI. CONCLUSIONS
In conclusion we provided numerical evidence that strong heterogeneity effects in networks, coming from the modular structure can result in GP even if the topological dimension is high, where mean-field scaling would be expected. This is the consequence of fragmented activity propagation caused by the modular topology and the threshold. We can define effective dimensions of the these graphs by running seed simulations with λ = 1, ν = 0 and measuring ρ(t) ∼ r d ef f . For this compact growth ρ(t) ∼ N (r), so d ef f provides an estimate for the dimensionality, similarly to the BFS algorithm. The inset of Fig. 8 shows that an initial scaling can be fitted for the  excitatory case with: ρ(t) ∼ t 1.84 (3) , while for inhibitory: ρ(t) ∼ t 1. 19(1) . Thus these effective, activity dimensions are less than d c , much less than the topological dimension obtained by the BFS, which is also shown on the graph as the function of r.
Furthermore, the threshold type models allow for the possibility to observe hybrid phase transitions, where order parameter discontinuity and multi-stability can coexist with dynamical scaling in a GP, thus they can model brain criticality as well as up/down states. External ac- tivation can then push the model among the multi-stable states if it is poised near the transition point. The investigated K = 2 discrete threshold model results can obviously be extended for higher K values and we expect to find similar behavior in continuous, integrate and fire type models on modular networks. Conversely, by duplicating the links we get back effecively the contact process without RR effects and GP. The effects of inhibition and refractive states have also been studied and emergence of oscillatory states have been shown. Our model results are applicable to a wide range of phenomena, like power-grids, crack and fracture dynamics, contagion.