Global hierarchy vs. local structure: spurious self-feedback in scale-free networks

Networks with fat-tailed degree distributions are omnipresent across many scientific disciplines. Such systems are characterized by so-called hubs, specific nodes with high numbers of connections to other nodes. By this property, they are expected to be key to the collective network behavior, e.g., in Ising models on such complex topologies. This applies in particular to the transition into a globally ordered network state, which thereby proceeds in a hierarchical fashion, and with a non-trivial local structure. Standard mean-field theory of Ising models on scale-free networks underrates the presence of the hubs, while nevertheless providing remarkably reliable estimates for the onset of global order. Here, we expose that a spurious self-feedback effect, inherent to mean-field theory, underlies this apparent paradox. More specifically, we demonstrate that higher order interaction effects precisely cancel the self-feedback on the hubs, and we expose the importance of hubs for the distinct onset of local versus global order in the network. Due to the generic nature of our arguments, we expect the mechanism that we uncover for the archetypal case of Ising networks of the Barab\'asi-Albert type to be also relevant for other systems with a strongly hierarchical underlying network structure.


I. INTRODUCTION
Hierarchical networks are found ubiquitously across many areas of physics, as well as in social interaction clusters, biological systems, and medicine [1][2][3][4][5][6]. While the full universality of scale-free or power-law behavior in the underlying connectivity structure is still under debate [7,8], many real-world networks clearly exhibit strong heterogeneities. Such networks are often also characterized by the presence of hubs, i.e. nodes of high degree, which strongly influence the network's properties. Highly connected hubs are also expected to be important for the collective behavior in networks of node-based agents that interact through the network's topology.
In order to examine the impact that hubs have on both the global behavior as well as the local properties in such systems, we here examine a specific statistical mechanics model on a hierarchical network. For this purpose, we consider the nodes of the network to consist of binary units that interact across a scale-free network structure. In particular, we examine the Ising model on Barabási-Albert (BA) networks [9]. While in this model all interactions are set to the same strength, the number of connections of a given node i, also called the node's degree k i , varies strongly across the network, defining thereby a hierarchy on the network topology. We investigate how this hierarchy influences the emergence of order in especially the presence of hubs whose degree may even scale with overall size of the network.
Early studies of Barabási-Albert-Ising (BAI) networks have been performed in particular using Monte Carlo simulations [10,11] and mean-field theory [12]. These studies demonstrated that for finite network sizes (in terms of the number of nodes N ) a strong alignment of the Ising degrees of freedom emerges below a specific temperature, resembling the onset of ferromagnetic order in conventional lattice Ising models. In the conventional case, the ferromagnetically ordered phase emerges below a finite transition temperature out of the paramagnetic high-temperature phase in the thermodynamic (large-N ) limit. For the BAI model, however, Monte Carlo simulations indicated that the effective transition temperature T T instead grows logarithmically with the network size T T ∝ log(N ). In view of this numerical finding, Bianconi [12] then presented a mean-field theory of BAI models, describing the network state in terms of a single global order parameter [12]. This approach indeed reproduces the observed logarithmic scaling of the effective transition temperature. A linearization of the mean-field self-consistency equations near the transition temperature furthermore shows that the average magnetization m i of a given node i increases proportional to its degree This implies a rather simple local structure, in which the magnetization is determined solely by the degree. Other studies that analyzed the emergence of order in the Ising model on scale-free networks employed recursion methods for tree-like networks [13] or the replica trick [14]. All previous studies reproduced the logarithmic scaling of the transition temperature with the network size, while they characterized the magnetic order in terms of the degree of the nodes, as in the mean-field approach.
Here, we demonstrate that the mean-field theory of Ref. [12], even though it predicts the scaling of the global ordering transition, does not accurately describe the local structure in the collective network behavior. In particu-lar, as one of our main findings, we report results from refined Monte Carlo simulations, which exhibit that the hubs enforce a noticeable stronger local magnetization on their direct neighbors as compared to a typical node. The value of a node's magnetization m i is thus far from being determined by its degree k i alone. This behavior stems from the presence of a strong local alignment field h u , where u denotes one of the hubs, that acts on the nearest neighbors of each hub. In fact, our simulations expose the presence of an additional shell-like structure in the magnetization profile, which is not captured in the above-mentioned mean-field description. In view of its ignorance with respect to the actual local structure, while still accurately capturing the global onset of order in terms of a single global order parameter, we denote the mean-field theory of Ref. [12] as a global mean-field theory in the following.
Alternatively, one can perform a mean-field calculation on the level of individual Ising variables, including the actual local structure. However, for this analysis we find that the local mean-field theory severely overestimates the role of hubs. In particular, a hub's local alignment field h u induces a strong magnetization on its nearest neighbor nodes, which, in turn raises the magnetization of the hub itself. The hub thus effectively feels its own field. Due to this self-feedback, the local meanfield approach predicts a much higher transition temperature than actually observed, scaling proportionally to N 1 4 ≫ log(N ). Moreover, the magnetization of these ordered states are strongly localized around the hubs, in contradiction to the simple proportionality relation (1). The ordered states are therefore also highly sensitive to the presence of individual hubs in the network. This makes them fragile in terms of the stability to small perturbations in the network topology, such as the random removal of nodes. This sensitivity also appears to be in conflict with the collective nature of the onset of global order. The contrast between the global and the local mean-field theory ultimately exposes an inherent inconsistency of the degree-resolved approach to heterogeneous systems.
As we report below, this problem can be overcome upon expanding beyond mean-field theory and analyzing the BAI using a self-consistent Thouless-Anderson-Palmer (TAP) approach [15][16][17], which takes second order interaction effects into account. While the TAP approach has been applied to various systems in the past, including, e.g., spin glasses, we are not aware of of any previous application of the TAP approach to scale-free hierarchical networks.
Within the TAP framework, we identify an explicit cancellation of the self-feedback on the hubs on the microscopic level due to the fluctuations that are not captured by the mean-field approach. Moreover, this cancellation is found to be independent of the specific network structure, and is indeed reminiscent of the well-known cavity argument [18]: The local alignment field h i at a given node of the network needs to be calculated in the absence of this node. In consequence, the local fields caused by the hubs emerge only when the system orders globally, and are much weaker than predicted by the local mean-field theory. Upon taking this effect into account, the TAP approach yields the correct scaling of the transition temperature T T ∼ log(N ), while at the same time it maintains the full information on the network connectivity and thus explains the hierarchical structure of the local magnetization. The theoretical framework that we put forward also allows us to obtain accurate predictions of various thermodynamic quantities that compare well to numerical data obtained by Monte Carlo simulations.
Monte Carlo simulations of heterogeneous networks based on local update schemes are impeded by the freezing of the hub's magnetic states. We find that cluster update schemes do not efficiently alleviate this problem due to the high connectivity of the hubs, and thus still suffer from large autocorrelation times. In our simulations, we thus used an improved version of parallel-tempering Monte Carlo instead [19] that allows us to perform controlled simulation by adequately sampling over the full configuration space. In particular, this approach renders an accurate picture of the behavior of hubs and their surrounding nodes throughout the entire relevant temperature range.
The remainder of this paper is organized as follows: In Sec. II we introduce the BAI model. Sec. III defines the observables studied here and describes how they can be measured from finite-size Monte Carlo simulations. Then, in Sec. IV we expose the inherent inconsistency of the mean-field theory, and resolve it in Sec. V, by showing that the self-feedback effect is canceled by fluctuations. We demonstrate how an effective coupling to a global ordering field approximates the network state well, which can be described in terms of degree-resolved magnetizations. Further local information is then included to refine this approximation. We compare observations from finite size Monte Carlo and TAP theory in Sec. VI. Finally, Sec. VII provides a summary and further discussion of our findings, and also presents an outlook on how we expect them to generalize to other models.

II. HIERARCHICAL BAI MODEL
The statistical mechanics model that we consider in the following is defined on a BA network [9]. Quite generally, BA networks are specific hierarchical undirected networks, which can be generated ("grown") using an iterative, stochastic algorithm (see App. A for details), starting from m 0 initial nodes, one consecutively adds new nodes, where each new node is connected to m 0 other nodes of the network. During the growth process of a BA network, nodes that already have a large degree are most likely to form a new connection to the newly added node -a principle known as preferential attachment. The final network of size N is then specified by its N ×N adjacency matrix A, where for nodes i, j ∈ {1, ...N } It was shown in Ref. [9] that in the large-N limit these networks feature a scale-free degree statistics, i.e., the probability that a randomly chosen node has a specific degree k is given by Furthermore, in the same limit, the BA network exhibits linear degree-degree correlations [12], where the probability that two nodes i and j are connected is given as in terms of their degrees k i and k j , respectively. By construction (see App. A) the average degree across all nodes of a realization of a BA network is The largest degree, averaged over multiple realizations of finite BA networks of the same size, is given by As a simple statistical physics model of how agents interact on strongly hierarchical networks, we now consider an Ising model on the BA network: to each node i ∈ 1, . . . , N a local binary degree of freedom (spin) x i ∈ {−1, 1} is assigned, and the probability for a given configuration x = (x 1 , ..., x N ) is determined by the Boltzmann-factor p(x) ∝ exp (−βH(x)) in terms of the configuration's energy H(x) = − J 2 x·A x. The ferromagnetic coupling J in H favors the alignment of connected spins into either direction, and β = T −1 is the inverse temperature (k B = 1), measured in relation to the elementary energy scale J, which we therefore fix to J = 1 in the following: In addition, an external magnetic field h is included in the Hamiltonian H, which will be set to zero for our further analysis. However, the inclusion of such a term will turn out convenient for the formulation of the TAP theory (where h will be considered an infinitesimal source term).
The physical properties of the BAI model are finally defined in terms of statistical averages over all spin configurations, O = x O(x) p(x), of appropriate observables O, specified further below. Since performing the summation over all 2 N spin configurations is unfeasible for large N , here we use parallel-tempering Monte Carlo simulations to calculate these statistical averages (see. App. D for details regarding the employed simulation scheme). In order to compare finite-size Monte-Carlo simulations to results obtained from the self-consistent mean-field and TAP approaches, we consider appropriately defined Monte Carlo observables, as discussed in the following section.

III. MONTE CARLO OBSERVABLES ON FINITE NETWORKS
When studying magnetic systems that undergo orderdisorder phase transitions, one is typically interested in the thermodynamic limit, i.e., one considers the limit of infinite system sizes, as only in this limit the spontaneous symmetry breaking associated with the onset of order can emerge (see, e.g, the discussion in [20], Sec. 2). In finite systems, one can nevertheless detect the emergence of magnetic order in terms of an effective transition temperature that approaches the true ordering transition temperature in the thermodynamic limit. Such effective transition temperatures can be defined based on various thermodynamic quantities, such as the peak position of the specific heat or the magnetic susceptibility. A detailed discussion of this approach for the Ising model on a regular lattice geometry can be found, e.g., in Ref. [21].
For the BAI, the effective transition temperature scales logarithmically with the network size [10,11]. In the thermodynamic limit the system thus resides in the ordered state at all finite temperatures. One thus faces an interesting dichotomy: The concept of a phase transition only acquires meaning in the thermodynamic limit, but it is precisely the infinite network that does not exhibit this phenomenon, due to the diverging effective transition temperature. On the other hand, despite this dichotomy, one can take a more pragmatic view and try to describe a large but finite system as well as possible. This should be of relevance to most applications of the model.
More generally, one must proceed with care when comparing results from mean-field theory and the TAP approach to Monte Carlo simulations of finite systems. In particular, from ergodic Monte Carlo simulations, one cannot obtain a finite value of the mean magnetization that characterizes the magnetic state of the mean-field and TAP approach. Instead, x i = 0 vanishes exactly for a sufficiently long run, due to the Z 2 -symmetry of the Ising model Hamiltonian, which for a finite system also implies the exact vanishing of the zero-field magnetization that enters the explicit form of the zero-field susceptibil- For the Ising model on regular lattices with a finite transition temperature T T in the thermodynamic limit, this quantity converges to the zero-field susceptibility in the thermodynamic limit only for temperatures T > T T [21]. For temperatures below the transition temperature, the mean value of the absolute magnetization, which we denote by can instead be used to probe a unimodal symmetry broken state and thus also allows us to compare to the magnetic solutions of mean-field and TAP theory. In terms of M MC , we also consider the estimator which is appropriate to calculate the susceptibility within the symmetry broken regime [21]. One can also express M MC in terms of the orientation of the individual spins with respect to the overall magnetization, since where m i,MC quantifies the alignment of individual spins with the overall magnetization, In addition to these magnetic properties, we also consider the energy and the specific heat of the BAI system. Based on the quantities introduced above, the Monte Carlo simulations allow us to benchmark the accuracy of the TAP approach, as well as to expose the severe limitations of earlier mean-field theories. In the next section, we start by revisiting the meanfield approach.

IV. THE CONUNDRUM: MEAN-FIELD THEORY
Within the mean-field approximation [12], the exact equation for the average value of the magnetization m i = x i at node i = 1, ..., N is replaced by the self-consistency equation The trivial solution m i = 0, i = 1, ..., N exists for all values of β. Here, we are interested in non-trivial solutions to (16), for which i |m i | > 0. Such solutions feature a non-zero value of (at least one of) the variables m i , and always come in pairs, since (16) is symmetric under the simultaneous inversion of the sign of all m i . These non-trivial solutions are found for values of β larger than a particular value β T , which defines the mean-field estimate for the effective transition temperature T T = 1/β T for the onset of the magnetic order.
In the following, we will analyze the mean-field equations (16) in two different ways: First, within the local mean-field theory approach, we continue to treat all spins on the microscopic level in terms of the individual average magnetization m i . Second, within the global mean-field theory approach, we consider only a single global order parameter, denoted S, to be defined further below. We will derive for S a self-consistency equation on the global network level. Its solution with a finite value of S thus indicates the presence of magnetic order in the network.
We will show below that (i) the two approaches yield inconsistent results, (ii) only the latter is in agreement the with effective transition temperature scaling obtained from Monte Carlo simulations, and (iii) it nevertheless fails to account for the local shell-like structure present in the magnetic state of the BAI model.
Local mean-field theory. On a finite network, nontrivial solutions of the local mean-field self-consistency equations (16) emerge in a continuous manner, such that i |m i | → 0 continuously upon tuning β → β T from above. We can thus linearize (16) near β T to obtain where β local MF denotes the value of β T within the local mean-field theory approach. Finding the transition temperature therefore reduces to the calculation of the largest eigenvalue λ A,max of the adjacency matrix Here, only the largest eigenvalue of A yields a valid solution, since the linearized equation (17) applies only at the transition temperature, below which a solution with finite i |m i | > 0 emerges. Since A is real and symmetric, its largest eigenvalue is equal to the matrix norm ||A||. In Ref. [22], the scaling of the largest eigenvalue was found to be λ A,max ∝ √ m 0 N 1/4 , for which the corresponding eigenvector is strongly localized at the node of highest degree. It is easy to see that indeed the matrix norm ||A|| scales as N 1/4 : To this end, consider a vector v of unit length that takes on the value v u = 1/ √ 2 for a specific node u, while v i = A iu / √ 2k u for all other nodes, i = u. This vector is therefore non-zero only on node u and its nearest neighbors. Then where we used that i A 2 ui = k u equals the degree of node u. Now choosing u to be the node with the largest degree k max ∝ √ N [9], we obtain λ A,max ∝ N 1/4 . In conclusion, within the local mean-field theory, a nonvanishing solution of (17) exists below a temperature T local MF , that scales as for large N . This result differs significantly from the transition temperature to the ferromagnetic state at T T ∝ log(N ), predicted by the global mean-field theory with a global order parameter [12], which will be revisited next, and it also contradicts the results from Monte Carlo simulations [10,11]. Global mean-field theory. To demonstrate that the global mean-field theory approach with a single global order parameter [12] yields a different scaling than (19), we briefly review its derivation here: The probability of two nodes to be connected is expressed by their respective degrees (3). One then proceeds by replacing the adjacency matrix A ij in (17) by the corresponding probability p c (k i , k j ). Within this approximation, nodes of the same degree hence obey the same mean-field equations, so that the value of the local magnetization depends only on the degree of the node, m i = m(k i ). One can therefore replace a summation N i=1 over all nodes by a summation over the degrees k N p(k). Applied to (17) one finds that where we introduced the global order parameter Using (4) this global order parameter obeys the equation from which the transition temperature of the global mean-field theory approach follows as Comparing (19) and (22) for large networks N ≫ 1, we obtain This shows an inherent inconsistency of the mean-field approach: At the lower temperature T global MF ≪ T local MF the local mean-field theory predicts that the system resides already well within the regime for which a non-trivial solution exists. Therefore, the linearization (17), i.e., the starting point to derive (22), has been employed outside its regime of validity.
To illustrate the previous point, we solve (16) numerically for a fixed random realization of A by a fixpoint relaxation; a positive solution (i.e., m i ≥ 0 for all i) is selected upon starting the relaxation from the fully polarized state (with m i = 1 for all i). Throughout this work, we analyze the same single realization of a BA network per system size. We find that this is sufficient because the characteristics of these networks vary only weakly across realizations for large system sizes (see App. C). We find these solutions to be typically strongly localized at the nodes of highest degree. We compare several characteristic observables obtained from the local mean-field theory with the results from Monte Carlo simulations. We find that the average magnetization M and energy E, while showing a similar overall behavior at low temperatures, exhibit different high-temperature behavior in Fig. 1. In particular, Fig. 1(b) demonstrates that nonzero solutions of (16) indeed emerge at higher temperatures than (22). The local mean-field theory thus overestimates the tendency of the spins to order and it predicts the transition to a disordered state at a higher temperatures than the value obtained from the global mean-field theory, which matches better to the Monte Carlo data.
The origin of this discrepancy appears to be the use of p c (k i , k j ), which eliminates the local structure of the network contained in A. One could argue that valid solutions to (16) can still be found if one excludes strongly localized solutions, which is the effect of replacing A by p c in the global mean-field theory. Indeed, we observe that below the saturation regime (i.e. for m i,MC ≪ 1) the average magnetization m(k) of a node with a given degree k from the Monte Carlo simulations, fulfills m(k) ∝ k cf. Fig. 2(a). Therefore, there is a strong degree-dependent hierarchical structure in the resulting magnetization profile as predicted by global mean-field theory. On the other hand, we find that the local magnetization at nodes of low degree is not characterized by the degree alone: In Fig. 2(b) we compare the average degree-wise magnetization (23) of all nodes which have a given degree k to the average magnetization of the nearest neighbors of a hub u, given by We find from this comparison that the hubs enforce a stronger ordering on their nearest neighbor nodes than predicted by the degree alone, m NNu (k) > m(k). Therefore, the local structure cannot be entirely eliminated when describing the ordered network state, in contrast to the assumption entering the derivation of the global mean-field theory. A consistent mean-field theory can therefore not predict (22) without further assumptions made in the derivation, leading to the ignorance of the local network structure. We will resolve this contradiction in the following by showing that the transition temperature (22) emerges naturally from the microscopic TAP equations, which simultaneously also predict the local structure of magnetization.

V. SELF-FEEDBACK AND TAP EQUATIONS
The TAP approach [15][16][17] improves upon the meanfield theory (16) in the sense that it adds second order interaction effects. For this purpose, we start from the free energy F , obtained from the sum over all spin configurations as One can then compute the average local magnetizations m i from derivatives of the free energy with respect to h i , where one sets h = 0 in the end. However, it is more convenient to define the Legendre-Fenchel transform G(m, β) of the free energy thereby obtaining a thermodynamic potential which is a function of the mean local magnetizations m i . It fulfills the equation of state In App. B, we demonstrate how to expand G as a function of the nearest-neighbor couplings, thus in β. To second order, one obtains where (i) the term in the first line is the Shannon entropy of a set of independent binary variables, (ii) the sum in the second line, proportional to β, takes the form of the inner energy in mean-field approximation, and (iii) the term in the third line, proportional to β 2 , is known as the TAP or Onsager correction term [17].

TAP self-consistency equations
containing terms of linear and quadratic order in β. For a given adjacency matrix A, we solve this equation numerically and compare the resulting averages to Monte Carlo simulations. We find that the average magnetization M and inner energy (obtained using E = N −1 ∂(βG)/∂β) agree rather well with the numerical values (Fig. 1). In particular, the deviations from Monte Carlo are overall smaller for TAP than for the local mean-field theory.
To identify the transition temperature based on the TAP approach, we examine the linearization of the TAP self-consistence equation (29), obtaining where k i = j A ij is the degree of vertex i. The transition temperature is thus obtained in terms of the leading eigenvalue λ B(β),max of the β-dependent matrix B(β), with from the condition that In Fig. 3(a), we show that the numerical evaluation of (32) reproduces the prediction (22) from global meanfield theory, namely the logarithmic scaling of the transition temperature. In contrast to the global mean-field theory, however, the full connectivity structure is taken into account in the TAP approach. Furthermore, we can see that the linearization (30) is consistent with the solution of the full TAP equations (29). Indeed, the projection of the magnetization m onto the leading eigenvector v B(β),max of B(β), at the transition temperature approaches unity, as seen in Fig. 3(b).
We will now explain the logarithmic scaling of the transition temperature in local TAP theory by examining the role of the additional diagonal term −βδ ij k i in (31). For given values of m i , the presence of node i affects the network like a heterogeneous external field which couples most strongly to the nearest neighbors of i. We thus expand m j around m i = 0: keeping only terms up to linear order in m i , m j , and β. We then insert this expression into (30) to obtain up to quadratic order in β, consistent with the high-temperature expansion employed in (29). Using j A ji = k i , this simplifies to which is similar to the linearized mean-field equation (17), but is consistent with the notion of global order: the field at a node i has no local contribution from the value of m i , as it is calculated in the absence of i. This cancellation has been observed in previous studies [15], and can be anticipated from its relation to the fluctuationdissipation theorem and cavity methods, which we discuss in App. E. Equation (36) implies that the resulting eigenvector of B(β) cannot be localized strongly at one node of high degree, because the corresponding component cannot be influenced by its own presence. The TAP term −β 2 k i m i in (35) cancels the positive feedback of node i back to itself via any of its k i nearest neighbors j. As a result, (36) justifies the exclusion of strongly localized solutions by the replacement of A ij by p c (k i , k j ), which is used in the global mean-field theory. By analogous calculations, one then obtains the same transition temperature as in global mean-field theory, i.e., where the magnetization of each node depends on its degree as The description of the transition behavior in terms of the degree resolved quantities in this global TAP theory is therefore identical to the global mean-field theory. Numerically, we observe that the transition temperature obtained from this global TAP approach is consistent with the emergence of the non-trivial solution on the local level, namely the temperature at which (32) is fulfilled, as shown in Fig. 3(a). We show in Fig. 4c that the entries in the leading eigenvector of B(β TAP ), when averaged over nodes of the same degree, agree with the expected linear behavior in (38). In summary, the TAP term in (29) resolves the inconsistency between the local and the global approach that is present in mean-field theory. This allows us to describe the magnetization beyond the degree-resolved approximation. For the nearest neighbors of a node i, (34) explicitly shows the influence of node i on its neighbors, which we call the local bias. Thus, the local information has not been lost completely: the presence of node i is encoded in the fields of its nearest neighbors. This is the reason why nodes of low degree show strong variations around the prediction (38). Their magnetization reflects a more complicated local structure. This is seen explicitly when considering the nodes according to their respective distance d l,u to a hub u, as shown in Fig. 4(a) for the example of the hub u of largest degree. The local bias due to the strong alignment of the hub with the overall magnetization affects its nearest, next-nearest, and further neighbors, forming thereby an onion-like structure, where in each shell the bias from the hub becomes weaker. This structure is also visible in the eigenvector of B(β) at β = β TAP , as shown in Fig. 4(c): Nodes of low degree that are not in the vicinity of any hub are much more weakly magnetized than those that are nearest neighbors to a hub. Nodes of high degree connect to a representative collection of nodes on the network and therefore do not deviate much from the expectation value m(k), as seen in Fig. 4c. As a consequence of the hierarchical network structure, we observe that the transition marks the point below which hubs essentially become permanently "frozen", i.e. they almost perfectly align with the overall magnetization of the whole system, m u ≈ 1 for k u ≫ k (see Fig. 4(b)).
The cancellation of the self-feedback (36) also extends to lower temperatures. Using the same expansion method, one finds that up to quadratic terms in the inverse temperature β, the self-consistency equation reads To understand the degree-resolved magnetization, we insert m i ≈ m(k i ) into (39), making use of the global theory also below the transition temperature. This leads to the approximate result for the degree-resolved average magnetization from the global mean-field theory [12][13][14], This approximation may not be good for small k i , but nodes of larger degree contribute more strongly to the global order parameter S, for which m i ≈ m(k i ) holds quite precisely. The resulting equation for the global order parameter can be solved numerically for arbitrary network sizes. From this, we obtain the total magnetization which compares well to the Monte Carlo data (Fig. 4(b)). At the transition, obtained from either TAP or global mean-field theory, we find that the prediction m i ∝ k i averaged over all vertices of the same degree holds for small degrees (see Fig. 2(a)), albeit at a slightly downward shifted effective transition temperature (cf. Sec. VI for a discussion on the exact position of this crossover). For large degrees, however, the average behavior in the Monte Carlo data seems to deviate from this rule. This can be understood from what is essentially a combinatorial argument. For finite systems, the modulus of the magnetization is never exactly zero, even in the disordered case, but rather converges to a finite value. This finite value only converges towards zero with increasing number of spins over which the average is performed. Even though we somewhat alleviate this effect by measuring with respect to the net magnetization, the finite size of the network still affects the ordering of the nodes of larger degrees (see Fig. 2(b)): firstly, they tend to align to the net magnetization, due to the large energetic cost involved in flipping their orientation, so that in the limit of large k i , equation (12) effectively reduces to the modulus again. Secondly, by virtue of having a high degree, these nodes tend to be few. Keeping these two points in mind, one can understand the deviation from the m i ∝ k i scaling for large k i by this finite magnetization converging to a plateau as the magnetization eventually has to saturate to unity. One can also understand the degreedependent finite values of the local magnetization in the limit of high temperatures as seen in Fig. 4. The local structure however, remains visible across all temperature scales. Most prominently, the magnetization at the nearest neighbors of the hubs is enhanced as compared to the average value in Fig. 4(b). Generalizing equation (34), we can also calculate the average magnetization of nearest neighbors of a hub u. The effective field is the sum over all neighbors, therefore for a node of degree k, we find one contribution from m(k u ), and the remaining k − 1 connections couple to S, so the effective field reads In the linear regime, where m(k u ) = β TAP k u S and m NN,u (k) = h NN,u (k) we find that this equation well describes the entries for the nearest neighbours of the hub u in the leading eigenvector of B(β TAP ) in Fig. 4c. Making use of p c and averaging over all k u neighbors of u yields which predicts an elevated magnetization at those nodes as compared to the average value ( Fig. 3(b)). We thus find that the global approach, used here within the TAP approach, can be extended to account for the local differences in the ordered state of the system in a consistent manner.

VI. EFFECTIVE MAGNETIC TRANSITION ON FINITE NETWORKS
We further probe the TAP approach by calculating the specific heat C, and the susceptibility χ. Here, the susceptibility measures the response to a global, homogeneous external field h i = h for all i = 1, ..., N . In TAP and mean-field theory, the susceptibility is obtained as the inverse of the Hessian of G (see App. B), and the specific heat is computed from the numerical derivative of E. Both quantities are shown in Fig. 5.
Both in mean-field and in TAP approximation the susceptibility exhibits a pronounced peak at the point where the non-trivial solutions to (16) or (29), respectively, come into existence ( Fig. 5(b)). This is readily understood from considering the manifold of possible solutions to the equation of state (27): Above their respective transition temperature, both the mean-field and TAP equations allow only for a single trivial solution, for which the magnetization vanishes on all nodes, such that the expectation value of the absolute magnetization is exactly zero (see Fig. 1). Below the transition temperature, the equations feature a pair of solutions that are transformed into one another by the Z 2 -symmetry. This is also the reason why within this approximation, the susceptibility diverges even for finite system sizes (see Fig. 5) right at the temperature for which the transition from a unique solution to a triplet of solutions takes place: At the transition, G must have a saddle point, therefore the inverse of its Hessian, i.e., the susceptibility, diverges.
Such a sharp peak is however absent in the Monte Carlo data, as befits the exact solution on a finite system. Indeed, there cannot be any non-analytical behavior in the free energy on a finite system at finite temperatures,  (10). Yellow dots: Solutions of the local mean-field equation (16). Violet dots: Solutions of the TAP equation (29). Violet curve: Global mean-field (22) or, equivalently, TAP prediction (32). Black triangles: Leading eigenvalue of the adjacency matrix λA,max. Orange curve: Fit of λA,max to a N 1/4 + b to illustrate the N 1/4 scaling. and thus such sharp features cannot exist in the physical observables. Nevertheless, there appears a broad peak in the estimator (10) of the susceptibility for the symmetry broken regime. In all cases, we take the position of the peak in the susceptibility as an estimator for the effective transition temperature -this is the point at which the system is most sensitive to small perturbations.
To further compare the susceptibility from Monte Carlo to the TAP solution, we must account for the distinction between the symmetric and the symmetrybroken regime that we already mentioned in Sec. III. In the latter case, the second term in (8) provides a nonvanishing contribution, whereas in the former case, i.e., above the transition temperature, the magnetic susceptibility reduces to the second moment of the magnetization, because M = 0 in the non-symmetry broken regime. Based on this distinction, we indeed find good agreement between the TAP and the MC susceptibility at both small and large temperatures, as can be seen in Fig. 5. While the agreement in the high-temperature regime is expected, as the TAP equations are an asymptotic expansion in β, the good agreement in the low-temperature regime is pleasantly surprising. Only in the vicinity of the effective transition temperature do we observe notable differences between the TAP and Monte Carlo results. In fact, this concerns the value of the effective transition temperature T T itself, as TAP and Monte Carlo differ by a constant offset regarding the position of the maximum of the susceptibility. While this offset is still visible up to large networks with 10 5 nodes and does apparently not reduce in absolute magnitude (cf. Fig. 6), it nevertheless becomes negligible for large network sizes N , since in any case the value of T T diverges logarithmically with N . In Fig. 6, we show that in the mean-field and TAP approximation, we find that the transition temperature coincides with the expected result from the linear analysis, (18) and (32), respectively. Monte Carlo simulations confirm the logarithmic scaling (22) obtained from global mean-field theory or from TAP theory (37) up to a constant shift. This is consistent with previous work [10,11].
The specific heat C shows a distinct behavior from the susceptibility, with a maximum that is not related to the position of the effective transition temperature. Indeed, an analysis of its position for different system sizes confirms that the position of the maximum in C as well as the value of the maximum is size-independent and already explained reasonably well within mean-field theory. In particular, we find the position of the maximum to linearly shift to larger temperatures upon increasing the network parameter m 0 . This can be anticipated already within mean-field theory: Solving the linearized self-consistency equation in the continuum limit one obtains T max,C = 2m 0 , which exhibits no N -dependence (even though the predicted value is quantitatively off). Overall, we observe a good agreement between the Monte Carlo data for the specific heat and the TAP results. We note that [13] arrives at a similar expression for the specific heat. Physically, the maximum in the specific heat C shows that a substantial amount of entropy is released upon the full ferromagnetic alignment of the spin from a large number of nodes well below the initial onset of the ferromagnetic order, which, as discussed above, is an effect that is driven by the much lower number of highlyconnected hubs in the system.

VII. DISCUSSION
We investigated the onset of magnetic order in a system of Ising spins that interact on an undirected Barabási-Albert network. These networks are characterized by heavy-tailed degree distributions and thus possess hubs, nodes with a large degree of connectivity to other sites of the network. The transition from a disordered to an ordered state can be understood within mean-field theory of a global order parameter [12]. Monte Carlo simulations confirm that an effective magnetic transition occurs at an effective transition temperature that scales with log(N ), in line with mean-field theory. The global mean-field theory, however, neglects the local structure of the network that extends beyond the distribution of the nodes' degree. This results in two shortcomings: First, nodes with the same degree are predicted to have the same magnetization; Monte Carlo simulations, however, show this assumption to be violated, in particular for the neighboring nodes of the hubs. Second, we find that the mean-field theory for the global order parameter is inconsistent with the underlying mean-field theory for the magnetization of individual spins. This inconsistency is disturbing, because the former theory is derived from the latter. Here, we resolved these shortcomings upon including the leading systematic perturbative correction in the interaction strength to the mean-field approximation. This correction cancels the spurious, indirect selffeedback from one node onto itself, mediated by its direct neighbors, which is still present in mean-field theory. We identify this self-feedback as the cause of the inconsistency between the two mean-field approaches; the corrected mean-field (TAP) approach is indeed consistent between its local and its global formulation.
This improved theoretical understanding sheds light into the mechanism driving the transition into the heterogeneous system state. In particular, the theory qualitatively reshapes the role that hubs and their direct neighbors play for the onset of order in two ways: Hubs enforce a stronger alignment of their nearest neighbors as compared to the mean-field prediction, which accounts for only their neighbours' degrees. At the same time, local mean-field theory overestimates the importance of hubs for the build-up of order in the first place. The reason is a spurious self-feedback; if it was present, it would endow individual hubs to self-stabilize their magnetization and thus drive the transition already at temperatures T ∝ N 1 4 ≫ log(N ). The cancellation of the self-feedback by the perturbative corrections thus explains why hubs are less effective in driving global order than may be naively expected. Likewise, nodes in direct vicinity of hubs show stronger local order than predicted by their degree alone. The reason for this effect are the local fields that are stronger than expected by global meanfield theory.
The TAP equations take self-feedback -mediated by nearest neighbors -into account. However, such feedback can also come from loops in the connectivity structure, due to the overnext and over-overnext nearest neighbours and so on in a systematic high-temperature expansion [16,23,24]. In principle, an infinite number of corrections can be taken into account. It may be worthwhile to explore even higher order corrections to find if this self-consistent theory may explain the constant downward shift between the Monte Carlo and TAP results for the effective transition temperature. However, we expect these corrections due to higher order terms to be small, given (i) the small prefactor β and (ii) the good overall agreement between TAP theory and the Monte Carlo simulations.
The finite size of the system is taken into account by solving the TAP equations with respect to the full adja-cency matrix. It should be noted that in the limit of large N , the transition temperature shifts to infinite temperatures, so that the system is always in an ordered state in the thermodynamic limit. This defies the notion of regular phase transitions that would normally only acquire meaning in the thermodynamic limit.
To our knowledge, this work constitutes a successful effort to also account for the local structure in finite size networks, while global quantities such as the transition temperature, and the influence of network parameters, such as the clustering coefficient and average connectivity, have been investigated quite thoroughly in the past [12][13][14]25]. In comparison to [12][13][14], our method allows us to compute averages using the full connectivity structure, whereby local effects such as the bias fields become visible. The cancellation of self-feedback by perturbative corrections is reminiscent of the cavity method [18]; here, too, the field sensed by a spin needs to be computed in the absence of this very spin. The cavity method is prominent in the analysis of spin glasses.
Irrespective of the particular network architecture, we expect the cancellation of self-feedback to hold in Ising models, enabling one to find the transition temperature by means of a linear problem. We expect the distinction between local structure (for example, the immediate neighborhood of a specific node) and global hierarchy to be beneficial to the analysis of similar problems. In particular, we expect that qualitatively similar mechanisms on the local and on the global scale as described here shape the emergence of order in general heterogeneous systems.

VIII. ACKNOWLEDGMENTS
and proceed by expanding G in J to second order. Here, for J = 0 , i.e., the non-interaction case, is simply the G J=0 entropy of independent binary variables with mean values m i , The first derivative provides the expectation value of the interaction energy, which, when evaluated for independent variables, and fixed values of m i , gives To obtain the second derivative, one has to evaluate where the derivative of ∂ J h i can be found using the inverse of (27) and inserting G ≈ G J=0 + ∂ J G J=0 J. Summing all contributions and finally setting J = 1 again, one obtains (28) in the main text. For the susceptibility, we make use of the fact that ∇ h F and ∇ m G are (up to a sign) inverse functions of each other: After having determined m i for all i, using (29), we can therefore obtain the susceptibility χ ij = dmi dhj = − d 2 F dhj dh k by matrix inversion of the Hessian of G. However, the average response to a global field (i.e., h i = h for all i) is given by the sum over all entries One can verify by insertion that in order to determine the average susceptibility, one needs to simply solve for χ and finally perform the summation over all entries, Appendix C: Self-averaging on BA networks Here we show that global properties, such as the magnetization M and the energy E do not depend significantly on the specific realization of the BA networks. To this end, we calculate within the TAP approach the variances σ O of such an observable O, with where · R denotes the average over independent realizations of the BA network with given values of N and m 0 . The results are shown in Fig. 7. With increasing system size, neither the magnetization nor the energy show a significant variance, and furthermore, the variances decrease with increasing system size. When treating the Ising model on a network with hierarchical structure, such as a BA network, several considerations are required in order to choose an appropriate simulation method. When performing ordinary Metropolis Monte Carlo simulations, one quickly runs into the problem of "freezing hubs". The local update step is increasingly unlikely to flip nodes of high connectivity, due to the massive change of energy that is involved in such a single spin flip. The problem is akin to the emergence of domain walls in a regular Ising model on a lattice, although it quickly gets more severe as the degree of the hubs grows with increasing system size. With the hub frozen out, Metropolis updates are no longer able to sample the complete phase space, autocorrelations exhibit long time scales, and the measurement of physical observables is prone to large statistical errors or the effective breakdown of ergodicity.
To circumvent this problem in the dynamics of the simulation, the results obtained by Monte Carlo sampling reported here use a slightly modified version of the parallel-tempering method on the basis of the work of [28], as well as feedback-optimized methods first developed by [19], in combination with regular Metropolis update steps. After initializing a given BA network with a random initial distribution of the Ising spins, one creates M replicas of the given grid and evolves them separately at different temperatures, according to a local stochastic process, i.e. the Metropolis-Hastings algorithm. Then, periodically after a fixed amount of these local update steps, one checks for valid swaps between neighbors of replicas in temperature space, according to an appropriate pairwise transition probability p ij , which is determined by the state of the two replicas. If such an update step is accepted, the algorithm swaps the spin realizations of the two replicas, thus effectively allowing each replica to move through the full temperature space. Initially designed for the simulation of spin glasses, this sampling method allows us to overcome large energy barriers that would normally make the sampling difficult. More specifically, in the case of a BA network, the most prominent of these energy barriers is of course the flip of a "hub". Employing parallel tempering allows us to reduce autocorrelation times and recover an ergodic sampling within each single Monte Carlo simulation. A global update step constructed in this way, i.e. the swapping of two replicas between different temperatures, must of course fulfill the detailed balance condition of the overall Markov chain, and thus the swap probabilities are chosen to be p ij = min(1, e (Ei−Ej)(βi−βj ) ).
This of course raises the question on how to distribute the M different temperatures in [T 0 , T M ] used in a paralleltempering simulation. There exists a multitude of schemes to tackle this problem in the literature, each pursuing different objectives with regards to the simulation dynamics. The interpretation of replicas moving through temperature space allows for an adequate physical picture in terms of the replicas' diffusivity along the temperature axis. In order to optimize the gain in reduced autocorrelations from the parallel tempering method, one thus tries to maximize the number of full "round-trips" that a replica undertakes. In order to make this feasible, one tags each replica moving in temperature space as "up" when it reaches the minimum temperature and with "down" when it reaches the maximum temperature. One can then increment a temperature-wise histogram after every parallel tempering update step, by counting the number of up-walking and down-walking replicas at a given temperature. We keep track of this distribution by defining f (T i ) = n up n down + n up (T i ).
In [19] it is shown, that, in order to guarantee a maximum number of round-trips, this quantity is supposed to be a linearly descending series. Since f is monotonous, we can define an inverse g of f , such that g(f (T i )) = T i Feeding a linearly descending series to this f will create a new set of temperatures, which is then used in the next preliminary parallel tempering simulation, until the temperature grid converge and an optimal distribution of the simulation temperatures is reached. However, the originally proposed method ran into problems when applied to large BA networks. The shifting of temperatures in the self-optimization proved to be too severe, which results in the algorithm getting trapped between ever newly created diffusivity bottlenecks. As an extension of the original method, we thus define where ω ∈ [0, 1]. This addition effectively smooths out the rearranging of temperatures and guarantees a converging optimization. Usually ω = 0.75 was chosen and a few temperatures were manually assigned after the optimization in order to obtain a better resolution around physically interesting points. At the end of the procedure, we obtain a Monte Carlo algorithm that effectively samples the full phase space and produces more precise data using less computational resources.

Appendix E: Fluctuation-dissipation theorem
In mean-field theory, the presence of the spin at node i may be alternatively viewed as an inhomogeneous external field of strength h ext,j = A ji m i at node j. The response to this field on the spin at node j is given by χ j A ji m i , where χ j is the the susceptibility at node j, given by Up to factors of β, this is equal to the variance in accord with the fluctuation-dissipation theorem [20]. When sampling from an equilibrium distribution, the fluctuation-dissipation theorem applies, which states the equivalence between the fluctuation (E2) and the linear response (E1). In fact, both values are calculated in the same way, namely (up to factors of β) from the second derivative of the free energy F with respect to the external field h. The self-feedback of the response field χ j A ji m i at node j to the moment at node i is given by Summing over all neighbor nodes j, this gives precisely the TAP term in (29), but with opposite sign. For non-equilibrium systems (the non-equilibrium kinetic Ising model or directed networks of binary units), this equivalence between linear response and fluctuations is lost [29], as this argument does not produce the correct time arguments of the mean fields and the replacement A ij = A ji is no longer possible. This was also noted in [30]. Therefore, the cancellation of the spurious self-feedback present in the mean-field approximation by the TAP correction term cannot be expected outside thermal equilibrium.