Nonlinear walkers and efficient exploration of congested networks

Random walks are the simplest way to explore or search a graph, and have revealed a very useful tool to investigate and characterize the structural properties of complex networks from the real world, e.g. they have been used to identify the modules of a given network, its most central nodes and paths, or to determine the typical times to reach a target. Although various types of random walks whose motion is node biased have been proposed, which are still amenable to analytical solution, most if not all of them rely on the assumption of linearity and independence of the walkers. We introduce a novel class of nonlinear stochastic processes describing a system of interacting random walkers moving over networks with finite node capacities. The transition probabilities are modulated by nonlinear functions of the available space at the destination node, with a bias parameter that allows to tune the tendency of the walkers to avoid nodes occupied by other walkers. Firstly, we derive the master equation governing the dynamics of the system, and we determine an analytical expression for the occupation probability of the walkers at equilibrium in the most general case, and under different level of network congestions. Then, we study different type of synthetic and real-world networks, presenting numerical and analytical results for the entropy rate, a proxy for the network exploration capacities of the walkers.We find that, for each level of the nonlinear bias, there is an optimal crowding that maximises the entropy rate in a given network topology. The analysis suggests that a large fraction of real-world networks are organised in such a way as to favour exploration under congested conditions. Our work provides a general and versatile framework to model nonlinear stochastic processes whose transition probabilities vary in time depending on the current state of the system.


I. INTRODUCTION
Random walks are basic stochastic processes, which bear universal interest in light of their widespread and cross-disciplinary usage.Since the pioneering work by Pearson and Rayleigh, back in 1905 [1,2], the number of studies invoking the notion of random walker has grown rapidly, to eventually cover a broad spectrum of applications, from physics to engineering, via biology and economics.
Random walks have been thoroughly studied on regular lattices [3] and, more recently, on graphs displaying complex topologies [4][5][6][7].In the simplest possible scenario, the walker moves, with a uniform probability, from a given node i to one of its neighbours j.Alternatively, when the dynamics takes place on a weighted graph, one can gauge the probability of performing the move with the weight of the link (i, j) [8,9].Various other classes of random walkers are however possible on complex networks [7].The walk can be for instance biased on the topological properties of the nodes of the network, such as the node degree or the betweenness.In [10], the probability for a walker to perform a move is modulated by a power law of the degree of the target node.Tuning the scaling exponent enables one to steer the dynamics towards the hubs or favour, at variance, the motion towards low-degree nodes.Furthermore, when the nodes are also characterised by endogenous state variables, mirroring congestion or tagging local deficiencies, these can be considered as a feedback to modify the motion of individual agents [11].Metapopulation models of random walkers which integrate random relocation moves with local interactions depending on the node occupation probabilities have also been proposed in [12] and employed to extract information on the architecture of the underlying network.Mutual interference, as stemming from the competition for available spatial resources, is unavoidably present when many walkers are moving at the same time across the nodes of a given network [13].In Ref. [14] a model of transport on networks which accounts for the finite carrying capacity of the nodes has been proposed.In particular, it has been shown that the equilibrium density (stationary distribution) of crowded walkers saturates for large enough values of the connectivity, while conventional non-interacting agents have a stationary distribution which depends linearly on the nodes degree.
In this work we introduce and study a novel and general class of nonlinear Markov chains with transition probabilities that change in time depending on the current state of the system.These describe the motion of interacting random walkers whose probability to jump to arXiv:2006.05146v1 [cond-mat.stat-mech]9 Jun 2020 a node of a network is a nonlinear function of the number of walkers currently at the node.Such class of nonlinear random walkers provides a versatile, but at the same time analytically treatable, framework to study the dynamics of active agents that modulate their motion depending on the level of perceived congestion on the network.As a special case, we will study walkers whose probability to move from node i to a neighbour j scales as a power law of the occupation density of node j, with an exponent σ ≥ 0 that measures the anti-social behaviour of the walkers, i.e. their tendency to avoid nodes already occupied by other walkers.Under this framework we will prove that, for any given network and each selected value of σ, there is always an optimal value of the network crowding (the total load on the network), that maximises the entropy rate, i.e. facilitates the exploration of the network.We will also show that, in many real-world networks, the maximal value of the entropy rate is larger than that in randomised networks with the same degree distributions.

II. THE STOCHASTIC PROCESS
Consider a set of interacting agents (walkers) moving on an undirected network with N nodes, each endowed with a finite carrying capacity.For the sake of simplicity, we assume that all the nodes have the same carrying capacity, i.e. each of them can simultaneously host a maximum number of agents equal to M .The architecture of the network is described in terms of the binary adjacency matrix A = {a ij }, with a ij = 1 if there is a link connecting nodes i and j, while a ij = 0 otherwise.At each time t, the state of the system (our set of walkers) is specified by the vector m = (m 1 , . . ., m N ), where 0 ≤ m i ≤ M is the number of agents that belong to node i, at time t.The total number of walkers in the network is fixed in time and is a tunable parameter of the model.We can control it by introducing the average node crowding β = 1/N N i=1 m i /M .By definition, β ∈ (0, 1] quantifies the average node congestion, with β → 0 corresponding to the idealised diluted setting.Hence, we can tune the total number of walkers in the network, βM N , by independently changing M and β.Agents perform a biased random walk hopping between neighbouring nodes, provided there is enough space at the arriving destination.Differently from Refs.[10], the motion of the agents is not biased on the topological properties of the underlying graph but on the positions of the other agents in the network.More specifically, the bias results in two distinct contributions, respectively representing the willingness to leave a node i, and the attractiveness of the target node j.The first component is a function, f (x i ), of the density x i = m i /M on node i.The second term is made to depend on the available space 1 − x j = (M − m j )/M at node j, as g(x j ) ≡ ĝ(1 − x j ).As a natural constraint, we require that f (x) vanishes at zero, i.e. f (0) = 0, since no hops can take place from an empty node.Further, we assume that f (•) is a non-decreasing nonlinear function of x, a choice that amounts to modelling antisocial reactions of the walkers to enhanced crowded conditions, i.e. their tendency to avoid nodes already occupied by other walkers.Observe that the standard unconstrained random walk is eventually recovered when setting f (x) = x and g(x) = 1, for all x.The finite carrying capacity signifies that no transition towards node j can take place, if x j = 1, namely if the arrival node is fully packed.We hence require the self-consistent condition g(1) ≡ ĝ(0) = 0. Any possible choice of f (x) and g(x) fulfilling the above prescription is in principle possible.Notice that the linear model studied in [14] can be obtained as a particular case of our model if we fix f (x) = x and g(x) = 1 − x.On the other hand, adopting nonlinear functions for g(x), enables one to reveal a large plethora of interesting dynamical features, which reflect different modalities of active reaction to perceived crowding conditions, encompassing social/antisocial attitudes.
The evolution of our system of nonlinear interacting random walks is ruled by the master equation: where P (m, t) denotes the probability to find the system in the state m at time t, T (m |m) is the transition probability from state m to state m , and the sum is restricted to states m compatible with m [15].Because the transitions involve pairs (i, j) connected by a link, i.e. such that a ij = 1 and only increments and decrements by one unity are allowed, we get m = (. . ., m i ± 1, . . ., m j ∓ 1, . . .).The transition probabilities read: where k i = j a ij is the degree of node i.To make the notation compact, in the above expression we solely highlight the state components which are modified by the occurring transition [16][17][18][19][20].The calculation is however exact: all components are accounted for, and no approximation is involved (see Appendix A).
A straightforward manipulation yields (see Appendix A and [14,21]) the following equation for the time evolution of the mean-field node density ρ i (t) = lim M →∞ m i /M : where ∆ ij = a ij /k j − δ ij is the random walk Laplacian and the nonlinear operator L i (ρ) is defined by the rightmost equality.Notice that the above mean field equation has been obtained by neglecting terms which are 1/M smaller than the others.This is an approximation at M finite, but holds exactly in the limit M → ∞, i.e., when 1/M corrections vanish [16].From Eq. ( 1) it is immediate to conclude that the mass, namely the total number of walkers, is an invariant of the dynamics.The quantity N i=1 ρ i (t)/N is hence conserved and equals to the average node congestion β.

III. EQUILIBRIUM DISTRIBUTION
The stationary solution of Eq. ( 1) can be computed, for any choice of the nonlinear functions f and g (see Appendix B).We study here the case in which we set f (x) = x and g(x) = (1 − x) σ , with σ ≥ 0. Modulating the exponent σ, means selecting different exploration strategies of the walkers.More specifically, the larger σ the more the walkers will try to avoid densely populated nodes.In other terms, the value of σ quantifies the level of anti-social behaviour of the walkers.Notice that the diluted limit of non-interacting walkers is recovered by letting σ → 0 (and also β → 0).For the case at hand, the stationary solution ρ * i should match the following implicit equation: where c σ is a normalisation factor which depends on the selected σ.Recalling the definition of β yields c σ = βN/ j k j (1 − ρ * j ) σ , a condition which should complement Eq. ( 2) for a self-consistent determination of the stationary equilibrium.To interpret the above asymptotic solution we will draw a comparison with that obtained when assuming linear transition rates, σ = 1 When σ < 1, agents accumulate on the nodes characterised by a large degree, by consequently depleting those displaying modest connectivity (see Fig. 1).At variance, when σ > 1, hubs are progressively emptied and the walkers tend to preferentially fill peripheral nodes with respect to the linear case.It is instructive to compute the critical degree k crit where such inversion takes place for a generic σ with respect to the reference case σ = 1.A direct computation (see Appendix B 1) returns: In short, for all k i > k crit , we have ρ * i | σ<1 > ρ * i | σ=1 , while the opposite inequality holds true if k i < k crit .The sign of the inequalities reverse when σ > 1 (see Fig. 1).

IV. EXPLORATION UNDER CONGESTED CONDITIONS
The entropy rate of a random walk on a complex network characterises the walkers ability to explore the network, resulting in a non trivial indicator where topology and dynamical rules are mutually entangled [10,22,23].We will hence evaluate the entropy rate of the process under study to quantify the performance of the walkers in exploring a given network under different level of congestion.The entropy rate h of a stationary Markov chain with transition matrix Π = {π ij } and stationary distribution w * = {w * i } can be written as In the present case one gets: The entropy rates depends on the dynamics of the walkers, via the stationary probability ρ * i , the nonlinearity exponent σ, and the congestion parameter β, but also on the structure of the underlying network, via its adjacency matrix A = {a ij }.The entropy rate in Eq. ( 4) (normalised to the system size N ) can be rewritten in the Heterogeneous Mean Field (HMF) approximation, by dividing the nodes in different degree classes, considering the asymptotic densities of nodes with the same degree and performing sums over degree classes (see Appendix C) [10,24].Fig. 2 a) and c) show the entropy rate per node, h/N , versus β, for synthetic networks with the same average degree k and heterogeneous or homogeneous degree distributions respectively.Symbols refer to a direct (and exact) computation through Eq. ( 4).Solid lines are instead the results in the HMF approximation.We notice that, for any value of σ, there is an associated value of the crowding parameter β opt which maximises the entropy rate h/N .An adequate and network-dependent amount of congestion seems therefore necessary to favour the network explorability for any given level of anti-social behaviour (as measured by the value of σ).The maximum entropy h opt = h(β opt ) increases by decreasing σ, namely when the antisocial behaviour of the walkers is reduced.A trivial global optimum is eventually obtained when the constraint of a finite carrying capacity is completely re-moved.Notice also that the entropy rate approaches zero when β → 1, namely under extremely crowded conditions, i.e. when the agents are practically stuck in their positions.Interestingly, both β opt and the value of h opt depend on the topology of the network.As an example, when σ = 0.5, β opt ∼ 0.68 and h opt ∼ 0.48 × N for Erdős-Rényi random graphs, while β opt ∼ 0.64 and h opt ∼ 0.34 × N for scale-free networks.Complementary insights can be obtained by looking at the iso-level lines of h/N in the plane (β, σ) reported in Fig. 2 b) and d).In order to maintain the same level of explorability, the walkers need to adjust the value of the dynamic bias σ, depending on the traffic load β in the network.Intriguingly enough, σ is a non-monotonic function of β on iso-h curves.For small values of β, the walkers have to strengthen their antisocial behaviour (i.e. to increase σ) to keep the same value of h.Above a critical value of the average node crowding β, the walkers need instead to weaken their antisocial bias (i.e. to decrease σ).
FIG. 2. Entropy rate and iso-explorability on synthetic networks.The asymptotic entropy rate per node h/N is shown as a function of the average node congestion β and for different choices of the parameter σ.Panel a) refers to scale-free networks with N = 1000, γ = 2.5 and average degree k = 6.9.Panel c) is obtained for Erdős-Rényi networks, with N = 1000 and k = 6.9.Symbols refer to the exact computation performed from Eq. (4).Lines are the analytical predictions obtained in the heterogeneous mean field approximation.In panels b) (relative to SF networks) and c) (for the case of Erdős-Rényi graphs) the iso-level lines h/N are depicted in the reference plane (β, σ).A constant level of explorability is obtained by modulating σ as a nonlinear function of β.
Further, we have analysed how the average node degree of a network, impacts the entropy rate of the walkers.To this end, we build different Erdős-Rényi networks with the same number of nodes but different average node degrees.Fig. 3 shows the entropy rate per node as a func-tion of β and its maximum as a function of k , for three values of the nonlinear bias parameter σ (0.5 top panels, 1.0 middle panels and 2.0 bottom panels).Increasing the network connectivity, yields a global enhancement of the entropy rate and of its associated maximum.The larger the connectivity, in fact, the richer the variety of routes available to the motion of the walkers.As a further point, we stress that random architectures return lower entropy values at peak, as compared to lattices, a counter-intuitive conclusion that is made quantitative in the Appendix B 2, where we also derive closed analytical formulae for the entropy rate on k-regular lattices.
FIG. 3. Entropy rate versus k .The asymptotic entropy rate h/N per node and its maximum value are reported as a function of the network load (left panels) and the average nodes degree (right panels), for several values of the nonlinear bias parameter σ.Erdős-Rényi networks with N = 100 nodes have been used.Lines in the left panels are the analytical predictions.
Finally, we studied the properties of our model of nonlinear random walkers on several networks taken from the real world.We computed the entropy rate as a function of the crowding parameter, determining in each case the optimal values β opt and h opt , for several values of the nonlinear bias σ. Results are compared to those obtained on randomized versions of the networks.Two different types of randomization have been adopted: the first one preserves the degree of each node, while the second one maintains the network average degree only.As an example Fig. 4 shows the results obtained for: (a) a snapshot of the social network of Facebook [25], and (b) for the air transportation network among the 500 largest US airports [5,26].First, we confirm the non-monotonic be-haviour of the entropy rate: this latter vanishes for small and large values of β, and exhibits a maximum at an optimal value of the crowding parameter β opt .In addition to this, we notice that, for intermediate and large values of β, the entropy rate of the walkers on both these two real-world networks is larger than that on the randomised versions of the networks preserving the degree distribution.In Appendix D and Table I we report on the results obtained for a large collection of real networks.Although some of them can also exhibit smaller values of entropy rate than their randomised versions, we have found that all the networks analysed, which describe urban street patterns, achieve a better explorability.[25] and (b) the transportation network of the 500 largest US airports [5,26].Filled symbols refer to the average entropy rate obtained for an ensemble of 50 randomizations which preserve the node degrees of the two real networks.

V. CONCLUSIONS AND OUTLOOK
Summing up we have here discussed a general approach to the modelling of biased random walks, under crowded conditions.The formulation of the problem is not limited to the specific framework analysed here (see Appendix E for a generalisation in which also function f is a power law) and paves the way to devising novel algorithms for an efficient transport on networks, even in more complex adaptive settings where the dynamics of the walkers is coevolving with the underlying network [27].
Appendix A: From the master equation to the deterministic density evolution.
The goal of this section is derive the mean field equations for the densities, namely Eqs.(3) in the main body of the paper, from the Master Equation: where P (m, t) denotes the probability to find the system at time t in the state m = (m 1 , . . ., m N ).Recall that the above sum is restricted to states m compatible with m.Because at a given time only a walker can hop from a given node to one of its neighbours, the states m take the form m j = (. . ., m i ± 1, . . ., m j ∓ 1, . . .), for all j with a ij = 1.
Let us introduce the average number of agents in node i at time t, m i (t) ≡ m m i P (m, t), and the densities ρ i (t) = lim M →∞ m i (t) /M .Then, by taking the time-derivative of m i , recalling (A1) and accounting for the subsets of compatible states we get: or equivalently Consider now the transition probabilities.These latter are expressed in terms two functions, f node and g as: then we eventually get: By introducing the rescale time τ = t/M and performing the limit M → ∞ (which in turn amounts to neglecting correlations, i.e. f (•) = f ( • ), similarly for g) yields: By introducing the random walk Laplacian ∆ ij = a ij /k j − δ ij and making use of the symmetry of the adjacency matrix, we obtain the sought equation for the time evolution of the density ρ i : Appendix B: Asymptotic solution We now set to calculate the asymptotic density ρ * i as displayed on each node of the network.To do this end we equate to 0 the right hand side of Eq. (A2), and rewrite the ensuing equation as follows: that is for all i, (ψ 1 (i), . . ., ψ N (i)) T should be the eigenvector of the random walk Laplace matrix ∆ associated with the null eigenvalue.In other words, for some constant µ(i): Observe that ψ i (i) = 0 for all i and thus k i µ(i) = 0, which implies µ(i) = 0. Indeed, k i = 0 for all i since the network is connected.In conclusion, the asymptotic solution ρ * i must satisfy: Reordering the terms one gets: The above condition is met, for any i and j, only if the terms on the right and left hand-side equate to a constant c (namely if they do not bear a reflex of the associated index): For generic f and ĝ the previous equation can exhibit multiple solutions.To rule out such possibility, and eventually focus on the interesting setting where just one solution is allowed for, we can assume: (i) f to be non-decreasing function, vanishing at x = 0; (ii) ĝ to be non-increasing function, vanishing at x = 1.In such a way, by continuity, the curves f (ρ) and ckĝ(1 − ρ) intersect only once, for any choice of c > 0 and k > 0.
In Fig. 1 (main text) we have shown the non trivial behaviour of the stationary solution ρ * i as a function of σ and the node degree k i responsible for the interesting phenomenon of accumulation / depletion of hubs and leaves with respect to the case σ = 1.For any given σ > 0 there exists a unique critical values for the node degree, k crit , where such inversion takes place which indirectly defines "large" versus "small" degrees.
To compute such critical value we need to impose the equality among the stationary solution ρ * | σ , for σ = 1, and the same quantity for σ = 1, ρ * | σ=1 , both associated to a node with degree k.From Eq. ( 4) (main text) with σ = 1 we obtain and substituting this value again in (4) we get from which we straightforward obtain which gives the Eq. ( 5) (main text).

The case of k-regular networks
The asymptotic solution Eq. (B1) simplifies in the case of k-regular networks, for which i.e. k i = k for all i; in this case indeed, the dependence on the node index i disappears and thus all the nodes will display the same asymptotic density.The total mass conservation allows to determine the latter as independently of the nonlinear functions f and g.These latter are instead used in determining the normalising constant c entering in Eq. (B1) Given the exact asymptotic solution one can explicitly compute the entropy rate given by Eq. ( 4) (in the main text with the choice f (x) = x and g(x) = (1 − x) σ or the following Eq.(C1)).Indeed the sum over the index j allows to simplify j a ij with the degree k i at the denominator, while the second sum returns the factor N , being the remaining part independent from i.One gets therefore: where we have used that g(x) = ĝ(1 − x).
Assuming f (x) = x and g(x) = (1 − x) σ one can compute the value of β which maximises h for a fixed σ, namely β opt .Moreover one can calculate the parameter σ opt which returns the maximum of h for a fixed β.To this end one needs to perform the partial derivative, ∂ β h, respectively ∂ σ h, and equating these latter to 0. In this way one can can draw an interesting conclusion on σ opt ; indeed one can obtain and thus if k < eβ one gets σ opt > 0 while on the contrary one obtains σ opt < 0. The first constraint can be realised only with k = 2, that is for a 1D ring where each node is connected to its two neighbours, one on the left and one on the right, and for β sufficiently large, i.e. β > 2/e ∼ 0.736.These facts can explain why in the case of the Erdős-Rényi and scale-free networks one always found σ opt < 0 and thus h is a decreasing function of σ (see Fig. 7).
The computation for β opt follows the same reasoning.One can in particular obtain an implicit equation for the optimal value of β for a generic function g(x): .
For the particular choice g(x) = (1 − x) σ we obtain: We conclude this section by observing that more regular topologies can be associated to larger entropy rates and thus to a stronger ergodic behaviour.In particular in Fig. 5 we compare the maximum of the entropy rate achieved for a k-regular 1D lattice against the same quantity computed for an Erdős-Rényi network with the same average degree and the same number of nodes.We can observe that for all the values of the average degree, the entropy rate is alway larger in the case of the regular lattice than for the random network.
Remark 1 (The k-Cayley trees).A similar analysis can be performed in the case of k-Cayley trees, where each node has degree k (also called coordination number), but the leaves that by definition have degree 1.This implies that there will be two values for the asymptotic density, one associated to the leaves, ρ * out , and one for the remaining nodes, i.e. the inner ones, ρ * inn , determined by: The constraint on the conservation of the total mass and the observation that in the limit of infinitely large Cayley tree, i.e. for a diverging number of shells, the number of inner nodes divided by the number of leaves converges to 1/(k − 2), provide a third relation: From Eqs. (B5) and (B6) one can determine the three variables ρ * inn , ρ * out and c, and then again the entropy rate h.Let us observe that in this limiting case the average degree of the Cayley tree converges to 2 and thus we cannot fairly compare its entropy rate with the one obtained for the k-regular 1D lattice or the Erdős-Rényi network with the same average degree.
3. Analytical approximation for the asymptotic solution, when f (x) = x and g(x) = (1 − x) σ Assuming f (x) = x and g(x) = (1 − x) σ , σ > 0, for 0 ≤ x < 1 and 0 otherwise, the asymptotic solution for the density Eq. (B1) is implicitly given by Entropy and topology regularity.We compare, as a function of the network connectivity, the maximum of the entropy rate per node computed for a k-regular 1D-lattice and an Erdős-Rényi network with the same average degree and the same number of nodes (N = 50).Results show that the regular topology always exhibits the highest maximum.For large network connectivities the two computed quantities converge to a shared value (indeed both networks converge to the same complete network).
In the following we shall write ρ * i (σ) to stress the dependence on the parameter σ.For σ = 1 the solution to the latter problem takes the form [14] Let us introduce y i = 1 − ρ * i and rewrite the equation for the implicit solution as where for a sake of clarity we dropped the index i and we introduced κ = k i c.One can thus look for a series expansion of y(σ) in terms of (σ − 1), that should converge in a neighbourhood of σ = 1: Inserting this power series into Eq.(B8), recalling that and equating terms corresponding to the same powers of (σ − 1) on the left and the right hand sides of Eq. (B8), we can express y n as a function of the terms y l , 0 ≤ l ≤ n − 1.This recursive (infinite) system of equations can be explicitly solved.The first few terms are given by We can thus write, for σ ∼ 1, the following approximate solution: From the explicit form of the coefficients ρ * i,n one can analyse the dependence of the asymptotic density on the nodes degree and on the normalising parameter c, that, we recall, is a function of the crowding amount β, for fixed σ.Indeed we observe that for ck i 1 the zeroth order correction is of the order of the unity, ρ * i,0 → 1, while the high order corrections, n ≥ 1, do satisfy ρ * i,n = O((log ck i ) n /ck i ).Thus, they are negligible provided at least one among k i and c is sufficiently large.The former condition implies that i is a hub, the latter amounts to operate under crowded conditions, namely β → 1, which in turn implies c → ∞.On the other hand, ck i 1 (the network is connected and thus k i = 0 for all i) yields ρ * i,n = O((ck i ) n+1 ); hence, in very diluted conditions, β → 0, high degree nodes can exhibit a very low density.
To check the accuracy of the approximation we quantify the discrepancy between the approximate formula Eq. (B9), up to a given order m, ρ (m) i (σ), and the exact numerical solution of Eq. (B7), ρ * i (σ), both for the same fixed value of σ.The error is specifically defined as: .
Results reported in Fig. 6 testify on the accuracy of the proposed approximation for σ close to 1; observe that for over a significant window in σ, the error stays bounded to a few percents.The actual error depends also on the crowding parameter β and on the network topology.The top panels of Fig. 6 refer to a weakly crowded environment, β = 0.2, while bottom ones are obtained when considering a more pronounced degree of imposed crowding, β = 0.8.One can observe that the error deteriorates, as β increases.To test the impact of the topology of the underlying network, we created 10 Erdős-Rényi networks made by N = 100 nodes and assuming a probability for the existence of a link p = 0.2.For each network, we computed δ 7 (σ).The solid line in the right panels of Fig. 6 displays the average of the computed errors, while the boundaries of the grey shadow are set at one standard deviation by the mean.A similar behaviour (data not shown) is obtained when employing different schemes of network generation (as adopted in the main body of the paper).
Appendix C: Entropy rate and the Heterogeneous Mean Field hypothesis.
The aim of this section is provide additional information on the application of the approximate Heterogeneous Mean Field hypothesis (HMF).Working under this assumption, we will characterise the entropy rate and then derive a simplified formula which holds when correlations among nodes degree can be neglected.
Consider again the entropy rate given by: where a ij is the adjacency matrix of the underlying network, (k i ) 1≤1≤N the nodes degree and ρ * i the stationary probability.The first step consists in reorganising the sums as follows Then we invoke the Heterogeneous Mean Field hypothesis, namely we aggregate together nodes which share the same connectivity.Instead of summing on the node's index, we perform the sum on the degree [10].Let thus denote by P (k) the probability for a generic node to have degree k and let P (k |k) the conditional probability that a generic node with degree k is connected to a node with degree k , then : where ρ * k is the density of the nodes that share connectivity k.Assuming an uncorrelated network, P (k |k) = k P (k )/ k , we get: and using the equilibrium definition, g (1 − ρ * k ) = f (ρ * k )/(ck ), we eventually get: and after some straightforward computations Recalling that f (x) = x (for the case analysed in the main body of the paper) we get: In the main text we have shown that the entropy rate per node is a non-monotonic function of the crowding parameter β and thus the existence of an optimal value, β opt , i.e. a value for which h attains its maximum.The latter depends on the nonlinearity bias σ.In Fig. 7 we report results of some dedicated simulations proving that the optimal value of the crowding parameter is a decreasing function of the nonlinear bias σ, in the case of uncorrelated scale free networks, for several values of γ.A qualitatively similar result holds true also for other network topologies (data not shown).Optimal value for the crowding coefficient β opt as a function of σ.We report β opt (σ) as numerically obtained by means of the HMF hypothesis (curves), under the assumption P (k) ∼ 1/k γ (γ = 3 dashed line, γ = 2.5 dash-dotted line and γ = 3.5 solid line) and compare it with the corresponding quantity estimated for scale free networks which implement an identical scaling exponent (γ = 2.5 triangles, γ = 3.0 circles and γ = 3.5 squares).Each symbol is the average over 10 different networks realisations.
To study the impact of the assortativity on the entropy rate, we build 400 random assortative and disassortative synthetic networks made of N = 1000 nodes, each one characterised by its assortativity coefficient, r ∈ [−1, 1].For each network we computed the entropy rate using Eq. ( 4) (in the main body of the paper), or equivalently Eq. (C2), that is taking into account the possible correlations among nodes degree; then we applied a degree-preserving randomisation process to the network, namely we build a null model by rewiring links without changing the nodes degree.Eventually we computed the entropy rate by using the nodes degree distribution P (k) common to both networks, via Eq.(C2).This amounts in turn to neglect the degree correlations.Let us observe that in this way both networks exhibit same asymptotic nodes densities which depend only on the node degree for fixed β and σ, see Eq. (4) (in the main body of the paper).Hence, any possible differences as stemming from the usage of Eq. (C2) and Eq.(4) (in the main body of the paper) should be traced back to nodes degree correlations [34].To measure the discrepancies as originated by the two aforementioned formulas, we define ∆h = max β |h HMF − h uncorr HMF |.The results reported in Fig. 8 show a dependence of the latter on r; ∆h vanishes for r → 0 coherently with the fact that, for non-assortative networks h HMF and h uncorr HMF should eventually coincide.Then, ∆h is maximal for |r| → 1 and in the worst case scenario, the difference is bounded by a few percents.We also stress that this behaviour does not depend on the value of the nonlinear bias σ imposed (data not shown).

Appendix D: Real and synthetic Networks
The synthetic uncorrelated scale-free networks used in our analyses have been created by means of the configuration model [5], i.e. by drawing a set of positive integer numbers according to the distribution ∼ 1/k γ , and then by using the latter as the degree sequence.In the Erdős-Rényi networks with N nodes, each of the N (N − 1)/2 edges is created independently with probability p ER .
To study the behaviour of our model of nonlinear random walkers on real-world networks we have selected various networks from different domains and with a different number of nodes, links and level of assortativity.The networks considered and the results obtained are summarized in Table I.In particular, we have evaluated the maximal entropy rate h opt for each network, and we have compared this value to that obtained in two different types of null models.The quantity ∆h opt rew (resp.∆h opt rnd ) denotes the difference between the value of h opt and that of a null model (both normalised to the system size) that consists in randomising the original network under the assumption of preserving We then compute the deviation between the entropy rate hHMF and the analogous quantity obtained when degree correlations are silenced using a null model, h uncorr HMF .We can observe a slight dependence of ∆h = max β |h/N − hHMF| on r, which in the worst case scenario reaches a few percents.its node degrees (resp.the average degree) in the randomisation.In formulae: Let us observe that, by definition, networks in the first type of null model have the same degree sequence as the original real-world network.The asymptotic distribution in Eq. (B1) depends, for a fixed load β, only on the degree sequence of the network.We can hence conclude that both the real network and the rewired ones exhibit the same asymptotic density of the walkers ρ * i for all i = 1, . . ., N .From the expression of the entropy rate in Eq. (C1) we can thus obtain that h null,rew differs from h only for the following contribution coming from differences in the adjacency matrices: where a ij is the adjacency matrix of the original network while a (null,rew) ij the one obtained after the degree-preserving randomisation.Differences between the two matrices are limited by be constraint imposed by fixing the degree sequence, and so are the differences between the entropy rates.From the values of ∆h opt rew we observe that in general real-world networks, with the exception of Internet at the autonomous systems level and some social networks perform better in terms of explorability especially in the case of walkers with large values of σ.
On the other hand, using a null model that only preserves the average degree of the original network, we will obtain larger variations of the entropy rates because now the asymptotic density will also differ (see Fig. 9).Randomised networks obtained in this way in general exhibit larger maximal entropy rates, with the notable exception of urban street patterns (see Table I and Fig. 10) that have not only a positive value of ∆h opt rew , but also always a positive value of ∆h opt rnd .In these systems, in fact, randomised networks with the same average degree performs worse in term of explorability, namely their entropy rate is always smaller than the one of the original network.Based on this it is tempting to speculate that road network have been assembled so as to optimise their structure for transport under congested conditions: any randomised version, that disrupt the local organisation of crossroads, will perform worse.
Our results imply that the entropy rate per node, for the synthetic networks, the real ones or a randomised version of these latter, exhibits the same functional behaviour; h vanishes for very small and very large values of β and then achieves a single maximum at an intermediate value of the parameter.This value, β opt , corresponds thus to an optimal network crowding (the total load on the network) that facilitates the exploration of the network, being a maximum of the entropy rate.In Table II we report, for three choices of the nonlinear parameter σ, the values of the optimal crowding computed for the real networks presented in Table I, together with the corresponding differences with respect to the same quantities computed for the null model networks obtained by randomising the original network under the assumption of preserving its node degrees (resp.the average degree).In formulae: FIG. 9. Comparison of the entropy rates for the different null models.We report in the main panel the entropy rate for the real network "C.Elegans frontal" [30] (circles), the one computed using the HMF assumption (black line) and the one for the randomised network preserving the average degree (shaded grey areas corresponding to the interval [min h/N, max h/N ] for the 50 replicas).In the panels a-b-c we zoom in the vicinity of the maximum of the entropy rate to appreciate the similarity with the entropy rate obtained for the degree-preserving null model.FIG. 10.Entropy rates for the road networks.We report the entropy rate for the road network of London [5,31] (circles) together with the same quantity computed using the randomised network preserving the average degree (shaded grey areas corresponding to the interval [min h/N, max h/N ] for the 50 replicas).The solid black line denotes the entropy rate computed using the HMF assumption, that also coincide with the entropy rate obtained using the randomised model that preserves the degree distribution (not visible at this scale).
To conclude, let us consider the role of the degree-degree correlations on the computation of the entropy rate of nonlinear random walkers.In few cases, e.g. the three Internet Autonomous Systems networks and the US Airplane network the randomisation process is not able to completely destroy the degree correlations and thus the entropy rate for the null model will deviated from the same quantity computed in the HMF approximation (see Fig. 11 top panels for the case of the US Airplane network).On the other hand, once the randomisation is able to wash out the degree-degree correlations, we obtain a satisfying agreement between the null model entropy rate and the HMF approximation.We recall in fact that the HMF approximation implies neglecting degree-degree correlations (see Fig. 11 bottom panels for the case of the Facebook network).FIG.11.The role of degree-degree correlations.Results reported for the US Airplane network [5,26] (top panels) and Facebook [25] (bottom panels) indicate that, when the randomisation (preserving the degree distribution) is not able to completely wash out the degree-degree correlations, then the entropy rate of the randomised network (grey shaded area in panels a and d) differs from the one computed in the HMF approximation (black lines in panels a and d).In both cases, the circles denote the entropy rate for the original network.We report for the original network (panels b and e ) and the randomised one (panels c and f) a measure of the presence of degree-degree correlations, namely the average degree knn of neighbours of nodes of degree k as a function of k.The similarity of the scatter plots in panels b and c suggest the presence of degree-degree correlations in the randomised version of the US airplane network.Comparison of panels e and f suggests instead that correlations have been destroyed in the case of Facebook.
Appendix E: Different choices of the bias functions f and g The aim of this section is to introduce and study some more general classes of biases functions f and g.The simplest straightforward generalisation is to consider f (x) = x α and g(x) = (1 − x) σ for 0 ≤ x < 1 and 0 otherwise , for some real positive α and σ.Under such assumption the asymptotic solution given by Eq. ( 4) (in the main body of the paper) rewrites: Observe that the function on the left hand side equals 0 for ρ * i = 0 and monotonically diverge to +∞ for ρ * i → 1 − .Hence, for any c σ k i , there exists one and only one value for ρ * i satisfying the equality.Stated differently looking for the asymptotic solution corresponds to finding the intersection of the two curves y = x α and y = λ(1 − x) σ (see Fig. 12 for two generic cases), and by continuity such intersection is alway unique.
T a I J Y z q z D X 7 A e P 8 C e M e c 9 Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " T a I J Y z q z D X 7 A e P 8 C e M e c 9 Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " T a I J Y z q z D X 7 A e P 8 C e M e c 9 Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " A S s z w g o L f X y u e T k 7 h q 6 N z F 4 H T i 1 n F 6 + Y B a P q + W S X c Z m 0 T Q r V s l K S a l i H 9 n Y 0 k q K A p q j 3 s u / d

TABLE II.
Nonlinear random walkers on real-world networks (II).For three different choices of the nonlinear bias, namely σ = 0.5, 1 and 2, we report the optimal value of the crowding parameter, β opt , for the original network and and in the two null models we used (degree distribution and average degree preserving) ∆β opt rew and ∆β opt rnd .50 different network realisations have been considered to evaluate averages and standard deviations for each null model.

FIG. 1 .
FIG. 1.The stationary solution.(a) The stationary distribution of the walkers at a node of degree k is determined as the intersection between the line ρ/(cσk) and the curve (1 − ρ) σ .The normalisation constant cσ depends on σ, the level of crowding in the network.(b) ρ * is plotted versus k.When σ < 1 agents cluster on nodes with a large degree, while for σ > 1 hubs are progressively depleted (with respect to the case σ = 1).The solutions obtained for σ = 1 intersect the curve relative to σ = 1 at k crit .
FIG.5.Entropy and topology regularity.We compare, as a function of the network connectivity, the maximum of the entropy rate per node computed for a k-regular 1D-lattice and an Erdős-Rényi network with the same average degree and the same number of nodes (N = 50).Results show that the regular topology always exhibits the highest maximum.For large network connectivities the two computed quantities converge to a shared value (indeed both networks converge to the same complete network).

FIG. 6 .
FIG.6.Analytical approximation for the asymptotic solution.Left panels (a & c): We compare the approximate formula up to the m-th order (m = 3 dotted line, m = 5 dashed line and m = 7 solid line) with the exact solution of Eq. (B7), as obtained via numerical methods.Right panels (b & d): for a fixed approximation order, m = 7, we compute the average error δ7(σ) over 10 realisations of the underlying network.The boundaries of the grey shadow are at one standard deviation from the mean.Top panels refer to β = 0.2 while bottom ones to β = 0.8.The underlying network is generated according to the Erdős-Rényi recipe, with N = 100 nodes and probability for the existence of a link p = 0.2.
FIG. 7.Optimal value for the crowding coefficient β opt as a function of σ.We report β opt (σ) as numerically obtained by means of the HMF hypothesis (curves), under the assumption P (k) ∼ 1/k γ (γ = 3 dashed line, γ = 2.5 dash-dotted line and γ = 3.5 solid line) and compare it with the corresponding quantity estimated for scale free networks which implement an identical scaling exponent (γ = 2.5 triangles, γ = 3.0 circles and γ = 3.5 squares).Each symbol is the average over 10 different networks realisations.

3 FIG. 8 .
FIG.8.Entropy rate for assortative/disassortative synthetic networks.We built several random assortative and disassortative synthetic networks, made of N = 1000 nodes, each one characterised by its assortativity coefficient, r ∈ [−1, 1].We then compute the deviation between the entropy rate hHMF and the analogous quantity obtained when degree correlations are silenced using a null model, h uncorr HMF .We can observe a slight dependence of ∆h = max β |h/N − hHMF| on r, which in the worst case scenario reaches a few percents.
r / h x l 9 x 4 0 I R l 7 r y b 8 x o B R U 9 E D i c c y 4 3 9 7 i x 4 E o j 9 G Z l x s Y n J q e y 0 7 m Z 2 b n 5 h f l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 4 h 6 n f z8 O I y z m g Y u Y G e o k q W P v g 2 E = " > A A A C G X i c d V D L S g M x F M 3 U V 6 2 v q k s 3 w a J U 0 J I U s S 2 o F N 2 4 r G C 1 0 B l L J p N p Q z M P k 4 x Y i r / hx l 9 x 4 0 I R l 7 r y b 8 x o B R U 9 E D i c c y 4 3 9 7 i x 4 E o j 9 G Z l x s Y n J q e y 0 7 m Z 2 b n 5 h f l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 4 h 6 n f z8 O I y z m g Y u Y G e o k q W P v g 2 E = " > A A A C G X i c d V D L S g M x F M 3 U V 6 2 v q k s 3 w a J U 0 J I U s S 2 o F N 2 4 r G C 1 0 B l L J p N p Q z M P k 4 x Y i r / hx l 9 x 4 0 I R l 7 r y b 8 x o B R U 9 E D i c c y 4 3 9 7 i x 4 E o j 9 G Z l x s Y n J q e y 0 7 m Z 2 b n 5 h f + T d O 2 y y 0 9 c D A 4 Z x z m X t P k A q u j e t + O 6 W N z a 3 t nf J u Z W / / 4 P C o e n z S 0 U m m K G v T R C S q F 6 B m g k v W N t w I 1 k s V w z g Q r B t M 7 u Z + 9 4 k p z R P 5 a K Y p 8 2 M c S R 5 x i s Z K 3 Y G w 0 R C H 1 Z p b d x c g 6 8 Q r S A 0 K t I b V r 0 G Y 0 C x m 0 l C B W v c 9 N z V + j s p w K t i s M s g 0 S 5 F O c M T 6 l k q M m f b z x b o z c m G V k E S J s k 8 a s l B / T + Q Y a z 2 N A 5 u M 0 Y z 1 q j c X / / P 6 m Y l u / J z L N D N M 0 u V H U S a I S c j 8 d h J y x a g R U 0 u Q K m 5 3 J X S M C q m x D V V s C d 7 q y e u k c 1 X 3 L H + 4 r j V v i z r K c A b n c A k e N K A J 9 9 C C N l C Y w D O 8 w p u T O i / O u / O x j J a c Y u Y U /s D 5 / A E 8 x 4 9 9 < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " L h a 9 Q B + T d O 2 y y 0 9 c D A 4 Z x z m X t P k A q u j e t + O 6 W N z a 3 t nf J u Z W / / 4 P C o e n z S 0 U m m K G v T R C S q F 6 B m g k v W N t w I 1 k s V w z g Q r B t M 7 u Z + 9 4 k p z R P 5 a K Y p 8 2 M c S R 5 x i s Z K 3 Y G w 0 R C H 1 Z p b d x c g 6 8 Q r S A 0 K t I b V r 0 G Y 0 C x m 0 l C B W v c 9 N z V + j s p w K t i s M s g 0 S 5 F O c M T 6 l k q M m f b z x b o z c m G V k E S J s k 8 a s l B / T + Q Y a z 2 N A 5 u M 0 Y z 1 q j c X / / P 6 m Y l u / J z L N D N M 0 u V H U S a I S c j 8 d h J y x a g R U 0 u Q K m 5 3 J X S M C q m x D V V s C d 7 q y e u k c 1 X 3 L H + 4 r j V v i z r K c A b n c A k e N K A J 9 9 C C N l C Y w D O 8 w p u T O i / O u / O x j J a c Y u Y U /s D 5 / A E 8 x 4 9 9 < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " L h a 9 Q B + T d O 2 y y 0 9 c D A 4 Z x z m X t P k A q u j e t + O 6 W N z a 3 t nf J u Z W / / 4 P C o e n z S 0 U m m K G v T R C S q F 6 B m g k v W N t w I 1 k s V w z g Q r B t M 7 u Z + 9 4 k p z R P 5 a K Y p 8 2 M c S R 5 x i s Z K 3 Y G w 0 R C H 1 Z p b d x c g 6 8 Q r S A 0 K t I b V r 0 G Y 0 C x m 0 l C B W v c 9 N z V + j s p w K t i s M s g 0 S 5 F O c M T 6 l k q M m f b z x b o z c m G V k E S J s k 8 a s l B / T + Q Y a z 2 N A 5 u M 0 Y z 1 q j c X / / P 6 m Y l u / J z L N D N M 0 u V H U S a I S c j 8 d h J y x a g R U 0 u Q K m 5 3 J X S M C q m x D V V s C d 7 q y e u k c 1 X 3 L H + 4 r j V v i z r K c A b n c A k e N K A J 9 9 C C N l C Y w D O 8 w p u T O i / O u / O x j J a c Y u Y U /s D 5 / A E 8 x 4 9 9 < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " L h a 9 Q B y o o k D T H p 4 y 5 t a e p j j 0 p 7 O L 5 r B P e 0 4 s J O I P T z F R y r 3 y e G 2 J N y 4 D k 6 6 W H y o o k D T H p 4 y 5 t a e p j j 0 p 7 O L 5 r B P e 0 4 s J O I P T z F R y r 3 y e G 2 J N y 4 D k 6 6 W H y o o k D T H p 4 y 5 t a e p j j 0 p 7 O L 5 r B P e 0 4 s J O I P T z F R y r 3 y e G 2 J N y 4 D k 6 6 W H y o o k D T H p 4 y 5 t a e p j j 0 p 7 O L 5 r B P e 0 4 s J O I P T z F R y r 3 y e G 2 J N y 4 D k 6 6 W H