Localization of eigenvector centrality in networks with a cut vertex

We show that eigenvector centrality exhibits localization phenomena on networks that can be easily partitioned by the removal of a vertex cut set, the most extreme example being networks with a cut vertex. Three distinct types of localization are identified in these structures. One is related to the well-established hub node localization phenomenon and the other two are introduced and characterized here. We gain insights into these problems by deriving the relationship between eigenvector centrality and Katz centrality. This leads to an interpretation of the principal eigenvector as an approximation to more robust centrality measures which exist in the full span of an eigenbasis of the adjacency matrix.


I. INTRODUCTION
Cataloging individual nodes and the connections between them forms the underlying data in many areas of science and technology such as the world-wide-web, social networks, biochemical pathways, transportation networks and power grids [1][2] [3].This underlying concept of a graph or network is the same across disciplines and it is not surprising that the same issues emerge.A common problem is how to identify which nodes are most significant.This is valuable if we wish to identify the most important pages on the internet or the most influential people from social media analysis or to target resources at controlling an epidemic on a network of contacts.
Measures of node importance are often termed 'centrality' [4][2] [3].Degree centrality is the most obvious measure of the relative importance of nodes and refers to how many nearest neighbours that any given node has.In general, the structure of a network can be represented by an adjacency matrix A such that element A ij = 1 if node j is connected towards node i and A ij = 0 otherwise.For an n by n adjacency matrix A representing an undirected network, degree centrality is given by where 1 is a column vector of ones of length n; on a directed network, this represents the in-degree.One of the main deficiencies of degree centrality is that a simple tally of the number of neighbours does not account for whether those neighbours are themselves important.Generally it is reasonable to suppose that nodes with high centrality should confer a higher centrality onto their neighbours than lower centrality nodes.A standard method for resolving this problem is eigenvector centrality [5] which relates to the eigenvalue equation for A: Comparing with (1), the eigenvalue equation has the required form; instead of summing over the number of neighbouring nodes with equal weight, we have a weighted sum where each neighbour contributes centrality in proportion to its own centrality u.For this equation to have a solution, it is of course required that µ is an eigenvalue of A and that u is its corresponding eigenvector.
From the Perron-Frobenius theorem, the principal eigenvalue has a non-negative eigenvector in its eigenspace.Consequently this is the solution of (2) which is used to define eigenvector centrality.To set the scene for what follows, we note that while this may be a desirable attribute for a centrality measure and we should also expect the principal eigenvector to contain more information than the other eigenvectors, this is not sufficient to neglect the other eigenvectors.Essential ranking information could exist in the direction of the other eigenvectors and so we cannot guarantee that eigenvector centrality will always give a sensible ranking of node importance.
Problems with eigenvector centrality known as localization have been observed whereby the centrality is localized on just a few nodes in the network.This is particularly apparent when networks have highly connected hub nodes [6][7] [8][9] [10], but also occurs when networks have high modularity [11].
Here we derive the general form of eigenvector centrality in the presence of hub nodes and cut-vertices.This enables us to make a detailed analysis of eigenvector centrality on these networks and leads to the identification of three distinct types of localization.

II. EIGENVECTOR CENTRALITY ON NETWORKS WITH CUT-VERTICES
We consider a network with adjacency matrix A and a cut-vertex such that its removal results in m disconnected components (or partitions) with adjacency matrices P i of size p i × p i for i ∈ (1, 2, . . ., m).The adjacency matrix A has the form Here, the notation 0 p×q denotes the p by q zero matrix, the column vector a i describes connections from the cutvertex to partition P i and the row vector b i describes connections from partition P i to the cut-vertex.For simplicity we shall suppose that A and the partitions P i are irreducible (strongly connected).

arXiv:1809.00810v1 [physics.soc-ph] 4 Sep 2018
The form of eigenvector centrality for this network can be obtained by adapting the method in Martin et al. [9] which considered m = 1.Suppose that the principal eigenvalue is µ and that the corresponding eigenvector is where x i are column vectors of length p i and v is a scalar.Substituting this and (3) into the eigenvalue equation ( 2) gives: . ., m).Solving this for x i gives and by substituting these values into (4) we obtain where

III. LOCALIZATION OF EIGENVECTOR CENTRALITY
Since A is irreducible and P i is a subgraph of A, µ dominates the eigenvalues of P i ( [12], pp.83-84).Consequently, for a class of networks whose spectral radius µ(n) scales with size such that lim n→∞ ν i (n)/µ(n) = 0 where ν i (n) is the spectral radius of partition P i , then if follows from ( 5) and ( 6) that the the centrality of the nodes in P i tends towards a i and becomes uninformative.In the m = 1 case, this can lead to hub node localization where an unreasonable focusing of centrality on the hub node and its immediate neighbours occurs.This has been observed on several networks [6][7] [8][10] and established as a phase transition on a class of undirected random graphs [9].
Here we note that in addition, the form ( 5) is also problematic because its values for all nodes in subgraph P i are directly dependent on the nodes, defined by a i , that the cut-vertex connects towards.This suggests that there could be a non-local impact on the entire subgraph of the choice of connecting nodes.
To investigate this, let us suppose that A is undirected which enables us to assume an orthonormal eigenbasis for P i given by the vectors (w 1 i , w 2 i , . . ., w pi i ) with corresponding eigenvalues λ 1 i , λ 2 i , . . ., λ pi i .We can write the vector a i in this basis: with coordinates given by the projection of a i onto the relevant basis vectors: g j i = a i • w j i for j ∈ (1, 2, . . .p i ).Additionally, decomposing the inverse matrix (6) as a power series in P/µ:

now lets us write
To keep the notation simple, let us denote the leading eigenvector of P i by w i and its associated eigenvalue by λ i .We can now make a leading eigenvector approximation u to capture the main characteristics of (5) in most circumstances: From this, the ratio of the average centrality in subgraph P 1 to the average centrality in subgraph P 2 is approximated by The first factor can be shown to be bounded between √ p 2 /p 1 and p 2 / √ p 1 by making use of the bounds x 2 ≤ x 1 ≤ √ p x 2 provided by the l 2 -norm on the l 1 -norm for a vector x of dimension p.For subgraphs of similar size and type, its value is typically close to 1.
To investigate the second factor, it is informative to consider the situation where the two partitions P 1 and P 2 are isomorphic so that w 1 = w 2 , p 1 = p 2 and λ 1 = λ 2 and so we are left with just the second factor: ρ = a p • w 1 /a q • w 1 .In addition to describing the ratio between the average centralities of the partitions, FIG. 1.The classic karate club network [13] is duplicated.Node 2 in the left network is connected to node 13 in the network on the right via an additional node (CV) which is a cut-vertex.The size of the nodes increases with their eigenvector centrality and the colour from white through to green is also changing with increasing eigenvector centrality.
in this particular case it also gives the ratio between corresponding nodes.An example of such a network is shown in Fig. 1 where the classic karate club network of Zachary [13] has been duplicated and then linked by an additional connecting node of degree two (the cutvertex).The only difference between the duplicated subgraphs is that the cut-vertex connects to node 2 in the left subgraph and to node 13 in the right subgraph.
Identifying the left subgraph with P 1 and the right subgraph with P 2 , we obtain ρ = 3.157, rounded to four significant figures.Here, ρ reduces to the ratio of the eigenvector centrality of node 2 to the eigenvector centrality of node 13 when computed on the original (single) karate club network.When computing the thirty-four individual ratios of node i in the left subgraph to node i in the right subgraph for i ∈ (1, 2, . . ., 34), we find that, when excluding corresponding pairs of nodes 2 and 13, this has mean 3.157 rounded to four significant figures with standard deviation 0.015.Corresponding nodes 2 have ratio 3.236 and corresponding nodes 13 have ratio 2.502 reflecting differences due to the connection of these nodes to the cut-vertex which would be described by including the other terms in (7).The small node-specific variation described by the standard deviation also relates to the contributions from the other terms in (7) which are neglected in (8).These terms are small because the values µ = 6.738 and λ 1 = λ 2 = 6.726 cause the terms with the largest eigenvalue to be far more significant.
For most applications, we would expect a useful centrality measure to provide more or less the same centrality values to the corresponding nodes in the two subgraphs except for some deviation near to the connecting nodes.However, a network-wide impact of the choice of connecting nodes is observed whereby the centralities in the left subgraph are significantly more than those in the right subgraph.By changing the choice of connecting nodes, a non-local network-wide impact on every node occurs as the value of the ratio ρ changes.
When subgraphs P 1 and P 2 are different, there are contributions from all factors in (8).The last factor depends on the principal eigenvalues of subgraphs P 1 and P 2 .We should expect some dependence, but this term can be very sensitive to whether λ 1 or λ 2 is closest to µ, and consequently ρ can be very large or small because of this.Additionally, as in the previous example, we also have dependence on the second term describing the non-local impact of the choice of connecting nodes.Fig. 2 illustrates two different Erdős-Rényi random graphs of the same order and similar density joined together.Here most of the centrality is in the top-right subgraph, demonstrating localization behaviour.The average eigenvector centrality of nodes in the upper subgraph is found to be 7.301 times greater than those in the lower subgraph.This is captured by (8) which gives ρ = 7.312.The first factor in (8) has value 1.024.The second has value 0.3917 which partly reflects the fact that the lower graph has five connections and the upper one has three, leading to a higher amount of the eigenvector centrality of the lower subgraph being directly connected to the cut-vertex than the upper one.The third factor is 18.23 illustrating sensitivity to the principal eigenvalues which have values µ = 7.190, λ 1 = 7.171 and λ 2 = 6.856.
Since the form of the approximation (8) reduces to just the second factor for the network in Fig. 1, we conclude that increasing the number of links between the cut-vertex and left subgraph will cause the centrality of nodes in P 1 to increases with respect to their counterparts in P 2 .The effect of doing this is illustrated in Fig. 3a for both the actual ratio and the approximation (8).
In the general case where P 1 and P 2 are not isomorphic, increasing the number of links to P 1 from the cut vertex will still cause the second factor in (8) to increase.However, the value of µ will also increase ( [12], pp.69-70) which brings in a changing contribution from the third factor in (8).This factor will decrease if λ 1 > λ 2 and will approach 1 from above, otherwise, aside from the equality case, it increases and approaches 1 from below.In the former case, it is therefore possible that ρ will initially decrease with increasing links, prior to it increasing again.This is illustrated in Fig. 3b by adding connections to the upper subgraph in Fig. 2.
As µ gets large with respect to any given subgraph P i , the centrality of this subgraph approaches the uninformative distribution a i .In the case of Fig. 3b, we FIG. 2. A network formed by connecting two different fiftynode Erdős-Rényi random graphs together with a cut-vertex.The bottom graph has average degree 5.84 and the top has average degree 6.32.The cut-vertex connects to five nodes in the bottom subgraph and to three nodes in the top subgraph.The size of nodes increases with their eigenvector centrality and their colour also changes from white through to green.
are far from this limit since the principal eigenvalue only increases to µ = 11.39 when all fifty nodes in the upper subgraph are connected, which is less than twice the principal eigenvalue of either subgraph.
The original (single) karate club network also has a cut-vertex at node 1 and so for completeness we can consider this.Removal of node 1 fragments the network into three parts: nodes (5,6,7,11,17), node 12, and the remaining 27 nodes.We can determine eigenvector centrality by (5) with m = 3. Identifying partition P 3 with node 12, we get M 3 a 3 = 1.If we identify partition P 1 with nodes (5,6,7,11,17) and their internal connections and P 2 with the larger partition, then the ratio of the average eigenvector centrality of nodes in P 1 to the average in P 2 is 0.4266.The value from ( 8) is ρ = 0.4612.Here the first factor is 2.708, the second is 1.272 and the third is 0.1339.There is nothing obviously problematic with the value of this ratio, but it lacks some credibility given our previous observations.

IV. INTERPRETATIONS OF EIGENVECTOR CENTRALITY
For the types of network we considered, our analysis identified three types of localization.
FIG. 3. a) For the network in Fig. 1, the right subgraph remains connected via node 13 whereas the left subgraph is increasingly connected to the cut-vertex, starting with node 1, and then nodes 1 and 2, continuing until all 34 nodes are connected.The impact on the ratio of the average eigenvector centrality (blue circles) as well as on the approximation ρ (red crosses) is shown.b) For the network in Fig. 2, the connectivity of the lower subgraph to the cut-vertex remains as before, but the connectivity to the upper subgraph increases by sequentially connecting a new node chosen uniformly at random from the remaining unconnected nodes, except that the first three nodes are chosen to be the same as in Fig. 2. The ratio of the average eigenvector centrality (blue circles) and the approximation ρ (red crosses) is shown as well as the contributions of the second (black solid line) and third (black dashed line) factors in (8).
• Type 1: If a class of network scales with the number of nodes n such that its principal eigenvalue is µ(n) and if one of the partitions P i of a cut-vertex has principal eigenvalue ν i (n) and if lim n→∞ ν i (n)/µ(n) = 0, then it follows from ( 5) and ( 6) that the proportion of eigenvector centrality allocated to P i vanishes except for those nodes in the partition connected directly from the cutvertex.In some circumstances this leads to an unreasonable focusing of centrality (also see the discussion at the beginning of section III).
• Type 2: The second factor in (8) describes a nonlocal impact across the entire of a partition P i of the choice of nodes connecting it to the cut-vertex.This is particularly apparent when the connectivity is low (e.g.Fig 1).
• Type 3: If the principal eigenvalue of a partition is close to the principal eigenvalue of the full network, then the third factor of (8) can make the centrality of this subgraph unreasonably high (e.g.Fig. 2 and Fig. 3b).
Localization caused by the presence of high-centrality hub nodes can be qualitatively explained on undirected random graphs with a vanishingly small density of short loops in terms of the eigenvalue equation [9].It emerges from the process where a high-centrality hub node passes centrality to its neighbours, but this is then reflected back to the hub via the bidirectional links.This type of backtracking can be avoided by using a modified 'nonbacktracking' version of eigenvector centrality [9] based on the Hashimoto or nonbacktracking matrix [14][15] [16].
If we apply the nonbacktracking variant of eigenvector centrality to the double karate club network in Fig. 1, then the problem of the non-local influence of the choice of connecting node is resolved and the two subgraphs gain similar centralities, suggesting that this resolves at least some Type 2 localization problems in addition to resolving at least some Type 1 problems.However, for general networks with a cut-vertex where the partitions have different principal eigenvalues, localization problems can remain.For example, the average nonbacktracking centralities in the upper graph of Fig. 2 are 5.725 times bigger on average than those in the lower graph.This suggests that the localization problems associated with the third factor in (8) remain.To understand this, it would be valuable to determine whether something similar to (5) could be derived for the nonbacktracking algorithm.
Other eigenvector-related related centrality measures such as Katz centrality [17] and PageRank [18] are much more robust than eigenvector centrality and provide an alternative resolution to localization problems [9].Katz centrality is defined for a general adjacency matrix A by where and where a is is a parameter that we are free to choose within the range 0 < a < 1/µ [17][19].It is clear that there is a close relationship to (6) which we shall investigate further.
The matrix M can be written as a power series in aA: The element s ij of matrix A n is the number of paths of length n between node j and node i.In the original interpretation of this series by Katz, it is supposed that the influence of a node j on node i via a path between them reduces by the length of this path according to a n where a is the 'attenuation' on each link, so that shorter paths contribute more.According to Katz, the interpretation of element M ij of matrix M is the influence that node j has on node i due to all possible paths between j and i.By performing the sum over j in (9), we determine the total influence of all nodes on node i.
Comparison of ( 9) with (5) highlights that if the cutvertex connects to every node in a partition, then eigenvector centrality becomes Katz centrality of that partition with parameter a = 1/µ.For example, in Fig. 3a&b, when all nodes of one subgraph are connected to the cutvertex then that subgraph is described by Katz centrality.
Following similar arguments to the derivation of ( 7), Katz centrality can be written in terms of an eigenbasis {u 1 , u 2 , . . ., u n } (with corresponding eigenvalues µ 1 , µ 2 , . . ., µ n ) of A: where f i are coordinates of vector 1 in this basis.We can assume that µ 1 = µ is the principal eigenvalue of A.
Katz centrality is therefore a vector in the full span of the eigenbasis of A. The level of contribution of each eigenvector depends on the parameter a.As the parameter gets close to 1/µ from below, the first term dominates and, after appropriate normalization, we obtain the convergence of eigenvector and Katz centrality [20][21].As a result, the same localization problems emerge.However, when we approach the eigenvector centrality limit, each term in the sum (10) makes a similar contribution to the centrality and the contributions of the later terms converge to each other in size.With endlessly repeated cycles and infinite path lengths, the original process that Katz envisaged [17] loses its meaning.
For lower values of a, the attenuation automatically reduces the impact of large paths.At the same time, we gain contributions from the other eigenvectors and so in this sense, Katz centrality can be viewed as a mechanism for assimilating information from all of the eigenvectors of A where a is a tuning parameter to determine the relative magnitude of those contributions.Motivated by this, one useful way of defining the attenuation is a = 1/(µ + µ 2 ) where µ 2 is the second-largest positive eigenvalue, if it exists.This is bounded by 0.5/µ < a < 1/µ which is consistent with the value a = 0.85/µ used in the related PageRank algorithm [18].
Conversely, eigenvector centrality can be viewed as the leading contribution to more robust measures in the span of the eigenvectors such as Katz centrality, based on underpinning systems with clear centrality-like interpretations [17] [19].Indeed, we have already argued that there is no sufficient reason why the principal eigenvector of the adjacency matrix should be a reliable centrality measure.

V. ACKNOWLEDGMENTS
The author acknowledges support from the EPSRC, grant No EP/N014499/1