Discrete-time quantum walk on complex networks for community detection

We define the discrete-time quantum walk on complex networks and utilize it for community detection. We numerically show that the quantum walk with the Fourier coin is localized in a community to which the initial node belongs. Meanwhile, the quantum walk with the Grover coin tends to be localized around the initial node, not over a community. The probability of the classical random walk on the same network converges to the uniform distribution with a relaxation time generally a priori. We thus claim that the time average of the probability of the Fourier-coin quantum walk on complex networks reveals the community structure more explicitly than that of the Grover-coin quantum walk and a snapshot of the classical random walk. We first demonstrate our method of community detection for a prototypical three-community network, producing the correct grouping. We then apply our method to two real-world networks, namely Zachary's karate club and the US Airport network. We successfully reveals the community structure, the two communities of the instructor and the administrator in the former and major airline companies in the latter.


A. Quantum walk
The quantum walk has been studied in various areas of physics. The quantum walk is divided into two types: the discrete-time quantum walk [1] and the continuous-time quantum walk [2]. The time evolution of the latter is expressed by a Hamiltonian obeying the Schrödinger equation. In the present paper, we focus on the former.
The discrete-time quantum walk is a quantum counterpart of the discrete-time classical random walk. In the classical random walk e.g. in one dimension, a particle hops to the left or right stochastically, generating a probability distribution, whereas the quantum walk is described instead in terms of the probability amplitude of quantum superposition of the leftmover and the right-mover [1].
The quantum walk generally has the following two properties: it linearly spreads on a flat space and localizes in particular spots [3]. To be more specific, however, quantum walks with different inner states and different coin operators behave differently. The probability distribution of the three-state quantum walk in one dimension, for example, has three peaks, one that moves linearly to the left, one that moves linearly to the right, and the one that localizes at the initial node [4]. The one of the two-state quantum walk in one dimension, on the other hand, has only two peaks that spread linearly to the left and right, without any peak that localizes [5]. In the present thesis, we focus on the two-state walk, using the Fourier coin and the Grover coin [6,7]. The walks with these coin operators are called the Fourier walk and the Grover walk, re-spectively. We will demonstrate that the two walks behave differently.
The quantum walk has been applied to quantum computers, search problems and so on [8][9][10]. Many researchers consider that the quantum-mechanical computers may solve problems more efficiently than the classical computers. The quantum walk has been already implemented in the laboratory [11].
There have been several studies on the quantum walk on networks, mostly on regular ones [9,12]. The shift operator and the coin operator have been defined in conformity to the structure of networks. The quantum walk on networks occupies an important role on search problems. In general, it takes classical algorithms O(N ) steps to identify the target record from an unsorted database of N records, while it takes quantum mechanical systems only O( √ N ) steps [8].

B. Complex networks
Many systems including social networks and biological networks have been found to have distinctive features drastically different from random graphs, and hence are collectively called complex networks [13][14][15][16]. Representative examples include acquaintance networks [17], the World Wide Web [18], corporate transaction networks [19], neural networks [20], food webs [21] and metabolic networks [22].
The distinctive features of the complex networks often quoted include the scale-free property and the small-world effect, although there are so-called complex networks that do not have these features. The former feature means that the histogram of the degrees of the nodes (the number of the links at- tached to a node) follows a power-law behavior [24]; in other words, there are a large number of nodes with low degrees and a small number of nodes with high degrees in a self-similar way. The latter feature means that the average distance between randomly chosen pair of nodes in a complex network is surprisingly shorter than that in a random network [25].
These features may indicate that many complex networks have a hierarchical structure; see Fig. 1, for example. When we depict the structure as a tree, which is called a dendrogram in the social sciences [23] (see Fig. 1 (b), for example), the leaves correspond to the nodes and the branches to the links. Nodes in higher levels of the dendrogram can have more links to nodes in lower levels in a self-similar way. A node in one branch of the dendrogram to another node in a different branch can be connected by a short path through nodes in higher levels.
In a hierarchical complex network, we should be able to find communities in various levels. The community is a subset of nodes within the network such that connections among the nodes of the community are denser than those among the other nodes [23]. As the hierarchy in Fig. 1 suggests, a node at a high level of the dendrogram is likely to be at the center of each community typically with many links, which we call a hub. It is therefore of great importance for detecting the features of complex networks to identify communities.
There are several algorithms for community detection [13,23,[26][27][28]. The conventional method is the hierarchical clustering [13,23,26,29]. In this method, one calculates a weight W i,j for every pair of nodes in the network. The weight shows how closely connected the nodes are. Starting from the nodes with no links between them, one adds links between pairs in the order of their weights. The nodes are classified into communities, and the communities are grouped into larger communities. Many different weights have been proposed in this algorithm. The weight considering the paths longer than the shortest ones was taken into account in Ref. [30]. Another method is called the divisive algorithm [13]. Starting from the whole network, one cuts the links. The network is divided into smaller subnetworks, which are identified as communities. Another research presents an algorithm with a modularity [26,27,31,32]. The modularity is a property of a network and a division of the network into communities. If there are many links within the communities and a few links between the communities, the division is good.
There have been several studies on community detection that used the discrete-time classical random walks [28,33]. These approaches are based on the consideration that random walks on the networks tend to get trapped within the communities [28]. One computes the frequency in which each node is visited by a random walker, and explores the possible partitions by using deterministic algorithms [33].
We here utilize the discrete-time quantum walk instead for community detection. The infinite-time average of the transition probability, normalized by the number of links, of the Fourier-coin quantum walk on a complex network shows localization in a community, and thereby reveals the community structure. The Grover-coin quantum walk, in contrast, tends to be localized around the initial node, presumably due to the localized eigenstates of the time-evolution unitary with degenerate ±1 eigenvalues. For the classical random walk on the same network, the probability converges to a flat distribution as time passes. Although the community structure partially emerges before the convergence, it is generally a priori unknown which time step of the walk is best for the community detection. We thus claim that the Fourier-coin quantum walk on complex networks reveals the community structure more explicitly than the Grover-coin quantum walk and the classical random walk.

II. QUANTUM WALK ON COMPLEX NETWORKS
We first describe our definition of the quantum walk on complex networks. It requires a node-dependent coin operator because each node has generally a different number of links.
We define the quantum state on a complex network (see Fig. 2, for example) in the form where N is the total number of nodes, the state |i → j resides on the node i and is about to hop to the adjacent node j on a link connecting i and j, while k i is the number of links attached to the node i. The total Hilbert space H = H 1 ⊕H 2 ⊕· · ·⊕H N consists of the Hilbert space of each node H i , which is spanned by (|i → j 1 , |i → j 2 , · · · , |i → j ki ). The dimensionality of the total Hilbert space is therefore given by which is the total number of links under double counting. We normalize the state |ψ(t) as in We can write the probability of the existence on a node i at time t as The time evolution of the state |ψ(t) is given by where the unitary operator U is the product of a shift operator S and a coin operator C: We define the shift operator S : H → H by The choice of this shift operator may appear to be atypical compared to the one defined for the one-dimensional lattice, but it is necessary because of the existence of dangling bonds. When the node j is at the end of a dangling bond as the bottom one i → j 3 in Fig. 2, Eq. (8) is the only possible choice. Indeed, it has been used for searching a marked vertex on a specific graph called the Cayley tree [9]. We can also easily prove that the shift operator of Eq. (8), if defined on a onedimensional lattice, can be mapped to the standard shift operator by introducing an extra factor to the coin operator; see Appendix A.
Below we also consider the quantum walk with an alternative coin operator, namely the Grover coin [6], which is given by . This is called the Grover matrix, being related to Grover's search algorithm [35]. There are many studies on the Grover-coin quantum walk (Grover walk). The periodicity of the Grover walk on some finite graphs has been clarified [36]. We will show that the Fourier coin works much better than the Grover coin for the purpose of community detection. We prepare the initial state for the quantum walk as a state in which a specific state on a specific node i start , |i start → j , has the element unity and the others have elements zero. In the next section III, we take the average over the adjacent nodes j as will be seen in (13) below.

A. Infinite-time average
We numerically show hereafter that the probability of the Fourier walk becomes higher in hubs as time passes whichever node we choose as the initial one i start . We can thus detect hubs of complex networks, although the threshold to detect them is an open question. We will also show that the state of the Fourier walk on complex networks is localized in a community of the initial node, and thereby reveals the community structure. For the quantum walk on the one-dimensional finite lattice, the probability distribution after a long period of time has been proved to be stationary and uniform when the quantum walk behaves symmetrically [37]. For the quantum walk on complex networks, on the other hand, we here show that the infinite-time average of the normalized transition probability, calculated from the eigenvectors, shows localization.
Let us calculate the infinite-time average of the transition probability by expanding the unitary operator U = SC in terms of its eigenstates: where |µ is the eigenvector and e iθµ is its eigenvalue with a real argument θ µ . The transition probability that the quantum walk starting from a node i reaches a node l is given by where |i → j is the initial state and |l → m is the state at the step t. The factor 1/k i is to average over the direction j of the initial state. We also took the summation over the direction m of the final state. The infinite-time average of the transition probability is given by where we assumed which is valid if the eigenvalues are non-degenerate and distributed almost randomly over the unit circle. In this case, the quantum walk on the network is a superposition of oscillation with various frequencies, and hence the infinite-time average makes sense. In order to check the validity of the formulation, we show in Fig. 3 (b)-(c) the eigenvalue distributions of the timeevolution unitary matrix U for the Fourier walk and the Grover walk on a prototypical three-community network given in Fig. 3 (a), for which N = 21 and D = 78. In both cases, the 78 eigenvalues are distributed over a unit circle on the complex plane. The eigenvalues of the Fourier walk are nondegenerate, while almost half of the eigenvalues of the Grover walk are degenerate either at ±1. (Precisely, the degeneracies are 20 and 18 for the eigenvalues ±1, respectively.) The histogram in Fig. 3 (d) shows more clearly that the eigenvalues of the Fourier walk are distributed much more evenly over the unit circle than the eigenvalues of the Grover walk. We thus realize that the Fourier walk is more suitable for the formulation (15) than the Grover walk.
It has been proven for the Grover walk that the eigenvectors of the eigenvalues degenerate to ±1 are localized on loops of graphs [38]; indeed the degree of the degeneracy is completely determined by the topology of the graph (see Appendix B for tutorial examples). On regular graphs, this degeneracy would lead to a proof of the localization on the initial node after linear combination of the eigenvectors on the loops [38]. We will numerically show below that the Grover walk on a graph is also localized around the initial node of the walk. Figure 4 (a) shows the infinite-time average of the probability of the Fourier walk on the three-community network in Fig. 3 (a), computed according to Eq. (15) based on the numerical diagonalization of U . The vertical axis shows the initial node i, the horizontal axis shows the target node l, and each square color-codes the amplitude of the time-averaged probability p(i → l); note that the probabilities are roughly proportional to the number of links, and those of the hubs (the nodes 1, 13, and 21) are the highest. We can thus identify hubs clearly from the infinite-time average of the probability.
Based on the observation in Fig. 4 (a), we define the normalized probability P (i → l; t) of each node by dividing the probability p(i → l; t) by the number of links of the target node l: The infinite-time average of the normalized probability is given by This infinite-time average then becomes symmetric with respect to the exchange of l and i as in P (i → l) = P (l → i). Figure 4 (b) color-codes the infinite-time average of the normalized probability P (i → l) calculated from the eigenvectors of the Fourier walk for the three-community network in Fig. 3 (a). The normalized transition probability between the initial node and the other nodes in the same community is high, which reveals the community structure. Figure 4 (c)-(e) shows the same quantity as in Fig. 4 (b), but only for the cases in which the walk starts from the hubs (the nodes i = 1, 13, and 21, which are indicated by red arrows in Fig. 4 (c)-(e)). In order to detect the community structure quantitatively, we here tentatively define the threshold for the detection of a community to be q = 1/D, where D = 78, which is indeed the stationary probability normalized by the number of links k l of the classical random walk on the network.
We thereby define a community as follows: (i): We first define a hub i as a node with the largest order k i ; (ii): If the normalized probability starting from a hub i to a node l is greater than the threshold q, namely if the node l is a member of the community of the hub i.
This algorithm clearly reveals the three communities of the three-community network in Fig. 3 (a). For instance, if the Fourier walk starts from the hub 1 as in Fig. 4 (b), the probability of the nodes 2, 3, 4, 5, 6, 7, which belong to the same community, is higher than the threshold q. We can thus successfully identify which community each node belongs to. In order to justify the algorithm from a different perspective, we show that the Fourier walk on the network is localized in a community to which the initial node belongs. Let us evaluate the localization of the eigenvectors using the inverse participation ratio (IPR) [39,40]. The IPR of an eigenvector is given by where the probability p µ (l) is with the normalization N l=1 p µ (l) = 1 for all µ. If the eigenvector is sharply localized to one node, the IPR is close to unity. If the eigenvector is delocalized, the IPR is as small as 1/N , which is 1/21 ≈ 0.0476 in the present case of the three-community network in Fig. 3 (a). If the eigenvector were localized uniformly in one of the communities of the network as in for l = 1, 2, · · · , 7, the IPR would be exactly 1/7 ≈ 0.14. Figure 5 (a) shows the IPR of each eigenvector for the Fourier walk on the three-community network. We find that all states have the IPR higher than 1/21 0.04762 (the thinner horizontal line in Fig. 5 (a)) and several eigenvectors are localized more strongly than the IPR = 1/7 0.1429 (the thicker horizontal line in Fig. 5 (a)). The naive average of the IPR over all eigenstates is about 0.1151 1/8.7, which is not far from 1/7. Figure 5 (b), on the other hand, shows the normalized probability of the Fourier walk on the three-community network. The probability distribution for each eigenvalue shows the localization, often over a community. For the probability distribution for the eigenstate number 1 (for which IPR is about 0.150), for instance, the probability for the node 1 and the nodes in the same community (from l = 1 to l = 7) is visibly higher than that of the other nodes; in other words, this eigenvector is localized in the first community. Similarly, the probability distribution of the eigenstate number 46 (for which IPR is bout 0.137) is localized in the second community, and that of the eigenstate number 11 (for which IPR is bout 0.135) is localized in the third one. The localization of the quantum walk may be related to the Anderson localization. In the standard sense, the Anderson localization is the property of quantum particles in random media [41,42]. There are several studies on the Anderson localization of the discrete-time quantum walk on lattices with randomness [43,44]. The quantum walk on the complex network may be similar to the quantum particle in random media because of the inhomogeneity of the network, and hence may experience the Anderson localization.

B. Finite-time calculation
We next present our finite-time results of the quantum walk on the same three-community network in Fig. 3 (a). We operated the unitary matrix U to the initial state |i → j up to 100 steps and averaged the resulting probability over j.
Figure 4 (f) shows the time average of the normalized probability P t (i → l) over 100 steps from t = 1 through t = 100. It is almost the same as Fig. 4 (b), also revealing the community structure. The fact that the finite-time average is almost equal to the infinite-time average is presumably thanks to the property of the quantum walk that the front of the probability spreads linearly. This implies that we can apply the present method to complex networks which are too large to diagonalize the time-evolution unitary matrix U by computing a finitetime average instead of the infinite-time average. Figure 4 (g) shows, on the other hand, the same time average of the normalized probability P (i → l; t) but for the Grover walk. We can see that the diagonal elements are much larger than the other elements. In other words, the Grover walk mostly stays at the initial node through the 100 steps, implying strong localization at each node.
We may relate this phenomenon to findings for the Grover walk on regular lattices [38,45,46]. As we mentioned in Sec. III A, it has been proven that the eigenvectors with the degenerate eigenvalues ±1 of the unitary matrix of the Grover walk on regular lattices are broken down to states localized on loops, which leads to a proof of the localization of the walk on the initial node. This may be also the case in the present three-community network. Indeed, the localization numerically demonstrated in Fig. 1 (a) of Ref. [45] resembles the behavior of the diagonal concentration in Fig. 4 (g). the states 41, 44, 47, 49 and so on, have small elements in a community but are mostly localized to one or a couple of nodes. Figure 6 shows the first few steps in the time evolution P (i → l; t) of the Fourier and Grover walks that started from various initial nodes i. When a walk starts from the node 1 of the three-community network in Fig. 3 (a), the Fourier walk ( Fig. 6 (a)) spreads over the first community in a couple of steps and stays so afterwards. On the other hand, the Grover walk ( Fig. 6 (b)), although it has higher probabilities over the first community than the rest, shows some oscillation in time and has even higher probability at the initial node 1 from time to time. When a walk starts from the node 7, the Fourier walk ( Fig. 6 (c)) again spreads over the first community in afew steps and stays so afterwards. The Grover walk (Fig. 6 (d)), however, has high probabilities on nodes 7, 6, and 2.
To summarize in short, the time evolution of the Fourier walk tends to get localized over a community whichever node it starts from, while that of the Grover walk tends to get localized on a couple of nodes around the initial one. Therefore, the Fourier walk reveals the community structure more clearly than the Grover walk.
Finally, we compare the probability of the quantum walk to that of the classical random walk on the same network. The probability of the classical random walk eventually relaxes to the flat distribution, which is equal for all nodes, and hence the infinite-time average of the probability does not reveal the community structure. For community detection we would have to choose a specific time step, which is unknown a priori. We thus claim that using the time average of the probability of the quantum walk is a more tractable way of community detection than trying to find a specific time step of the classical random walk.

A. Zachary's karate-club network
Let us apply the above algorithm of the community detection to Zachary's karate-club network [17] (Fig. 7 (a)), which is a friendship network in a karate club in a university in the USA. The club split into two communities, one clustered around the instructor (node 1) and the other around the administrator (node 34). In Zachary's psychological experiment, each member of the club answered his/her friends' names and the community to which he/she belongs.
In this sense, this is a rare case of the complex network for which the 'correct' answer of the community detection is known, although the correctness can be disputed; see the last paragraph of the present section. We will show that our method 'correctly' identifies the two communities. Figure 7 (b) indicates that the eigenvalues for the Fourier walk, which we computed by the numerical diagonalization of the 156 × 156 matrix, distributes quite evenly on the unit circle. We confirmed that there is no degeneracy within the numerical double precision. This guarantees the computation of the infinite-time average given in Sec. III A to be valid for the karate-club network too. Figure 7 (c) shows the infinitetime average of the probability in Eq. (15), which we computed from the numerical diagonalization. The probabilities of the nodes 1 and 34 are higher than any other nodes. We can clearly identify the nodes 1 and 34 as the hubs in this figure. from the hubs (the nodes 1 and 34, which are indicated by red arrows in Fig. 7 (d)-(e)). Let us again tentatively define the threshold to be q = 1/D, where D = 156. The nodes whose probabilities are higher than the threshold q belong to the community in which the initial node is the hub. For instance, if the Fourier walk starts from the hub 1, the probability of the node 2, which belongs to the same community, is higher than the threshold q. We can thus detect which community each node belongs to. Figure 7 (f)-(g) shows the time evolution of the normalized probabilities (17) of the Fourier walk that starts from the nodes 1 and 34. Here the nodes in the first community are gathered to the left and those in the second one are to the right. We can clearly see that the walk spreads over the respective community in the first couple of steps in the time evolution.
Comments are in order here; the detection of the node 3 and 20 (highlighted by the dotted arrow in Fig. 7 (a)) are quite marginal. First, for the node 20, the normalized probability P (1 → 20) and P (34 → 20) are both greater than the threshold q = 1/D. Nonetheless, the former P (1 → 20) 0.007062 is markedly greater than the threshold q 0.006410, while the latter P (34 → 20) 0.006451 is only marginally greater. A slight increase of the threshold q would exclude the possibility of classifying the node 20 to the community of the hub 34. We thereby conclude that the node 20 should belong to the community of the node 1. Second, for the node 3, the normalized probability P (1 → 3) is only slightly greater than the threshold, although for the node 3, P (34 → 3) is less than the threshold.
There are indeed several views of the grouping for Zachary's network. One research [47] divided the nodes into three groups; the first is a group of the node 1, the second is a group of the node 34, and the third is a neutral group of the nodes 9, 10, 20, 28, 29. It is therefore reasonable that the node 20 has a marginal value in Fig. 7 (d). In another research [13], their algorithm classified the node 3 into the group of the node 34. This is consistent with our result that the node 3 has a marginal value in Fig. 7 (c). After all, the grouping according to the second set of answers of Zachary's experiment is based on personal views of each subject, and hence is not the only possible answer but remains to be a quite possible one.

B. USA airport network
We next apply our method to the domestic airport network in the USA in 1997 [48,50] (Fig. 8 (a)). The original data is a weighted network [48], but we use the network data as a non-weighted network. Each node of the airport network corresponds to an airport in the USA. They are connected by a link if there is a flight connection between the two airports. The total number of nodes of the airport network is N = 332 and the total number of links is D = 4252. We computed the infinite-time average (15) of the probability of the Fourier walk by the numerical diagonalization of the 4252 × 4252 matrix. Figure 8 (b) shows that the eigenvalues of the timeevolution unitary matrix are distributed almost evenly on the unit circle of the complex plane. This validates the usage of Eq. (16).
The community structure of this airport network is a priori unknown unlike the prototypical three-community network and Zachary's karate-club network. Based on the successful results above, we here use the following algorithm for community detection: (i): We order the nodes according to the number of links k i , and regard the nodes from the top of the list as candidates for hubs.
(ii): Starting from the node i with the highest degree, which is the first candidate for the hub, we classify the nodes l whose normalized probability (19) is higher than a threshold q, as in P (i → l) > q, into the community of the hub i.
(iii): We carry out (2) repeatedly, ignoring the nodes that have been classified, until all of the nodes are classified into communities. If the node with the highest degree at the moment (e.g. the node 2 in Fig. 8 (c)) has been already classified into a community (e.g. the green broken circle in Fig. 8 (c)), we assume that the hub and the members of the group of the hub (e.g. the orange broken circle in Fig. 8 (c)), belong to the community into which the hub has been classified (e.g. the solid circle in Fig. 8 (c)).
When we use the threshold q = 1/D 0.0002351834 as was in the two cases above, we classify all the nodes into two communities, one with 260 nodes, the other with 72. As we can see in Table I (a), most of the major airports are classified into the first community, while the second community contains mostly minor airports with a few exceptions.
Changing the value of the threshold q reveals the hierarchical structure of the communities as the dendrogram in Fig. 1 implies; see Fig. 8 (d). We find three communities when we use the threshold q in the range 0.0002354434 ≤ q ≤ 0.0002354734 (see Fig. 8 (e)). The orange circle shows the airport which belongs to the first community (147 nodes), the green square the second (151 nodes), and the blue star the third (34 nodes). Comparing the top airports in Table I (a) and (b), we see that many major airports in the first community in Table I (a) are distributed to the second and third communities in Table I The top airport of each community is the hub airport of the present-day three major airline companies, Chicago O'Hare for the United, Dallas/Fort Worth for the American, and Atlanta for Delta. We therefore claim that each of the three communities indicate the subnetwork of airline companies. Nonetheless, except for the hub airports, we see mixtures of various airlines. Note that TWA, US Airways, and America West have been merged into the American, while Northwest to Delta and Continental to the United. We could observe that the these mergers were strategically reasonable in the sense that Delta and the American respectively merged companies that appear in communities different from their own hub airports. It would be interesting to analyze the airport network after mergers (including the one of TWA into the American), but it is out of the scope of the present paper. (c) A schematic illustration of the community which has two hubs. The node 1 is the first hub and the node 2 is classified as a member of the group (green broken circle). The node 2 is the second hub concurrently. We classify the hub 2 and its community (orange broken circle) to the community of the hub 1, ending up with a larger community (green solid circle). (d) The number of communities depending on the threshold q. The horizontal axis shows the threshold q that we set and the vertical axis shows the number of communities that we obtain. (e) The result of the community detection of the airport network. We classify the nodes into three communities when we use the threshold q 0.0002354734. The orange circle shows the airport which belongs to the first community (147 nodes). The green square shows the second one (151 nodes) and the blue star shows the third one (34 nodes). (f) The result of the community detection of the airport network. We have five communities when we use the threshold q 0.0002355834. The olive hexagon shows the airport which belongs to the fourth community and the red ellipse the fifth. Our algorithm of community detection can further find the hierarchy of the airline companies. By increasing the threshold further to q = 0.0002355834, we find five communities (see Fig. 8 (f)), with 109, 111, 51, 44, 17 nodes, respectively. The fourth community (olive hexagons) splits off exactly from the second community in Fig. 8 (e), while the fifth community (red ellipses) mostly from the first. We can easily see that the fourth community corresponds to the Alaska Airlines [49], which is indeed a partner company of the American Airlines, a major company of the second community in Fig. 8 (e).
In contrast, a previous research [50] divided the airports into two communities that geographically corresponds to the east and the west, the latter including the midwest. Our algorithm excels in finding a different structure since it starts from finding the hubs.

V. CONCLUSION
In the present paper, we defined the discrete-time quantum walk on complex networks and utilized it for community detection. We numerically showed that the Fourier walk is localized in a community to which the initial node belongs. We calculated the infinite-time average of the transition probability by the use of the eigenvectors. We confirmed that the eigenvectors of the Fourier walk tend to be localized in a community, while those of the Grover walk tend to be localized in some specific nodes.
We found that the infinite-time average reveals the community structure better if the eigenvalues of the unitary matrix are non-degenerate, and hence the Fourier walk is more suitable for community detection than the Grover walk. The transition probability becomes higher in proportion to the number of links, and thereby we can detect the hubs. Next, we normalized the probability of each node by dividing it by the number of links. The normalized probability in the initial node and the other nodes in the same community is high, which reveals the community structure. Meanwhile, the probability of the classical random walk on the same network eventually converges to the flat distribution. We thus claim that the time average of the probability of the Fourier walk on complex networks reveals the community structure more explicitly than that of the classical random walk.
Finally, we applied the method to the real-world networks. For Zachary's karate-club network, we confirmed that our method reveals its community structure correctly. Most nodes of the network are classified clearly, while two nodes are marginally identified. This result is consistent with other researches. For the airport network in the USA, we confirmed that our method reveals its community structure that corresponds to the three major airline companies in the USA. By adjusting the threshold, our algorithm successfully reveals the hierarchical structure of the communities as the dendrogram in Fig. 1 implies.
We argued that the strong localization of the Grover walk is presumably due to many eigenstates of degenerate eigenvalues ±1, which were mathematically proven to localized on loops [38]; hence we are almost certain that the Grover walk is not suitable for community detection. On the other hand, we numerically showed that the Fourier walk works for community detection, but we are yet to find any mathematical reasons why it does. We have not tried other types of quantum walks either. These are beyond the scope of the present single paper, and should be pursued in future studies.
Let us finally add a remark on a possible extension of the present algorithm. We have defined our quantum walk ignoring the weight and the direction of the links of the networks. We can vary the weight as integers by making each link have multiple connections. In order for a directed network to accommodate a quantum walk, the network cannot have any dead ends of directed links [12]. We may be able to apply our algorithm to the directed network as long as the condition is satisfied. the following matrix: under the following ordering of the bases The time-evolution unitary operator is therefore given by with a coin operator with a 2 × 2 unitary matrix C x ; for example, a Fourier coin We now define a new coin operator with an additional factor P x inserted to the left of the original coin operator C x : where the new factor flips the direction of the right and left movers. In the new time-evolution operator we find We can see that the operator S works as the standard shift operator, which shifts the right mover to the right keeping it as a right mover and shifts the left mover to the left keeping it as a left mover. Therefore, the time-evolution operator is the atypical shift operator (8) multiplied by a slightly jammed coin operator, for example, for Eq. (A8), but at the same time, is the standard shift operator S multiplied by the standard coin operator C. In this sense, the atypical shift operator (8) is quite similar to the standard shift operator.
Appendix B: Eigenvectors of the Grover walk on graphs for the eigenvalues ±1 We here present tutorial examples of the eigenvectors of the Grover walk on graphs for the degenerate eigenvalues ±1. The following is based on private discussions with H. Obuse [51] and E. Segawa [52].
Let us first note that the Grover coin C G i in Eq. (11) always has an eigenvalue −1 for an eigenvector with only two nonzero elements. We can straightforwardly confirm it by applying the Grover coin to the vector 1 −1 0 0 · · · T : We first show [51] that the vector depicted in Fig. 9 (a) is an eigenvector of the Grover walk with the eigenvalue +1. Here the (blue) arrow with the sign +1 indicates that the vector has an element +1 for the basis at the node to which the arrow is attached and being about to hop to the next node. The (red) arrow with the sign −1 indicates an element −1 for the corresponding basis. The other elements are all zero. In other words, this vector is strictly localized on a triangular loop.
Application of the Grover coin to the vector changes it to the one depicted in Fig. 9 (b) because of the operation in Eq. (B1). Further application of the shift operator in Eq. (8) changes it back to the original one as in Fig. 9 (c). We have thereby confirmed that the vector in Fig. 9 (a) is an eigenvector of the Grover walk with the eigenvalue +1.
We can confirm in the same way that the vector depicted in Fig. 9 (d), localized strictly on a square loop is also an eigenvector of the Grover walk with the eigenvalue +1. On a square loop, we can find another vector, depicted in Fig. 9 (g), is an eigenvector of the Grover walk but with the eigenvalue −1.
We can thus easily guess the Grover walk must have large degrees of degeneracies for the eigenvalues ±1.
Indeed, it has been proven [38,52] for the Grover walk on a graph G that the the degeneracy of the eigenvalue +1 is b 1 (G) + 1, whereas the degeneracy of the eigenvalue −1 is b 1 (G) + 1 if the graph G is bipartite and b 1 (G) − 1 if not, where b 1 (G) = |E| − |V | + 1 is the Betti number of the graph G with |E| and |V | denoting the number of edges (links) and vertices (nodes), respectively.
We can easily confirm this e.g. for the graph in Fig. 10, which is a combination of a square and a triangle with the Betti number b 1 (G) = 6 − 5 + 1 = 2. According to the theorem, the degeneracies of the eigenvalues ±1 are b 1 (G) + 1 = 3 and b 1 (G) − 1 = 1, respectively, because the graph is not bipartite. Indeed, the vectors depicted in Fig. 10 (a)-(c) have the eigenvalue +1, while the vector in Fig. 10 (d) the eigenvalue −1.
Note that although there is always an extended eigenvector, such as exemplified in Fig. 10 (c), namely, the same element in all bases, with the eigenvalue +1, its overlap with the initial state of the Grover walk should be order of 1/ √ D because of the normalization of the eigenvector, and hence we can ignore its contribution for large networks.
For the three-community network in Fig. 3 (a), because the Betti number is given by b 1 (G) = 39−21+1 = 19, the degeneracies in the eigenvalues ±1 are 20 and 18, respectively. For Zhachary's karate club in Fig. 7 (a), they are 46 and 44, and for the airport transport network in Fig. 8 (a), they are 1796 and 1794. Except for the one extended eigenvector, they are all localized on loops at least in one set of linear combinations of degenerate eigenvectors.