Network Reconstruction Based on Evolutionary-Game Data via Compressive Sensing

Evolutionary games model a common type of interactions in a variety of complex, networked, natural systems and social systems. Given such a system, uncovering the interacting structure of the underlying network is key to understanding its collective dynamics. Based on compressive sensing, we develop an efﬁcient approach to reconstructing complex networks under game-based interactions from small amounts of data. The method is validated by using a variety of model networks and by conducting an actual experiment to reconstruct a social network. While most existing methods in this area assume oscillator networks that generate continuous-time data, our work successfully demonstrates that the extremely challenging problem of reverse engineering of complex networks can also be addressed even when the underlying dynamical processes are governed by realistic, evolutionary-game type of interactions in discrete time.

In many fields of science and engineering, one encounters the situation where the system of interest is composed of networked elements, called nodes, but the pattern of the node-to-node interaction or the network topology is totally unknown.It is desirable and of significant interest to uncover the network topology based on time series of certain observable quantities extracted from experiments or observations.Examples of potential applications abound: reconstruction of gene-regulatory networks based on expression data in systems biology [1][2][3][4], extraction of various functional networks in the human brain from activation data in neuroscience [5][6][7][8], and uncovering organizational networks based on discrete data or information in social science and homeland defense.In the past few years, the problem of network reconstruction has received growing attention [9][10][11][12][13][14][15][16].Most existing works were based, however, on networks of oscillators whose dynamics are mathematically described by coupled, continuous differential equations.In particular, either some knowledge about the dynamical evolution of the underlying networked system is needed [9][10][11] or long, oscillatory signals in continuous time are required [12][13][14][15][16].The advantage of availing oneself of continuous-time data is lost for networks in social, economic, and even biological sciences where node-to-node interactions are governed by evolutionary-game types of dynamics [17][18][19][20][21].In addition to being discrete, the available data may be sporadic and the amount may be small.To our knowledge, the problem of reconstructing the full topology of a network based on discrete and ''rare'' data remains outstanding [22].
In this paper, we articulate a general method of addressing the problem of how to uncover network topology using evolutionary-game data based on compressive sensing, a recently developed paradigm for sparse-signal reconstruction [23][24][25][26][27][28] with broad applications ranging from image compression/reconstruction to the analysis of large-scale sensor-network data.Although convex optimization, of which compressive sensing is one type, has been used to reconstruct coupled-oscillator networks [9][10][11], we shall show the advantages of compressive sensing, such as its small data requirement, in solving the general inverse problem of network reconstruction based on either continuous [29,30] or discrete data.We propose a mathematical framework to convert the problem of uncovering network topology into that of sparse-signal reconstruction.In a typical game, agents use different strategies in order to gain the maximum payoff.Generally, the strategies can be divided into two types: cooperation and defection.We will show that, even when the available information about each agent's strategy and payoff is limited, our compressivesensing-based method can yield precise knowledge of the node-to-node interaction pattern in a highly efficient manner.We validate our method by (1) extensive numerical computations using model complex networks and evolutionary games, and (2) an actual social experiment in which participants forming a friendship network play a typical game to generate short sequences of strategy and payoff data.The high prediction accuracy achieved and the unique requirement of an extremely small data set make our method particularly suitable for potential applications that reveal ''hidden'' networks embedded in various social, economic, and biological systems.
In an evolutionary game, at any time, a player can choose one of two strategies (S): cooperation (C) or defection (D), which can be expressed as SðCÞ ¼ ð1; 0Þ T and SðDÞ ¼ ð0; 1Þ T , where T stands for ''transpose.''The payoffs of the two players in a game are determined by their strategies and the payoff matrix of the specific game.For example, for the prisoner's-dilemma game (PDG) [31] and the snowdrift game (SG) [32], the payoff matrices are where b (1 < b < 2) and r (0 < r < 1) are parameters characterizing the temptation to defect.When a defector encounters a cooperator, the defector gains payoff b in the PDG and payoff 1 þ r in the SG, but the cooperator gains the ''sucker'' payoff 0 in the PDG and payoff 1 À r in the SG.At each time step, all agents play the game with their neighbors and gain payoffs.For agent i, the payoff is where S i and S j denote the strategies of agents i and j at the time and the sum is over the neighbor-connection set À i of i.After obtaining its payoff, an agent updates its strategy according to its own and its neighbors' payoffs, attempting to maximize its payoff at the next round.Possible mathematical rules to capture an agent's decision-making process include the best-take-over rule [31], the Fermi equation [33], and payoff-difference-determined updating probability [34].To be concrete, we use the Fermi rule in our simulations of evolutionary-game dynamics and generate time series accordingly, which is defined as follows.
After a player i randomly chooses a neighbor j, i adopts j's status S j with the probability [33] WðS where characterizes the stochastic uncertainties in the game dynamics.For example, ¼ 0 corresponds to absolute rationality where the probability is 0 if G j < G i and 1 if G i < G j , and ! 1 corresponds to completely random decision making.The probability W thus characterizes the bounded rationality of agents in society and the natural selection based on relative fitness in evolution.The goal of compressive sensing is to reconstruct a vector X 2 R N from linear measurements Y about X in the form where Y 2 R M and È is an M Â N matrix.The striking feature of compressive sensing is that the number of measurements is much less than the number of components of the unknown vector, i.e., M ( N. Accurate reconstruction can be achieved by solving the following convexoptimization problem: where kXk 1 ¼ P N i¼1 jX i j is the L 1 norm of vector X.Solutions to the convex-optimization problem are available [23][24][25][26][27][28]. Convex optimization based on L 1 norm has been used for solving network-construction problems in oscillator networks [9][10][11].Here, we shall show that the compressive-sensing approach provides a solution to network-construction problems (other than oscillator networks) based on the small amount of data from evolutionary games.
The key to solving the network-reconstruction problem lies in the relationship between the agents' payoffs and strategies.The interactions among agents in the network can be characterized by an N Â N adjacency matrix A with elements a ij ¼ 1 if agents i and j are connected, and a ij ¼ 0 otherwise.The payoff of agent x can be expressed by where represents a possible connection between agent x and its neighbor i; a xi S T x ðtÞ Á P Á S i ðtÞ (i stands for the possible payoff of agent x from the game with i (if there is no connection between x and i, the payoff is zero because a xi ¼ 0); and t ¼ 1; Á Á Á ; m is the number of rounds that all agents play the game with their neighbors.This relation provides us with a base to construct the vector G x and matrix È x in a proper compressive-sensing framework to obtain a solution of the neighbor-connection vector A x of agent x.In particular, we write and where The sparsity of A x makes the compressive-sensing framework applicable.The vector G x can be obtained directly from the payoff data.Since S T x ðt i Þ and S y ðt i Þ in F xy ðt i Þ come from data and P is known, the matrix È x can be calculated from the strategy data.The vector A x can thus be predicted based solely on the time series.Note that the self-interaction term a xx is not included in the vector A x and that the self-interaction column ½F xx ðt 1 Þ;ÁÁÁ;F xx ðt m Þ T is excluded from the matrix È x .In a similar fashion, the neighbor-connection vectors of all other agents can be predicted, yielding the network adjacency matrix A ¼ ðA 1 ; A 2 ; Á Á Á ; A N Þ.
We first use model complex networks to demonstrate our method by implementing PDG and SG on three types of complex networks: random [35], small-world [36], and scale-free [37].Time series of strategies and payoffs are recorded during the system's evolution toward the steady state; they are used for uncovering the topology of the interaction network.To quantify the performance of our method in terms of the amount of required measurements for different game types and network structures, we introduce the success rates of existent links (SREL) and nonexistent links (SRNL).If the predicted value of an element of the adjacency matrix A is close to 1, the corresponding link is deemed to exist.If the value is close to zero, the prediction is that there is no link.In practice, we assign a small threshold, e.g., 0.1, so that the range of existent links is 1 AE 0:1 and the range of nonexistent links is 0 AE 0:1.Any value outside the two intervals is regarded as a failure of prediction.For a single player, SREL is defined as the ratio of the number of successfully predicted neighborconnection links to the number of actual neighbors, and SRNL is similarly defined.We then average over all nodes to obtain the values of SREL and SRNL for the entire network.The reason for treating the success rates for existent and nonexistent links separately lies in the sparsity of the underlying complex network, where the number of nonexistent links is usually much larger than the number of existent links.The choice of the threshold does not affect the values of the success rates, insofar as it is not too close to 1, nor too close to zero.
The success rates of prediction for two types of games and three types of network topologies are shown in Fig. 1.The length of the time series is represented by the number of measurements collected during the temporal evolution normalized by the number N of agents; e.g., a value of 1 means that the number of used measurements equals N.For all combinations of game dynamics and network topologies examined, a perfect success rate can be achieved with an extremely small amount of data.For example, for random and small-world networks, the length of data required for achieving a 100% success rate is between 0.3 and 0.4.This value is slightly larger (about 0.5) for scalefree networks, due to the presence of hubs whose connections are much denser than those of most nodes, although their neighbor-connection vectors are still sparse.Figure 1 thus demonstrates that our method is both accurate and efficient.The requirement of an exceptionally small amount of data is particularly important for situations where only rare information is available.From this standpoint, evolutionary games are suitable to simulate FIG. 1. Success rates of inferring three types of networks: random, small-world and scale-free, with PDG and SG dynamics.The network size N is 100.Each data point is obtained by averaging over 10 network realizations.For each realization, measurements are randomly picked from a time series of temporary evolution.The error bars denote the standard deviations.The payoff parameters for the PDG and the SG are b ¼ 1:2 and r ¼ 0:7, respectively.We have also systematically tested other values of b and r and obtained similar success rates.The average node degrees of all used networks are fixed to 6 and the noise parameter ¼ 0:1.
such situations as meaningful data can be collected only during the transient phase before the system reaches its steady state, and game dynamics are typically fast to converge so that the transients are short.In addition, the robustness of our method has been tested in situations where the time series are contaminated by noise.For example, we have studied the case where random noise of amplitude up to 30%b (where b is the parameter characterizing temptation to defect) is added to the payoffs of PDG.When the amount of used data exceeds 0.4, the success rate approaches 100% for random networks.Similar performance has been achieved for small-world and scale-free networks.The immunity to random noise seen in our method is not surprising, as compressive sensing represents an optimization scheme that is fundamentally resilient to noise.In contrast, another type of noise, noise in the strategy-updating process, plays a positive role in network reconstruction because this kind of noise can increase the relaxation time toward one of the absorbing states (all C or all D), thereby providing more information for successful reconstruction.
To measure the efficiency of our method in reconstructing network structure with respect to different network sizes N, we systematically investigate the dependence of the minimum required data for a successful reconstruction on N. Without loss of generality, we define the threshold T d (minimum amount) of data required for accurate reconstruction when the success rates of SREL and SRNL reach 99% in Fig. 1.As shown in Fig. 2, as the network size increases, a relatively less amount of data is required to precisely identify the links in the network.The reason is that, for a network with complex topology, as its size is increased, the sparsity condition can be satisfied more readily, rendering compressive sensing more efficient.We have also examined other types of networks, such as random and small-world networks.The results are qualitatively the same as that for the scale-free networks.This observation suggests that our method is particularly efficient for large and sparse networks based on rare measurable information, and this efficiency is facilitated by the characteristics of the compressive-sensing approach.
Our method can be generalized straightforwardly to weighted networks with inhomogeneous node-to-node interactions.Using weights to characterize various interaction strengths, we define the weighted adjacency matrix w as In the context of evolutionary games on networks, the weight w ij characterizes the situation of aggregate investment.In particular, for both players, more investments in general will lead to more payoffs.Given the link weights, the weighted payoff G w i of an arbitrary individual is given by where À i denotes the neighbor set of i.Under evolutionarygame dynamics, the weighted-network structure is taken into account by the weighted payoff G w i .To uncover such a network from data, we need the weighted payoff vector G w x , matrix È x , and weighted neighbor-connection vector W x for an arbitrary individual x.The vectors G w x and W x are given by Similar to unweighted networks, we have where W x can be predicted from the strategy and payoff data.The prediction accuracy can be conveniently characterized by various prediction errors, which are defined separately for link weights and nonexistent links with zero weight.In particular, the relative error of a link weight is defined as the ratio of the absolute difference between the predicted weight and the true weight to the true weight.The average error over all link weights is the prediction error E w .However, a relative error for a zero-weight (nonexistent) link cannot be defined, so we use the absolute error E z .Figure 3 shows the prediction errors for PDG dynamics on a scalefree network with random link weights chosen uniformly from the interval [1.0,6.0].We observe that the prediction errors decrease fast as the number of measurements is increased.As the relative data size exceeds about 0.4, the two types of prediction errors approach essentially zero, indicating that all link weights have been successfully predicted without failure and redundancy, despite that the link weights are random.We have also examined random and small-world networks and observed that, to achieve the same level of accuracy, the requirement for data can be somewhat relaxed as compared with scale-free networks.We next present an example to uncover a real social network.In the experiment, 22 participants from Arizona State University played PDG together iteratively and, in each round, each player was allowed to change his or her strategies to optimize the payoff.The payoff parameter is set (arbitrarily) to be b ¼ 1:2.The player who had the highest normalized payoff (original payoff divided by the number of neighbors) summed over time was the winner and was rewarded.During the experiment, each player was allowed to communicate only with his or her direct neighbors for strategy updating.Prior to the experiment, there was a social tie (link) between two players if they had already been acquainted with each other; otherwise, there was no link.Among the 22 players, two withdrew before the experiment was completed, so they were treated as isolated nodes.The network structure is illustrated in Fig. 4(a).It exhibits typical features of social networks, such as the appearance of a large density of triangles and a core consisting of 4 players (nodes 5, 11, 13, and 16), which is fully connected within and has more links than other nodes in the network.The core essentially consists of players who were responsible for recruiting other players to participate in the experiment.Each of the 20 players who completed the experiment played 31 rounds of games, and each recorded his or her own strategy and payoff at each time, which represented the available data base for prediction.The data used for each prediction run was randomly picked from this data base.The preexisting friendship ties among the participants tend to favor cooperation and preclude the system from being trapped in the social dilemma for a small number of rounds of games.However, for a long run, a full-defection state may occur.In this sense, the recorded data were taken during the transient dynamical phase and are thus suitable for network reconstruction.The results are shown in Fig. 4(b).We see that the social network can be successfully uncovered, despite the complicated decision-making process of each individual during the experiment.Compared to the simulation results, a larger data set (with a relative size of about 0.6) is needed for a perfect prediction of social ties.This can be attributed to the relative smaller size and denser connections in the social network than in model networks.
An interesting phenomenon is that the winner picked in terms of the normalized payoff had only two neighbors, in contrast to the players with the largest node degree, whose normalized payoffs are approximately at the average level, as shown in Fig. 4(c).In addition, the payoffs of players with smaller node degrees are highly nonuniform, while those with higher node degrees show a smaller difference.This suggests that players with high degree may not act as leaders due to their low average normalized payoffs.This experimental finding is in striking agreement with numerical predictions in the literature about the relationship between individuals' normalized payoffs and their node degrees [38][39][40].We also observe from experimental data that a typical player with a large number of neighbors failed to stimulate the neighbors to follow his or her FIG.3. Prediction errors E w in link weights and E z in nonexistent links for PDG on weighted scale-free networks.The network size is 100, and the weights follow a uniform distribution, ranging from 1.0 to 6.0.Each value of the prediction error is obtained using 10 independent network realizations.Other parameters are the same as for Fig. 1. strategies.This observation suggests that hubs may not be as influential as expected in social networks.However, this finding should not be interpreted as a counterexample to the leader's role in evolutionary games [34,41], since the network based on friendship may violate the absolute selfishness assumption of players who tend to be reciprocal with each other.In summary, we proposed a general method based on compressive sensing to uncover interaction networks based on evolutionary-game data.The method was validated for complex networks of different topologies and a real social network.For all cases considered, as the number of data points exceeds a low critical value depending on the sparsity of the underlying network, the prediction errors approach zero rapidly, without or with noise in the data.To our knowledge, no previous method can match our method in terms of the accuracy and efficiency when only a small set of discrete data is available.
Our method, besides being fully applicable to complex networks governed by evolutionary-game-type interactions, can be applied in other contexts where the dynamical processes are discrete in time and the amount of available data is small.For example, inferring gene-regulatory networks from sparse experimental data is a problem of paramount importance in systems biology.For such an application, Eq. ( 6) should be replaced by the Hill equation, which models generic interactions among genes.In an expansion using base functions specifically suited for generegulatory interactions, a compressive-sensing framework, mathematically represented by Eq. ( 4), may be established.The underlying reverse-engineering problem can then be solved.A challenge that must be overcome is to represent the Hill function by an appropriate mathematical expansion so that the form of compressive sensing can be met.

FIG. 2 .
FIG.2.Threshold data T d of SREL and SRNL for a successful reconstruction as a function of network size N for PDG and SG on scale-free networks, where T d is defined as the amount of data normalized by network size that enables 99% success rate.Each data point is obtained by 10 independent realizations, and the error bars represent standard deviations.The average node degrees of all used networks are fixed to 6.The parameters in the PDG and the SG are b ¼ 1:2, r ¼ 0:7, and ¼ 0:1.

FIG. 4 .
FIG. 4. (a) Structure of the experimental social network.(b)Success rates of uncovering the network topology and (c) normalized payoff of each player as a function of node degrees.The sizes of the red nodes in (a) denote their node degrees, and the two light gray nodes (for players who did not complete the experiment) are isolated without any interactions with other nodes.The 10 independent realizations used in calculating the average success rates were randomly chosen from the database of 31 rounds of games.