Identifying time dependence in network growth

Identifying power-law scaling in real networks—indicative of preferential attachment—has proved controversial. Critics argue that measuring the temporal evolution of a network directly is better than measuring the degree distribution when looking for preferential attachment. However, many of the established methods do not account for any potential time dependence in the attachment kernels of growing networks, or methods assume that node degree is the key observable determining network evolution. In this paper, we argue that these assumptions may lead to misleading conclusions about the evolution of growing networks. We illustrate this by introducing a simple adaptation of the Barabási-Albert model, the “k2 model,” where new nodes attach to nodes in the existing network in proportion to the number of nodes one or two steps from the target node. The k2 model results in time dependent degree distributions and attachment kernels, despite initially appearing to grow as linear preferential attachment, and without the need to include explicit time dependence in key network parameters (such as the average out-degree). We show that similar effects are seen in several real world networks where constant network growth rules do not describe their evolution. This implies that measurements of speciﬁc degree distributions in real networks are likely to change over time.


I. INTRODUCTION
The study of complex networks has expanded rapidly over the past 20 years. Many real systems have been analyzed using networks with great success, showing many nontrivial properties [1]. Model networks have been defined to understand the origin and development of these properties from elementary principles. For instance, the Watts-Strogatz model generates networks with short average path lengths but high clustering coefficients, explaining the small world phenomenon [2]. Similarly, the Barabási-Albert (BA) model, an undirected version of the Price model [3], demonstrates that scale free (power-law) degree distributions in real networks can arise from a combination of growth and preferential attachment [4]. These models have given significant insight into the structure of real networks. However, real systems almost never reflect the exact details of a model.
One of the most common features to study in a real network is the degree distribution [5]. The degree k of a node in a network is the number of direct connections a node has to other nodes in the network. The degree distribution P(k) is the probability distribution of the degree across all the nodes in the network. The degree distribution is said to be scale free distributions. Applying such techniques to a large set of real world networks, a recent study found that true scale free networks are rare, representing only about 4% of all networks [5]. These results are in line with a number of similar criticisms of the scale free paradigm [16][17][18][19][20][21][22]. However, despite broad support for these criticisms, many others in the networks community are still strong believers that most complex networks exhibit preferential attachment [23][24][25][26]. Among these individuals, many have taken issue with the methods to process the data in Ref. [5] and/or the strictness of the scale free definition [14,[25][26][27]. Arguing that scale free networks are only well defined in the infinite system size limit, looser definitions suggest that scale free networks are in fact not rare at all [26]. However, it can be argued that such a loosening will naturally result in a larger number of positive identifications and that using weakened criteria for scale freeness defeats the aim of using a statistically rigorous approach. Clearly, the issue of which approach is best when analyzing network degree distributions is yet to be fully resolved.
There is a third camp who argue that "knowledge of whether or not a distribution is heavy-tailed is far more important than whether it can be fit using a power-law" [20]. However, great care must be taken with such an approach in a context dependent manner. For instance, in the case of epidemic spreading, two networks may both be fat-tailed with similar degree distributions, yet exhibit very different epidemic mixing patterns due to differences in network assortativity [28,29].
What all these approaches have in common is that they analyze the degree distribution of a network at a fixed point in time. If such an analysis is to give insight into the mechanistic origin and evolution of a network, it would be prudent to ask whether the degree distribution is representative of the network in general during its evolution or only for a brief period of time? Without an answer to this question, inferring the past and future evolution of a network based on the current form of its degree distribution may give misleading results.
A prominent example of a theoretical network model where the observed degree distribution appears to change over time is superlinear preferential attachment, where new nodes attach to existing nodes proportionally to their degree to a power greater than 1 [30]. In the long time limit, a gelation phenomenon is observed where almost all nodes connect to a single hub node forming a starlike network. However, Krapivsky and Krioukov [31] showed that superlinear attachment has significant pre-asymptotic regimes where the degree distribution appears to be approximately scale free.
Given the difficulty of directly identifying preferential attachment from static degree distributions, proponents of the scale free paradigm have argued that preferential attachment can be identified directly from dynamical network data (if available) [25]. Numerous approaches have been introduced over the years, using a variety of different assumptions [32][33][34][35][36][37][38]. Most commonly, methods assume that the preferential attachment kernel follows a functional form, (k) ∝ k γ , and primarily focus on estimating the exponent γ -such methods will naturally assume that the preferential attachment kernel of a network is time independent.
As an alternative approach, nonparametric methods have been proposed that do not assume a functional form. The first of these methods by Jeong et al. [33] infers the form of the attachment kernel by constructing a histogram of the degree of nodes to which new edges attach over a short observation window. However, there is no clear guide as to how to choose the start of the observation window and how long it must be-too short and the result is very noisy, too long and the result is subject to bias [38]. The method by Newman [32] avoids this problem by constructing multiple histograms over different observation windows and computes the attachment kernel by taking a weighted average over the different histograms. Although this method avoids the issue of how to choose your observation window, this approach seems to underestimate the attachment kernel at large degrees [39], an issue since corrected by Pham et al. [38].
For networks in which the attachment kernel is time independent, the corrected Newman method proposed in Ref. [38] gives an excellent fit to data. However, it is still not clearly established whether the assumption of time independence is valid for real networks, and in some cases (such as citation networks) it is known to be false [40]. Similarly, the probability of attaching to a node may be a function of a variable other than the degree. However, how to correctly identify which feature of a node determines its attractiveness is not clear.
It is often argued that accurately calculating the attachment kernel of a growing network is important because it can help to predict the future evolution of a network [37]. For instance, in the case of nonlinear preferential attachment, where the attachment kernel is given by (k) ∝ k γ with positive constant γ , it is known that for 0 < γ < 1, the limiting degree distribution is a stretched exponential, whereas for γ > 1, the degree distribution displays a gelation phenomenon where a single dominant hub connects to almost all other nodes in the network [30]. In between, γ = 1 corresponds to traditional linear preferential attachment where the degree distribution displays power-law scaling. Hence, if we can estimate the value of γ for the attachment kernel of a real network, this can be used to predict its future evolution.
Predictions regarding the future evolution of networks, explanations of the historical development of networks, and investigations into whether preferential attachment underlies the evolution of networks, based on measured attachment kernels, are widespread in the literature. These include studies on citation networks [41,42], protein networks [43], the bitcoin network [44], common words in the English language [45], social dynamics in online games [46], actor networks [33], and more.
The majority of these studies make three assumptions: (1) that the degree of a node is the key feature determining a node's attractiveness, (2) that the attachment kernel can be approximated by (k) ∝ k γ , and (3) that the measured attachment kernel is either time independent, or that the time dependence is largely unimportant. For instance, looking at four different periods in the evolution of the American Physical Society (APS) citation network, and using the node degree (citation count) as the key variable of interest, Sheridan and Onodera found that the exponent γ ranges from 0.94 to 1.06 [42]. The authors assert that this implies that the attachment probabilities in the APS citation network are at least approximately time independent. However, as noted, γ < 1 would imply that the APS citation network's degree distribution approaches a stretched exponential, whereas γ > 1 would result in a gelation effect. Since both γ < 1 and γ > 1 were observed from the data, what does this imply for the future evolution of the network?
The aim of this paper is to illustrate the risks of assuming time independence in the rules governing the evolution of growing networks, and the risk of assuming that the node degree determines node attractiveness. We will do this by introducing the "k2 model," a simple variant of the Barabási-Albert model where new nodes do not attach to existing nodes proportionally to the number of direct neighbors a node has but rather proportionally to the number of nodes within a distance two of the target node. This simple rule is rooted in the idea that well connected neighbors are preferable to poorly connected neighbors. The rule puts a particular focus on the role of nearest neighbor correlations in network growth. Such mechanisms of mutual benefit may be relevant to collaboration [47], or citation [48] networks. The mechanism may also have indirect relevance to node copying processes [49][50][51]. Similar ideas have been explored in [52][53][54][55].
Although this simple rule has no explicit time dependence (i.e., time dependence is not in built by including a time dependent parameter, e.g., the average out-degree), the correlations that form between neighboring nodes result in an implicit time dependence in the attachment kernel. Consequently, the resulting network does not demonstrate any of the simple scaling observed in traditional network models. This is despite an extended initial transient phase during which the network appears to grow according to linear preferential attachment. We support this argument with an analytical treatment demonstrating that assumptions of simple scaling in the k2 model are not robust.
The arguments we illustrate with the k2 model are highly relevant to real networks. By calculating the ratio of network attachment kernels over different time periods, we show that over short timescales, assumptions of time independence for real networks are relatively well justified. However, over longer time periods, the attachment kernels calculated show clear time dependence, displaying a diversity of patterns. While the overall effect may be small in some cases (such as for the Flickr friendship network or the English Wikipedia hyperlink network [56]), we argue that, at a minimum, practitioners should test the degree of time dependence in their data before making predictions about the future or past development of a network.

A. Model definition
The k2 model is defined as a simple, undirected network. The model is initialized with a small connected network of m 0 nodes. Each time step, a new node is created with m m 0 new edges. The m edges are connected to the new node and target nodes from the network. Each target node is chosen with probability proportional to the number of neighbors which are one or two steps away, k (2) i , from the target node, i. We refer to k (2) i as the second degree of node i, see A for a formal definition. The attachment probability is identical to the BA model with the exception that the BA model attaches proportionally to the number of nodes one step away, k (1) i , from the target node, i. We refer to k (1) i as the first degree, or just the degree, of node i. Computationally, we prevent multiple edges being formed between two nodes by selecting m unique target nodes. For clarity, whenever notation is presented with a subscript i or j, for instance k (1) i or k (2) j , the focus is on the value of that variable for the particular node i or j. When the subscript is omitted, for instance k (1) or k (2) , the focus is on all nodes with the same specific value of the variable in question. We use k and k (1) interchangeably where appropriate. Figure 1 illustrates the motivation for the k2 model. In the BA model, a node's importance is proportional to the number of nodes connected to it, i.e., the first degree. However, there is no consideration for whether these connected nodes are important or not. A node with three isolated neighbors is considered equally important to a node neighboring three hubs. This is in conflict with many real world scenarios, for instance in academic collaboration networks, where it is known that junior researchers working under top scientists are those most likely to be successful and reach tenure in their careers [47]. In the k2 model, this effect is accounted for, allowing nodes to benefit from connecting to hub nodes and giving them the opportunity to become hubs themselves.
The principle of weighting neighbor importance reflects the role of friends-of-friends in social network theory [57,58] and is the foundation for widely used network centrality measures built on self-consistent equations, such as Katz centrality [59,60] or PageRank [60,61]. Mathematically, we define the attachment kernel as the function specifying the probability of attaching to a specific node in the network. In the BA model, (BA) ∝ k (1) , whereas in the k2 model, (k2) ∝ k (2) . In the case of the k2 model, we can write the normalized form of the attachment kernel as where for m = 1, the approximation is an equality. For m > 1 the approximation holds as long as the number of nonunique second degree neighbors is small, see A. By splitting the numerator of the attachment kernel into the contribution of the first degree neighbors to node i, k (1) i , and the contribution of the next-nearest neighbors, k (2) i − k (1) i , Eq. (1) can be rewritten as which is a function of the first neighbor degree only, where we have used Here, α labels the k (1) i unique first neighbors of node i, and k (1) iα is the first degree of node α, connected to node i. In Eq. (2), the first term indicates the contribution to the attachment kernel from the direct neighbors of node i, and the second term indicates the contribution from next-nearest neighbors to node i. Conceptually, we can think of the k2 model as involving two separate networks. In the observed network, each node represents an agent, and an edge between two nodes represents a direct, first degree relationship between the two nodes. However, new nodes do not connect to a target node according to the node's direct connections but rather according to the number of nodes within distance two of the target. These nodes are within the sphere of influence of the target node. Hence, we define the influence network, in which an edge between any two nodes signifies that the nodes are within each other's sphere of influence, i.e., two connected nodes are neighbors, or next-nearest neighbors in the observed network, see Fig. 2.
The influence network has similarities to the node copying mechanism studied in Refs. [49,50], based on earlier models in Ref. [51], with relevance to social network formation [57,62], citation networks [48,63], evolution [64], and protein interaction networks [65,66]. Although the k2 model is not designed to model such systems explicitly, it may be useful for understanding the role of neighbor-neighbor correlations in the growth of such networks.
In the influence network, new nodes connect to a target node proportionally to the node's degree, k (2) . The new node then copies a fraction of the nodes attached to the initial target node and forms additional edges to these copied neighbors. The copied neighbors correspond to those which are directly connected to the target node in the observed network. In the node copying model, new nodes select a target node at random and then copy a fraction of the target node's neighbors. As opposed to the k2 model, the copied neighbors are selected at random with probability p. In this respect, the node copying model where the original target node is chosen preferentially could represent a mean-field version of the k2 model, where we neglect correlations between neighboring nodes.

B. Measuring the time dependence of preferential attachment
To understand how the attachment kernel of a network changes over time, it is helpful to consider relative attachment probabilities as opposed to absolute attachment probabilities. In general we can write an arbitrary attachment kernel, which is a function of the node degree only, as with an arbitrary preference function f . The summation is over all nodes in the network at time t. The function f is time independent, however, as the network grows and more nodes are added, N (t ) in the denominator changes, and hence, the denominator is time dependent. Note, f (k i (t )) for a specific node i is time dependent, since the degree of a specific node evolves over time. We define the relative attachment kernel as As opposed to (k; t ), the relative attachment kernel has no dependence on the network as a whole, but rather, is a function of the degree k and k only. As a result, we can express the time independence of the attachment kernel as For convenience, in the following we will consider φ t (k, 1), i.e., the attachment probability of connecting to a node with degree k relative to a node with degree k = 1. By definition, φ t (k, k) = 1. Consider the relative attachment kernel calculated at time t, written as φ t (k, 1), and at time s, φ s (k, 1). If Eq. (6) holds, then φ t (k, 1) = φ s (k, 1). For a real network, it is likely that there will be small deviations from this ideal case. Hence, we can plot the ratio φ t (k, 1)/φ s (k, 1) against degree k to gauge the extent of the time dependence across a specific time interval. This ratio is only well defined for networks which contain nodes with degree k at both times t and s.
The BA model is a simple case where Eq. (6) should hold, with φ t (k, 1) = k for all t. Likewise, for nonlinear preferential attachment, φ t (k, 1) ∝ k γ with positive constant γ . In the case of the k2 model, Eq. (6) does not hold, due to the second term in Eq. (2). The preference function in the k2 model is not a function of the degree of a node, but the second degree, k (2) . Clearly the second degree is related to the first degree, and, when analyzing the k2 model, one could mistakenly believe that the node degree, k (1) is the quantity determining network growth. However, although this appears approximately true at first, over time, the relation between the average first and second degree changes, k (2) 1 In other words, although the attachment kernel is not explicitly time dependent (e.g., we have not included an explicit aging mechanism), the local network structure, which determines a node's second degree, is time dependent.
This point cannot be overstated; while the k2 model clearly breaks the assumptions outlined above, it does so in a way that, without prior knowledge of the model rules, is wholly nonobvious. As we will outline, if sufficient care is not taken, these assumptions risk misleading or incorrect predictions about a networks past or future evolution.

A. Simulation results
In the following, we will focus on analyzing the attachment kernel and the degree distribution of the k2 model, using the BA model as a comparison. Each simulation is initialized with a complete graph of m 0 = m + 1 nodes. Figure 3 shows the degree distribution and the true relative attachment kernel for the k2 model with m = 1. Both subfigures are averaged over 100 simulations. Early in the network development, there are only small differences between the behavior of the k2 and BA models. However, as the network grows, significant differences emerge. The duration of the initial BA-like transient phase is longer and follows the BA model even more closely for m > 1, see Appendix B.
Over short timescales, the network growth appears largely indistinguishable from linear preferential attachment. However, over longer timescales, the attachment kernel shows clear deviations from this simple scaling, with a plateau region at moderate degree preceding a superlinear tail. The anomalous scaling observed is most clearly seen for nodes with moderate degree in the range k ≈ 10 to k ≈ 300. This region suggests that there may be multiple timescales of interest at play in the evolution of the k2 model.
As noted, the BA model incorporates a rich-get-richer mechanism but does not account for any mechanisms of mutual benefit between nodes; new nodes added to the network receive no benefit from attaching to a hub node as opposed to any other less important node. Conversely, in the k2 model, when a node i, added to the network at time t i , attaches to a hub node, the new node's initial attractiveness is given by which is completely determined by the first degree of the targeted hub. This has the counterintuitive effect that the tail of the attachment kernel appears to show superlinear preferential attachment, implying gelation, but that the change in the number of new edges in the influence network is dominated by new nodes with degree k (1) = 1. As a consequence, the k2 model appears to show a gelationlike phenomenon to communities rather than hubs, resulting in the plateaus shown in Fig. 3.
The true relative attachment kernel in Fig. 3 is typically not accessible for a real network. To illustrate the risks this may pose, let us assume the k2 model is a real network and fit the relative attachment kernel, on the assumption that φ t (k, 1) ∝ k γ for a positive exponent γ . From Fig. 3(b) we may deduce that for t = 10 3 , the k2 model has an approximately linear (or possibly slightly sublinear) attachment kernel, whereas at t = 10 6 , the attachment kernel is highly nonlinear but clearly grows faster with k than the simple prediction from linear preferential attachment. If we were to use these results to infer the future scaling of the network, the data at t = 10 3 would suggest that the network might approach a stretched exponential degree distribution, whereas from the data for t = 10 6 , we might paradoxically infer the network is approaching a gelation state. In the case of the k2 model this approach is misleading, but for other networks this approach may be a good first approximation. However, what is clear is that simply calculating the attachment kernel of a network at one point in time is not sufficient to determine the form of the attachment kernel in the past or future. Likewise, since the degree distribution is determined by the underlying dynamical process growing the network, we cannot accurately know how the degree distribution will evolve in time.
In the case of a real network, we can only estimate the relative attachment kernel by observing the degree of nodes to which new nodes added to the network attach. To simulate this real-network scenario, we apply the corrected Newman method to a single simulation of the k2 model as shown in Fig. 4. Figure 4(a) shows the calculated relative attachment kernel for the k2 model at times t = 10 5 and t = 10 6 . As in Fig. 3(b), nodes with moderate degree, k ≈ 30, show an excess in the relative attachment kernel. In Fig. 4(b), deviations in the relative attachment kernel are shown explicitly by taking the ratio to the relative attachment kernels at t = 10 5 and t = 10 6 . For very small degree nodes, the ratio is approximately one indicating that the attachment kernel is time independent at these degrees. Above k = 10, the ratio clearly deviates from one, indicating that the relative attachment kernel is time dependent. For visual clarity, Figs. 4(c) and 4(d) show the equivalent as (a) and (b) but for the cumulative sum of the relative attachment kernel, defined as It is important to note that the estimated attachment kernel using Newman's method is not fully consistent with the true attachment kernel; the magnitude of the excess in the attachment probabilities is much smaller using Newman's method than the true excess shown in Fig. 3(b). This is because Newman's method constructs the attachment kernel by collating multiple histograms from different times in the network evolution. The consequence is that the estimated form of the attachment kernel at t = 10 6 is more consistent with the true attachment kernel earlier in the evolution of the k2 model, rather than the current value of the attachment kernel. To verify that the deviations in the relative attachment kernel are due to the evolution of the k2 model and not numerical errors, we repeat the analysis shown in Fig. 4 for the BA model where the relative attachment kernel is expected to be time independent. Figure 5(a) shows that the relative attachment kernels are effectively indistinguishable at different times in the network evolution. This is confirmed by Fig. 5(b) where the ratio of the relative attachment kernels is approximately one for all nodes with degree k < 100. Noise in the tail obscures the ratio for k > 100.
Overall, Fig. 5 indicates that using the corrected Newman method is effective, to an extent, at estimating the relative attachment kernel of a network and testing whether it exhibits time dependence. This suggests that the deviations in the relative attachment kernel observed in Fig. 4 are due to the structural properties of the k2 model and not due to limitations in the method used to estimate the relative attachment kernel. Hence, we can deduce that the relative attachment kernel for the k2 model is time dependent.

B. Mathematical results
Given the complexity of the k2 model, exact analytical results are hard to derive. However, using simple arguments, we can demonstrate the inconsistencies that arise from assuming the k2 model follows a simple form of nonlinear preferential attachment.
From the definition of the k2 model, we can make a continuum approximation and write the evolution of the degree, k (1) i (t ), of a given node i as where node i is added to the network at time t i ( t ). The second degree, k (2) i (t ), is defined according to Eq. (3), the summation is over all nodes in the network, and the approximation is an equality if m = 1, see A. We can write the evolution of the second degree as where see A for a derivation. Here i α(β ) represents the node β connected to node i α . Equation (11) represents the effect of non-neighboring nodes on node i. Since the first degree of a node can only grow over time, ξ i (t ) is a positive semidefinite monotone increasing function with respect to time t, that is, Our aim in the following is to write ξ i (t ) as a function of the first degree, k (1) , only. To do so we rearrange Eq. (10) to make ξ i (t ) the subject and substitute in Eq. (9a) and Eq. (9b), which is a function of the first degree only. Here we note that the summations in Eq. (13) correspond to the denominator in Eq. (9b). This is the sum over k (2) i (t ) for each node in the network, and hence, corresponds to twice the total number of edges in the influence network, which we label as Figure 3 shows the degree distribution and the relative attachment kernel for the k2 model obtained from simulations. As a thought experiment, let us suppose that these simulations are not for a theoretical network model but that the data represents a real world network. For the network at small times in its evolution, the degree distribution and the attachment kernel are closely approximated by the BA model.
For preferential attachment (linear or nonlinear), it is known that, on average, the degree of a given node i evolves in time as a power function given by where t i is the time at which node i was added to the network, and δ = 1/2 for linear preferential attachment [60]. In the case of sub-(super)linear preferential attachment, δ < 1/2 (δ > 1/2). Let us assume Eq. (15) holds and test whether this simple scaling is consistent with the mathematical form of the k2 model. First, we substitute Eq. (14) into Eq. (13), In the case of the k2 model where one node is added to the network at each time step, we initialize our network such that t j = j and note that the number of nodes in the network at time t is given by N (t ) = m 0 + t ≈ t for large t.
Using this initialization, we now calculate the value of E (2) (t ) by approximating the sum as an integral and substituting in Eq. (15), There are three cases for the different possible values of δ: Case (i) 2δ < 1. Corresponding to sublinear preferential attachment, this scenario is likely to be irrelevant for the k2 model since the influence network cannot grow slower than the original network in the BA model. In this case we expect to find linear growth in the number of edges in the influence network Here E (2) (t ) is dominated by the youngest nodes (created at the largest times t i ) as the older nodes grow too slowly. Case (ii) 2δ = 1. Corresponding to linear preferential attachment, this is the case for the BA model, Case (iii) 2δ > 1. Corresponding to superlinear preferential attachment where there is some enhancement over linear preferential attachment. For the k2 model, this scenario is plausible since we know that for any given node k (2) i (t ) k (1) i (t ). In this case we find where the growth in the number of edges in the influence network is dominated by the oldest nodes in the network. Let us assume case (iii) is valid for the k2 model. Substituting Eq. (20) into Eq. (16) we find, where in the final line we have grouped all the constants for each term into a single positive prefactor, a 1 to a 4 .
Recall that the k2 model requires that ξ i (t ) is a positive, semidefined monotonically increasing function and note that Eq. (21) is only valid for δ > 1/2. As t → ∞, the first term of Eq. (21) will dominate the second if 5δ − 2 δ, δ 1/2. Likewise, the third term will dominate the fourth if 3δ − 1 2δ, δ 1. Hence, as t → ∞, the first term is the dominant positive term and the fourth term is the dominant negative term. To ensure ξ i (t ) 0 for all t > t i , this requires the first term to grow faster than the fourth term giving 5δ − 2 2δ, corresponding to δ 2/3.
Returning to Eq. (10) and substituting in Eq. (15) and Eq. (20), we can also write We have established that to satisfy Eq. (12a), the leading term of ξ i (t ) must scale as t 5δ−2 , and δ 2/3. However, as a consequence of the rules of the k2 model, at time t > t i , node i can gain no more than m new edges in the influence network in any given time step (i.e., k (2) i (t + 1) − k (2) i (t ) m). Hence, strictly for t > t i , we require which is only satisfied if the denominator of Eq. (22) grows at least as fast as the numerator of Eq. (22). This requires t 2δ t 5δ−2 as t → ∞. Hence, 2δ 5δ − 2, giving δ 2/3. Combining the conditions in Eq. (12a) and Eq. (23), we find that a power function of the form given in Eq. (15) can only satisfy the requirements of the k2 model if δ = 2/3. To test the validity of our argument, we simulate the growth in the number of edges for the k2 influence network. This is shown in Fig. 6 for m = 1 and 3. The figure shows that, at large t, the number of edges in the influence network scales as approximately t 4/3 corresponding to δ = 2/3, in agreement with our prediction.
However, further analysis appears to contradict this conclusion. Firstly, we can simulate the k2 model and track the degree of specific nodes over time, see Fig. 7. The data has been averaged over 10 4 simulations with the shaded regions indicating the standard deviation; only with a very large sample size can the average evolution of node i be observed. In most simulations, a node hardly grows at all, whereas in a few simulations, nodes grow very quickly.
For a transient period after being added to the network, the average degree evolution of a node appears to scale as t 1/2 which is the expected scaling for linear preferential attachment. This appears to contradict the δ = 2/3 scaling identified previously, although we note that the integral in Eq. (17) is dominated by the oldest nodes in the network for δ > 1/2, which do appear to grow faster than t 1/2 towards the end of the simulation.
For newer nodes, after a transient period, the degree evolution appears to deviate from t 1/2 scaling, but the scaling appears to transition to δ < 1/2 rather than δ > 1/2. The time over which this transition takes place increases with the time nodes are added to the network.
This suggests that the true functional form for the degree evolution in the k2 model involves two competing terms, the first scaling as t 2/3 which is suppressed by t i , and a second term which scales as t 1/2 which is suppressed by t. We hypothesize, but at this stage cannot prove, that this implies two scaling regimes: For fixed t i and t → ∞, the scaling of the degree evolution is dominated by a t 2/3 term to ensure that E (t ) ∝ t 4/3 as t → ∞. For t i → ∞ and t = t i + where t i , the degree evolution of node i is dominated by a t 1/2 term. Competing regimes of this type are not seen in standard nonlinear preferential attachment.
It is interesting to consider the origin of the t 1/2 scaling. Our results are inconclusive, however, if we let t i → ∞ and set t = t i + with t i , a Taylor expansion of Eq. (21) gives with positive constants b 1 and b 3 , revealing that δ 1/2, rather than δ 2/3, is sufficient for ensuring that ξ i (t = t i ) 0 as t i → ∞.
The mathematical argument presented here does not prove the limiting behavior of the k2 model. However, the result does indicate that the inclusion of simple nearest neighbor correlations in network growth can effect the scaling of key observables. Despite initially appearing to grow as linear preferential attachment, this simple scaling breaks down as the network grows. In the case of real networks this may happen FIG. 8. The ratio of the cumulative relative attachment kernels for six real world networks; (a) Facebook friendships [67], (b) Youtube followers [56], the APS citation network [68], (d) hep-ph arXiv collaborations [69], (e) the Flickr network [56], and (f) hyperlinks on English Wikipedia [56]. For all networks the ratio is shown for early, t/s = 0.1, and late, t/s = 0.9, in the evolution of the recorded network relative to the endpoint s; time is measured in the net number of edges added to the network.
at an early stage in the evolution of a network. However, as illustrated by the k2 model, the transient time during which the model appears to grow according to linear preferential attachment may be significant-it is not uncommon to analyze real networks with 10 4 -10 5 nodes, yet in the case of the k2 model, particularly for m > 1, see Appendix B, the network is still in this transient period. In cases like the k2 model with complex growth rules, oversimplified assumptions derived from nonlinear preferential attachment do not reflect reality. Figure 8 shows the ratio of the cumulative relative attachment kernels for six real world growing networks: (a) a regional friendship network on Facebook [67], (b) the Youtube follower network [56], (c) the APS citation network [68], (d) the hep-ph arXiv collaboration network [37], (e) the Flickr follower network [56], and (f) the hyperlink network for English Wikipedia [56]. For all networks, the cumulative relative attachment kernel is calculated at two time points early and late in the network's evolution relative to the endpoint of the dataset. In some cases the full evolution of the network is not known and is accounted for with a large initial graph at t = 0.

C. Application to real world networks
Two details are clear in Fig. 8. Firstly, on short timescales the relative attachment kernels are approximately time independent, with only small deviations observed. However, over longer time periods, the ratio of the relative attachment kernels is not constant indicating time dependence.
There is significant diversity in the changes observed to the ratio of the relative attachment kernels. In Figs. 8(a), 8(e) and 8(f), the ratio is (to a good approximation) monotonically increasing. This implies that if we were to approximate the attachment kernel of these networks using nonlinear preferential attachment, (k) ∝ k γ , the exponent γ will appear to have reduced over time (the network growth is becoming more sublinear). Conversely, in Fig. 8(b) we see the opposite effect where the ratio is approximately monotonically decreasing, implying an increase in the exponent γ (the network growth is becoming more superlinear). In the cases of Fig. 8(a), 8(b) and 8(e), the form of nonlinear attachment (i.e., sublinear, γ < 1, or superlinear, γ > 1) does not change. However, in the case of the Wikipedia hyperlink network in Fig. 8(f), the relative attachment kernel early in the network's evolution appeared superlinear, whereas by its endpoint, the attachment kernel was measured to be sublinear. Here we reiterate our previous criticism: If the attachment kernel of a network is meant to predict its future evolution, how do we reconcile that measurements across some time windows result in one prediction, while other time windows result in a different, wholly incompatible prediction.
The story in Figs. 8(c) and 8(d) is more complex. In both cases, the ratios of the relative attachment kernels initially appear to decrease below 1, before increasing at moderate degree and exceeding a ratio of 1 at large degree. This behavior is qualitatively very similar to the dynamics observed in Fig. 4 for the k2 model. In these cases, approximating the change in the attachment kernel as a change in the nonlinear preferential attachment kernel γ is not easy since the data for small degrees may imply an increase in γ whereas the data for large degrees may imply a decrease in the exponent γ . In such cases, simple assumptions of nonlinear preferential attachment are not sufficient to draw reliable conclusions about the future evolution or past origin of a network.
A number of mechanisms may be responsible for the appearance of time dependence in the relative attachment kernel. Generally, the simplest explanations for time dependence in network growth relate to changes in network parameters over time. In most network growth models (including in the k2 model) these are assumed to be time independent for simplicity. An example of such a parameter includes the outdegree of each new node added to a network, m.
In the BA model, the limiting degree distribution is given by where we note that this solution is valid for sufficiently large graphs given any initial network at time t 0 . As a result, if we let m → m(t ), it is plausible that we will observe transient behavior during which the form of the degree distribution may change over time. The same argument holds for measurements of the network attachment kernel. Such time dependence in m is not hypothetical and has been shown to be true by Leskovec et al. [40] for a number of different growing networks including citation networks, patent networks, and affiliation networks. Some network models consider growth in the average out-degree of nodes over time, see for instance Ref. [70]. However, despite showing complex time dependent scaling in the time evolution of individual nodes (the analytical form for the degree distribution is not solvable), the authors argue that the limiting degree evolution implies power-law scaling in the degree distribution for large graphs. This may be true, but as the k2 model illustrates, in some cases the transient phase of a network may be so long such that, for all practical purposes, the limiting degree distribution is not necessarily observed during the lifetime of a real world network. Slow convergence to a limiting degree distribution has been noted previously for node copying models in Ref. [65]. Another simple parameter which may effect the time dependence of either the degree distribution or attachment kernel of a network is the exponent for nonlinear preferential attachment. Consider letting γ → γ (t ) for (k) ∝ k γ . It is already known that γ effects the limiting degree distribution of nonlinear preferential attachment [30], and that for γ > 1, the degree distribution appears scale free for a transient period [31]. Hence, any variation in γ is likely to add to the complexity of the time dependence observed in network growth observables, see [71] for a detailed discussion.
So far, we have provided only two examples of parameters whose time dependence may alter the transient behavior of a network's degree distribution or attachment kernel. However, we would argue that any solvable network model where a constant parameter appears in the analytical form for the degree distribution is likely to exhibit time dependent transients if that constant becomes time dependent.
In cases like the k2 model, where no individual parameter has been set to be time dependent, the reasons for complex scaling in the observables of network growth are less easily explained and may not be easy to elucidate from data. In the k2 model, the origin lies in the implicit time dependence of the local network structure which results in superlinear scaling in the influence network, associated with gelation to important communities. However, any network growth model where the attachment kernel is determined by an observable which implicitly changes over time as the structure of the network changes is likely to exhibit similar time dependence. For instance, network growth based on attaching to nodes according to their betweenness is likely to exhibit a time dependent network attachment kernel, as investigated by Ref. [54].
We note to the reader that it was our original intention to perform the analysis in Fig. 8 with statistical rigor. However, this task has proven difficult given that (1) our data breaks many of the assumptions underlying common statistical tests, and (2) it is not yet fully understood how techniques for estimating attachment kernels, like Newman's method, are effected/biased by time dependence-by construction, these techniques assume the attachment kernel is time independent.
We highlight the need to tackle these problems with better statistical network analysis in future work.

IV. DISCUSSION & CONCLUSION
The study of complex networks has come to dominate complexity science in the 21st century and is likely to become more prominent in a hyperconnected world. Not only have complex networks become influential in physics and mathematics, but their trans-disciplinary appeal has led to their use across almost all areas of science and academia, from archaeology [72] to neuroscience [73], economics [74] to epidemics [75], and many more.
A key feature of network science is the study of how networks emerge and evolve over time, and numerous models and techniques have been developed to explore this problem [4,25,30,32,33,38,48,60,63,70,[76][77][78][79][80]. In almost all cases, these models and techniques have their limitations and are only applicable to the real world under a number of key constraints. Despite this, the spread of network science has been so extensive that many of these approaches are being used without a robust understanding of their underlying assumptions. In this paper we have discussed two such assumptions: (1) that the rules underlying network growth do not depend on time, and (2) that the degree of nodes in a network is the key observable determining network evolution.
The number of research papers discussing network growth and attempting to infer their underlying mechanisms is vast, often guided by simple network models to inform their analysis. However, the models most frequently discussed in popular network science textbooks, for instance [25,60], almost always assume that underlying growth rules are fixed in time. As a result, it is not particularly surprising that most papers inferring network growth mechanisms also make these assumptions. In many contexts such assumptions are sensible and essential, allowing for analytically tractable calculations which may otherwise be impossible. However, in some real world scenarios such approaches may not be suitable. A selection of papers which do consider the implication of these assumptions include Refs. [52,71,81] in the context of preferential attachment models, Refs. [55,82] in organizational networks, Ref. [54] in social networks, and others [49,50,53].
In this paper, we have tried to highlight how very simple network growth rules can break both the time independence of the network degree distribution and the time independence of the node-node attachment probability. We have done this by introducing the k2 model, a simple variant of the Barabási-Albert model where the attractiveness of a node is correlated to the attractiveness of a node's neighbors. Even though such a network growth rule does not contain an explicit time dependence, the formation of clusters means that a node's attractiveness is implicitly time dependent through its dependence on its local environment. This mechanism is relevant for real-world networks involving mutual benefit where a node gains an advantage from being connected to an influential neighbor, such as in collaboration networks [47], or citation networks [48], or indirectly in systems with neighbor-neighbor interactions and copying processes [22,49,50,57,[62][63][64][65][66]].
The k2 model shows that for small networks, the degree distribution appears approximately power-law, and the attach-ment kernel is approximately linear, both of which are consistent with preferential attachment. However, after a lengthy transient period, both the degree distribution and attachment kernel show significant deviations from the simple scaling predicted for preferential attachment. These deviations grow over time showing strong time dependence. We support these findings with an approximate analytical treatment showing that assumptions of simple scaling forms in the evolution of individual nodes in the k2 model results in inconsistencies in the mathematics, suggesting that numerous scaling regimes are interacting and changing over time.
The k2 model is an idealized network growth model-it does not reflect real-world networks, even if the underlying mechanism has explanatory value. However, changes in the degree distribution and the attachment kernel can also be seen in real data of varying origins. In six networks for which dynamic network data is available (three social networks, one hyperlink network, one collaboration network, and one citation network), we have found that these networks are approximately time independent on short timescales but show significant time dependency, and diversity in that dependency, over longer timescales. In some cases, this time dependency may have simple origins such as node aging, changes in the average out-degree over time, or changes in the exponent for preferential attachment. However, in other cases, time dependence may arise implicitly resulting in complex scaling.
In the context of the wider debate on "scale free" networks, it is worth considering the following. If the generative mechanisms underlying network growth are not constant in time, is it plausible that the functional form for network degree distributions will be constant in time? In many cases the change over time may be very small. However, if we apply a strict definition of "scale freeness," small changes in the attachment kernel may be sufficient to induce changes in the most-likely functional form for the degree distribution as predicted using the current state of the art measures [5]. If this is in fact the case, this may explain why only 4% of real world networks have been identified as scale free [5].
To conclude, as long as network science techniques are being applied to the real world by experts and nonexperts alike, it is essential that we understand the limitations of simple models and consider their underlying assumptions. Here, we have shown how very simple, sociologically meaningful changes to network growth models can profoundly effect both the time dependence of network growth and the assumption that node degree determines network evolution. While the k2 model serves an illustrative purpose, the ideas drawn from its evolution apply to real networks, which show diverse time dependence over extended durations. While this appears to be a disappointing conclusion, we note that over short time periods network growth does appear to be approximately time independent. In many cases the origin of the time dependence may have a simple explanation, which, if accounted for in prediction models, may avoid excessive errors in forecasting the evolution of networks and the dynamics taking place on those networks. However, knowing the impact of these assumptions is only possible if simple steps are taken to check their validity. It is our hope that this paper will encourage more people to do so. It is possible to generalize the form of attachment shown in Eq. (2) by including a coefficient to the second term that adjusts the total weighting of next-nearest neighbors. This can be written as where we require 0. If = 0, the k2 model reduces to the BA model, (k2) Eq. (A1) reduces to the k2 model, Eq. (2). In this paper, to illustrate concerns about time invariance in the scaling of attachment kernels and degree distributions, we will only focus on the = 1, δ = 1 case shown in Eq. (2). Note that the general case presented in Eq. (A1) is very closely related to the 2 levels model proposed by Dangalchev [52]. However, the 2 levels model double counts the first degree neighbors of node i, = 1, δ = 0, and in the analysis of the model, Dangalchev only looked at very small networks in which issues concerning the time invariance of the attachment kernel and degree distributions cannot be seen.

Formal definition of the k2 model
We can define k ( ) i as the number of unique nodes which are or fewer steps from the target node i, excluding node i itself. Let N i (t ) be the set of nodes which are distance from node i in the network at time t, that is G(t ) which is after all nodes and edges have been added and this has m 0 + t ≈ t nodes. The distance between nodes i and j is defined as the minimum number of edges which need to be crossed in order to form a continuous path from node i to node j. Then we define where k (1) i (t ) = q (1) i (t ). In this paper we do not consider attachment kernel's proportional to k ( ) i (t ) for > 2. However, it is interesting to note that if the attachment kernel were FIG. 9. The ratio of the sum over the second degree of each node in the network to the sum over the first degree squared for each node in the network, see Eq. (A7). The ratio equals one for m = 1 and converges to one for m > 1.
proportional to k ( ) i (t ) and D(t ), where D(t ) is the network diameter, this attachment kernel is equivalent to random attachment until the growing network has diameter D(t ) > .
(A5) Therefore, we obtain N j=1 k (2) For m > 1, we can test the validity of Eq. (A6). Figure 9 plots the ratio of the two sums, for m = 1, 3, defined as against time. The figure has been averaged over 100 simulations of the k2 model. Figure 9 indicates that for m = 1, S 2 (t )/S 1 (t ) = 1 for all t, as expected. For m > 1, there is a noticeable difference between S 2 (t ) and S 1 (t ) at very small times in the network's evolution. This is to be expected since when the network is small, the probability of acquiring nonunique second degree neighbors is small but not negligible. As the FIG. 10. The (a) degree distribution and (b) relative attachment kernel for the k2 model over time with m = 3. The dashed lines show the expected scaling for the BA model. Early in the growth of the k2 model, the evolution of the network is largely indistinguishable from the BA model. As the k2 model grows, both the degree distribution and relative attachment kernel deviate significantly from the simple scaling predicted by the BA model. network evolves, the ratio S 2 (t )/S 1 (t ) quickly converges to 1, with S 2 (t )/S 1 (t ) > 0.9 by t = 10 3 . This indicates that Eq. (A7) is a good approximation even for m > 1.
for m = 1 previously. Initially, the degree distribution appears qualitatively similar to the power-law scaling expected from linear preferential attachment. This is associated with an approximately linear relative attachment kernel. However, as the network evolves, clear deviations from the simple scaling form predicted by the BA model appear in both the degree distribution and the relative attachment kernel. Note, the time for these deviations to become significant increases as m is increased. The magnitude of the deviations shown for m = 1 exceed those for m = 3. Figure 11 shows the evolution of individual nodes added at time t i in the k2 model for m = 3. The figure is consistent with the previous result shown for m = 1. Note in particular that the magnitude of the standard deviation is significantly smaller than for m = 1. This suggests that the results for m = 3 better reflect the true underlying degree scaling in the k2 model than the result for m = 1. It is especially clear how closely nodes added at large t i follow the t 1/2 scaling predicted by the BA model during the initial phase after the node is added to the network.

Degree evolution for m = 3
Two additional details are worth highlighting: (1) After the initial transient phase during which nodes scale approximately with t 1/2 , the scaling deviates from δ = 1/2 scaling to δ < 1/2, but the magnitude of the change is much smaller than for m = 1. This result is of particular interest since extended transient times and smaller deviations from δ = 1/2 scaling may explain why the transient period for the degree distribution and relative attachment kernel shown in Fig. 10 are longer, and follow the BA model more closely, than the equivalent for m = 1. (2) For t i = 10, it appears that shortly after entering the δ < 1/2 phase, the exponent increases again and appears to approach δ > 1/2, although the effect is very small. Longer simulations are required to clearly elucidate the scaling behavior of individual nodes, but these simulations are computationally challenging in the current framework.