Feedback through graph motifs relates structure and function in complex networks

How does the connectivity of a network system combine with the behavior of its individual components to determine its collective function? We approach this question by relating the internal network feedback to the statistical prevalence of connectivity motifs, a set of surprisingly simple and local statistics on the network topology. The resulting motif description provides a reduced order model of the network input-output dynamics and it relates the overall network function to feedback control theory. For example, this new formulation dramatically simplifies the classic Erdos-Renyi graph, reducing the overall graph behavior to a simple proportional feedback wrapped around the dynamics of a single node. Higher-order motifs systematically provide further layers and types of feedback to regulate the network response. Thus, the local connectivity shapes temporal and spectral processing by the network as a whole, and we show how this enables robust, yet tunable, functionality such as extending the time constant with which networks remember past signals. Rich connections between structure and function are developed for networks with random input and output weights, and those composed from nodes with distinct dynamics and connectivity. These simple statistical descriptions provide a powerful theoretical framework to understand the functionality of real-world network systems, as we illustrate for the mouse brain connectome.

High-dimensional networked dynamical systems are ubiquitous in the engineering, physical and biological sciences, including the power grid, nervous systems, communications and social networks. These systems are characterized by a large connectivity graph that determines how the system operates as a whole [1][2][3][4]. However, the connectivity is typically so complex that the structure-function relationship is obscured. It is infeasible that every individual connection in a network is masterfully planned, or even necessary for functionality. One alternative is that some statistical features of connectivity drive the underlying network function, motivating significant interest in studying specific connectivity patterns, or network motifs, that occur at higher than chance rates [5,6].
Here, we develop a concrete and precise theory relating the statistics of connectivity motifs to function, in the context of the collective input-output response, or signal processing property, of network systems, where a signal arrives at some nodes, resonates through the network, and is read out (Fig. 1A). We find that the network response takes the form of a feedback hierarchy, where each term in the hierarchy is determined by the statistics of a specific network motif. These statistics are the motif cumulants, which identify the unique contributions of each motif, over and and above any of its sub-structures. As a consequence, the hierarchy can be truncated, giving a very efficient reduced order model for the network as a whole. This yields a powerful simplification that applies to a wide range of networks under intensive study. For example, we show that the classical Erdős-Rényi [7] graph is equivalent to proportional feedback on a single unit by directly relating motif cumulants to the network transfer function. In a general network, a hierarchy of additional feedback loops occurs, each corresponding to an individual motif cumulant.
Taken together, our results show how to infer simple engineering principles that underlie a complex network's response to inputs, and how new responses can be designed by tuning the connection statistics. An analogy may be drawn to the use of negative feedback to tightly regulate system response despite low-fidelity components [8]. This synthesizes two disciplines: (i) the statistical theory of networks [5,9,10] to isolate the impact of network motifs and (ii) control theory [11] to describe the network response via an equivalent feedback circuit.
There has been significant interest and effort designing robust distributed control of networked systems [12][13][14], including multi-agent control for the internet [15,16] and the electric grid [17]. Collective motion [18,19] has also been explored, such as animals flocking or fish schooling, and concensus [20]. However, there is relatively little work that relates network structure to function in the context of internal feedback and control theory [21].
To focus on the effect of statistical connectivity features, we prescribe linear time-invariant (LTI) dynamics at each node in the network. Large networked dynamical systems with LTI node dynamics are widely used to analyze a broad range of applications. Consensus and cooperation in networked multi-agent systems [22,23] are often investigated with linear dynamics, which are relevant for linear agreement protocols. LTI systems have also been studied in the context of fault detection and isolation [24]. Linear electrical networks describe many electrical circuits, and nonlinear electrical networks may often be linearized around an operating condition. The same is true in neuroscience, where LTI networks have been successfully applied. Examples include the analysis of "memory traces" that retain input signals, in the context of both cortical microcircuits and whole brain networks [25][26][27]. Importantly, close cousins to LTI systems accurately describe stochastic spiking behavior in many neural networks (as well as other systems with punctate systems such as social and financial networks [28]). These are the linearly interacting point process systems, or Hawkes models [29][30][31][32]; our theory directly applies to these systems as well.
The most extensive theory of control is developed for linear systems or the linearization of a nonlinear system about a fixed point or periodic orbit [11,33]. Optimal control of networked LTI systems with unreliable connections is useful for internet or wireless networks [34]; other applications include rate control in communications networks. Input localization has also been investigated for LTI networks [35]. Linear control may be relevant even when the dynamics are strongly nonlinear, such as in turbulent fluid dynamics; stabilizing control regulates the system towards the equilibrium solution where linearization is increasingly valid [36]. Many nonlinear systems may also be analyzed via the Koopman [37] or Perron-Frobenius operators [38], providing a linear networked dynamical system on a discretized phase space [39]. In fact, networked dynamical systems are beginning to be applied to nonlinear fluid systems [39][40][41]. Gain scheduled control is also used to generalize linear control theory to nonlinear systems [42,43]. Importantly, despite the wealth of knowledge about LTI feedback systems and graph topology, the powerful network descriptions in this work have not been explored previously.
Signal processing through a network. Fig. 1A shows the network system under consideration. A timedependent, scalar input signal u(t) is fed to all the nodes in the network, an assumption later modified and extended to cases where nodes receive weighted inputs and/or when only a subset of nodes receive the input. The nodes are connected via a directed connectivity matrix W . The output y(t) is a uniform sum of all nodal outputs x i (t), so that y(t) = 1 N i x i (t); this will also be extended. The behavior of the network can then be characterized by the signal transfer function between the input u(t) and output y(t). For systems comprised of linear time-invariant (LTI) units, the input-output property of the network can be expressed explicitly with the connectivity matrix W ; the same is true of linearly interacting, stochastic point process models (i.e., multivariate Hawkes processes), for which our methods also apply. The response of a LTI component to an arbitrary input is given by convolution with a temporal filter h(t), i.e. the impulse response. For simplicity, we start with identical dynamics across all nodes. A direct solution (see Supplementary Materials) gives the network transfer function Similarly, a readout y(t) is formed by summing activities across nodes with a weight vector C. Each node in the network has Linear Time-Invariant (LTI) dynamics. Thus, the relationship between y(t) and u(t) is completely described by a network transfer function G(s). (b) Examples of connectivity motifs; in the simplest cases we study, G(s) is determined by the prevalence of chain motifs, given by the motif cumulants j (see text).

Controller Schematic
We restrict to study the case when the dynamics each node is described as a linear time-invariant (LTI) system ; the dynamics of a network system with nonlinear nodes is important but beyond the scope of this chapter (see discussion). A LTI system responds linearly to superimposed inputs and responses, and does so in a manner independent of the time at which the inputs arrive. The dynamics of a LTI system can be completely described by a temporal filter h(t), so that its response to any input is simply given by a convolution with the filter. The temporal filter is also called the impulse response function, and we will use this notation interchangeably throughout this chapter. The impulse response terminology derives from the fact that one can recover h(t) as the node output if a delta function input is given at time 0. Similarly, a readout y(t) is formed by summing activities across nodes with a weight vector C. Each node in the network has Linear Time-Invariant (LTI) dynamics. Thus, the relationship between y(t) and u(t) is completely described by a network transfer function G(s). (b) Examples of connectivity motifs; in the simplest cases we study, G(s) is determined by the prevalence of chain motifs, given by the motif cumulants j (see text).
We restrict to study the case when the dynamics each node is described as a linear time-invariant (LTI) system ; the dynamics of a network system with nonlinear nodes is important but beyond the scope of this chapter (see discussion). A LTI system responds linearly to superimposed inputs and responses, and does so in a manner independent of the time at which the inputs arrive. The dynamics of a LTI system can be completely described by a temporal filter h(t), so that its response to any input is simply given by a convolution with the filter. The temporal filter is also called the impulse response function, and we will use this notation interchangeably throughout this chapter. The impulse response terminology derives from the fact that one can recover h(t) as the node output if a delta function input is given at time 0.

Chains Motifs
FIG. 1. A. Recurrent input-output network with N nodes. An input signal u(t) enters the network according to a weight vector B. An output signal y(t) is formed by summing activities across nodes with a weight vector C. Each node has linear dynamics, and thus y(t) and u(t) are related by a network transfer function G(s). B. Examples of connectivity motifs; in simple cases, G(s) is determined by the prevalence of chain motifs, given by the motif cumulants κj.
Here I is the identity matrix, e = 1 √ N (1, . . . , 1) T . We are interested in how the structure of network connectivity W affects the network transfer function G(s). Insights developed in [10,44] allow us to establish an elegant relationship between G(s) and highly simplified statistical features of W . These features are the connectivity motifs (Fig. 1B), quantified via network statistics we refer to as motif cumulants (see Supplementary Materials). For a connectivity matrix W , motif cumulants quantify the empirical frequency of "pure" motif structures of a given size, κ n for chain of length n, over and above the frequency expected from smaller motifs that form its building blocks.
The smallest network building block is a single link; thus positive motif cumulants of size two indicate overrepresentation of two-link connection patterns compared with what would be seen from random individual connections; similarly for negative motif cumulants and an under-representation of two-link patterns. The same relationship holds for larger motif cumulants, which remove all redundancy from smaller motifs that compose them. Importantly, motif cumulants of order n, containing n connections, can be computed from sampling of the connectivity among up to n + 1 nodes in the network; thus, they are local features of network connectivity, which can be key to quantifying them experimentally [6,45].
Our central result provides a formula for the network transfer function G(s) in terms of motif cumulants. This is powerful, because it characterizes network function by the statistics of a limited number of motifs, which can be accessed empirically. Moreover, it shows that network dynamics are regulated by a feedback hierarchy in terms of the motifs -a fact that produces a new characterization of Erdős-Rényi and other random graph models: Complex networks may be organized by their motif cumulants κj; the first two cumulants are shown. We show two example networks with different motifs. Bar graphs show values of κ1, κ2, κ3, κ4 for each network. B. The motif content determines the strength of each pathway in the feedback hierarchy shown; this relationship is indicated by the green "slider" arrows. Importantly, Erdős-Rényi networks have nonzero values for only the first motif cumulant. Thus, they are completely described by the single feedback loop shown in grey; the same is true for other broad network classes (see text). C. Network transfer functions G(s) for different networks, indicated by matched-color dots in A; in the first column, networks have differing values of κ1; in the second, differing values of κ2. Here, the node filter is hexp. Dashed lines are approximations by keeping leading terms (1 term in the left column and 3 terms for the right, see text); some are indistinguishable from the solid corresponding to actual filters with all terms. Theorem 1.
provided the connection strength is sufficiently small (see Supplementary Materials) so that the series above converges.
This is an exact expression without specific assumptions on the structure of W . Moreover, it also applies to networks with variable connection weights. The motif cumulants are therefore interpreted as statistics for motifs, where each "count" of occurrence is weighted by the product of the strengths of connections it contains. Theorem 1 provides a major simplification of the relationship between network connectivity W and the network transfer function. First, note that only chain motifs appear in Eq. (2). This shows they are the only independent connectivity features that affect G(s). Other types of motifs and connectivity features may indirectly modify G(s), but their effect is fully quantified in terms of their impact on chain motif cumulants.
Moreover, the representation is highly efficient. Because motif cumulants remove all redundancy of their sub-components, they often decay quickly with motif size n [44]. Consequently, keeping only the first few terms of the infinite sum in Eq. (2) provides a good approximation of G(s). This has important practical consequences, as the global network dynamics can then be explained in terms of a few measurable connection statistics.
Motif statistics and the internal feedback hierachy. Theorem 1 shows that each successive motif cumulant contributes dynamics that are equivalent to a successive feedback loop in engineering control (Fig. 2B). As a special case, we obtain a simple and intuitive controltheoretic characterization of the widely studied Erdős-Rényi (ER) random network [7]. In a ER graph each connection is independently chosen to be present with probability p. We show that, as a consequence, only one term will remain in the expression for the network transfer function G(s) for large networks ( Fig. 2A  This simple expression has a clear interpretation of wrapping proportional feedback around a single node, a well studied architecture in control theory [46]. A broader class of networks also have the same chain motif cumulants as an Erdős-Rényi network: that is, κ j = 0 for all j ≥ 2. Thus, for the setup above, these networks again produce filters that are equivalent to proportional feedback around a single node. This holds for any network for which the in-degree or out-degree of each node is identical, such as regular networks, "small world" networks [1], and any other network that has connectivity with a rotational/translational symmetry (see Supplemental Material).
For a general network, higher order motif cumulants κ n may remain significant even if the network is large ( Fig. 2A lower inset). Each motif cumulant gives rise to a unique feedback pathway, which combine to yield the ladder-structured control diagram shown in Fig. 2B. We emphasize that our usage of motif cumulants is essen-tial: by removing redundancy due to shorter component paths, each motif cumulant corresponds to a unique feedback link, instead of appearing at multiple links.
Motif cumulants provide feedback knobs, as in Fig. 2B, to manipulate the input-output properties of a network. To demonstrate this, we generate sparse networks, where only a fraction of all possible connections are nonzero and all connections have the same strength, with motif statistics that lie in different locations of the plane of κ 1 and κ 2 ( Fig. 2A; see Supplemental Material). We fix all other parameters such as the coupling strength (which scales W by a constant) so that the only difference is the graphical structure of W . For concreteness, we set the node filter h(t) to be either an exponential filter h exp (t) (Fig. 2C), or a decaying-oscillatory filter h cos (t) (supplemental Fig. S2).
We investigate the change in the network transfer function for various κ 1 and κ 2 in the frequency domain, using the Bode diagram (Fig. 2C, first two rows) [46], and in the time domain, using the impulse response (bottom row). Increasing κ 1 while κ n≥2 ≈ 0, or equivalently increasing the connection probability in a ER graph, we observe a change in the network transfer function G(s) from a low-pass filter towards an integrator (i.e. G(s) = 1/s). In the time domain, the network impulse response has a slower decay, indicating an increased network time constant.
Next, we change the connectivity W along the κ 2 direction, while fixing κ 1 . This is equivalent to changing the frequency of two-link chain motifs, while keeping the number of connections the same. Fig. 2C shows that this structural change achieves achieve similar input-output dynamics adding more connections to a ER graph. Moreover, including a higher order motif cumulant, κ 2 , introduces new effects in G(s) not present with κ 1 alone. For example, an enhanced frequency of two-link chains (positive κ 2 ) in networks with h exp (t) nodes introduces additional timescales in the network impulse response, which is no longer described by a single exponential ( Fig. 2C third row insets on log magnitude plots). More dramatic effects of κ 2 can be achieved for dense weighted networks, or for sparse networks (see Supplementary Materials).
To confirm the accuracy of our motif cumulant approach, which requires only highly local information about network connectivity, we have plotted the network transfer functions calculated by truncating Eq. (2) in Fig. 2C. We only keep leading motif cumulant terms (see caption) in the expression. The resulting G(s) (dashed lines) closely match calculations using the full connectivity matrix W (solid lines).
We can understand the observed qualitative impact of motif cumulants by analyzing Eq. (2). In some cases, different behavior can be predicted by changes of κ n independent of the form of the node filter h(s). This relies upon classic tools from control theory, where the transfer function is characterized by its poles and zeros in the complex plane (see Supplementary Materials).
Time constant of the network response. A key property of a network's response to an input stimulus is the timescale over which past inputs are retained. Next, we develop an explicit link between the motif statistics of a network with the time constant of its transfer function G(s). This reveals how the prevalence of localized connection structures can modulate the response timescale of a network as a whole.
We quantify the timescale of a filter via the "frequencycutoff" time constant (see Supplementary Materials). This time constant can be precisely linked to motif cumulants using the resumming formula Eq. (2): Theorem 2. Consider a network with a node filter h(s) that decreases asymptotically as 1/s g (g > 0) for large s, with a time constant τ . Then the time constant of the network transfer function G(s) is (f (τ g )) 1 g , where f (·) is a function with the same form as formula Eq. (2), that is, This formula explains a phenomenon shown in Fig. 2C: both κ 1 and κ 2 can have large impacts on timescale. Over range of different networks, this timescale can vary over several orders of magnitude, while still being well predicted by a few low order cumulants characterizing local network connectivity.
Generalizations and a real-world application. Intriguingly, the same results -or highly tractable modifications -also hold in networks with random input and output weights. Instead of feeding input uniformly to every node, we consider weights B i , so that node i receives B i u(t). Similarly, we form the output as a weighted sum y(t) = 1 N i C i y i (t). First, consider the case when the entries B i , C j , for all i, j = 1, . . . , N are independent and identically distributed (i.i.d.). The result of Theorem 1 holds in the limit of large networks, with a scale factor of mean(B i ) × mean(C j ) (Thm. 13.1 in Supplemental Materials). Moreover, for any network size, there is a computable bound on the confidence interval into which a given transfer function will fall (Fig. 3A).
Next, consider the case when B i and C i are correlated for each i = 1, . . . , N while B i and C j are still independent for i = j. Such a correlation structure can be motivated in the context of neuroscience, where basic plasticity mechanisms might lead more-active cells to both receive stronger inputs and more strongly influence cells downstream. Moreover, in many networks nodes and their branches have different physical sizes; if connections are formed randomly over space, a large cell would then be simultaneously more likely to receive the input and to project its output. The average network transfer function for this case is described by a similar motif (1) ...
in (2) out (2) (1) in The links with motif cumulants in Fig. 11 illustrates the role of cycle and chain motifs in shaping the transfer function. Note the same chain motifs now have di↵erent impact from the diagram in Fig. 3. . Figure 12: Functionally equivalent diagram of E {G} in the case of correlated input output weights with ✓ = 0 (see text). The network transfer function produced with a graph W is the same as that produced with this flow diagram. Each chain and cycle motif cumulant gives rise to a feedback link in the flow diagram. The number on certain links indicates a constant filter of that strength (related to the numerator in Eq.(17)).
We demonstrate the possible impacts from cycle motifs on the network transfer function with the following numerical example. We generate a few W (of same size) with various level of  c 2 , spanning from -0.85 to 0.9 while other motif cumulants are essentially 0 (more details in Appendix I). Because ⇢ and 2 are simply multiplied constants in Eq. (17), we set them as ⇢ 2 = 1/N and E {G(s)} for each W in di↵erent colors in Fig. 13. The most significant impact happens with negative  c 2 . For the exponential filter, positive  c 2 tends to increase the time constant (decrease the cut-o↵ frequency in the Bode plot), the opposite is true for negative  c 2 . The resumming formula (17) can be used to calculate E {G(s)} only based on the value of  c 2 . The approximations are actuate for relative small coupling strengths and the e↵ect of  c 2 appears to be similar but smaller in scale (data not shown) than what seen in Fig. 13 where a strong coupling strength is used for illustration purpose. However, at large coupling strength, the spectral radius condition guaranteeing the convergence of the motif cumulants series in Eq. (17) stops to be satisfied. Extending

Multi-population
A C D B Figure 15: Network transfer functions for the connectivity of thalamic and cortical areas in the mouse brain. Here, we use an exponential filter h exp (s) for all nodes (↵ = 1/100). We consider the network transfer function G(s) when a input is fed uniformly to the 11 thalamic areas and a output is formed by reading out uniformly from all areas. A coupling strength is chosen at 90% of the level of the maximum value that keeps the system stable. A. We plot the magnitude of chain motif cumulants | n  n | against the order n (y-axis is in log-scale but tick mark values are labeled in linear scale). The sign of  n is indicated with blue (+) or red (-) markers. The motivation and value of the constant is given in Appendix I. Note that these cumulants are computed by treating all nodes as belonging to a single population; a more complex but more accurate approach would be to consider subpopulation cumulants, as in Section 7 and Appendix IX. B. The G(s) calculated based on original W is plotted with red solid lines. We also calculate the transfer function that results from an Erdős-Rényi shu✏ed W (blue dash lines, see text), or equivalently keeping only first order motif cumulants in Eq. (22). The numerical value of the time constant and an analytical value based on Theorem 5.1 is also shown. C. As we include more motif cumulant terms in Eq. (22), the time constant of the approximate network transfer function G(s) (blue dots and line) approaches that calculated from the original connectivity W (black dashed horizontal line).

A B
FIG. 4. A. Mouse brain mesoscale connectivity between 213 brain regions, organized into 8 anatomical groups; red connections originate from one of 11 primary thalamic nuclei (those receiving the stimulus in our model); remaining connections are shaded proportional to connection strength, from [47]. B. Red line indicates the network impulse response, which exhibits multiple timescales. A sequence of blue lines depict successive (improving) approximations to this response computed by considering additional motif cumulants. Inset: When the network is modified by either performing a node-degree preserving shuffle (100 samples plotted in lightblue), or by setting every connection to a strength equal to the mean of the original log-normal weight distribution (reddashed line), the long-time memory capacity of the network is diminished.
cumulant expression. This expression again corresponds to a feedback diagram (Fig. 3B), but now involving additionally the cycle motif cumulants κ c n (see Supplementary Materials). These cycle motifs play distinct roles in shaping the transfer function, and modulate the impact of chain motifs (compare with Fig. 2B).
As is of increasing interest in biology and other fields, many networks consist of different node (or cell) types, each having a distinct pattern of connectivity and dynamics. Our theory generalizes to capture such cases with multiple node types. Here, the network transfer function is still given by a feedback hierarchy, but now with more branches (e.g., Fig. 3D for 2 node types). Importantly, motif cumulants can still be defined to reflect node type identities in addition to connection structure.
We applied this result to recent data on the connectiv-ity of the mouse brain (Fig. 4A). The resulting network is complex, with significant motif cumulants of many orders (Supplemental Fig. S11). Moreover, this structure extends the time over which the network remembers input signals, and introduces multiple timescales into this process (Fig. 4B, note nonlinear decay on log plot of impulse response (inset)).
Discussion. Above, we have highlighted a set of of relationships between network structure and feedback pathways for signal filtering that follow from the simplest definition of motif cumulants. We have also derived novel, alternative definitions of motif cumulants that more directly account for degree heterogeneity among nodes (see Supplementary Materials). These also lead to descriptions of network response in terms of a feedback diagram. While the resulting pathways are slightly more complicated, they can also be more efficient, giving more accurate descriptions of the network response based on lower order statistics of its connectivity. Moreover, there are cases when the original feedback description does not converge, but our alternate motif cumulants do give a convergent description. This said, there are fundamental advances that remain for future work. The dynamics we describe here assume linear input output properties at each node. While this is sufficient for describing some network functions [25][26][27], extending the motif cumulant-based approach to nonlinear nodes -perhaps beginning with quadratic and moving to higher-order filters -is an exciting possibility.
A major contemporary challenge is to link the connectivity structure of networks to their dynamic function. Here, we have shown one way that this link can be made. Highly localized features of connectivity predict the global input-output dynamics of a network through a tractable, feedback architecture. Our approach using motif cumulants and network transfer functions is complementary to the richly developed spectral graph theory [48]: indeed, the eigenvalues of a graph and motifs can be concretely related [49].
The result of the present work, at the interface of net-work science and systems engineering, has clear practical and scientific implications. For one, localized connectivity, quantified through motifs, is relatively easy to sample and compare among systems [5,6,45]. Even more exciting, in neural systems this localized connectivity is under the control of learning and plasticity mechanisms [50,51]. Thus, our work may open the door to new analyses of the natural learning and adaption of network function.

Signal input and output in complex recurrent networks: setup
For convenience, here we restate the steps and notation for the general case of weighted input/output. We consider a network system consisting of N recurrently connected nodes (Fig. 1B), which are connected via a directed connectivity matrix W . A scalar-valued, time-dependent input signal u(t), is fed to the network according to a weight vector B. That is, each node i receives an external input of B i u(t). We gather an output y(t) by a linear combination of unit outputs x i (t) according to the vector C, so that y(t) = i C i x i (t). The signal processing function of the network can then be characterized as the relationship between the input u(t) and output y(t). The dynamics of each node is completely described by a temporal filter h(t).
A similar set of equations can also be applied to describe linearly interacting point process models (Hawkes process [Hawkes, 1971a,b]). In particular, this model has been used to model the dynamics of spiking neurons . Let λ i (t) be the (stochastic) instantaneous firing rate of neuron i. Spike trains (point process) S i (t) are generated by an inhomogeneous Poisson process according to the rate λ i (t). We assume that λ i (t) is always or for most of the time positive, so that it serves as a legitimate Poisson rate. Neurons interact with each other through the spike trains filtered by a nodal kernel h(t) (S1) Here λ 0 is a constant baseline firing rate. In absence of signal u(t), the steady state firing rateλ i satisfiesλ where h 0 = ∞ 0 h(t)dt. Now we consider how the average firing rate changes over time when there is a time dependent signal u(t). Let ∆λ i (t) = E {λ i (t)} −λ i , where the expectation is taken across trials while fixing the signal u(t). For example, This is the same set of equations for a network of linear time invariant (LTI) nodes.

Matrix expression of the network transfer function Eq. (1)
The dynamics at each LTI node can be written as As a result of having LTI nodes, it is easy to verify that the entire network as a whole is also a LTI system. In fact, we can derive the filter of the network, denoted thereafter as G(t), explicitly. This is accomplished via the Laplace transform , which allows us to rewrite the convolution conveniently as multiplication. Combining outputs of nodes together as a vector x(t) = (x 1 (t), . . . , x N (t)) T , we have the following equation in matrix form: Here we overload the notation of h(·) as both the temporal filter h(t) and the Laplace transform h(s), and similarly for other variables and throughout the manuscript. The Laplace transform of a temporal filter offers an equivalent description of the LTI system in the frequency domain. Solving the system of Eq. (S5) gives the network transfer function Here I is the identity matrix. To obtain Eq. (1) in the main text, simply let B, C = e = 1 √ N (1, . . . , 1) T .

Formulae of network motif cumulants
Motif cumulants quantify the frequency of "pure" motif structures of a given size, over and above the frequency expected from smaller motifs that form its building blocks. A positive motif cumulant indicates an over-representation of a certain motif, whereas a negative motif cumulant indicates an under-representation. Motif cumulants are intimately related to simpler network statistics, which we refer to as motif moments. Motif moments are defined by simply counting the number of occurrences of a motif in network defined by W , normalized by the number in a complete (i.e., completely connected) graph. For example, the motif moment of length n chains (n consecutive connections among nodes , the motif cumulant of W for length n chains, κ n , can be (recursively) defined via Here C(n) is the set of all compositions (ordered partitions) of n.
Besides the combinatorial definition, motif cumulants can also be calculated using a matrix expression , We emphasize that Formula (2) is an exact expression without specific assumptions on the structure of W . Moreover, Theorem 1 and the definition of motif cumulants Eq. (S7) also apply to networks with non-uniform connection weights that vary from one link to another. The motif cumulants are therefore interpreted as statistics for motifs, where each "count" of occurrence is weighted by the product of the strength of connections it contains. A proof of this theorem is given in Section 5.

Network generation methods
We use two classes of random networks in our numerical examples, as described in detail below. The first is the class of "sparse" networks with the majority of entries in the connection matrix W being 0; these are generated according to the Erdős-Rényi and SONET (second order network) models. The second class is dense networks, with most entries in W being non-zero and taking continuous values; these networks are generated via Gaussian random matrices. The Erdős-Rényi network is generated by simply drawing each connection independently as a Bernoulli random variable with connection probability p. We next give the details by which the other networks are generated.

Generating sparse networks with the SONET graph model
We use the SONET model of random graphs, together with code provided by the authors of [Zhao et al., 2011], to generate sparse networks with different motif statistics. As an extension of the Erdős-Rényi model, the algorithm generates a W with binary entries, with a given connection probability and approximately specified second order motif cumulants (for converging, chain, diverging and reciprocal connection motifs).

Generating Gaussian networks with chain, converging, and diverging motifs
We first consider a network with Gaussian distributed entries, W ij = a i + b j + c ij , where a i , b j and c ij are all Gaussian variables with zero means. Furthermore, assume that all of these variables are independent, except for pairs (a i , b i ), i = 1, · · · , N . Then, it is easy to verify that cov(W ij , W jk ) = cov(a j , b j ), cov(W ij , W ik ) = var(a i ) and cov(W ik , W jk ) = var(b k ). By considering the corresponding indices one sees that these covariances correspond to the excess probability, or cumulants, of length-two chain, converging and diverging motifs for By adjusting the variance and covariance of a i , b i , we can therefore achieve various values of motif cumulants. One can show that the resulting motif cumulants must satisfy the following inequality constraints Here σ 2 is the variance of the entries of W (except for entries on the diagonal).

Generating networks with different cycle motif cumulants κ c 2
We achieve various values for the cycle motif cumulant κ c 2 via another model of Gaussian random matrices, with an adjustable level of symmetry in the matrix entries. First, we point out that κ c 2 is directly related to the correlation coefficient ρ reci of entries in the connection matrix that correspond to reciprocal connections, such as W ij and W ij . In particular, it can be shown that for large networks generated with Gaussian entries (assuming no correlations except for between reciprocal entries), κ c 2 ≈ σ 2 ρ reci , where σ 2 is the variance of W ij . The argument is as follows. Below, we assume that the network size is large and replace the sum of large number of (nearly) independent variables by its expected value. One can show that Finally, we can readily construct Gaussian random matrices with arbitrary levels of ρ reci , while keeping all other correlations among entries of W equal to 0. To do this, we generate a W matrix as a weighted sum of a symmetric or anti-symmetric Gaussian matrix and an independent Gaussian matrix, with special treatment for the diagonal entries (to keep their variance the same as for other entries). This method allows to one achieve all possible range of ρ reci ([−1, 1]).

Connection strength and stability condition for the network system
For convenience, we describe the connection matrix in the main text up to a positive constant a that determines the overall magnitude of the connection strength. For example, we may refer to W as an Erdős-Rényi network with connection probability p, but the actual connection matrix is 1 N W . This constant is not written explicitly, but is assumed to be absorbed into W .
The constants in numerical examples are often chosen based on the largest possible connection strength that will keep the network system stable. This largest value is determined by W and h(s), and can be efficiently computed using a semi-analytic method that we describe next. The exact stability condition for any LTI system x(s) = G(s)u(s) is that there is no pole on the right-half-plane of complex s values. For our model, G(s) = (I − ah(s)W ) −1 , and the condition on the poles can be translated into a condition based on the eigenvalues of W and on a region in the complex plane defined by h(s). The poles of G(s) satisfies where λ i is the eigenvalue of W . If we define a region in the complex plane then the stability condition is equivalent to requiring that the point cloud of eigenvalues of W scaled by a does not fall in to Ω.
For the h exp (s) and h cos (s) functions we use (Section 17), this region Ω can be determined In particular the boundary has a singular and right-most point at 2α. These characterizations make it easy to calculate the critical a for stability.

Proof of the motif cumulant expression for the network transfer function
We first restate Theorem 1 with a precise condition of "sufficiently small" connection strength.
Theorem 5.1. A network transfer function G(s) described by Eq. (1) can be written as provided the connection strength is sufficiently small so that the series above converges (the condition for this being |h(s)| ρ(ΘW Θ) < 1, Θ = I − ee T ).
The proof will essentially follow the same approach used in the proof in Section 5 of ; the version here is somewhat simpler.
Proof. Starting from Eq. (1), we expand the matrix inverse as a power series. (S14) Substituting for µ n with the decomposition Eq. (S7) gives Switching the order of the summations by first enumerating the number of components t in the ordered partition of n, we have Note that the sum over n 1 , · · · , n t and the product summand can be factorized, which yields an identical factor for every n i : Finally, summing the geometric series yields Eq. (2),

Proof of the limit of motif cumulants in large Erdős-Rényi networks
In the main text, we claimed the following result.
Proof. By definition κ 1 = 1 For n ≥ 2, using the matrix expression of motif cumulants given in , 8 Here e = (1, · · · , 1) T / √ N , and W F = i,j W 2 ij is the Frobenius norm. First, we show a bound on the second factor W F /N in (S15). By the law of large numbers, for any positive δ, is satisfied with probability approaching 1 as N → ∞; here, we used that W 2 ij = W ij since entries of the connection matrix are 0 or 1. Choosing a fixed value of δ such as δ = p, we have with probability approaching 1 as N → 1.
To finish the proof, we will use the following result to bound the first factor in (S15).
Lemma 6.2. For some absolute constant C, the probability that the following inequality holds approaches 1, as Proof. We split the norm into two terms, The bound of the first term is a typical result in random matrix theory, for which we will rely heavily on the reference [Vershynin, 2010]. Note that W − pN ee T has Bernoulli distributed i.i.d. entries. According to Lemma 5.24 in [Vershynin, 2010], the rows of this matrix are independent sub-gaussian isotropic random vectors (Definitions 5.22 and 5.19 in [Vershynin, 2010]) with sub-gaussian norm bounded by some absolute constant K. We can then apply Theorem 5.39 in [Vershynin, 2010] about matrices with sub-gaussian rows to W − pN ee T , with for example t = √ N . The theorem shows is satisfied with probability of at least 1 − 2 exp(−cN ). Here C and c are constants only depending on the sub-gaussian norm bound K and thus are also absolute constants. As N → ∞, Eq. (S20) holds with at least probability 1 − 2 exp(−cN ) → 1. For the second term in (S19), let µ i = 1/N j W ij and the vector ∆µ = (µ 1 − p, µ 2 − p, . . . , µ N − p) T . Then the relevant matrix can be written as which is a rank-1 matrix. Therefore its two norm is 9 The norm of the vector is a sum of i.i.d variables (µ i − p) 2 . It is straightforward to calculate the mean and variance of this norm, and we will then use the Chebyshev inequality to give a bound. For ease of notation, let W j = W ij (since i is fixed) and ∆W j = W j − p. We have and Here m 4 = p − 4p 2 + 6p 3 − 3p 4 is a constant.
Applying the Chebyshev inequality to the variable ∆µ 2 2 , which has mean p(1 − p) and variance less than This shows that the probability so that this probability goes to 1 as N → ∞. Finally, combining the results from (S20) and (S26) with Eq. (S19) and using the "union bound" (P (A ∪ B) ≤ P (A) + P (B)), we have with probability approaching 1 as N → ∞, for some absolute constant C.
Using Eq. (S16) and (S18) (along with a choice of the constant δ), we see that the inequality holds with probability approaching 1, as N → ∞, and for some positive constant C (independent of n and N ). Finally, as C 1 √ N n−1 → 0 as N → ∞, the above indicates that κ n≥2 → 0 in probability.
7 Vanishing chain motif cumulants for "rotationally invariant" networks and others with uniform in-or out-degrees Networks with uniform in-degrees (that is, with the weighted sum of all incoming connections being the same for each node) or with uniform out-degrees are frequently described in neuroscience, physics, and in network science overall. Here, we will show that such networks have an interesting property referred to in the main text: their chain motif cumulants κ n = 0 for all n ≥ 2, as for the case of Erdős-Rényi networks in the large N limit. Thus, using Theorem 1, the network transfer function G(s) for such networks is equivalent to proportional feedback on a single node. One way that networks with uniform degrees arise is through rotationally invariant connectivity structures. We describe this case first, and then show how the results generalize. Rotationally invariant networks are characterized by defining a circular space variable x and having the connectivity between two nodes depend on their spatial distance.
Let's consider a network with rotationally invariant connection strengths W ij = w(j − i) and periodic boundary conditions w(i) = w(i + N ). The average connection weight is w = 1/N N k=1 w(k). The second order chain moment is: Thus a rotationally invariant network has the same second order chain motif moment as a uniform network with W ij ≡w ∀ i, j.
The n-th order (n ≥ 2) chain moment is: Using the decomposition relation between µ n and κ n (S7), we conclude that κ n≥2 = 0 for rotationally invariant networks. The essential part of the proof was (S36), in which the sum corresponding to the end of the chain was be factored out and summed, yielding the same value Nw for each i n−1 . This reduces the length of the chain by 1. In general, this step is possible as long as all the nodes in the network have the same (weighted) in-degree; this is the case of uniform in-degree.
Note that we can perform a similar reduction at the beginning of the product in (S36) instead of at the end, via Therefore, the same conclusion will follow if the network has uniform out-degree instead. Networks having uniform in-or out-degree include both regular networks , and rewired regular networks. Importantly, the latter is a popular model for small world networks, which have uniform out-degrees in a very common formulation ].

Proportional feedback and κ 1
Consider adding a global feedback around a general network (i.e., not restricted to the Erdős-Rényi case), by sending a proportion of the network output, N ∆κ 1 y(t), back to combine with its input u(t).
One can directly verify that the new network transfer function for a network with such additional "global" proportional feedback is  in (2) out (2) (1) in f how statistical connectivity motifs give rise to function in comon two disciplines of research: (i) novel advances in the statistical ntify statistical network motifs efficiently, avoiding redundancy [10] to describe the input-output behavior of networks with sigprove that the signal transfer properties of large recurrent nety by the statistical prevalence of chain motifs of different lengths function h(s): (1) Figure S1: Erdős-Rényi network transfer function is equivalent to proportional feedback on a single node.
This shows that the effect of such a global term, at the level of motif cumulants, is simply to shift κ 1 while keeping all κ n≥2 the same.
To develop the corresponding stability condition, we imagine that there is an additional, N + 1 st node that functions as the global feedback. The node will have a constant filter and receives input from, and provides output to, all nodes in the original network. The stability of this N + 1 node system is determined by studying for what s ∈ C, the following matrix is invertible Using the Schur complement, this matrix is invertible if and only if I − h(W + ∆κ 1 N ee T ) is invertible. The latter matrix corresponds to a N node network with a connectivity matrix formed by adding ∆κ 1 to all entries of W . This shows that, at the level of stability conditions, global proportional feedback is also equivalent to shifting all W ij by a constant. The above argument shows that the system with a such a feedback is equivalent to another network with κ 1 shifted to κ 1 + ∆κ 1 and all other κ n unchanged. It is also equivalent to a network in which ∆κ 1 has been added to each W ij . The equivalence is both in the sense of producing the same network transfer G(s), and in terms of the conditions for the system to be stable. We will use these facts below in Section 9.
9 Network examples with stronger impact of two-link chain motifs κ 2

Additional node filter h cos (t)
In the main text Fig. 2C, we set the filter for each node to be the exponential h exp (t), in order to demonstrate the effect of motif cumulants κ n . The same changes in connectivity will result in different effects on the network transfer function if a different node filter is used.
To explore this, we show results in Fig. S2 for the case where the node filter is the decayingoscillatory filter h cos (t) = e −αt cos(νt)H(t) (see Section 17). Such a filter captures oscillatory dynamics at each node, and has been used, for example, to describe neurons in the nearperiodic firing regime Pernice et al. [2012]. In Fig. S2, increasing the connection probability κ 1 (left column) enhances the resonant peak that is present in h cos (t). This is reflected in the time domain by larger amplitude oscillations. On the second column, we hold the connection probability κ 1 fixed while increasing the chain motif cumulant κ 2 . Interestingly, this achieves a similar effect on the network: increased oscillations in the impulse response.
...   (Fig. 2), but show stronger effects. To achieve this, we aim to vary the motif cumulant κ 2 while leaving the other motif cumulants as close to zero as possible. One way to do this is to generate Gaussian networks as described in Section 3. In this case, κ 1 = 0 as the entries have zero mean; and higher order cumulants κ n≥3 = 0 thanks to basic properties of Gaussian random variables. The numerical effect of κ 2 for these Gaussian networks is similar to the examples discussed below (data not shown). For example, in Fig. 3A in the main text, we use Gaussian networks with κ 2 < 0. Note that Gaussian networks are densely connected, in the sense that nodes are all to all connected with continuously distributed connection strengths. We discuss in below how to achieve similar effects for sparsely connected networks where the connection strength is either 0 or a fixed constant. Unlike the Gaussian network case, the analysis based on motif cumulants for these sparse networks requires a revised theory described in Section 9.3.
We accomplish this in two steps. First, to set κ n≥3 close to zero, we use the SONET (second order network) model of random graphs [Zhao et al., 2011]. This algorithm generalizes the Erdős-Rényi graph by allowing non-zero second order motif cumulants and produces a sparse binary W . Next, to set κ 1 = 0, we apply a global feedback of −N κ 1 y(t). As explained in Section 8, this will set the effective κ 1 = 0 while keeping all other κ n the same, and is equivalent to adding a constant of −κ 1 to all W ij . We note that our procedure of isolating κ 2 simplifies the situation conceptually and helps achieve a larger numerical effect on network transfer functions. However, the procedure is not strictly necessary for the qualitative effects or analysis described below. Figures S3 and S4 show the impact of varying κ 2 . First, the middle columns shows the case with all κ n very close to 0: here, we use an Erdős-Rényi network and apply global feedback to shift κ 1 to 0 (Section 8). We therefore recover the original node filter h(s). Next, the right columns show that increasing κ 2 to positive values has a roughly similar effect on network transfer functions as increasing κ 1 , for both of the filters h exp (s) and h cos (s). Intriguingly, however, the left columns show that networks with negative κ 2 -fewer chain motifs -produce qualitative changes in the network transfer functions. For exponential node filters h exp (s), a frequency peak is generated the Bode plot for response magnitudes. For the decaying-oscillatory filter h cos (s), the original resonant peak splits into two peaks. As κ 2 becomes more negative still, the peak seen for h exp (s) increases its magnitude, and the twin peaks for h cos (s) become more separated (data not shown). The effects of chain motif cumulants are also reflected in the impulse response functions. In particular, note the emergence of a negative response window for h exp (s) node filters, and irregular-looking oscillations in the impulse response for h cos (s). Amplitude Figure S3: Effect of the length-2 chain motif cumulant κ 2 on shaping the network transfer function G(s), for networks with node filter h exp (s). The networks W 's are generated from the SONET random graph model (Section 3.1). Network size is N = 1000, connection probability κ 1 = 0.1, and κ 2 /κ 2 1 = −0.6, 0, 0.1 respectively for the three columns from left to right. Higher order motif cumulants κ n≥3 are small (see Section 17). To emphasize the effect of κ 2 , a global (negative) feedback is applied to all three networks to shift κ 1 to 0 without changing higher order motif cumulants. The blue solid lines are calculated by directly solving the system Eq. (1) using entire connectivity matrix W . The red dashed lines are calculated using only the first two degree-corrected motif cumulants κ BC Amplitude Figure S4: Same as Fig. S3, but for networks with node filter h cos (s).

Approximations of G(s) using motif cumulants in sparse networks
In Fig. 2 in the main text, and many other cases, we have shown that truncating the motif cumulant expression in Eq. (2) after terms for small (low-order) cumulants results in a good approximation of the network transfer function G(s). Such approximations also work well for Gaussian networks. However, for the sparse network examples in Fig. S3 and S4 in which we demonstrated strong effects of 2-length chain motifs (κ 2 ) effects by shifting κ 1 to 0, the situation is more complicated.
Here the SONET networks have an effective κ 1 = 0 through applying a global feedback (Section 8), and κ 2 is the most significant with κ n≥3 being small. However, a naive truncation of Eq. (2) after second order motif terms will generate an approximation of G(s) with significant error (data not shown). This is because we have chosen a very strong coupling strength near the point at which one of the simulated networks would become unstable (the allowed strength is larger than the case when κ 1 = 0); this leads to increased contributions from higher order motifs. For example, for κ 2 > 0, a truncation keeping motifs up to 5th order agrees very well with the actual G(s) (data not shown). However, for the case of κ 2 < 0, keeping more terms corresponding to higher-order motif cumulants in Eq. (2) will not improve the approximation and can even make it worse! The problem is that the condition in Theorem 1 about the spectral radius is not satisfied, and the infinite series of motif cumulants in the denominator of Eq. (2) does not converge.
To solve this problem, we introduce a modified version of motif cumulants, denoted by κ BC n . These terms automatically account for the different levels of influence that each node in the network exerts (e.g. quantified by their in-and out-degrees), but without requiring direct knowledge of this degree information a priori. By incorporating the associated heterogeneity, the series in terms of the κ BC n can converge even in cases where the corresponding κ n series diverges. We make use of the κ BC n to obtain approximations of the network transfer function G(s) by drawing on two important facts. First, G(s) can be expressed in terms of κ BC n in a manner similar to Theorem 1 of the main text (Eq. (2)) but with a different formula. Secondly, and importantly, the κ BC n can be expressed in terms of the original κ n through algebraic combinations. This is an extremely useful property, as it implies that the κ BC n are defined via local network subgraphs, as for the κ n (here involving n + 2, vs. n, connections; see Section 10).
The modified motif cumulants κ BC n , while slightly more complex than the original κ n , can produce highly accurate approximations for G(s) when the corresponding series is truncated to use only κ BC 1 and κ BC 2 -even when this approach fails for the original κ n . This is illustrated via the red dashed lines in Fig. S3 and S4. Details on definitions, motivation, and derivations of the associated formulae for κ BC n are given in Section 10.
10 Degree-corrected motif cumulants κ BC n 10.1 Motif cumulant resumming for arbitrary weights B, C We will point out the following fact without further detail, as its careful exposition will be the subject of future work. It is possible to derive a resumming formula for Eq. (1) analogous to Eq. (2) for arbitrary weights B, C. Let N BC = C T B, Θ BC := I − 1 N BC BC T . Then the network transfer function defined in Eq. (1) can be rewritten as (S41) Here we use notation G BC (s) instead of G(s) to signify the use of weights B, C. Unless stated otherwise, G(s) below means the uniform weight case where B = C = e. The "motif moments" µ BC n and "motif cumulants" κ BC n are defined as Moreover, the same decomposition relation Eq. (S7) also holds for µ BC n and κ BC n .
10.2 Relating G BC (s), κ BC n and G(s), κ n A special case of the generalized resumming formula Eq. (S41) is to choose B, C as in-and out-degrees of W , that is, Here l = (1, · · · , 1) T . This choice of B, C is motivated by a few observations and heuristic arguments about eliminating the dominant eigenvalue of W in ΘW Θ, based on the degree vector approximating the Perron-Frobenius vector for W . This is partially discussed in , and again we will save exposition of further details for future work.
To signify this special choice of B, C, we will use the notation G deg (s), µ deg n , κ deg n and N deg as a special case for G BC (s), µ BC n , κ BC n and N BC . For general W , the network transfer function G deg (s) given by Eq. (S41) will be different from G(s), the latter being based on uniform input and output weights. Nonetheless, these are related: Therefore The next step is to write G deg (s) in terms of the original κ n . Given Eq. (S41), this boils down to expressing κ deg n in terms of κ n . The basic relationship is through the motif moments µ deg n and µ n , Eq. (S45) has an intuitive explanation: µ deg n is probability of length n chains when each count is weighted by the the out-degree of the "sending" node times the in-degree of the "receiving" node.
Using Eq. (S45), we can derive the needed relationship among motif cumulants. For example, Using these expressions for κ deg n together with Eq. (S41) and (S44), we can express G(s) in terms of κ n . Note that this expression will appear quite different from that of Eq. (2). However, the two are equivalent, in the sense that if one expands as an infinite series in powers of h(s), the coefficients containing κ n should be the same. This can be shown by noting that G(s) is an analytic function of h(s) and the two expressions both exist and agree for h(s) of sufficiently small magnitude.
The real difference (and advantage) of Eq. (S44) can be understood as a re-ordering of the terms in an infinite series. The re-ordered series may have an different convergence region (in terms of h(s)), and differ in value for finite truncations. This latter property is what the leads to the improved approximations of the network transfer function based on the statistics of small motifs alone.

Feedback diagram representation of network transfer function using degree-corrected motif cumulants
Similarly to the case for the standard motif cumulants (Fig. 2B) we can represent the network transfer function using a feedback diagram in terms of the degree-corrected motif cumulants.
To do this, we consider Eq. (S44) (plugging in Eq. (S41)), and note that this is equivalent to the diagram (Fig. S5). 11 Poles of G(s) and κ n Intriguingly, we can understand and predict the qualitative impact of motif cumulants by analyzing Eq. (2). In some cases, the conclusions are even independent of the form of the node filter h(s). This relies upon classic tools from control theory, where the properties of a transfer function are linked to its poles on the complex plane. These poles are complex values of s where the transfer function fails to exist, typically at locations where the denominator becomes zero.
Given the form of Eq. (1), the poles of G(s) can be determined by studying the roots of the denominator. Assuming that the first m of κ n are significant while κ n>m ≈ 0, the set of poles of G(s) is Here h −1 (·) is the (complex) inverse function of h(s). For the two filters h exp (s), h cos (s) we use (Section 17), h −1 (s) can be well characterized analytically. Moreover, if we idealize the networks above so that only κ 1 and/or κ 2 are nonzero, P (z) is a first or second order polynomial, which is readily solvable. Taking these facts together, one can explain the qualitative changes seen in the network transfer function G(s) when κ 1 and κ 2 are varied. This includes the slower temporal decay seen with positive κ 1 in the case of h exp (s), the split of the resonant peak with negative κ 2 in the case of h cos (s), and so on. Details of the analysis are given below. The number of poles of G(s) is determined by the number of roots of P (z) and #h −1 (z). Here #h −1 (z) is the number of preimages of z under the mapping h(s). It can be shown that #h −1 exp (z) ≡ 1 for all z ∈ C and #h −1 cos (z) ≡ 2 for all z ∈ C\{− 1 2ν , 1 2ν }. Therefore, when using these node transfer functions, for most cases, the number of poles for the network transfer will be determined by the number of roots of P (z). This number is the degree m by the fundamental theorem of algebra. In particular, we conclude that the numbers of poles of G(s) are 1 for h exp and 2 for h cos in the case of κ 1 = 0, κ n≥2 = 0; and are 2 for h exp and 4 for h cos in the case of κ 2 = 0, κ n≥3 = 0. This difference in the number of poles as the network connectivity changes reflects the qualitative changes that we see in the network transfer functions G(s) (Fig. S3 and S4).
To make this more precise, we study the location of the poles. For the exponential filter h exp , there is one (simple) pole −α on the real line. The decaying-oscillatory filter h cos has a pair of complex conjugate poles −α ± νi. In general, a pair of complex conjugate poles will give rise to a peak in the frequency domain and real poles correspond to exponential decay. As we have seen in these two examples, the magnitude of the real part of the poles determines the speed of decay in time domain, or the width of the frequency peak. A last fact that we will use is the symmetry of h(s), that is h(s) = h(s).
For the case of κ 1 = 0, κ n≥2 = 0, P (z) always has one real root 1 N κ 1 . By the symmetry under conjugacy, this real root is mapped to one real pole under h −1 exp (z) and to a pair of conjugate poles under h −1 cos (z). As κ 1 increases (decreases), the real parts of those poles, N κ 1 − α and N κ 1 /2 − α respectively for the two filters, increase (decrease) and result in the changes of time constant (h exp (s)) or peak width (h cos (s)).
For the other case where κ 2 = 0, κ 1 = κ n≥3 = 0, P (z) can either have two real roots or a pair of complex conjugate roots, depending on the sign of κ 2 . When κ 2 > 0, each of the real roots is similarly mapped under h −1 (z) as in the previous case. The poles corresponding to one of the two real roots dominates the effect on the time constant or peak width. When κ 2 < 0, the complex root pair is mapped to two complex poles under h −1 exp (z), which generate the observed oscillation in the impulse response. Under h −1 cos (z), each complex root is mapped to two non-conjugate complex poles with different imaginary parts, while the image of the other root is exactly the conjugate of these poles. All together, the complex root pair of P (z) are mapped into four poles, occurring as two conjugate pairs. The fact that these pole pairs have different imaginary parts explains the observation of two distinct frequency peaks.
The arguments above show how the roots of P (z) determine qualitative properties of the network transfer function G(s). This is interesting, because P (z) is determined completely by the network's motif statistics, independent of the nodal dynamics h(s). For example, in Fig. S3 and S4, as κ 2 becomes negative, the roots of P (z) switch from two real ones into a complex pair. Correspondingly, G(s) seems to undergo a type of "bifurcation" in the case of both h(s) functions. We suggest that although the specific changes in G(s) depend on details h(s), the onset of the transition is often determined by alone P (z), which is also a form of "generating function" for motif cumulants.
12 Motif cumulants and the time constant of system response

Definition of the time constant
One definition of the time constant for a general filter is based on the notion of cut-off frequency. Specifically, in the Bode magnitude plot, we draw an asymptotic line at the level of "baseline" gain (i.e., the magnitude at very low frequency), and another asymptotic line that follows the magnitude plot at high frequency. For physical systems, the Bode magnitude will always eventually decay and roll-off at high frequencies. The x-coordinate of the intersection of the two asymptotes defines the cut-off frequency. Intuitively, this is the frequency where the transition between a sustained response vs. a strongly damped response occurs. The time constant can in turn be defined as the reciprocal of the cut-off frequency.
There are a variety of other characteristic frequencies and timescales that may be defined in control theory . The above definition is consistent with the simplest notion of a time constant in terms of the the temporal decay of a filter. Taking the exponential filter h exp (t) = e −αt , t ≥ 0 as an example, it is easy to verify that the cut-off frequency is α, and so the time constant is 1/α. This is consistent with the time constant that defines the dynamics of the exponential filter.
The cut-off time constant can be precisely linked to motif cumulants using the resumming formula Eq. (2), as stated in Theorem 12.1.

Proof of the formula of time constant Theorem 12.1
We first restate Theorem 2 in the main text.
Theorem 12.1. Consider a network with a node filter h(s) that decreases asymptotically as 1/s g (g > 0) for large s, with a time constant τ . Then the time constant of the network transfer function G(s) is (f (τ g )) 1 g , where f (·) is a function with the same form as formula Eq. (2), that is, Proof. Without loss of generality, assume that For large s, h(s) ≈ 1 s g . That is, the asymptotic line in Bode plot is y = −20gx, x = log 10 s. The low frequency asymptote is y = 20 log 10 h(0). The two lines intersect at x = −g −1 log 10 h(0), or s = (h(0)) − 1 g . This corresponds to a time constant τ = (h(0)) 1 g , or h(0) = τ g .
Note that from Eq. (2), G(0) = f (h(0)), and at large s, G(s) ≈ 1 s g . A similar calculation of the intersection then gives the time constant of G(s) to be Remark 1. For the special case when g = 1, the above definition of time constant is equivalent to the value of the integral of the impulse response function ∞ 0 h(t)dt.

Approximating time constant with motif cumulants
To test the utility of Theorem 12.1, we numerically computed the time constant for a large set of SONET networks with diverse values of κ 2 , with κ 1 and coupling strength are set to be the same for all networks. When compared with the time constant computed using Theorem 12.1 and only using κ 1 and κ 2 , with higher order terms truncated, we see a very good agreement (Fig. S6, Left). We also observe a broad range of time constants spanning several orders of magnitude; this range is directly linked to the motif content of the underlying networks. The motif based approximation can be further improved by keeping more cumulant terms in Theorem 12.1 (Fig. S6, Right).

Convergence of G(s) in large networks
Here we consider the case when input and output weights B i , C j , for all i, j = 1, · · · , N are independent and identically distributed (i.i.d.) variables with mean θ and variance σ 2 (Fig. S7). For an arbitrary set of weights B, C, the matrix formula of G(s), Eq. (S6), still applies. However, the expression Eq. (2), in terms of motif cumulants, no longer holds directly. Here we first compute the expectation of G(s) across realizations of the input and output weights, and then, for large networks, prove the convergence of realizations of G(s) to this expectation. First, we note that G(s) is also a random variable for each s due to the stochasticity of B, C. Moreover, it is easy to calculate its expectation.
Here tr(·) is the trace. If we choose the mean input and output weight θ = 1 √ N , then we recover Eq. (1) with B = C = e for E [G(s)]. We can then apply Theorem 1 to express E [G(s)] with motif cumulants. As a result, we have the same explicit relationship as before for how motif structure impacts the network transfer function, when that transfer is defined as the average across randomly chosen sets of input and output weights.

Proof of the convergence of G(s) in the case of independent random input output weights
For large networks, the Eq.(1) not only holds for E [G(s)] but is also close to each random G(s). Specifically, under a norm condition for W , we prove the following convergence property.
Theorem 13.1. Let κ n be the motif cumulants of a sequence of W , whose size N → ∞.
Assume that each κ n has a limit κ ∞ n as N → ∞. Additionally, we assume a bound on the norm of W , for some fixed positive constant δ. Let G(s) be the (random) transfer function for networks with connection matrix 1 N W , and random i.i.d. input/output weights B, C with mean θ = 1 √ N and variance σ 2 = σ 2 0 N (σ 0 is a constant). Then, we have the following convergence as N → ∞: Remark 2. In the main text, we described a setup where the input weights are B i and the output weights are 1 N C j , where B i and C j are i.i.d. random variables with arbitrary means. It is straightforward to apply the result of Theorem 13.1 to this case by introducing a constant factor of Proof. First, we will show that the convergence κ n → κ ∞ n along with (S49) implies the convergence of E [G(s)] → G ∞ (s). Note that Therefore, by (S49), This inequality can be used with the matrix expression for κ n , Eq. (S8), to show that |h n (s)κ n | ≤ (1−δ) n . This geometric bound (in n) guarantees that ∞ n=1 |h n (s)κ n | is bounded independent of N and W , and ∞ n=1 h n (s)κ n converges absolutely to ∞ n=1 h n (s)κ ∞ n < ∞ (by dominant convergence). The above leads to Furthermore, the convergence is uniform in s as we show below. For any > 0, there exists an integer n 1 such that (1−δ) n 1 +1 δ < 4 . For each of 1 ≤ n ≤ n 1 , since κ n → κ ∞ n , there exists an integer N n such that for any N > N n , 27 This shows that Using the following inequality and composing the limit of Eq. (S55) with function (1 − x) −1 , we conclude the uniform convergence of E [G(s)] to G ∞ (s) in s.
We will now use the Chebyshev's inequality to show the convergence of G(s) to E {G(s)} based on calculating the variance of G(s).
HereP is the entry-wise complex conjugate and P * =P T . In the fourth equality, we have used the property that B, C are independent.
Given the norm condition on W , we have which is a constant bound independent of N and W . Using this, e T P P * e ≤ e T 2 P 2 P * 2 e 2 ≤ 1 δ 2 , similarly e T P P * e ≤ 1 δ 2 .
Together, we have Chebyshev's inequality then ensures the convergence of G(s) in probability.

Bound on the confidence interval of |G(s)| with random input and output weights
As discussed above, for random, independently distributed input and output weights, G(s) is random across realizations of these weights. We calculate an upper bound of the confidence interval of |G(s)| under the assumption that G(s) for each s is Gaussian distributed.
The assumption is intuitively justified as G(s) is a linear sum of a large number of random variables, B i , C j , when W and s are fixed. This assumption appears to hold well in our numerical simulations for large networks (data not shown). Note that the G(s) in general are complex variables. Therefore, G(s) will be Gaussian distributed in 2D under our assumption (Fig. S8). As G(s) can have correlated components which may not be aligned with E {G(s)} (see Fig. S8), calculating the exact confidence interval that G(s) lies in for a certain probability could be complicated. Instead we derive a simple bound.
First, the triangle inequality gives Let Z = G(s) − E {G(s)} be the 2D random vector, which can be decomposed into independent real Gaussian components X, Y such that |Z| 2 = X 2 + Y 2 . For 0 < p < 1, let is the inverse cumulative distribution function for a standard Gaussian variable. Therefore, we have Here σ 2 X and σ 2 Y are the variances of X, Y . Since X and Y are independent, Under this event, 14 Correlated input and output weights recruit cycle motifs

Expectation of network transfer function and cycle motif cumulants
Here we consider the case when the input and output weight vectors B, C are random but correlated (Fig. S9). Specifically, we take B i and C i to be correlated for each i = 1, · · · , N while B i and C j are still independent for i = j.
The appearance of the second term above reflects the correlation between input and output weights. If θ and σ have the the same scaling with respect to N , the first term in Eq. (S66) will dominate for large N . However, if the weights are balanced with both positive and negative values, giving a mean weight θ = 0, E [G(s)] will contain only the second term ρσ 2 tr((I − h(s)W ) −1 ). Given the distinct form of this "trace" term, compared with Eq. (1), one would naturally expect that it will introduce an impact on G(s) from different motifs than those that have entered above. We demonstrate this next, focussing on the trace term by setting θ = 0 unless stated otherwise.
Using the matrix Taylor series expansion, we can relate the trace term to connectivity motifs. First, we have Note that N −n tr(W n ) is the frequency with which an n-cycle of connections occurs in a network in which the entries of W take values in {0, 1}. A 2-cycle is simply a pair of reciprocal connections (Fig. 1B). In general, a n-cycle is a loop identified with indices connected as i 1 → i 2 → · · · → i n → i 1 . Similar to the chain motifs considered earlier, we may define the motif moments for n-cycles as µ c n = N −n tr(W n ) , n ≥ 1.
By generalizing the decomposition between motif moments and cumulants (Eq. (S7)), we should expect that The formula is explained by enumerating all possible decompositions of a n-cycle. We do this in two steps. First, we break the cycle at any one of the n nodes, yielding a single chain. Next, we break the resulting chain into smaller chain motifs, which correspond to ordered partitions as before. Since each decomposition which ends up with t ≥ 1 components can be acquired by first breaking the cycle at any of the t locations, it is redundantly counted t times in the procedure above. This, together with the n locations of the first break, explains the n/t factor in Eq. (S69). Finally, the only exception to the procedure is the cycle itself without any breaks, which is the last term κ c n . Naturally, we recursively define κ c n as the cumulant for cycle motifs using Eq. (S69). One can also prove that this definition has an equivalent matrix expression κ c n = N −n tr((ΘW ) n ), Θ = I − ee T .

A formula for the contribution of cycle motif cumulants to the network transfer function
We now derive a relationship between cycle motifs and the network transfer function (S67) at the level of motif cumulants. This is achieved by first proving a general expression involving cycle and chain motif cumulants ((S71)), and then using a resumming technique building on that used in Theorem 1.
Using the decomposition Eq. (S69), we can write Eq. (S67) or Eq. (S66) (at θ = 0) in terms of motif cumulants. The first few terms are Furthermore, a similar collection of κ n and κ c n terms as for Eq. (2) can be performed to obtain a resummed formula for this trace term. The result is: provided the connection strength is sufficiently small so that the series above converges (the condition for this being |h(s)| ρ(ΘW Θ) < 1, Θ = I − ee T ).
Proof. We start by expanding (I − hW ) −1 into a matrix power series and then substitute tr(W n ) in for µ c n , using the definition Eq. (S68). Next, we use decomposition Eq. (S69) to split the terms of µ c * into κ c * ( * stands for an arbitrary length index). This gives The essential step is to resum is the last term in the above expression. We introduce a (formal) series in complex variable z, The original series can be obtained by setting z = N h once the series is summed to a closed expression. Formally, or for z in the radius of convergence of the series, consider the indefinite integral of f (z)/z, In the last equality, we treat ∞ n=1 z n κ n as a variable and assume its magnitude is less than 1 (or operate formally). Finally, The original trace term for G(s) is thus 33

Numerical examples: the impact of cycle motifs on the network transfer function
We demonstrate possible impacts of cycle motifs on the network transfer function G(s) with the following numerical example. We generate a few connection matrices W (of same size) with various levels of the cycle cumulant κ c 2 , ranging from -0.85 σ 2 w to 0.9 σ 2 w (within the possible range of [−σ 2 w , σ 2 w ], see Section 3.3; here σ 2 w = 0.09 is the variance of the entries in W ), while other motif cumulants are close to 0.
We set ρ and σ 2 so that the product ρσ 2 = 1/N (see Eq. (S71)). In Fig. S10, we compute E {G(s)} via Eq. (S66) and plot the result for each W in a different color. The most significant impact of cycle motifs on the network transfer function happens with negative κ c 2 . Overall, for the exponential node filter, positive κ c 2 tends to increase the time constant; the opposite is true for negative κ c 2 . Beyond numerical calculations of the network transfer function via Eq. (S66), the resumming formula (S71) can be used to calculate E {G(s)} based only on the value of κ c 2 . We find that this approach is accurate for relatively small coupling strengths, and the effect of κ c 2 appears to be similar but smaller in scale (data not shown) than that shown in Fig. S10. However, at large coupling strengths, the spectral radius condition guaranteeing the convergence of the underlying motif cumulant series in Eq. (S71) is no longer satisfied. Extending our theory to capture such cases of strong coupling is left for future work.

34
Here we describe how to generalize our theory to allow for dynamics (h(s)) and connectivity (κ n ) factors to be node-type specific.
Consider a network consisting of k populations of nodes, with population type indexed by α. Nodes in each population α are assumed to have the same population specific filter h α (s). Ordering individual nodes in blocks according to their population index, the input-output equation for x(s) (Eq. (S5)) can be expressed in matrix form by introducing the diagonal matrix D h = diag(h 1 , · · · , h 1 , h 2 , · · · , h 2 , · · · , h k , · · · , h k ): For simplicity we restrict to cases where the signal input and output weights are uniform over the indices corresponding to each population. The resulting network transfer function is a linear combination of terms we denote by G αβ (s). These are the network transfer functions achieved by (uniformly) feeding input the u(s) into nodes of the population β and reading the response out from nodes in population α (Fig. 3C). These transfer functions G αβ (s) form a matrix; by abuse of notation we refer to this again by G(s). To derive a formula for it, first let U be the block matrix Here N α is the size of population α. We can then write G(s) as Note that D h and U "commute" due to their matching block structure, that is For simplicity, we will use the notation D h to represent D h whenever the meaning is clear from the dimensions of matrices. Similarly as for Theorem 1, we can rewrite the multipopulation network transfer functionG(s) in terms of motif cumulants. The result looks almost identical, where we replace scalar quantities with k × k matrices corresponding to the k populations.
whereκ n is the motif cumulant (k × k matrix) of D h W for length n chains, defined via the recursive relations with (matrix) motif moments , Here C(n) is the set of all compositions ( ordered partitions) of n, and the diagonal matrix D f = diag(N 1 /N, · · · , N k /N ).
Here we have combined the h α filters with W to define the motif statistics for the effective coupling matrixW = D h W . This is indicated via the˜over µ and κ.
In the theorem above, we needed to introduce population specific motif moments (μ n ) αβ and cumulants (κ n ) αβ . The meaning of (μ n ) αβ is the frequency or probability of n-length chains with start and end nodes in populations β and α respectively. The decomposition relation Eq. (S81) is a matrix version of Eq. (S7), and is formally identical if the multiplication of two matrix objects A, B is implemented as AD f B . The insertion of D f here provides the proper weights for averaging between populations with different sizes.
We can also expressκ n in terms of motif cumulants of the original connectivity matrix W as opposed to the filter-weighted matrix D h W . This gives an interpretation which is closer to the single population case.
We now explain this relationship, and give a explicit calculation of the terms for the case of two populations. Let κ αβ 1 be the motif cumulant of W for length 1 chains starting in population β and ending in population α. Similarly, let κ αβγ 2 be the motif cumulants of W for length 2 chains with the three nodes in population γ, β, α respectively, and so on for higher order motifs.
By enumerating the population identity of nodes in chain motifs, it is easy to show that The above formulae motivate definingh α = Nα N h α , α = 1, 2, and Dh = D f D h = diag(h 1 ,h 2 ) to simplify the expressions. Using this notation, we can, for example, rewrite Eq. (S82) and (S83) as (S85) 36 Plugging Eq. (S84-S85) and analogous expressions at higher orders into Eq. (S79) gives a series expression of G(s) in terms of population specific motif cumulants such as κ αβγ 2 . To interpret Theorem 15.1, we again construct a feedback diagram. In Fig. 3D in the main text we show this diagram for the case of two populations, based on the above calculations. The feedback diagram consists of two (infinite) perfect binary trees, whose roots have an in and out node and a link between the two, carrying filtersh α = Nα N h α . The rest of the trees grow from the out nodes. Each left branch has a filterh 1 and each right branch has a filter h 2 . At every node of the binary trees (i.e. all nodes except for the two in nodes), there are two links connecting to nodes in (1) and in (2) respectively. The strength of such feedback links are determined by population specific chain motif cumulants as N n κ αpathβ n . Here "path" in the superscript is the sequence with 1 or 2 denoting left or right branches traveled along the path from the root to this node (starting from the end of the sequence); β is the index of the tree that the node belongs to, and α is the index of the in node that the link connects to. Sending input into one of the two in nodes and reading it out from one of the two out nodes gives the corresponding entry in the 2 × 2 matrix G(s).

Time constant formula for multi-population networks
Similar to the case for single population networks, we can relate the cut-off time constant to motif cumulants for networks with multiple node populations. This is: Theorem 15.2. Assume that the node filter for a population h α (s) decreases asymptotically as 1/s gα for large s and g α > 0, with a time constant τ α . We form a "scalar" transfer function based on the matrix of the network transfer functions G(s) via a linear combination using vectorsB andĈ:Ĝ (s) =Ĉ T G(s)B.

(S86)
Then the time constant ofĜ(s) is (Ĉ T f (D τ )B) 1 g , where f (D τ ) is a matrix acquired by replacing the diagonal matrix D h in Eq. S79 (and inside the definition ofκ n ) by another diagonal matrix D τ , D τ = diag{τ g 1 1 , . . . , τ g k k }, (S87) The degree of asymptotic decay ofĜ(s), g, is determined by both the vectorsB,Ĉ and the motif cumulant structure of W . In the special case when g α ≡ g 0 are all equal, and C T D fB = 0, we have g = g 0 .
To determine g in the special case when the g α = g 0 are all equal, we can expand Eq. (S79) in terms of powers of h α , G(s) =Ĉ T G(s)B =Ĉ T (I + N D fκ1 + · · · )D f D hB . (S91) Using the assumption thatĈ T D fB = 0, the first term in Eq. (S91) The order of s, as s → ∞, for all other terms in Eq. (S91) is at most −2g 0 . Therefore, G(s) decays as 1/s g 0 . Combining with Eq. (S89), we conclude that the time scale ofĜ(s) is 16 Additional figures for the application to mouse brain connectivity To further explain the findings of a long time constant and multiple timescales for the response of networks derived from mouse brain connectivity (Fig. 4), we provide two additional figures here. The first demonstrates the magnitude of the motif cumulants in this network, and the second illustrates the controls we performed to isolate the impact of network structure on network response. Fig. S11 quantifies the presence of chain motif cumulants of a range of orders in the mouse connectivity network, and shows that these higher-order cumulants contribute contribute significantly to increasing the timescale of the network response (so that its responses display a long "memory" of past inputs). As demonstrated in Fig. 4(B of the main text, these long timescales are lost when the network connectivity is randomly shuffled: the inset compares the impulse response of the intact network (red line) to that of 100 randomized networks. These random networks are constructed by shuffling the rows and columns of their adjacency matrices, with the degree-preserving shuffling procedure illustrated in Fig. S12. This procedure will result in a random graph with an in-degree and out-degree distribution identical to the original mesoscale connectivity network, but with the sources and targets of each node redrawn independently. Figure S11: Contributions of motif cumulants to the network time constant, for the mouse connectivity network. The motif cumulants of the original, unshuffled network do not decay rapidly with order (size). This results in a long timescales of the network transfer function (inset). This timescales is diminished when the network structure is shuffled (see main text). Row-shu e Column-shu e Figure S12: Degree distribution preserving shuffling of network connectivity. The figure illustrates a graph randomization scheme. This produces a random network by transforming the adjacency matrix for a network via a random row permutation followed by a random column permutation. This transformation will preserve the number of nodes and edges, as well as the particular distribution of node in/out degrees. This can be seen by noting the row-sum and column-sum (i.e. the in-degree and out-degree of a node) of the original (left) matrix is the same as the final (right) matrix. However, the values in each row and column will be in a different order, resulting in independent in-and out-degree distributions. Thus, any remaining network structure is due to unequal weights and degrees, but not due to any special (i.e. nonrandom) configuration of connections beyond this. The order of shuffling rows first and then columns (as depicted here) is arbitrary, and can be reversed.

Parameters used in numerical examples
In the numerical examples, we set the node filter h(s) (for all nodes in the network) to be one of two forms: an exponential filter, or a decaying-oscillatory filter. Specifically, we take: h exp (t) = e −αt H(t) (or h exp (s) = 1 s + α ) (S93) and h cos (t) = e −αt cos(νt)H(t) (or h cos (s) = s + α (s + α) 2 + ν 2 ) ; (S94) here, H(t) is the Heaviside function, and the Laplace transforms are given in parentheses.
When not stated otherwise, we set the parameters for h(s) filters in Eq. (S93) and (S94) to be α = 0.2 and ν = 2π/7 with units of rad/s. We choose these values only for purpose of concreteness and plotting: our results do not rely on these particular values, or on the units of these parameters. Fig. 2

Parameters in
In Fig. 3A, we demonstrate the convergence described in Theorem 13.1 with numerical examples. We generate two networks of size N = 100 and N = 1000 with Gaussian random W (Section 3.2) and node filter h exp (s). The two networks are designed to have the same κ n , so that, when their entries are scaled by 1/N , the corresponding G(s) functions are identical (with uniform weights B, C = e). Specifically the entries of W have zero mean and variance 0.09, are chosen to be correlated, so that κ 2 = −0.6 × 10 −2 for both of the matrices. All other κ n are approximately 0 by construction. The G(s) under uniform weights are the red curves in Fig. 3A, which are also the average E {G(s)}. For each W , we compute 100 realizations of randomly chosen input and output weights by drawing B, C as i.i.d. Gaussian variables with mean θ = 1 √ N and variance σ 2 = 0.8 N (the scaling with N allows comparison across different network size). The resulting 100 realizations of G(s) are plotted as blue traces. We see that of these realizations cluster around E [G(s)] more tightly as the network size increases. The gray areas are representing the 90% confidence interval derived in Section 13.2. We emphasize that such convergence is strong in the sense it holds on a trial-to-trial basis for all frequencies s = iω, as long as the network size N is large. Fig. 4 and Fig. S11 Here we use the connectivity between 213 cortical and thalamic areas in the mouse brain . The connection matrix in this dataset describes the density of axon projections from one area to another (Fig. 4A). We build a simple dynamic model by assuming that the node dynamics are identically determined by the exponential node filter h exp (s) (α = 1/100, Section 17). We consider the network transfer function G(s) when a input is fed uniformly to the 11 thalamic areas and a output is formed by reading out uniformly from all areas.

Parameters used in
In Fig. 4B, The G(s) calculated based on original W is plotted with red lines. A sequence of blue lines depict successive (improving) approximations to this response computed by considering additional motif cumulants, that is keeping more terms ofκ n in Eq. (S79). In the inset, the light blue curves are 100 samples produced by a node-degree preserving shuffle as explained in Fig. S12. The effect is equivalent to setting every connection to a strength equal to the mean of the original log-normal weight distribution (red-dashed line), by keeping only theκ 1 term in Eq. (S79). A coupling strength is chosen at 90% of the level of the maximum value that keeps the system stable.
In Fig. S11, we plot the magnitude of chain motif cumulants |γ n κ n | against the order n.
Here a constant γ raised to proper power is inserted to set the scale for comparing motif cumulants across orders. The value of γ is 90% of the maximum value which satisfies the consistency requirement that κ n γ n decays to 0 as n → ∞. Note that these cumulants are computed by treating all nodes as belonging to a single population; a more complex but more accurate approach would be to consider subpopulation cumulants, as in Section 15.