Non-normality and non-monotonic dynamics in complex reaction networks

Complex chemical reaction networks, which underlie many industrial and biological processes, often exhibit non-monotonic changes in chemical species concentrations, typically described using nonlinear models. Such non-monotonic dynamics are in principle possible even in linear models if the matrices defining the models are non-normal, as characterized by a necessarily non-orthogonal set of eigenvectors. However, the extent to which non-normality is responsible for non-monotonic behavior remains an open question. Here, using a master equation to model the reaction dynamics, we derive a general condition for observing non-monotonic dynamics of individual species, establishing that non-normality promotes non-monotonicity but is not a requirement for it. In contrast, we show that non-normality is a requirement for non-monotonic dynamics to be observed in the R\'enyi entropy. Using hydrogen combustion as an example application, we demonstrate that non-monotonic dynamics under experimental conditions are supported by a linear chain of connected components, in contrast with the dominance of a single giant component observed in typical random reaction networks. The exact linearity of the master equation enables development of rigorous theory and simulations for dynamical networks of unprecedented size (approaching $10^5$ dynamical variables, even for a network of only 20 reactions and involving less than 100 atoms). Our conclusions are expected to hold for other combustion processes, and the general theory we develop is applicable to all chemical reaction networks, including biological ones.

Matrix non-normality is perhaps best known for its role in a counter-intuitive form of nonlinear instability [1]. Even when a fixed point is linearly stable in a nonlinear system described by ordinary differential equations, if the corresponding Jacobian matrix is non-normal, a small but finite perturbation can transiently grow beyond the validity of the linear approximation and enter into the nonlinear regime, preventing the perturbation from decaying to zero. The discovery of this phenomenon has led to a thorough study of the spectral properties of nonnormal matrices in the context of transient dynamics [2]; it has also inspired recent works on implications of nonnormality for network and spatiotemporal dynamics [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17]. Given the common perception that linear dynamics are fully understood, the possibility of such transient growth offers interesting alternative interpretations for behavior usually attributed to nonlinearity, such as ignition dynamics in combustion and temporary activation of biochemical signals.
In network systems, however, even at the level of linear dynamics, fundamental questions remain open concerning such transient growth-or, more generally, nonmonotonic dynamics. How prevalent is non-normality in dynamical networks and how often does it lead to nonmonotonic dynamics? While non-normality is known to be widespread among matrices encoding network structures [12], the question is open for matrices representing dynamical interactions, which have direct implications on non-monotonic dynamics. Since non-monotonicity could be observed for one variable but not for others within the same system, how can we determine from the initial conditions whether a given variable will show non-monotonic behavior? Beyond the known tendency of non-normality to be correlated with local and global directionality of the network [5-7, 12, 17], what other connectivity structures have implications for non-normality and/or nonmonotonic dynamics?
In this article, we address these questions using the chemical master equation [18,19], which describes the dynamics of a chemical reaction network in terms of a time-evolving probability distribution of the network's state. Master equations play an important role in statistical physics [20,21] and have been used to model physical and chemical processes in various contexts (e.g., Refs. [22,23]), including processes on networks (e.g., Refs. [24][25][26]). The chemical master equation has the advantage of being exactly linear and amenable to rigorous analysis, while still reflecting the dynamics of species concentrations, which are most often alternatively modeled by nonlinear reaction rate equations in the limit of large number of molecules. This connection between linear and nonlinear models is possible because, when the number of molecules is large, the nonlinear dynamics in the finite-dimensional space of species concentrations can be lifted to linear dynamics in an infinite-dimensional space (which is akin to how the linear, infinite-dimensional Koopman operator can be used to study nonlinear, but finite-dimensional, dynamical systems [27]). Thus, while the individual interactions between the species concentration dynamics are inherently nonlinear, the interactions between the probability dynamics of different states in the chemical master equation are strictly linear. A key to our approach, particularly for reactions involving a finite number of molecules, is to represent these linear , while the bottom one shows the projection of these trajectories onto a one-dimensional subspace (black solid line in the top panel) parallel to a given vector u (purple arrow) as a function of time. In the top panel, the dashed lines indicate the two eigenvectors of the system, and the red color intensity at each point encodes the rate at which the projection of the trajectory starting from that point initially moves away from zero (i.e., the derivative of the projection at t = 0, multiplied by the sign of the projection). The vector v shown in purple is orthogonal to a boundary line of the red region (the line on which the derivative of the projection is zero at t = 0). (a) Normal system,ẋ = −3x − 2y,ẏ = −2x − 3y, whose eigenvectors are orthogonal. The dynamics are projected onto a line with a 55 • slope (left column) and a vertical line (right column). The system exhibits weakly non-monotonic trajectories in both cases. (b) Non-normal system,ẋ = 3x − 4y,ẏ = 8x − 9y, whose eigenvectors are far from being orthogonal. The trajectories are projected on to the same lines as in (a). In contrast to the normal system in (a), this non-normal system exhibits a much more pronounced transient growth. While the two representative projection angles are used here to illustrate the contrast between the two systems, the entire range of possible angles are shown in an animated version of this figure provided as Supplemental Material (click here for the video). The animation shows that the ranges of initial conditions and projection angles leading to non-monotonicity is wider for the non-normal system than for the normal system.
interactions by a directed weighted network of state-tostate transitions.
Here, we study what the structure of such a network can tell us about the chemical process it encodes. In particular, we focus on the consequences of non-normality, strongly connected components, and reaction irreversibility underlying the local directionality of network links, examining their implications for non-monotonic dynamics. As a representative example of real complex chemical reactions, we consider hydrogen combustion, for which a dataset is available on the experimentally determined reactions, species, and rate constants [28][29][30][31]. In the master-equation representation, the network grows rapidly in size with the number of atoms, reaching tens of thousands of nodes for fewer than a hundred atoms. To address the computational challenges of constructing the network and solving the master equation, we developed a thresholding technique that substantially reduces the network size without significantly affecting the accuracy of system trajectory calculations. This technique is incorporated in our open-source toolbox for automatically constructing a network from a given reaction dataset [32]. Equipped with this toolbox, we use hydrogen combustion as a representative example to demonstrate that combustion networks are indeed non-normal, with some eigenvectors having small angles between them (and hence far from being orthogonal), leading to non-monotonic behavior in the number of molecules of intermediate chemical species.
More generally, we derive a rigorous, geometrically interpretable condition on the initial probability distribution under which the time evolution of a given set of chemical species is non-monotonic. In particular, the condition indicates that the dynamical non-monotonicity depends both on the initial distribution and the selected set of species. The condition also shows that nonnormality is not strictly required for non-monotonicity. Indeed, for a generic system, be it normal or non-normal,

FIG. 2.
Network representations of the H2/O2 combustion process. (a) Conventional bipartite graph representation, in which the nodes representing reactions (black dots) have incoming links (blue arrows) from the nodes representing the reactant species and outgoing links (red arrows) to those representing the product species. (b) Master-equation representation, in which each node represents a state defined by the number of molecules of all species and a weighted link the rate of transition between two states. The network shown encompasses the 11,129 states accessible from the initial state (orange node on the left) with 20 H2, 10 O2, 1 OH, and 200 Ar molecules. The green nodes on the right indicate the states with strictly positive probability of occupancy in the steady-state distribution of the chemical master equation, Eq. (1). All the other states are indicated by black dots, and the (reversible) transitions between the nodes are represented by links. The inset shows a zoom-up of the initial state and its neighboring states, which are labeled with the changes in species composition relative to the initial state. The link strengths in the inset are proportional to the transition rates Wij. (c) Transition rate matrix W for Eq. (1). The states are indexed by the order of discovery in the state enumeration algorithm based on depth-first search, starting from the initial state i = 1.
there is always a projection of the solution space for the master equation that leads to non-monotonicity for some initial distribution. Thus, the key question is whether the dynamics of a given set of species are determined by such a projection. Non-normal systems, however, distinguish themselves from their normal counterparts in that nonmonotonicity is more prevalent and more pronounced (as illustrated in Fig. 1 using a simple two-dimensional system). In addition, we establish that the relation between non-normality and non-monotonicity can be expressed in terms of the Rényi entropy [33], in analogy with the known relation between (the violation of) the molecular chaos assumption in the H-theorem and nonmonotonicity of the Shannon entropy [34]. Furthermore, capturing the local link directionality and decomposing the network into connected components, we reveal the global directionality of the network in the form of a directed acyclic graph (DAG) linking these connected components. Starting from an initial state most natural for the combustion process, the system traverses a linear chain of irreversible steps within the DAG structure. By comparing with multiple classes of random networks, we conclude that the existence of this linear-chain structure must be attributed to non-random nature of the real combustion networks.

I. MASTER-EQUATION FORMULATION
Given a set of N s chemical species and a set of N r reactions involving them, we consider the dynamics of reactions in a mixture of these species. The state of the system is defined by the numbers of molecules of individual species. For a given state i, we denote by m ik the number of molecules of species k in that state. For a given reaction n, we denote by ξ nk and ζ nk the numbers of molecules of species k that are reactants and products, respectively. Under this reaction, state i with m ik molecules of each species k transitions to a state with m ik − ξ nk + ζ nk molecules of each species k. This allows us to represent the system as a network of states (nodes) connected by the state-to-state transitions induced by reactions (directed links). This representation is entirely different from (but related to) to the conventional representation of the species-reactions relation by a directed bipartite graph. For the combustion example . . , 16 H2 initial states (with the numbers of O2 molecules and Ar atoms varying proportionally). Each node represents an SCC, a set of states connected bidirectionally by paths of state transitions. The SCCs are identified in the network constructed by enumerating states without the -thresholding and then removing all transitions whose rates are much smaller than the rates of the corresponding opposite transitions (see Sec. V for further details). The node size is proportional to the number of states in the corresponding SCC. Each directed link indicates that a transition is possible from a state in one SCC to a state in the other SCC. The red star symbol indicates the SCC containing only the initial state. The few color-coded nodes connected by red links are the SCCs of the network obtained when applying the -thresholding, with the node color encoding the time at which the maximum is observed for the probability that the system's state is in the corresponding SCC. These SCCs are observed to form a directed linear chain in each case (when excluding SCCs with maximum probability < 10 −3 ).
we describe in detail below, the bipartite representation is shown in Fig. 2(a) and compared to our representation in Fig. 2(b). Assuming that the molecules are well-mixed within a fixed volume, the dynamics can be described probabilistically by the chemical master equation [19], which determines the time evolution of P i = P i (t), the probability that the system is in state i at time t: where W ij ≥ 0 is the rate of transition from state j to state i given by κ nj is a constant characterizing the rate at which reaction n occurs when the system is in state j, the notation δ ij is used for the Kronecker delta, and j (n) is the state to which the system transitions from state j under reaction n. Since the rate of a reaction depends on the numbers of molecules of the reactants but not on those of the products, W ij involves ξ nk but not ζ nk . Equation (2) implies that the transition rates W ij are independent of the probabilities P j , making Eq. (1) strictly linear. The system trajectory starting from a specific state, which we label with i = 1, can be determined by solving Eq. (1) with P 1 (0) = 1 and P i (0) = 0 for all i > 1. The dynamics can thus be regarded as a Markov process over the network of states driven by transition rates W ij , which can be regarded as the weight of the link from node j to i. As a specific example of chemical reaction dynamics, we consider combustion in a gas composed of hydrogen, oxygen, and a neutral buffer of argon, under a constantvolume, constant-energy condition. The argon buffer is included to absorb most of the energy released by the reactions, so that the evolving temperature of the gas remains within the acceptable range for the kinetic model (to be described below) while conserving energy. In addition to the reactants and products of the overall combustion reaction, 2 H 2 + O 2 −−→ 2 H 2 O, various intermediate, short-lived, radical species are created during this process (H, O, OH, HO 2 , and H 2 O 2 ). From a given initial state of this H 2 /O 2 combustion process with specified numbers of molecules of each species, the number of states that are accessible through a sequence of reactions is finite, and we denote this number by N . This is because the number of each atomic species in the gas is conserved during each reaction event. Assuming that hydrodynamic effects are negligible (i.e., the gas is wellmixed, and the kinetic energy released by reactions thermalizes quickly), each accessible state i has well-defined temperature T i and pressure p i (which can vary with i due to the energy released or absorbed by the reactions).
Because the temperature and pressure should not fall unrealistically low, the number of accessible states is further limited. To map out the network of states, we implemented an algorithm based on a depth-first search to identify all states accessible from a given initial state through the network of possible transitions. Our implementation [32] uses Cantera [35] and can be applied to any reaction network data in the Cantera or ChemKin format [36]. The number of accessible states can be very large even for a gas with a modest number of atoms. To see this, we consider the initial state with 2m molecules of H 2 , m molecules of O 2 , a single molecule of OH (included to start the chain reaction process), and 20m atoms of Ar at 1 atm and 1,000 K. We refer to this as the 2m H 2 initial state and label it with i = 1 throughout the article. For this initial condition, we additionally impose a minimum temperature of 200 K and a minimum pressure of 0.01 atm for any state to be included. Note that the 2m H 2 initial state is stoichiometrically balanced, so that all hydrogen in the mixture will be consumed during combustion. The number of states accessible from the 6 H 2 initial state (i.e., m = 3) is just N = 347, and the corresponding network has a modest complexity (Fig. 3, the first network at the top). We note that this number is significantly smaller than the number of all possible states with the same numbers of H, O, and Ar atoms (842 states, as determined by direct enumeration) due to the accessibility requirement and the thermodynamic constraints mentioned above. The number of accessible states quickly grows as we scale up the number of molecules proportionally (Fig. 3), reaching N = 21,717 for the 16 H 2 initial state, with just 6m + 2 = 50 atoms (counting only O and H, since Ar serves only as an energy buffer and does not actively participate in the reactions). In fact, the number of states appears to grow with the number of atoms slightly faster than a power law with exponent 4.3 ( Fig. 4(a), open circles) [37].
To compute the transition rates W ij , we need the con-stants κ nj in Eq. (2), which in this case is given by where o n ≡ k ξ nk is the molecularity, N A is Avogadro's number, V is the system volume, and κ n (T j , p j ) are the rate constants defining the kinetic model (which for most reactions has a temperature dependence of the modified Arrhenius form, but may also involve efficiency coefficients for a third-body reaction and a pressure dependence of a falloff reaction). Since the transition rates W ij span multiple orders of magnitude, the flow of probability under Eq. (1) is typically limited to a small portion of the network. This observation can be used upfront to reduce the size of the network, and thus the computational burden, without significantly affecting the dynamics. For this purpose, we apply thresholding in our state enumeration algorithm; if there is a transition from state j to a new state i, then state i is added only if the rate of that transition is not negligible compared to the rate of the opposite transition, i.e., if W ij > W ji , where is a (small) threshold [38]. Thus, if there are multiple states j from which the system transitions to state i, we add state i if and only if at least one of these transitions carries nonnegligible flow of probability into state i. Once a full list of states is obtained, both W ij and W ji are computed for each pair of enumerated states i and j. Unless otherwise noted, we employ = 10 −3 in the remainder of the paper, which is small enough for accurate computation of system trajectories (see reference comment [39] for details). For example, for the 20 H 2 initial state, this -thresholding reduces the size of the network by more than 82% (N = 63,835 → 11,129). This reduced 20 H 2 network and the corresponding 11,106-dimensional transition rate matrix W ≡ (W ij ) 1≤i,j≤N are visualized in Fig. 2(b) and (c), respectively. The -thresholding consistently yields significant reduction in N and in the complexity of the network, as seen by comparing the open and filled circles in Fig. 4(a). Furthermore, the percentage reduction in N achieved by this procedure grows with the number of atoms in the system, as shown in Fig. 4(b). With this reduction technique, we were able to consider network sizes as large as N = 74,421, corresponding to the 30 H 2 initial state with 92 atoms (counting O and H).

II. SPECTRAL ANALYSIS
The linearity of Eq. (1) implies that all information about the dynamics is encoded in the eigenvalues and eigenvectors of the matrix W. We first note that the structure of the matrix guarantees the sum over each column to be zero, i.e., i W ij = 0 for all j. Using this and applying the Gershgorin Circle Theorem [40], it follows that zero is an eigenvalue of W (which can be degenerate), and that all the other eigenvalues have strictly negative real parts. Moreover, it can be shown that each right eigenvector associated with the zero eigenvalue can be normalized so that its components are all non-negative and sum to unity (see Appendix). Each such normalized eigenvector, or a convex combination of multiple such vectors, is thus a steady-state probability distribution for Eq. (1).
If the network of states is strongly connected, or equivalently if the matrix W is irreducible, the Perron-Frobenius Theorem [40] implies that the zero eigenvalue is actually non-degenerate and that the components of its right eigenvector are all strictly positive, leading to a unique steady-state distribution with P i > 0 for all i. This holds true for the H 2 /O 2 combustion process considered here, since all reactions in that process are reversible, making the networks of states undirected and thus strongly connected. Note, however, that many of the reactions have tiny reverse reaction rate, which leads to many states with low probability P i , as we will show below.
The temporal evolution of the system under Eq. (1) can be decomposed into eigenmodes. Assuming that the network is strongly connected, we denote by χ 0i the normalized right eigenvector associated with the zero eigenvalue, i.e., the unique steady-state distribution. For the remaining N − 1 eigenvalues, we use λ n to denote the nth eigenvalue (in an arbitrary order) and χ ni to denote the ith component of the corresponding right eigenvector normalized to unit length in 2-norm. Given the initial probability distribution P i (0), the deviation from the steady-state distribution, Q i (t) ≡ P i (t) − χ 0i , can be expressed in terms of the (generally non-orthogonal) projection onto the eigenbasis, (4) and η ni is the ith component of the left eigenvector associated with λ n . Here, the left eigenvectors are normalized so as to satisfy i η ni χ n i = δ nn , where we recall that δ nn denotes the Kronecker delta. In this eigendecomposition, all terms decay monotonically in time because λ n < 0 for all n > 0, implying that P i (t) → χ 0i as t → ∞, i.e., the system converges to the steady-state distribution. The contributions of these eigenmodes are quantified by the mode strength coefficients α n , with their decay characterized by the timescale parameters The multi-scale nature of the H 2 /O 2 combustion process is apparent in the distributions of α n and τ n shown in Fig. 5(a), spanning many orders of magnitude. On the one hand, the vast majority of the eigenmodes decay very fast as τ n is tiny and thus contribute very little to the overall dynamics. Indeed, the probability P i (t) computed using only the 150 modes highlighted in blue in Fig. 5(a) stays within 3 × 10 −4 of that computed using all eigenmodes for all i and all t, while P i (t) computed using only the 1,000 eigenmodes with largest τ n stays within 2 × 10 −6 of that computed using all eigenmodes  Fig. 2(b). (a) Projection αn onto eigenvectors vs. timescale τn associated with the corresponding eigenvalues. Highlighted in blue are the 150 eigenvalues with largest αn among the 1,000 eigenvalues with the largest τn. Histograms for αn and τn are shown on the right and the top of the panel, respectively. In each case, the histogram is shown separately for the blue and gray eigenvalues. (b) Illustration of non-monotonic dynamics due to a pair of nearly parallel eigenvectors χn and χ n (green arrows). When the eigenvectors have large coefficients (αn, α n 1) and associated with very different timescales (τn τ n ), the sum of the corresponding terms in Eq. (4) (blue arrows) initially grows but eventually shrinks to zero. (c) Angle θ nn between the nth and n th eigenvectors among those corresponding to the blue dots in (a), sorted by descending magnitude of their projection coefficients αn and α n . (d) Non-monotonic decay of i Q 2 i for the network in Fig. 2(b). For reference, we also plot the expected number of molecules of reactants (H2 and O2) and product (H2O) as functions of time, showing the progression of the combustion process. The red shading indicates the interval of the ignition event (defined in the text).
(code for efficient trajectory computation available in our online repository [32]). On the other hand, there are several modes with slow timescales and significant α n . These features are shared with sloppy models [41,42], which have modes with strength and timescale varying over many orders of magnitude but can describe the process accurately with only a handful of these modes.
However, if the matrix W is non-normal (i.e., if the matrix normality condition k W ik W jk = k W ki W kj , ∀i, j is violated), the monotonic decay of individual eigenmodes does not tell the whole story. For a normal W, the right eigenvectors are all orthogonal to each other. It then follows from the eigen-decomposition in Eq. (4) that Q i (t) decays monotonically in 2-norm, i.e., i Q 2 i → 0 as t → ∞. For a non-normal W, the eigenvectors are not necessarily orthogonal to each other and can be nearly parallel (or even completely parallel, which is equivalent to having a degenerate eigenvector). Consider a system whose nth and n th eigenvectors are nearly parallel, and suppose that the coeffi-cients α n and α n have opposite signs with large magnitudes, and that the timescales τ n and τ n are very different. Then, the sum of the corresponding terms in Eq. (4) will exhibit highly non-monotonic dynamics, initially growing to a large size before eventually shrinking to zero, as illustrated in Fig. 5(b). For the H 2 /O 2 combustion network of Fig. 2, the matrix W is highly non-normal, with many eigenvectors having small angles between them (Fig. 5(c)), and we indeed observe i Q 2 i to behave non-monotonically (Fig. 5(d), blue curve). The non-monotonic dynamics emerge before the ignition event, which starts at t ≈ 2.2 × 10 −5 sec and ends at t ≈ 1.9 × 10 −2 sec (Fig. 5(d), red shading). Here, the ignition event is defined as the interval in which between 5% and 95% of the total temperature change is observed. The non-monotonicity continues during the conversion of H 2 and O 2 to H 2 O, until about half of the fuel is consumed at t ≈ 10 −3 sec.

III. NON-MONOTONIC ENTROPY DYNAMICS
The non-monotonic decay described above for nonnormal W is closely related to the so-called Rényi entropy H 2 ≡ − log( i P 2 i ) [33]. Like the Shannon entropy H 1 ≡ − i P i log(P i ), the Rényi entropy quantifies the spread of the probability distribution and can thus be regarded as a measure of the uncertainty about the state of the system. Just as Boltzmann's H-theorem guarantees monotonic evolution of the Shannon entropy when a molecular chaos assumption is satisfied (which would entail W ij = W ji in this setting) [34], the Rényi entropy is monotonic when the normality condition k W ik W jk = k W ki W kj is satisfied, since the orthogonality of the eigenvectors guarantees monotonic decay [43]. To illustrate this, we first note that the following formula can be derived: where S ≡ (S ij ) 1≤i,j≤N , S ij ≡ 1 2 (W ij + W ji ) is the symmetric part of W. This equation establishes a connection between information theory and dynamical systems theory: the rate of change of the Rényi entropy is directly proportional to the Rayleigh quotient of S, which is of interest in non-normal growth analysis [2,3]. This connection reflects the fact that the Rényi entropy is a function of the 2-norm of the probability distribution P i . In particular, Eq. (5) implies that the minimum possible dH 2 /dt is determined by the largest eigenvalue of S, which is non-positive and known as the reactivity (up to the factor of −2). If W is normal, it is known that the reactivity equals the largest eigenvalue of W, which is zero. This implies that dH 2 /dt ≥ 0, i.e., the Rényi entropy can only increase during the system's evolution towards the steady-state distribution. If W is non-normal, the reactivity can be strictly positive, meaning that the Rényi entropy decreases. The leading eigenvector of S gives the distribution maximizing the rate of decrease.
For the H 2 /O 2 combustion network of Fig. 2, the reactivity is indeed strictly positive ( Fig. 6(a), larger purple dot), indicating the existence of a probability distribution for which the system exhibits the maximum possible dH 2 /dt ( Fig. 6(b), smaller purple dots). Such a distribution can be constructed by properly normalizing the corresponding eigenvector of S, which is guaranteed to be possible by the Peron-Frobenius theorem. The system evolution starting from that distribution indeed exhibits the predicted rate of entropy decrease, dH 2 /dt = −1.1 × 10 8 sec −1 (Fig. 6(c), purple dot), reflecting the quick focusing of the probability initially spread over many states onto a small number of states. This distribution, however, is not the only one with dH 2 /dt < 0, as there are many other strictly positive eigenvalues for S ( Fig. 6(a)), from which distributions sharing the same property can be constructed. The decreasing H 2 (albeit at a much slower rate than the maximum) is also observed during the system evolution start-   Fig. 2(b). (a) Spectra of the non-normal matrix W and its symmetric part S. While the leading eigenvalue of W is zero (larger orange dot), the leading eigenvalue of S is strictly positive (larger purple dot), indicating that the Rényi entropy H2 can decrease, with the fastest rate |dH2/dt| determined by that eigenvalue. (b) Steady-state distribution (orange) and the distribution with the maximum |dH2/dt| (purple), constructed by normalizing the leading eigenvectors of W and S, respectively. Also shown is the single-state distribution corresponding to the initial condition used to construct the network (green). (c) Evolution of H2 starting from the three distributions in (b). The green trajectory achieves its maximum |dH2/dt| at t = 9.6 × 10 −7 sec (green dot), while the purple trajectory (shifted forward in time to facilitate comparison) initially exhibit two orders of magnitude faster decrease in H2 (purple dot). The red shading indicates the same interval of ignition event shown in Fig. 5(d).
ing from the 20 H 2 initial state, with several periods of dH 2 /dt < 0 over the non-monotonic trajectory (Fig. 6(c), green curve). In particular, the fastest decrease occurs at t = 9.6 × 10 −7 sec, before the ignition event, indicated by the red shading in Fig. 6(c). This suggests the existence of bottlenecks in the network of states, where non-normality directs the flow of probability to accumulate on a small number of states during the pre-ignition dynamics and sets up the stage for the explosive ignition dynamics.

IV. CONDITION FOR NON-MONOTONIC SPECIES DYNAMICS
We now derive an analytical, geometrically interpretable condition for individual species to exhibit non- monotonic dynamics. Given a subset of species X , let X = X(t) denote the expected fraction of molecules that are of the species in X at time t, relative to its value for the steady-state distribution. This quantity can be expressed as X(t) = i M i Q i , where M i ≡ k∈X m ik / k m ik is the normalized counts of molecules of the species in X for state i. Using this formula, the time evolution of the species can be calculated from the solutions of Eq. (1). For example, the evolution of the radicals for the H 2 /O 2 combustion process computed this way is shown for different initial number of molecules (and thus different N ) in Fig. 7(a). As N increases, the trajectory appears to approach the continuum limit determined by Cantera. Thus, our results suggest that the sharp peaks of radical concentrations associated with ignition observed in the continuum limit have origin in non-monotonic dynamics promoted by the non-normality of the transition rate matrix W in the chemical master equation.
Since the probabilities P i are all strictly positive and sum to unity, the trajectory of the corresponding point under Eq. (1) is limited to the set {(P 1 , . . . , P N ) | P i ≥ 0, ∀i and i P i = 1}, which is an (N − 1)-simplex (a generalization of a triangle). This simplex is illustrated in Fig. 7(b) for N = 3, in which case it is a triangle. Each vertex of the simplex corresponds to the distribution with all the probability concentrated on a single state. Each face (or edge for N = 3) corresponds to distributions with the probability spread over multiple states. The interior corresponds to distributions with non-zero probability for all states. Trajectories determined by Eq. (1) travel within this simplex and eventually approach the point corresponding to the steady-state distribution χ 0i . If the matrix W is normal, this point would be at the center of the simplex, since the vector of all ones is a right eigenvector of W (in addition to being a left eigenvector), which implies that the steady-state distribution is uniform. However, if W is non-normal, the point could be anywhere in the simplex and can be close to the boundaries of the simplex, since χ 0i can be non-uniform. Indeed, for the H 2 /O 2 combustion network of Fig. 2, it lies very close to a 24-dimensional hyper-edge, with a 2norm distance ≈ 1.4 × 10 −5 (≈ 0.01% of the distance from the center of the 11,129-dimensional simplex to the hyper-edge). This reflects the property of the steadystate distribution that it is highly localized: more than 99.99% of the probability is concentrated on the corresponding 24 states (mostly composed of H 2 O, with just a few molecules of H 2 , O 2 , and other radicals).
To derive the condition for non-monotonic dynamics, we first note that X converges to zero as t → ∞, since X is defined relative to the steady-state distribution. Thus, the condition for X to initially move away from zero before converging to zero is that X and its derivative dX/dt, given by dX/dt = ij M i W ij Q j , have the same sign. To rewrite this condition in a geometrically interpretable form, we define vectors u and v as those with the ith The vector u can be interpreted as the projection of the vector m ≡ (M i ) onto the simplex. Likewise, v is the projection of the vector m ≡ (M k ), M k ≡ j M j W jk , onto the simplex. Using these vectors, we can rewrite X and dX/dt as X = i u i Q i and dX/dt = i v i Q i . The simplex can be divided into "quadrants" by two hyperplanes: one orthogonal to u (given by i u i Q i = 0) and the other orthogonal to v (given by i v i Q i = 0). Both hyperplanes contain the point Q i = 0, which corresponds to the steady-state distribution χ 0i . Then, a geometric condition for exhibiting non-monotonic dynamics after time t is that the point Q i (t) lies in either the quad- These quadrants are shaded in gray in Fig. 7(b). The vectors u and v, as well as the quadrants of non-monotonicity, can also be defined and are shown in Fig. 1 for normal and non-normal twodimensional systems. (We note that u and v for these systems do not require the projection by O ij , since their states are not constrained to a simplex.) The condition just derived shows that, for both nor-  In each panel, the network is visualized as in Fig. 3, except that the initial probability distribution is fully concentrated on a randomly chosen state (red star symbol). During system evolution, probability branches out and flows downward along various paths (red arrows; showing only those with probabilities < 10 −3 ), eventually converging to the SCC at the bottom (green circle), which supports the steady-state distribution. Note that the four randomly chosen initial states are different from the 8 H2 initial state used to construct the network.
mal and non-normal W, there are initial distributions leading to non-monotonic behavior of X. This means that, even though the 2-norm i Q 2 i converges to zero monotonically for normal W (as mentioned earlier), the convergence of the projection X = i u i Q i may be nonmonotonic. We thus conclude that non-normality of W is not a requirement for non-monotonicity of X. However, non-normal W is different from normal W in that the steady-state distribution tends to lie near the boundaries of the simplex. This implies that the quadrants of nonmonotonicity occupy larger fraction of the simplex than for normal W, indicating that non-normal W is more likely to exhibit non-monotonic dynamics. Moreover, the non-monotonicity tends to be more pronounced for nonnormal W, since the further away the initial point is from the hyperplane orthogonal to v, the larger is the rate at which X moves away from zero (recalling that dX/dt = i v i Q i is the distance from the hyperplane). For our H 2 /O 2 combustion example, we indeed observe a sharp transient increase in the expected number of radical molecules, as we saw in Fig. 7(a).

V. CONNECTED COMPONENT ANALYSIS
In general, the network of states representing the system (1) is directed and can be decomposed into strongly connected components (SCC), where an SCC is defined as a subset of nodes in which a directed path exists between every node pair in both directions. This allows for a coarse-grained representation of the network by a (different) network of SCCs. In this representation, which we refer to as the SCC network, each SCC is regarded as a node, and a directed link is drawn from one SCC to another if, in the original network, there is a directed link from some node in the first SCC to a node in the second SCC. It follows from a well-known fact from graph theory that the SCC network must be a DAG, i.e., it does not contain any closed loop.
For the H 2 /O 2 combustion, the whole network is actually strongly connected (since all the reactions are modeled as a reversible one) and thus has just one SCC. However, a non-trivial DAG structure emerges when we take into account the link weights W ij , many of which are negligibly small compared to the weights W ji of the reciprocal links. This is because the reaction giving rise to a link is often nearly irreversible under the given condition in the sense that its rate in one direction is orders of magnitude larger than in the other direction. To capture this link directionality, we remove all transitions from state j to state i satisfying W ij < µW ji for a given small constant µ > 0. We note that this µ-thresholding prunes links representing negligibly rare transitions, while the -thresholding introduced above prunes nodes representing rarely visited states (together with the links involving those nodes). Once the µ-thresholding is applied, we decompose the resulting network into SCCs, revealing a DAG structure. Throughout this article, we use µ = 2×10 −3 . With this value, we find that the µ-thresholding removes much fewer links than the -thresholding (4.4% compared to 86.3% of all links in the 20 H 2 network constructed with no -or µ-thresholding) and does not significantly affect the dynamics (with maximum error in P i (t) less than 5 × 10 −3 for all i and all t). Because the SCC network has a DAG structure, one can arrange the SCCs in horizontal layers in such a way that the directions of all links, and thus flows of probability under Eq. (1), are downwards. This layered SCC arrangement is used in Fig. 3 for the 6 H 2 , 8 H 2 , . . . , 16 H 2 networks constructed without the -thresholding but with µ-thresholding. The DAG structure dictates the downward flow of probability from an arbitrary initial distribution to the final steadystate distribution contained in the bottom SCC, as shown in Fig. 8 for randomly chosen initial state in the 8 H 2 network. Code for computing the SCCs, as well as data files describing individual states, is available in our online repository [32].
We observe that the initial condition used to construct the network tends to belong to an SCC located toward the bottom of the layered DAG structure (the red stars in Fig. 3). If we focus now on the SCCs that are accessible from that initial state, we find that these core SCCs form a linear chain leading to the SCC containing the steadystate distribution (color-coded circles in Fig. 3, excluding SCCs with P i < 10 −3 for all t). The same linear chain is obtained also if we take the network constructed with the -thresholding and then remove insignificant reverse transitions using the µ-thresholding. While the expected number of the reactants H 2 and O 2 decay monotonically and the product H 2 O increase monotonically along the chain, the numbers of radical molecules change nonmonotonically, as shown in Fig. 9 for the 20 H 2 network. While this coarse-grained view of the network summarizes the dynamics with a simple linear chain, complex network structures exist both within and between the individual SCCs in the chain, as shown in Fig. 10. This full network is the same as that in Fig. 2(b), except that it is visualized using the linear chain structure in Fig. 9.
The DAG structure of the SCC network has a direct implication for the steady-state distribution χ 0i . Indeed, we can show that P i = χ 0i is strictly positive only for nodes i belonging to the SCCs without any outgoing links, which can always be placed at the bottom of the layered arrangement (see Appendix for a rigorous proof). Since the number and the sizes of such SCCs are often small, this implies that the steady-state distribution is  Fig. 2(b). For a given SCC, we show the expected number of each chemical species in a state belonging to that SCC at the time of maximum probability (encoded as the node color). We observe that, while the numbers of reactant molecules (H2 and O2) monotonically decrease and the number of product molecules (H2O) monotonically increases along the linear chain, some of the radicals (H and O) have peaks for intermediate SCCs.
localized. For the networks in Fig. 3, the distribution is highly localized, with all probability concentrated in the green SCC at the bottom of each network (and further localized within that SCC, as illustrated in Fig. 10 for the 20 H 2 network). Moreover, we observe a similar localization even when the µ-thresholding is not applied and the whole network forms a single SCC, as we saw in the previous section, reflecting the fact that the dynamics are essentially unaltered by the thresholding.

VI. RANDOM NETWORKS
To interpret the linear chain structure shown in Figs. 9 and 10 for the (µ-thresholded) 20 H 2 network, we now compare its topological properties to those of generic directed random networks. We first consider fully random networks with the same numbers of nodes N = 11,129 and the same number of directed links M = 150,077. Such a random network consists of a single giant SCC containing all N nodes with very high probability (96.8%, estimated from 1,000 realizations), and the average size of the giant SCC is ≈ 11,128.97 ± 0.18. The giant SCC appears because the average degree d ≡ M/N ≈ 13.5 is far above the percolation threshold of d = 1 [44,45]. In 1,000 realizations, any node not in the giant SCC either had outgoing links only (0.01 ± 0.12 nodes on average) or incoming links only (0.02 ± 0.13 nodes). Thus, the linear-chain structure of the SCC network (the color-  Fig. 9 (reproduced here at the bottom). The size of each node i is proportional to the maximum of Pi(t) over t, and the color coding for the nodes is the same as in Fig. 9. Nodes with negligibly small probability (Pi(t) < 10 −8 for all t) are excluded to avoid overcrowding.
coded SCCs in Fig. 9) is not typical of a network with the same numbers of nodes and links. One possible reason for this is that the in-and out-degree distributions of the combustion network are different from the Poisson distributions expected for the random counterpart ( Fig. 11(a)). However, even if we consider random networks with the same expected in-and out-degree distributions [46] as the combustion network, the network still typically consists almost entirely of the giant SCC (having 11,067.96 ± 6.01 nodes on average, when estimated from 100 realizations). Moreover, the probability that a directed link is reciprocated is relatively high (≈ 0.36) in the combustion network, and this is far from typical for the random networks. Indeed, for either of the two random models, the probability that a directed link is reciprocated is p 2 /[2p(1 − p)] ≈ 0.00061, where p = M/[N (N − 1)] is the link probability. These observations clearly indicate that the structure of the combustion network comes from additional constraints that the network must satisfy in order to represent Eq. (1).
To see if the linear-chain structure is typical among the networks representing Eq. (1) (and not just for the H 2 /O 2 combustion process), we now consider the class of random chemical reaction networks defined for a given number N r of randomly chosen reactions involving a given set of N s species. For simplicity, we consider only bi-species, bi-molecular reactions of the form S 1 + S 2 → S 3 + S 4 in which the four species involved are distinct and chosen randomly from the N s species. For each reaction chosen, we include the reverse reaction with probability R (which can be tuned to reproduce the probability of reverse transition observed for the H 2 /O 2 combustion network). For a given set of reactions and given the total number of molecules N m (which is conserved by any bi-species, bi-molecular reaction), we construct the corresponding matrix W through Eq. (2), which in this case  Fig. 11(b)). Similarly to the purely random networks considered above, this class of random networks also undergoes a percolation transition close to average degree = 1 (Fig. 11(c), blue curve). The number of nodes with outgoing (14.58 ± 3.93) or incoming (33.62 ± 5.12) links only constitutes a tiny fraction of the network (though significantly larger than for fully random networks). In particular, the nodes with only incoming links, which individually form single-node SCCs, corresponds to a tiny fraction of the states on which the steady-state probability concentrates. This fraction becomes even smaller as the average degree increases (Fig. 11(c), orange curve).

VII. CONCLUSIONS
The theory developed here reveals two distinct mechanisms through which non-monotonic dynamics emerge from non-normality of the network. First, the nonnormality moves the reactivity (the largest eigenvalue of S) away from zero, enabling the decrease of the Rényi entropy and inducing non-monotonic entropy dynamics. Second, non-normality allows the steady-state distribu- tion χ 0i to be non-uniform and often highly localized (as we observed for the hydrogen combustion example in Sec. IV and attributed to the DAG structure of the SCC network in Sec. V), placing it off-center and close to a hyper-edge of the high-dimensional simplex of probability distributions. This tends to widen the quadrants of growing deviation in the non-monotonicity condition we derived and thus increases the likelihood of observing an initial swing of the species concentrations away from their steady-state values. These mechanisms for non-monotonic dynamics are intimately related to the fact that many chemical reactions, particularly in combustion processes, are irreversible or nearly irreversible. In our formulation, reaction irreversibility translates to local directionality of the corresponding links in the network. Since non-normality requires the network to be directed [5-7, 12, 17], some of the reactions need to be nearly irreversible for the two mechanisms mentioned above to induce non-monotonic dynamics. By removing the rare reverse transitions associated with nearly irreversible reactions, we reveal global directionality in the hydrogen combustion networks in the form of a linear chain of strongly connected components. This directionality underlies the extreme localization of steady-state probability that promotes non-monotonic dynamics through the second mechanism above. These two mechanisms for non-monotonicity requires non-normality, which in turn requires local link directionality. However, our non-monotonicity condition clarifies that normal networks, even those with undirected links, can exhibit non-monotonic dynamics (but typically less pronounced than for non-normal networks), depending on the initial distributions of states and the choice of the observed quantity. The observed quantities we considered here are similar to a weighted 1-norm and thus fundamentally different from the 2-norm of the devia-tion from the steady-state distribution, which can exhibit non-monotonicity only for non-normal networks.
The structure found in random networks is different (albeit still globally directional) and characterized by the dominance of a single giant connected component and even more extreme localization of the steady-state probability. This holds true also for a class of random reaction networks that respects the network structural constraints imposed by the chemical master equation. Thus, the emergence of a linear chain in hydrogen combustion networks must be due to other factors not captured by the random selection of reactions. One possibility is that, when the product molecules are thermodynamically more stable than the reactants, this bias constrains the connected component structure. While a model of random reaction networks accounting for this effect could explain the type of global directionality observed in hydrogen combustion networks, different factors could lead to different global structures in other types of chemical reaction networks, such as biological ones [47]. Many intracellular biochemical reaction networks that exhibit transient dynamics, such as signaling networks, are expected to have a high level of directionality. The nonmonotonicity condition we derived here and our SCCbased analysis of the network's global directionality are applicable beyond those networks we explicitly considered and lay a foundation for future research on nonnormality and non-monotonic dynamics in complex reaction networks in general. Our approach, which focuses on transient dynamics far from equilibrium and can be extended to externally driven systems using time-varying master equations, may extend to other network systems and also contribute to the ongoing development of nonequilibrium statistical mechanics. 1-0383. APPENDIX We prove that any right eigenvector of W associated with the (possible degenerate) zero eigenvalue can be chosen so that its components are all non-negative and sum to unity. We also prove that the components are zero except for those corresponding to the nodes in the SCCs that have no outgoing links. The latter implies that, for the steady-state distribution, all states outside these SCCs have P i > 0.
Let c denote the number of SCCs with no outgoing links and c the number of the other SCCs. We first note that the DAG structure can be used to re-index the nodes and make W a block lower-triangular matrix, with each diagonal block corresponding to an SCC. The matrix W then takes the following form: where an empty space indicates a zero block, and a star symbol indicates that the block can have non-zero components. The c blocks corresponding to the SCCs without outgoing links necessarily appear as the last ones, denoted by W 1 , . . . , W c , which form a block diagonal submatrix in W because there cannot be links connecting these SCCs by definition. Since the sum of components along each column of these blocks is zero, each block W i has a zero eigenvalue along with the associated right eigenvectors. Since each SCC is by definition strongly connected, and hence each W i is irreducible, it follows from the Peron-Frobenius Theorem that the zero eigenvalue is not degenerate and that the corresponding eigenvector can be chosen to have only non-negative components. We choose such an eigenvector, and we further normalize it, so the sum of the component equals one.
Extending these vectors to the size of the whole matrix W (setting all added components to zero), we obtain right eigenvectors associated with the zero eigenvalue of W.
To see that the c-dimensional subspace spanned by these eigenvectors covers the entire eigenspace associated with the zero eigenvalue, consider a column vector of W intersecting with the diagonal block W i , i = 1, . . . , c . The sum of the off-diagonal components of W i that appear in this vector is strictly less than the diagonal component, since the sum of all the vector components is zero and all the off-diagonal components are nonnegative. This implies that the Gershgorin circle corresponding to this column of W i is to the left of the origin of the complex plane and at a finite distance away from the origin. Since this applies to all the columns of W i for any i = 1, . . . , c , zero cannot be an eigenvalue of any of W 1 , . . . , W c . Thus, all repetitions of the zero eigenvalue of W must come from the zero eigenvalues of W 1 , . . . , W c , and the c-dimensional subspace constructed above is indeed the eigenspace of W corresponding to eigenvalue zero. We note that the eigenvectors of W spanning the eigenspace were chosen to have components that are all non-negative and are zero outside the SCCs with no outgoing links. Thus, any eigenvector of W associated with the zero eigenvalue shares the same property and can thus be normalized so that its components sum to unity, giving the steady-state distribution. Consequently, we have P i > 0 in the corresponding steady-state distribution only for the nodes in the SCCs with no outgoing links.