Quantum Entanglement Growth Under Random Unitary Dynamics

Characterizing how entanglement grows with time in a many-body system, for example after a quantum quench, is a key problem in non-equilibrium quantum physics. We study this problem for the case of random unitary dynamics, representing either Hamiltonian evolution with time--dependent noise or evolution by a random quantum circuit. Our results reveal a universal structure behind noisy entanglement growth, and also provide simple new heuristics for the `entanglement tsunami' in Hamiltonian systems without noise. In 1D, we show that noise causes the entanglement entropy across a cut to grow according to the celebrated Kardar--Parisi--Zhang (KPZ) equation. The mean entanglement grows linearly in time, while fluctuations grow like $(\text{time})^{1/3}$ and are spatially correlated over a distance $\propto (\text{time})^{2/3}$. We derive KPZ universal behaviour in three complementary ways, by mapping random entanglement growth to: (i) a stochastic model of a growing surface; (ii) a `minimal cut' picture, reminiscent of the Ryu--Takayanagi formula in holography; and (iii) a hydrodynamic problem involving the dynamical spreading of operators. We demonstrate KPZ universality in 1D numerically using simulations of random unitary circuits. Importantly, the leading order time dependence of the entropy is deterministic even in the presence of noise, allowing us to propose a simple `minimal cut' picture for the entanglement growth of generic Hamiltonians, even without noise, in arbitrary dimensionality. We clarify the meaning of the `velocity' of entanglement growth in the 1D `entanglement tsunami'. We show that in higher dimensions, noisy entanglement evolution maps to the well-studied problem of pinning of a membrane or domain wall by disorder.


I. INTRODUCTION
The language of quantum entanglement ties together condensed matter physics, quantum information, and highenergy theory. The von Neumann entanglement entropy is known to encode universal properties of quantum ground states and has led to new perpectives on the AdS=CFT correspondence. But the dynamics of the entanglement are far less understood. The entanglement entropy is a highly nonlocal quantity, with very different dynamics to energy or charge or other local densities. Traditional many-body tools therefore do not provide much intuition about how entanglement spreads with time, for example, after a quantum quench (a sudden change to the Hamiltonian). We need to develop simple heuristic pictures, and simple long-wavelength descriptions, for entanglement dynamics.
For integrable 1D systems and rational CFTs, there is an appealing heuristic picture for entanglement growth following a quench in terms of spreading quasiparticles [1,3]. However, this picture does not apply to general interacting systems [3,4,10,44].
In this paper, we propose new heuristic pictures for entanglement growth in generic, nonintegrable systems, both in 1D and in higher dimensions. We arrive at these pictures by studying "minimally structured" models for quantum dynamics: dynamics that are spatially local, and unitary, but random both in time and space ("noisy"). Concretely, we focus on quantum circuit dynamics with randomly chosen quantum gates. Entanglement growth in these systems exhibits a remarkable universal structure in its own right, related to paradigmatic problems in classical statistical mechanics. But in addition, random circuits provide a theoretical laboratory that allows us to derive scaling pictures for entanglement growth and the so-called "entanglement tsunami" [8], which, we conjecture, generalize to quenches in many-body systems without noise. For example, we propose a simple "minimal membrane" picture that can be used to derive scaling forms for the growth of the entanglement. We also argue that generically there is a well-defined "entanglement speed" v E , but this is generically smaller than the "butterfly speed" v B governing operator growth, and we give a physical explanation for this phenomenon.
We show that noisy entanglement growth allows a longwavelength description with an emergent universal structure. Physically, the class of noisy dynamics includes closed, many-body systems whose Hamiltonian HðtÞ contains fluctuating noise terms, and also quantum circuits in which qubits are acted on by randomly chosen unitary gates. In this setting, we pin down both the leading-order deterministic behavior of the entanglement and the subleading fluctuations associated with noise. We argue that fluctuations and spatial correlations in the entanglement entropy are characterized by universal scaling exponents, expected to be independent of the details of the microscopic model.
For noisy systems in one spatial dimension, we argue that the critical exponents for entanglement growth are those of the Kardar-Parisi-Zhang (KPZ) equation, originally introduced to describe the stochastic growth of a surface with time t [45]. In the simplest setting, we find that the height of this surface at a point x in space is simply the von Neumann entanglement entropy Sðx; tÞ for a bipartition which splits the system in two at x. The average entanglement grows linearly in time, while fluctuations are characterized by nontrivial exponents. We support this identification with analytical arguments and numerical results for discrete time quantum evolution (unitary circuits).
A remarkable feature of the KPZ universality class is that it also embraces two classical problems that at first sight are very different from surface growth [45,46]. These connections lead us to powerful heuristic pictures for entanglement growth, in both 1D and higher dimensions. The KPZ universality class embraces the statistical mechanics of a directed polymer in a disordered potential landscape [47] and 1D hydrodynamics with noise (the noisy Burgers equation [48]). These problems, together with surface growth, are sometimes known as the "KPZ triumvirate" [49]. They are summarized in Fig. 1. We show that entanglement growth can usefully be related to all three of the classical problems in in Fig. 1.
In the quantum setting, the directed polymer is related to the "minimal cut," a curve in space-time that bisects the unitary circuit representing the time evolution. This picture is more general than the surface growth picture, as it allows one to consider the entropy for any bipartition of the system. It also allows us to generalize from 1D to higher dimensions. The picture is reminiscent of the Ryu-Takayanagi prescription for calculating the entanglement entropy of conformal field theories in the AdS=CFT correspondence, which makes use of a minimal surface in the bulk space [50], and analogous results for certain tensor network states [51][52][53]. Here, however, the cut lives in space-time rather than in space, and in a noisy system its shape is random rather than deterministic. (For a different use of the idea of a minimal cut in space-time, see Ref. [10].) In d þ 1 spacetime dimensions the minimal cut becomes a d-dimensional membrane pinned by disorder. This picture allows us to obtain approximate critical exponents for noisy entanglement growth in any number of dimensions.
This picture also leads us to a conjecture for entanglement growth in systems without noise, both in 1D and higher dimensions, as we discuss below. According to this conjecture, the calculation of the entanglement in higher dimensions reduces to a deterministic elastic problem for the minimal membrane in space-time. In 1D, it results in particularly simple universal scaling functions, which agree with scaling forms in holographic 1 þ 1D CFTs [8,10,44], and which we suggest are universal for generic, nonintegrable, translationally invariant 1D systems.
The third member of the triumvirate in Fig. 1 is a noisy hydrodynamic equation describing the diffusion of interacting (classical) particles in 1D. We show that this can be related to the spreading of quantum operators under the unitary evolution, giving a detailed treatment of the special case of stabilizer circuits. Note that while the minimal cut picture generalizes to higher dimensions, the KPZ and hydrodynamic pictures are special to 1D.

A. Organization of the paper
We propose that noisy dynamics are a useful toy model for quantum quenches in generic (nonintegrable, nonconformally-invariant) systems, even without noise. The logic of our approach is to pin down the universal behavior of noisy systems (Secs. II-VI), to establish simple heuristics capturing this behavior (Secs. III and IV), and then to extend these heuristic pictures to dynamics without noise (Secs. V and VIII).
The detailed physics of the entanglement fluctuations (including KPZ exponents) certainly relies on noise. However, the coarser features of the dynamics are in fact deterministic. These include the leading-order time dependence of the entanglement entropy and mutual information when the length and time scales are large. We conjecture that this leading-order behavior, as captured by the directed polymer and hydrodynamic pictures, carries over to Hamiltonian dynamics without noise. On the basis of this we address (Sec. V) features of entanglement growth that have previously been unclear. We argue that in generic 1D systems the entanglement growth rate can be interpreted as a well-defined speed v E , but that this speed is generically smaller than another characteristic speed, which is the speed v B at which quantum operators spread out under the dynamics (the butterfly speed). Section V also addresses universal scaling forms for the entanglement entropy in 1D. In Sec. VIII, we discuss the geometry dependence of the dynamical entanglement in higher-dimensional systems: we argue that there is again a scaling picture in terms of a minimal surface, but that more nonuniversal parameters enter into the time dependence than in 1+1D.

II. SURFACE GROWTH IN 1D
We begin by studying entanglement growth under random unitary dynamics in one dimension. After summarizing the KPZ universal behavior, we derive this behavior analytically in a solvable model, using a mapping to a classical surface growth problem. In the following sections we provide alternative derivations of this universal behavior by relating the minimal cut bound on the entanglement to the classical problem of a directed polymer in a random environment, and by relating the spreading of quantum operators to a 1D hydrodynamics problem.
Consider a chain of quantum spins with local Hilbert space dimension q (for example, spin-1=2's with q ¼ 2). We take open boundary conditions and label the bonds of the lattice by x ¼ 1; …; L. We consider only unitary dynamics, so the full density matrix ρ ¼ jΨihΨj represents a pure state. For now we consider the entanglement across a single cut at position x; we generalize to other geometries later in the paper. The reduced density matrix ρ x is defined by splitting the chain into two halves at x and tracing out the left-hand side (Fig. 2). The nth Rényi entropy for a cut at x is defined as Logarithms are taken base q. In the limit n → 1, the Rényi entropy becomes the von Neumann entropy: A basic constraint on the von Neumann entropy is that neighboring bonds can differ by at most one (this follows from subadditivity of the von Neumann entropy): In this section, we focus on the growth of the bipartite entropies Sðx; tÞ with time, starting from a state with low entanglement. [Here Sðx; tÞ, without a subscript, can denote any of the Rényi entropies with n > 0.] For simplicity, we take the initial state to be a product state, but we expect the same long-time behavior for any initial state with area-law entanglement. (The setup with area-law entanglement in the initial state is analogous to a quantum quench that starts in the ground state of a noncritical Hamiltonian. We briefly consider initial states with non-area-law entanglement in Sec. IX.) We argue that for noisy unitary dynamics, the universal properties of Sðx; tÞ are those of the Kardar-Parisi-Zhang equation: This equation was introduced to describe the stochastic growth of a 1D surface or interface with height profile SðxÞ [45]. It captures an important universality class which has found a wealth of applications in classical nonequilibrium physics, and its scaling properties have been verified in high-precision experiments [54,55] how the average growth rate depends on the slope; the negative sign is natural here, as we discuss in Sec. II A 3 [and implies that B in Eq. (6) below is positive]. KPZ scaling is characterized by an exponent β governing the size of fluctuations in the interface height, an exponent α governing spatial correlations, and a dynamical exponent z determining the rate of growth of the correlation length (z ¼ α=β by a scaling relation). These are known exactly [45]: In our context the height of the surface is the bipartite entanglement Sðx; tÞ. This is a random quantity that depends on the realization of the noise in the quantum dynamics. The mean height (entanglement) grows linearly in time, with a universal subleading correction: Angle brackets denote an average over noise. The linearity of the leading t dependence is expected from rigorous bounds for various 1 þ 1D random circuits [39][40][41]. Linear growth is also generic for quenches in translationally invariant 1D systems [3,21]. The fluctuations grow as We refer to w as the width of the surface. The ratio C=B is universal (the constants v E and B are not). The KPZ fluctuations are non-Gaussian: remarkably, their universal probability distribution has been determined analytically [56][57][58][59][60][61][62][63][64][65][66]. The correlation length governing spatial correlations in the fluctuations grows with time as and the equal time correlation function has the scaling form On length scales 1 ≪ r ≪ ξðtÞ, the surface profile SðxÞ resembles the trace of a 1D random walk: this is consistent with the exponent α ¼ 1=2. On scales r ≫ ξðtÞ, the fluctuations in Sðx; tÞ and Sðx þ r; tÞ are essentially uncorrelated. At short times the entanglement growth is affected by initial conditions, while on very long time scales, of the order of the system size, the entanglement saturates. Equations (6)-(9) apply prior to this saturation. In a finite system the asymptotic hSðxÞi profile is that of a pyramid, with a maximum at height x ¼ L=2, whose height is L=2, minus an Oð1Þ correction [67,68]. This profile is reached at a time with bonds closer to the boundary saturating sooner (Secs. III and V). Saturation is also captured in the surface growth description, once we note that there are Dirichlet boundary conditions on the entropy: Sð0; tÞ ¼ SðL þ 1; tÞ ¼ 0. Note that the scaling described in Eqs. (6) and (8) implies the existence of two distinct diverging length scales during entanglement growth. The fact that hSðx; tÞi is of order t implies that spins are entangled over distances of order t. In fact, we show in Sec. V that v E t is a sharply defined length scale. But prior to saturation, the relevant length scale for spatial variations in Sðx; tÞ is parametrically smaller than v E t; namely, ξðtÞ ∼ t 2=3 .
Before deriving KPZ for entanglement, let us briefly consider the status of this equation. At first sight, we might try to justify this description of Sðx; tÞ simply on grounds of symmetry and coarse graining. If we were describing classical surface growth, we would appeal to translational symmetry in the growth direction (S → S þ const) in order to restrict the allowed terms, and would note that the righthand side includes the lowest-order terms in ∂ x and ∂ x S. But for entanglement we cannot rely on this simple reasoning. First, the transformation S → S þ const is not a symmetry (or even a well-defined transformation) of the quantum system. More importantly, it is not clear a priori that we can write a stochastic differential equation for Sðx; tÞ alone, since the full quantum state contains vastly more information than Sðx; tÞ. Despite these differences from simple surface growth, we show below that the above equation does capture the universal aspects of the entanglement dynamics.
In the next section, we exhibit a solvable quantum model that maps to a classical surface growth problem that is manifestly in the KPZ universality class. Then in the two following sections, we give heuristic arguments for more general systems by making connections with the other members of the KPZ triumvirate. Together with the results for the solvable model, these arguments suggest that KPZ exponents should be generic for entanglement growth in any quantum system whose dynamics involves timedependent randomness. In Sec. VI, we perform numerical checks on KPZ universality for quantum dynamics in discrete time.

A. Solvable 1D model
We now focus on a specific quantum circuit model for the dynamics of a spin chain with strong noise. We take random unitaries to act on pairs of adjacent spins (i.e., on bonds) at random locations and at random times, as illustrated in Fig. 3. For simplicity, we discretize time and apply one unitary per time step. (Dynamics in continuous time, with unitaries applied to the links at a fixed rate in a Poissonian fashion, are equivalent.) We choose the initial state to be a product state, with S n ðxÞ ¼ 0 for all n and x. We choose the unitaries from the uniform (Haar) probability distribution on the unitary group for a pair of spins, Uðq 2 Þ. This model is solvable in the limit of large local Hilbert space dimension q.

Dynamics of Hartley entropy
A useful starting point is to consider the n → 0 limit of the Rényi entropy, S 0 . This is known as the Hartley entropy, and quantifies (the logarithm of) the number of nonzero eigenvalues of the reduced density matrix. Equivalently, the Hartley entropy determines the minimal necessary value of the local bond dimension in an exact matrix product representation [31,69] of the state: Like the von Neumann entropy, the Hartley entropy of neighboring bonds can differ by at most one: Recall that logarithms are base q. For the present, we keep q finite.
For the random dynamics we describe above (Sec. II A), the Hartley entropy obeys an extremely simple dynamical rule. In a given time step, a unitary is applied at a random bond, say, at x. Applying this unitary may change the Hartley entropy across the bond x; the entropy remains unchanged for all other bonds. The rule for the change in S 0 ðxÞ is that, with probability one, it increases to the maximal value allowed by the general constraint Eq. (12): This "maximal growth" of S 0 occurs with probability one when all unitaries are chosen randomly. Fine-tuned unitaries (e.g., the identity) may give a smaller value, but these choices are measure zero with respect to the Haar distribution.
We present a rigorous proof of Eq. (13) in Appendix A. The appendix also gives a heuristic parameter-counting argument that suggests the same result, but as we explain, the more rigorous argument is necessary as the heuristic argument can be misleading. We note that Ref. [70] observed that the growth in bond dimension, when a unitary is applied to a matrix product state, is upper bounded by the right-hand side of Eq. (13) and used this to obtain an upper bound on bond dimension growth during a quantum computation.
For the random dynamics we are considering, the dynamical rule in Eq. (13) leads to a simple but nontrivial stochastic process. Before discussing its properties, we use Eq. (13) as a starting point to show that, in the limit of large Hilbert space dimension, the von Neumann entropy (and in fact all the higher Rényi entropies) obeys the same dynamical rule. This requires an explicit calculation in the limit q → ∞. The von Neumann entropy is of more interest than S 0 , since the latter behaves pathologically in many circumstances. [This is because it simply counts up all the (nonzero) eigenvalues in the spectrum of ρ x , regardless of how small they are. For example, Hamiltonian dynamics in continuous time-as opposed to unitary circuits like the above-will generally give an infinite growth rate for S 0 , in contrast to the finite growth rate for S vN and the higher Rényi entropies.]

Limit of large Hilbert space dimension
The present quantum circuit dynamics lead to a solvable model in the limit of large local Hilbert space dimension, q → ∞. In this limit, all the Rényi entropies obey the dynamical rule in Eq. (13).
To show this, we consider the reduced density matrix for a cut at x, where x is the bond to which we are applying the unitary in a given time step. We may write ρ x ðt þ 1Þ in terms of ρ x−1 ðtÞ and the applied unitary matrix. Averaging Trρ 2 x over the choice of this unitary, we then obtain: See Appendix B for details. In terms of the second Rényi entropy S 2 , this is The general constraint S 2 ≤ S 0 allows us to write with Δ ≥ 0. We now use Eqs. (13) and (14) to show that Δ is infinitesimal at large q. Rewriting Eq. (14) in terms of Δ, and substituting Eq. (13), immediately shows hq Δðx;tþ1Þ i Haar < q Δðx−1;tÞ þ q Δðxþ1;tÞ : ð16Þ For a simple bound (for a large system, this bound on hq Δ max i is far from the tightest possible since we have not exploited the large size of the system), define Δ max ðtÞ to be the maximal value of Δðx; tÞ in the entire system. The equation above implies We may iterate this by averaging over successively earlier unitaries: This shows that as q → ∞ at fixed time t, the probability distribution for Δ concentrates on Δ ¼ 0, so that S 2 and S 0 become equal. This implies that the entanglement spectrum is flat, so, in fact, all the Rényi entropies obey Eq. (13) for the application of a unitary across bond x.

Properties of the solvable model
The dynamical rule we arrive at for the bipartite von Neumann and Rényi entropies at large q, defines a stochastic surface growth model in which Sðx; tÞ is always an integer-valued height profile (Fig. 4). The remaining randomness is in the choice of which bond is updated in a given time step. At each time step, a bond x is chosen at random, and the height SðxÞ is increased to the maximal value allowed by the neighbors. Figure 5 gives examples of local configurations before and after the central bond is updated.
This model is almost identical to standard models for surface growth [71,72]. It is in the KPZ universality class (it is straightforward to simulate the model and confirm the expected KPZ exponents), and some nonuniversal properties can also be determined exactly (see below). Note that the boundary conditions S ¼ 0 on the right and the left, and the restriction jSðx þ 1Þ − SðxÞj ≤ 1, imply that the entanglement eventually saturates in the expected pyramid profile.
When we move to the continuum (KPZ) description of the interface [Eq. (4)], the nonlinear λ term appears with a negative sign, meaning that entanglement growth is slower when the coarse-grained surface has a nonzero slope. This is natural given the microscopic dynamics: if the slope is maximal in some region, local dynamics cannot increase the entropy there.

Entanglement speed in the solvable model
In the present model the difference in height between two adjacent bonds is either ΔS ¼ AE1 or ΔS ¼ 0. At early stages of the evolution both possibilities occur. However, one may argue that the "flat points," where ΔS ¼ 0, become rarer and rarer at late times. [Flat points can disappear by "pair annihilation" (Fig. 5, top left), and can diffuse left or right (Fig. 5, top right), but cannot be created. As a result, their density decreases with time.] At late times the model therefore becomes equivalent to the well-known "single step" surface growth model [71], in which ΔS ¼ AE1 only. An appealing feature of this model is that, for a certain choice of boundary conditions (BCs), the late-time probability distribution of the growing interface can be determined exactly [71]. [The solvable case corresponds to choosing periodic BCs in the classical problem. (These BCs are useful for understanding the classical model, but they do not have an interpretation in terms of entanglement.) In this setting, the mean height grows indefinitely, but the probability distribution for the height fluctuations reaches a well-defined steady state.] This shows that on scales smaller than the correlation length (and prior to saturation), the interface looks like a 1D random walk with uncorrelated ΔS ¼ AE1 steps. This confirms the expected KPZ exponent α ¼ 1=2. It also allows the mean growth rate of the surface to be calculated [71]: the mean height increase in a given time step can be calculated by averaging over the four possible initial configurations for a bond and its two neighbors. After rescaling time so that one unit of time corresponds to an average of one unitary per bond, this gives an entanglement growth rate [Eq.
As we discuss below (Secs. IV and V), one can also associate a speed v B with the growth of quantum operators under the random dynamics; in the present large-q model, this speed is (The result v B ¼ 1 arises because in the largeq limit, the growth of a typical operator is limited only by the structure of the circuit. In Ref. [73], we give explicit derivations of v B in random circuits for arbitrary q.) It is interesting to note that here v E < v B , contrary to 1D CFTs (where v E ¼ v B [1]) and contrary to previous conjectures about generic systems [23]. In Sec. IV, we give an appealing intuitive picture for why v E can be smaller than v B for the case of Clifford circuits. The mapping to surface growth gives us a clean derivation of universal entanglement dynamics in a solvable model. However, this surface growth picture is restricted to the entropy for a single cut (as opposed to the entropy of a region with multiple end points) and to one dimension. It will be useful to find a more general language which extends the above results. To do this, we now make a connection with the second member of the KPZ triumvirate (Fig. 2), the statistical mechanics of a polymer in a random environment.

III. DIRECTED POLYMERS AND MINIMAL CUT
In this section, we relate the dynamics of Sðx; tÞ to the geometry of a minimal cut through the quantum circuit which prepares the state (Fig. 6). This provides an alternative perspective on the exact result Eq. (19) for the solvable model, and also a useful heuristic picture for noisy quantum dynamics in general. This line of thinking reproduces KPZ behavior in 1D. Importantly, it also allows us to generalize to higher dimensions and to more complex geometries.
Our starting point is the minimal cut bound for tensor networks. This very general bound has been related to the Ryu-Takayanagi formula for entanglement in holographic conformal field theories [50][51][52][53]74], and has also been applied to unitary networks as a heuristic picture for entanglement growth [10].
Consider again a random quantum circuit in 1 þ 1D, and a curve like that in Fig. 6, which bisects the circuit and divides the physical degrees of freedom into two at position x. Any such curve gives an upper bound on the entanglement: all the Rényi entropies satisfy SðxÞ ≤ S cut , where S cut is the number of "legs" that the curve passes through. (This relies only on linear algebra: the rank of the reduced density matrix ρ x is at most q S cut .) [The cut divides the tensor network into two parts connected by S cut bonds. One part contains the physical legs for subsystem A and the other part contains those for the complement. Regarding the two parts of the circuit as composite tensors L and R gives a representation of the state as jψðtÞi ¼ where jai A and jbiĀ are basis states in A andĀ, respectively. This implies that the Schmidt rank for a bipartition into A andĀ is at most q S cut , so that S 0 , which is the logarithm of the Schmidt rank, is at most S cut . In turn, S n ≤ S 0 for any n ≥ 0.] The best bound of this type is given by the minimal cut, which passes through the smallest number of legs. We denote the corresponding estimate for the entropy S min −cut ðxÞ. If the geometry of the circuit is random, S min −cut ðxÞ and the corresponding curve are also random. Finding S min −cut ðxÞ amounts to an optimization problem in a classical disordered system.
In the solvable large-q model, S min −cut ðxÞ in fact gives the von Neumann entropy exactly. This follows straightforwardly from the results of the previous section (see below). In a typical microscopic model, on the other hand, S min −cut is only a bound on the true entropy. Nevertheless, we conjecture that the following picture based on the minimal cut is generally valid as a coarse-grained picture: i.e., that it correctly captures the universal properties of the entanglement dynamics in noisy systems. This conjecture is equivalent to the applicability of the KPZ description to generic noisy systems; further evidence for the latter is in Secs. IV and VI.
The problem of finding the minimal curve is a version of a well-studied problem in classical statistical mechanics, known as the directed polymer in a random environment (DPRE) [47,75]. Here, the "polymer" is the curve which bisects the circuit, and its energy EðxÞ is equal to S cut ðxÞ, the number of legs it bisects. The spatial coordinate of the polymer's upper end point is fixed at x, while the lower end point is free. Finding S min −cut ðxÞ is equivalent to finding the minimal value of the polymer's energy. This corresponds to the polymer problem at zero temperature; however, the universal behavior of the DPRE is the same at zero and at nonzero temperature. (For any finite temperature, the DPRE flows under renormalization to a zero-temperature fixed point at which temperature is an irrelevant perturbation.) DPRE models with short-range-correlated disorder are in the same universality class as the KPZ equation [45]. Let Eðx; tÞ be the minimal energy of the polymer in a sample of height t. We may increase t by adding an additional layer to the top of the sample. Eðx; t þ δtÞ can then be expressed recursively in terms of Eðy; tÞ for the various possible values of y. In the continuum limit, this leads to an equation for Eðx; tÞ which is precisely the KPZ equation (see Refs. [45,76] for more details of the mapping between DPRE and KPZ). The KPZ exponents we give in Sec. II may therefore be applied to the energy of the polymer. The exponent z ¼ 3=2 also determines the length scale for transverse fluctuations of the polymer on large length and time scales: The best such bound is given by the minimal cut (note that the cut shown here is not the minimal one). Finding the minimal cut in a random network is akin to finding the lowestenergy state of a polymer in a random potential landscape.
Since in our case the minimal E is simply S min −cut , we find that the latter executes KPZ growth. In light of the previous section, this is not surprising. In fact, in our solvable model, S min −cut is exactly equal to the true entanglement entropy (in the large-q limit). This follows from the fact that the recursive construction of Eðx; tÞ described above (on the lattice, rather than in the continuum) precisely matches the large-q dynamics of Eq. (19). Examples of nonunitary tensor networks in which the minimal cut bound becomes exact are also known [52], including a large-bonddimension limit similar to that we discuss here [53].
The utility of the DPRE picture is that it is far more generalizable than the surface growth picture, which is restricted to the entropy across a single cut in 1D. As we note above, the value of S min −cut in a given microscopic model is typically not equal to any of the physical entropies S n with n > 0. Nevertheless, we conjecture that the DPRE and KPZ pictures are valid universal descriptions for all noisy models, so long as they are not fine-tuned or nonlocal. This includes noisy Hamiltonian dynamics in continuous time (we discuss this case further in Sec. IX). If we restrict to the leading-order deterministic behavior, we can also make conjectures about Hamiltonian systems without noise.

A. Scaling form for entanglement saturation
At leading order in time, the growth of the height Sðx; tÞ is deterministic: fluctuations are a subleading effect when t is large. Similarly, Eq. (22) shows that the coarse-grained minimal cut is essentially vertical (prior to saturation of the entropy): the length scale for its transverse fluctuations is negligible in comparison with t. These pictures therefore have well-defined and simple deterministic limits. They lead directly to deterministic scaling forms for the leadingorder behavior of the entanglement, which we discuss in more detail in Sec. V. Here, we consider the simplest case, the saturation of the entanglement entropy Sðx; tÞ across a single cut (or for a single interval). We reproduce a simple scaling function known from other contexts [8,10,21].
The definition of the entanglement growth rate implies that the energy E of such a vertical cut is v E t to leading order. The entanglement in a finite system grows at this rate until time t saturation ¼ x=v E , when it reaches its saturation value S ¼ x. (Here, we are neglecting subleading terms and assuming x < L − x.) After this time a vertical cut is no longer favorable: instead, the minimal cut exits the circuit via the left-hand side. Its shape is no longer unique, but it can be taken to be horizontal, and it has energy E ¼ x. This picture corresponds to a simple scaling form (again, neglecting subleading terms): with fðuÞ ¼ u for u < 1 For a finite interval of length l in an infinite system there is a crossover between a configuration with two vertical cuts and one with a single horizontal cut, giving instead SðtÞ ¼ 2v E tfðl=2v E tÞ. These scaling forms are our first confirmation that v E is really a speed, as well as a growth rate for the entanglement. We give an independent derivation of this fact for Clifford circuits in the following section, and test the above scaling form numerically in Sec. VI B. We discuss the interpretation of v E further in Sec. V.
Note that fluctuations have dropped out of Eq. (23) as a result of considering only the leading-order behavior of Sðx; tÞ. These scaling forms agree with the results for holographic CFTs [8] and with a heuristic application of the minimal cut formula to a regular tensor network [10]. Here, we see them emerging from a simple and well-defined coarse-grained picture, suggesting that they are universal for all generic 1D systems, including, for example, translationally invariant but nonintegrable spin chains. [Reference [77] includes numerical tests of scaling forms derived from the directed polymer picture in deterministic systems, including extensions to inhomogeneous systems (a chain with a weak link).] It is also worth noting that Eq. (24) is capable of distinguishing generic systems from (nonrelativistic) integrable systems. In the latter case the quasiparticle picture applies and yields different profiles for SðtÞ [3,19]. For relativistic systems in which the quasiparticle picture holds (rational CFTs), all quasiparticles travel at the same speed, and as a result Eq. (24) does apply [1,3] (however, the entanglement of more complex regions will differ between the quasiparticle picture, on the one hand, and the results from holographic systems and the minimal cut picture, on the other hand [3,4,44]).
In Secs. V and VIII we propose that the above picture in terms of a coarse-grained minimal cut is the simplest way to understand the basic features of the entanglement tsunami for generic many-body systems (with or without noise) in both 1D and higher dimensions.

IV. HYDRODYNAMICS OF OPERATOR SPREADING
An alternative way to think about the quantum dynamics is in terms of the evolution of local operators O i . For example, a Pauli operator initially acting on a single spin (e.g., O i ≡ Y i ; we denote the Pauli matrices by X, Y, Z) will evolve with time into an operator O i ðtÞ which acts on many spins. Operators typically grow ballistically [38], in the sense that the number of spins in the support of O i ðtÞ grows linearly with t. In this section, we relate the growth of the bipartite entanglement to the spreading of operators. We focus on the special case of unitary evolution with Clifford circuits (defined below), but we expect the basic outcomes to hold for more general unitary dynamics. We find that the entanglement growth rate is not given by the rate at which a single operator grows, but is instead determined by collective dynamics involving many operators. Remarkably, in 1D these collective dynamics have a long wavelength hydrodynamic description.
This hydrodynamic description turns out to be the noisy Burgers equation, which is related to the KPZ equation by a simple change of variable and is the final member of the KPZ triumvirate shown in Fig. 1. In the present case the hydrodynamic mode is the density of certain fictitious "particles," shown in blue in Fig. 7. The quantum state is defined by a set of operators (Sec. IVA) which spread out over time, and the particles are markers which show how far these operators have spread. We derive their coarse-grained dynamics in Sec. IV B after introducing the necessary operator language.
In generic many-body systems (with local interactions) this process of operator growth is characterized by a speed known as the butterfly speed v B . This speed defines an effective light cone within which the commutator between the spreading operator OðtÞ and a typical local operator is appreciable. The quantity v B is a characteristic speed for the spreading of quantum information in a given model, and can be extracted from appropriate correlation functions. In deterministic systems (time-independent evolution) v B can depend on temperature, and typically does not saturate the well-known Lieb-Robinson bound [78,79]. Generic noisy systems equilibrate to infinite temperature, so in the present models there is no notion of temperature dependence-v B is a constant defined entirely by the dynamics.
The scaling forms we discuss in the previous section show that in 1D there is a well-defined speed v E associated with entanglement spreading. The following picture gives a physical interpretation of this speed, in terms of a certain set of growing operators. However, it also shows that in general the speed v E is smaller than the speed v B . This is perhaps surprising: in 1D CFTs the two speeds are equal, and it has been conjectured that they are equal in general [23]. (Note that we already encountered a solvable model with

A. Stabilizer operators
It is convenient to use the language of "stabilizer" operators to describe the entanglement dynamics. We may define the initial state jΨ 0 i by specifying L stabilizers under which it is invariant (in this section, we take the number of sites to be L). These operators, denoted O i (i ¼ 1; …; L), satisfy For example, if the spins are initially polarized in the y direction, we may take In the following, we focus on evolution of the initial state with unitary gates in the Clifford group [80]. Such gates have recently been used in toy models for many-body localization [29]. Entanglement generation in non-random Clifford circuits has also been studied [43]. The defining feature of Clifford unitaries is that they have a simple action on Pauli operators: single-spin Pauli operators are mapped to products of Pauli operators.
Any product of Pauli matrices can be written as a product of X and Z matrices, so to follow the dynamics of a given stabilizer O i ðtÞ, we need only keep track of which X i and Z i operators appear in this product. Furthermore, the overall sign of the stabilizer O i ðtÞ does not affect the entanglement properties of a system undergoing Clifford evolution, so we do not keep track of it. By writing O i ðtÞ as we may specify any stabilizer by a binary vector ⃗v with 2L components: For example, the first component of the vector v 1 ¼ 1 if X 1 appears in the product, and v 1 ¼ 0 otherwise. The binary vector corresponding to a stabilizer where the locations of the nonzero elements correspond to X i and Z i . We consider the dynamics in two stages. First, we consider the evolution of a single operator. Then, we generalize this to understand the dynamics of the state.
How does a single stabilizer O i ðtÞ evolve? Applying a one-or two-site Clifford unitary to O i ðtÞ corresponds to applying simple local updates to the string ⃗v. Although the precise details of these updates are not crucial, we now give some explicit examples of gates that we encounter again in the numerical simulations. FIG. 7. Spreading of stabilizer operators defining the quantum state (Sec. IV). Each blue particle marks the right end point of some stabilizer (the rightmost spin on which it acts). Blue particles hop predominantly to the right. Whenever a particle enters the right-hand region (A) the entanglement S A increases by one bit. The particle density is described by the noisy Burgers equation, which maps to KPZ. A "hole" (empty circle) marks the left-hand end point of some stabilizer.
As single-site examples, consider the Hadamard and phase gates. The Hadamard is a rotation on the Bloch sphere [the rotation is by π around the (1,0,1) axis] that exchanges the X and Z axes, so applying a Hadamard to site i updates the string by v ix ↔ v iz . The phase gate is a rotation around the Z axis which maps X i to Y i ¼ iX i Z i : This means that an additional Z i is generated whenever X i is present in the string, or equivalently v iz → v iz þ v ix ðmod 2Þ. For a two-site example, consider the left and right controlled-NOT (CNOT) gates acting on the leftmost spins in the chain. In the Z basis, the action of these operators is to flip the "target" spin if and only if the "control: spin is down: Conjugating the Pauli matrices by CNOT ðLÞ yields: We see that the operator X 2 is added to the string if X 1 is present (and similarly for Z 1 and Z 2 ). Applying CNOT ðLÞ therefore updates ⃗v by CNOT ðRÞ acts similarly with the roles of the spins reversed. It is simple to argue that random application of such operations causes the region of space in which ⃗v is nonzero to grow ballistically. This corresponds to the operator spreading itself over a region of average size 2v B t, where v B is the operator spreading (butterfly) velocity for this system [79]. (For the present system, this velocity is also the analogue of the Lieb-Robinson velocity.) The value of v B depends on the precise choice of dynamics, but it is the same for all initial operators so long as the dynamics (the probability distribution on gates) is not fine-tuned. Further, one may argue that the interior of the region where the string ⃗v is nonzero is "structureless." Within the interior, ⃗v rapidly "equilibrates" to become a completely random binary string. [Consider the late-time dynamics of an operator, or equivalently its string ⃗v, in an L-site system. Random application of Clifford gates gives random dynamics to ⃗v. It is easy to see that the flat probability distribution on ⃗v is invariant under the dynamics, regardless of the probabilities with which the gates are applied. By standard properties of Markov processes, this is the unique asymptotic distribution to which the system tends, so long as the choice of Clifford gates is not fine-tuned to make the process nonergodic. (If the gate set includes each gate and its inverse with the same probability, detailed balance is also obeyed, but this is not necessary.) We expect ⃗v to equilibrate locally to this structureless state on an Oð1Þ time scale, and similarly for the internal structure of operators smaller than L.] Now consider the dynamics of a quantum state. Once the sign information in Eq. (26) is dropped, the relevant information in the state jΨðtÞi is contained in binary vectors ⃗v 1 ; …; ⃗v L corresponding to the L stabilizers. We may package this information in a 2L × L matrix: Each column corresponds to a stabilizer, and each row to a spin operator X i or Z i . The dynamical updates correspond to row operations (with arithmetic modulo two) on this matrix. For example, a Hadamard gate exchanges the rows corresponding to X i and Z i . A crucial point is that there is a large gauge freedom in this definition of the state. This gauge freedom arises because we can redefine stabilizers by multiplying them together. For example, if a state is stabilized by fX 1 ; Z 2 g, then it is also stabilized by fX 1 Z 2 ; Z 2 g, and vice versa. This freedom to redefine the stabilizers corresponds to the freedom to make column operations on Ψ, or equivalently the freedom to add the vectors ⃗v i modulo two. Note that by making such a "gauge transformation" we may be able to reduce the size of one of the stabilizers, giving a more compact representation of the state.
The final fact we need is an expression for the entropy S A ðtÞ of a region A in terms of the stabilizers. Heuristically, this is given by the number of stabilizers that have spread into region A from outside. More precisely, define I A as the number of stabilizers that are independent when restricted to region A [81]. (Independence of the stabilizers corresponds to linear independence of the vectors ⃗v i , with arithmetic modulo two, once they are truncated to region A.) The entropy is equal to [82,83] where jAj is the number of sites in A. See Appendix C for a simple derivation of Eq. (33). For Clifford dynamics all Rényi entropies are equal, so we omit the Rényi index on S.
The maximal value of I A is 2jAj, so S A is bounded by jAj as expected. This formula has a simple interpretation. In the initial product state we may take one stabilizer to be localized at each site, so I A ¼ jAj and the entanglement is zero. As time goes on, stabilizers that were initially localized outside of A grow and enter A. Each time a new independent operator appears in A, the entanglement S A ðtÞ increases by one bit. The linear independence requirement in the definition of I A is crucial, as it leads to effective interactions between the stabilizers, which we discuss in the following section.
From now on, we take A to consist of the spins to the right of the bond x, and revert to the notation S A ¼ S x used in the rest of the text for the entanglement across a cut at x.

B. Coarse-grained operator dynamics
Each stabilizer O i ðtÞ (labeled i ¼ 1; …; L) has a left and a right end point l i and r i , marking the extremal spins included in the stabilizer. We view l i and r i as the positions of two fictitious particles of type l and r, represented in white and blue, respectively, in Figs. 7 and 8. There are L of each type of particle in total.
In the initial product state, O i ð0Þ is a single Pauli operator on site i, say, Y i . This means that each site has one l particle and one r particle (since l i ¼ r i ¼ i), as shown in Fig. 8 (left). As time increases, the r particles will typically move to the right and the l particles to the left.
The nature of this motion depends on how we define the stabilizers. At first sight, the obvious choice is to define O i ðtÞ as the unitary time evolution of the initial stabilizer, O i ðtÞ ¼ UðtÞY i U † ðtÞ. But in fact, it is useful to exploit the gauge freedom in the choice of stabilizers to impose a different "canonical" form. One result of this is that the stabilizers effectively grow more slowly than the butterfly velocity v B (discussed in the previous section) for the spreading of an operator considered in isolation.
Let ρ l ðiÞ and ρ r ðiÞ be the number of particles of each type at site i. The constraint that we impose is To see that we can impose this constraint, consider the situation ρ l ðiÞ ¼ 3, so that there are three stabilizers that start at i. The initial element of each string can be either X, Y, or Z. If ρ l ðiÞ ¼ 3, it is impossible for all three initial elements to be independent. We can then redefine one of the stabilizers, by multiplying it by one or both of the others, in such a way that its length decreases by one. (By choosing the longer stabilizer we avoid adding length at the right-hand side.) Making reductions of this kind wherever possible guarantees that ρ l ðiÞ ≤ 2, and also that if ρ l ¼ 2, the initial elements of the two stabilizers are distinct.
(And similarly for ρ r .) With this convention it also follows that ρ l ðiÞ þ ρ r ðiÞ ≤ 2: otherwise, the operators involved could not commute, which they must (the initial stabilizers commute, and this is preserved by the unitary dynamics and the redefinitions of the stabilizers). [Consider the case where ρ r ðiÞ ¼ 1: for example, let the corresponding stabilizer read O ¼ …X i . Any stabilizer contributing to ρ l ðiÞ must be of the form X i … in order to commute with O. By the rule imposed in the text, this means that ρ l ðiÞ ≤ 1.] Since there are a total of 2L particles which all have to live somewhere, we have Eq. (34). With this convention, the dynamics of the bipartite entropy SðxÞ is simply related to the hopping dynamics of the particles. By Eq. (34) it suffices to consider only the r particles: an l particle is just an r "hole." We write the density ρ r of r particles as ρ. See Fig. 7 for a typical configuration in a subregion of the system.
The utility of the canonical form Eq. (34) is that the independence requirement becomes trivial. One can easily check that all the operators that have spread into A (the region to the right of x) are independent. (Consider the stabilizers that act in region A, i.e., the stabilizers with r i > x. We may argue by contradiction that they remain independent after truncation to subsystem A. If not, this means there is some product of the truncated stabilizers that equals one. Let the rightmost spin appearing in any of these stabilizers be j. But by our convention for "clipping" the stabilizers, it is impossible for the Pauli matrices acting on spin j to cancel out when they are multiplied together. Therefore, the operators must, in fact, be independent.) Therefore, to find SðxÞ we need only count the number of r particles to the right of the cut and subtract the number of sites: To reiterate, the entanglement increases by one every time an r particle drifts rightward across bond x (and decreases by one if it drifts across in the other direction). Now consider the dynamics of the particles. Microscopically, a dynamical time step involves (1) application of a unitary gate and (2) potentially a "clipping" of stabilizers to enforce the canonical form. Effectively, the particles perform biased diffusion, with the restriction that more than two particles cannot share a site: This constraint leads to "traffic jam" phenomena familiar from the so-called asymmetric exclusion process [84], and to the same continuum description. Our essential approximation is to neglect the detailed internal structure of the stabilizers and to treat the dynamics of the end points as effectively Markovian. We expect this to be valid at long length and time scales for the reason we mention in the previous section: the internal structure of the operators is essentially featureless and characterized by finite time scales. We now move to a continuum description. The coarsegrained density obeys a continuity equation: with J the particle current. Further, there is a symmetry under spatial reflections, which exchange left and right end points (ρ l ↔ ρ r ). Writing where Δρ is the deviation from the mean density, the reflection symmetry is To obtain a long wavelength description, we write the current as a power series in Δρ and ∂ x . Keeping the lowestorder terms that respect the symmetry, These terms have a transparent physical meaning. The drift constant c > 0 reflects the fact that the average motion is to the right (i.e., operators grow over time). The ν term is simple diffusion. The noise η reflects the randomness in the dynamics. Most importantly, the nonlinear λ term is the effect of the constraint Eq. (34). It reflects the fact that the current is maximal when the density is close to one. The current evidently vanishes when ρ ¼ 0, since there are no particles, but also when ρ ¼ 2 (the particles cannot move if the density is everywhere maximal). Therefore, we expect λ > 0. From the above formulas, the density obeys known as the noisy Burgers equation [84]. The entanglement S ¼ R x ðρ − 1Þ obeys ∂ t S ¼ J, leading to the KPZ equation: The sign of λ is in agreement with that obtained from the surface growth picture in Sec. II and from the directed polymer picture in Sec. III. While we focus here on dynamics of a restricted type (Clifford), this derivation of KPZ for entanglement provides independent support for the arguments in the previous sections.
In the language of the particles, the initial state corresponds to uniform density ρ ¼ 1. Saturation of the entanglement corresponds (neglecting fluctuations) to all of the r particles accumulating on the right-hand side and all of the l particles on the left (Fig. 8), i.e., to a step function density.
As an aside, it is interesting to consider fluctuations in SðxÞ at late times, i.e., long after the saturation of hSðxÞi.
Let us revert to our previous notation, where the system has L þ 1 sites and bonds are labeled x ¼ 1; …; L. Without loss of generality we take x ≤ L=2. When fluctuations are neglected, the region to the left of x is empty of r particles, and the entropy is maximal, S max ðxÞ ¼ x. Fluctuations will reduce the average. But in order for SðxÞ to fluctuate downward, a blue r particle must diffuse leftward from the right half of the system in order to enter the region to the left of x, as in Fig. 9. This is a fluctuation by a distance ∼ðL=2 − xÞ. Such fluctuations are exponentially rare events, because they fight against the net rightward drift for the r particles. Thus, when L=2 − x is large, we expect Our coarse-grained picture does not determine the numerical constants. The detailed nature of these exponentially small corrections will differ between Clifford circuits and more general unitary circuits. (For example, in the Clifford case S n is independent of n, while in general the corrections will depend on n [68].) Nevertheless, the functional form above agrees with the late-time result for generic gate sets, which is simply the mean entanglement in a fully randomized pure state [67,68]: jAj ¼ x and jĀj ¼ L − x þ 1 are the numbers of sites in A and its complement. For (generic) Clifford dynamics, the probability distribution of the entanglement at asymptotically late times will be that of a random stabilizer state. This has been calculated in Ref. [85].

V. ENTANGLEMENT TSUNAMI: SPEEDS AND SCALING FORMS
It is not a priori obvious that the rate v E governing entanglement growth can be viewed as a speed in generic systems (see Ref. [79] for a recent discussion), although this is known to be the case in holographic CFTs [8]. Our results in the directed polymer picture and in the operator spreading picture suggest that v E is indeed a well-defined speed in generic systems. (We see in the previous section that there is a simple visual interpretation of this speed in x FIG. 9. Fluctuations at late times, after saturation of hSðxÞi, in the Clifford case. When x ≪ L=2 it requires a rare fluctuation (fighting against the net drift) to remove a particle from region A, leading to an exponentially small S max ðxÞ − hSðxÞi. the stabilizer formalism.) However, this speed is in general smaller than the speed v B which governs the spreading of an operator considered in isolation: "thermalization is slower than operator spreading." In the stabilizer context the difference between v E and v B arises because in enforcing Eq. (34) we "clip" the stabilizers, reducing their rate of growth. We believe the phenomenon of v E being smaller than v B to be general (see also the result in Sec. II A 4) and relevant also to nonnoisy dynamics. This picture is contrary to that of. e.g., Ref. [23], where the operator spreading velocity is assumed to determine the entanglement growth rate. In the presence of noise, one may also argue that a picture of independently spreading operators underestimates the exponent governing the growth of fluctuations. [Considering the unitary evolution of a single operator in isolation, its right end point executes a biased random walk, traveling an average distance v B t with fluctuations Oðt 1=2 Þ. If we were to neglect the independence requirement in Eq. (33), then the entanglement would be estimated (incorrectly) as the number of independently spreading operators which have reached A. The mean of this quantity is v B t and the fluctuations are of order t 1=4 . This is related to the difference between the KPZ universality class of surface growth, which is generic, and the Edwards-Wilkinson universality class, which applies when the strength of interactions is fine-tuned to zero [45].] The language of a tsunami is often used in discussing entanglement spreading, so it is nice to see that-at least in 1D-entanglement spreading can be related to a hydrodynamic problem. (The motivation for the tsunami terminology is the idea that for a region A, the entanglement S A is dominated by a subregion close to the boundary which grows ballistically, like the advancing front of a tsunami.) In higher dimensions the boundary of an operator has a more complicated geometry, so the hydrodynamic correspondence we describe above does not generalize.
In order to understand the entanglement tsunami better, we now return briefly to the directed-polymer-in-arandom-medium picture developed for noisy systems in Sec. III.

A. Scaling forms for the entanglement tsunami
When all length and time scales are large, fluctuations in the entanglement are subleading. Neglecting them is equivalent to saying that the coarse-grained minimal cut (prior to saturation) is a straight vertical line. This deterministic picture generalizes to the entanglement or mutual information of arbitrary regions, and also to higher dimensions (Sec. VIII). We conjecture that these pictures are valid for the long-time behavior of entanglement quite generally. The setup relevant to us in the non-noisy case is a quench, in which the initial state is a ground state of one Hamiltonian, and a different Hamiltonian is used for the evolution.
In the 1D case, the deterministic scaling form for the entanglement (of an arbitrary region) which results from the leading-order directed polymer picture is rather simple, and is not new-it agrees with holographic results [8,44], and as noted in Ref. [10], can also be obtained from a more microscopic minimal cut picture in which the geometry of the minimal cut is highly nonunique. We propose that coarse graining fixes the geometry of the minimal cut. The derivation of these scaling forms from a simple coarsegrained picture suggests that they are universal in nonintegrable, translationally invariant systems. (These scaling forms are generally not the same as those obtained from the quasiparticle picture for rational CFTs [3,4].) Our derivation also opens the door to generalizations to higher dimensions (Sec. VIII B) and to 1D systems with quenched disorder [77].
We now consider some examples of the scaling of the mutual information. This will help clarify the operational meaning of the speed v E .
To calculate the entanglement S A of a region A, we must take a cut, or multiple cuts, with end points on the boundary points of A at the top of the space-time slice. These cuts can either be vertical, in which case they cost an energy'v E t (to use the language of Sec. III), or they can connect two end points, in which case we take them to be horizontal and to have an energy equal to their length. The entanglement S A ðtÞ is given by minimizing the energy of the cut configuration. It is a continuous piecewise linear function, with slope discontinuities when the geometry of the minimal cut configuration changes. To generalize the conjecture to systems without noise, we must allow for the fact that the asymptotic value of the entanglement depends on the energy density of the initial state. We therefore replace the entanglement S in the formulas with S=s eq , where s eq is the equilibrium entropy density corresponding to the initial energy density [2,8]. This ensures that the entanglement entropy of an l-sized region matches the equilibrium thermal entropy when v E t ≫ l=2, as required for thermalization. Heuristically, s eq defines the density of "active" degrees of freedom at a given temperature [79].
To clarify the meaning of v E , consider the mutual information between two semi-infinite regions that are separated by a distance l (Fig. 10). With the labeling of the regions as in the figure, this is given by FIG. 10. Infinite chain with regions A, B, C marked. B is of length l while A and C are semi-infinite. The mutual information between A and C is nonzero so long as l < 2v E t: correlations exist over distances up to 2v E t, not v E t.
We have S A ¼ S C ¼ v E t for all times, since the appropriate minimal cuts are vertical. If l > 2v E t, S B is given by two vertical cuts, so I AC vanishes. When l < 2v E t, S B is instead dominated by a horizontal cut, so that The entanglement tsunami is sometimes taken to mean that at time t, a "boundary layer" of width v E t inside a given region is entangled with the exterior. If this region were maximally entangled with the exterior, this would reproduce the correct value of the entanglement across a cut (S ¼ v E t). However, this picture is not correct: the result for the mutual information shows that correlations exist over distances up to 2v E t, not v E t. So although v E is a speed, it should not be thought of as the speed at which the boundary of the entangled region moves.
Although the rule for calculating the entanglement is almost trivial, the consequences are not always intuitively obvious. First consider the case where the regions A and C above are finite rather than infinite (and embedded in an infinite chain); see Fig. 11. When the length d of the regions A and C exceeds their separation l, the time dependence of the mutual information is as shown in Fig. 11. [The sequence of minimal cut configurations required for calculating S B in this case is (a), (b), (c) shown in Fig. 12.] By contrast, when the separation l exceeds the length d, the mutual information is always zero (or more precisely, exponentially small).
[For a simpler example of exponentially small values of the mutual information, consider hI AC i at infinite times in a finite system. If the system contains L qubits and A ∪ C contains N qubits, the mutual information is exponentially small whenever N < L=2, and given by Eq. (44)  Finally, consider the effect of a boundary. Take a semiinfinite chain with regions A, B, C adjacent to the boundary as in Fig. 13 (C is semi-infinite). Consider the mutual information between B and C, I BC ¼ S B þ S C − S A . We must distinguish the case l A < l B =2 from the case l A > l B =2. (In the former case, the first event is that the minimal cut at the boundary of A goes from being vertical to being horizontal; in the latter, the first event is that the two vertical cuts at the boundary of B are replaced by a horizontal one.) The resulting expressions for I BC are plotted in Fig. 13.

VI. NUMERICAL EVIDENCE FOR KPZ GROWTH
We now give numerical evidence that noisy entanglement growth in 1D is in the KPZ universality class. We study the time evolution of spin-1=2 chains with open boundary conditions, taking the initial state to be a product state with all spins pointing in the same direction (either the positive y or z direction) and keeping track of the entanglement entropy across each bond during the evolution. The discrete time evolution is a circuit of one-and two-site unitaries. Figure 14 shows the structure of a single time step: two layers of two-site unitaries are applied, one layer on odd and one on even bonds, together with singlesite unitaries. Each unitary is chosen independently and randomly (from a certain set specified below). We use the symbol R to denote a generic single-site unitary and U to denote a two-site unitary.
We consider three kinds of dynamics, distinguished by the choice of unitaries. To begin with, we study "Clifford evolution" in which the unitaries are restricted to the set of  so-called Clifford gates (Sec. IV). Clifford evolution can be simulated efficiently (in polynomial time) using the stabilizer representation we discuss in Sec. IV. This allows us to access very long times and to pin down KPZ exponents accurately. Next, we study more general dynamics for which polynomial-time classical simulation is impossible, giving evidence that KPZ behavior holds more generally. The two types of non-Clifford dynamics we study here are referred to as the phase evolution and the universal evolution: we give details below. For these dynamics we use a matrix product representation of the state implemented via ITensor [86].
The fingerprints of KPZ behavior that we search for are the two independent critical exponents β and α (Sec. II). We extract β both from the fluctuations in the von Neumann entropy and from the corrections to the mean value [Eqs. (6) and (7)], and we extract α from the spatial correlations in the entanglement at distances shorter than the correlation length ξðtÞ [Eq. (9)]. For Clifford circuits, we also touch on the entanglement probability distribution.

A. Clifford evolution
Clifford circuits, or "stabilizer circuits," are a special class of quantum circuits that play an important role in quantum information theory. As shown by Gottesman and Knill, they can be simulated efficiently, even when the entanglement entropy grows rapidly, by representing the quantum state in terms of stabilizers [87]: see Sec. IV.
The time evolution operator for a Clifford circuit belongs to the Clifford group, a subgroup of the unitary group on the full Hilbert space. This group may be generated by a small set of local Clifford gates: the two-site controlled NOT gates [Eq. (31)] and the single-site Hadamard and phase gates R H and R P [Eqs. (29) and (30)]. For circuits built from these gates, time evolving the state on L spins up to a time t takes a computational time of order Lt, and measuring the entanglement across a given bond in the final state takes a time of order L 3 at most. This is in sharp contrast to the exponential scaling that is inevitable for more general circuits.
In all our simulations, each two-site unitary U in the circuit is chosen with equal probability from three possibilities: the two types of CNOT gate [Eq. (31)] and the identity matrix: U ∈ f1; CNOT ðLÞ ; CNOT ðRÞ g: ð46Þ In this section, we discuss the simplest Clifford dynamics, which includes only these gates, and no one-site unitaries (R ¼ 1). When the initial state is polarized in the ydirection, this set of gates is sufficient to give nontrivial entanglement evolution, with universal properties that turn out to be the same as those for more generic gate sets. We also study the "full" Clifford dynamics in which all the Clifford generators are used, choosing the single-site unitaries randomly from the three options R ∈ f1; R H ; R P g: ð47Þ Results for this case are similar and are given in Appendix. D.
To begin with, Fig. 15 shows the evolution of the bipartite von Neumann entropy SðxÞ (in units of log 2) for a single realization of the noise (i.e., a particular random circuit) in a system of L ¼ 459 sites. The curves show successively later times. Note that the entropy saturates more rapidly closer to the boundary, because the maximum entanglement across a bond is proportional to its distance from the boundary. At very late times Sðx; tÞ saturates to a pyramidlike profile representing close-to-maximal entanglement. Our interest is in the stochastic growth prior to saturation, which we show is KPZ-like. All observables in  340, 690, 1024, 1365, 1707, 2048, and 4096), in the Clifford evolution. This shows that the state evolves from a product state to a near-maximally-entangled one. Prior to saturation the entanglement displays KPZ-like stochastic growth. Sðx; tÞ is in units of log 2. the following are measured far from the boundary, in order to avoid finite-size effects associated with saturation. Figure 16 shows successive snapshots for a subregion of a larger system of L ¼ 1025 bonds (times t ¼ 170, 340, 512, 682, from bottom to top). The maximal slope that can appear is 1, in accord with Eq. (3). Note the gradual roughening of the surface and the growing correlation length. Figure 17 shows the height and width of the growing surface, for very long times. These quantities are averaged over at least 10 5 realizations. In each realization only the entanglement across the center bond is used (therefore all data points are uncorrelated) and the system size is L ¼ at, where a is chosen to avoid finite-size effects (see Sec VI B). We obtain estimates β h and β w of the exponent β by fitting the data to the expected forms [cf. Eqs. (6) and (7)]: Here, η (with η < β w ) captures subleading corrections. We find Both estimates of β are in excellent agreement with the KPZ value β ¼ 1=3. The solid lines in Fig. 17 show the fits (the fit parameters are in Table I). The dashed lines show the slopes corresponding to the expected asymptotic power laws, hðtÞ ∼ t and wðtÞ ∼ t 1=3 . The analysis in Sec. IV implies that v E is a well-defined velocity, and v E t is a sharply defined length scale characterizing the range of entanglement in the state. We may confirm this by measuring this length scale directly; see the section below.
Note the small value of the subleading exponent η obtained from the fit. This implies that finite time corrections are  reduced if we plot the numerical derivative dw=d log t rather than w itself (both quantities scale as t 1=3 at long times). This is done in Fig. 18. The data fit well to the t 1=3 law even at short times. This will be useful for the more general dynamics where long times are not available.
To complete the check of the two independent KPZ exponents, Fig. 19 shows the spatial correlator GðrÞ defined in Eq. (9), as a function of separation r, for three successive times. For small r, the correlation grows like a r α with α ≃ 1=2, in agreement with the KPZ prediction for this exponent. For distances r ≫ ξðtÞ, the correlator saturates to a value proportional to wðtÞ. The figure gives an idea of the size of the correlation length ξðtÞ for these times.

B. Numerics on speeds and scaling forms
We argue in Sec. V that in addition to determining the entanglement growth rate, v E can also be viewed as a speed. This is the speed of the fictitious particles in Sec. IV. Operationally, the simplest manifestation of this speed is in the saturation behavior of the entanglement. The analytical arguments imply that to leading order (at large t and l) the entanglement across a cut at position l (l ≤ L=2) has the simple scaling form given above in Eq. (24): This gives simply S A ¼ v E t for t < l=v E , and S A ¼ l for t > l=v E . This means that there is no influence of the boundary at times t < l=v E . (See also the numerical results in Refs. [39,40], indicating sharp saturation in circuits where interactions between any pair of spins are allowed.) In Fig. 20, we test this result numerically for the Clifford evolution. We set l ¼ L=2 and plot as a function of L, for several values of the time (t ¼ 2 9 , 2 10 , 2 11 , and 2 12 ). Here, v E ¼ 0.1 is taken from the fits to Fig. 17. According to Eq. (51), this plot should converge for large t to a plot of fðuÞ against u. The results are in excellent agreement with the scaling form, confirming, for the case of Clifford circuits, that v E is a meaningful velocity. It is also interesting to compare the entanglement velocity v E with the butterfly velocity v B . We obtain v B from the average spatial extent W of a growing Pauli string (see Sec. IVA) under the unitary Clifford dynamics at time t, as v B ¼ W=2t. Remarkably, we find that v E ¼ v B =2 within  numerical precision for both the CNOT-only Clifford dynamics and the "full" Clifford dynamics defined above. This is shown in Fig. 21, where we plot W starting versus time for the two protocols. The initial Pauli strings we consider in this simulation are single-site Y operators. [The CNOT dynamics is not ergodic on the space of Pauli strings (unlike the full Clifford dynamics). Nevertheless, any operator grows in size at the same rate v B .] We compare W with 4 times the average entanglement entropy, 4SðtÞ. The two curves lie on top of each other, consistent with v E =v B ¼ 1=2.
We also find the same ratio for v E =v B in the exactly solvable large-q model (Sec. II A 2). However, it is possible to construct non-fine-tuned random circuits, involving Haarrandom unitaries at finite q, in which the ratio is less than 1=2 [73], so this value is not universal. A natural question is whether it is generic for random Clifford circuits.

C. Universal and phase evolution
The phase and universal dynamics take us outside the Clifford realm, and cannot be simulated efficiently on a classical computer (in polynomial time). We give evidence that the correspondence with KPZ continues to hold in this more generic situation. However, our results are not as conclusive as in the Clifford evolution as we do not have access to such long times.
The simulations are performed on spin-1=2 chains of length L ¼ 500 bonds (501 spins) using the ITensor package [86]. The two types of dynamics are defined as follows. [The two-site unitaries are always chosen from the set in Eq. (46); the initial state is taken polarized in the y direction.] Phase evolution.-Each single-site unitary is chosen randomly and uniformly from the set of eightfold rotations about the z axis in spin space: R ¼ exp ðπinσ z =8Þ, with n ∈ 1; …; 8.
Universal evolution.-This set of gates, unlike the others, is "universal" in the quantum information sense (any unitary acting on the full Hilbert space of the spin chain can be approximated, arbitrarily closely, by a product of gates from this set). The single-site gates include the eightfold rotations mentioned above, together with the Hadamard gate R H [. (29)]. R H is applied with probability 1=2 and the rotations with probability 1=16 each. Figure 22 shows the height and width hðtÞ and wðtÞ for the two protocols (averaged over 380 realizations for the phase evolution and 200 realizations for the universal evolution, and over bonds x with 20 < x < 480). The figure shows fits to the forms in Eq. (49) with β h and β w fixed to the KPZ value and η fixed to zero (fit parameters are in Table I). The fits with Eq. (49) are consistent with the data. It is not possible to extract precise estimates for β from the slope of the log-log plot of wðtÞ, although for the phase evolution the slope at late times is in reasonable agreement with the expected KPZ value, shown by the dashed gray trend line.
For an alternative attack on β, we plot the numerical derivative dwðtÞ=d ln t. Recall that in the Clifford case the slope of this quantity (when plotted against time on a loglog plot) has smaller finite-size corrections than the slope for wðtÞ itself. The corresponding plot is shown in Fig. 23, for times up to t ¼ 25 (averaging over more than 5000 realizations). The dashed gray lines are the t 1=3 trend lines. Results for both types of dynamics are in good agreement with the expected slope β ¼ 1=3.
Next, we examine the spatial correlator Eq. (9) in Fig. 24. For both types of dynamics, the behavior for r ≪ ξðtÞ agrees well with the KPZ exponent value α ¼ 1=2 at the largest available time.
The very long times accessible in the Clifford simulation allow us to establish KPZ exponents with high accuracy there. For the more generic dynamical rules we cannot reach the same level of precision, but, nevertheless, the KPZ exponent values are compatible with the data.
Next, we briefly discuss a fine-tuned situation in which entanglement dynamics are not KPZ-like: namely, when the system is made up of free particles. Then, in Sec. VIII, we move to higher dimensions.

VII. FREE FERMIONS ARE NONGENERIC
The growth of entanglement in systems of free particles is highly nongeneric. In the presence of noise, the entanglement of a system of free particles on the lattice grows only as S ∼ ffiffi t p , in contrast to the behavior S ∼ t of generic systems. The case of spatially homogeneous noise has been discussed recently [88]. The basic point is the same when the noise varies in space: the fact that the single-particle wave functions spread diffusively in the presence of noise implies that the entanglement cannot be larger than Oð ffiffi t p Þ [88]. As a concrete example, consider a short-range hopping Hamiltonian for free fermions, with noisy matrix elements H ij ðtÞ. For simplicity, take the initial state to consist of particles localized at sites i ∈ S for some set S; for example, we could take S to consist of all the even-numbered sites: Under the evolution, each creation operator evolves into a superposition of creation operators, where ψ ðiÞ ðj; tÞ is the solution of the time-dependent Schrödinger equation for a particle initially localized at i. In the absence of noise, ψ ðjÞ spreads ballistically, but in the presence of noise. it spreads only diffusively. The fact that each creation operator is spread out over only Oð ffiffi t p Þ sites after a time t immediately implies that the mean entanglement is at most of order ffiffi t p . (See also Ref. [88].) Note, however, that this argument does not tell us how large the fluctuations are. (Random unitary evolution of a single wave packet is discussed in Ref. [89]. However, we must consider the full many-body wave function, since the formalism of Ref. [90] for the free-fermion density matrix shows that the initially occupied orbitals do not simply contribute additively to the entanglement.) We have confirmed numerically that hSi ∝ ffiffi t p for a noisy 1D hopping model, using the formalism of Ref. [90] to construct the reduced density matrix. This is much slower than the linear-in-time growth of generic interacting models. The ffiffi t p scaling should apply for free fermions in any number of dimensions. In 1D it also applies to certain noisy spin models via the Jordan-Wigner transformation: for example, the transverse field XY model: However, any generic perturbation to the spin chain spoils the free-fermion correspondence. We then expect the generic KPZ behavior to reassert itself.

VIII. HIGHER DIMENSIONS
We discuss several ways of thinking about entanglement growth in 1D. One of these, the directed polymer picture, generalizes naturally to higher dimensions: the polymer is simply replaced by a d-dimensional membrane embedded in (d þ 1)-dimensional space-time. As in 1D, we think of this membrane as a coarse-grained version of a minimal cut bisecting a unitary circuit. The membrane is subject to pinning by "disorder" in space-time arising from the dynamical noise. See Fig. 25 for the two-dimensional case.
We can explore two kinds of questions using this picture. First, we can examine universal properties that are specific to the noisy scenario: as in 1D, fluctuations are governed by universal exponents. Second, we can calculate leading-order properties of SðtÞ that do not involve fluctuations and that are, therefore, likely to be valid even in the absence of noise, i.e., for dynamics with a time-independent Hamiltonian. In higher dimensions the behavior of SðtÞ has nontrivial dependence on the geometry of the region for which we calculate the entanglement. We suggest the "minimal membrane in space-time" as a simple and general heuristic for such calculations. Below, we discuss the case of a spherical region (Sec. VIII B) and contrast our results with an alternative simple conjecture. For other toy models for entanglement spreading, see Refs. [10,23].
Denoting the region for which we wish to calculate the entropy by A, and its boundary by ∂A, the membrane lives in a space-time slice of temporal thickness t, and terminates at ∂A on the upper boundary of this time slice; see Fig. 25. For simple shapes and for times shorter than the saturation time, the membrane also has a boundary on the lower slice, as shown in Figs. 25 and 26. In this section, we focus on entanglement growth prior to saturation.

A. Universal fluctuations of SðtÞ in noisy systems
Consider the entanglement SðtÞ for a region A whose boundary ∂A has length or area j∂Aj. In the d ¼ 2 case, shown in Fig. 25, j∂Aj is the length of the spatial boundary. Neglecting fluctuations, the "world volume" of the minimal membrane scales as j∂Aj × t. This gives the leading scaling of the membrane's energy and, hence, of the entanglement. As in 1D, subleading terms encode universal data. We now consider these terms.
The pinning of a membrane or domain wall by disorder is well studied [47,[91][92][93][94] (a brief summary is in Appendix E). Translating standard results into the language of the entanglement in a d-dimensional noisy quantum system, we find that in both d ¼ 1 and d ¼ 2 there is a unique dynamical phase with nontrivial critical exponents. The same is true for continuum systems [more precisely, for systems with continuous (statistical) spatial translational symmetry] in d ¼ 3. However, if a lattice is present, two stable phases (and thus a dynamical phase transition) are possible in d ¼ 3; one with nontrivial exponents and one with trivial ones. In the trivial phase, the membrane is "smooth" and is pinned by the lattice. In the nontrivial phases, the membrane is instead pinned by disorder in a "rough" configuration. We discuss the nontrivial phases (which are the only ones possible in d < 3 and for continuum systems in d ¼ 3).
Generally, fluctuations have a weaker effect in higher dimensions than in 1D. For simplicity, take a quantum system which is infinite in one direction and of size L in the other d − 1 directions, and consider the entanglement for a cut perpendicular to the infinite direction. Since A and its complement are both infinite, SðtÞ will grow indefinitely for this geometry. However, there are two regimes, t ≲ L and t ≫ L (here, we drop a dimensionful prefactor). For times t ≲ L (see Appendix E for details): where the exponent θ is defined below. This reproduces the 1D result with θ ¼ β.  are suppressed with respect to the mean by a factor of j∂Aj 1=2 : distant regions of the boundary give rise to essentially independent fluctuations which add up incoherently. In the opposite regime t ≫ L, the temporal dimension of the membrane is much larger than its spatial dimensions, so there is a crossover to the 1D directed polymer problem. However, the exponent of the higherdimensional problem appears in the universal L dependence of the growth rate: (The higher corrections will include the t 1=3 term associated with the 1D universality class.) Numerically, the exponent is θ ¼ 0.84ð3Þ in d ¼ 2 and θ ¼ 1.45ð4Þ in d ¼ 3 [94]. The subleading exponent in Eq. (57) is negative for d > 1, so this correction may be hard to observe numerically.
B. Minimal membrane picture for dynamics without noise In higher dimensions we can ask how SðtÞ depends on the geometry of region A when this geometry is nontrivial. Interestingly, the membrane picture makes predictions about this that do not involve the noise-induced fluctuations, and that are likely also to be valid for Hamiltonian dynamics without noise (with the replacement S → S=s eq we discuss in Sec. V).
As an instructive special case, take A to be a disk-shaped region of radius R in d ¼ 2. (A ball in higher dimensions is precisely analogous.) We assume continuous rotational symmetry, at least on average. At short times, the leading scaling of the entanglement is SðtÞ ≃ 2πv E Rt, since the world-sheet area of the membrane is approximately 2πR × t. However, there are corrections to this arising from the curvature of ∂A.
We consider the limit of large t and large R with a fixed ratio t=R. In this regime, the effects of fluctuations may be neglected, and instead the energetics of the membrane are determined by deterministic elastic effects. We write the energy of the membrane as where d 2 S is the membrane's area element and E is its "energy" density. We get SðtÞ by minimizing E with appropriate boundary conditions. Next, we Taylor expand E in terms of local properties of the membrane. For a flat "vertical" membrane (i.e., with normal perpendicular to the t axis), E ¼ v E . In general, however, E depends on the angle φ by which the surface locally deviates from verticality, as well as, for example, the local curvatures κ s and κ t in the spatial and temporal directions. Using rotational symmetry to parametrize the membrane by the radius rðt 0 Þ, However, this simplifies in the limit of interest. We first send t; R → ∞ with t=R fixed. In this limit _ rðt 0 Þ remains finite, but the curvature terms become negligible (see, for example, the explicit solution below), so we can write E ¼ Eð_ rÞ. Now we make the second approximation that t=R is small, meaning that we can keep only the Oð_ r 2 Þ correction.
The boundary condition at the top of the space-time slice is rðtÞ ¼ R. We consider times prior to saturation, so the membrane also has a free boundary on t ¼ 0. In the relevant limit, its energy is Minimal energy requires the boundary condition _ rð0Þ ¼ 0. When t=R is small, we may expand in 1=R. This gives rðt 0 Þ ≃ R − ðt 2 − t 02 Þ=ð4aRÞ, as illustrated in Fig. 26. The corresponding entropy is This calculation generalizes trivially to higher dimensions, where the correction is of the same order. Corrections due to fluctuations come in with negative powers of t, and are negligible in the limit we are discussing. Note that the first correction in the brackets in Eq. (63) is of order ðt=RÞ 2 , and not of order t=R. This result differs from what one might naively have expected if one guessed that at time t an annulus of widthṽ × t inside the disk is entangled with the outside, whereṽ is a tsunami velocity. This picture gives an entropy proportional to the area of the annulus, leading to a negative correction of order t=R. The difference between Eqs. (63) and (64) also indicates that a picture in terms of independently spreading operators is misleading, in agreement with what we find in 1D. It is interesting to note that in the regime where t=R is of order 1, the full _ r dependence of Eð_ rÞ plays a role. This suggests that an infinite number of nonuniversal parameters enter the expression for SðtÞ in this regime, and that there is no general, universal scaling form for the entanglement of a sphere in d > 1. However, we do expect saturation to remain discontinuous, as in 1D [Eq. (24)], occurring via a transition between an optimal membrane configuration that reaches the bottom of the space-time slice and one (with E ¼ πR 2 ) that does not.

IX. OUTLOOK
Quantum quenches generate complex, highly entangled states whose dynamics cannot usually be tracked explicitly. For this reason, analytical approaches to quenches have typically relied on additional structure in the quantum dynamics: for example, integrability, or absence of interactions, or conformal invariance. This paper instead studies dynamics that are as unstructured as possible. We propose that noisy dynamics are a useful toy model for quantum quenches in generic (nonintegrable, non-conformallyinvariant) systems.
Many of our results are, of course, specific to noisy dynamics: in particular, the emergence of KPZ behavior at long wavelengths in 1D, and the detailed pictures for entanglement growth afforded by the "KPZ triumvirate." But we suggest that some of our heuristic pictures apply to non-noisy entanglement growth as well (with the replacement S → S=s eq mentioned above). We propose a general directed polymer picture or minimal membrane picture for the scaling of the entanglement and mutual information (Secs. V, VIII B) and we use the operator spreading picture to clarify the meaning of the "entanglement velocity" and its distinction from the operator spreading velocity (Sec. V). "Thermalization is slower than operator spreading" in generic 1D systems (i.e., in general ,v E is smaller than v B ): by contrast, this is not true in 1+1D CFTs [2], or in certain toy models [23]. It would be interesting to make more detailed comparisons with holographic models [8].
Many interesting questions remain. First, within the realm of noisy systems, an analytical treatment for the regime with weak noise would be desirable, i.e., for dynamics of the form where H 0 is a time-independent many-body Hamiltonian, H 1 ðtÞ represents noise, and λ is small. Our conjecture is that KPZ exponents apply for any nonzero value of λ [unless HðtÞ is fine-tuned]-i.e., that there is no universal distinction between continuous time dynamics and quantum circuits.
(Note that there is no distinction between these two cases at the level of conservation laws: once noise is added, energy is not conserved even in the continuous time case.) However, our derivations and numerics correspond, roughly speaking, to the large λ regime. Perhaps the opposite regime could be addressed using a more explicit renormalization group (RG) treatment, although it is not obvious how to set this up. Such a RG treatment might also shed light on the nature of the entanglement spectrum or, equivalently, the dependence of S n ðtÞ on the index n. While we believe that all the Rényi entropies execute KPZ growth in the presence of noise, we have not pinned down the n dependence of the various constants. The solvable models suggest that the leadingorder behavior may be independent of n at large times. What is the appropriate scaling form for the spectrum? Limited time scales prevent us from addressing this numerically (except for Clifford circuits, where all the S n are trivially equal). (The entanglement spectrum is one window on the structure of the quantum states generated by the random dynamics. We can also ask in what ways these states differ from ground states of random Hamiltonians, when the amount of entanglement is similar.) As a more speculative question in the domain of noisy dynamics, we may ask whether there exist timeindependent Hamiltonians that show KPZ entanglement fluctuations, despite the absence of explicit noise, in some dynamical regimes. We emphasize that this seems unlikely on asymptotically long time scales for a generic system (since local reduced density matrices and observables will eventually thermalize), but it may hold on intermediate time scales in certain systems in which some degrees of freedom act effectively as chaotic classical variables and provide effective noise.
In the text, we discuss only initial states with area-law entanglement. A natural extension is to initial states with, for example, submaximal volume-law entanglement. The natural expectation, say, in 1D, is that the directed polymer picture extends to this case if we glue the unitary circuit to a tensor network representation of the initial state. Then the entropy Sðx; tÞ would include a fluctuating part with KPZ exponents together with a contribution from the initial state. Another natural direction to explore is the role of conserved quantities.
Turning to higher dimensions, it would also be useful to test the higher-dimensional membrane pictures of Sec. VIII, perhaps exploiting Clifford circuits to reduce the numerical difficulty of higher-dimensional dynamics.
There are also many further questions regarding deterministic systems for which the tools we introduce here may give insight. For example, the forthcoming Ref. [77] will discuss entanglement growth in disordered or inhomogeneous spin chains from the point of view of surface growth, while Ref. [73] will give results for the spreading of quantum operators. Consider a one-dimensional quantum spin chain of local Hilbert space dimension q, prepared initially in a product state, and apply a sequence of random unitaries that couple two neighboring spins. The location of the local unitary at a given time step is arbitrary. In the following we fix the location of the unitary, but take it to be Haar random.
This formula can be interpreted in matrix-product-state language. If d x is the minimal value of the local bond dimension required for an exact matrix-product-state representation of the state, then S 0 ðxÞ ¼ log d x . A heuristic parameter-counting argument for the local bond dimension, given in Appendix A 3, suggests Eq. (A1).
However, a more rigorous proof is necessary as such heuristic arguments can fail. In particular, one might naively conjecture a stronger statement: namely, that for any state at time t, if the unitary at bond x is Haar random, then Eq. (A1) is true with probability 1. This conjecture is false; a counterexample is given in Appendix A 2. We now give a proof of Eq. (A1).

Proof of Eq. (A1)
Our genericity proof consists of two parts. First, we show that given locations of unitaries, there exist certain unitaries such that at each time step Eq. (A1) is true. Second, we show the negation of Eq. (A1), happens if and only if a system of polynomial equations in the entries of the unitaries is satisfied. (The inequality ">" never holds, as we note in the main text.) By the first part of the proof, the zero locus of these polynomial equations does not cover the entire set of unitaries. Therefore, it is only a submanifold of strictly smaller dimension, which implies it has measure zero. For the first part, it is sufficient to consider only three types of local unitaries: the identity I, the swap gate W, and a unitary E with the property that it turns a pair of unentangled polarized spins, j11i, into q −1=2 P q i¼1 jiii, a maximally entangled state. Without loss of generality we may take the initial product state to be the polarized state j…1111…i.
We show that using these three types of unitaries at the given locations, one can construct a state whose entanglement entropy is given by Eq. (A1). Since Eq. (A1) defines the entropy inductively, we only have to show it inductively, too.
At t ¼ 0, all the spins are unentangled, so we can simply choose E for every designated location. Clearly, Eq. (A1) is satisfied. At later times, if we do not apply E except on an unentangled pair of spins, then a spin can be either unentangled or maximally entangled with a single other spin. Therefore, at time t > 0, the spin s L that is immediately left to the bond x can be (i) unentangled, (ii) entangled with a spin to the left of s L , (iii) entangled with the spin s R that is immediately to the right of the bond y, or (iv) entangled with a spin to the right of s R .
These are exclusive possibilities, and similarly s R has four options. Enumerating all 16 cases, which in fact reduces to seven different cases excluding invalid ones and those related by reflection, one easily checks that there is always a choice among I, W, E that makes Eq. (A1) true. Let us treat three exemplary cases here. If s L and s R are entangled at time t, then Sðx − 1; tÞ ¼ Sðx þ 1; tÞ and Sðx; tÞ ¼ Sðx − 1; tÞ þ 1, so one chooses the identity I. If s L is entangled with a spin on the left of s L and s R is entangled with a spin on the right of s R , then Sðx − 1; tÞ ¼ Sðx þ 1; tÞ ¼ 1 þ Sðx; tÞ. One chooses the swap W to obtain Sðx; t þ 1Þ ¼ Sðx; tÞ þ 2. If s L and s R are both unentangled, then one applies the entangling unitary E to obtain Sðx;t þ 1Þ − 1 ¼ Sðx;tÞ ¼ Sðx − 1;tÞ ¼ Sðx þ 1;tÞ.
For the second part, recall that for any bipartite state the number of nonzero Schmidt coefficients is equal to the number of nonzero singular values of the matrix M, which is nothing but the rank of M. For any positive integer r, the rank of M is smaller than r if and only if every r × r submatrix has determinant zero; i.e., all r × r minors vanish. Thus, a bipartite state jψi having Hartley entropy (log of rank of M) strictly smaller than log r is expressed by a system of polynomial equations on the coefficients of jψi. If jψi is given by U t …U 2 U 1 j0i, where j0i is a fixed product state, then the coefficients are some polynomials of the entries of the unitaries U i , and hence the equations that expresses vanishing determinants are polynomial equations in the entries of the unitaries. Our claim Eq. (A1) completely determines the Hartley entropy based on the location of unitaries, and therefore the spatial configuration of the unitaries tells us which minors we should check. Namely, the size r of the minors we turn into the polynomial equations is given by (the exponential of) the right-hand side of Eq. (A2). In other words, given a spatial configuration of unitaries, the polynomial equations that express Eq. (A2) are determined. The polynomial equations are over tL variables, and the actual number of equations is much larger yet finite. We do not need explicit expressions for these polynomials, only the fact of their existence. These polynomials might a priori read 0 ¼ 0; i.e., they could be trivially satisfied. In that case, the solution to the polynomial equation would be the entire set of unitaries, and Eq. (A1) could never be satisfied. However, we just showed in the first part that this cannot happen because there exists a choice of unitaries for which Eq. (A1) is satisfied. This implies that the polynomial equations are nontrivial and define a measure zero subset of the entire set of unitaries. This completes the genericity proof.

Counterexample to the stronger conjecture
We show above that Eq. (A1) holds when all unitaries are chosen generically and the initial state is a product state. Naively one might make the stronger conjecture: that the update rule Eq. (A1) holds whenever a generic unitary U is applied to an arbitrary-possibly fine-tuned-state jΨi. We construct an explicit jΨi, which is a counterexample to this stronger conjecture.
Consider four degrees of freedom ABCD. The spins B and C have dimension 2 each, and A and D have dimension 3 each. (To conform with our consideration of spin chains, the subsystems A and D should be regarded as subspaces of two or more spin-1=2's.) The most general form of a quantum state on ABCD is We consider T abcd ¼ T 0 abd δ c;0 , i.e., C is in j0i, where (This does not give a normalized state, but we are only concerned about ranks.) The Hartley entropy for the cut A=BCD is simple to compute. As we remark in the previous section, it is the rank of the coefficient matrix. Interpreting this matrix as a linear map, the rank is the dimension of the image of the map from BCD to A. The image is precisely the linear span of columns of T 0 a0d and T 0 a1d . They have three linearly independent columns, implying that the Hartley entropy for A=BCD is log 2 3. Similarly, the rank of the coefficient matrix for ABC=D is the dimension of the linear span of the rows of T 0 a0d and T 0 a1d , which reads 3. That is, the Hartley entropy for ABC=D is log 2 3.
If Eq. (A1) were to be true for the generic choice of Haar random unitary on BC, then we should be able to find a unitary on BC such that We show this cannot hold. Applying the unitary U on BC the state, we obtain X b;c where U 0 and U 1 are 2 × 2 matrices. The coefficient matrix for the cut AB=CD is then whose rank should be 6 if S 0 ðAB=CDÞ ¼ log 6. Computing all the minors of the 6 × 6 matrix V for arbitrary matrices U 0 and U 1 , we find that all (5 × 5) minors vanish, implying that V has rank at most 4. Therefore, for this nongeneric initial state, S 0 ðx; t þ 1Þ ≠ min½S 0 ðx − 1; tÞ; S 0 ðx þ 1; tÞ þ 1: 3. Parameter-counting argument Consider a 1D state jΨi in a matrix product representation. Labeling the states of the qubits (spins) by σ; σ 0 ; … running from 1 to q, Since the state is not translationally invariant, we allow the bond dimension d x to vary from bond to bond (a x ¼ 1; …; d x ). In an efficient representation, d x is equal to the rank of the reduced density matrix for a cut at x: We ask how S 0 ðxÞ changes when we apply a unitary U to the two spins, σ and σ 0 , either side of bond x. This effects the change (repeated indices are summed): To update the matrix product representation, we must find new matricesÃ andÃ 0 which satisfỹ A σ a x−1 ;a xÃ 0σ 0 a x ;a xþ1 ¼ U σσ 0 ;ττ 0 A τ a x−1 ;a x A 0τ 0 a x ;a xþ1 : ðA13Þ In order to solve this equation forÃ andÃ 0 , it will generally be necessary to increase the bond dimension at x to a new value d 0 x . Naively, the necessary value of d 0 x will generically be determined by equating the number of independent equations in Eq. (A13) with the number of degrees of freedom inÃ andÃ 0 . (However, the previous section shows that this expectation can fail for certain choices of A and A 0 .) The number of equations is q 2 d x−1 d xþ1 , since this is the number of possible values for the external indices in Eq. (A13). The numbers of degrees of freedom inÃ and A 0 are qd x−1 d 0 x and qd xþ1 d 0 x , respectively. However, d 0 x 2 of these are redundant, because the state is unchanged by the transformationÃ σ →Ã σ M,Ã 0σ 0 → M −1Ã0σ 0 , with M an arbitrary d 0 x × d 0 x matrix. Equating the number of equations with the number of independent degrees of freedom gives Choosing the smallest solution, NAHUM, RUHMAN, VIJAY, and HAAH PHYS. REV. X 7, 031016 (2017) This agrees with Eq. (A1) since S 0 ðxÞ ¼ log d x .
APPENDIX B: HAAR AVERAGE FOR Trρ 2 x Let ρ x ðtÞ be the reduced density matrix for a cut at x, obtained by tracing out the spins to the left of the cut. Each index on this matrix labels a configuration of the spins to the right of the cut. Let us temporarily label these spins 1; 2; …, and let the spin immediately to the left of the cut be denoted 0. The indices on the reduced density matrices are then ρ x−1 ðtÞ σ 0 ;σ 1 ;σ 2 ;… μ 0 ;μ 1 ;μ 2 ;… ; ρ x ðtÞ σ 1 ;σ 2 ;… μ 1 ;μ 2 ;… ; ρ xþ1 ðtÞ σ 2 ;… μ 2 ;… : ðB1Þ In the following we assume that repeated indices are summed. After applying a unitary on bond x, ρ x ðt þ 1Þ σ 1 ;σ 2 ;… μ 1 ;μ 2 ;… ¼ U τσ 1 ;σ 0 The index contractions give the result in the text:

APPENDIX C: ENTANGLEMENT ENTROPY OF STABILIZER STATES
A stabilizer state is a state of an n-qubit system defined by a complete set fg 1 ; …; g n g of commuting tensor products of Pauli matrices through equations The group generated by fg 1 ; …; g n g is naturally called a stabilizer group, and denoted by G [87,95]. A trivial example is the all-spin-up state, defined as for all i ¼ 1; …; n. The condition that jψi is nonzero and unique is equivalent to the condition that the operator is a projector of rank 1 [96,97]. Since jψi is in the image of this projector, we see Since this is a normalized pure density matrix, its trace is equal to 1. But a Pauli matrix has the property that it is traceless. Therefore, only the identity element on the right has nonzero trace: From this expression, it is straightforward to obtain expressions for reduced density matrices. Suppose the n-qubit system is partitioned into two complementary regions A and B. Tracing out B, we have Tr B ðgÞ is nonzero if and only if the tensor component corresponding to B is identity, in which case where gj A denotes the tensor components of g corresponding to A. The set of all gj A such that Tr B ðgÞ ≠ 0 can be regarded as a subgroup of G, which we denote by G A . The formula for ρ A now reads It is immediate that ρ A is proportional to a projector since it is a sum over a group. It follows that the rank of ρ A is equal to 2 jAj =jG A j. In particular, the (Rényi or von Neumann) entropy of ρ A with base-2 logarithm is The subgroup G A has period 2, and therefore log 2 jG A j is an integer, which is equal to the number of independent stabilizers supported only on A. This expression for the entanglement entropy has also appeared in Refs. [82,83]. Now, regard the stabilizer group G as a binary vector space V by ignoring the overall phase (sign) factors. Let Π A be the truncation map retaining the components corresponding to the region A, and similarly let Π B be the truncation map for B ¼Ā. It is routine to check that V decomposes as V A ⊕ V B ⊕ V 0 for some subspace V 0 ⊆V, where V A and V B are the spans of stabilizers supported only on A and B, respectively. Both the truncation maps are injective on V 0 . It follows that This completes the proof of Eq. (33).

APPENDIX D: NUMERICS FOR FULL CLIFFORD EVOLUTION
In Sec. VI A, we present numerical results for random unitary evolution using only the CNOT gates Eq. (31). Here, we present similar analysis using the full set of generators for the Clifford group, showing that the additional gates do not modify the universal behavior. The additional single-site gates are the Hadamard and phase gates defined in Eqs. (29) and (30). respectively. (The Hadamard gate corresponds to swapping the X and Z vectors, while the phase gate corresponds to adding the X vector to the Z vector.) The von Neumann entropy in units of log 2 and the corresponding width averaged over ∼2 × 10 5 realizations (except for the last data point, where ∼2 × 10 4 realizations are used for the average) are plotted in Fig. 27. The fit to the KPZ universal form Eq. (49) gives β h ¼ 0.2 AE 0.15 and β w ¼ 0.3 AE 0.04. We also obtain v E ¼ 0.194 AE 0.001, B ¼ 0.4 AE 0.2, C ¼ 0.4 AE 0.1, D ¼ 0.4 AE 0.6, and η ¼ −0.4 AE 0.8. These results are consistent with the KPZ universality and with the data presented in Fig. 17.

APPENDIX E: DETAILS OF STATISTICS OF MEMBRANES
The exponents governing the membrane problem are traditionally denoted θ and ζ, and are related by 2ζ − θ ¼ 2 − d [47]. Consider a patch of the membrane with linear dimensions scaling as l. This includes both its temporal dimension and its internal spatial dimensions: after a rescaling of time, the membrane is statistically isotropic on large scales. The mean "energy" of this patch of membrane scales as l d þ const × l θ , with fluctuations of order l θ . The length scale for wandering of the membrane in the transverse direction is of order l ζ . The numerical results we quote in the main text are in good agreement with an epsilon expansion about d ¼ 4, which gives ζ ≃ 0.208ð4 − dÞ [93] (see also Ref. [98]). The scaling forms for the entanglement we discuss in the text are easily found by regarding the membrane as made up of patches of appropriate linear size: size t for Eqs. (57) and (58) and size L for Eq. (59).
Note that the geometry of the membrane, including the transverse length scale (which is Δx ∼ t ζ for the regime t ≲ L), determines the dimensions of the space-time region around ∂A for which the final entanglement is sensitive to small changes in HðtÞ, i.e., in the history of the noise.

APPENDIX F: ENTANGLEMENT PROBABILITY DISTRIBUTION
As we mention in the main text, a remarkable recent advance in KPZ theory has been the derivation of the full universal probability distribution for the height of the surface at fixed position and fixed large time [56][57][58][59][60][61][62][63][64][65][66]; see Refs. [99][100][101] for reviews. In our case, this height corresponds to the entanglement S across a cut in a system undergoing noisy unitary dynamics. One may separate out the nonuniversal growth rate v E , and the nonuniversal constant D governing the scale of fluctuations, by writing The rescaled random variable χ is then expected to have a universal probability distribution PðχÞ at late times. This probability distribution depends on the initial condition for the surface. For a surface that is initially flat, PðχÞ is the Tracy-Widom distribution with β ¼ 1. This is the case relevant to our setup, where Sðx; t ¼ 0Þ ¼ 0. (In the directed polymer interpretation, this corresponds to a setup where the x-coordinate of the upper end point of the polymer is fixed but that of the lower end point is free; again, this is the setup relevant to our minimal cut picture.) Other initial conditions for a growing surface can give different universal forms for PðχÞ-for example, the socalled "narrow wedge" initial condition gives the Tracy-Widom distribution with β ¼ 2.
(The latter distribution is likely to be relevant to noisy growth of entanglement between two subsystems that are initially unentangled with each other, but separately highly entangled.) [The generalization of the directed polymer picture to entangled initial states (Sec. IX) indicates that the lower end point of the polymer is then no longer free, and instead favors x ¼ 0.] In Fig. 28, we fit numerical data for the probability distribution of S to the expected Tracy-Widom form and, for comparison, to a Gaussian distribution. The data are for the "full" Clifford dynamics (defined in Sec. VI A) at time t ¼ 2048. Each fit involves two parameters, corresponding to the mean and the variance. The Tracy-Widom distribution used is the theoretically expected one with β ¼ 1, but, in fact, the present data do not allow us to discriminate between TW β¼1 and TW β¼2 . The Tracy-Widom distribution is a much better fit to the data than the Gaussian, as quantified in the caption of Fig. 28. This is further confirmation of KPZ universality in the Clifford case. A more detailed investigation of the probability distribution is beyond the scope of this paper, in view of finite-time effects at the accessible time scales. A fit to TW with β ¼ 2 is not shown, but is indistinguishable at the scale of the figure. Bottom: Best fit to the Gaussian. Clearly, the Tracy-Widom distribution fits the data better than Gaussian, as the latter shows systematic deviation. The 1 − R 2 values for the fits are 2.1 × 10 −4 for TW β¼1 , 2.0 × 10 −4 for TW β¼2 , and 1.6 × 10 −3 for Gaussian.