The entanglement membrane in chaotic many-body systems

In certain analytically-tractable quantum chaotic systems, the calculation of out-of-time-order correlation functions, entanglement entropies after a quench, and other related dynamical observables, reduces to an effective theory of an ``entanglement membrane'' in spacetime. These tractable systems involve an average over random local unitaries defining the dynamical evolution. We show here how to make sense of this membrane in more realistic models, which do not involve an average over random unitaries. Our approach relies on introducing effective pairing degrees of freedom in spacetime, inspired by the structure emerging in random unitary circuits. We apply this approach to some translationally invariant Floquet spin chains studied in the literature. We show that a consistent line tension may be defined for the entanglement membrane, and that there are qualitative differences in this tension between generic models and ``dual-unitary'' circuits. These results allow scaling pictures for out-of-time-order correlators and for entanglement to be taken over from random circuits to non-random Floquet models. We also provide an efficient numerical algorithm for determining the entanglement line tension in 1+1D.

One way to motivate this membrane, which is relevant to the approach we take here, is via the multi-layer structure of the quantum circuit (or the multi-sheet structure of the path integral, in a continuum language) representing the observables in question. Algebraically, conventional correlation functions, such as an expectation value O(t) following a quench, involve a single copy of the unitary time evolution operator U (t), and a single copy of its Hermitian conjugate U † (t). Formally, we can represent them in terms of matrix elements of the operator U (t) ⊗ U * (t). But the quantities mentioned above (for example the N th Rényi entropy, obtained by tracing the N th power of the reduced density matrix) require matrix elements of the multiply "replicated" operator U (N ) ≡ U ⊗ U * · · · U ⊗ U * (1) with N > 1 copies of U , and N copies of U * . In a path integral language, we require multiple forward and backward paths [20,26]. The structure arising from this is most easily understood in random unitary circuits, which are chaotic models built from random unitary gates [6-8, 11, 14]. Averaging U (N ) in Eq. 1 over the random unitaries introduces a degree of freedom σ(x, t), at each location in spacetime, that labels a pairing between the set of forward replicas and the set of backward replicas [14]. Formally this pairing is a permutation, σ ∈ S N , and quantities like the Rényi entropies and the out-of-time ordered correlator (OTOC) map to effective partition functions for σ(x, t). In this setting, the membrane can be understood as a domain wall between different values of the pairing field σ(x, t) [7,11,14].
The basic quantity characterizing the membrane is its coarse-grained "line tension" E(v) [18]. This line tension determines the system's butterfly velocity or operator spreading speed and its entanglement growth rate. It can be thought of loosely as associating an entanglement cost with a given curve in spacetime (in general this tension depends on N ). See Fig. 1 for a schematic picture. We expect that the membrane description continues to make sense in systems more general and more realistic than random circuits, for the heuristic reason described in the  next paragraph. In support of this, there is numerical evidence that a consistent membrane tension can be defined in a generic spin chain [18], and in a dual-unitary circuit [34]. There are analytic computations in random Floquet circuits in the large Hilbert space dimension (large q) limit [11], showing that the domain wall structure makes sense in that limit. Finally, there is an analytical derivation of the form of the membrane tension in holographic systems [35], at leading order in the number of degrees of freedom; this also begins to make a concrete connection between the entanglement membrane, and the Ryu-Takayanagi/Hubeny-Rangamani-Takayanagi geometrical pictures for entanglement in AdS space [36][37][38]. (Lattice toy models for the AdS-CFT correspondence, using random tensor networks, also exhibit domain walls between permutation degrees of freedom [49? ].) But at present there is no formalism for explaining how the pairing degrees of freedom emerge when there is no average over randomness, or how to compute the properties of the membrane in a generic system without randomness and away from any large-N -like limit. As a more general point, random circuits and related stochastic models have shed light on various aspects of chaotic dynamics [6][7][8][9][10][11][12][13][14][15][16][17], and a natural question is how to generalize these calculations to non-random systems. Our aim here is to provide a formalism for doing this.
Heuristically, the paired configurations mentioned above can be thought of as saddle points of the path integral for the replicated evolution operator (1). This path integral contains multiple forward and backward paths. In a continuum language, these are weighted by e −iS and e +iS respectively, where S is the action; in the discrete setting, e −iS is replaced by a product of matrix elements of local unitaries. By pairing each forward path with a backward path we can cancel the phases in the exponent, so we might guess that paired configurations predominate. In the cases of interest to us, however, the boundary conditions force the pairing to differ in different regions of spacetime. This leads to saddle-point solutions with a nontrivial membrane structure.
In this work we show explicitly, for a large class of systems without conservation laws, how the pairing field σ(x, t) and the entanglement membrane emerge, making the above heuristic picture precise. We emphasize that we do not require either a limit of large local Hilbert space dimension, or any kind of randomness. To demonstrate the utility of our approach, we compute the entanglement line tension (for N = 2, the case relevant to the OTOC and to the second Rényi entropy) for a variety of translationally-invariant Floquet spin-1/2 chains.
Formally, our approach is to work in the replicated Hilbert space (acted on by Eq. 1) and to insert, in each time slice, a local resolution of identity: this includes states |σ associated with pairings, but also states in the complement of the Hilbert space, which we denote by ⊥. Traces of U (N ) , yielding for example correlation functions or Rényi entropies, can then be written as partition functions for an effective "spin" s(x, t) which can take the t σ σ ⊥ cluster (a) Figure 2. Thick membrane (schematic). Except in the special case of Haar-averaged unitaries, the membrane (a domain wall between permutations σ and σ ) is thickened by clusters where the effective spin takes the value "⊥". In the multilayer tensor network for U (N ) , this represents the propagation of states orthogonal to permutation states. (For N > 2 the domain wall is also decorated by other permutations, distinct from σ or σ [11,14].) values either s = σ (a permutation) or s =⊥. Including these non-pairing states allows the structures that appear in the random circuit to be generalized to models without any random average. The Haar-averaged random circuit can be thought of as a special case: there, the contributions from configurations with ⊥ vanish after averaging, and the membrane is a domain wall between permutations [7,14]. In, say, a translationally-invariant system, the domain wall structure becomes more complex, with the appearance of the spin value s =⊥ along the domain wall. (In fact, this is also true in a particular realization of the random circuit, before we average over the random unitaries.) This leads to a thickened domain wall with a more complex structure. However we argue that in chaotic systems, the domain wall's width remains of order 1 in terms of microscopic lengthscales. Therefore, on large scales, the membrane is still well-defined, but with a nontrivial, renormalized membrane tension function. We illustrate this situation in Fig. 2.
Our initial introduction of the pairing field is exact but formal, because the microscopic weight for a configuration s(x, t) is in general complicated. To make progress we first argue that configurations with large unpaired regions (large regions of ⊥) are exponentially suppressed. Then we show how to resum an infinite number of configurations, or "Feynman diagrams", exactly, to give a simpler description of the entanglement membrane, in terms of a reduced set of parameters.
Before tackling translation-invariant systems, we analyze the case of a fixed realization of a random unitary circuit. This illustrates the basic mechanism that makes the approach possible -suppression of ⊥ states by phase cancellation -in a tractable setting. Since the disorder realization is fixed, we cannot introduce the pairing degrees of freedom by Haar-averaging. Instead we apply the new method described here. Working at large but finite Hilbert local space dimension, we show that the membrane remains well-defined, and is thickened slightly by occasional appearances of ⊥ states. This leads to a membrane that inhabits a disordered "potential" in spacetime. This picture reproduces our previous results on Kardar-Parisi-Zhang (KPZ) fluctuations in the Rényi entropy S 2 between different realizations of the random circuit [14]. But previously these results required the replica trick, which we are able to dispense with here.
Next we turn to Floquet dynamics of spin-1 2 chains, without any kind of randomness. In order to control the calculations, we introduce a systematic expansion in the maximal temporal extent of connected ⊥ clusters in the effective spin model. A priori, this expansion does not have a small parameter. However we conjecture (based on the tractable case of the random circuit) that it is a convergent expansion in chaotic Floquet systems.
Using this observation, we construct an efficient numerical scheme for computing the membrane line tension E(v) using the expansion above: we truncate the maximum size of a connected ⊥ cluster (but we do not require that these clusters are dilute), and we examine convergence as a function of the order at which we truncate.
The numerical results show good convergence for a wide range of chaotic circuits, including prototypical spin models for quantum chaotic systems, such as Floquet Ising models with longitudinal and transverse fields (kicked Ising models) [21,39,40].
The line tension E(v) satisfies some general constraints [18] which provide a check on our results. Recall that dt E(ẋ(t)) is proportional to the "free energy", in the scaling limit, of a membrane whose shape is parameterized by x(t) (Fig. 1). E(v) is a convex function of velocity, which is tangential to the line E = ±v at the point (±v B , v B ), where v B is the quantum butterfly velocity (for parity symmetric systems), see the schematic picture in Fig. 1(b). For the chaotic systems we study, our numerical approximations to E(v) indeed appear to converge a form satisfying the constraints above.
We also investigate a special case of the kicked Ising model that has maximal entanglement growth [34] and the property of "dual unitarity" which, remarkably, allows a range of exact computations [41][42][43][44]. We argue that the entanglement membrane has special properties in dual-unitary models: its equation of motion becomes hyperbolic, rather than being parabolic (diffusive) as it is in generic circuits. We consider time evolution by a unitary circuit made of two-site gates, with the structure in Fig. 3. We will use U (t) or simply U to denote the full many-body unitary represented by the entire circuit, and the lower-case u to stand for a local two-site gate.

CONTENTS
At this point the local gates u are left arbitrary. In some sections we will restrict to Floquet models with both space and time translation invariance, but this is not necessary. The circuit geometry guarantees that there is no propagation of information outside a strict lightcone with speed v = 1; the actual butterfly speed in these circuits is in general strictly smaller than 1.
Conventional time ordered correlation functions such as Tr O(x n , t n ) . . . O(x 1 , t 1 ) involve a 'two-layer' quantum circuit made of U and U * , in the sense that they can be obtained from the contraction of U ⊗ U * by inserting operators in the intermediate time-steps. In contrast, the quantities of interest to us here require traces of the 2N -layer unitary circuit where N > 1. For example, the out-of-time order correlation function requires N = 2. The the Rényi entanglement entropy for subsystem A requires N = n to write the n-th power of the timeevolved reduced density matrix. Taking N > 1 leads to new structure. One consequence is that it multiplies the number of conservation laws, since formally each 'copy' of the system undergoes independent unitary evolution. This will not concern us here, however, since we study models without conservation laws. The additional structure we examine is instead associated with pairings between spacetime histories in the 'worlds' associated with different layers [14].
The multi-layer circuit has N layers of U and of U * , and can be thought of as describing evolution of a 'replicated', multi-copy system. For a heuristic picture, we may think in terms of Feynman histories of this replicated system in our chosen local basis. The amplitude for a given Feynman history is given by a product of matrix elements of the local unitaries. Feynman histories that are 'paired', so that the N histories occuring in the U replicas are pairwise equal to the N histories occuring in the U * replicas, avoid phase cancellation [14]. (This pairing can happen in N ! different ways.) We will argue that in quantum chaotic systems, the observables mentioned above are dominated by Feynman histories that have a well-defined pairing structure after coarse-graining. We will show how a corresponding local 'pairing field' σ(x, t) can be defined by coarse-graining over microscopic length and time scales.  To motivate the structures we will consider, we first  recall what happens when the local unitaries are averaged  over the Haar ensemble instead of being fixed. For a given two-site gate u, acting on sites i and i + 1, the replicated gate u (N ) is a unitary operator on 2N copies (N layers of u, N layers of u * ) of the two-site Hilbert space, which has dimension d = q 2 . Its average (which is no longer unitary) is the operator where Wg is the unitary Weingarten function [45][46][47].
Here τ and σ are elements of the permutation group S N . These label a pairing of the u layers, labelled 1, 2, . . . , N , with the u * layers, labelled1,2, . . . ,N . For example σ corresponds to pairing 1 with σ(1), etc. At a given spatial site, the state |σ i is a product of maximally entangled states between paired copies of the site, and the states above are for a pair of spatial sites: |σσ = |σ i ⊗ |σ i+1 . For example, for N = 2, there are two possible permutations: the identity I and the transposition (12). In our local basis the corresponding states at a site are a 1 , a 1 , a 2 , a 2 |I = δ a1a1 δ a2a2 , a 1 , a 1 , a 2 , a 2 |(12) = δ a1a2 δ a2a1 (6) (note that we do not normalize these states).
In Ref. 14 we showed how, starting with the above expression, the expression for U (N ) could be reduced to a lattice magnet, with local interactions, for the permutations σ (the permutations τ were integrated out in this mapping). At the end of this mapping, there is a single σ spin associated with each local block in the circuit. Now let us consider a general circuit with the brickwork geometry in Fig. 3. The basic idea will be to separate out the pairing states, |σ , in the multi-copy Hilbert space for a pair of sites, from states in orthogonal complement of this space. To simplify notation we will often write |σ for the two-site state |σσ .
Eq. 5 shows that in the Haar-averaged case contributions from the orthogonal complement vanish identically. More generally they do not vanish. However we will argue that the permutation states dominate in a certain sense, and that states in the orthogonal complement can be taken into account via a systematic procedure.
In fact, the expression for the Haar average of a local unitary in Eq. 5 can be regarded simply as the projection operator onto the subspace of permutation states. By the invariance of the Haar measure under (say) left multiplications, showing that this is a projector. The state |σ satisfies u (N ) |σ = |σ for any choice of two-site unitary, so after Haar averaging we also have u (N ) This shows that the Haar averaged object, which is a Hermitian operator on the replicated space, is equal to P , the orthogonal projector onto the subspace spanned by permutation states [48].
To separate out contributions from permutation states for a general circuit, we will insert the resolution of the identity (on two physical sites) immediately before each two-site gate in the circuit. Here P ⊥ is the projector onto the subspace complementary to that spanned by pairing states. This insertion is shown in (10) below as a bar below each gate. We will view the quantity to be computed, for example the purity, as a partition function which we denote Z. (In the purity example, the second Rényi entropy is the free energy associated with this partition function.) After using 9, this partition function is a sum of terms with all possible choices of projection operator inserted below each gate. Graphically, where the sum is over all possible assignments of , ⊥ to the projection operators, which are represented by horizontal bars. We leave the boundary conditions at the initial and final times unspecified at this point, since they depend on the observable to be computed. Note that the operator P can "absorb" an arbitrary two-site gate. This can be seen from the invariance of the Haar measure. For any 2-site unitary u, As a result, the term in the partition function where all the insertions are P is identical to the expression where all gates are replaced with Haar-averaged gates. Therefore the decomposition in Eq. 10 gives us a wellunderstood starting point, which we can go beyond by taking into account domains of ⊥ insertions. Roughly speaking, this approach will be useful if these domains do not form a connected cluster 'percolating' from the top to the bottom of the spacetime slab, but instead form finite clusters. Their effect can then be taken into account as a renormalization of the interactions for the permutation degrees of freedom that are present in the random circuit.
To make these permutations explicit we decompose P using Eq. 5. In the random circuit it was useful to perform the sum over the τ variables (cf. Eq. 5) explicitly, leaving a partition sum solely for the σ variables. Algebraically, that corresponds to splitting P into projection operators P σ , which are given by grouping together terms in Eq. 5 with a given σ: Using the properties of the Weingarten function, one can check that the states appearing in the brackets form the dual basis of |σσ , i.e. (σσ) * |τ τ = δ τ,σ , so that the non-Hermitian operators P σ = |(σσ) * σσ| are projectors satisfying (App. A): Here we assume that q is generic, to avoid divergences in the Weingarten function, which can arise for small values of q when N is sufficiently large (for q = 2, such divergences can arise for N > 4). This is not a fundamental obstacle, as we could always define a "spin" with a larger local Hilbert space dimension by grouping sites. Our resolution of the identity, inserted below each block in the multi-layer circuit, is now refined to Let us represent the possibilities graphically as: Importantly, the blocks on the left, labelled by permutations, are independent of the choice of local unitary gate, because P σ again "absorbs" any unitary. For any gate u, we have u (N ) |σ = |σ , so However the ⊥ blocks, which represent the tensor u (N ) P ⊥ , depend on the local unitaries defining the dynamics.
We have now written Z as a sum over spins s, associated with each block in the circuit, which take values in Throughout this paper, we will use x, t to label the bond between two lattice sites. The gates are located on half of the bonds. Note that "⊥" is simply a choice of label for one of the values that s can take; we could also have denoted this state by "0" or anything else. At each location, s runs over N ! + 1 different values.
The weight of a spin configuration {s} in this partition sum is given by contracting the network. In general, this weight is not simply a product of local terms. However we will argue that for the boundary conditions of interest, and for generic models, there is an effective notion of locality. We first describe some basic properties of the weights (Sec. II B), and then specify to the case of interest, where the boundary conditions induce a domain wall structure in the spin configuration (Sec. II C).

B. General properties of the spin model
First consider the sub-partition-function without any P ⊥ insertions (i.e. with s =⊥ everywhere), which we denote Z 0⊥ .
Z 0⊥ is a statistical model whose "spins" σ are permutations [14]. The spins have local three-body interactions on "down-pointing triangles" (and no interactions on uppointing triangles). To be specific, each down-pointing triangle such as has an associated weight From Eq. 13, the interaction is (see App. A) This spin model was described in Ref. 14 for the random circuit. In that context, the spins σ arose from the Haar average over the physical random unitary in Eq. 5. The three-body interaction appears after integrating out the "τ " spins. Here the sum over the τ spins appeared at the state level in defining the projection operator in Eq. 13. The simplest case is N = 2 [7]. There are then two possible permutations, I and (12), which we denote + and −. The weights are symmetric under exchange of + and − and under spatial reflection. They are + + − = 0, with which follows from Eq. 13 with the explicit expression and symmetrically for P − . Note the vanishing of the first weight, which amounts to a hard constraint on the spin configurations. The spin configuration {s} is in fact highly constrained for any N : the underlying unitarity means that J vanishes for many configurations of the triangle. In other words, many terms in the sum over s ∈ S N defining Z 0⊥ are zero (see Ref. [17] and App. B for a further simplification). For example, as a generalization of the first equality in Eq. 23 above, This enforces a light cone structure. For example, for the purity calculation which we discuss below in Sec. II C, the boundary spins on the top are s x,t = I for x < 0 and s x,t = (12) for x ≥ 0. This means that the spin configuration is only nontrivial within a backwards lightcone emanating from the entanglement cut at the top boundary [i.e. we must have s x <x−t+t ,t <t = I and s x ≥x+t−t ,t <t = (12)]. This light-cone structure is preserved when we include configurations with s =⊥. If the two spins at the top of a triangle are the same permutation σ b , the lower spin cannot be ⊥, because by definition the projector onto ⊥ is orthogonal to |σ b σ b : So again in the purity example above ⊥ can only be inserted in the backward lightcone emanating from the entanglement cut. Let us discuss the locality of the spin interactions. We have seen that in the absence of ⊥ the Boltzmann weight factorizes into three-body terms representing interactions between three adjacent spins. Once we have large clusters of ⊥ this is no longer true. However, the weight does factorize into a product of separate weights for each cluster of ⊥ (together with the local weights mentioned above): there is no interaction between disconnected clusters of ⊥. This is because surrounding a cluster of ⊥ with permutation states dictates a definite way of contracting up the ⊥ blocks, yielding a c-number. (We will give some explicit formulas below.) Therefore, if the partition function is dominated by configurations in which clusters of ⊥ have a finite typical size, locality will be regained after coarse-graining beyond this scale.
Finally, we note that the model we have defined has a global symmetry, with the symmetry group This is a consequence of the fact that Eq. 2 involves multiple copies of the same unitary [14,49]. The importance of S N × S N symmetry has been emphasized in random tensor networks [49], where permutations labelling pairings also appear [? ]. Graphically, the symmetry arises from the the possibility of permuting the layers of the original unitary circuit without changing Eq. 2. We can permute the U layers among themselves with a permutation g L ∈ S N , and we can also separately permute the U * layers among themselves with another permutation g R . This gives the S N × S N subgroup of G, which acts on the spins via The Z 2 subgroup of G arises from the fact that the multilayer circuit is invariant if we exchange all the U layers with all the U * layers, and also complex conjugate the circuit. The resulting symmetry acts by For both types of operation, the spin state ⊥ is invariant because these exchanges of layers preserve the and ⊥ subspaces of the two-site Hilbert space. The simplest case is N = 2, when G reduces to an Ising-like Z 2 symmetry relating + = I and − = (12). In this case the symmetry in Eq. 30 becomes trivial, since σ = σ −1 .
In the sub-partition function Z 0⊥ the weights J are invariant under G because the Weingarten function and the overlaps in Eq. 13 depend only on the cycle structure of products of the form στ −1 . This cycle structure is invariant under the above operations.
The G symmetry is a symmetry of the bulk interactions for the spin model. It will however be strongly broken by the boundary conditions we require.

C. Domain wall structure
As a useful illustrative example, let us consider the purity of a region A, e −S2(A) = tr(ρ 2 A ). We take A be + − a semi-infinite half-system to the left of the origin in an infinite chain. This quantity requires N = 2. If we take the physical system to start off in the state |ψ , and if we denote the replicated state in the four-copy Hilbert space by |ψ (2) , then where we have labelled the two permutation states at a site, I and (12), by + and −. For simplicity we take the initial state to be a product state, |ψ = ⊗ x |e x . Again let us first consider the sub-partition-function Z 0⊥ , which is equivalent to that for Haar-random unitaries treated in Ref. [7]. The spins have fixed boundary conditions at the top, enforcing a domain wall between + and −, and free boundary conditions at the bottom, because +|e (2) = −|e (2) = 1. As a result of the fact that J(+, +; −) = J(−, −; +) = 0 (the first diagram in Eq. 23) the only nonvanishing configurations in Z 0⊥ contain a single directed domain wall emanating from the entanglement cut, separating an infinite domain of + on the left and an infinite domain of − on the right. This is illustrated in Fig. 4, top. Formally where K, defined in Eq. 24, is the weight for a single step in the directed walk. Now we consider including the terms in the partition function with s =⊥, in order to address models without a Haar average. We have shown that the spins outside the backward light cone are equal and fixed by the final-time boundary condition (a consequence of causality) and that ⊥ can only appear within the backward light cone. Hence, we can define a 'thick' domain wall separating the connected infinite domain of + on the left from the connected infinite domain of − on the right. In principle, this thick domain wall could fill the entire backward lightcone: for example the ⊥ blocks could form a large percolating cluster of O(t 2 ) size. If such configurations are dominant, then the representation of the partition function in Eq. 19 is not useful.
However, we will argue that this does not happen for a typical chaotic choice of the circuit U , because of a phase cancellation effect which suppresses large ⊥ clusters. Instead the typical thickness of the domain wall remains of order 1 in t as t → ∞, so that in a typical configuration the domain wall has the schematic structure in shown in Fig. 4. The domain wall has a nontrivial structure made up of clusters that can contain ⊥, + and −; these clusters are separated by the places in spacetime where the domain wall becomes "thin", which we will take to mean "of minimal width". After coarse graining we recover a membrane picture similar to that in the random circuit, and we can define a renormalized line tension for this membrane.
We will give analytical and numerical evidence for this picture in Secs. III, IV. In some cases, there may be a small parameter which makes the analytical calculation of renormalized line tension simple. We will argue in Sec. III that this is the case for typical choices of u at large local Hilbert space dimension q. But even if there is no small parameter, the renormalized line tension can be well defined. We will see below how to extract it from simulations.
We now rewrite the partition function of the domain wall in a way which will be useful when the width of the thick domain wall does not grow with t. 1 As noted above, we then expect that a typical configuration has order t locations where the domain wall is thin. We define a directed path that is made of steps connecting these locations: i.e. steps of length t ≥ 1 connecting the "pinch points" in Fig. 4(b). The definition of one of these "irreducible steps" is that domain wall is thin at the beginning and end of an irreducible step, but nowhere in between.
The weight for a given domain wall configuration can then be written as a product of weights W (∆x, ∆t) for each step of extent ∆t in the time direction and ∆x in the space direction. Factorization into such a product follows from the locality property mentioned in the previous subsection (in the paragraph following Eq. 27).
To simplify the formulas, we restrict for now to systems with space and time translation symmetries such that all the local gates are identical, but the generalization to other cases is direct (Sec. III).
Define a partition function Z(x, y; t) with a modified bottom boundary condition, such that there is a domain wall between + and − at position x at the top boundary, and at y at the bottom boundary. This can be enforced using the dual states |± * defined in Eq. 14: (x labels bonds of the lattice as in Eq. 19.) Let the number of timesteps where the domain wall is thin be N + 1, and let their spacetime coordinates be: The steps are of time duration and spatial extent For our choice of lattice geometry, ∆x + ∆t is necessarily even. Defining the weight for an irreducible step to be W (∆x, ∆t), the total partition function is given by summing over paths of all possible lengths N : Because of translational invariance Z only depends on one spatial argument, so we will also write Z(x, t) ≡ Z(x, 0; t).
The weight for a step of duration 1 is The simplest case is for Z 0⊥ , when these are the only steps allowed: The ⊥ insertion produces non-zero values for W (x, t > 1), which may be either positive or negative. The presence of negative steps is not necessarily an obstacle to defining a coarse-grained line tension. If positive steps predominate then these negative weights will disappear under coarse-graining. However we will suggest that in some fine-tuned cases negative weights are important (Sec. V D). The rewriting in Eq. 37 is exact for the boundary conditions we have chosen, but this rewriting will only be useful if two conditions are satisfied. First, that the irreducible step weights decay sufficiently fast for large ∆t: at a minimum we require that the ratio W (x, t)/Z(x, t) tends to zero at large t. Second we require that the original partition function of interest, where the lower boundary condition is free, is simply related to Z(x, y; t).
To be more precise, we should distinguish between different kinds of usefulness. First, we conjecture that the above is useful for generic chaotic models for deriving a coarse-grained picture that is in the right universality class. Second we will argue that for some strongly chaotic models the above representation is also practically useful for numerical determination of quantities like the line tension E(v) and the butterfly speed v B . Third, in a more restricted class of models with a large parameter, the above can be used to obtain these quantities analytically.
We will describe the numerical algorithm that follows naturally from the above representation in Sec. IV. Its starting point is to use the recursive form of Eq. 37 to compute W (x, t): This expression follows simply from splitting the partition function Z(x, t) for the directed path into the term W (x, t) from the single-step path with N = 1, and from paths with N > 1 whose final irreducible step has weight W (x − y, t − t ). Since Z(x, t) can be computed numerically by taking appropriate traces of U (t) (N ) , we can compute the irreducible step weights W (x, t) by starting with the weight for t = 1 in Eq. 38 and employing Eq. 40 recursively to get the irreducible weights for larger t. The weights can then be used to extract E(v) and v B .
In practise we will be limited to t ≤ t max for some t max , so the usefulness of the algorithm will depend on how rapidly the weights W get smaller at larger t. We apply the algorithm to some non-random but chaotic Floquet models in Sec. V, with encouraging results. The results support the universal picture described in this section, with a domain wall that is not microscopically thin but has an order 1 width.
The approach can be modified in many ways which do not change the basic structure but which might improve the practical usefulness of the algorithm for a given choice of model. For example, we can insert projection operators on single sites instead of on double sites, yielding a slightly modified spin model. This has the advantage that the weight of a step of duration ∆t = 1 is no longer independent of the gate u (which it is in the present setup, see Eq. 38). It is clear that at least for some choices of u this will give better approximations for small t max . Nevertheless here we will stick with the geometry above, since it is convenient for making contact with the random circuit case, and is sufficient to illustrate the basic ideas.

D. Weights for smallest nontrivial ⊥ cluster
In some limits it is a good approximation to consider only the smallest possible ⊥ cluster, namely a single isolated ⊥ block (Sec. III). We can easily write down the weights for such a minimal cluster. They depend on the singular value structure of the local unitary gate u where ⊥ is inserted.
Recall that as a consequence of unitarity (Eq. 27). So locally, there are four configurations for one isolated ⊥, up to + ↔ − symmetry: In fact only three of these are independent, as 2 and 3 are equal for any u, even if it is not reflection-symmetric. Explicit formulas are given below in Eq. 46.
For the translationally invariant case, the first three diagrams above give the irreducible domain wall weights W (x, t) defined in the previous subsection for t = 2: By contrast 4 does not contribute to W (x, 2), as it does not yield a thin domain wall at the bottom. This diagram will appear only in configurations contributing to W (x, t > 2). In the limit discussed in Sec. III, namely typical unitaries at large q, only 1 is required at leading nontrivial order in 1/q. In order to express the values of 1 to 4 , recall that we can regard the gate u as a quantum state for four qstate spins ("vectorisation" of the operator). In a tensor network language, this simply means that we regard all four of the legs sticking out of u as physical spin indices. Let us label these legs A, B, C, D as Regarding u as a state makes it clear that (after normalizing this state) we can define the entanglement between any subset of {A, B, C, D} and the complement following the usual prescription for states. These are referred to as operator entanglements [50][51][52][53][54], and have been studied in the context of quantifying the "entangling power" of unitary gates [50,55]. Unitarity implies that {A, B} is maximally entangled with {C, D}, but other entanglements depend on the gate u. For example, if u is close to the identity then {A, C} is weakly entangled with {B, D}.
For N = 2 we require the purities P = e −S2 P ≡ P AC , P ≡ P AD .
All others are equivalent (e.g. P BD = P AC ) or independent of u (e.g. P A = q −1 , P AB = q −2 ). In terms of these, the weights in Eq. 42 are (recall 3 = 2 ): See App. D for computations. We now describe an analytically tractable case where it is sufficient to consider dilute ⊥ insertions of the above form (Sec. III). Then in Sec. IV we consider Floquet models for spin-1/2s, where it is important to allow for larger clusters of ⊥.

III. A TRACTABLE LIMIT: LARGE LOCAL HILBERT SPACE DIMENSION
We now apply the above formalism to a typical realization of a random unitary circuit at large q. This is a useful test ground for our approach, because we can quantify the suppression of large clusters of ⊥ blocks analytically.
It is important here to distinguish between the Haar average of e −S2 or of U (t) (2) , and and a typical individual realization drawn from the random circuit ensemble. The latter represents a specific chaotic time evolution, with much more structure than is captured by simply averaging U (t) (2) over the ensemble.
If we average U (t) (2) , the weight of any configuration with a ⊥ is set to zero, leading to Z 0⊥ . But in a given realization of the circuit this is not the case, and the ⊥ insertions have an important effect. For example, they modify the growth rate of averaged entropy S 2 , because this quantity is not equal to the simpler "annealed average" − ln e −S2 .
Consider a particular spin configuration in the partition sum Z (Eq. 19). As noted in Sec. II B, each cluster of ⊥ blocks in the configuration contributes separately to the Bolzmann weight of the spin configuration. We denote the weight of a given cluster C by Ω C . It depends on the geometry of the cluster, the spins ± on its boundary, and the local unitaries in the spacetime region inside the cluster.
If we average Ω C over the random unitaries inside the cluster region, then we obtain Ω C = 0 because of cancellation between positive and negative values. This is just the statement that, after Haar averaging, Z becomes equal to Z 0⊥ , where no ⊥s appear.
However, we can ask what the magnitude of Ω C is for a typical realization of the circuit. We can quantify this by computing Ω 2 C , which is not zero. We show below that (at least for large q, where the calculation is controlled) Ω 2 C is exponentially suppressed in the temporal duration of the cluster, with the exponential suppression getting stronger and stronger as q is increased. Therefore the Haar circuit at large q is a setting where we can put the phase cancellation conjecture on a quantitative footing. We do this next (Sec. III A). The present approach to the random circuit also allows us to compute statistical fluctuations in the second Rényi entropy due to circuit randomness in Sec. III B. We recover the result that these are in the KPZ/DPRM universality class. Previously this result required the replica trick [14]. Here however we can make the mapping to a directed polymer at the level of an individual circuit, so the correspondence with known classical problems can be made without use of the replica trick (Sec. III B).

A. Exponential suppression of large ⊥ clusters
Let us consider an isolated cluster C of ⊥ blocks. For simplicity we consider a cluster with trivial topology like that shown in Fig. 5(a), but a general cluster with holes inside can be treated similarly. Eq. 41 shows that the weight of the spin configuration vanishes if any ⊥ block has two spins of the same sign (++ or −−) above it, so the ⊥ cluster must lie on a domain wall between + and −. We wish to compute the mean square weight of the cluster, Ω 2 C . Haar-averaging maps Ω 2 C to a partition sum for spins σ ∈ S 4 . The permutations are now in S 4 rather than S 2 because squaring the diagram doubles the number of layers in the replicated circuit. The spins outside the cluster are fixed, so simply provide a boundary condition for those inside.
This partition sum can be computed relatively simply when q is large. We summarize the main features here and give a detailed analysis in App. C. The domain wall labelling convention we use is described in Ref. [14]: a domain wall with a domain of σ L to its left, and a domain of σ R to its right, is labelled by the permutation σ −1 L σ R . If we Haar-average the circuit u (4) we obtain spins σ ∈ S 4 with the interactions J specified in Eq. 13. Here the interactions are modified, because we fixed the configurations of spins s ∈ {+, −, ⊥} before squaring and averaging. This involves insertions of P + , P − and P ⊥ that modify the triangle weights discussed in Sec. II B.
In the doubled problem there is a doubled (12)(34) domain wall incoming at the entanglement cut, as illustrated in Fig. 5. In a + domain the S 4 spins are fixed to I, and in a − domain they are fixed to (12) (34). Inside the ⊥ cluster the spins must be summed over. However, they are very restricted. Because of the modified triangle weights, spin values in the subgroup S 2 × S 2 ∈ S 4 are suppressed in the interior of the cluster: a triangle whose vertices are all equal and take one of the values I, (12), (34) or (12)(34) has weight zero. In fact we have where the symbol on the left denotes the weight for a triangle whose lower spin is associated with a ⊥ block.
In the large q limit, the leading term involves the incoming doubled (12)(34) (24)(13)]. This does not incur any "bulk" cost, because a triangle all of whose vertices are equal to (14)(23) has weight unity by Eq. 49. The two doubled domain walls along the two boundaries of the cluster incur a weight q −4 per time step. 2 Consequently the mean squared cluster weight is of order Ω 2 ∼ q −4t cluster or even smaller, where t cluster is the time duration of the cluster. (We have not included the cost of the "thin" domain wall sections above and below the cluster.) This indicates a typical magnitude for a ⊥ cluster of a given size in a typical large q circuit. Note that this weight is much smaller than the weight ∼ q −t cluster of a section of thin domain wall of the same time duration. Similar considerations apply to clusters with more complex topology, i.e. with holes inside. Therefore, any configuration with a ⊥ cluster lasting for t cluster steps has a weight that is suppressed by at least q −t cluster compared to the leading configurations in the same spacetime region.
It should be noted that "thick" domain walls are suppressed by a cost in the exponent that scales not with the area in spacetime, but instead with the length of the boundary of the region contained in the thick domain wall. This is reminiscent of a domain wall between two different spin states in the ordered phase of the Potts model with Q > 2 states, where bubbles of other spin states can appear on the domain wall [56]. Here this scaling is a necessary consequence of unitarity. As a result of it, the effect of boundary conditions can be subtle. For example, a boundary condition at t = 0 that favours ⊥ may be able to induce a large domain of ⊥, because a boundary free energy of order t can compensate the additional domain wall cost of order t in the bulk. This is not the case for the quantities considered here, however. Indeed, for entanglement growth starting from a product state, the t = 0 boundary condition favours ± over ⊥.
The results above strongly support the conjectures of the previous section. They also mean that, at large q, we can obtain the nontrivial fluctuations of the entanglement entropy in a random circuit by considering only single-⊥ clusters. We discuss this next, before moving on to deterministic models.

B. KPZ/DPRM scaling in the random circuit
Recall that e −S2 = Z with the appropriate (free) lower boundary condition. Averaging over Haar-random gates gives Z = Z 0⊥ (Sec. II). However, the averaged Rényi entanglement entropy S 2 cannot be obtained from − log(Z), because of fluctuations.
Let us first recall the replica approach used previously. This involves computing the replicated partition function, Z k , and extracting quantities like S 2 from the k → 0 limit. At large q there is a simple picture for the replicated partition function Z k . It describes k domain walls that interact via a mutual attraction. In the microscopic S 2k spin model, the attraction is due to an additional local spin configuration that is possible when two domain walls come into contact with each other [14].
This description can be identified with the replica description of a well-known classical problem, the directed polymer in a random medium [57,58]. In that context, the free energy of the polymer can be calculated by replicating the system k times and integrating out the disorder. For an appropriate choice of lattice model, the resulting k polymers will have exactly the same mutually attractive interactions as the domain walls here. Therefore, we have indirectly mapped the entanglement to the free energy of a single domain wall in a random potential that modifies the weights of steps. This leads to KPZ scaling [57,58] for the fluctuations of the entanglement.
In the present approach we consider a single domain wall, without replicas. However, to go beyond Z 0⊥ , we must "dress" the domain wall with insertions of ⊥. We By Eq. 41, these ⊥ insertions are restricted to lie on the domain wall. In the notation of Sec. II C, this gives us a nonzero weight for a step with ∆x = 0 and ∆t = 2. The notation indicates that this weight depends on the local gate u, unlike the deterministic weights W (±1, 1). Since all the us are independent, this yields an uncorrelated random potential for our "polymer" in spacetime. 3 In an equivalent stochastic differential equation representation for S 2 (x, t), which is the KPZ equation, W (0, 2; u) determines spatiotemporal noise. The above picture, with random weights assigned to vertical steps, agrees with the replica approach: we just need to check the strength of the noise matches.
The average W (0, 2; u) vanishes. Its variance is given by the squared average of 1 in Eq. 42 (App. E): This matches the result for the variance in Ref. [14]. We therefore confirm that a thin domain wall with dilute ⊥ insertions reproduces the result of the replica calculation.
We now apply the formalism of Sec. II to models that are invariant under both space and time translations. In the brickwork circuit geometry, such a model is defined by a single 2-site unitary u.
In Sec. II C we defined the "thick" domain wall that appears in the calculation of Z(x, t), and the reduction of this partition function to one for a directed path with irreducible step weights W (∆x, ∆t) that take into account successively larger clusters of ⊥. In Sec. IV A we describe how to obtain systematic approximations to E(v) = E 2 (v) and the butterfly velocity v B from these weights. In Sec. V we present data for several models.
The numerical method for obtaining W (∆x, ∆t) is described in App. E 1. We use exact diagonalization, which allows us to treat ∆t up to t max = 8. Together with the analytical results below, this yields a straightforward algorithm which can be applied to any circuit.
Recall that the line tension function is encoded in the asymptotics of the partition function Z(x, t) defined in Eq. 37: We use generating functions to determine E(v). Define generating functions for the partition function Z and for the irreducible weights W , We can relate these generating functions using Eq. 37, which yields a geometric sum for ζ: By truncating the sum in Eq. 56 at order b tmax we will obtain a polynomial approximation to ω(s, b). This approximation is physically motivated: it amounts to setting a maximum step length t max in the directed walk of Eq. 37. We expect this approximation to improve systematically with t max for chaotic models.
We then need the prescription for obtaining the asymptotics of Z, or in other words the line tension E(v), from the numerically accessible quantity ω(s, b). This can be done with the pole method. For a given s, let b 0 (s) be the smallest root (in absolute value) of the denominator ω(s, b) − 1 in Eq. 57. Then the desired relation is This relation is explained in Appendix F. The root b 0 appearing here has a physical meaning (although we will not need it here): if we write s eq Γ(s) = ln |b 0 (s)|, then s eq Γ(s), which is related to s eq E(v) by a Legendre transformation, Eq. 58, is the Rényi entropy growth rate in a state with ∂S/∂x = s [18].
As an example, consider the lowest-order approximation where we truncate W at t max = 1, corresponding to the partition function with s =⊥. We then need only At this order, ω(s, b) = 2Kb cosh s, giving the root The quantity ln b 0 (s) + vs in Eq. 58 is maximized by s = tanh −1 v. Plugging this in gives This is indeed the correct result for Z 0⊥ [14,18] (and also the leading order result in the random circuit at large q).
In the numerical algorithm we compute W (x, t) up to t = t max and neglect W for larger times. Given a constant s, we then solve numerically for the smallest zero b 0 (s) of then E(v s ) is given by s −1 eq (ln |b 0 (s)| + v s s). (We see this by setting the s-derivative of the argument in Eq. 58 to zero, and extracting the derivative of ln |b 0 (s)| from Eq. 63.) We iterate over s to construct the entire curve.
This approach also allows us to extract the butterfly speed v B , as the point 5 where E(v B ) = v B . The "entanglement speed" v E for the second Rényi entropy, for a quench from the product state, is simply E(0). 4 W here is real by definition. In principle the smallest root b 0 may be complex (in which case it has a complex conjugate partner b * 0 ). We have allowed for this above. However from the physical interpretation of the quantities above we expect that b 0 is real and positive at least for large tmax.

B. General parameterization
In our discussion of the numerical results, it will sometimes be useful to use the following parameterization of the two-site gate u, which is an arbitrary SU(4) matrix [42,61,62]: where The u i s scramble the individual sites and the two-site unitary u sym entangles the two sites. We will mostly restrict to the reflection-symmetric case such that u 1 = u 2 and u 3 = u 4 . Additionally, where v is any single site unitary. This amounts to a trivial redefinition of the circuit, and one may check that the conjugate operators v and v −1 all cancel out in the weights in the partition function. Exploiting this symmetry, we can take so that the general symmetric gate is parameterized only by the two vectors h and J.

V. APPLICATION TO FLOQUET MODELS
We are finally ready to apply the numerical scheme in Sec. IV A to a variety of Floquet models, described in the following subsections. For each model we show the sequence of approximations to E(v), indexed by t = 1, . . . , t max with t max = 8: see for example Fig. 7. These successive approximations take weights W for longer and longer steps into account.
The curve for t = 1 (in red) is independent of the gate defining the circuit: this lowest approximation matches the result of the random average computed in Eq. 62. It can be thought of as a baseline showing how far the Rényi entropy growth in a particular circuit deviates from the random average.
Our algorithm works if the curve converges sufficiently rapidly to the actual line tension function. In addition to showing the bare plots, we perform several other checks of convergence. In Sec. V C we will directly examine the  decay of W (x, t) and W (x, t)/Z(x, t) at large t. The assumption on the structure of the domain wall implies that the latter should be negligible at large t.
Additionally, we check that the line tension function E(v) satisfies several constraints. It is positive, convex, greater than or equal to |v|, and tangent to this line only at v L/R B . We therefore plot the boundary curve E = |v| with dotted lines. We also mark an estimate of v B from an independent numerical computation of the OTOC (the protocol is described in App. E 3). We show this estimate as a pair of black dots.
Convergence to a form consistent with these constraints is a nontrivial test, because they are not built into the formulas for finite t max (indeed we will see that they can be disobeyed by the small-t max approximations).

A. Generic symmetric and asymmetric gates
First, we consider the following one-parameter family of reflection-symmetric gates (see Eqs. [65][66][67][68]: The parameter x tunes the strength of the interaction between the two qubits. For small x, local scrambling will take a long time: we do not expect our algorithm to converge rapidly in that case, so we will choose x reasonably large. Otherwise, the numbers above are arbitrary and were chosen to ensure (1) that there is no fine-tuning in the sense of any of the coefficients J α vanishing, (2) that u i represents a rotation on the Bloch sphere by a significant angle (here π/5). Fig. 7(a) shows results for the gate with x = 0.8. The results are consistent with a relatively fast convergence of E(v). We see that this chosen gate has a smaller v B and a smaller v (2) E than the annealed average for the random circuit.
Note that the results are in striking agreement with the general constraints listed above: the asymptotic E(v) is a convex curve that touches the line E = v at a single point, which is consistent with (v B , v B ) as obtained from an independent calculation of v B .
Next, Fig. 7(a) shows an example of a generic gate without the reflection-symmetry constraint. We simply picked a random gate from the Haar distribution on U(2) and used it to build a translation-invariant circuit. The matrix elements of this gate are given explicitly in Appendix E23.
Again, the algorithm appears to be working. However the convergence is now to an asymmetric curve with E(v) = E(−v). The touching point on the left marks v R B and that on the right marks v L B .

B. Chaotic Ising models
Next we consider Floquet (or "kicked") Ising models [21,39,40] with the Floquet unitary with a longitudinal field to spoil integrability: This can be written in the brickwork circuit form (Fig. 3) with the gate where: We expect that for generic values of the parameters (τ, h x , h z ) this gate defines a chaotic model. In special limits, such as h z = 0 or h x = 0, the model becomes integrable: in those limits we do not expect our algorithm to succeed. Interestingly, this model also has a "self-dual" line in parameter space [34,[41][42][43]63]: where some quantities can be computed exactly. In particular, entanglement growth is maximal on this line. This is related to the fact that for this choice of parameters, the tensor u remains unitary if it is rotated so as to exchange the roles of space and time [41,42]. This has been referred to as "dual unitarity". Though this property is highly fine-tuned, the model is believed to remain chaotic for generic values of h z on this line [63]. The fact that v E = 1 for the dual-unitary models implies v B = 1 and that the line tension is flat as a function of velocity [18]: This is an interesting test case for our algorithm. First, the above form is very far from our perturbative starting point. Second, the flatness of E(v) implies that for dualunitary models the large-scale properties of the domain wall are rather different from the generic case, with negative signs in W playing an important role. We describe this in Sec. V D.
To begin with we check the algorithm for a presumably generic set of parameters within the kicked Ising class in Fig. 8(a): The algorithm appears to be working, and the curves converging to one that represents a gate more entangling than the random average. Kim and Huse studied similar values of the fields, 6 but with the smaller period τ = 0.8 [39]. However, the Kim-Huse values (τ, h x ) (0.8, 0.90) happen to be quite close to the dual-unitary values, 7 which are (τ, h x ) (0.785, 1)! This explains the strong entanglement growth in the Kim-Huse model. Our numerical results for this case (not shown) show a line tension function with a v B close to 1 and a larger E(v) than for τ = 1.
We now check our algorithm for the dual-unitary case, The value of h z is not important for the dual-unitarity condition, but we should have τ h z = 0 mod 2π to avoid the model being free. Results are shown in Fig. 8(b). The curves seem to converge to the expected flat line E(v) = 1. The touching points where E(v B ) = |v B | seem to converge to the expected value v B = 1 very fast. 6 More precisely hx = (  Therefore the domain wall structure appears to make sense even for dual-unitary gates, which represent a limiting case. However in this case negative signs in W are important: without these it is impossible to have a flat E(v), see Sec. V D.

C. Structure of W matrix
We now examine the temporal decay of the step weights W defined in Sec. II C. In order to examine the importance of long steps, we consider the normalized quantity W (x, t)/Z(x, t): this compares the weight of a single long step, with displacement x and duration t, with the total weight of all paths with the same two endpoints. This ratio should decay with t if asymptotically long steps can be neglected. (Individually, both W (x, t) and Z(x, t) decay exponentially with t.) A slight extension of the reasoning in Sec. III A shows that in a typical realization of a random circuit at large q, the typical value of this ratio (for a given starting point of the domain wall in spacetime) decays like q −t . Fig. 9 shows heat maps of W (x, t)/Z(x, t) for examples of translation-invariant Floquet models for spin-1/2s. Panel (a) is for the generic reflection-symmetric gate in Eq. 68, whose line tension function was shown in Fig. 7(a), and Panel (b) shows the dual-unitary gate of Eq. 78 and Fig. 8(b).
In both cases the weight is small at large t, though finite-time effects in this quantity seem to be large for the former gate at displacement x = 0. We also find decay of W/Z for the generic asymmetric gate discussed in Sec. V A and for the kicked Ising model with the generic parameters in Eq. 77 (data not shown).
The domain wall picture appears to be well-defined both for the generic models and for the dual-unitary model. However there are key differences between the two cases which we discuss in Sec. V D. Note that the weight at x = 0, t = 2 is positive for the gate in Panel (a) and negative for the dual-unitary gate in Panel (b). In the former case this extra positive step weight increases Z(0, t) compared to Z 0⊥ (0, t), leading to decreased v (2) E . In the latter case the negative step weight decreases Z(0, t) compared to Z 0⊥ (0, t), helping the dual-unitary gate to attain the fastest possible decay of Z(0, t), i.e. the most rapid possible entropy growth.
Let us also show an example where local gate is very weakly entangling, resulting in slower convergence of the algorithm. This is the reflection-symmetric gate defined in Eq. 69, but with the smaller interaction constant x = 0.5 (in Sec. V A we showed results for x = 0.8).
First we show the ratio W/Z in Fig. 10. We see that there are large values close to the edges of the lightcone. However, the apparent decay inside the lightcone suggests that the results may still converge, just more slowly than in the cases examined above.
The slow convergence in this case is related to the fact that the gate is weakly entangling. First we examine the function Γ(s) defined in Sec. IV A (whose physical meaning, for |s| < ln 2, is as an entanglement entropy growth  Fig. 11(a). Note that they are no longer concave functions over the whole range s. However, this effect gets less severe with increasing t. If we restrict to the concave region in performing the Legendre transform, we get a reasonable result for E(v): see Fig. 11(b). This is consistent with slow convergence to a function that satisfies the general constraints, with a very small entanglement growth rate.

D. Entanglement membrane and dual-unitarity
Microscopically, the step weights W (x, t) can be of either sign. But we conjecture that, for generic models, the minus signs do not change the universality class. Heuristically, we can think of the step weights becoming positive after coarse-graining, so that the membrane is effectively a classical directed path with diffusive wandering, at least This coarse-grained picture implies a strictly convex E(v), with a positive second derivative that is related to the diffusivity of the path. [This diffusivity can depend on the coarse-grained velocity v, which we condition on in order to define E(v).] In more detail, the scaling of Z(x, t) is: The power-law prefactor is universal and comes from the fact that both endpoints of the random path are fixed. For example, consider for concreteness the case where x t and the coarse-grained velocity is close to zero.
We may identify this with the probability for a random walker, with diffusion constant D = (2s eq E ) −1 , to be at position x at time t.
The dual-unitary model behaves differently. The membrane and its line tension function still appear to make sense, but its wandering is not diffusive. For example, we can contrast the scaling of Z(x, t) with the formula above. Our results are consistent with the large t scaling with a trivial x-dependence inside the lightcone (we neglect the even-odd effect at the lattice scale). Evidently, in the dual-unitary case, the membrane is not a classical random walk. To understand this, let us write Eq. 40 (in the limit of large t) as This is a discrete analogue of a linear partial differential equation in x and t.
We expect that in the generic case we have an equation of parabolic -diffusive -type, analogous to the continuum equation (this is schematic). However, by fine-tuning W , we can make the coefficient of the leading time-derivative term ∂ t Z vanish, so that the leading term is second order, giving a hyperbolic, wave-like equation instead. We conjecture that this is what generically happens in the dualunitary case. A toy model is to take only a single additional nonzero W element (beyond the ones present in Z 0⊥ ), which we parameterize in terms of a constant µ as:  Fig. 8(a). Z(x, t) is consistent with the solution of a parabolic equation (e.g. the decay at x = 0 is compatible with t −1/2 scaling, Eq. 80). (b) Dual-unitary model as in Fig. 8(b). Z(x, t) is approximately constant in the interior of the light cone and zero outside.
To simplify the notation, let us remove the factors of K by extracting a factor of K t : we define Z(x, t) = K t Z(x, t) and W (x, t) = K t W (x, t).
When µ < 1 we are in the parabolic class. However when µ = 1 (in fact the dual unitary model is numerically quite close to this simplified model) Eq. 82 becomes which is a discrete version of the wave equation, (∂ 2 t − ∂ 2 x ) Z = 0. This gives a Z(x, t) which is constant inside the lightcone and zero outside.
Numerically, we find that the structure in the specific dual unitary kicked Ising model that we studied above is similar: 2 t Z(x, t) is approximately constant within the lightcone, and zero outside. Fig. 12 contrasts the spacetime pattern of e seqv E t Z(x, t) for the generic kicked-Ising model and the dual-unitary kicked Ising model. The approximate constancy of Z(x, t) within the lightcone, which is related to the hyperbolic structure, is imposed by dual-unitarity. This fact can be seen by considering a slightly different partition function: Z op (x, y, t) = · · · + + − −|U (2) Here x and y label the positions of the domain walls between + and − on the top and bottom boundaries respectively. Fig. 13(a) shows an example of q 2L Z op (x, t). In contrast to the domain wall partition function Z(x, t) defined in Eq. 33, Z op (x, y, t) does not precisely fix the end point of the domain wall to y, but instead penalizes configurations whose end point deviates from y. However, in analogy to the random circuit case, we expect that (so long as |x − y| ≤ v B t, which in the dual-unitary circuit is a trivial restriction since v B = 1) this only leads to an O(1) boundary term in the free energy, so that Physically, the above partition function Z op simply computes the operator entanglement of the time evolution operator itself, for a certain partition of the "legs" of the time evolution operator. This operator entanglement is maximal in the dual unitary circuits, like various other measures of entanglement in these systems [41,44,63]. This is easily seen by manipulations similar to those in [41,42]. Specifically (setting y = 0) we show that Z op (x, t) = q −t for a dual-unitary circuit, so long as x is within the light cone of the origin.
We represent the multi-layer two-site unitary evolution matrix as a four-leg tensor: We have labelled the external legs of this tensor by A, B, C, D. As a unitary, the legs A, B correspond to Figure 13. The partition function Zop(x, t) for a dual-unitary circuit. (a) q 2L Zop(x = 1, t = 5). By unitarity (the first identity in Eq. 90) we can remove the unitaries outside the lightcone, giving q 2t Zop(x = 1, t = 5) as expression (b). Then Eq. 90 allows the remaining gates to be removed. the row index and C, D to the column index. We rotate the tensor by 90 • to obtain the dual tensor. Dualunitarity requires the dual tensor to be unitary (with B, C as row legs and A, D as column legs) as well. To contrast with the generic gate in Eq. 44, we have drawn the dual-unitary tensor as a (fourfold symmetric) square in Eq. 89. For N = 2, unitarity and dual-unitarity can be used to propagate ++ (or −−) states from the top to bottom, right to left, and vice versa. Graphically, we have (the same holds for arbitrary N with + replaced by any σ) The first of these relations (unitarity) allows us to remove the unitary gates outside the lightcones in Fig. 13(a). Keeping track of the q factor, q 2t Z op (x, t) equals the expression in Fig. 13(b). Next, dual-unitarity (the second relation) can be used to propagate the + states to the right, so that all the unitary gates are removed and Fig. 13(b) is reduced to +|− t = q t . Hence q 2t Z op (x, t) = q t , so that Z op (x, t) = e −seqt within the lightcone.

VI. OPERATOR SPREADING
We have focused on the second Rényi entropy as a test case, but the present formalism can be applied to almost any observable expressible in terms of U (N ) . One of the simplest is the out-of-time order correlation function, which requires N = 2, and which on large scales we expect to reduce to a stochastic growth process [7,8].
The OTOC can be written in terms of the same partition sum as for e −S2 , but with different boundary conditions. In the Haar-averaged circuit this reduces to a simple partition function for a pair of domain walls. Since we have now made sense of the domain wall in a generic, fixed circuit, this picture from the random circuit carries over. The modification is that the domain walls now have a nontrivial internal structure, including finite regions of ⊥, with an associated microscopic timescale t 0 . A typical configuration from the partition sum is illustrated in Fig. 14.
For a brief heuristic summary, we consider the OTOC in the form of a commutator squared of two traceless operators,C We take the expectation value at infinite temperature, and to simplify the boundary conditions we average over local unitary rotations of the local operators Carrying out manipulations similar to Sec. VI B of [7], we rewriteC(x, t) using the present + + + + + + + + + + + + + + + + + + + + + − O(x, t) (92) The operator insertion at the top initiates a domain of − in the bulk. Because of the − boundary condition at the bottom, this domain prefers to expand at rate ±v B (assuming reflection symmetry). 9 However, the operator insertion at the initial time cannot lie in the + region exterior to the − domain, because this would lead to a factor of +|− * = 0. If |x| < v B t thenC(x, t) is constant at leading order, C(x, t) = q −2 tr O 2 tr O 2 . But if |x| > v B , the lower operator insertion "stretches" one of the domain walls, forcing it to move at a speed greater than v B . This is illustrated for the left domain wall in Fig. 14. The additional free energy cost means that, outside the lightcone, for x/t > v B , the OTOC scales as [64] C In random models these results can also be obtained from a stochastic growth process for the operator cluster [7,8,13]. In this picture, µ(v) is the large deviation function describing the probability for the boundary of the operator cluster to propagate at speed v [65]. From the above discussion, we expect that in generic models this stochastic picture applies after coarse-graining to timescales greater than t 0 .
Forthcoming work using the memory matrix formalism will give a complementary view on the operator spreading problem [66].

VII. OUTLOOK
In random circuits, the membrane picture emerges in a simple way from the average over unitaries whenever we deal with dynamical quantities, such as Rényi entropies or OTOCs, that require multiple forward and backward evolution operators. Here we have shown how to make sense of this membrane in realistic systems that do not involve randomness. This opens the way to extending the various results that have been obtained in random circuits to a much broader class of more realistic models.
We separated the local multi-copy Hilbert space into "paired" states, and states in their orthogonal complement, ⊥. Dynamical quantities then became partition functions for an effective field s(x, t) ∈ {⊥} ∪ S N . The state ⊥ of the effective spin here is reminiscent of the state S = 0, representing a vacancy, in the Blume-Capel model (the classical Ising model with vacancies, where the spin takes the values S = 0, ±1). We have shown that in strongly chaotic Floquet systems the "vacancies" ⊥ dress the structure of domain walls between different pairing states, but leave the basic structure of the domain wall intact on large scales.
After using a typical realization of a random circuit as a test case, we applied this picture to Floquet systems, for spin-1/2, that have no small parameter at all and no randomness.
The method gave convincing results for the line tension function for N = 2 (which determines both the second Rényi entropy growth rate and the butterfly velocity) from calculations on relatively small lengthscales. A simple but important ingredient was a resummation of the weights in the spin model into a single function W (x, t), describing the amplitude for the membrane -which in 1+1D can be viewed as a directed path -to take a step of temporal extent t and spatial displacement x. Interestingly, there was a qualitative difference in the structure of W for generic circuits and for dual-unitary, maximally entangling circuits.
Many questions remain for the future. First, there are many kinds of extra structure that could be incorporated and which might be expected to have interesting effects.
We have focused on Floquet models without any conserved quantities. Any slow mode, for example a conserved U(1) charge, will interact with the dynamics of the membrane. (A single physical conserved charge gives rise to multiple conserved charges in the replicated Hilbert space.) Results on random circuits with conservation laws provide a starting point [9,10]. One cautionary note is that it has recently been shown that the dynamics of the higher Rényi entropies, for which the present method is suitable, is qualitatively different from the that of the von Neumann entropy [15,67].
In the present approach, the von Neumann entropy corresponds to a nontrivial replica-like limit, which it would be interesting to understand better even in the absence of conserved quantities.
Also interesting are continuous spatiotemporal, as opposed to internal, symmetries, for example Lorentz invariance. Such symmetries will constrain the effective field theory of the membrane, as well as giving new hydrodynamic modes [68,69].
There is considerable freedom in the precise way in which the effective spins are introduced. We chose to insert a resolution of the identity for pairs of sites, since this was convenient for making contact with the Haarrandom circuits studied previously. We can also insert the resolutions of identity for single sites. This does not introduce any fundamental differences, but is likely to improve the convergence of our numerical scheme (especially for weakly-entangling gates) since for the same t max we may define more independent W elements.
We may also consider dynamics which is not of circuit form. The most straightforward generalization is to Floquet systems that do not have a circuit representation. Effective spins can be introduced in the same way, again with an arbitrariness in how we choose to place them.
The basic idea of this paper is not restricted to 1+1D: the effective partition function can be introduced in the same way in any number of dimensions, and we may again argue for the coarse-grained picture in terms of a codimension-1 membrane dressed with ⊥ states. However, the additional computational cost means that in higher dimensions we are likely to need either special structure, or a small parameter, in order to obtain quantitative results.
For the observables studied here, the pairing between layers of the circuit is local in time. For some dynamical quantities, for example the spectral form factor, pairings that are non-local in time play a role [11, 33? ]. It would be interesting to extend the present approach to this setting.
Our method for defining the effective spin model might illuminate other interesting universal dynamical phenomena. First, we may consider how the description in terms of an entanglement membrane breaks down on approaching a many-body-localized phase. We hope to return to this elsewhere. There are other failure modes that could be investigated, for example the approach to a noninteracting or integrable point.
Within the more limited domain of random circuits, the effective spin model here may also give a way of thinking about entanglement in random Clifford circuits which is more generalizable (e.g. to higher dimensions) than the approach of mapping the evolution of stabilizers to an effective classical stochastic dynamics [6,70,71]. The effective spin model may also be extended to dynamics with measurement, or other types of interaction with an environment, for example to clarify the relationship between different universality classes of measurement-induced criticality [70][71][72][73][74][75][76][77][78][79][80][81][82].
For chaotic dynamics, further numerical tests and further development of the numerical scheme, extending the range of models to which it could be applied usefully, would also be worthwhile and might give new insights into the physics on the scale of the membrane thickness (which, in many natural situations where there is a small parameter at hand, may be much larger than the lattice spacing). The operator P σ defined in Eq. 13 is a projection operator onto the σ permutation state. Here we prove the projector property in Eq. 15. According to the definition of P σ in terms of the Weingarten function and permutation states in Eq. 13, we have Here each state is on two sites: |τ = |τ, τ if we write out the state on each site. Note that states associated with distinct permutations are not orthogonal.
(A4) Any four-layer unitary gate with the projector P σ attached at the bottom is equal to P σ (Eq. 18). In Eq. 17, we represent the projector as a tensor (box) labelled with a σ inside. Given the expression in Eq. 13, it can be decomposed as where the ket |σ * = τ Wg(τ −1 σ)|τ is the two-site dual state of |σ , and the bras at the bottom are just σ| σ| on the two sites.
This structure helps us to clarify the definition of the triangle weights J(σ b , σ c ; σ a ) as the result of contracting the red part of the following tensor: This weight is independent of N (proved algebraically in [14]). This N -independence holds whenever all the spins are in the permutation group S M , as conjectured in [14] and proved by Hunter-Jones in Ref. [17]. Here we review this result. Algebraically, the triangle is the coefficient in the following expansion: where the two arguments in the bras refer to the pair of spatial states. This is equivalent to the definition in Eq. B4, as we see by using Eq. A2 to extract a particular coefficient on the right hand side of Eq. B3. Now consider the case where σ b , σ c ∈ S M , where S M is the subgroup mentioned above. 10 The fact that the resulting weights are independent of N is not immediately apparent from Eq. B4, but can be seen from Eq. B3. When σ b , σ c ∈ S M , we have For example, we can see this by writing the projection operator P (N ) as the average of a stacked random unitary. The unitaries in the layers with index greater than M cancel, since the bra contracts them together in the same way on each site. This leaves the average of a stack acting only on the first 2M layers, which is equal to P (M ) acting on those layers. The identity acts on the remaining 2(N − M ) layers. This shows that in Eq. B5 the only nonvanishing terms on the right-hand-side have σ a ∈ S M as well. As a result Eq. B5 becomes equivalent to the same equation for the 2M -layer case (except that both sides are tensored with the "identity" bra for the remaining 2(N − M ) layers). This shows that the triangle weights are the same as in the 2M -layer case, i.e. independent of N .
Note however that if all of the spins are in S M ⊗ S N −M ⊂ S N , where the subgroups refer to the first 2M and the last 2(N −M ) layers respectively, the weight does not factorize in general. That is, if we have where σ b1 ∈ S M and σ b2 ∈ S N −M etc., the triangle σ a σ b σ c (B7) 10 By the symmetry in Eq. 28, the following also applies to other subsets of states related to this one by left and right multiplication with fixed elements of S N .
is in general not equal to The two are equal if, for example, σ b2 = σ c2 .

Appendix C: Suppression of large ⊥ clusters
We analyze the spin model that arises in computing the mean square weight Ω 2 C of a ⊥ cluster like that in Fig. 5. (The average is over Haar-random unitaries.) The spins take values in S 4 , because in total we need four U layers and four U * layers to write Ω 2 C . We will see that there is a distinction between spin values in the S 2 × S 2 subgroup generated by (12) and (34), and spin values outside this subgroup. At leading order, the former are forbidden inside the cluster.
We first look at the region outside the ⊥ region. Each block there is labelled either as + or −, indicating a local insertion of P ± before squaring. Recall that this projection operator "absorbs" the local unitary gate: equivalently, in the squared system a given + block is (recall that each P (2) + acts on two physical sites): and similarly for −. This defines an interaction triangle for the case where the lower spin in the triangle is from a plus block, in analogy to the triangle interaction defined in Sec. II B for the case without the P (2) + ⊗ P (2) + factor. However, because of the projection operator, the spin associated with the block (the lower spin of the triangle) is fixed to be I. Similarly, the spin associated with a − block is fixed to be σ = (12) (34). Therefore in computing Ω C , the spins outside the cluster are not free to fluctuate. Their role is to provide a boundary condition for the spins inside the cluster.
Explicitly, the + weights are (from Eq. B3 with P (4) ⊥ replaced by P + ; the − weights are related to the following by symmetry) Using the expression for the projector in Eq. 25, this + triangle is equal to Since |σ b τ | + |τ | ≥ |σ b |, in the large q expansion, the + triangle is at most of order and similarly the − triangle can be bounded by Here |σ b | and |σ b (12)(34)| can be understood as the numbers of elementary domain walls (i.e. number of transpositions) on the left edge of the + and − triangle respectively. And similarly the corresponding σ c terms are for the right edge. We thus interpret Eq. C4 and Eq. C5 as costing q −1 per out-going domain wall on the two edges.
Inside the cluster we have ⊥ blocks. These lead to a "⊥" interaction triangle defined by: where we have used the equivalence between the parallel projection and the Haar averaged stack of unitaries. The right-hand-side of Eq. C6 gives the ⊥ triangle weight as the difference of two terms. The first term is the ordinary triangle weight for S 4 . The second term vanishes unless σ a ∈ S 2 × S 2 . Depending on whether σ a ∈ S 2 ⊗ S 2 , the large q expansion yields σ a ∈ S 2 ⊗ S 2 (C7) For the first case, the ⊥ triangle weight is exactly equal to an ordinary triangle We again interpret the corresponding expansion (the first line) in Eq. C7 as a cost of q −1 for each outgoing elementary domain wall. For the case of σ a ∈ S 2 ⊗ S 2 , we can show using the large q expansion techniques in Ref. 14 that the term of order q −|σ −1 a σ b |−|σ −1 a σc| cancels, which therefore incurs an additional cost of q −2 . In particular, if σ b , σ c are also in S 2 × S 2 , then the ⊥ triangle vanishes exactly: With this in hand, we can estimate the cost of a ⊥ cluster in Fig. 5. Because of the vanishing weight in Eq. C9, each triangle inside ⊥ region must host at least one spin that is not in S 2 × S 2 . Therefore in each time slice that contains a ⊥ there are at least four elementary domain walls. By the estimates above, each costs at least 1 q per time step, regardless of whether it is inside the ⊥ region or on the boundary between the ⊥ and ± regions. A ⊥ cluster that persists for t steps is therefore suppressed by a factor of q −4t or smaller. This completes the argument in Sec. III.
The cost of q −4t is achieved by the following configuration. We set the spins in the ⊥ cluster to (13) (24). The triangle weights inside the ⊥ cluster are then 1, by Eq. C8. There are two elementary domain walls making up the composite domain wall (13)(24)  which is equal to the ordinary triangle and costs q −4 . Meanwhile on the boundary, the four domain walls also cost q −4 , according to Eq. C4 and Eq. C5. In total, the weight is q −4t for the squared average of the ⊥ cluster.
Appendix D: Minimal ⊥ cluster and operator purity In Eq. 42, we listed four non-trivial configurations for an isolated ⊥ surrounded by + and −. Here we compute their weights for a fixed unitary u at the location of the ⊥. We will see they are related to the operator purity of u.
It is more transparent if we represent each + and − block graphically as in Eq. A5: (D1) The weight on the left-hand-side is equal to the c-number given by contracting the colorful parts of the tensor on the right-hand-side. The 4-leg ⊥ tensor (red) is u (4) P ⊥ . We can rewrite it as u (4) − P = u (4) − P + − P − . It is contracted with states +| and −| on the top (blue) and the kets at the bottom from the green tensors: Combining these, we have where the ± symbols around the u (4) block in the second line indicate contraction with ±| (at the top) or |± (at the bottom). Similarly, The blocks on the second line, representing different contractions of u (4) , are the operator purities for different partitions. Two of them are actually independent of the unitary gate u: This is because |+ |+ and |− |− are invariant under the action of u (4) . The other two terms are proportional to the operator purity of vertical and diagonal partitions: Using these operator purities and simplifying, we obtain the expressions in Eq. 46. (E1) 2' = 3' by the symmetry under spatial reflection and +/− exchange. We thus evaluate the other three. The averages of these diagrams generate the S 4 model discussed in App. C. They can be expressed in terms of ⊥ and ± triangles: On the other hand, if σ a / ∈ S 2 × S 2 , the ⊥ triangle is equal to an ordinary triangle (Eq. C8). There will be at least four out-going domain walls in the ⊥ triangle, and we have Therefore, the largest terms in Eq. E1 are: w ⊥ (I, (12)(34)) ∼w ⊥ (I, (12)(34); (13)(24)) (E16) + w ⊥ (I, (12)(34); (14)(23)) (E17) In other words, 1' dominates and scales as Note that we may also write 1 n = Z u − Z u n . (E20) where we have used P ⊥ = 1 − P to split 1 into Z u −Z u , and We may then use the result for Z n u in Sec. VI A of [14].
1. Protocol to compute Z(x, t) and W (x, t) In the main text we defined the partition function Z(x, t) for a domain wall with fixed upper and lower endpoints (fixed by contracting with + and − states at the top boundary and + * and − * states at the bottom, see Eq. 33).
A naïve way to evaluate Z(x, t) numerically is by contracting U (2) with these boundary states as per its definition. However, it is numerically inefficient to store the full four-copy unitary U (2) . It is much easier to compute modified partition functions whose bottom boundary conditions only contain |± . We have discussed a subset of these modified partition functions, which we called Z op (x, t), in the context of dual-unitary circuits in Sec. 40. We can view the unitary circuit representing the single-copy time evolution operator U (t) as a state on 2L spins, and view all the legs with the + boundary condition as a subsystem. The modified partition function Z op is then proportional to the operator purity for this partition. By rewriting we find that Z(x, t) is a linear combination of these operator purities. Our numerical scheme then computes these operator purities and obtains Z(x, t) from a linear transformation. We can then recursively compute W (x, t) through Eq. 40. This scheme enables us to compute up to t max = 8.

The random gate
The matrix elements of the random gate used in Fig. 7 This gate is less entangling than the Haar average but nevertheless the scheme for extracting E(v) seems to work well.

Protocol to estimate vB
We use the Pauli weight [18,25,83] to estimate v B in the Floquet circuit. The Pauli weight is defined to be fraction the operator weight at a site on non-identity operators. To reduce the number of lengths affecting the data, we place σ x at the leftmost site 0 and probe its Pauli weight at the rightmost site d = L − 1: According to the operator spreading picture, the Pauli strings in the time-evolved operator σ x 0 (t) spread to the right with mean velocity v B . Before the Pauli strings reach the right boundary, i.e. when v B t d, the Pauli weight is close to zero. At long times when v B t d, σ x 0 (t) can be regarded as a random operator, and the Pauli weight is 3 4 . Fig. 15(a) shows the Pauli weight for the kicked Ising model with the parameters in Eq. 77. Due to the circuit structure, the Pauli weights at consecutive integer time steps are equal. Hence we place the data at half integer points, and interpolate the curve (with cubic spline) when the data is greater than zero. Following Ref. [18], we define t arrival to the time when the interpolated curve is at 3 8 . Fitting d with t arrival gives us an estimation of v B . Fig. 15(b) shows the results of fitting for kicked Ising gates and the random gate. Symmetric(x = 0.8) Fixed random Chaotic Ising Maximally chaotic Symmmetric(x = 0.5) Figure 16. The approximate entanglement velocity E(0) computed by truncating to times ≤ t, for t up to 8.

The approximate entanglement velocity
The second Rényi entropy after a quench corresponds to a free boundary condition at the bottom for the domain wall. Therefore the system will select out min v E(v) as the entanglement velocity. We extract the approximate entanglement velocity from our numerical data and show its dependence on the truncation time t in in Fig. 16.
All the curves start off for t = 1 at the purity velocity of the random circuit. Differences between the gates then drive the curves to their respective entanglement velocities, either above or below the value at t = 1.
Notably, the curve for the dual-unitary gate shows a clear trend towards the analytical result v E = 1. Curves for the Floquet Ising model, the generic reflection symmetric gate and the fixed random gate behave reasonably well, although up to t max = 8 the fluctuations do not decay fast enough to allow us to understand the nature of the convergence. so that −s eq Γ(s) and s eq E(v) form a standard Legendre transform pair. On the other hand t z(s, t)b t = ω(s, b) 1 − ω(s, b) . (F4) Assuming that the smallest root of ω(s, b) for a given s is b 0 (s), then for |b| |b 0 | t z(s, t)b t ∼ 1 Matching the expansions in b, s eq Γ(s) = ln |b 0 (s)|.
Finally, inverting the Legendre transform yields the expression (58) in the main text for E(v).
Generically b 0 (s) is a single root. But in the dualunitary case, b 0 is a double root for s = 0. In Sec. V D, we postulate that Z(x, t) for the dual-unitary model is approximately constant in space, and so z(s = 0, t) ∼ te −seqt . The factor of t in the front indicates a double root on the right hand side of Eq. F4 in the limit of large t max .
Appendix G: Triangle weights with two commutative domain walls For reference, here we list the exact expressions for J(I, (12)(34); σ a ) for all σ a and their order relative to the weight of two separated domain walls. We use the notation from Ref. 14, drawing one line for each elementary domain wall crossing an edge of the triangle. An elementary domain wall is a transposition.
There are in total 6 classes. Within each class the configurations are related by the combination of reflection symmetry about the center axis, conjugation by (12), (34) or both.