A Separation of Out-of-time-ordered Correlation and Entanglement

The out-of-time-ordered correlation (OTOC) and entanglement are two physically motivated and widely used probes of the"scrambling"of quantum information, a phenomenon that has drawn great interest recently in quantum gravity and many-body physics. We argue that the corresponding notions of scrambling can be fundamentally different, by proving an asymptotic separation between the time scales of the saturation of OTOC and that of entanglement entropy in a random quantum circuit model defined on graphs with a tight bottleneck, such as tree graphs. Our result counters the intuition that a random quantum circuit mixes in time proportional to the diameter of the underlying graph of interactions. It also provides a more rigorous justification for an argument in our previous work arXiv:1807.04363, that black holes may be slow information scramblers, which in turn relates to the black hole information problem. The bounds we obtained for OTOC are interesting in their own right in that they generalize previous studies of OTOC on lattices to the geometries on graphs in a rigorous and general fashion.


I. INTRODUCTION AND OVERVIEW
The "scrambling" of quantum information is a phenomenon of fundamental importance, deeply connected to many important research topics in physics, such as black holes [2][3][4][5] and many-body chaos [6,7]. In recent years, a great amount of research effort has been devoted to the detection and characterization of scrambling. The so-called out-of-time-ordered correlation (OTOC) [8] is a commonly used measure of quantum chaos and scrambling. A variant based on commutators (also known as the OTO commutator) is given by where O 1 (x, 0) is an operator acting on site x, and O 2 (y, t) is a Heisenberg operator at time t that only acts on y at time 0, i.e. O 2 (y, t) = U † (t)O 2 (y, 0)U (t) where U (t) is the unitary for the evolution from time 0 to t. The average is taken w.r.t. the thermal state at some temperature, which we take to be infinite in this work. Intuitively speaking, it characterizes parameters like sensitivity to initial conditions via the spread of local operators. Also notice that the scrambling phenomena exhibit a truly quantum nature-the state of the entire system remains pure during the unitary evolution (although it is effectively randomized), thus no information is really lost; the generation of global entanglement leads to the scrambling of initially localized quantum information, spreading and hiding it from observers that only have access to part of the system. This observation leads to another fundamental probe of a stronger form of scrambling, namely the entanglement between parts of the system [3,[9][10][11] (which measures the equivalent effect as the tripartite information [10] in the case of unitary dynamics; see [11] for more detailed discussions).
To understand and characterize the dynamical behaviors of scrambling systems, several explicit models have been proposed and investigated, such as the Sachdev-Ye-Kitaev (SYK) model [12,13]. Another leading approach is the random quantum circuit model [2,10,[14][15][16], capturing the key kinematic feature of chaos that the evolution appears to be random, and the locality of physical interactions. In these well-studied physical scrambling models, the saturation of OTOC and that of entanglement are expected to occur at a similar time scale [10,14,15]. More generally, one could consider the dynamics of many small quantum systems (say qubits) connected according to some graph [17,18], with random unitary gates being applied to each edge. Suppose that we apply gates in a random order such that on average each edge has one gate applied to it per unit time. A natural conjecture here, which would be compatible with all previous results, is that the scrambling time for a constant degree graph is proportional to its diameter, i.e. the maximum distance between any two vertices. This would correspond to information traveling through the graph at a linear velocity and is assumed implicitly in previous works. However, no proof exists, outside of the special case of Euclidean lattices in a fixed number of dimensions. Even for Euclidean lattices in more than one dimension, this result was only recently proven [19].
Our main results are the following. The first one (Theorem 1 and Theorem 2) shows that for arbitrary graphs with sufficiently low degree, the OTOC saturation time scales linearly in the graph diameter. Here by low degree, we mean d 2 ≥ z where d is the dimension of the quantum system and z is the degree of the graph. On the other hand, we use bounds on entanglement growth to show that the time needed to establish substantial entanglement between parts of the system scale at least as the number of vertices and thus could be longer than the OTOC saturation time, for graphs with bottlenecks (see Corollary 1). Such graphs include e.g. binary trees, which we explicitly analyze in this paper, and discretizations of hyperbolic space around black holes, originally proposed by [1], which are expected to exhibit similar behaviors (as argued below). In other words, we have established an asymptotic separation between the time scales of OTOC and entanglement saturation. Refs. [20,21] studied scrambling on certain peculiar graphs via a Hamiltonian model, but the relations between OTOC and entanglement were not fully understood and the physical correspondences were not clear; here we rigorously proved the separation in a general setting and studied the implications to physics.
Our results have the following major implications: i) Scrambling in non-Euclidean geometries. Existing work studied scrambling mostly on Euclidean lattices [14,15,22]. The general assumption is that after time t, a localized perturbation will affect everything within some ball of radius v butterfly t, where v butterfly is known as the "butterfly velocity" which characterizes the speed of information spreading. However, this has not been proved and previous works only gave heuristic arguments for it that included uncontrolled approximations. For the random circuit models defined on general graphs, we find that if the local dimension is large relative to the graph degree then indeed there is a linear butterfly velocity. We also find apparent counter-examples which suggest that linear butterfly velocity no longer holds in high-degree graphs. Some of these examples are not rigorously analyzed but we present a heuristic argument suggesting that the scrambling time for some families of graphs should grow more rapidly or more slowly than the diameter of the graphs.
ii) Black hole information scrambling. Our results can be regarded as a more rigorous argument that fleshes out the idea of a recent paper by one of the authors [1], which concerns whether it is possible for the fast scrambling conjecture of black holes [3] to hold if one assumes that the causality structure of general relativity holds around a black hole, and if the medium by which the information is scrambled is Hawking radiation. In the model of [1], the space around the black hole is divided into cells, each of which contains a constant number of bits of Hawking radiation. It then gives arguments for why the Hawking radiation is not adequate for fast scrambling if the entanglement definition of scrambling is used. The cell structure around the black hole looks like a patch of a cellulation of hyperbolic geometry, where the cells on the event horizon are the boundary of this patch. The tree graph we shall consider captures a key feature of this geometry: the leaves lie on the event horizon, and the density of nodes decreases as one moves outwards radially. As the assumptions essentially suggest that information is processed via local interactions of the Hawking radiation, we may consider a random circuit defined on the underlying graph to be a toy model that captures key features of the black hole scrambling process. Our mathematical results then indicate that the scrambling time scales given by entanglement and OTOC are fundamentally different. Another way to interpret our model is that the information "wavefront" could reach the farthest side rather quickly since there exists short paths, but it takes a longer time, which scales with the number of degrees of freedom, to establish truly global entanglement. This is consistent with recent holographic calculations (see e.g. [23,24]), which suggest that the entanglement entropy grows roughly linearly after a quench in chaotic systems.
We would also like to remark upon the task of recovering quantum information falling into the black hole from Hawking radiation (commonly known as Hayden-Preskill decoding [2]), which plays central roles in recent studies of the black hole information problem. Yoshida and Kitaev recently proposed an explicit protocol [25] whose decoding fidelity is at least the order of 1/d 2 , where d A is the Hilbert space dimension of the input message. Here C(t) takes the form of Eq. (1) and considers O 1 and O 2 averaged over all Pauli operators on the infalling system and Hawking radiation respectively; see Sections 2-4 of [25] for details. By simple calculations one can see that our results imply a possible time window in which the decoding could be achieved with high fidelity without substantial entanglement when the infalling quantum state is sufficiently small compared to the black hole. However, it appears that adding a small number of qubits to a Schwarzschild black hole can only be done by photons whose wavelength is comparable to the size of the black hole. It does not seem surprising that the information carried by such photons can be extracted by a black hole quickly; when the information is absorbed by the black hole, it is already spread out over the entire black hole, and so does not need to migrate from a localized region to a state where it is delocalized on the black hole.
iii) Inequivalence of convergence to 2-designs in different measures. The speed of convergence of a random circuit to a 2-design (distributions that approximately agree with the Haar measure up to the first two moments, which have found many important applications as an efficient approximation to Haar randomness [26]) has been the subject of a large amount of research. In particular, [19,[27][28][29][30] show that the speed of convergence depends on the graph of interactions, and suggest that it should be proportional to the diameter. Note that 2-designs are very powerful measures of convergence, in the sense that a distribution being close to a 2-design implies that the distribution has mixed with respect to not only OTOC but also von Neumann and Rényi-2 entanglement entropies [10,31], and other important signatures of information scrambling such as decoupling [32]. Our work provides several examples where a random circuit approximates the OTOC but not the entanglement properties of a 2-design, and therefore implies that a strong approximation of 2-designs (in terms of e.g. the frame operator [11]) may not be achieved in time proportional to the diameter.

II. MODELS AND NOTATION
Let G be a graph with V vertices and E edges. The model we study consists of a graph with a d-dimensional Hilbert space associated with each vertex of G. Each edge has Haar-random unitary gates applied to qudits on its endpoints according to a Poisson process with rate 1, meaning a Poisson distribution such that k unitaries are applied in time t with probability t k e −t /k!). The mixing times for OTOC and entanglement, τ ent ) is defined to be the minimum amount of time needed for OTOC between vertices x and y (resp. the entanglement entropy between A and the complement of A) to become at least a constant fraction of its equilibrium value. In this work we take the constant to be 1/(d 2 + 1).
Here we expect that qualitatively similar behavior will hold with 1/(d 2 + 1) replaced by any constant strictly between 0 and 1. We will study how τ OTOC and τ ent scale with parameters such as local dimension, degree, and number of vertices.
We study the pair of (x, y) that has largest τ (x,y) OTOC , and the set A that has largest τ (A) ent , as they could best characterize OTOC and entanglement properties for G.
Instead of studying this model directly we can consider the process in which a random edge is picked every 1/E time units. This is because in our Poisson process model, each edge is equally likely to be picked. The number of unitaries applied within time t is of order Et (see Appendix A), so the two models above are equivalent up to a constant factor.

III. OTOC
To analyze the saturation time of OTOC, we describe the process of operator spreading as a Markov chain. Consider an arbitrary Pauli operator σ p acting on n ddimensional qudits, p ∈ {0, . . . , d 2 − 1} n , and apply some unitary U to it. We expand the resulting operator on Pauli basis and have The expected value of the cross term for α q averaged over the distribution of U would be According to the construction of random circuit, this is zero for q = q for U being the unitary in a single step. Therefore in each step the values of α q α * q undergo linear transformation, which we can interpret as a distribution because they are positive and sum to 1.
If we start from a Pauli operator located at a single vertex x, on each vertex all non-identity Pauli operators will have the same probability as long as x has been touched at least once in the process. So we only care if the operator on a vertex is identity (I) or non-identity (N). And the norm of the time-evolved operator with a Pauli operator P on some vertex y would be proportional to the probability of having nonzero Pauli operator on that site, and the factor of proportionality would be which is just the commutator averaged over all nonidentity Pauli operators. In summary, the object we will study is the OTOC between Pauli operator on vertex y and time-evolved Pauli operator on vertex x after T steps of random circuits on graph G, which equals to d 2 d 2 −1 times the probability of having "N" on vertex y after T steps in the Markov chain M 0 defined below. The state space of M 0 is the set of all the configurations in which each vertex of G is assigned a label "N" or "I". The initial state of M 0 has "N" assigned to vertex x and "I" assigned to all other vertices. The update rule is that in each step a uniformly random edge is picked and the labels on the two corresponding vertices are updated. "II" remains "II", and otherwise they has a probability of d 2 −1 d 4 −1 = 1 d 2 +1 for becoming "IN" or "NI" each, and d 2 −1 d 2 +1 for becoming "NN" [27]. Now we prove an upper bound for the OTOC saturation time. For illustration we present an outline of the proof here; full proof could be found in Appendix C. Note that O(α) that appears here and in the following is represents a quantity that scales asymptotically as α, i.e. ≥ c 1 α and ≤ c 2 α for constants 0 < c 1 < c 2 .
Theorem 1 (OTOC upper bound). Let G be a graph with V vertices and E edges, and suppose the degree for each vertex at most d 2 , where d is the Hilbert space dimension for each vertex. Then for any pair of vertices x and y, τ (x,y) OTOC = O(D(x, y)) with high probability, where D(x, y) is the distance between x and y. The probability of failure is exponentially small in D(x, y). As a consequence the perfect binary tree has τ (x,y) , where x and y are the farthest pair of vertices.
Proof. As explained earlier, the OTOC saturation time corresponds to the number of steps needed for M 0 to have constant probability of having a label "N" on y. We will first prove Lemma 1, which states that with probability 1−e −O(D(x,y)) the vertex y gets hit by a label "N" within order of E · D(x, y) steps. As shown in Appendix A, this needs order of D(x, y) time units with high probability. Then we will show in Lemma 2 that after this happens, the probability for having an "N" on y remains constant. Lemma 1. Suppose that G is a graph with the degree for each vertex being at most d 2 . For any pair of vertices x and y with distance D(x, y), the expected number of steps for y to be labeled "N" is of order E · D(x, y) in M 0 starting from x. Besides, with high probability the vertex y gets labeled "N" in time of order E ·D(x, y). The probability of failure is exponentially small in D(x, y).
Proof. (Sketch) We will first construct a Markov chain M which has the same initial state as M 0 , and in each step the update rule of M is applied, followed by changing all "N" into "I" except the one closest to vertex y. By a simple coupling argument the number of steps needed for y to get an "N" in M 0 is lower bounded by that in M . The distance between the vertex with label "N" and vertex y in Markov chain M 0 can be described by a biased random walk, from which we can obtain the desired bound. More details could be found in Appendix C.
Lemma 2. After a label "N" reaches the target vertex y, the probability for having an "N" on y will remain order one.
Proof. (Sketch) We again consider the modified chain which only keeps one label "N" after each step. We will show that vertex y has constant probability of having label "N" in the equilibrium distribution. This probability is monotonically non-increasing as a function of the number of steps, so the probability is order one in any step. More details could be found in Appendix D.
Theorem 1 states that the number of steps needed for OTOC saturation in a low-degree graph is at most of order E·D(x, y). However, we expect that in a graph with high degree, the number could be much larger. Some intuitions are given in Appendix E.
Besides this upper bound we also derive a lower bound for OTOC saturation.
Theorem 2 (OTOC lower bound). Let G be a graph with V vertices and E edges, and suppose the degree for each vertex is O(1). Then for any pair of vertices x and y, τ (x,y) OTOC is at least of order D(x, y) with high probability, where D(x, y) is the distance between x and y. The probability of failure is exponentially small in D(x, y).
The proof is presented in Appendix F.

IV. ENTANGLEMENT
Here we only need to consider the case where the evolution is unitary and the system is pure.
Entanglement entropy of pure state |ψ AB is given by E(|ψ ) := S(ρ A ) where ρ A = Tr B [|ψ ψ|] and S is the von Neumann entropy. Notice the following simple, general fact: Lemma 3. Let U AB be a unitary operator acting on two d-dimensional systems AB. Then for any |ψ AA BB with ancilla systems A , B , Proof. Adapted from the proof of Lemma 1 of [33]. Suppose Alice holds AA and Bob holds BB . In addition, they share two copies of the maximally entangled state Consider the following double teleportation protocol. Alice consumes a |Φ d and classical communication to teleport A to Bob, who performs U locally and then consumes a |Φ d and classical communication to teleport system A back to Alice. The protocol is LOCC, under which the entanglement entropy between Alice and Bob is monotonically nonincreasing. Therefore, by the additivity of S (and thus E) on tensor products, and so the claimed bound follows.
Note that the proof also applies to e.g. the Rényi-2 entropy, which is a variant of the entanglement entropy that can be more easily measured in experiments [34,35].
By Lemma 3, the entanglement entropy between the two trees increases by at most 2 log d when the random unitary is acted across the middle edge. This edge only has a probability of 1/E ∼ 1/V to be selected in each step. So in order to reach the maximum entropy of order V log d, we need at least an order of V 2 steps or equivalently order V time. This is much larger than the OTOC time of order log V .
From Lemma 3 we can get the following result for a general graph. This also lower-bounds the time it takes for the random circuit to converge to 2-designs [10,31].

V. OTOC VS. ENTANGLEMENT IN DIFFERENT MODELS
According to our results, a simple graph that can give a separation of OTOC and entanglement saturation times is a perfect binary tree, as depicted in Fig. 1. Here we consider the OTOC of operators located on the pair of farthest vertices in the graph (a leaf vertex of left subtree and a leaf vertex of the right subtree), and the entanglement entropy between left and right subtrees. The cell structure roughly equivalent to the hyperbolic geometry in 3 dimensions, or indeed any constant number of dimensions, exhibits such a separation as well. These graphs are regarded as toy models of quantum information scrambling around black holes [1], as motivated in the introduction; see also Appendix B.
The behaviors of OTOC and entanglement on some other graphs are also studied. These include: i) The "dumbbell graph" consisting of two complete graphs connected by a bottleneck edge. A careful analysis could still show a separation between OTOC and entanglement; ii) High degree graphs. We demonstrate two examples in which the OTOC saturation time can be much longer or shorter than the bound we have on low degree graphs. See Appendices E, G for more details. Also note that there is no separation on Euclidean lattices. These results are summarized in Table I. Other than the Poisson process on each edge, different orders of choosing the edges have also been studied in Appendix G. We consider OTOC between local operators originally acting on two farthest vertices (a leaf of the left and right subtrees respectively, for example, O1 and O2 in the diagram), and entanglement between the left subtree (dashed circle) and the rest of the graph (the cut shown by the red double line).

VI. CONCLUSION
Random quantum circuits have widespread applications in quantum information, and are also very important models of scrambling and chaotic quantum systems in theoretical physics. There are several ways to characterize scrambling and randomness in quantum processes, among which the OTOC and entanglement are two important types of measures. This work aims to understand whether they are equivalent to each other as the signature of scrambling. To this end, we carefully analyze local random quantum circuits defined on, e.g., a binary tree, which exhibit the property that OTOC mixes rather fast since the light cone can quickly reach the far end (time of order ln V ), while it takes a much longer time for entanglement between the left and right subtrees to grow (time at least of order V ). We furthermore generalize the result to any bounded-degree graph with a tight bottleneck. That is, the generation of entanglement is slow, even if the graph has small diameter. Our result indicates that unitary t-designs can be much more expensive than we thought: They require a random quantum circuit to have depth much larger than the diameter of the underlying graph. This result provides a more rigorous evidence for arguments made in [1]: if we consider the model discussed in [1,3], then the scrambling of quantum information as seen by strong measures such as entanglement or decoupling can be much slower than we thought before. It would be interesting to explore further implications of this phenomenon, and more generally, quantum information processing, to the black hole information problem, many-body physics, and beyond.  Here we argue that the Poisson clock model and the simplified model in which a random edge is picked every 1/E time units are essentially equivalent, by the following fact: Proof. The number of unitaries applied to each edge is a Poisson distribution with mean t, so the total number of unitaries λ is a Poisson distribution with mean Et. By [1],

Appendix B: Cellulation of hyperbolic geometry
Here we include the figure from [2] that roughly depicts the cellulation of hyperbolic geometry, representing the causality structure of a black hole in Schwarzschild coordinates. * aram@mit.edu † linghang@mit.edu ‡ zliu1@perimeterinstitute.ca § mehraban@mit.edu ¶ shor@math.mit. edu  FIG. 1. ([2]) The cell structure roughly equivalent to the hyperbolic geometry, depicted in two dimensions. In [2], this represents the black hole cell structure in Schwarzschild coordinates, where each cell carries one qubit of Hawking/Unruh radiation.
The line element in this geometry is The black hole has radius 2M , the photon sphere has radius 3M , and we are interested in the region 2M ≤ r ≤ 3M . In order to give an upper bound on the time scale that an "N" hits vertex y, we study a modified chain M in which the spreading of label "N" is slower than M 0 . Roughly speaking M only keeps a single label "N" that is closest to vertex y.
Definition 1 (Markov chain M ). Markov chain M has the same state space and initial state as M 0 . In each step the update rule for M 0 is applied, followed by setting all "N" labels into "I" except for the one closest to y (choose randomly if this is not unique).
In this way the vertex with label "N" in M is always labeled "N" in M 0 in the most natural way of coupling M 0 to M , and therefore after any number of steps the probability that vertex is labeled "N" in M 0 is lower bounded by the corresponding probability in M .
The Bernstein inequality is needed for the proof of our theorem, which states that for independent zero-mean random variables X 1 , . . . , X n each with absolute value at most M , This could be generalized to the case with nonzero mean. Suppose Y 1 , . . . , Y n has mean µ 1 , . . . , µ n and they satisfy Proposition 2 (Restatement of Lemma 1). Suppose that G is a graph with the degree for each vertex being at most d 2 . For any pair of vertices x and y with distance D(x, y), the expected number of steps for y to be labeled "N" is O (ED(x, y)) in M 0 starting from x. Besides, with probability 1 − e −Ω(D(x,y)) the vertex y gets labeled "N" in time O (ED(x, y)).
Proof. As mentioned, the upper bound for M defined in Definition 1 gives an upper bound for M 0 . Let v be the vertex with label "N". As long as v ̸ = y, there will be at least one neighbor u that is one step closer to y, and other neighbors are at most one step further from y due to the triangle inequality. If the edge (u, v) is selected, there is a chance of d 2 d 2 +1 that u obtains label "N". If the edge between u and other neighbors is selected, there is a chance of 1 d 2 +1 that the label on u becomes "I", and the distance between y and the closest label "N" becomes one step longer. Let the degree of u be d u , and the distance between the label "N" and vertex y will have probability of at least 1 E d 2 d 2 +1 to decreases by 1 and at most du−1 E 1 d 2 +1 to increase by 1, where the probabilities depend on the specific vertex. Since the degree for any vertex is at most d 2 , the time needed for the distance to drop from D(x, y) to 0 is upper bounded by the time in the following biased random walk W . W has states {0, 1, . . . , d max } where d max is the maximum possible distance to y, and starting from vertex D(x, y) it has a fixed probability of d 2 E(d 2 +1) for decreasing by 1 and d 2 −1 E(d 2 +1) for increasing by 1. Extension of this finite chain to an infinite one could only increase the hitting time of vertex 0, because the finiteness at vertex d max prevents us from getting too far from vertex 0. The displacement of an random walk on an infinite chain (i.e. the difference of the final position and initial position) is the sum of displacement for each step, which has probability ) probability of being +1, and otherwise it is 0. The mean and variance for displacement at each step is The expected number of steps needed to reach vertex 0 in this random walk is −D(x,y) µ0 = Θ(E · D(x, y)). We can also use Eq. (C1) to bound the probability that the total displacement of T steps is larger than −D(x, y), where we set T to be twice the expected number of steps needed and t = D(x, y). M can be set to be 2. The denominator in the exponent will be T (σ 2 0 + 2µ 2 0 ) + 2 3 t = Θ(D(x, y)), so Eq. (C1) gives a probability of at most e −Θ(D(x,y)) for not reaching vertex 0.
Appendix D: Proof of Lemma 2 Lemma 1. Consider a reversible Markov chain M with transition matrix P (x, y), x, y ∈ Ω. M starts deterministically from state x 0 ∈ Ω. If all the eigenvalues of P are nonnegative, then the probability for x 0 will be monotonically nonincreasing as a function of the number of steps.
Proof. Let π(x) be the stationary distribution. The reversibility implies that A(x, y) ≡ √ π(x) π(y) P (x, y) is a symmetric matrix, and therefore has orthonormal eigenvectors f k (x) with corresponding eigenvalues λ k . The eigenvectors for P (x, y) will then be g k (x) = f k (x)/ √ π(x). Now we want to expand the initial distribution p 0 (x) = δ x,x0 in terms of g k (x), which can be verified using the orthogonality of f k . After t steps, the probability distribution would be and the probability for state x 0 would be which is a monotonically non-increasing function of t given λ k are all nonnegative.
Proposition 3 (Restatement of Lemma 2). After a label "N" reaches the target vertex y, the probability for having an "N" on y will remain Ω(1).
Proof. We again only keep track of the label "N" closest to y. When it is at y, there is a probability of dy E 1 d 2 +1 that an "I" is left on y and the closest "N" becomes one of the neighbors of y. Here, d y is the degree of y. Otherwise, suppose it is at a vertex u with degree d u . There is at least one neighbor of u that is one step closer to y, and other neighbors are at most one step farther. This corresponds to a probability of 1  [3] we can get the probability for state 0 in the stationary distribution, which is which is Ω (1). Note that at large E, the probability for staying at the same state is larger that 1 2 , which means that all eigenvalues are positive. Also the random walk on a finite chain is reversible, so by Lemma 1, the probability for having "N" on t remain Ω(1).
Appendix E: OTOC for the high degree case

Examples with long OTOC time
In this section we describe two examples of a high-degree graph where the OTOC time is asymptotically larger than the graph diameter. One example is the star graph where this phenomenon was previously observed by Lucas [4]. A second example is a binary tree with high but constant degree. We summarize the parameters of these two examples in the following table.

Graph
Diameter OTOC time Source n qubits in star graph 2 O(log n)

this section
In this section we will sketch proofs of both of the claimed OTOC times. First we give a simplified explanation of the O(log n) OTOC time for the star graph. (A more precise but also more complicated argument was given in [4].) In the star graph there are n − 1 vertices each connected to a central vertex. Consider the OTOC between Paulis on two of the non-central vertices, say 1 and 2. We begin with a single N on vertex 1 and I's everywhere else. Vertices 0 and 1 interact after expected time 1, and when they interact there is a 4/5 chance that vertex 0 ends up with an N. Thus the expected time for vertex 0 to turn to N is 5/4 and after this happens the expected number of non-central N's is 3/4. After this, vertex 0 rapidly interacts with the other n − 1 vertices. Each time there is a 3/5 chance that the other vertex will turn from I to N, a 1/5 chance that the other vertex will remain I (or change from N to I, but consider for simplicity the early stages when most vertices are I), and a 1/5 chance that the other vertex will become N and vertex 0 will revert to I. The number of N's created here before vertex 0 turns back to I is given by a geometric random variable with expectation 3, and the time this takes has expectation ≈ 4 n . For simplicity, we will ignore the fluctuations and assume that the process creates exactly three N's and takes time exactly 4 n . Now there (on expectation) 3.75 N's in non-central vertices and I's elsewhere. Again there is a long wait until the central vertex turns back to N, but now the expected wait is only 5 4  = O(log n).
This is not quite rigorous as we have replaced random variables with their expectations in several places, but otherwise is strong evidence for the OTOC time of O(log n) for the star graph. A more rigorous argument is given in Section 5 of [4]. There it is also observed that dynamics based on random Hamiltonians give an asymptotically different OTOC time.
We can obtain a stronger separation between diameter and OTOC time by considering a tree. Here we need the degree z only to be a sufficiently large constant and can obtain an OTOC time of n 1−δ with δ → 0 as z → ∞. However the diameter is also asymptotically growing as O(log n), unlike the star graph where the diameter is only 2.
Consider an N at vertex v, with height h. Suppose that x(z − 1) of its children are in state N and (1 − x)(z − 1) are in state I for some x ∈ [0, 1]. We approximate the dynamics as follows.
• A child of v turns into N at a rate (z − 1)(1 − 1/(d 2 + 1)) ≈ z, again regardless of its current state. This means that dx dt has an expected contribution of (1 − x).
• A child of v currently in state N turns to I, at rate x(z − 1)/(d 2 + 1) ≈ xz/d 2 , yielding a contribution to dx dt of −x/d 2 .
• v itself turns to I at a rate of (z − 1)/d 2 ≈ z/d 2 . (At the same time, one of v's children turns to N , but we can neglect this.) Overall we find that x evolves according to which asymptotically approaches (1 + 1/d 2 ) −1 ≈ 1 − 1/d 2 . However, v will turn to I before this happens, typically after time d 2 /z. During this time, the probability of v's parent turning to N is ≈ z/d 2 .
Once v turns to I, its children can turn it back to N . This happens at rate ≈ xd. First, let us analyze more carefully the dynamics of x. If h ≤ H − 2 then the children of v will themselves undergo the same process. In particular, they will turn to I at rate z/d 2 . This means Eq. (E2) is modified to become Now x will saturate at x = d 2 /z. Putting this together, the N's will "fall" to lower levels more quickly than they can climb or be replaced. By the time they've fallen to the base, the N's correspond to the leaves of a subtree S with branching factor d 2 . Then they begin slowly climbing back up.
Let us start at level H. Fix a vertex v ∈ S at height H − 1. After time 1/d 2 , v will turn to N. Then it will fall back down again after time d 2 /z, during which it will have created another d 2 N's at level H. Now there are 2d 2 N's below