Measurement-induced phase transitions on dynamical quantum trees

Monitored many-body systems fall broadly into two dynamical phases, ``entangling'' or ``disentangling'', separated by a transition as a function of the rate at which measurements are made on the system. Producing an analytical theory of this measurement-induced transition is an outstanding challenge. Recent work made progress in the context of tree tensor networks, which can be related to all-to-all quantum circuit dynamics with forced (postselected) measurement outcomes. So far, however, there are no exact solutions for dynamics of spin-1/2 degrees of freedom (qubits) with ``real'' measurements, whose outcome probabilities are sampled according to the Born rule. Here we define dynamical processes for qubits, with real measurements, that have a tree-like spacetime interaction graph, either collapsing or expanding the system as a function of time. The former case yields an exactly solvable measurement transition. We explore these processes analytically and numerically, exploiting the recursive structure of the tree. We compare the case of ``real'' measurements with the case of ``forced'' measurements. Both cases show a transition at a nontrivial value of the measurement strength, with the real measurement case exhibiting a smaller entangling phase. Both exhibit exponential scaling of the entanglement near the transition, but they differ in the value of a critical exponent. An intriguing difference between the two cases is that the real measurement case lies at the boundary between two distinct types of critical scaling. On the basis of our results we propose a protocol for realizing a measurement phase transition experimentally via an expansion process.


I. INTRODUCTION
If the unitary time evolution of a many-body quantum system is punctuated by measurements, made at a finite rate per local degree of freedom, the resulting dynamics can fall into one of two broad classes known as the "entangling" and "disentangling" dynamical phases, or the "weak monitoring" and "strong monitoring" phases. Separating these two phases is a measurement-induced phase transition (MPT), which can be crossed by increasing the rate at which measurements are performed .
In the original identification of the MPT [1,2], the transition was associated with the scaling properties of the entanglement entropy of a pure state (volume-law versus area-law). Subsequent work made clear that there are alternative useful formulations of the MPT. In particular, one can associate the MPT with a sharp change in the rate at which an initially mixed state is transformed into a pure state by the monitored dynamics [7]. In the disentangling phase the characteristic timescale for purification is only order-one, while in the entangling phase purification takes a time that is exponential in the system size. This purification timescale is also the timescale up to which quantum information from the initial state is retained in the final state [6,20,22,24], and is therefore related to questions like "do I need to know the initial state in order to deduce the final state, or can I deduce it from the measurement outcomes alone?".
While numerical studies have characterized the MPT in a wide variety of settings (see Refs. [25,26] for reviews), so far its critical properties have largely evaded an exact analytical treatment (except in the limit of infinite local Hilbert space dimension, where the phase transition reduces to a problem of the classical geometry of the circuit [1,27]). There is an active effort to describe the MPT and other models with related mathematical structure using effective statistical mechanical lattice models [8,9,22,[28][29][30][31][32][33][34][35][36][37][38][39][40] or effective "Landau-Ginsburg-like" field theories [22,28]. But so far there is no theory that can reproduce the numerically observed critical measurement rate nor the critical exponents of the MPT in a low-dimensional system of qubits.
This challenge has motivated simplifying the spatial structure of the problem. Recent work examined the MPT in all-to-all coupled circuits [7,22,23]. In these models unitary gates are applied (at random, with equal rates) between any pair of spins/qubits, and these unitaries are interspersed with a finite rate of single-spin projective measurements. Such models take the spin system to a mean-field-like limit in which all pairs of spins can interact. The hope is that this limit will enable an analytical approach that is not available in finite-dimensional systems.
Recent progress in this direction has made use of the fact that the quantum circuit for a large all-to-all-coupled system has a locally tree-like structure [22], leading to the conjecture that the MPT in the all-to-all circuit coincides with an entanglement transition in an associated ensemble of tree tensor networks. This correspondence enables exact results in the case where the measurement outcomes are "forced", as described below.
In order to characterize the entanglement transition on a tree tensor network, one can define an "order param-eter" Z k (defined more precisely below) which quantifies the degree of entanglement between the root and the leaves of a tree with k generations. In the weak monitoring phase, this order parameter remains nonzero when the tree's size diverges. Importantly, the tree structure allows one to solve for the behavior of Z k as k → ∞ using a recursion relation, giving analytic results for the critical measurement rate and the critical vanishing of Z k near the transition [22]. These results generalize to entanglement transitions in a large class of tree tensor networks. These developments, as well as the results in [41,42], suggest that tree tensor networks [41,[43][44][45][46][47][48][49][50][51] are a useful tool for developing a mean-field theory for measurement and entanglement transitions.
Despite this progress, we do not yet have an exact solution for an MPT for qubits with true measurements, even in the context of quantum trees. In Ref. 22, a reduction to a tractable tree tensor network was obtained only for a circuit where measurement outcomes were predetermined, instead of being sampled with Born's rule. Physically, the former situation corresponds to a protocol where one runs the dynamics many times and then discards all realizations except those for which the measurements produce a desired sequence of outcomes (e.g., all spins are measured in the ↑ state). This protocol was dubbed the "forced measurement" case, and the resulting transition is called the "forced-measurementinduced phase transition" (FMPT). A key question, then, is whether the approach of Ref. 22 can be extended to describe real measurements, for which the measurement outcomes are sampled with the nontrivial, statedependent probability given by Born's rule. More generally, one can ask how the MPT and FMPT differ in various settings. Do they have different critical points and/or different critical exponents? It would be valuable to have a model in which exact results can be obtained for a version of the MPT.
In this paper we define a different kind of many-body dynamics, the "collapse process", which has a solvable phase transition for real measurements as well as for forced measurements. This solvability allows us to examine the difference between real and forced measurements. We also define a closely-related "expansion process". Cartoons of these two processes are shown in Fig. 1. A potentially useful feature of dynamical trees is that they exhibit measurement phase transitions that are experimentally accessible without the need for postselection [12,[52][53][54][55][56] on measurement outcomes, because the efficient contractability of tree tensor networks can be exploited to classically process experimental measurement outcomes.
The dynamics we discuss are not conventional quantum circuit dynamics with a fixed number of qubits. Instead we define dynamical protocols whose spacetime interaction graph resembles a tree, with the number of active qubits either decreasing or increasing in time. In the "quantum collapse process" (Fig. 1, left) we begin with 2 k ≫ 1 qubits, and by measuring and discarding half of and expansion processes (Right). Each node of the tree represents an interaction between two qubits, preceded/followed by weak measurements. Viewing the tree as a tensor network, the symbol t denotes a local three-legged tensor and T denotes the full network. (These tensors are defined in the text.) the qubits at each timestep, we are left with a single qubit at time t = k. Interactions and weak measurements also take place during each timestep. In the "expansion process" (Fig. 1, right) we start with a single qubit and (by recruiting further spins) attempt to scramble its state into a many-body state of 2 t qubits. In both cases, we ask whether a maximally mixed initial state is purified by the monitored dynamics (the strong monitoring phase) or not (the weak monitoring phase). The purity of the state is quantified by an entropy that varies between 0 and 1 bit (since the root of the tree is a single spin); this entropy also represents the entanglement entropy between the root and the leaves. The entanglement transition is induced by varying the strength of the weak measurements that occur in every timestep. We focus mainly on the collapse process, where we give exact results for both MPT and FMPT. (Our results imply exact results for the expansion process FMPT, but not the expansion process MPT, as we discuss in more detail below.) Let us contrast the "dynamical" trees studied here with the trees studied in Ref. [22]. In the present work, the "generation" coordinate of the tree is simply proportional to the physical time coordinate t. By contrast, in Ref. [22] trees arose more indirectly, starting from circuit dynamics with a fixed number of qubits: in that case the "generation" coordinate of the tree was not equivalent to physical time. 1 Ultimately, in both cases we must deal with a random ensemble of tree tensor networks. However, the fact that we consider real measurements here means the probability distribution for this ensemble of trees has nontrivial correlations between different nodes.
Crucially, the collapse model we introduce maintains a unitary invariance property for the distribution of the output states, even for the case of real measurements: as a result of the Haar-randomness of the interaction unitaries, there is no preferred local basis. This invariance allows us to locate the transition analytically by studying a recursive map [22]. The invariance also constrains the universality class of the phase transition [22,42]. Interestingly, the critical scaling forms are similar to those describing branching random walks with a moving absorbing wall [57]: the reason is that both problems are closely connected to traveling wave equations [58].
We find that the MPT and FMPT have qualitatively similar critical behavior, with an exponential vanishing of the order parameter as the transition is approached from the weak-monitoring side. However they have different values of the critical measurement strength, with the MPT exhibiting a smaller entangling phase than the FMPT. This difference can be seen in Fig. 2, which shows the von Neumann entropy S vN of the final state (i.e. the final, unmeasured spin at the apex of the tree in Fig. 1) for a collapse process in the limit of long time (large initial system size). In the strong monitoring phase, S vN → 0 in the limit of long time, while in the weak monitoring phase S vN remains finite. For both types of dynamics, S vN is plotted against a parameter θ (defined below) that characterizes the strength of weak measurements within each node of the tree; θ → π/4 corresponds to the limit of no measurement while θ → π/2 is the limit of strong, projective measurements.
Interestingly, the MPT and FMPT reside in distinct universality classes, with the MPT (but not the FMPT) located at the boundary separating two different regimes for the phase transition. These two regimes can be thought of as "strong" versus "weak" randomness. This result suggests that there may be slightly different protocols for which the MPT and FMPT exhibit more radical differences in their critical scaling, for example power-law versus exponential scaling of the order parameter.
The structure of this paper is as follows. In Sec. II we define the dynamical tree models. In Sec. III we study the forced measurement case using both simulations and theoretical analysis. In Section IV we turn to the case of real measurements, and contrast it with the forced measurement case. In Sec. V we discuss the expansion process, and suggest a protocol for realizing the measurement phase transition on a tree experimentally. We conclude in Sec. VI with a summary and a discussion of potential future work.

II. DYNAMICAL TREE MODELS
The quantum dynamics we consider are of two kinds. In the collapse process we start with a large number of spins (qubits). In each timestep the number of "active" spins is halved, because, after the spins interact in pairs, half of them are projectively measured and discarded (Fig. 1). This process gives a tree structure in spacetime, with only a single spin remaining at the top of the tree, i.e. at the final time. The question we ask is whether the state of this final spin preserves quantum information from the initial state [7]. This question may be formalized by initializing the spins in the lowest layer of the tree in the maximally mixed state and asking whether the final spin is in a pure state, or whether it is in a mixed state with a nonzero entropy (which can be at most ln 2, since the final spin is a 2-state system). This entropy is a measure of the amount of surviving information, and is also an order parameter for the phase transition.
The expansion process is, loosely speaking, the inverse process. Initially the "system" consists of a single spin. Further spins (initialized in a definite state, |↑⟩) are recruited at each timestep so that at time t the system consists of 2 t spins. Again we ask whether a maximally mixed initial state is purified or not.
An equivalent way to formulate this question about purification [12] in the expansion process is to start with a state in which the initial spin is maximally entangled with a "reference" spin. In this case entropy of interest is also equal to the entanglement, at the final time, between the reference spin and the collection of 2 t system spins (which is again no larger than ln 2).
The detailed dynamics are specified below and include unitary interactions and weak measurements. Equivalent models could also be formulated using only projective measurements, at the cost of a more complex node interaction involving ancilla spins (a trivial way to see this equivalence is by using the physical interpretation of 3. An illustration of the tree model studied in this paper (for the collapse process), drawn here with k = 3 generations. Each line represents the world line of one of the 2 k spins in the system. The red K blocks (defined by Eq. 1) represent single-spin weak measurements. The blue U blocks represent independent Haar-random, two-spin unitary operators. The red bras are associated with projective (strong) single-spin measurements. A single node of the tree (a circle in Fig. 1), comprising all three operations, is indicated by the orange dashed box. a weak measurement in terms of an ancilla 2 ).
From a formal point of view, the basic objects describing the evolution are tree tensor networks, built up (as shown in Fig. 3 for the collapse process) from: unitary gates U ; Krauss operators K (representing weak measurements, see below); and kets/bras associated with the spins that are recruited/discarded in each timestep. The set of allowed tree tensor networks is identical for both the collapse and expansion processes, just with the time coordinate reversed. In both cases the order parameter is a simple property of the singular value decomposition of the tree (for a decomposition separating the "trunk" from the "leaves"). However, the probability distribution on this set of tree tensor networks depends crucially on the physical interpretation of the dynamics, because applying Born's rule (the usual quantum mechanical probability for measurement outcomes) leads to correlations between the constituent elements of the tensor network. We now describe the dynamics more carefully.
A. The collapse process: basic "node" The basic ingredient of the collapse process (i.e. a node of the tree) is a random operation that takes a state of 2 A weak measurement can be achieved by allowing the spin of interest to interact for a finite time with an ancilla degree of freedom, and then projectively measuring the ancilla [59].
two spins as the input and gives a state of a single spin as output. The node consists of the following sequence of three steps: (i) a weak measurement is applied independently to each of two spins along the lower bonds, (ii) a Haar-random, two-spin unitary operator U is applied to the two spins, and (iii) one of the two spins undergoes a projective measurement.
The Haar-randomness of the two-site unitaries means that it does not matter whether we choose to make all the measurements in the S z basis, or whether we choose a new random basis for each measurement. These two measurement protocols are related by a "gauge transformation" which involves rotating the measurement bases and making compensating rotations of the 2-site unitaries (see Appendix A for more details). 3 For notational convenience later on, we will take each weak measurement to be in an independent random basis, but the strong measurements to be in the S z basis.
The unmeasured spin that remains after (i)-(iii) is then used as an input for another node of the tree, unless it is the final spin at the apex of the tree. The resulting tree is depicted in Fig. 3. The strength of the weak measurement in step (i) is the major parameter of the model.
These ingredients may be formalised as follows. Each weak measurement is described using a Kraus operator where σ =↑ or ↓ labels the measurement outcome, |σ⟩ denotes an S z eigenstate, and u is a Haar-random unitary rotation that randomizes the measurement basis. (An independent random basis is chosen for each weak measurement, but to avoid clutter in our notation we will leave the dependence of K σ on u implicit.) If a given spin has the initial 2 × 2 density matrix ρ, then after a weak measurement with outcome σ the spin has the normalised density matrix If the weak measurement is a "true" measurement, the probability of the outcome σ is equal to For a forced measurement, we fix the outcome to σ =↑, without loss of generality. See the following subsection for a comment on the physical interpretation of forced measurements.
When θ = π/4 the Kraus operator is proportional to the identity and no measurement occurs, while when θ = π/2 the Kraus operator is a projection operator, giving a conventional projective measurement. For both the MPT and the FMPT we will find that the critical value of θ lies in the interior of this interval, with θ = π/4 being in the entangling phase and θ = π/2 in the disentangling phase.
Let the initial density matrices of the (uncorrelated) spins involved in the node be ρ 1 and ρ 2 . Then the action of the weak measurements is with the appropriate probabilities for the outcomes σ 1 , σ 2 . (We write the superscripts (1) and (2) for the Kraus operators because K as a result of the independent choices of random measurement basis for each spin). Next, in step (ii) we generate a 4 × 4 Haar-random unitary matrix U and act on the combined state of the two spins so that the new 4 × 4 density matrix is Finally, in step (iii) we perform a projective measurement of one spin (taken here to be spin 2, without loss of generality) in the S z basis. For measurement outcome σ ′ 2 (which can again either be selected with Born's rule, or postselected to enforce σ ′ 2 =↑) the corresponding projection operator is P σ ′ 2 = 1 ⊗ (|σ ′ 2 ⟩ ⟨σ ′ 2 |), and the final 2 × 2 density matrix for the first spin after all three steps is Here tr 2 denotes the partial trace over the second spin, while tr denotes the trace over both spins. For real measurements, the joint probability for p(σ 1 , σ 2 , σ ′ 2 ), given the initial state and the random unitaries, can be written as the product of the probabilities for the earlier measurements and the conditional probability for the later one: Anticipating the notation of the next section, the conditional probability can be written more conveniently in terms of a node tensor t, p(σ 1 , σ 2 , σ ′ 2 ) = tr t(σ 1 , σ 2 , σ ′ 2 )(ρ 1 ⊗ ρ 2 )t(σ 1 , σ 2 , σ ′ 2 ) † . (9) Similarly, the output density matrix is B. Collapse process as a tensor network The full collapse process can be described with the tensor network T which is shown in Fig. 3. This tensor network can be viewed as a 2×M rectangular matrix, where the number of rows, 2, is the Hilbert space dimension of the final remaining spin (the topmost bond of T ) and the number of columns, M = 2 2 k , is the Hilbert space dimension of the 2 k initial spins (the lowermost bonds). T depends on the various random variables as follows.
First, there are the various random unitaries. We denote the complete set of all the unitaries appearing in T by U. For each node r there is a two-site random unitary U r , together with two single-site random unitaries, u (1) r and u (2) r , that appear inside the Kraus operators and set the weak measurement bases. Second, T depends on the lists, denoted m W and m S , of measurement outcomes for the weak and strong measurements respectively. We will sometimes make the dependence on the measurement outcomes explicit, T = T (m W , m S ), but we will leave the dependence of T on the various unitaries implicit. We will do the same for the local node tensors t described below.
The tensor network T is built by contracting together three-index tensors (t r ) a bc , where r labels a trivalent node of the tensor network (from now on we will suppress this subscript) and a, b, c are the bond indices. The node tensor is constructed as follows. First, we define which is an operator acting on two spins. We write its components as t ab cd , where the positions of the indices reflects their geometrical position in the tensor network: the indices (a, b) are for the upper bonds of the tensor, and (c, d) are for the lower bonds. Viewed as a matrix, the multi-index (a, b) makes up the row index of t and (c, d) makes up the column index.
The strong measurement outcome fixes the value of the index b, so the desired three-index tensor, conditioned on all three measurement outcomes, is: If we take the 2 k spins that comprise the initial state to be in a maximally mixed state, so that the initial density matrix of the system is proportional to the identity matrix, ρ = 1/M , then the final state of the last spin can be written as (again we make the dependence on the measurement outcomes explicit): The matrix multiplication in T T † contracts all the bottom bonds of the tensor network T (Fig. 3) with the corresponding bonds of T † , leaving a 2 × 2 matrix.
We now consider the probability distribution for the tensor network T for protocols involving either forced or true measurements.
In all cases, the unitaries in the set U are drawn independently from the Haar distribution. In the forced measurement case, this set of unitaries is sufficient to fix T entirely. Using the unitaries we have drawn, we can form the local node tensors t = t(↑, ↑, ↑).
Note that in this forced measurement case the local t tensors for the nodes are independently random, i.e. they are uncorrelated between nodes. 4 In the true measurement case the local measurement outcomes, and therefore the local tensors t = t(σ 1 , σ 2 , σ ′ 2 ), are correlated. For true measurements, the tree T (m W , m S ) is determined by both U and the measurement trajectory. The probability distribution for T (m W , m S ) is the product of the Haar distribution for U and the conditional probability (given U) of obtaining the trajectory (m W , m S ). The probability of such a trajectory may be written as In practise, rather than using this expression for the full trajectory, it will be more convenient to think of the process as a Markov process, in which at each time step the density matrices are acted on in the manner described in the previous section.
We have distinguished the case where all measurements are true from the case where all measuremements are forced. We will also mention a mixed case, where the weak measurements are true measurements, but the strong measurements are postselected to be ↑. In this case, the probability for a given tree is determined by conditioning the probability Eq. 14 on having m S = 0, where 0 is the list (↑, ↑, . . . , ↑) of length 2 k − 1. This process gives 5 (where the nontrivial probability distribution is only for m W , since m S is fixed). As we will discuss, the distribution of weak measurement outcomes on the right-hand side of Eq. 15 is also precisely the one relevant to the expansion process with true (weak) measurements. 4 Physically, we could sample the forced measurement trees using the following procedure. We first choose U, then attempt to run the dynamics. If the dynamics produces a "wrong" measurement outcome, i.e. ↓, we discard that run and re-run the dynamics (in fact it is sufficient to re-run the relevant subtree) using the same U. We repeat this process until we get an all-↑ measurement record for the given U. Then we resample U and repeat. Note that this protocol does not bias the distribution of U: all unitaries are Haar-distributed. 5 In more detail: In terms of p(m W , m S ) in Eq. 14, this conditional probability is Using (14) and repeatedly using the relation σ KσK † σ = 1 for the Kraus operators, the denominator simplifies to 2 1−M .

C. Characterizing the transition
We adopt the perspective mentioned in the introduction, and introduced in Ref. [7], of whether a maximallymixed initial state is purified by the dynamics. This is equivalent to studying the singular value decomposition of the tree tensor network/rectangular matrix T .
As above, the 2 k spins in the collapse process are initially in a maximally mixed state. We ask whether the density matrix ρ f of the unmeasured spin at the apex of the tree is in a pure state or a mixed state. In its eigenbasis, ρ f may be written The value of Z, the smaller of the two eigenvalues of ρ f , determines how mixed the state is. Z = 0 corresponds to a pure state, while Z = 1/2 describes a maximally mixed state. In the near-critical regime, where Z is small, the Rényi entropies are approximately S vN ≡ S 1 =≃ Z ln Z −1 , and S n ≃ n(n−1) −1 Z for n > 1.
We use the notation Z k to denote the smaller eigenvalue of a tree of k generations. In the limit of large k, Z k yields an order parameter for the MPT or FMPT: the entangling phase is defined by the fact that the typical (or the average) value of Z k is nonzero in the limit k → ∞. As discussed above, Z is closely related to the entropies, so that having finite Z in the limit k → ∞ is equivalent to having a nonzero entropy in the limit of a large tree. We have already shown numerical data for the average entropy ⟨S vN ⟩ in Fig. 2, illustrating the critical vanishing at the relevant critical value of θ.
The evolution of the distribution of Z k as a function of k can be described recursively, as detailed in the following sections. For an analytical treatment it will be convenient to focus on the typical value of Z k [58], defined by 6 instead of the mean value ⟨Z k ⟩. This distinction is important because Z k may have a broad statistical distribution for trees of size k ≫ 1, so that ⟨Z k ⟩ is dominated by rare samples and is much larger than the typical value.
(Nevertheless, lim k→∞ Z typ k and lim k→∞ ⟨Z k ⟩ are both nonzero precisely in the entangling phase.) To recap, the phenomenology of the transition is that in the entangling phase the typical value Z typ k tends to a positive constant in the limit k → ∞, indicating that a nonzero entropy is retained in the final state, while in the disentangling phase, Z typ k → 0 as k → ∞.

D. Expansion process
The expansion process at time t = k ( Fig. 1, Right) is characterized by a tensor network that relates the state of a single initial spin to the state of all 2 k spins at the final time. (This final state includes the 2 k − 1 spins that were recruited during the k timesteps.) The relevant tensor networks are related to the ones in the collapse process, defined in Sec. II B, by reversing the direction of the arrow of time.
For the case of forced measurements, where all measurements are postselected to be ↑, the expansion process is precisely equivalent to the collapse process except for this time reversal. A node in the expansion process which "recruits" a spin in the state |↑⟩ can be viewed as the time reversal of a node in the collapse process that discards a spin after a forced measurement in the state |↑⟩. More precisely, the relevant ensemble of tensor networks is the exactly same between the two forced measurement processes (with the probability distribution given simply by the Haar distributions for the unitaries). Since the FMPT is a property of the ensemble of tensor networks, it follows that the expansion process has all the same critical properties as the collapse process in the FMPT case.
In the expansion process with true measurements, the weak measurement outcomes m W are sampled with Born's rule (there are no strong measurements in the expansion process). This process is not equivalent to the collapse process with only true measurements. Instead, writing the relevant probability p expansion (m W ) in terms of T shows that p expansion (m W ) is equal to the probability p mixed (m W ) of the "mixed" collapse process in Eq. 15, where the weak measurements are true but the strong ones are forced.
In our discussion of the MPT in Sec. IV we focus on the collapse process. We comment in Sec. V on an efficient experimental protocol for detecting the transition in an expansion process.

III. FORCED MEASUREMENTS
Before considering real measurements, in this section we examine the case of forced measurements, as these can be understood by a straightforward application of the approach in Ref. 22. We briefly recapitulate this approach, then give the resulting critical properties for the FMPT. As mentioned above, in the forced measurement case all measurement outcomes are fixed in advance to give the ↑ state in the corresponding measurement basis.
Below we discuss the collapse process FMPT. However the results below also apply for the expansion process FMPT, for the reason described in Sec. II D: the same ensemble of tensor networks describes both problems.

A. Linearized Recursion Relation
In order to identify a recursion relation for the probability distribution of the smaller eigenvalue Z (see Eq. 16), we imagine the process of constructing a tree with k + 1 generations from two smaller subtrees, each with k generations. We denote the probability distribution of the smaller eigenvalue for a tree of k generations by µ k (Z).
Suppose that the two subtrees have final states that are described by density matrices ρ k,1 and ρ k,2 , respectively, with corresponding smaller eigenvalues Z k,1 and Z k,2 . Since the two subtrees are statistically independent, each of these eigenvalues is drawn independently from the distribution µ k . Combining these trees into a larger tree with k + 1 generations via the "node" operation described in Sec. II A, with the appropriate probabilities for the unitaries and measurements, gives a new final density matrix ρ k+1 whose smaller eigenvalue Z k+1 is distributed according to µ k+1 .
In considering this process of combining subtrees, it is in fact sufficient to take ρ k,1 and ρ k,2 to be diagonal and of the form in Eq. 16. This simplification arises because the unitary transformations required to achieve this diagonal form can be absorbed into the Haar-random unitaries involved in the node tensor, without changing the probability distribution of these operations. This is a crucial simplification from Haar randomness: it means we can deal with a recursion relation only for the eigenvalue Z, without having to keep track of how the associated eigenvectors of the density matrix evolve.
In a given instance, the new eigenvalue Z k+1 depends on the values of Z k,1 and Z k,2 , as well as on the variables in the node (U , and the random basis rotations in K (1) and K (2) ). For example, if Z k,1 = Z k,2 = 0, then we necessarily have Z k+1 = 0, since the action of the node always takes two independent pure states into a single pure state.
In order to understand the critical properties near the transition, it is sufficient to consider the case where Z k is very small compared to unity (either vanishing at large k in the disentangling phase, or tending to a very small constant just inside the entangling phase). Much of the key information is contained in the leading order (linear) expansion of Z k+1 in terms of Z k,1 and Z k,2 : For a given node, the coefficients A 1 and A 2 can be expressed in terms of the three-index tensor t defined in Sec. II B, which makes up a local node of the tree tensor network. For the true measurement problem, this tensor depends on the measurement outcomes at the node, The tensor t is therefore defined by the Haar-random 2-site interaction unitary and the two Haar-random 1site unitaries appearing in the Kraus operators (cf. Eqs. 11, 12): In terms of t, Eq. 7 becomes where t a cd is treated as a 2 × 4 rectangular matrix whose row index is a and whose column index is (c, d).
From this expression for ρ f we can write the expression for trρ 2 f , and the latter can be expanded in Z k+1 as trρ 2 f ≃ 1 − 2Z k+1 . Explicitly evaluating the left-hand side gives the random coefficients in (18) as: In order to understand the evolution of Z k with increasing k, we define the generating function [58] Here, ⟨...⟩ indicates the average over Z k with distribution µ k . A heuristic way to think of G k (x) is as a smeared version of the cumulative probability distribution of ln Z k : For x ≫ ln Z typ , G k (x) has a plateau at unity, and for x ≪ ln Z typ it has a plateau at zero, as illustrated in Fig. 4. As in Ref. [22], we first study the linearized recursion relation (Eq. 18), and we then take into account the nonlinear terms. We are interested in the distribution of Z k for large k, i.e. in the nature of the probability distribution µ k after many recursive steps.
The linearized recursion for Z (Eq. 18) implies a recursion relation for G k (x) that can be expressed through the coefficients A 1 and A 2 as: where the remaining average is over A 1 and A 2 . Note that the probability distribution of A 1,2 depends on the measurement strength parameter θ of our model. It is convenient to view k as the time and x as a kind of position for a travelling wave [58], as suggested by Fig. 4. Then Eq. 25 describes a wave front that is located at x = ln Z typ k and which propagates in "space" with a velocity v θ that may be either positive or negative. The critical point θ c corresponds precisely to the value for which v θ vanishes [22].
In the disentangling phase, the wave front propagates to the left (v θ < 0) at large times k, meaning that the typical value of Z k decays exponentially, Z typ ∼ e −|v θ |k . In the entangling phase, on the other hand, the linearized treatment gives a traveling wave that propagates to the right (v θ > 0). Since a rightmoving wave would correspond to Z typ growing indefinitely, the nonlinear terms in Eq. 18 become important in the entangling phase: they halt the front at finite value of ln Z typ k as discussed below in Sec. III B. However, our first task is to understand the linearized problem, which is captured by Eq. 26: from this linearized recursion we can identify the location of the critical point.
The travelling wave ansatz is Here λ ≥ 0 parameterizes a family of possible travelling wave solutions: it will be necessary to choose the correct solution that is appropriate to the initial conditions. The value of λ is defined through the exponential decay of G (λ) at large argument (see Ref. 58 and references therein): Using this asymptotic expression and the recursion relation for G k (x) (Eq. 26) gives the following result for the wave front velocity: where again the average is over A 1,2 as defined by Eqs. 23 and 24, or in other words over the random unitary operations that define the node tensor t. Note that t, and therefore A 1,2 , depend on θ implicitly through the Kraus operators in Eq. 21 [we will sometimes write A 1,2 (θ) to emphasize this dependence]. The value of the parameter λ is determined in a standard way as for Fisher-Kolmogorov-Petrovsky-Piskunov (FKPP) waves [60]: so long as the initial condition decays sufficiently fast at large x (which we confirm below), the solution λ that is selected is the one for which the velocity v θ (λ) is minimal. The "dispersion relation" v θ (λ) is a convex function of λ, and for each fixed θ there is a value of λ = λ * that gives the minimal velocity: If we plug in Eq. 29, we get Notice that the first term on the left-hand side of Eq. 31 is equal v θ /λ, which vanishes at θ c . Therefore the two equations which hold at the critical point, Eqs. 31 and 32, reduce to These two equations determine both unknowns, λ * and θ c , i.e. they determine the location of the critical point. Surprisingly, Eqs. (33) and (34) can be solved analytically. Equation 33 can be solved for λ * using only an invariance property of the node tensor distribution (we consider this solution first), while for Eq. (34) we must consider the particular structure of the node tensor.
Since both the weak measurement bases and the unitary operator U are chosen Haar-randomly, the t-tensor is statistically invariant under unitary rotations on any single index. For instance, consider the mapping the two tensors t and t ′ = ut have equal probability. Given this condition, 7 the coefficients A 1 and A 2 satisfy the identity [22] (irrespective of the value of θ) This may be shown by averaging over a single-leg unitary rotation like that in Eq. 35 (an analogous argument is discussed in App. B). Given this identity, Eq. 33 implies that λ * = 1/2 at the critical point, and therefore that the critical point θ c is the value of θ for which (Eq. 34) where we have made the dependence of A i on θ explicit. In the present model the two terms in Eq. 37 are equal 7 Here we also use the fact that A i > 0 with probability 1.
by symmetry (the two bottom legs of t are equivalent on average). Equation 37, together with Eqs. 23 and 24, defines the critical measurement strength θ c . Equation 37 is easy to solve numerically, since it involves only a finitedimensional integral over the random parameters of a single node, but in fact it can also be solved analytically.
So far, the identities above relied only on the unitary invariance property discussed around Eq. 35, and did not depend on the more detailed structure of the node tensor t. For the present model, where t is given by Eq. 12, averages like ⟨A λ ⟩ can be written as explicit functions of θ by sequentially integrating out the various Haarrandom objects appearing in the node. This calculation is presented in Appendix C and gives Taking the limit λ = 1/2 in this formula, Eq. 37 becomes which has the solution We also obtained θ c by evaluating the averages in Eq. 37 directly, by numerically integrating over the Haarrandom elements in the node, and we obtained the compatible result θ c = 1.4201 ± 0.0001. Above we assumed that the travelling wave converged to the solution with with λ = λ * . For completeness, let us briefly recall the justification for this assumption.
The solution chosen at late times depends on which decays faster with x at large x: the initial condition 1 − G 0 (x), or the minimal-velocity wavefront, 1 − G (λ * ) (x). For the initial condition, expanding the generating function at large x gives Thus, the initial condition effectively corresponds to a value of λ = 1 [see Eq. 28]. The selected value of λ at late times is the minimum between λ * and 1. Borrowing terminology from a closely related problem of directed polymers on disordered trees [58], we say that the system is in the glass class in cases where λ converges to λ * < 1, while cases where λ * > 1 correspond to the paramagnetic class. 8 In the present case -the near-critical FMPTwe have λ * < 1, as discussed above, so the solution with λ * is the relevant one (see Ref. [22] for further discussion).

B. Scaling behavior near the critical point
We now consider scaling behavior of Z k near θ c . We focus on two specific quantities: the size of the order parameter Z typ ∞ ≡ lim k→∞ Z typ k in the entangling phase as one closely approaches the transition (θ < θ c ); and the dependence of Z typ k on k exactly at the critical point, θ = θ c .
Once we leave the disentangled phase, the linear recursion relation discussed above is no longer sufficient: the nonlinear terms in Eq. 18 are essential to prevent Z from diverging. However, previous work [22] showed that ultimately the critical scaling properties can be expressed in terms of the velocity function v θ (λ) (Eq. 29).
Here we briefly list the main results. (We will review the argument for these scaling forms in Sec. IV and App. E.) Just on the entangling side of the critical point, Here λ θ is the solution of the equation v (λ) θ = 0. Expanding Eq. 29 around the critical point, we find where C is a constant coefficient. Using Eq. 29 and Eq. 42, we can calculate the coefficient C, Numerically evaluating these averages gives C = 2.903 ± 0.001. Taking appropriate derivatives of Eq. 38 (with respect to λ to obtain the numerator of Eq. 44 and with respect to θ to obtain the denominator) gives: Exactly at the critical point, θ = θ c , the velocity v θ vanishes and the value of ln Z typ k evolves sub-ballistically with k. Reference [22] argued that ln Z typ Finally, we note that λ * = 1/2 can be viewed as a critical exponent determining the breadth of the probability has a glass transition as a function of the ratio of temperature to disorder strength [58]. In the paramagnetic class the polymer has an entropy that is extensive in the depth of the tree, while in the glass class the entropy per unit length vanishes. (We use the word "class" to avoid confusion with the entanglement "phases". See Ref. [22], Sec. IV.G. for further discussion in the present context.) distribution of Z k near criticality [22]. We discuss this further in Sec. IV D.
Equations 40, 43 and 46 are our main predictions for the critical properties of the FMPT. We note that analogs of Eqs. 43 and 46 appear for the survival probability of a branching random walk with a moving absorbing wall [57], which can be understood in terms of an FKPP equation with a boundary condition that breaks translation symmetry.

C. Numerical results
In order to check the above predictions for θ c and for the critical scaling, we perform numerical simulations of the collapse process.
A direct approach, constructing many realizations of large trees, would be prohibitively expensive for large k because of the exponential growth of the number of nodes with k. We overcome this limitation by using the pool method described in Refs. [61][62][63]. Briefly, this method involves storing a large set ("pool") {Z a k } N pool a=1 of values of Z k corresponding to trees of finite size k. This pool is a way of approximately capturing the probability distribution µ k of Z k , via a large set of values drawn from this distribution, with larger values of the pool size N pool giving a better approximation. A pool {Z k+1 } corresponding to trees of size k + 1 can then be produced by randomly combining the values from the pool {Z k } using the process defined by Eq. 10. We note that, while in the limit of Z k ≪ 1 one can use the linearized recursion relation of Eq. 18, throughout this paper our numerical data is obtained using the full recursion defined by Eq. 10, and does not assume small Z k . Our numerical averages for ln Z typ k are taken over all elements of the pool. All our results in the main text are obtained with a pool size 10 6 , and we show in Appendix D that the results we present are well-converged as a function of the pool size.

IV. REAL MEASUREMENTS
The preceding section considered the quantum tree with forced measurements. We found a FMPT with properties that are consistent with those found in Ref. [22], although the details of the tree are slightly different. An interesting question, which is not directly considered in Ref. [22], is whether we can use a similar approach to also reveal the properties of the transition induced by real measurements. The key difference with the FMPT is that for real measurements the measurement outcomes σ 1 , σ 2 and σ ′ 2 associated with a given node of the tree are no longer fixed, but are instead randomly chosen with probabilities determined by the Born rule (leading to additional quantities to average over in each node). Since the outcomes of different measurements can be correlated, the local threeindex tensors t = t(σ 1 , σ 2 , σ ′ 2 ) that make up the network are no longer independently random objects: they now have nontrivial correlations not only within a node but also between different nodes.
We show in this section that averaging over the Born rule outcomes leads to different theoretical results for the MPT as compared to the FMPT. Again we find that a travelling wave picture can be developed. Notably, however, while the critical FMPT mapped to a travelling wave problem in the "glass" class (with λ * = 1/2), the critical MPT maps to the case λ * = 1 which is at the boundary of the glass class with the paramagnetic class. The MPT still allows for two distinct dynamical phases, entangling and disentangling, separated by a critical measurement strength θ = θ c that is distinct from that of the FMPT case. The scaling of Z typ k as a function of k and θ is qualitatively similar to that in the MPT, but there are universal differences between the problems that are reflected in the probability distribution of Z k .

A. Linearized recursion relation
We wish to consider how the stochastic process defined by the node t transforms the probability distribution µ k (Z) into the probability distribution µ k+1 (Z).
For the recursive step, we consider (as in the previous section) a node "event" in which we start with known "in-put" density matrices ρ k,1 and ρ k,2 for the two spins. By randomly choosing the necessary Haar-random unitaries, and choosing the measurement outcomes with the Bornrule probabilities, we obtain a random "output" density matrix ρ k+1 .
As in the previous section, it is sufficient to consider the case where the input density matrices ρ k,1 and ρ k,2 have the diagonal form in Eq. 16. This simplification is possible because the probability distribution of the output density matrix, given the inputs, depends only on the eigenvalues of the inputs, and not on their eigenvectors. This irrelevance of the eigenvectors follows from the Haar-randomness of the unitaries in the node, which means that there is no "preferred basis" for the spins. (For a formal argument, see App. A.) Adopting this diagonal form, the input density matrices are evolved using the t-tensor that defines the operation of a single node, which was given in Sec. II B. Recalling these definitions, This tensor t = t(σ 1 , σ 2 , σ ′ 2 ) depends on the 2-site unitary and on the random bases for the measurements (this dependence will be left implicit), and also on the measurement outcomes (these will sometimes be shown as explicit arguments). As before, the output density matrix is given by The measurement outcomes are chosen probabilistically with the Born rule: for given initial states and unitaries, the probability of a set of outcomes (σ 1 , σ 2 , σ ′ 2 ) may be written as discussed in Sec. II A. Below we also write this probability as p(s), using s = (σ 1 , σ 2 , σ ′ 2 ) to denote the collection of measurement outcomes for the node. For notational simplicity we have suppressed the dependence of p(s) on the initial states and on the random unitaries.
Formally, the value of Z k+1 is a function of the initial states Z k,1 , Z k,2 and of the t-tensor (and through it the measurement outcomes): For a fixed set of measurement outcomes, the expansion of the function f in terms of the arguments Z k,1 , Z k,2 proceeds identically to the forced measurement case (Sec. III A), and thus Eqs. 23 and 24 for the coefficients A 1 and A 2 are still valid for small Z k,1 , Z k,2 : with where the coefficients A 1 = A 1 (σ 1 , σ 2 , σ ′ 2 ) and A 2 = A 2 (σ 1 , σ 2 , σ ′ 2 ) are now dependent on the measurement outcomes.
This dependence on the measurement outcomes means that, unlike the forced measurement case, the coefficients A i in Eq. 51 are not chosen independently of the Z k,i , since the probability of given measurement outcomes is itself dependent on Z k,1 and Z k,2 . Fortunately, however, do not need to consider the full dependence. In particular, in the disentangling phase, where Z k becomes arbitrarily small at large k, we can approximate the measurement probabilities by their Z → 0 limits. This approximation is sufficient to locate the transition.
Expanding Eq. 49 in Z k,1 , Z k,2 and retaining only the zeroth-order term, and using the fact that ρ 1,2 are diagonal in our chosen basis and have the form in Eq. 16, we find Using the expressions for t in Eqs. 11, 12, and the property of the Kraus operators one can check that the above probability p is normalized, such that σ1,σ2,σ ′ 2 p(σ 1 , σ 2 , σ ′ 2 ) = 1. Using Eqs. 51-52, the generating function G k (x) has the recursion relation Here we use s to represent the set of all three measurement outcomes. The angle brackets are the average over the Haar-random unitary operations: the two-site random unitary U , and the 1-site random unitaries in K (i) which set the bases for the weak measurements. The argument θ is the strength of weak measurements.
As before, at late times G k (x) converges to a traveling wave solution, parameterized by a variable λ which must be determined. The solution for a given λ has velocity For a given θ, there is a special value λ = λ * that gives the minimal velocity, The velocity selected is given by and the critical point θ c is defined by the vanishing of this velocity. Recall the physical meaning of this velocity: in the disentangling phase, v θ is negative, and Z typ k ∼ e −|v θ |k . If λ * ≤ 1, the critical point is given by solving the following equations for the unknowns λ * and θ c : These equations may be rewritten as In Appendix B we show that the coefficients A 1 (s, θ), A 2 (s, θ) satisfy the striking identity for both i = 1 and i = 2. As a result, Eq. 61 is satisfied by λ * = 1. Equation 60 then gives the value of θ c : This equation may also be written since the average in Eq. 63 is independent of s (which runs over 8 possible measurement sequences) and of i. Thus, at the critical point the wavefront solution has λ * = 1, which means that the linear recursion relation is poised exactly at the boundary point between the glass and paramagnetic classes (which yield different critical scaling once nonlinearities are included [22]). Note that λ * = 1 only at the critical point, and in general the value of λ * varies with θ near the critical point as λ * −1 ∝ θ c −θ.
Numerically integrating the left-hand side of Eq. 63 allows us to estimate the critical measurement strength θ c . This process gives θ c = 1.100 ± 0.001.
The above procedure, and the critical point equation (63), applies for a larger class of tree models with measurements, in which the averages ⟨. . .⟩ have the unitary invariance property discussed in Sec. III A. Surprisingly, the simple structure of the node tensor t in the particular model under study allows the integral to be performed analytically. Any average of the form ⟨p κ A λ i ⟩ can be evaluated explicitly as a function of θ (App. C), in particular The critical point equation (64) then becomes ln γ γ 2 − 1/γ 2 = 3 16 , which has the solution θ c ≃ 1.10010302468401.
B. Scaling behavior near the critical point The primary qualitative difference between the real measurement and forced measurement critical points is that the real measurement case has λ * = 1, while the forced measurement case has λ * = 1/2. Recursion relations with λ * < 1 and λ * > 1 generally exhibit different order parameter scaling, so the scaling for the MPT in our model is a subtle question. 9 Below we give a heuristic argument fixing the leading scaling, leaving a more detailed and rigorous study of the recursion relations to the future.
Slightly inside the weak-measurement phase, we find that where the constant C ′ is discussed below. Exactly at the critical point, These scaling forms are compared with simulation data in Sec. IV C. They are similar to those for forced measurements (Sec. III). On the other hand, λ * can be viewed as a critical exponent (governing the breadth of the critical probability distribution for Z k ) that distinguishes the real and forced measurement critical points. This difference is discussed in Sec. IV D. Now we discuss the order parameter scaling in slightly more detail. (Some readers may prefer to skip the remainder of this subsection.) It is convenient to define which is nontrivial in the entangling phase. In App. E we give an exact linear equation for H(x) that holds for x ≫ ln Z typ , i.e. far to the right of the front. This equation simplifies further when x is also sufficiently large and negative. In this intermediate regime (loosely speaking, negative x with 1 ≪ |x| ≪ | ln Z typ |), the analysis is similar to that in the previous section, where we may take H(x) ∼ e −λx . For the critical scaling on the entangling side of the transition, however, we are looking for a solution that is stationary (independent of k), so the parameter λ must be chosen such that the traveling wave velocity v θ (λ) vanishes. The key point is that satisfying this condition requires a complex λ [64]: (Here c is a constant determined by Eq. 60.) Taking a real combination of the two complex solutions then gives where ϕ is an undetermined constant. By considering how this solution in the intermediate regime 1 ≪ |x| ≪ | ln Z typ | matches onto solutions to the left and right of this interval we may fix the scaling of Z typ . Note that Eq. 71 gives a logarithmic "slope" that is close to −1 for most values of x: The exceptions are close to the zeroes of the tangent, for which the second term is no longer small. As x approaches the left hand side of the regime where (72) is valid, i.e. as x approaches ln Z typ , we expect that the magnitude |∂ x ln H| of the slope decreases towards zero, in order to match the plateau in H for x ≪ ln Z typ . This condition suggests that the argument of the tan function should vanish at a value of x close to ln Z typ [22]. Consequently, with C ′ = ϕ/c. In principle, we should now use the matching condition on the other side of the intermediate regime (where x approaches −1) to fix the numerical value of ϕ. We suspect that the value ϕ = π/2 is required, in order for Eq. 72 to be consistent with an expansion of the generating function in terms of moments of Z k (see App. E). However, confirming this would require a more careful analysis of corrections to Eq. 72. Finally, given Eq. 71, the form (68) for the kdependence of Z typ k exactly at the critical point follows from the same heuristic argument as in the forced measurement case. 10

C. Numerical results
We can confirm our analytical results for the MPT numerically, using the same pool algorithm described in Sec. III C for the FMPT. The only difference for real measurements is that, in applying the recursion relation defined by Eq. 10, one must select the measurement outcomes with the Born-rule probabilities in Eq. 9. θ c . In the entangling phase, θ < θ c , the value of ln Z typ k is finite in the limit k → ∞, so that an initially mixed state remains mixed regardless of the size of the tree. On the other hand, in the disentangling phase, θ > θ c , the value of ln Z typ k is −∞ in the limit k → ∞, which means that an initially mixed state is completely purified in the limit of a large tree. The red curve in the plot represents the critical value θ c obtained by the linear recursion relation; one can see that this value is consistent with the numerical simulation. Figure 9 shows the scaling behavior of ln Z typ k→∞ as one approaches the transition from the entangling side. The red dashed line shows the predicted dependence given by Eq. 67. The predicted scaling of ln Z typ k with k at θ = θ c is also confirmed in Fig. 10, which shows ln Z typ k ∼ −k 1/3 .

D. Comparison: real vs. forced measurements
In closing this section we briefly comment on the differences between the forced measurement and real measurement cases. In both cases the system exhibits a phase transition between entangling and disentangling dynamical phases as a function of increasing measurement strength. However, the real measurement case has a smaller critical value of θ, which means that the entangling phase is smaller for real measurements than for forced measurements.
This difference between the two cases implies that real measurements are more strongly purifying than forced measurements for the present dynamics. To get some intuition for the difference between forced and real measurements in terms of purification, one can consider the  simple example of a single weak measurement on a single spin. Specifically, we compare the effect of a true measurement to the effect of a forced measurement in which the outcome is chosen independently of the state and with 50% probability for ↑ or ↓.
Let us visualize the spin's initial density matrix ρ init as a point in the interior of the Bloch sphere (recall that pure states lie on the surface of the Bloch sphere, while mixed states lie in the interior). Without loss of generality, assume that this initial density matrix is closer to the north pole (|↑⟩) than to the south pole (|↓⟩). Now make a very weak [65] measurement (θ = π/4 + ϵ) in the S z basis. After the measurement, the density ma- shows the case of true measurements. The somewhat worse agreement for case (b) is likely due to a logarithmic prefactor that we have neglected in the relation between Z typ and ⟨Z⟩.
trix is altered so that the corresponding point in the Bloch sphere is displaced very slightly. If the measurement outcome is ↑, the displacement takes the point closer to the north pole. Since the state is initially closer to the north pole, this displacement also brings it closer to the surface of the Bloch sphere, i.e. it is purifying. By contrast, if the measurement outcome is ↓, the density matrix gets closer to the south pole and becomes less pure.
Real measurements make the first outcome (the purifying one) more likely, while for forced measurements the two outcomes are equally likely. The result is that, after averaging over outcomes, the real measurement gives a mean increase in purity of order ϵ, while the forced measurement gives a mean change that is higher-order in ϵ (we omit the detailed calculation). The other prominent difference between the MPT and FMPT is the different value of the critical exponent λ that characterizes the two transitions; λ = 1/2 for the FMPT, which is within the glass class, while λ = 1 for the MPT, which is at the boundary between the glass and paramagnetic classes. Ultimately this difference arises because the Born rule probability p(s) depends on the set of measurement outcomes s. This dependence implies that the nodes of the tensor network T are no longer independent. The measurement outcomes at a node, which go into determining the node tensor t(σ 1 , σ 2 , σ ′ 2 ), have probabilities that depend on the density matrices ρ 1 , ρ 2 coming from the earlier part of the tree. 11 Nonetheless, the Haar-randomness of the 2-site unitaries means that the ensemble of density matrices ρ k for a tree of k generations can be characterized only by a distribution of singular values Z k . This allows us to treat the problem via a recurrence relation for Z k .
While this difference in the value of λ is not reflected in the critical scaling of ln Z typ k near the transition (compare Eqs. 43 and 46 with Eqs. 67 and 68), it does have an observable consequence in terms of the distribution of values of Z k among different realizations of the tensor network. Specifically, the exponent λ controls the probability density P (ln Z) d ln Z for ln Z [22,58]. We consider the regime just inside the entangling phase, with finite asymptotic values Z typ = Z typ k→∞ and ⟨Z⟩ = ⟨Z k ⟩ k→∞ (similar considerations apply just on the other side of the transition, and at the critical point, for the probability distribution at large finite k). When Z typ is very small, there is a wide range for which 1 ≪ | ln Z| ≪ | ln Z typ |, and in this range when λ ≤ 1 (neglecting a possible more slowly-varying prefactor). Consequently, when Z typ is small the relationship between ⟨Z⟩ and Z typ is ⟨Z⟩ ∼ (Z typ ) λ (again, neglecting a possible more slowly-varying prefactor). This relationship is confirmed by our simulation data in Fig. 11 for both the forced measurement and real measurement cases. Specifically, the two cases exhibit a different slope λ when ln⟨Z k→∞ ⟩ is plotted against ln Z typ k→∞ . This plot provides an independent confirmation of the different value of λ between the two cases. 11 In our treatment of the node we considered (without loss of generality) the case where ρ 1 and ρ 2 are diagonal. The correlations between ρ i and t then imply that the resulting distribution for the node tensor t is not invariant under the rotations t a bc → t a b ′ c u b ′ b or t a bc → t a bc ′ u c ′ c . That is, ρ 1,2 have picked out preferred states on the bonds, and the distribution of t is sensitive to these preferred states.
FIG. 12. Schematic picture of the proposed expansion protocol. We use the ∪-shaped connection at the initial time to represent the initial GHZ state between one system spin and the reference spin. The world line of the reference spin is shown by the red line. A possible choice for the design of each node, following Fig. 3, is shown at the bottom.

V. THE EXPANSION PROCESS AND EXPERIMENTAL PROTOCOLS
Having discussed the collapse process in the previous sections, we now turn to the phase transition in the expansion process, and we suggest a schematic protocol for detecting it experimentally.
The critical properties of the expansion process with forced measurements have been solved in Sec. III, since they map to those of the collapse process FMPT (the same ensemble of tensor networks arises in both cases, with opposite choices for the arrow of time). We do not solve the expansion process with true measurements here. Unlike for forced measurements, reversing the arrow of time does not map the expansion process with true measurements to the collapse process with true measurements. Instead, reversing the arrow of time maps the expansion process with true (weak 12 ) measurements to a "mixed" version of the collapse process, in which the weak measurements are true measurements, but the projective measurements are forced. This equivalence can be shown using the formulas in Sec. II B.
Thus, formally the expansion process with true measurements is an intermediate case between the two cases that were solved analytically in the previous sections. It would be interesting to understand how this intermediate case differs in its critical properties from the other two cases. A naive guess is that it has qualitatively similar scaling properties, with some value of the critical exponent λ. We leave an investigation of such mixed dynamics to the future.
Instead, we comment on the implementation of the measurement phase transition in the expansion process on a quantum device. Such an experiment does not require any postselection, in contrast to usual versions of the MPT. See Refs. [12,[52][53][54][55][56] for recent approaches to the statistical challenge associated with postselection. In particular, Refs. [55,56] discuss approaches involving "hybrid observables" that mix classical computations and quantum measurements. Our proposal below follows the spirit of these works, as it makes use of the efficient classical computability (given the measurement record) of certain expectation values. Here, the tree structure means that these classical computations are particularly simple. It would also possible to study the collapse process experimentally: we comment on this in Sec. VI.
The discussion in the remainder of this section is not restricted to any particular design of the "node" interaction, so we will abstract away from the particular models we have considered so far, and imagine a more general version of the expansion process. For example, it is not necessary that the interaction unitaries be random: we could consider a protocol in which the only randomness comes from the measurement outcomes. 13 Extrapolating from numerical explorations of the collapse process, it is likely that clear evidence of the phase transition could be obtained from an experiment on a relatively small number of qubits.
We consider an expansion process that starts with the single initial "system" spin in a maximally entangled (GHZ) state with a "reference" spin, denoted R [12]. (Performing a partial trace over the reference spin produces the maximally mixed initial state of the system spin, as we assumed when we introduced the expansion process in the Introduction.) We run the expansion process for the system, recruiting additional spins at each time step. A cartoon of this dynamics is illustrated in Fig. 12, where the ∪-shaped connection at the bottom represents the initial entangled state of the "system" and "reference" spins. We will detect the MPT using measurements of the reference R.
At the end of the expansion process we have an entangled state |Ψ⟩ of R and the 2 t system spins. The Schmidt values for a bipartition of this state between R and the system are where Z is the order 13 It is also possible to replace weak measurements with projective ones, at the cost of involving more spins in the node. (This equivalence is a consequence of the fact that a weak measurement can be effected using an additional ancilla spin that gets projectively measured.) parameter that we have focused on in previous sections. Tracing out the system, the reduced density matrix for the reference ρ R has eigenvalues 1 − Z and Z.
To begin, imagine an idealized experiment with no uncontrolled noise. In a given run of the experiment we obtain a measurement record m W . The final state ρ R of the reference will depend on m W . However the recursive structure of the tree means that a very straightforward processing of the experimental data m W allows us to deduce this final state ρ R . The computation simply iterates a recursion relation for 2×2 density matrices. This recursion has exactly the same structure as Eq. 22: we start at the leaves of the tree and iterate towards the trunk (in the present setting this process corresponds to iterating backwards in time). 14 Notably, the computational effort is proportional only to the number of nodes in the tree: it is sufficient to work with only 2 × 2 density matrices, rather than a many-body quantum state.
Let us write ρ R in the form where ⃗ σ is the vector of Pauli matrices, and In the idealised experiment, ρ R can be computed analytically from the measurement record in each run, and one could in principle obtain observables like the order parameter ⟨Z⟩ directly from this computation. We will denote this estimate as ⟨Z⟩ S , since it makes use of measurement data for the system S alone (not for the reference spin). In reality, it is crucial to make a further measurement of the reference spin in each realization, in order to obtain independent confirmation of the result of the density matrix reconstruction. A conceptually simple way to do this is as follows.
In each run one has the recursively computed estimate of the polarization vector ⃗ n and therefore of its normalized versionn. One can now make a Pauli measurement in the appropriately rotated frame, i.e. a measurement of n · ⃗ σ. This measurement yields an outcome τ = ±1, with the outcome τ = −1 having a probability Z. Therefore these measurements give us an estimate of the average value of the order parameter Z: where the averaging is over different runs of the experiment. The subscript on ⟨Z⟩ SR indicates that this estimate combines measurement data from both the system S and the reference R. ⟨Z⟩ SR should agree with the estimate ⟨Z⟩ S which uses measurement data from the system only.
Equation (77) implies that the MPT for the expansion process corresponds to a phase transition in the predictability of the final measurement on the reference spin. In the disentangling/strong measurement phase, the reduced density matrix ρ R for the reference spin is a pure state in the limit k → ∞ of large tree depth. Consequently, the final measurement of the reference spin (the value of τ ) can be predicted with perfect accuracy by someone keeping track of the measurement record of the system spins (and using them to perform the recursive calculation of ρ R ). On the other hand, in the entangling/weak measurement phase the reference spin remains in a mixed state even as k → ∞, and consequently the final measurement on the reference spin cannot be predicted with perfect accuracy.
The protocol described above requires one to make a spin measurement in an arbitrary basis defined byn, which might be challenging. However, even if measurements can only be made in a fixed basis, say σ z , the average value of the order parameter, ⟨Z⟩, can still be extracted. Specifically, where n z is the z-component of the vector ⃗ n that is calculated theoretically for each run, τ = ±1 is the σ z measurement outcome for the reference, and again the average is over runs. The value of ⟨Z⟩ calculated in this way using the experimental measurements of σ z should exhibit the critical scaling of the MPT.

VI. OUTLOOK
In this paper we have defined and explored the measurement-induced entanglement phase transition in dynamical quantum trees. We have adopted the perspective of purification [7], so that the entanglement transition is characterized by a single "order parameter" Z k that describes the purity of the final state after a k-step protocol of measurement and unitary evolution.
Our primary contribution is to extend the approach of Ref. [22] in order to provide analytical solutions for a transition (for the collapse process) not only for forced measurements, but also for real measurements. We have shown that a transition exists in both cases (the FMPT and the MPT), with distinct nontrivial values of the critical measurement strength θ c . The value of θ c is somewhat smaller in the real measurement case, which in the language of purification arises because real (weak) measurements are more purifying than forced ones. At the most basic level, the MPT and FMPT exhibit similar critical behavior near the transition, as captured by Eqs. 43-46 and Eqs. 67-68, but there are universal differences between the two types of transition.
The most striking difference between the MPT and FMPT is in the values of the critical exponent λ, which characterizes the breadth of the probability distribution for the order parameter Z (or equivalently for the entanglement) at the critical point. The MPT has the larger value λ = 1, as compared with λ = 1/2 for the FMPT. This difference implies a narrower distribution of Z k values in the MPT case for a given tree size k.
The models are analysed using an auxiliary recursion relation for Z. Interestingly, the value λ = 1 lies precisely at the boundary between two classes of phase transition for such recursion relations. The analysis here shows that the value λ = 1 is protected for a larger class of models, in which the node operations have a statistical invariance under unitary rotations. But it is possible that modifications to the quantum tree model that take it outside this class of models could push λ for the MPT to values larger than unity. In such a model, the MPT and FMPT would show more drastically different universal properties. (According to the arguments of Ref. [22], at λ > 1 the onset of the order parameter near the transition is as a power law, rather than exponential as in the model studied here.) Studying models in this broader class may therefore be a promising goal. These differences are also interesting in relation to possible field theories for the MPT.
One could similarly ask whether smaller values of the exponent λ are realizable when the statistical unitary invariance property of the current models is relaxed. This property enforces λ = 1 for the collapse process with real measurements and λ = 1/2 for the processes with forced measurements (at their respective critical points). Heuristically, smaller λ corresponds to stronger disorder and a broader distribution of entanglement. Do the Haarinvariant tree ensembles correspond to the strongest possible disorder, or can other critical models realize broader order parameter distributions?
It is worth pointing out that our model contains no randomness in the time or location of measurements (in the FMPT case there is no randomness in the measurement outcomes either). Nonetheless, a phase transition still exists as a function of the measurement strength. Forthcoming work shows that a completely uniform tree tensor network, with no randomness of any kind, can also show an entanglement transition [42]. The model studied here, and extensions of it, may provide insight into the role of different kinds (and strengths) of randomness at the MPT.
An outstanding task that we have not tackled here is to develop a recursive approach to the expansion process with real measurements. Finding an analytical solution to this problem would be interesting. Does this process correspond to a phase transition of the same general type as those studied here, with some value of the critical exponent λ?
In Sec. V we suggested a schematic experimental implementation of the MPT for the expansion process. It would also be possible to realize the collapse process experimentally. One general approach [55] is to compare the evolution of different initial states that are conditioned on the same measurement outcomes. In the col-lapse process, it is sufficient to compare the output singlespin state for distinct initial many-body states. In outline, one initial state is run experimentally and the other initial state is simulated classically, enforcing the same outcomes. The average distance between the outputs can then be estimated with an appropriate measurement, and can serve as an order parameter (in the strong monitoring phase, the tree tensor network has a single dominant singular value, and so projects almost all inputs onto the same output). As in the case of Clifford circuits [55], the initial state that is run experimentally can be a state that is not efficiently representable classically.
Separately, it would also be interesting to understand whether the expansion or collapse processes are related to any natural quantum information task. One could also explore quantum versions of classical messagebroadcasting processes with a tree-like structure, with potential transmission errors on the bonds of the tree. These classical processes can show nontrivial phase transitions in the information communicated from the root to the leaves [66].
The results here were obtained using a recursion relation for a random variable Z k characterizing the entanglement (or purity) of a given tree. An open question is how to recover them using the formalism of the effective "lattice magnet" that can be obtained using the replica trick by integrating out all the random unitaries [8,9,22,28,30,41].
Finally, we note that we have only studied the entanglement between the root and leaves of the tree, but it would also be interesting to analyze the entanglement between subsets of the leaves. For example, one could consider the final state produced by an expansion process for which the initial state is pure. In this case one can think of the final state of the leaves as a 1D wavefunction analogous to a multi-scale entanglement renormalization ansatz (MERA) state. For such a state the entanglement entropy between subsets of the leaves also reflects the transition. In the weak monitoring phase, we expect a logarithmic scaling of the entanglement entropy with subsystem size [22,67].

ACKNOWLEDGMENTS
The authors are grateful to Bernard Derrida, Sthitadhi Roy, and Jonathan Ruhman for helpful discussions. Part of this work was performed at the Aspen Center for Physics, which is supported by National Science Foundation grant PHY-1607611. In Sec. II A we stated that it did not matter whether the measurements were performed in the S z basis, or in randomized bases. In this Appendix we show this using the expressions in Sec. II B for the probability of a given tree. We consider the case of true measurements (the forced measurement case is even simpler) for the collapse process. The same logic applies to the expansion process.
Abusing notation, in this Appendix (only) we will denote a given tree by Here U denotes not a single unitary but the complete set of two-spin unitaries appearing in the tree, u denotes the set of single-spin unitaries appearing in the Kraus operators (which fix the measurement bases for the weak measurements) and v denotes a set of single-spin unitaries that fix the bases for the strong measurements. Finally σ is the set of measurement outcomes. We will also use U , u etc. for individual unitaries -this should be clear from context. First, note that, because of the structure of the complete tensor network for T , it is possible to absorb almost all of the us and vs into a redefinition of the U s. For example, in the expression K = uK diag u † for the Kraus operator on a bond, we can absorb the u into the U above the bond, and the u † in to the U below. Similarly each v can be absorbed into the U from the same node. In this process, a given two-site unitary U r (here r denotes a node) transforms as Where V r and W r are some unitaries that depend on the adjacent us and the adjacent v. (The details do not matter.) The only u † s that do not get absorbed in this process are the ones at the initial time from the lowest layer of Ks. If we denote the product of these by X(u), then the above "gauge transformation" gives us the simple identity (A3) Again we use a schematic notation where site indices are dropped.
The final factor, X(u), will drop out because (i) we take the initial state to be maximally mixed, and (ii) we consider observables F (T ) that are invariant under a unitary transformation at the base of the tensor network. That is, regarding T as a 2 × M rectangular matrix (see Sec. II B), for any M × M unitary Y . For example, any function of the order parameter Z has this property, since the singular values of T are unchanged by a unitary rotation. Now we wish to show that the averages of F are the same in two different ensembles: first, the ensemble where the single-site unitaries u and v are Haar-random, and second, the ensemble where they are all fixed to the identity. The latter corresponds to taking all measurements in the S z basis. (The case where the us are random while the vs are fixed, or vice versa, can also be shown to be equivalent, in a similar way).
First consider the case where the bases are randomized. Then expectation values have the schematic form The unitary averages involve the Haar measure. The σ average has to be the innermost one, and it is done with the Born rule probability, which is a function of T . We write this schematically as a function of T (see Sec. II B): Since we start with a maximally mixed state, the function H is expressed in terms of the trace of T T † , so it obeys property (A4). Therefore, using (A3), Now return to the average. Let us label the inner average by the probability function used for the average over σ: Inside the average, we use the identity (A3).
where T stands for T = T (V (u, v)U W (u, v), 1, 1, σ). We also use the identity for p.
. We note that for any fixed u and v, the distribution of U ′ is independent of u and v and is Haar.
Nothing inside the outer average depends on u, v so we can drop the average over these quantities: This is the average in the ensemble where all the measurements are in the S z basis. This establishes the claim. Let us also confirm a similar invariance which we invoked at the beginning of Sec. IV A. Consider a single node of the collapse process, with real measurements, with a definite initial state of the form ρ 1 ⊗ ρ 2 . We check that the probability distribution for the output density matrix, is left unchanged by a basis rotation for one of the initial spins, e.g.
This invariance can be seen by manipulations similar to the above. The node tensor t, defined in Sec. II B, depends on a two-site unitary U , on single-site unitaries u 1 , u 2 appearing in the Krauss operators, and on measurement outcomes s = (σ 1 , σ 2 , σ ′ 2 ). Schematically, ρ is a function of all these and the initial states. Let us write ρ = ρ(s, X), where X = (U, u 1 , u 2 , ρ 1 , ρ 2 ). (A15) Let us denote the probability of outcomes s, given the initial states and the unitaries, as p(s|X) (see Eq. 9 for the expression). Let us also define Then the structure of the tensor network for ρ gives the following invariances, As a result, the probability distribution of ρ, for given input states, is unchanged by the rotation ρ 1 → wρ 1 w † . To see this, consider an arbitrary expectation value involving ρ, with the initial state (wρ 1 w † ) ⊗ ρ 2 : Looking at Eq. A17 we see that w and w † can be absorbed into the Haar-random unitaries U and u 1 without any change to their distribution, so that F (ρ) is independent of w. That is, the probability distribution of the output ρ depends only on the eigenvalues of ρ 1 and not on its eigenvectors (and similarly for ρ 2 ). and finally Since the average over measurements has been separated out and written explicitly in terms of p(s) = p(σ 1 , σ 2 , σ ′ 2 ), the angle brackets in Eq. B1 represent only the averages over the Haar-random unitary U and the random bases in the K operators. 15 Therefore, these averages are taken with a distribution that is invariant under unitary transformations on any index of the tensor t.
Eq. B1 means that the quantity we need to evaluate is not ⟨A λ i ⟩ alone, as the forced measurement case, but ⟨p(s)A λ i (s)⟩. Since A 1 and A 2 are statistically equivalent, it suffices to examine the average involving A 1 . Using Eqs. B2-B4, This can be analyzed by a similar method to Ref. [22]. Let M be the matrix M ab ≡ t a b1 , in terms of which We consider the singular value decomposition of M , where w and v are unitary, and, with probability 1, both singular values are greater than zero. Then Eq. B6 becomes p(s)A λ 1 (s) = η 2λ 1 η 2λ 2 η 2 1 |v 11 | 2 + η 2 2 |v 21 | 2 2λ−1 v η .
(B9) Here the averages are over the singular values η and the unitary matrix v. Because of the invariance property of the t distribution mentioned above, the v average is simply a Haar average, which reduces to a uniform average of the complex vector v a1 over the unit sphere, as in [22]. Performing the sphere average (see Eqs. C2, C2 below for the necessary formula) gives p(s)A λ 1 (s) = (B10) 15 We could write these operators as e.g.

K
(1) where K z is non-random and diagonal in the z basis, and u (1) is a Haar random 2 × 2 unitary.
This expression is sufficient to obtain the desired identity, even without performing the η average explicitly. Differentiating with respect to λ and taking the limit λ → 1, we obtain ⟨p(s)A 1 ln A 1 ⟩ = 0, for any outcome. Thus we prove Eq. 62. And the critical point is given by Eq. 60.
Because U is anyway Haar-random (and independent of K (2) ), the orientation of this vector does not matter. We are free to replace y c ′ with a vector that points in the (1, 0) direction, without changing the distribution of M . However, we must keep track of the norm of y. Let us call this norm R. Since the leftmost unitary on the RHS of (C7) does not change the norm, R = sin θ 0 0 cos θ ⃗ z .

Appendix D: Finite pool-size effect
In this work we use the pool method to approximate the recursive evolution of the probability distribution of the density matrix in the quantum tree problem. For any fixed k, averages obtained by the pool method should converge slowly to the true values as the pool size N pool tends to infinity. Therefore an important question is whether the pool size is large enough.
To check whether the pool size in our study (10 6 ) is large enough, we choose θ close to the critical point and study the curves of ln Z typ k with different pool sizes for both the forced and real measurement cases. The result is shown in Fig. 13. The relative error between the two curves for the largest pool sizes shown in the figure, 10 4.5 and 10 5 , is smaller than 1%. This convinces us that the pool size of 10 6 used in the main text is sufficient.
As discussed in the main text, a heuristic matching argument at the left-hand-side of the range of x under discussion allows ln Z typ be expressed in terms of ϕ/c √ θ c − θ. The key point is that the second term in (E9) should become non-negligible (and positive) as x approaches ln Z typ , and this matching requires the argument of the tangent to vanish close to the front.
Next we consider the value of ϕ. In the case of forced measurements the value of ϕ can be fixed straightforwardly (if non-rigorously) by matching on the other side of the intermediate-x range. For forced measurements, a formula like Eq. E9 holds, but with the first term being −1/2 instead of −1. Therefore the slope in the intermediate regime is close to −1/2. On the other hand, it is easy to show (for example from the definition of the generating function) that for x ≫ 1 the slope must approach −1. Therefore, for forced measurements, we argue that ϕ = π, in order that, as x approaches 0 from the left, the slope should decrease (by an amount much larger than √ θ c − θ) to match onto the appropriate solution for x ≳ 0.
However for real measurements we have ∂ ln H ≃ −1 both in the intermediate regime and for x ≫ 1. Therefore we cannot use the same argument to fix ϕ.
Instead, it seems plausible that ϕ can be fixed to π/2 by more careful considerations along the following lines. We have (we consider the limit k → ∞) Differentiating to obtain an expression for the slope ∂ x ln H discussed above, and definingZ = e −x Z, gives Both the numerator and denominator may in principle be expanded in moments of Z to all orders. The key point is that while the expansion of the denominator starts with the first moment, ⟨Z⟩, the expansion of the numerator starts with the second moment. Directly averaging the recursion relation for Z k , in the stationary regime where the moments are independent of k, gives where B is a sum of terms involving higher moments (and higher powers of the first moment). Since 2⟨A⟩ − 1 is of order (θ c − θ), this equation strongly suggests that, close to the critical point, the higher moments are smaller than the first moment by a factor of order (θ c − θ): We have confirmed this numerically for n = 2. Applied to Eq. E11, this scaling implies that, in the limit where θ c − θ tends to zero, but for any fixed x, is an unknown function of x. (We have also checked this relation numerically, using the expression in Eq. E11.) Next we would like to compare Eq. E14 with Eq. E9. If in Eq. E9 we fix a value of x, and then take the limit of small θ c − θ, we obtain 1 + ∂ x ln H ≃ c √ θ c − θ/ tan(ϕ). At first glance, consistency with (E14) requires 1/ tan(ϕ) to vanish, allowing us to fix ϕ = π/2. However, to justify this conclusion we would need to quantify the size of corrections to (E9) arising from the terms (involving e x ) that we neglected in Eq. E6, in order to check that the two approximations (E9) and (E14) have a nonvanishing region of overlap at small θ c − θ.