Prethermal Strong Zero Modes and Topological Qubits

We prove that quantum information encoded in some topological excitations, including certain Majorana zero modes, is protected in closed systems for a time scale exponentially long in system parameters. This protection holds even at infinite temperature. At lower temperatures the decay time becomes even longer, with a temperature dependence controlled by an effective gap that is parametrically larger than the actual energy gap of the system. This non-equilibrium dynamical phenomenon is a form of prethermalization, and occurs because of obstructions to the equilibriation of edge or defect degrees of freedom with the bulk. We analyze the ramifications for ordered and topological phases in one, two, and three dimensions, with examples including Majorana and parafermionic zero modes in interacting spin chains. Our results are based on a non-perturbative analysis valid in any dimension, and they are illustrated by numerical simulations in one dimension. We discuss the implications for experiments on quantum-dot chains tuned into a regime supporting end Majorana zero modes, and on trapped ion chains.


I. INTRODUCTION
Solid-state systems supporting non-Abelian anyons, such as Majorana zero modes (MZMs), are the focus of considerable research aimed at exploiting them for quantum information processing 1-4 . In the limit of zero temperature, the quantum information stored in a collection of non-Abelian anyons is protected, up to corrections exponentially small in the separation between anyons. At non-zero temperatures, however, thermally excited bulk quasiparticles can be absorbed or emitted by a zero mode, thereby corrupting the quantum information contained therein. Thus a separation-independent failure of protection is expected to increase with temperature as e −∆/T , where ∆ is an energy gap. These processes increase the width and reduce the height of the predicted 5-7 zero-bias peak that appears to have been observed in tunneling experiments [8][9][10][11][12][13][14] .
However, the decay of a zero mode and the quantum information encoded in it is a non-equilibrium dynamical process, and it is not clear if thermodynamic reasoning can describe it properly. In the absence of electron-electron interactions -for instance, in the Kitaev chain Hamiltonian 15 or the transversefield Ising chain, to which it is related by a Jordan-Wigner transformation -thermally excited quasiparticles do not affect the zero modes at all. This can be reconciled with the previous paragraph by noting that, in the absence of interactions, the coefficient in front of e −∆/T vanishes. While the absence of interactions is a fine-tuned special case, a similar conclusion holds in systems with strong disorder in which many-body localization 16 occurs. Here disorder-induced localization prevents bulk excitations from carrying quantum information away from a zero mode [17][18][19] .
Disorder, however, is not necessary to have zero modes in interacting systems. In at least one integrable system, the XYZ spin chain, exact edge zero modes survive the presence of interactions 20 . This edge "strong zero mode" is an operator that commutes with the Hamiltonian up to exponentially small corrections in the finite size of the system [21][22][23] . Moreover, as with the transverse-field Ising chain, thermally excited quasi-particles do not cause the edge degrees of freedom to equilibriate with the bulk. Rather, the edge spin coherence lasts forever in a semi-infinite chain, even at infinite temperature 24 . Even more strikingly, similar behavior was found in several non-integrable deformations of the Ising chain. Here the coherence time is not infinite, but extremely long-lived 24 .
The purpose of this paper is to show that such long-lived edge modes are a more general phenomenon, and to give a direct and rigorous method for understanding them. We demonstrate that "prethermalization," the exponentially slow approach to thermal equilibrium that occurs in some closed quantum systems [25][26][27][28][29][30] , can protect edge zero modes and, in fact, topological degrees of freedom in higher-dimensional systems as well. (Prethermalization can also occur in periodically driven systems [30][31][32][33][34][35][36][37][38] , but this is not our focus here.) Our analysis gives a clear meaning to the notion of an "almost" strong zero mode: it is an operator that commutes with the full Hamiltonian of a system up to corrections that are a nearly exponentially small function of a ratio of energy scales. Here by "nearly" exponentially small we mean with a logarithmic correction to the exponent, as set out in equation (3) below. We call such an operator a "prethermal strong zero mode". Its lifetime is bounded below by a nearly exponentially growing function of this ratio of energy scales because prethermalization delays equilibration of a prethermal strong zero mode until this late time.
By relating the protection of quantum information to prethermalization, we reveal the limits of such protection. We elucidate the nature of this protection in onedimensional, two-dimensional, and three-dimensional closed systems. However, solid-state devices are not closed systems and prethermalization in such devices is eventually superseded by thermalization driven by electron-phonon interactions. Thus, the applicability of these ideas to Majorana zero modes in semiconductor-superconductor devices depends on the particular device considered (since the prethermal limit is not accessible in all devices) and, even then, will only be in some temperature range over which the electron-phonon interaction does not dominate. We quantitatively analyze a quan-tum dot chain proposed in Refs. 39 and 40; although it has not been realized in experiments yet, it is a useful case study. We show that prethermalization can occur over a range of time scales. In this prethermal regime, arguments relying on thermal equilibrium are ultimately correct, but with a (nearly) exponentially small prefactor that reflects the slow thermalization of the system. Moreover, the naive energy gap ∆ is replaced with a much larger effective energy gap ∆ eff . This suggests that the T > 0 protection of quantum information may be optimized by entering the prethermal regime, in addition to -or even rather than -maximizing the energy gap. We demonstrate this tradeoff in explicit models.
The type of prethermalization we describe is not special to one-dimensional or topological systems. We describe explicitly how analogous phenomena occur in some two-and three-dimensional systems, and how prethermalization protects edge modes for long times in systems not topologically ordered. One particular example we describe in detail is the transverse-field Ising chain perturbed by integrabilitybreaking interactions. While this chain is related to the quantum-dot chain via a Jordan-Wigner transformation, the non-locality of the map means that topological order in the latter is simply ordinary ferromagnetic order in the former. Nonetheless, we show how prethermalization means the edge spin coherence lasts for very long times here as well under any perturbation preserving the Z 2 spin-flip symmetry. Namely, the spin coherence lasts for a time that is (nearly) exponentially large in terms of the couplings.
We begin with a conceptual overview in Section II. In section III, we explain the prethermal regime and the theorem due to Abanin et al. 30 that guarantees its existence for certain Hamiltonians. In the first part of Section IV, we show how this theorem can be used to provide a lower bound on the lifetime of edge zero modes, and apply it explicitly to a model of interacting Majorana fermions. In the remainder of Section IV, we discuss the lifetime of the Majorana zero modes at non-zero temperatures and present numerical simulations supporting our arguments. We then generalize this analysis in the following Section V, describing the conditions needed to observe prethermally protected zero modes, and giving details of several examples which illustrate these conditions. In section VI, we apply this general strategy to analyze systems in two and three dimensions. Section VII explores possible practical applications of our results, in quantum dot and ion chains. Finally, in Section VIII, we consider integrable systems, where the zero modes may survive much longer (possibly even infinitely longer) than the lower bound.

II. TOPOLOGICAL ZERO MODES AT FINITE TEMPERATURE
A common feature of systems exhibiting topological order at zero temperature is topological degeneracy, where there are several nearly degenerate ground states. Their energy splitting scales as e −L/ξ for some correlation length ξ, where L is the system size. Moreover, these degenerate states are locally indistinguishable. This means that, at zero temperature, quantum information can be stored in the degenerate ground state subspace in a topologically protected way. However, this topological protection usually does not extend to finite temperature. Here we will review, in a schematic way, the standard arguments for this. We then explain why, in an isolated system, prethermalization can improve the situation considerably, both in topologically and conventionally ordered systems. A rigorous argument will be given in later sections.
Our prototypical example is a one-dimensional topological superconductor, exemplified by the Kitaev chain 15 . In such systems with open boundary conditions, there is a pair of Majorana zero modes on the two ends of the chain, represented by Majorana operators γ and γ' [see Figure 1(a)]. At zero temperature, these can be used to encode quantum information in the qubit. The qubit can be decohered by randomly acting on it with the logical operators σ z = iγγ or σ x = γ. Since these are both non-local (the latter because it is a fermionic operator), the qubit is therefore immune to decoherence from any local noise process.
At finite temperature, on the other hand, there is a finite density of fermionic quasiparticles in the bulk of the system. If such a quasiparticle is near one of the Majorana zero modes (say the one corresponding to γ), then it can annihilate on it [ Figure 1(b)], which has the effect of acting on the encoded qubit with the logical operator σ x = γ. At finite temperature, such processes will happen continuously, and so the encoded qubit will quickly decohere. This is known as "quasiparticle poisoning".
How can quasiparticle poisoning be overcome? One way is by many-body localization (MBL) 18,[41][42][43][44][45][46][47][48][49] . This causes the quasi-particles to become immobile, and thus prevents them from moving onto the boundary and annihilating 49 , as shown in Figure 1(c). Going to an MBL phase, however, requires strong disorder, and it is not at all clear if such phases exist outside one dimension. Moreover, MBL systems may not be suitable for topological quantum computation with Majorana zero modes due to non-local rearrangements that occur when varying the Hamiltonian adiabatically 50,51 .
We show in this work that there is another way of avoiding quasiparticle poisoning. Our approach exploits the fact that if the number of quasiparticles in the bulk is conserved (in fact, it is sufficient for the number to be conserved modulo 2), then they cannot be annihilated on the boundaries [ Figure  1(d)]. Of course, outside integrable systems there is no reason why such a conservation law should hold exactly, but often parameters of the Hamiltonian can be tuned such that it holds in an approximate way, leading to a long decoherence time.
One might imagine that obtaining a long decoherence time in this way would require significant fine-tuning. Remarkably, this turns out not to be the case. We show that many Hamiltonians possess a significant parameter regime with an approximately conserved quantity that we identify with quasiparticle number. Terms violating the conservation law are exponentially small in parameters of the Hamiltonian. This is based on the mechanism of prethermalization, as discussed further in later sections.
Our approach is not necessarily limited to one-dimensional systems. In higher dimensions, decoherence of quantum With strong disorder, the chain can be made to be MBL, such that the localized quasiparticles are not able to annihilate on the boundaries. (d) In a suitable "prethermal" regime, the quasiparticles are mobile but are prevented from annihilating on the boundary by an approximate conservation law.
information is also generally related to processes involving quasi-particles. For example, in the 2D toric code defined on a torus, decoherence is caused by quasiparticles moving around a non-contractible loop on the torus. On the other hand, the 4D toric code is known to be immune from decoherence at low temperatures, precisely because this system has no topologically non-trivial particle-like excitations 52 . Unfortunately, no such system is known in dimension less than four.
We show that prethermalization can be used to suppress decoherence in an isolated system arising from creation or annihilation of quasiparticles. Thus it is no help in the case of the toric code on a torus, since quasiparticles can move around non-contractible loops without changing the total quasiparticle number. However, in the planar version of the toric code the decoherence mechanism involves quasiparticles annihilating on the edges. Prethermalization therefore is useful here. We analyze this and other higher-dimensional examples in more detail in Section VI.
III. PRETHERMAL REGIME A closed quantum system is said to be "prethermal" if, en route to thermalization, it is in an exponentially long-lived quasi-steady state. One cause of prethermalization is an approximate conservation law: over intermediate time scalesknown as the prethermal regime -the system maximizes its entropy, subject to the constraint that the conserved quantity takes a fixed value. Over sufficiently long time scales the entropy is maximized without any constraint.
A theorem due to Abanin, De Roeck, Huveneers and Ho (henceforth ADHH) 30 guarantees the existence of such a prethermal regime for Hamiltonians of the form whereN is a sum of finite-range commuting terms, such that N has integer eigenvalues, i.e. e 2πiN = 1. The proof in Ref.
30 assumes that each term inN acts only on a single site, but in Appendix A we show that this assumption can be relaxed. We define a parameter J 0 that is essentially the largest operator norm of any local term in Y ; more explicit forms are given in specific examples below. The theorem says that for J 0 /J sufficiently small, there exists a local unitary transformation U such that and c is a constant. Since we focus on the J J 0 limit, we often drop the 1 in the denominator of Eq. (3) for the sake of uncluttering the equations. As a result, the dynamics of the system conservesN until a time t * ∝ e cn * . A more precise statement of the ADHH theorem can be found in Ref. 30.
For our purposes, the essential point of the ADHH theorem is that a Hamiltonian (1) has an emergent approximate U (1) symmetry generated byN . The symmetry violations come from terms in the transformed Hamiltonian that are nearly exponentially small in the large-J limit. We apply the ADHH theorem to Hamiltonians with edge zero modes, and show that the approximate U (1) symmetry can protect these zero modes -even far from the ground state of the system, where we do not ordinarily expect topological protection. It also protects them in systems with no topological order whatsoever, thus providing a completely different mechanism for preserving quantum coherence distinct from topological considerations.
There is a nice heuristic picture for why a Hamiltonian of the form (1) has an approximate U (1) symmetry. The transformation to the form (2) means that it is very difficult for the terms inŶ to cause transitions between different eigenspaces ofN : in order to conserve energy, many excitations ofŶ must be created or annihilated. Such a process occurs slowly, so violation of the approximate U (1) symmetry takes a very long time.
However, in applying these ideas to real systems, it is important to keep in mind that the ADHH theorem requires thatŶ be a sum of local terms, each of which has bounded (indeed, small) norm. This condition cannot be satisfied in any real solid since phonons and photons do not have finitedimensional local Hilbert spaces. The energy associated with a transition between different eigenspaces ofN need not wait for many excitations ofŶ to be created or annihilated; it can, instead, be supplied or carried away by a phonon or photon. However, if the couplings of the electronic degrees of freedom to phonons and photons are sufficiently small, they may not play a role during the prethermal regime. We show that this is the case in semiconductor devices in Section VII A. We note also that the presence of gapless excitations does not automatically destroy all symmetry-protected edge modes 53 .

IV. PRETHERMAL PROTECTION OF A TOPOLOGICAL QUBIT
The importance of the ADHH theorem for the present work is that in many cases UN U † can be identified with an effective "quasi-particle number", and its approximate conservation can suppress the decoherence mechanisms of a topological qubit, as outlined in section II.
In this section we introduce the idea by describing a particular example in depth: a topological superconducting chain. We show that it is possible to construct a prethermal edge Majorana zero mode for arbitrary interactions, as long as the topological superconducting ordering term is the dominant coupling. We present evidence from numerical simulations that the prethermal regime persists over a surprisingly large range of couplings, including values of the dominant coupling that are not so very large. Later, in section VII A, we discuss the relevance of these ideas to quantum dot chains in a semiconductor-superconductor heterostructure, where the electron-phonon coupling cuts off prethermalization. As we will see, there are circumstances under which the effect of the electron-phonon coupling is weaker than electron-electron interactions, so that prethermalization acts to suppress the dominant MZM decay channel, leading to relatively long-lived MZMs. Although this first example is one-dimensional, the basic idea and the theorem on which it relies work in any dimension, as we discuss in later sections.

A. Prethermalization in the interacting Kitaev chain
The Kitaev chain 15 is a simplified model of a onedimensional topological superconductor of spinless fermions. Its Hamiltonian is The topological superconducting phase occurs for 2|t| > |µ| provided ∆ > 0.
Moreover, we will need to include interactions in our description. They can be and usually are dropped for the purposes of demonstrating the existence of the topological superconducting phase and its concomitant MZMs, but they are necessary for any discussion of non-zero temperature dynamics. With the addition of such terms, Eq. (4) takes the form We have only written the simplest interaction term explicitly (the V term) and denoted the rest implicitly with the ellipses.
In writing the Hamiltonian in this way, we have explicitly separated the presumed largest term, written on the first line, from the smaller terms, written on the second and third lines. This model can be written in terms of the Majorana fermion operators γ A j and γ B J defined by c j = (γ A j + iγ B J )/2. This rewriting givesĤ = −JN +Ŷ , wherê and J ≡ (t + ∆)/2. On the right-hand side of the second equal sign, we have written the Jordan-Wigner transformed representation of this operator according to the definitions Note that we have taken open boundary conditions, in order to focus on the physics of the edge. One important thing to note is that with open boundary conditions, the two Majorana fermions at the edges, γ A 1 and γ B L , do not appear inN . Each therefore commutes with it. We also havê where . . . denotes other third-neighbor and more-distant hopping and interaction terms. The transverse field h is the chemical potential of the topological superconductor (5) according to the identification h ≡ µ. The two four-Fermi terms we single out are the two simplest, and in spin language are h 2 σ x j σ x j+1 and J 2 σ z j σ z j+2 respectively; in terms of the original topological superconductor, h 2 ≡ V while J 2 includes slightly longer-ranged interactions and Cooper pair-hopping terms not explicitly included in Eq. (5). For simplicity, we take couplings inŶ to be spatially uniform, but this is not necessary for the approach to work; in fact adding disorder toŶ typically enhances the effects that we describe. We thus have included a J AB term in Eq. (8); because of the integereigenvalue restriction it cannot be absorbed into the JN term if it is disordered. We assume that the hopping and interaction terms have finite range but make no further assumptions.
We now apply the ADHH theorem to the perturbed Ising/Kitaev Hamiltonian −JN +Ŷ , with open boundary conditions and the operators defined by (6,8). The theorem is applicable for J/J 0 sufficiently large, where the energy scale J 0 is given in this case by The J AB term is included in the presence of disorder, where it is the deviation of J from its mean value. In the disordered case, J 0 is defined as the maximum possible value of the righthand side of Eq. (9) for any site in the system, stipulating that the interactions on the right-hand side of Eq. (9) touch that site. The number κ 0 is chosen so that this sum is finite. By choosing e κ0 as large as possible while satisfying this requirement, we can maximize the range of J over which the theorem applies. Then the theorem guarantees that there exists a local unitary transformation U such that where [N ,D] = 0 and n * is given by (3). Another way to say this is that the original HamiltonianĤ = −JN +Ŷ has an approximately conserved quantity U †N U, which is conserved until times t * = O(e cn * ).
To illustrate the significance of this approximately conserved quantity, let us consider the limit J 0 → 0, which im-pliesŶ = 0 so the Hamiltonian is simplyĤ = −JN . In that case the local unitary rotation U = , so the conserved quantity is simply U †N U =N . In this limit, excitations (henceforth, "quasiparticles") correspond to making one of the terms in Eq. (6) have eigenvalue −1, and have no dynamics. The conserved quantityN simply counts the number of quasiparticles, which is the eigenvalue of (L − 1 −N )/2. In fact, for our purposes, it will be sufficient to use the fact that the number of quasiparticles is conserved modulo 2 in this limit; in other words, here there is a conserved Z 2 charge We emphasize that this is not the same as the fermion parity F, which is also a symmetry of any local fermionic Hamiltonian; the latter is instead given by which differs from F in that the boundary Majorana operators γ A 1 and γ B L are also included. The charge F thus can be interpreted as the "fermion parity in the bulk".
The important result is that as perturbations are added, moving J 0 away from zero, there is a continuous deformation ofN , namely U †N U, which we continue to identify as the effective quasiparticle number, and is approximately conserved. By the intuitive discussion in Section II, this suggests that topological information stored in the Majorana zero modes should have a long decoherence time. We are now in a position to rigorously establish this. Indeed, a sufficient condition to allow for quantum information to be stored with infinite (respectively, very large) decoherence time is that there exist Majorana operators Ψ A and Ψ B which square to the identity, anti-commute with each other, and commute (respectively, almost commute) with the Hamiltonian; this means that they form an "edge strong MZM" in the language of Ref. 23. We will show that this is an implication of the ADHH result.
The presence of the strong MZMs has important ramifications. In the fermionic picture, both Ψ A and Ψ B toggle between all states in the sectors, even the highly excited ones. Their presence means that the full Hilbert space of the system can be decomposed into the tensor product of a two-state quantum system, i.e. a topological qubit, and a non-topological "bulk" Hilbert space, of dimension 2 L−1 , such that the Hamiltonian vanishes upon projection onto the topologial qubit, up to finite-size corrections ∼ e −L/ξ . Consequently, the topological qubit is protected at any temperature: regardless of how the dynamics of the system affects the projection of the state of the system into the bulk Hilbert space, the topological qubit is unaffected. In the spin language, this means that the autocorrelator of the boundary spin operator σ z 1 = γ A 1 for any temperature or initial state is non-vanishing up to exponentially long times of order (J/h) L . 24 The same goes for the other edge spin σ z L = Fγ B L . To see why ADHH implies a strong MZM, let us first return to the limit J 0 = 0 where the Hamiltonian is simply −JN . In this limit the Majorana operators γ A 1 and γ B L already form a strong MZM. Now turn on any or all of the other terms in Y , while keeping 0 < J 0 J. The ADHH theorem states that there exists a local unitary change of basis U which transforms the problem into one in whichN is conserved, up to nearly exponentially small corrections. The locality properties of U, along with the fact that [N ,D] = 0, require that the approximate transformed Hamiltonian −JN +D commutes with the edge Majorana fermions: To see this, observe that −JN +D commutes with both the fermion parity F and the "bulk fermion parity" F defined in Eq. (11). Therefore, it also commutes with their product F F = iγ A 1 γ L 1 . The fact that U is a local unitary ensures that the norm of any term coupling both γ A 1 and γ B L must be exponentially small in L. We assume that L is large enough that such terms can be ignored. It then follows that all terms must commute with γ A 1 and γ B 1 individually, proving (13).
Thus, in the presence of interactions we define The vanishing commutator (13) and the theorem (10) show that these commute with H up to order cn * < L: By construction Ψ l and Ψ r square to the identity operator and anticommute with F. We call such operators, satisfying Eq. (15), "prethermal strong Majorana zero modes". Ref. 24 found evidence for "almost" strong zero modes in the special case whereŶ contains only non-zero h, h 2 , J 2 . Here we have established that they are an example of a prethermal strong MZM, confirming the claims made there.
B. Temperature dependence of the lifetime Now suppose that the system is at temperature T . The dynamics of the prethermal strong MZM Ψ l will be visible in the retarded Green function This is not the Green function of Ψ l but rather that of the "bare" operator γ A 1 . However, there will be non-zero overlap between these two operators, so the Green function of the latter can be used to probe the former. In the limit of large system size we can ignore the interaction between Ψ l and Ψ r , so this Green function takes the form The "wavefunction renormalization" Z is a measure of the overlap between γ A 1 and the MZM operator Ψ l , as determined by the unitary transformation U. The decay rate Γ will be our primary figure of merit in judging prethermal strong zero modes. It is directly reflected in the width of a zero-bias peak observed in tunneling into the end of a Majorana chain (see Sec. VII A) and determines the error rate for topological qubits encoded in the MZMs Ψ l , Ψ r . The decay rate Γ is determined byÊ, the correction term in the transformed basis, and would vanish ifÊ were to vanish, with the decay time becoming infinite. From Eq. 15, which expresses the fact that Ê = O(e −cn * ), we see that Let us now examine the temperature dependence in more detail. After a local unitary rotation by U, the Hamiltonian can be shifted to [Eq.(2)] asĤ = −JM +D +Ê, wherê M = −N + c, with the constant c chosen so thatM has smallest eigenvalue 0. The termD commutes withM and can be jointly diagonalized with it, so the decay is attributed to resonant transitions between differentM sectors induced byÊ. These transitions can only happen between states with approximately the same energy with respect to JM +D. The corresponding energy bandwidth of the sector withM = m is bounded by where E 0 is the ground-state energy, and C is some dimensionless constant. Roughly, this is because a state in this sector differs from the ground state only in at most m spots, and only those spots can contribute to the energy difference from the ground state. For a rigorous proof, see Appendix B. States in theM = m sector can have the same energy as those in theM = m + 1 sector when E 0 +mJ +mCJ 0 ≥ E 0 +(m+1)J −(m+1)CJ 0 or, in other words, when m ≥ m min ∼ J/J 0 . A transition can only occur when the resonance condition is satisfied. In an excited state with a non-zero density of excitations, at least m min of them must be close to the MZM in order for a transition to occur. Hence, where ρ is the density of excited quasiparticles. We thus have shown that the decay time of a MZM is of Eq. (19) is determined by two factors. The important consequence is that decay of an MZM is suppressed by a high power of the quasiparticle density, in addition to the state-independent exponential suppression e −cn * .
At very low temperatures, the density of excited quasiparticles in superconductors is generally higher than the expected thermal (or pre-thermal) equilibrium value ρ ∼ e −∆/T . (The reasons for this lie outside the purview of our discussion; for a recent theoretical analysis, see Ref. 54. References 1-7 in this paper contain experimental measurements of the quasiparticle density.) At temperatures that are not too low, the quasiparticle density exhibits equilibrium behavior, and we have ρ ∼ e −∆/T . When this is the case, the decay rate is controlled by e −∆ eff /T , where we have defined the "effective gap" ∆ eff ∼ J(J/J 0 ). The effective gap is much larger than the actual gap, which is ∆ ∼ J, up to corrections of order J 0 . Therefore, for J J 0 and T ∆ eff the decay rate is of the form This shows that the finite lifetime of an MZM is exponentially large in n * ∼ (J/J 0 )/ln 3 (J/J 0 ) and also exponentially large in 1/T . Eq. 20 suggests that MZM qubits can be optimized by maximizing J/J 0 even at the cost of reducing ∆, since in any case ∆ eff ∆ when J/J 0 is large.

C. Numerical Results for Prethermal MZMs.
The preceding general arguments can be substantiated by computations in finite-size systems. For chains of length N = 8 − 14, we study prethermal strong MZMs all the way up to infinite temperature by exact diagonalization. As may be seen in the top panel of Fig. 2, the MZM survives to very long times at infinite temperature. The lifetimes are consistent with an exponential dependence on the ratio of scales J/J 0 until the lifetime becomes so long that finite-size effects become important. In a model in which the only terms inŶ are h and h 2 , we can write As may be seen from Fig. 2, the data collapses onto this form. Using time-evolving block decimation (TEBD), 55 we can study much larger systems, where finite-size effects will be less severe. This approach is applicable to low-energy initial states for which the entanglement does not grow too much, allowing accurate simulation with bond dimension χ = 100. Our results confirm that, at least for low-energy states, the Majorana lifetime remains large for larger system sizes (see Figure 3). Note that the decay time shown in Figure 3 should not be directly compared to Figure 2 because Figure 2 is at infinite temperature whereas Figure 3 is at very low temperature.

V. GENERAL CRITERIA AND EXAMPLES FOR PRETHERMALIZATION-PROTECTED TOPOLOGICAL DEGREES OF FREEDOM
The procedure described in preceding sections uses the ADHH theorem to find prethermal strong MZMs. In this section, we give a systematic approach to applying the theorem to determine when topological degrees of freedom are protected from thermal fluctuations by prethermalization. This approach involves two key observations: • The ADHH theorem guarantees the presence of a single long-lived local U(1) charge for J/J 0 large enough.
• In turn, this nearly conserved U(1) prevents local bulk excitations from violating the conservation of some topological charge, up to exponentially small terms.
More prosaically, the idea is that the nearly conserved quantity guarantees that a topological charge localized at edges or defects cannot be changed by the absorption or emission of small numbers of bulk excitations. These general criteria can be applied in any dimension, and so will give some examples in one, two, and three dimensions. We illustrate them through several one-dimensional examples in the remainder of this section and with higher-dimensional examples in the next.

A. Three-state Potts
It is instructive to analyze an example where the ADHH theorem, although applicable, does not guarantee any prethermal strong zero modes. As shown in Refs. 21 and 22, the three-state Potts chain does not have any sort of edge zero mode for finite J. The easy edge-spin-flip process that kills the putative zero mode can be seen easily in perturbation theory for J large. Here we rephrase this result in the more general setup of this paper.
The quantum-chain analog of the 3-state Potts model has a three-state quantum system on each of L sites. The basic operators σ and τ acting non-trivially on a single site generalize the Pauli matrices σ x and σ z . Instead of squaring to the identity and anticommuting, they obey where ω ≡ e 2πi/3 . Matrices satisfying this algebra are Here σ generalizes the Pauli matrix σ z to measure the value of a clock variable, while τ generalizes σ x to shifting the value. The operators σ j , τ j are defined analogously to the Ising case, where they act non-trivially on the jth site of the chain and trivially elsewhere. The three-state Potts chain is invariant under global S 3 permutations of the three states and has nearest-neighbor interactions. This fixes the Hamiltonian with open boundary conditions to be H = −JN P + Y P , wherê If desired, these operators can be rewritten in terms of Z 3 parafermionic operators akin to Majorana fermions 23 .
The operatorN P has integer eigenvalues, since σ † j σ j+1 + σ j σ † j+1 has eigenvalue 2 if the Z 3 spins at j and j + 1 are identical, and −1 if they are not. ThusN P is related to counting kinks, just as in Ising. However, an important difference with the Ising case is that here there are two types of kinks. We label the three states at each site as A, B, C, with the Z 3 symmetry cyclically permuting them. When the states AB (or BC or CA, their cyclic permutations) occur on sites j and j + 1 respectively, we call the configuration a kink, and when BA appears (or CB or AC), we call the configuration an antikink. Thus (2(L − 1) −N P )/3 counts the number of kinks plus the number of antikinks.
SinceN P is an integer, the ADHH theorem says there is an emergent U (1) symmetry in the prethermal regime, conserving the total number of quasiparticles. However, this symmetry does not prevent a zero mode from decaying, because a kink can scatter off the edge and turn into an antikink without changingN P . In perturbation theory this results from the easy-spin-flip process described in Ref. 22. 56 Namely, consider say the AB kink on sites 1 and 2. The Y P term allows the spin on site 1 to be shifted from A to C, and so converts the AB kink to the CB antikink. This process conservesN P while flipping the edge spin. Clearly there can be no long edge-spin relaxation time, and no edge strong zero mode, even in the prethermal regime.
This fact appears nicely in the language of the unitary transformations used in this paper. The point is that the transformation guarantees only that the resultingD commutes witĥ N . It says nothing directly about the edge spin, which is why in the Majorana case we needed to argue that no terms involving edge-spin flips could appear inD. Here they can. Such operators are easy to write out using the projectors satisfying P (r) j,j+1 and τ j P (r) j,j+1 = P (r−1) j,j+1 . SinceN P can be written as a sum over P (0) , it follows that N P , τ 1 P (2) 1,2 = 0 so there is no obstacle to including τ 1 P (2) 1,2 intoD. Indeed it does appear for (22) and for a generic Hamiltonian with the same dominant termN P . Since it shifts the edge spin, it rules out any strong zero mode. The Q-state Potts model with S Q permutation symmetry is the obvious generalization of this to any integer Q ≥ 2 and, except for the Ising case Q = 2, the same arguments apply: there is no edge strong zero mode for any finite J.

B. Z3 Parafermions
Despite the results for the three-state Potts model, it is still possible to have prethermal edge zero modes of Z 3 parafermions. One way of doing this is to follow Refs. 21 and 22 and to deform the dominant termN P from (22) tô For θ not a multiple of π/3, this explicitly breaks spatial parity and time-reversal symmetries. It also breaks the S 3 permutation symmetry to Z 3 , and thereby breaks the symmetry between kinks and antikinks.
To utilize the ADHH theorem, we need to chose θ so that N θ has integer eigenvalues. The simplest non-zero value, θ = π/3, is the Potts antiferromagnet. Here the edge zero mode does not exist for similar reasons as described above for the ferromagnet. Thus we choose θ = π/6, halfway between ferromagnet and antiferromagnet. In this case, the energy of an antikink is twice that that of a kink when h = 0, and L +N π/6 counts the number of kinks plus twice the number of antikinks. Turning on h and applying the theorem means that the resultingD does not conserve the number of kinks individually, but allows scattering processes that convert one antikink to two kinks. The nice fact is that no such simple process flips the edge spin. For example, an antikink near the edge scatters into two kinks via the process BAAAA . . . ⇒ BCAAA · · · that does not flip the edge spin. More formally, there is no local operator involving τ 1 insideD here, just as there is none involving σ x 1 in the Majorana case. To prove this, we utilise the identity 57 Hence, conservation ofN π/6 implies that any term that does not couple the two ends of the chain must commute with σ 1 and σ L individually. The physics for other values of θ seems akin to the J 2 large case discussed next. For certain values of θ,N θ can be rescaled to have integer eigenvalues, but then edge spinshifting terms can occur at some order 57 .

C. Interacting Majorana chains with multiple large couplings
We now return to the interacting Majorana chain but alloŵ Y to include terms that are not very small. We cannot apply the ADHH theorem to ensure a long decay-time for the zero mode. We may be tempted to fix this by moving the offending terms fromŶ toN . Naively, this fix would be valid as long as the extra terms commute with our putative zero mode, and the eigenvalues ofN remain integers. We show that similar considerations as for the Potts and clock models arise: the pre-thermalization theorem holds, but the resulting U (1) symmetry need not protect the zero mode.
As a simple example, returning to our original Ising model in (8), let us allow the next-nearest neighbor coupling J 2 to be of the same magnitude as J. In order to apply the ADHH theorem, we must then include it in the dominant term JN . This presents a potential problem, since the theorem requires integer eigenvalues ofN . However, if we take rational J 2 /J = p/q for coprime integers p and q, then still has integer eigenvalues. The Hamiltonian becomeŝ H nn = −(J/q)N nn +Ŷ nn .
We may now use the ADHH theorem to obtain an approximate conservation law forN nn . However,N nn no longer counts kink number; instead it is the sum of the number of broken nearest and next-nearest neighbor bonds, weighted appropriately by q and p. This means it is possible to flip the edge spin while conservingN nn by converting broken bonds of one type to the other.
Equivalently, terms will appear inD nn which do not commute with the MZM γ A 1 . For example if J = J 2 , then σ x 1 (σ z 2 − σ z 3 ) commutes withN nn but flips the edge spin, ruining the conservation of the MZM. These are the resonances described in detail in Ref. 24. They allow easy edge-spin flips by exchanging energy between different types of bonds. For example, consider the process which swaps between these two spin configurations, identical but for the edge spin: The energy contribution from the three spins on the left is 2(J + J 2 ), while on the right it is 2(2J), so when J = J 2 the edge-spin can be flipped for no energy cost. This easy spin flip is analogous to a kink scattering off the edge into an antikink in the Potts model, but there is an important difference between the two. Here, there is an energy cost associated with either of the domain walls in the right configuration in (27) moving. Thus once a kink has moved to the edge and transformed from the left configuration to the right via an edge-spin flip, it is trapped at the edge, as opposed to the Potts case, where the newly produced antikink is not confined. Thus the only way the kink can move away from the edge is to transform back into the left configuration, reversing the edge-spin flip. This implies that, at low energy densities where there are few kinks, we should expect the MZM to retain a long lifetime, despite the resonances. We have confirmed this via the TEBD. It should be noted that an exception to this survival at low energy densities occurs at the critical point J 2 = −J/2 which, for any non-vanishingŶ nn , becomes a paramagnetic regime between competing ferromagnetic and antiferromagnetic orders. This allows kinks to move freely from the edge in either configuration.
These results and the ADHH theorem show that it is possible for the MZM to survive for long times even for large J 2 /J, because the term in D with the edge-spin flip may occur only with some high power of 1/(Jq). In other words, the order in perturbation theory in which the resonance occurs may be some large value n r . The time to decay will then be order e cnr . For general p and q, we expect that n r will be roughly max(p, q), in accord with the analysis of Ref. 24. It is worth noting that at J = 0 but non-zero J 2 , the ensuing Hamiltonian model is equivalent to two copies of the Ising-Kitaev chain, and so n * → ∞ as J 2 /J → ∞ as well.

VI. TWO-AND THREE-DIMENSIONAL SYSTEMS
As we have emphasized throughout, the ADHH theorem applies in any dimension. In this section, we apply the general criteria developed in Section V to show how the resulting almost-conservation law can result in topological protection analogous to the one-dimensional examples we have analyzed.

A. Two dimensions
We study a perturbed toric code Hamiltonian 58 on a finite square lattice with sides of lengths L 1 and L 2 . The spins live on the links i, with Hamiltonian where A v = i∈N (v) σ z i and B p = i∈p σ x i for vertices v and plaquettes p. In the bulk, there are 4 spins on the links entering each vertex v and 4 spins on the links in each plaquette p. We put "rough" and "smooth" boundary conditions 59,60 on, respectively, the horizontal and vertical sides. At a rough edge, there are only three links around each edge plaquette; at a smooth edge, there are only three links entering each edge vertex. At the rough boundaries, B p is modified so that it is the product of the three σ x i operators around a rough boundary plaquette; the vertex terms are unchanged since there are still four links attached to each vertex (we include no term for the "dangling" vertices touching only one link). At the smooth boundaries, A v is modified so that it is the product of the three σ z i operators around each smooth boundary vertex; the nearby B p are unchanged since each plaquette still contains four links. The . . . represents all other possible local terms, which are assumed to be small, including a term ∝ ( v A v − p B p ) which would give electric and magnetic charges different energies.
Such a system has a doubly degenerate ground state, which can be used as a qubit. One basis for this qubit is given by the eigenstates of the electric charge (modulo 2) on a rough edge. To make this more precise, consider the unperturbed toric code Hamiltonian ((28) with all couplings other than u set to zero). Each term in the Hamiltonian is a projector plus a constant, and so ground states are annihilated by each individually. Now consider a path P of length P stretching along the lattice from one of the dangling vertices on one rough edge to a dangling vertex on the other edge labeled by consecutive links l 1 , l 2 . . . l P and the operator M P = P k=1 σ x l k . Then the eigenstates of the unperturbed toric code Hamiltonian can be grouped into eigenstates of M P with eigenvalues ±1. It is easy to check that this eigenvalue is independent of the choice of the path P. This eigenvalue is the magnetic charge on either smooth boundary, with which P is roughly parallel. Likewise, the smooth edge corresponds to a rough edge on the dual lattice, and so we can define a path P on the dual lattice stretching from one smooth edge to the other. The electric charge operator is then defined as E P = P k=1 σ x l k , and analogously to the magnetic charge, any eigenstate of the Hamiltonian can be grouped into eigenstates of E P with eigenvalue ±1. However, E P and M P anticommute, since P and P always intersect. There are thus two ground states, not four, and the operators acting on this qubit can be identified as Z ≡ E P and X ≡ M P . Now consider the perturbed model, in which the other couplings are allowed to be non-zero. If h x or h z is large, the system undergoes a zero-temperature phase transition to a trivial phase; see Refs. 61 and 62 and references therein. However, if h x , h z , . . . are not too large, then the system will remain in the zero-temperature topological phase, and the ground state will still be doubly degenerate, up to corrections that are exponentially small in min(L 1 , L 2 ). We can still associate the Z and X eigenstates with the eigenstates of electric and magnetic charge (modulo 2) on, respectively, the rough and smooth edges, but the Z and X Wilson lines, E P and M P , will need to be thickened since the ground state will have fluctuations in which virtual pairs of e particles straddle P, and similarly with m particles and P. The resulting thickened operators will not be washed out by such quantum fluctuations in the ground state because virtual pairs never get too widely separated before recombining, so long as the system is in the topological phase. (They have an opportunity to become more and more widely separated from each other as the system approaches the quantum phase transition.) However, conventional wisdom holds that this degeneracy is only a feature of the ground states and the excited states are not degenerate. Thermally excited pairs of quasiparticles can wander far from each other since they are real, not virtual, excitations. In particular, the naive expectation is that that the qubit -meaning the boundary electric and magnetic charges -will have a decay rate Γ of order (L 1 + L 2 )e −u/T at non-zero temperature. Suppose, however, that u u 0 , where u 0 is built analogously to (9) from h x , h z and all other terms lumped into the . . . remainder. The interesting question is then whether the prethermal conservation law implied by the ADHH theorem results in the qubit living longer than this naive expectation, analogous to the edge modes in 1D.
Because all A v and B p operators mutually commute, their sum has integer eigenvalues. The ADHH theorem thus implies thatN TC ≡ v A v + p B p is conserved, up to exponentially small corrections. This integer is zero in the ground state, and otherwise is simply the number of bulk excitations, not just their parity. To see if the approximate conservation of the number of bulk excitations protects the qubit, we need to define X and Z operators. The bulk Wilson line operators described above will not work because, even if the number of bulk excitations is conserved, a bulk excitation can still cross a Wilson line and so flip its charge. However, we can define boundary operators that measure the rough-boundary electric charge and the smooth-boundary magnetic charge by simply pushing P to either rough boundary and P to either smooth boundary: where U is the unitary given by ADHH; HB is the set of vertical links that belong to the horizontal rough boundary at the top of the rectangle (this choice is arbitrary; the bottom would work equally well); and V B is the set of vertical links that belong to the vertical smooth boundary at the left side of the rectangle. The transformed edge Wilson-line operators in (29) commute with the Hamiltonian, up to O(e −cn * ) corrections, except at the corners. Away from the corners, D contains only terms commuting with Z, because a non-commuting term would necessarily create or annihilate an e particle in the bulk, thereby violating the conservation ofN TC . Similarly, terms in D that do not commute with X would create an m particle. However, at the corners, D can contain terms that cause the rough edge to absorb an e particle and the smooth edge to emit an m particle or vice versa. One example is σ z 4 σ z 3 σ x 1 (1+B p2345 )(1+A v123 ), where the dangling link is labeled by 1, v 123 is the corner vertex connected to links 1, 2, 3 and p 2345 is the corner plaquette that overlaps it and contains links 2, 3, 4, 5. An error-causing process associated with this term is depicted in Fig. 4 We thus arrive at the interesting result that the only possible violations of the long-lived conservation law occur at the corners. Prethermalization therefore suppresses the lowtemperature error rate from Γ ∝ (L 1 + L 2 )e −u/T to Γ ∝ e −cn * (L 1 + L 2 )e −(u/u0)u/T + e −u/T for some constant c. It would be interesting to see how this argument generalizes to weakly perturbed Levin-Wen models 63 .

B. Three dimensions
There are four-dimensional topological phases with no point-like excitations, and they protect quantum information at non-zero temperatures below the phase transition into the   4. (a) Prethermalization does not protect against qubit errors at the boundaries between "rough" and "smooth" edges of the toric code on a surface with boundary. Here, an e-particle (blue dot) is absorbed by the rough boundary (top edge) while an m-particle (red cross) is emitted by the neighboring smooth boundary (right edge), as described in the text. (b) When a qubit is encoded in a loop that carries "Cheshire charge" in 3D, errors associated with a magnetic flux loop encircling (sequence of dotted red lines, whose growth and subsequent shrinkage are indicated by the green arrows) the Cheshire loop are suppressed by the size of the Cheshire loop since the magnetic flux loops have a line tension at temperatures below the bulk phase transition temperature. There is no such protection against the emission or absorption of a point-like electric charge (blue circles), even below the phase transition temperature into the topological phase. However, prethermalization can suppress such processes exponentially.
topological phase, as noted in Sec. II. In three dimensions, we are halfway there, since magnetic excitations are loop-like; their line tension prevents them from causing errors. However, there are still point-like excitations that will cause errors at non-zero temperature. Prethermalization can suppress them. It was recently shown 64 that Abelian topological phases in 3 + 1 dimensions have loop excitations carrying Cheshire charge. The simplest example is the 3D toric code, which has a Hamiltonian of the same form as Eq. (28) generalized to the cubic lattice, with one modification discussed below. There are 6 links attached to each vertex, so A v is a product of 6 σ z i operators while B p is still the product of 4 σ x i operators around each plaquette. There are loop excitations on which e particles condense in a manner analogous to the "rough" boundaries considered above in the case of the 2D toric code. To analyze such loop excitations, we can modify the Hamiltonian so that the ground state has one at a specified loop K, as was done in Ref. 64. We label the loop K, of length L, in terms of vertices v k and links l k such that the vertices v k+1 and v k are connected by the link l k for all k = 1 . . . L with v K+1 = v 1 . We now modify the Hamiltonian along this loop according to: This transverse field commutes with the B p terms, so eigenstates of the Hamiltonian are eigenstates of σ x l k for every link l k+1 on the loop. Thus they resonate between a state with two e particles at either end of the link and a state without e particles at the ends of the link. Alternatively, we can insert such a loop excitation in a manner that emphasizes the similarity with the "rough" edge in 2D: we remove all v k , l k ∈ K from the lattice. The Hamiltonian is unchanged at all of the remaining vertices and the plaquette operator is modified to a product of 3 spins for the plaquettes that previously contained l k ∈ K. In this alternative construction, it is again clear that the states of the system resonate between having zero and two e particles at either end of each link on K. The spectrum is degenerate in the limit of a large loop, provided that there are at least two such loops in the system, since the total charge on a loop can be either 1 or e, subject to the constraint that the total topological charge of the system is fixed. In the simplest case, in which there are two such loops in the system, states are doubly degenerate and the degenerate subspace at each energy forms a qubit spanned by states with charge 1 or e on both loops. This charge is locally unobservable, hence it is "Cheshire charge", which explains why the two states are degenerate in the limit of a large loop. A straightforward generalization of the arguments applied in two dimensions in the section VI A show that Cheshirecharge-carrying loops cannot emit or absorb an e particle in the analogous prethermal regime in 3D. Hence, the Z operator acting on the qubit commutes with the prethermal Hamil-tonianD. A dressed version of the X operator, meanwhile, is expected to be conserved in the low-temperature phase T < T c below the phase transition at which long flux loops unbind and proliferate 65 . Thus, the qubit is partially protected by the dynamics of the low-temperature phase (as was already known 52 ) and partially protected by prethermalization. Unlike in the 2D case discussed above, our 3D topological qubit has error rate Γ < e −cn * e −(u/u0)u/T , where u 0 is the appropriate energy scale derived from the couplings inŶ . The corner error processes that caused trouble in 2D are not present here because an electric charge cannot become a magnetic charge since one is point-like and the other loop-like, unlike in 2D, where both are point-like and necessarily had the same energy in the prethermal limit. In other words, in the prethermal regime, the 3D toric code is, up to exponentially small corrections, a self-correcting non-zero temperature quantum memory; the only other known examples are 4D topological phases 52 . Thus, in this case, prethermalization buys us an extra dimension, which might be rather difficult to otherwise acquire.

VII. EXPERIMENTAL REALIZATIONS
Although the primary focus of this paper is the conceptual advance in the theory of topological phases of matter that results when the theory of prethermalization is brought to bear on it, it is important to note that this advance may have implications for near-term devices and experiments. For illustrative purposes, we discuss two examples: Majorana zero modes in semiconductor-superconductor devices and Ising spins in a trapped ion chain. The former example allows us to introduce an important point, which is that prethermalization is generically terminated by a coupling to a heat bath; when the coupling is weak, this occurs at very late times, and the prethermal regime extends until then. Since the electron-phonon interaction is suppressed by the ratio of the electron mass to the ion mass (it must vanish in the limit that the latter approaches infinity), this can be weak, thus giving an example of this scenario. Meanwhile, the ion chain has interactions that are long-ranged, so it is not immediately obvious that our analysis applies. However, the calculations that we present do, in fact, indicate that prethermalization occurs in this case, as well. Although this system does not have a topological phase, the edge spins are governed by similar dynamics. We present a quantitative discussion of the experimental requirements for observing edge spins protected by prethermalization.

A. Quantum Dot Chains and the Fate of Prethermalization in the Presence of Electron-Phonon Interactions.
Refs. 39 and 40 have proposed devices in which quantum dots (possibly defined by gates acting on MZM-supporting nanowires) are the basic building blocks for super-lattices that have an effective low-energy description as models of Majorana fermions hopping on a lattice. They therefore realize literal versions of the Hamiltonian of Eqs. 6 and 8. Moreover, Ref. 40 gives a procedure for tuning a system of quantum dots to the limit in which the coupling J is much larger than all the other coupling constants from Eq. 8. In the standard notation for the Kitaev chain 15 this translates into making t + ∆ much larger than µ and t − ∆. In other words, Ref. 40 gives a procedure for tuning into the prethermal regime of large J/J 0 . These devices are a bit futuristic, so we are not, in this section, proposing immediate experimental tests of prethermalization-protected MZMs. Rather, our purpose here is to illustrate the considerations that must enter into any effort to exploit prethermalization in a solid.
As we will discuss momentarily, prethermalization leads us to predict extremely small decay rates for the MZMs at the ends of such a chain. However, these extremely small decay rates will not be observed since the electron-phonon coupling will end prethermalization before these exponentially small effects do. In other words, the electron-phonon coupling can violate the conservation of N before the time t * . But if the electron-phonon contribution to thermalization is smaller than the electron-electron contribution over some range of temperatures, device geometries, and device parameters, then prethermalization will still play an important role in extending the lifetime of a topological qubit. We will show that the naive MZM decay rate due to electron-electron interactions, which is what would be seen outside the prethermal regime, is Γ non-pt el-el ∼ n 2 qp · 0.6 GHz. Meanwhile, the decay rate due to electron-phonon interactions is Γ el-ph < n qp · 10 MHz, which is almost certainly a gross overestimate since it doesn't take into account the screening of piezo-electric interactions by the superconductor or the effect of the device geometry, which can be designed to have a phonon band gap. Since the former is quadratic in n qp while the latter is linear, we con-clude that, even for this overestimate of phonon effects, the electron-electron interactions are the dominant decay channel for n qp > 0.01, and prethermalization suppresses this dominant decay channel. If the bulk quasiparticle density is thermal (which may or may not be a valid assumption, for the reasons discussed in Ref. 54; and references 1-7 of that paper), this translates into a range of temperatures (t + ∆)/T < 4.6. Again, this range could be greatly increased, depending on the effects of device geometry on the phonon spectrum. In the rest of this subsection, we explain these estimates in more detail.
The initial motivation for considering such a system was that it may be possible to tune parts of the system more reliably into and out of the topological superconducting phase. The potential drawback of such systems is that the energy gap of the coupled-MZM system is significantly smaller than the energy gap of a single nanowire. For instance, Ref. 40 finds t ≈ 9µV and ∆ ≈ 6µV. One might consequently fear that much lower temperatures would be necessary in order to protect the end MZMs of such a quantum-dot chain.
Indeed, suppose that the hopping parameter were twice as large, in which case the system would be outside the prethermal regime, with (t+∆)/2 = 7.5µV and (t−∆)/2 = 1.5µV, and assume a nearest-neighbor repulsion V = 3µV. Then, a rough naive estimate for the decay rate due to electronelectron interactions would be Γ non-pt is the bandwidth of excited quasiparticles (so its inverse is the density-of-states). This gives Γ non-pt el-el ∼ n 2 qp · 0.6GHz. Thus, the zero mode will decay in a few nanoseconds unless the quasiparticle density is very low, which is unlikely to be the case since the gap is relatively small.
However, the results of this paper show that the small gap may not doom the MZM: prethermalization will protect quantum information until a time that is nearly exponentially long in the ratio between t + ∆ and some combination of the other couplings, such as t − ∆, V , etc., according to Eq. 20. Moreover, the temperature dependence has a characteristic energy scale n * (t + ∆), rather than (t + ∆) itself, so relatively small (t + ∆) is not as detrimental as one might fear. As a result of prethermalization, the decay rate due to electron-electron interactions should, instead, be Γ pt el-el ∼ (n qp ) mmin · e −cn * · V 2 /(t − ∆), as in Eq. (19). In other words, prethermalization suppresses the decay rate due to electron-electron interactions by a factor Γ pt el-el /Γ non-pt el-el ∼ (n qp ) mmin−2 e −cn * . For the values of t, ∆ given above, (t + ∆)/(t − ∆) ≈ 5 and assuming that all other couplings are smaller than t − ∆, we find n * ≈ m min ≈ 5. If we replace the O(1) constant c by 1, then we can use e −n * ≈ 6 × 10 −3 . Thus, we have Γ pt el-el ∼ n 5 qp · 3.6MHz. If we assume that the system is at a temperature T = 50mK, and that the quasiparticle density is given by the equilibrium value, then n qp ≈ e −(t+∆)/T ≈ 0.05 while (n qp ) 5 ≈ e −∆eff/T = e −mmin(t+∆)/T ≈ 3 × 10 −7 . (We are assuming that, in the initial state, the quasiparticle density is equal to its equilibrium value but its subsequent evolution is impeded by prethermalization, especially its equilibration with the edge modes.) Thus, the enhancement of (t + ∆) to ∆ eff = m min (t + ∆) is the larger effect at these temperatures, assuming that the system equilibrates rapidly with respect to D i.e. is in the prethermal state. The resulting decay rate is Γ pt el-el ∼ n 5 qp ·3.6MHz ∼ 1Hz. However, if the system is out-ofequilibrium, the factor of e −cn * may be more important since it will protect the zero modes even if there are non-equilibrium excitations in the bulk which would render the equilibrium estimate (n qp ) mmin e −∆eff/T moot.
We must compare the prethermal decay rate to the phononassisted decay rate Γ el-ph . A bulk fermionic excitation, which has energy E = t+∆, can be absorbed by a zero mode and its energy can be emitted as a phonon of momentum q = E/v, where v is the speed of sound. The Hamiltonian governing the electron-phonon interaction is: is the wavefunction of the zero-energy fermionic level of a single dot coupled to a superconductor 40 ; u j (x) is the displacement in the j-direction of the ion whose equilibrium position is x; and . The electronphonon coupling has two parts, the deformation potential D and the piezoelectric coupling h 14 . The piezo-electric potential satisfies q i w ij (q) = λ iM λ (q) λ q ) j where λ are the phonon polarizations, λ q are the corresponding polarization unit vectors; and M λ (q) depend on the direction of q but do not its overall scale. Hence, the decay rate for an MZM due to the deformation potential electron-phonon coupling is Here, ρ is the density of the solid and, as before, n qp is the probability of a bulk fermionic excitation on dot 1. In going to the second line, we have bounded Q(q) ≡ d 3 x e iq·x |ψ 1 (x)| 2 by |Q(q)| 2 < 1. The first two factors in the second line are the matrix element for such a process; the third factor in the second line is the density of states for the phonon, which is ∝ q 2 ; and the final factor is the probability for a bulk quasiparticle excitation to be near enough to the MZM for absorption to occur. (We emphasize that the relevant gap for phonon-assisted decay of an MZM is E = t + ∆, not ∆ eff , which is the relevant scale for electronelectron interactions in the pre-thermal regime.) The reverse process, in which a bulk quasiparticle excitation is emitted and a phonon is absorbed, has the same amplitude at lowtemperature. Prethermalization will occur if Γ el-ph < Γ el-el . For InAs we take the following values 66 : D = 5.1 eV; the speed of longitudinal sound waves is v l ≈ 4.7 km/s; the density is ρ ≈ 5.67 g/cm 3 . We take t + ∆ ≈ 15µV estimated in Ref. 40, as in our discussion of the prethermal decay rate due to electron-electron interactions. This gives Γ DP el-ph < n qp · 300kHz. For an equilibrium distribution of excited quasiparticles at T = 50mK, this gives Γ DP el-ph < 15kHz. Turning now to the piezoelectric coupling, we first note that, in the presence of strong coupling to superconducting leads, this effect may be suppressed by screening. However, if we neglect this screening effect and compute, as an upper bound, the decay rate due to an unscreened piezoelectric coupling, we find: Here, we have made the approximation of ignoring the difference between the longitudinal and transverse sound velocities v l ≈ 4.7km/s and v t ≈ 3.3km/s and simply set them both to v ≡ 4.2km/s. We have also made the simplification of replacing M λ (q) by an upper bound M λ (q) < 1.
Using h 14 = 3.5 × 10 6 V/cm, given in Ref. 67, we find Γ PE el-ph ∼ n qp · 10MHz. For an equilibrium distribution of excited quasiparticles at T = 50mK, this gives Γ PE el-ph ∼ 500kHz. We note that this estimate does not take into account the effect of phonons in the superconductor, although we expect this to be a smaller contribution to the decay rate since the electronic wavefunction is concentrated primarily in the semiconductor; it has also not taken into account the effect of the device geometry on the phonon spectrum at the wavelengths of interest. In this regard, we note that it may be possible to pattern a material in order to engineer a phonon band gap at the wavelength 2πk −1 = hv/E ≈ 1µm, potentially strongly suppressing the effect of phonons. Thus, even at relatively high temperatures, 1µs is a conservative estimate of the potential lifetime of a MZM in a quantum-dot chain in the prethermal regime, but the lifetime may be as long as 1 ms.
We briefly note that much of our discussion of quantumdot chains in Sec. VII A also applies to the model of Ref. 68, which uses short topological nanowires to construct a two-dimensional model of Ising anyons on the honeycomb lattice 69 . The Ising anyons emerge as low-energy excitations of a superlattice of Coulomb-blockaded islands containing, each containing two nanowires. Our results do not apply to the physics of the nanowires themselves, but instead to the effective model of the low-energy degrees of freedom, which has a pre-thermal regime in the limit that the bulk Majorana fermion operators have a flat band. The existence of such a prethermal regime would facilitate universal topological quantum computation using the strategy of Ref. 68, since it ameliorates the drawback of a reduced energy gap.

B. Trapped Atomic Chains
Another possible experimental realization would be a trapped ion or neutral atom chain governed by a perturbed transverse field Ising model 70,71 . Here, coupling to an external heat bath would be less of a concern, although the effective system size might be smaller than in the quantum-dot case.
For example, in Ref. 72, the authors use chains of up to 22 171 Yb + ions in linear radiofrequency (Paul) traps, encoding effective two-state systems in their 2 S 1/2 hyperfine ground states. Long-range spin-spin interactions are generated using laser-mediated spin-phonon interactions. In particular, using the beatnote between two overlapped laser beams to drive stimulated Raman transitions, they generate the effective Hamiltonian where interaction is long-ranged and antiferromagnetic: with J I > 0. For nearest-neighbor interactions in Ising, ferromagnet and antiferromagnet are unitarily equivalent. Here the distinction is important, because the ferromagnet has a phase with long-ranged order for α < 2 and non-zero temperatures less than some critical temperature T c , 73 while the antiferromagnet does not have such an ordered phase for T > 0, like the nearest-neighbor model 74 . Consequently, for initial states that are near the top of the spectrum here, the end and bulk spin lifetimes will be infinite since these are low-energy state of the ferromagnetic Hamiltonian −H.
In the setup of Ref. 72, the experimentally realizable range of α is 0.5 < α < 2, while J I /2π ≤ 1kHz, achieved by changing the trap voltages and the detuning of the beatnote from resonance. The B term is generated by driving further resonant stimulated Raman transitions out of phase with the beatnote. It can range from negligible to a maximum of B/2π = 10kHz. So it should certainly be feasible to enter the prethermal regime B J I . The autocorrelators of individual spins σ z j (t)σ z j (0) may be measured up to times of order 100/J I . Thus, it should be possible to observe the prethermal protection of the edge spin through the survival of its autocorrelator. In contrast, the bulk spins decay over experimentally accessible timescales. The one caveat is that, although the edge spin will be longlived for any system temperature or any energy initial state, for very high-energy initial states near the top of the spectrum (or for negative temperatures), the protection comes from the long-ranged ferromagnetic order when α < 2, rather than prethermalization. This case is easily distinguishable because the bulk spins will also be long-lived in such initial states.
Simulations using exact diagonalization at infinite temperature on similar system sizes confirm this picture, at least for α 1.25, see Fig. 5. The main theoretical concern for prethermalization is the long-ranged nature of the interaction, because for the ADHH theorem to hold we require small J 0 /J. In fact the ADHH theorem has not yet been proven for power-law decay interactions. In our discussion above,N consists of just the nearest-neighbor interaction magnitude J I . For α = 2, the shortest-possible range in the experiments, the next-largest term is a quarter of the size, so we might be justified in putting it and longer-range terms inŶ . However, for smaller α we should include at least the next-nearest-neighbor term inN as well. This case, including the possibility of resonances, is discussed in detail in Section V C below. Crucially, for there to be any chance for the ADHH theorem to hold,N must have integer eigenvalues, which is of course impossible to tune exactly experimentally for more than one coupling. However, for the system sizes L ≤ 22 which are experimentally accessible, we do not expect this to be of practical issue in observing the prethermal protection of the edge spin in contrast to the bulk, and indeed the exact diagonalization results support this.
Of course, once α becomes small, the arguments above based on locality break down completely. The two ends of the chain come into contact, so that terms which flip the edge spin may be immediately added toD, and the edge spin will no longer be protected, regardless of the applicability of the ADHH theorem. It is also worth noting that the smaller α, the less localised at the edge the zero mode is, and so the greater overlap it has with the bulk spins. This means that there may be some small but observable part of the bulk spin correlator which survives to long times, albeit exponentially suppressed in magnitude compared to the edge.

VIII. INTEGRABLE SYSTEMS
Turning now to more formal considerations, we emphasize that the number n * following from the ADHH theorem only defines a lower bound on how long a zero mode will live. Indeed, in a free-fermion system, the recursive procedure defined in Ref. 30 converges, so that n * → ∞ as L → ∞, as we now show explicitly. In fact, one might go so far as to say that the ADHH recursive procedure gives us an alternate formulation of Onsager's solution of the Ising model.
The strong MZMs and corresponding unitary operators in the free-fermion Ising/Kitaev chain can be found directly, with no need for the full-blown ADHH procedure. The Hamiltonian here is H = −JN +Ŷ from (6), and (8) with all couplings other than h and J set to zero. The strong MZMs at the left and right edges respectively are then given by 15 where the normalization N is chosen to make Ψ 2 = (Ψ ) 2 = 1. Each of these commutes with the Hamiltonian, up to terms of order (h/J) L . Thus n * for the edge modes is L throughout the ordered phase h < J. The unitary operator U e relates the strong edge MZM at arbitrary h to that at h = 0 (the latter commuting withN . We have and we write where U n is Hermitian and of order (h/J) n . If we insist that Ψ commute with the HamiltonianĤ up to order (h/J) n , as per Section III, then we may calculate U n by inverting the equation: For the Ising/Kitaev chain we may calculate this boundary unitary transform exactly: where sin θ = h/J. Of course the U e constructed in this manner is not the full unitary transform shown to exist by the ADHH theorem, and the correspondingly transformed Hamiltonian would not display the emergent U(1) symmetry. Nevertheless this 'boundary' unitary transform is sufficient to show conservation of the edge mode in the pre-thermal regime. Namely, if we combine the boundary unitaries from both ends, then the transformed Hamiltonian will conserve the Z 2 bulk fermion parity (U e U e ) † L−1 j=1 γ B j γ A j+1 U e U e = (U e U e ) † σ z 1 σ z L U e U e up order n * = L. As discussed in Section IV A, this approximate conservation law is all that is needed.
Breaking the integrability by including non-zero h 2 and/or J 2 in (8), we have calculated U e up to eleventh order using computer-aided algebra program. The resulting edge zero modes agree with those calculated explicitly in Ref. 24. For example, when the only perturbing terms inŶ are h and h 2 we find to second order that: Furthermore, this method of calculating the edge zero modes is preferable and more efficient than the method outlined in Ref. 24, because the edge zero modes are automatically normalised at each order by construction. It is also illuminating to implement the ADHH procedure directly and explicitly on the full bulk Hamiltonian. We show that for free-fermion Ising/Kitaev case the unitary transformation can be computed exactly. For simplicity, here we take L → ∞ so the Hamiltonian of the Ising/Kitaev chain is We assume that J is sufficiently large compared to h so that we can apply ADHH witĥ We rewrite the Hamiltonian in the form so that the last term is the "error" termÊ, which does not commute withN IK and commutes withN IK . This can be checked explicitly; the key observation is that if there is a single kink at either j − 1/2 or j + 1/2, flipping the spin at site j hops the kink, while with no kink or two kinks adjacent to j, flipping the spin creates or annihilates two kinks respectively. The former process con-servesN IK , while the latter does not, so the former is allowed inD IK . Acting with the operator (1 − σ z j−1 σ z j+1 )/2 annihilates any configuration with zero or two kinks adjacent to j, and gives the identity if there is a single one. Thus we arrive at the expression given above forD 0 .
At each step of the ADHH recursive procedure, new terms are included inD and the coefficient of the error term is reduced by a power of h/J. At the very first recursive step, the error term can be canceled by using the relation . For simplicity, we impose periodic boundary conditions and define Then, after the first step in the recursive procedure, the transformed Hamiltonian has the form given on the right-hand side of the following equation: Thus all terms in the transformed Hamiltonian that do not commute withN IK are at least of order h 2 /J. The terms of this order can be split into pieces that do commute withN IK , which then compriseD 1 , and those that do not, which comprise another error term. This procedure can be repeated, so that an order (h/J) 2 term can be added to U to yield an error term of order h(h/J) 2 .
The ADHH theorem guarantees that this procedure can be repeated, at least up to order n * . Not surprisingly, the procedure can be implemented to all orders in a free-fermion system and so n * and hence t * become infinite as L → ∞ when h < J in the Ising/Kitaev chain. This method was, in essence, how Onsager originally computed the free energy of the twodimensional Ising model! His original calculations 75 are manipulations of fermion bilinears, just as is required to find the unitary transformation U. We find where the fermion bilinears G n are defined in (42).
The key to deriving (43) is to utilize the Onsager algebra, as derived in the original paper 75 . This is simply the algebra of fermion bilinears, and a quick glance at the paper shows that Onsager defines them in terms of the conventional Jordan-Wigner expressions (given the clarity of his expression, one wonders why he missed defining individual fermions). The generators of the algebra are then given by the G n in (42), and The Hamiltonian of the Ising/Kitaev chain H IK = −JN + Y IK is then is written in terms of these generators asN IK = −A 0 andŶ IK = hA 1 . Onsager carefully works out the effect of periodic and antiperiodic boundary conditions, but we simplify matters here by taking L → ∞. It is then easy to work out the Onsager algebra A key property of the Onsager algebra is that when G n is commuted with any linear combination of the A m , the result remains linear in the A m . Moreover, because by definition G −n = −G n , a series of quantities preserving the U (1) symmetry is given by A m + A −m . This suggests then building the unitary transformation out of the G n , a task made even easier by the fact that they commute among themselves. Such a construction is done straightforwardly by Fourier-transforming the A m asÃ where k takes values in the Brillouin zone −π < k ≤ π.
Commuting any G n withÃ(k) is then diagonal in k: Because G † n = −G n , then is indeed unitary when u n is real. Then Finding the appropriate coefficients u n for the ADHH transformation then becomes easy by rewriting the Hamiltonian using the inverse Fourier transformation as Note then if we take u n = −(h/J) n /(2n) as in (43), then (48) gives Thus, by making this choice, we find This transformed Hamiltonian commutes withN IK , sincẽ A(k) +Ã(−k) does, and the function in the integrand of (49) is even in k. The unitary transformation in (48) thus indeed does the job required of the ADHH theorem. It also shows that n * → ∞ for L → ∞, since (49) shows that the error goes to zero in this limit. In an integrable model such as the XY Z model, there is no free-fermion representation, so the zero mode can interact with bulk excitations. Nonetheless, it is possible to show by rewriting the exact strong zero mode of Ref. 20 as a matrixproduct operator that n * is indeed infinite here as well 76 , at least for the edge mode. That tantalizingly suggests that n * is infinite for all integrable systems.

IX. DISCUSSION
We showed that prethermalization can extend topological protection into regimes where it might have been expected to fail. Perhaps the simplest experimental realization of the prethermal protection of edge modes, albeit in a system without topological order, is in a trapped ion or neutral atom chain governed by a perturbed transverse-field Ising model. In a solid state system, prethermalization is ultimately interrupted by the thermalization driven by the electron-phonon interaction. However, over intermediate time scales a chain of quantum dots tuned to a Kitaev-type Hamiltonian may exhibit prethermal strong zero modes that survive until the late time at which the electron-phonon interaction causes thermalization. We emphasize that prethermalization can occur in any dimension, and the long-lived zero modes can, as a result, occur in two or three dimensions. Our work is therefore relevant to the proposal of Ref. 68 for a universal topological quantum computer.
In a more theoretical direction, we note that the ADHH theorem applies to much more general systems than those with topological order. It provides a precise and rigorous approach to finding an approximately conserved charge. Integrable models are so because of the presence of non-trivial conserved charges, and the resemblance of the Onsager analysis of the Ising model to the results derived here hints at a more general connection between the ADHH procedure and integrability. For example, the U (1) quantum number conservation implied by the theorem is strongly reminiscent of the conservation of quasiparticle number in integrable field theories. Moreover, the general connection between integrability breaking and prethermalization suggests that the extremely long lifetime of the prethermal strong zero modes is a consequence of the structure of integrability still affecting the perturbed system, at least in one dimension. In higher dimensions, the role of integrability seemingly is being played by the requirement of integer eigenvalues ofN , but why this is so is somewhat mysterious. Clearly more research in both more formal and experimental directions is warranted.
Here we prove the claim we made about the energy bandwidth of theM = m sector under the Hamiltonian JM +D. The starting point is the observation that a consequence of Ref. 30 is a bound on the "local norm" ofD. Namely, where we define J 0 ≡ 1 κ 2 0 Ŷ κ0 and fix κ 0 such that J 0 < ∞. The dimensionless constant C is proportional to κ 2 0 . We define the local norm H κ of a Hamiltonian H = Γ H Γ (where the Γ are subsets of the lattice Λ, and H Γ is an operator supported on Γ), as We derived Eq. (B1) by applying the bound in the unnumbered equation just after Eqn. (4.10) of Ref. 30, invoked the fact that · 0 ≤ · κ for κ > 0, and then summed over n.
We will state our results in some degree of generality. Rather than considering any particular form ofM , we just assume that it can be written aŝ where the P x 's are commuting projectors, and X is some set to index the projectors. We assume that each projector P x is supported on a set B x ⊆ Λ (where Λ is the set of all sites in the lattice). We can label eigenstates ofM by their simultaneous eigenvalue under the projectors P x , which we refer to as a "syndrome". More precisely, a syndrome is a subset S ⊆ X whose corresponding projectors are not satisfied. Moreover, for each syndrome s we can construct a corresponding projector P s ≡ x∈S P x x / ∈s (1 − P x ) . We let P denote the projector corresponding to the trivial syndrome, P ≡ P ∅ (i.e. the projector onto the ground state subspace ofN ).
We also introduce the notion of a partial syndrome (Y, s Y ) where Y ⊆ X and s Y ⊆ Y. We say that (Y, s Y ) is the restriction of a syndrome s if s Y = Y ∩ s. A partial syndrome specifies the eigenvalue of only those projectors indexed by x ∈ Y. The projector corresponding to a partial syndrome is We now state the following condition under which we will prove our results.
Local-TQO. There exists a constant K such that, for any syndrome s, there exists a region R s , of size at most K|s|, such that B x ⊆ R s for all x ∈ s, and furthermore for any region Γ with Γ ∩ R s = ∅, we have where c(V Γ ) = Tr(P V Γ P )/Tr(P ), and A = X \ s, which implies that the restriction s A = s ∩ A = ∅.
Roughly, this is saying that there is a topologically protected degeneracy in the ground-state ofM (by applying it to the trivial syndrome), and moreover that an excited state with eigenvalue m is localized to a region of size at most Km, and looks like the ground state elsewhere. The condition "Local-TQO" is closely related to the conditions under which stability of the topological order in the ground state subspace ofM was proven in Refs. 77-79 (though the result we want to prove here is somewhat different). We observe that "Local-TQO" is indeed satisfied for theN of the Kitaev chain [Eq. (6)], with K = 2.
Now we can prove the following theorem: Theorem 1. If the condition "Local-TQO" is satisfied, then for any Hamiltonian V = Γ V Γ that commutes withM , the spectrum of JM + V in the eigenspace ofM with eigenvalue m lies within the interval c(V ) + m(J − K V 0 ) , c(V ) + m(J + K V 0 ) .

(B6)
Proof. Consider an operator V Γ supported on a set Γ ⊆ Λ. We want to consider the circumstances under which P s V Γ P t − δ s,t c(V Γ )P s can be nonzero for two syndromes s, t ∈ S such that |s| = |t|. Let us partition the set X into X Γ and X c Γ , where X Γ = {x ∈ X : B x ∩ Γ = ∅}. We consider the following cases: • s = t and s ∩ X c Γ = t ∩ X c Γ .
Without loss of generality, we can say that there exists x ∈ s ∩ X c Γ such that x / ∈ t. Then we note that P s P x = 1 and P x P t = 0. Furthermore, x ∈ X c Γ implies that [P x , V Γ ] = 0. Hence, we can write P s V Γ P t = P s P x V Γ P t = P s V Γ P x P t = 0.
• s = t, s ∩ X c Γ = t ∩ X c Γ and R s ∩ Γ = ∅. R s ∩ Γ = ∅ implies that for all x ∈ s, B x ∩ Γ = ∅ (since B x ⊆ R s ). This implies that s ⊆ X c Γ , and hence that s = s ∩ X c Γ = t ∩ X c Γ . Since |s| = |t| this implies that s = t which contradicts our assumption.
We decompose s into partial syndromes (A, s A ) and (B, s B ) where A = s and B = X \ A. Then we can write P s = Q A,s A Q B,s B . We observe that R s ∩ Γ = ∅ ensures that Q B,s B commutes with V Γ . Hence, we find that by Eq. (B5).
In conclusion, we find that P s V Γ P t − δ s,t c(V Γ )P s = 0 except when R s ∩ Γ = ∅ and R t ∩ Γ = ∅. Now consider a Hamiltonian V = Γ V Γ . Let P m = s∈S:|s|=E P s . (That is, P m is the projector onto the subspace with eigenvalue m underM ). Furthermore, let P Γ m = s∈S,|s|=m,Rs∩Γ =∅ P m . Then we see that The theorem immediately follows.