Resilience to time-correlated noise in quantum computation

Fault-tolerant quantum computation techniques rely on weakly correlated noise. Here I show that it is enough to assume weak spatial correlations: time correlations can take any form. In particular, single-shot error correction techniques exhibit a noise threshold for quantum memories under spatially local stochastic noise.


I. INTRODUCTION
Building a quantum computer amounts to attain exquisite control of a large quantum system. In particular, it requires quantum systems that (i) are isolated from the rest of the universe and (ii) can be arbitrarily manipulated, all with any desired accuracy. A priori, quantum systems naturally displaying this unlikely combination of characteristics might exist [1]. Unfortunately, for the time being no such systems are known [2]. Instead, one has to deal with decoherent quantum systems and innacurate control, i.e. with noise.
The great success of the pioneering works in faulttolerant quantum computation [3][4][5][6] was to show that a small amount of noise does not represent an insurmountable difficulty: As long as noise is weak and weaklycorrelated in space and time, quantum computations of arbitrary accuracy and complexity can be carried out. The purpose of this paper is to drop one of the conditions imposed on noise, by allowing arbitrary correlations in time. The motivation is to extend the class of physical systems potentially useful for quantum computation.

A. Fabrication faults
Suppose that a quantum computer is built out of many interconnected pieces or 'nodes', each node holding a lowdimensional and noisy quantum system, each connection allowing the controlled (but noisy) interaction of the corresponding quantum systems. The system is designed to implement some fault-tolerant computation scheme, so that when all the nodes (and connections) work as expected, a certain degree of noise is tolerated.
This work originally grew out of an interest in the following question: what will happen if a fraction of the nodes fail permanently? Are there fault-tolerant schemes resilient to such 'fabrication faults' ? The answer could, in principle, depend on important details, such as whether the fabrication faults are known or not, or how flexible the operation of the system is. However, the study of general time-correlated noise automatically encompasses all such possible situations: If a system can deal with such correlations, then it can also deal with unknown fabrication faults, with no need for alternative operations.
In fact, one can go far beyond fixed fabrication faults, and imagine faults that fluctuate slowly over time and introduce significant time-correlated noise. There is no guarantee that a conventional approach to fault-tolerant quantum computation could function within such conditions. Thus, demonstrating fault-tolerant techniques that are compatible with time-correlated noise opens the door to the use of physical systems exhibiting such problematic behavior.

B. Topological codes
Topological quantum error correction [7] is a popular approach to fault tolerance that emphasizes the locality of interactions in the above setting: nodes form a lattice of a certain spatial dimensionality, with connections between nearby nodes only. For practical reasons, it is desirable that the spatial dimensionality of the lattice is as low as possible, which puts the focus on 2D topological codes such as toric codes [5] and color codes [8].
It is not known whether 2D stabilizer codes are generally resilient to fabrication faults, even when the locations of the faults are known and it is possible to change the local operation of the physical qubits, but the problem is already under study [9]. On the other hand, it will be shown below that there exist 3D topological schemes, namely 3D gauge color codes [10], capable of dealing not only with (dilute enough) unknown fabrications faults, but also with noise exhibiting arbitrary correlations in time. It is possible to argue, see appendix A, that 2D codes are not resilient to noise with such correlations, even if they have a Markovian origin.

C. Single-shot error correction
In a fault-tolerant quantum computer information is protected by means of redundancy: instead of using all available degrees of freedom for storing information, some (or most) of them are used to absorb the damage inflicted by noise. This cannot work indefinitely, because the effect of noise piles up over time. Therefore a process known as error correction must be performed repeatedly to flush away the errors inflicted to the system, making room to subsequently absorb more errors.
Error correction consists of two stages. The first stage identifies the errors most likely suffered by the system, and the second one corrects them. It is critical that errors are identified correctly, because otherwise the correction step might end up being counterproductive. In particular, in the first stage a number of operators are measured, providing an 'error syndrome'. In the conventional approach, to guarantee that the error syndrome is not too noisy these measurements are repeated a number of times. This is, in particular, the approach taken for 2D topological codes.
In the presence of time-correlated noise, repeating measurements ceases to be viable and alternative approaches are needed. Single-shot error correction is such an alternative [11]. It relies on an unusual feature of some error-correcting schemes, for which errors occurring at the measuring stage that are localized in space give rise to errors induced by the correction stage that are similarly localized in space. This means that error correction can be performed directly in a single step (the 'single shot'), with no need to repeat measurements to avoid errors. In fact, in the case of 3D gauge color codes [10] much more is true, since single-shot error correction enables the performance of arbitrary elementary computational operations in constant time, irrespective of the system size. This is in stark contrast to most fault-tolerant settings, where some operations will incur in a time-overhead proportional to the amount of errors that the system can correct.
The purpose of this work is to show that single-shot error correction is compatible with time-correlated noise. In particular, this will be demonstrated for the errorcorrection schemes considered in [11]. More general approaches to singles-shot error correction are conceivable, as discussed in appendix B.

D. Spatially local stochastic noise
For concreteness, consider a quantum computation performed on a number of qubits. A computation is divided into time steps. Ideally, at each step a number of separate processes happen simultaneously, each involving only a bounded number of qubits. Each of this processes is called a location. In modeling fault-tolerant quantum computation, the resulting ideal process of computation is deformed to introduce noise. Although more sophisticated approaches exist [12,13], here noise will be modeled stochastically [4,6], as explained next. This approach is ideally suited to explore the effects of strongly timecorrelated noise.
In the stochastic noise model it is assumed that a given set of locations fails with some probability, while the rest behave ideally. In particular, at a given time step the qubits that belong to faulty locations undergo a possibly unknown process, possibly involving the environment.
An essential assumption in fault-tolerant quantum computation is that noise is both weak and local: weakly correlated in space and time. In the stochastic model the locality of noise is implemented as follows: The probability that any given set of n locations fails, irrespective of what happens at other locations, is bounded by p n for some error rate p.
If nothing is known about time correlations, the only meaningful constraints on the distribution of faulty locations have to involve simultaneous locations, i.e. belonging to the same time step. Here a spatially local stochastic noise model will be adopted: The probability that any given set of n simultaneous locations fails is bounded by p n for some error rate p.
Fabrication faults can also be naturally modeled in the stochastic noise model. A given set of fabrication faults can be related to a set of qubits: For every time step, all locations involving any of those qubits fail. If the fabrication faults are fixed, this does not yield a spatially local distribution of faults. However, the fabrication faults will typically be randomly distributed. In particular, if the fabrication faults follow a certain spatial distribution (such as a local one), the faulty locations will follow a closely related spatial distribution. This is how the study of fabrication faults fits into the study of stochastic noise with arbitrary time correlations. It suffices to focus on the latter, which is also much more general.

E. Results
The focus of this paper is to show that the single-shot error-correction techniques introduced in [11] are compatible with spatially local stochastic noise. In particular, the corresponding quantum memories exhibit a noise threshold under local stochastic noise, in the following sense. In a quantum memory error correction is repeated time after time [7], with no actual computation. Any fault-tolerant scheme always involves a family of schemes requiring a varying number of qubits: if more accuracy is desired, a larger set of qubits is needed. Quantum memories have a threshold p 0 for a given family of schemes and a noise model with parameter p when, for any p < p 0 , quantum information can be stored with arbitrary accuracy for arbitrarily long times, by choosing the right scheme in the family.
Even though only quantum memories will be discussed, there seems to be no difficulty in extending the results to general quantum computations, at least in the case of fault-tolerant schemes that use 3D gauge color codes [10]. The reason is that in the case of 3D gauge color codes the computation can be performed by alternating operations involving a single time step and single-shot error correction [11]. Moreover, the computations can be performed in a network of qubits with 3 spatial dimensions [14], so that, in principle, such a scheme could be realized with spatially local interactions. For a numerical study of error thresholds in gauge color codes, see [15].

II. ERROR CORRECTION
This section introduces some basic notions of quantum error correction and fault-tolerant quantum computation.

A. Operator quantum error correction
The purpose of quantum error-correction techniques [16] is to send quantum information as faithfully as possible through a noisy quantum channel E. This is achieved by choosing an encoding channel C and a decoding channel D such that the composed channel

DEC
(1) has as little noise as possible. There is a price to be paid: the composed channel operates on a system with smaller dimensionality.
In operator quantum error correction [17] the Hilbert space of the quantum system where the channel E operates is structured as follows: The subspace H 0 is the code subspace. It is divided into two subsystems, logical and gauge. Quantum information is encoded in the logical subsystem, whereas the gauge subsystem is not to be protected and will typically be in a random state. Although gauge degrees of freedom are not particularly interesting at the level of purely ideal error correction, they are surprisingly useful in the context of fault-tolerant quantum computation, both within topological techniques [10,11,14,[18][19][20][21] and elsewhere [22][23][24][25]. Given the above decomposition of H, the encoding channel C maps states on H logical to states on H, and D does the converse. The encoding channel C maps a logical state ρ to the encoded state on the code subspace H 0 , with τ some (fixed) state of the gauge degrees of freedom. The decoding channel D takes the form where the error-recovery channel R, which operates on the system H and yields states with support on H 0 , aims to undo the errors introduced by the channel E. The linearity of error correction is rather useful. For a fixed logical subsystem (2) and a fixed decoding channel (4), there exists a subspace V of operators on H that characterizes correctable channels E, i.e. those for which DEC is the identity channel for any encoding C as in (3). In particular, a channel E is correctable if and only if its Kraus operators belong to V . Another useful fact is that not all encodings need to be checked: If DEC is the identity channel for the encoding C with a completely random gauge state, then the same is true for any other encoding [17].

B. Quantum memories
The purpose of fault-tolerant quantum computation techniques [16] is to perform quantum computations of arbitrary size and precision by using faulty devices. At the core of these techniques is error correction. In fact, during the whole computation the logical information remains encoded in a subsystem as in (2), in the following sense: If the ideal decoding operation D were to be applied at any step of the computation, the result would be very close to the intended logical state at that computational step. To stay within such a regime the errors that accumulate during the computation have to be flushed away regularly. This can be achieved with a faulttolerant recovery operation, a noisy analogue of the ideal recovery operation R.
The purpose of a quantum memory is to preserve some (logical) quantum state, rather than to compute with it. A possible approach to this problem is to use a subset of the techniques of fault-tolerant quantum computation. In particular, it suffices to apply as regularly as needed the fault-tolerant recovery operation. For any given error-correcting code this only provides a limited memory lifetime, but typically codes come in families and the lifetime can be increased at the expense of using a larger system H.
The above encoding and decoding channels C, D come handy in analyzing fault-tolerant protocols, and this is also true for quantum memories. Assume that the effect of the passage of time t in the quantum memory can be modeled as a quantum channel E t . The usual approach to analyze the quality of the memory is to consider the composed channel In other words, one assumes perfect encoding and decoding in order to to analyze the effective noise on the logical degrees of freedom.

III. STOCHASTIC NOISE
This section introduces a formalism that allows to work comfortably with time-correlated stochastic noise.

A. Preamble
A quantum circuit on a set of qubits can be divided into time steps, each of which is itself further composed of locations: each location involves only a few qubits, which are either initialized, transformed or measured, with no two simultaneous locations involving the same qubits. The circuit represents a quantum channel, which is obtained by composing the channels represented by each time step, which in turn are obtained by tensoring the channels represented by the corresponding locations.
In a stochastic noise model any given quantum circuit is modeled as a quantum channel of the form where each E i is a channel corresponding to a certain set of faults in the circuit, or fault path, and p i is the probability assigned to that particular path. Each channel E i is a composition of channels E i,t representing the t-th time step. E i,t behaves as expected on the locations that are not faulty, and the rest of qubits undergo some other process that might be unknown and path dependent 1 . The probabilities p i might also not be precisely known either: instead, they are assumed to satisfy some properties. Because of these uncertainties, one usually deals with a set of channels of the form (6), rather than a single channel, and has to extract properties that are common to all the channels in the set. When studying the behavior of a large circuit it is useful to be able to deal with its parts separately. This section develops a formalism to do precisely this, in particular in the case of arbitrary and unknown time correlations on the fault probabilities.

B. Stochastic channels
consists of a probability distribution p i over a finite collection of channels E i , all of which share a given input quantum system and a given output quantum system. Given such a distribution there exists a quantum channel (6) obtained by applying E i with probability p i . But given a quantum channel, in general there will be different distributions that can produce it in this way. Thus it is important not to confuse the stochastic channel (p i , E i ) with the corresponding channel (6). Besides, the stochastic channel can be identified with the expression (6) when the latter is regarded as a formal sum, and this is sometimes useful. It is convenient to identify any channel E with the stochastic channel where E is assigned probability 1. When considering distributions p ij over several variables 1 The process on the qubits at faulty locations can incorporate an environment, which should vary in size over time: the input and output systems of E i are the same as for the noiseless circuit.
(or similar objects), traces will be denoted with a compact notation: A stochastic class A is a set of stochastic channels with shared input and output quantum systems. Their interest will become clear in the next section, which discusses noise models as maps taking quantum circuits to stochastic classes. It is convenient to identify any stochastic channel E with the singleton class {E}.

C. Correlated composition
Suppose that a given circuit is split in time slices t = 1, . . . , n, and for each time t there is a number of fault configurations that can happen. For a given fault path, the circuit maps to a channel where is the channel characterizing the time slice t given that the fault i happens. The effect of the circuit is characterized by some stochastic class A, with elements of the form Eventually the goal is to deal with situations where the only available information concerns the probabilities for simultaneous faults to happen. In the case of the above stochastic class A this information can be encoded in stochastic classes A (t) , one per time slice t. Namely the elements of A (t) are stochastic channels of the form for any given E as in (10). If only the stochastic classes A (t) are known, in general A cannot be reconstructed. This lack of knowledge is desirable if we are only interested in properties arising from the A (t) . In such a situation, it is convenient to consider the most general stochastic class A ′ that would yield such A (t) , i.e. the unique stochastic class A ′ such that A ⊆ A ′ for any A as above. This stochastic class A ′ is denoted In other words, given the A (t) , the expression (12) is defined as the set of stochastic channels E that can be put in the form (10) with (11) an element of A (t) for each t.
A priori, this definition introduces an n-ary operation on the A (t) with no further structure. As proven in appendix C 1, the corresponding binary operation ⋄ is associative and when iterated yields the n-ary operation. The binary operation ⋄ will be called correlated composition.
As a contrast, consider the case of noise completely uncorrelated in time. Instead of (12), one ought to consider defined as the set of stochastic channels E that can be put in the form Notice that using stochastic channels is an overkill in this case: If stochastic channels are regarded as ordinary channels, the composition (13) is just the set of all channels obtained by composing channels in the sets A (t) .

D. Spatially local noise
In section II the quantum system H used to encode quantum information had an entirely abstract nature. However, in most cases H will be composed of a number of physical subsystems, typically qubits. In fact, that error correction is possible at all depends on the noise having some sort of structure, and the most common assumption is that noise is spatially local: errors affecting a large number of subsystems are unlikely. Let supp E denote the set of qubits in which the channel E does not act trivially 2 . In terms of stochastic classes, spatially local noise can be modeled as follows.
How much can time correlations disrupt the locality of noise? A quantitative answer is provided in appendix F: As a comparison, for uncorrelated spatially local noise Although time-correlated spatially local noise behaves much worse than time-uncorrelated spatially local noise, in both cases locality is preserved when composing several time slices together.

E. Approximate behavior
Since stochastic channels are probability distributions over channels, the distance between them can be defined as the statistical distance. Namely, it is always possible to write two stochastic channels A and B in the form and their distance is In some cases this distance might be huge compared to, say, the diamond distance of the corresponding channels. However, since the aim is to deal with noise of unknown form, there is hardly any gain in considering a more sophisticated distance that accounts for the proximity of channels. Moreover, as in any approach to error correction that models noise stochastically, eventually the goal is to bound the probability that a logical error might occur, and the statistical distance does the job.

Definition 2 A class of stochastic channels
The following basic properties are proven in appendix C 2: Notice that the wedge ∧ represents a logical 'and'.

F. Error rate
As in section II, consider a pair C, D of encoding and decoding operations defined for some quantum system H. The error rate of a stochastic channel E that operates on H is defined to be To clarify the meaning of this figure of merit, let E = (p i , E i ). It can be regarded as a conventional channel satisfying, when p = fail (E), where E ′ is some channel. The error rate bounds the probability that logical information is damaged.
The error rate of a stochastic class A is the worst possible error rate for an element of the class, fail (A) := sup E∈A fail (E). (28) Observe that which using (23) gives

G. Modeling quantum memories
The quantum memory model studied here consists of two processes that alternate in time: noise accumulation and single-shot error recovery. Each process is described by a parametrized stochastic class: L λ represents the accumulation of noise in between error-correction steps, and R η represents the noisy error recovery steps. The stochastic classes L λ and R η will be described in section V. For the time being it suffices to know that the parameters λ and η have a similar meaning: they indicate the amount of (spatially local) noise afflicting each process, with the value zero indicating a noiseless process.
The figure of merit is the error rate where n is the number of iterations of the error accumulation and recovery steps. The goal is to show that there exist thresholds λ 0 , η 0 > 0 such that for any the error rate (31) can be efficiently made arbitrarily small for any value of n, so that the quantum memory lifetime can be as long as desired.

H. Approach
In general, the strategy to bound the error rate (31) will involve an additional parametrized stochastic class N τ . Its purpose is to characterize the residual noise after error recovery, with τ ≥ 0 the amount of noise. In [11] the parameter τ is referred to as a 'temperature': error recovery can be regarded as cooling down the system by removing the entropy introduced by errors. Assume that for some λ, η, τ i and δ i , The following series of relations are constructed by repeatedly applying (33-35) via (24): According to (23) these relations imply and thus, using (30,36) fail The problem is to find conditions under which (33-36) hold and δ 1 , δ 2 and δ 3 can be made arbitrarily small (in a resource-efficient manner). Such conditions will be described in section V C.

IV. STABILIZER FORMALISM
A complete account of stabilizer codes in the context of single-shot error correction can be found in [11]. The purpose of this section is to explain some basic aspects that will be required in the main text.

A. Stabilizer codes
A stabilizer code is defined on a given number of physical qubits [26]. It can be characterized with two sets of Pauli operators: 'check operators' (or stabilizers) and 'gauge operators'. They fix the code subspace and its subsystem structure, discussed in section II, as follows.
The check operators all commute, and the Hilbert space is the sum of subspaces H i of the same dimension each corresponding to a different set of eigenvalues of the check operators. Each check operator has two eigenvalues, and only one is compatible with the code subspace. The index σ can be identified with the 'error syndrome': the set of check operators with an eigenvalue incompatible with the code. It is useful to regard σ also as a binary vector: each entry corresponds to a check operator and has value 1 when the check operator belongs to σ. With this notation H 0 is the code subspace, and the binary addition σ + σ ′ of any two syndromes is also a syndrome. The code subspace is composed of a logical and a gauge subsystem, each equivalent to a number of qubits. The decomposition is fixed by the gauge operators, which generate an algebra G: its elements map H 0 to itself, act trivially on the logical subsystem, and can arbitrarily transform the gauge subsystem [26]. In fact, a similar decomposition applies to all the subspaces H σ .
Let a Pauli operation be a quantum channel ρ → eρe † , with e some Pauli operator. Ideal error correction, i.e. the error recovery operation in (4), takes the form where P σ denotes the map ρ → h ρ h, with h the projector onto H σ , and ω σ is a Pauli operation such that Ideal error correction is a two-step process: first the check operators are measured to recover an error syndrome σ, then a Pauli operator is applied to bring the system back to the code subspace. The choice of correction operations ω σ will be considered to be part of the description of a stabilizer code. This fixes, in particular, the decoding channel D, which incorporates the recovery channel (41).

B. Processing of noisy syndromes
Fault-tolerant error correction, as opposed to the ideal error correction described above, might involve a series of measurements of check operators or, more generally, of gauge operators [26]. In either case, since each measurement has a binary outcome the measurement results can be described with a binary vector x. This binary string needs to be processed in a classical computer to produce an error syndrome σ = r(x), and then the correction operation ω σ is applied.
Given a syndrome σ, there will be a set of measurement outcomes x compatible with σ in the absence of errors, and then it is natural to impose that r(x) = σ. This constraint, in general, is not enough to fix the function r, as many outcomes will not be compatible with any syndrome in this way. There is a constraint on r, however, that often holds and will be relevant in the next section. If x is compatible with σ as stated, and if y is an arbitrary binary vector of the same length as x, then where addition is modulo 2. In particular, this is true for all the single-shot error-correction strategies discussed in [11]. A remark is in order. As far as preserving the logical state is concerned, there is no need to actually perform the correction operations ω σ at the end of each round of error correction: the stabilizer formalism makes it possible to simply keep track of them. In fact, the optimal (but computationally more expensive) strategy is always to put together the measurements from different rounds in order to infer the errors, rather than interpreting each round separately, for the simple reason that time-local strategies are instances of global strategies.

C. Ignoring the gauge operators
It is convenient to use a notation that ignores the gauge algebra G [11]. For reasons to be clarified below, stochastic channels of interest will mainly take the form where E iσ are Pauli operations and G iσ are quantum channels with Kraus operators in G. Any such stochastic channel will be denoted, omitting the G iσ , This does not give rise to any ambiguity regarding the compositions ⋄ and • because ignoring the G iσ terms is consistent with channel composition [11]. The failure rate is not ambiguous either. In fact, for these stochastic channels it does not depend on the encoding C used. For fixed p i and E iσ , the set of all channels of the form (44) will be denoted There are two main kinds of stochastic channels that will be of interest below.

Definition 3
The stochastic class P is the set of all Pauli channels, i.e. stochastic channels where E i are Pauli operations.

Definition 4
The stochastic class Q is the set of stochastic channels of the form where In section V A class P will be used to model noise occurring in between recovery steps, and class Q will be used to model noisy error recovery.

V. QUANTUM MEMORIES
This section describes the main result, the compatibility of single-shot error correction and spatially local stochastic noise.

A. Noise model
As stated in section III H, the quantum memory model is described by a pair of parametrized stochastic classes: L λ represents the accumulation of noise in between errorcorrection steps, and R η represents the noisy error recovery steps. This section provides their specific form, which aims to be as simple as possible.
Definition 5 Given λ ≥ 0, L λ is the set of stochastic channels that can be put in the form so that The noisy steps are modeled as local Pauli noise, a standard phenomenological model that allows easy computations within the context of stabilizer codes. Because of linearity, see section II A, there is no loss of generality in considering local Pauli noise rather than general local noise, namely where the inequality accounts for the extra gauge operations allowed on the right-hand side 3 . It is assumed that the waiting times in between error-correction rounds are independent of the code size, so that the parameter λ is fixed for the whole family of codes. In order to model noisy measurements it is again enough to consider Pauli noise. Single-shot error correction with stabilizer codes involves (i) a finite-depth circuit to perform measurements (generally of gauge operators), which yield a binary vector x, (ii) classical processing to obtain the error syndrome σ = f (x), and (iii) application of the Pauli operation ω σ . In general noise will affect all aspects of the process, but from a qualitative point of view it suffices to consider one kind of fault alone: classical errors afflicting the outcome of the measurements. Indeed, local Pauli noise afflicting the physical qubits can be propagated forward without changing its local nature, and thus can be accounted for in the subsequent 'layer' of local noise.
Notice that the classical measurement outcomes can be treated as qubits for which phase-flip errors are immaterial, so that they might only undergo bit-flip errors: instead of the correct outcome x, the measurement yields a result x + y for some bit string y. Assuming that condition (43) holds, the error-correction step is represented by stochastic channels (p y , R r(y) ) g ∈ Q, where p y is the probability that the bit-flips y happen. The measurement of gauge generators is not explicit in (53) due to the notation introduced in IV C. In an abuse of notation, identify any binary vector with the set of positions at which it has value 1. To model the locality of noise, the probabilities p y are subject to a condition with parameter η analogous to (15): Given η ≥ 0, R η is the set of stochastic channels (p y , R r(y) ) g ∈ Q (54) such that for any binary vector x representing a bit-flip configuration,

B. Residual noise
This section adapts the results on the residual noise given in [11] to the notation and tools used here, particularly the correlated composition and the distance relations. It reviews both the conditions imposed to the residual noise and specific examples satisfying them. For a given family of codes of increasing size, the conditions on the residual noise are as follows: (i) there are real functions g 1 (x) and g 2 (x, y) with limit 0 at the origin and monotonically increasing (in the case of g 2 , on either of its parameters when the other is fixed), and (ii) for each code there are real functions f 1 (x), f 2 (x, y) and f 3 (x), and for each i = 1, 2, 3 there is a neighborhood of the origin within which f i goes to zero in the limit of large codes, and (iii) the following relations hold: Moreover, noise monotonically increases with τ , and disappears for τ = 0, Take, for example, the relation (57). Not only is it required that the residual noise N τ can approximate local noise L λ with arbitrary precision in the limit of large codes (for λ below a threshold). In addition, any 'temperature' τ > 0, no matter how low, can be achieved for some λ > 0.
The above conditions do not address the issue of resources, i.e. how fast the error bounds f i go to zero as a function of the number of qubits in the code. In the case of the fault-tolerant schemes discussed below, a precision f i < ǫ requires a polylogarithmic number of qubits in 1/ǫ, which is efficient.

Local noise
For some codes it will suffice to use local noise as the model for residual noise. In this case and thus a valid choice is, see appendix F, The only nontrivial condition comes from (59): the family of codes has to have a threshold for local noise. Notice that in [11] local residual noise was handled in a slightly different manner, see appendix J.

Local syndromes
For some codes it is not residual noise that has the local structure (15), but rather the residual distribution of error syndromes. In such cases the residual noise model of interest is Recall that σ represents a subset of check operators, and thus the definition strongly depends on the choice of check operators. Since the elements of N τ are composed of correction operations, in this case Appendix G discusses conditions under which all the required properties are met. As in [11], such conditions are satisfied by topological stabilizer codes that are related to self-correcting Hamiltonian systems, in the following sense. For any local stabilizer code one can always write down a local quantum Hamiltonian as a sum of energy penalties for check operators in the error syndrome. The ground state of such a system is the subspace of encoded states, and the other eigenstates can be chosen to have well-defined syndromes. The energy of such an eigenstate is the sum of the energy penalties for each of the check operators. For some codes a confinement mechanism gives rise to self-correction [7,27]: in the thermodynamic limit, in thermal equilibrium and below a threshold temperature, the system becomes a perfect quantum memory without the need of any external intervention. Definition (65) is in fact motivated by thermal equilibrium, were states have a probability that decreases exponentially with the energy.

C. Effective noise
This section gives an account of the last ingredients required to prove that single-shot error-correction strategies give rise to quantum memories of arbitrary lifetimes.
In order to obtain relations of the form (33) it is convenient to introduce the following function [11]. It maps stochastic classes R ⊆ Q to 'effective' sets of Pauli channels eff(R) ⊆ P.
Definition 7 For any R ⊆ Q, the stochastic class eff(R) contains the stochastic channels such that R contains an element of the form Its usefulness stems from the following lemma.
Lemma 8 For any stabilizer code, R ⊆ Q and E ⊆ P where The proof is in appendix H. The same result holds if the correlated composition ⋄ is substituted with the uncorrelated composition •. Lemma 8 provides the key condition for the quantum memory to have an arbitrarily long lifetime. Assume that for a given family of codes the residual noise N τ satisfies all the conditions of section V B, and that, in addition, the following is true: (i) there is a real function g 4 (x) with limit 0 at the origin, that is monotonically increasing; (ii) for each code there is a real function f 4 (x), and there is a neighborhood of the origin within which f 4 goes to zero in the limit of large codes; (iii) the following relation holds: It is not difficult to check, see appendix I, that the relations (33-36) hold setting Moreover, by taking λ and η small enough the δ i go to zero in the limit of large codes, so that the memory can preserve quantum information for arbitrarily long times according to the inequality (30). A single-shot error recovery strategy for a stabilizer code is composed of a collection of gauge operators to be measured (that can be measured with a finite-depth circuit), a syndrome recovery function r and a choice of Pauli correction operations ω σ , all such that the corresponding parametrized stochastic class R η fulfills the above conditions. Notice, in particular, that the residual noise after every fault-tolerant recovery step can be made arbitrarily small by reducing the noise in the recovery, a characteristic trait of single-shot error correction.
Two different kinds of single-shot error-correction strategies were introduced in [11]. The first strategy is based on 3D gauge color codes, which are a class of 3D topological codes with unique characteristics [10]. Thanks to the gauge structure, for gauge color codes the residual noise is local, as in (62). The second strategy is based on stabilizer codes exhibiting self-correcting properties, as discussed in section V B 2. For such codes the residual noise has the local syndrome structure N exc τ of (65). For more details, see appendix J.

VI. DISCUSSION
The purpose of this work is to show that fault-tolerant quantum computation is still possible in the presence of noise with arbitrary time correlations. In particular, quantum memories based on single-shot error-correcting codes exhibit a threshold for spatially local stochastic noise. A natural next step is to obtain a fullfledged threshold theorem for universal computation. Since quantum computation with 3D gauge color codes amounts to a series of transversal operations and singleshot error correction [11,14], there is no reason to expect obstacles in this regard.
The role of spatial dimensionality in quantum error correction is intriguing. It is known that fault-tolerant quantum computation is possible with local gates even for a single spatial dimension [4]. This suggests that spatial dimension plays no role, at least qualitatively. However, it is not known if single-shot error correction can be performed with less than three spatial dimensions. It might be the case that for two spatial dimensions and arbitrary time correlations, fault-tolerant quantum computation is not possible. If this were true, it would set an interesting example of how spatial dimensionality can play a fundamental role in fault-tolerance.
One of the defining features of single-shot error correction is that it only requires a finite-depth quantum circuit, assisted with non-local classical computation. In particular, no classical information needs to flow between different layers of single-shot error correction. A stronger form of locality is achieved by removing the ability to process classical information nonlocally, thus reducing quantum error correction to purely local circuitry. This seems to be incompatible with single-shot error correction: a single step of fully local error correction is unlikely to be able to produce arbitrarily low residual noise (in the limit where error correction is close to noiseless). However, fault tolerance can still be achieved [28]: it suffices for residual noise to stay within certain bounds. A natu-ral question is whether such entirely local forms of error correction can deal with time-correlated noise (or at least with some particular forms of it).
Finally, it is interesting to ask, more generally, if there are other approaches to achieve resilience to timecorrelated errors.

ACKNOWLEDGMENTS
I am grateful to Benjamin J. Brown and Michael J. Kastoryano for useful discussions, and to Aleksander Kubica for comments on an early version of the manuscript. I received support from the MINECO grant FIS2012-33152 and the CAM grant QUITEMAD+. This work was supported by the International Research Unit of Advanced Future Studies at Kyoto University.

Appendix A: Difficulties with 2D codes
This appendix is addressed to readers familiar with topological stablilizer codes. It aims to briefly discuss why noise with arbitrary time correlations cannot be handled by 2D topological stabilizer codes. These include the toric code [5] and 2D color codes [8]. The important feature of these codes is that they have string-like logical operators for which a localized set of faults can hide the syndromes at a given endpoint of a string-like error operator.
The outline of the argument is as follows. For any string-like logical operator, clearly there exists a history of faults that (i) implements the logical operator, producing a logical error, (ii) leaves no trace in the syndrome history, and (iii) only requires a finite number of faults per time step. Such a fault-path does not satisfy the spatial locality constraint, but it suffices to consider an ensemble of a finite number of such fault-paths, each starting at a different time. This limits the memory lifetime to be proportional to the length of the string, but in fact the argument can be improved by dropping condition (iii) above. Instead, with a bit of care one can allow for a finite density of faults per time step, getting a finite memory lifetime.
This argument does not work for 3D gauge color codes because it is not possible to hide a topological charge with measurement faults localized in the vicinity of the charge. Instead, the faults should be string-like and connect the charge to another one or to a boundary. This is the confinement mechanism [11].
Finally, it is worth noting that the above argument can be slightly modified so that the stochastic process governing the noise takes the form of a Markov chain. As stated, certain histories of faults unavoidably damage the memory and only require a few faults per time step. In the simplest case, the history amounts to moving a particle along a predetermined trajectory from one end of the system to another. To satisfy the Markov property, it suffices to start with some probability such a history at every time step, that is, to randomly create particles at the starting end. Indeed, (i) since the start of the history is random, it does not depend on what happened in the past, and (ii) once a history has started, every step of it depends on what happened in the previous step (i.e., it depends on where the particle is). Moreover, the trajectories of particles starting at different times do not interfere with each other. Finally, to satisfy the spatial locality constraint it suffices to make the probability to start a history as small as necessary.
Appendix B: Extrinsic single-shot error correction As defined in [11], an approach to noisy error correction can be considered to be single-shot if it only involves a quantum-local operation: a finite depth quantum circuit involving ancilla qubits and aided with global classical computation (implementing feedback on measurement results). However, the single-shot error-correction techniques discussed in [11] follow a much more strict set of rules: the circuits are local with respect to geometry of the topological codes used, and the density of ancilla qubits (within that geometry) is bounded by a constant. Such single-shot error-correction techniques can be regarded as intrinsic, as they strongly depend on special properties of the codes. It might be worth considering a wider class of extrinsic techniques that do not rely so strongly on specific codes, but still fit within the wider definition.
In Knill's approach to error correction [29] the measurement of check operators is performed via teleportation. The aim of Knill's technique is to reduce the exposure to noise of those physical qubits where quantum information is encoded. This requires the preparation of high quality states on the ancillas used for teleportation. But the teleportation of check operator measurements could also be used differently. For a local family of codes [5], where check operators can be measured with a finite depth circuit, any number n of noisy rounds of check operator measurements can be performed in constant time by performing n teleportations in parallel. This approach fits the wider definition of single-shot error correction, at the cost of using a large number of ancilla qubits: In the case of topological codes, it amounts to add an extra spatial dimension to the qubit lattice. Notice that in the case of surface codes a similar alternative already exists in the form of 3D cluster state quantum computation [30].

Appendix C: Stochastic channels
This appendix provides the results required in section III.

Associativity
Definition 9 Given classes A and B, their correlated composition A ⋄ B is the set of stochastic channels E of the form For short, when writing (p ij , A i • B j ) ∈ A ⋄ B it is understood that p ij , A i and B j satisfy (C1) (notice the explicit composition symbol •).
The following result shows that ⋄ has the intended meaning.

Proposition 10
The binary operation ⋄ is associative. In particular, if and only if E takes the form Proof. First we show by induction that if and only if E takes the form (C3). The case n = 2 is true by definition. Given j ≥ 2, assume that the statement holds for n = j and let us show that it holds for n = j + 1. The if direction is trivial, and we omit it. Given E as in (C4), by definition and by the inductive assumption it takes the form gives, as desired, (C3), noting that To complete the proof it suffices to show that if and only if E takes the form (C3) with n = 3; however, the proof is analogous to the j = 2 step of the inductive argument and thus we omit it. A couple of comments on the notation are in order. If A and C are sets of channels and B is a stochastic class, then As a special case of the above, if A and C are channels and B a stochastic channel, the composition is a singleton stochastic class, but it will also be identified with its single element.

Approximation
The statements (22) and (23) follow respectively (and trivially) from the following distance axioms: As for (24), due to (23) it suffices to show that A ⋄ B ⊳ ǫ C ⋄ B and C ⋄ B ⊳ δ C ⋄ D. We show the former, and the later is analogous. Let E = (p ij , A i • B j ) ∈ A ⋄ B, with p i := p i(j) = 0 without loss of generality. By assumption there exists C ∈ C, non-negative ǫ i , α k and channels C k such that, in formal sum notation, (C13) Recall here that ǫ (i) denotes i ǫ i . Consider the formal sum It is easy to check that p ′ (i)j + q (k)j = p (i)j , p ′ i(j) = p i − ǫ i , q k(j) = α k , (C15) so that E ′ ∈ C ⋄ B ⊆ C ⋄ B. Moreover, as required, This appendix compiles some properties and definitions related to Pauli operations required below.

Error syndrome
Given a stabilizer code each Pauli operation E has a well-defined error syndrome: there is a unique σ such that For each Pauli operation E and such σ, define synd(E) := σ, E := ω σ .

Groups
Pauli operations form a group. For any pair of elements E, D, Given a stabilizer code, an important subgroup is the gauge group, which is generated by gauge operations. A Pauli operation L is a logical operation if it cannot be detected through error correction, i.e. if its syndrome is trivial: synd(L) = 0. (D6) Logical operations form a group. The gauge group is the subgroup of logical operations that do not affect the logical subsystem.

Failure rates
If ideal error correction is applied after a single Pauli operation E hits an encoded state, the net result is a logical Pauli operation: Here, E is correctable if EE is in the gauge group. In particular and for a Pauli channel, and conversely It follows that L λ ⋄ L λ ′ ⊆ L 2 max(λ,λ ′ ) 1/2 , (F11) If X = P and F (A) = synd(A), then and conversely for any A ⊆ P, Using proposition 13, it follows that The following proposition can be used to bound the above failure rates (the case of • is analogous, and in any case can be found in [11]).

Proposition 15
If there exists an integer m and a set B of sets of check operators such that (i) each element of B contains m check operators, and (ii) for every syndromes σ, σ ′ , then fail (N exc τ ⋄ N exc τ ′ ) ≤ |B|(2 max(τ, τ ′ ) 1/2 ) m .
Proof. Let ǫ = 2 max(τ, τ ′ ) 1/2 . Given its failure rate can be bounded using the same technique as in (F6): Proof. Every element of R ⋄ E • P 0 takes the form T • P 0 for some where R ∈ R and E ∈ E. Consider the set It satisfies where (D3) was used in the computation of F T and (D10) in that of fail (F T ). From (D1, D2) we get Let F be union of all F T of the form (H3) for some such T . Then which via proposition 13 yields the desired result. The proof for the uncorrelated composition R • E is entirely analogous; basically, it suffices to substitute p σi with q σ p i .