Low-depth random Clifford circuits for quantum coding against Pauli noise using a tensor-network decoder

Recent work [M. J. Gullans et al., Physical Review X, 11(3):031066 (2021)] has shown that quantum error correcting codes defined by random Clifford encoding circuits can achieve a non-zero encoding rate in correcting errors even if the random circuits on $n$ qubits, embedded in one spatial dimension (1D), have a logarithmic depth $d=\mathcal{O}(\log{n})$. However, this was demonstrated only for a simple erasure noise model. In this work, we discover that this desired property indeed holds for the conventional Pauli noise model. Specifically, we numerically demonstrate that the hashing bound, i.e., a rate known to be achieved with $d=\mathcal{O}(n)$-depth random encoding circuits, can be attained even when the circuit depth is restricted to $d=\mathcal{O}(\log n)$ in 1D for depolarizing noise of various strengths. This analysis is made possible with our development of a tensor-network maximum-likelihood decoding algorithm that works efficiently for $\log$-depth encoding circuits in 1D.

Protecting quantum information from noise and decoherence is a requirement for scalable quantum computation, which, in theory, can be done using quantum error-correcting codes.Yet despite rapid developments in experimental realizations of quantum error-correcting codes [1][2][3][4][5][6][7][8], there are many challenges that must be overcome before they can be used in practical applications.
One major challenge is to deal with the daunting overhead required for various error correction schemes.For instance, experimental realization of the surface code is being pursued by several groups due to its high threshold and 2D layout [9,10].However, its major drawback appears to be a low encoding rate, defined as the r := k/n where k and n are the numbers of encoded and physical qubits of a quantum error correcting code, respectively.For practical computations, thousands of physical qubits may be required for each encoded qubit [11].
As a result, a significant effort has been made to find error correction schemes with a higher encoding rate and lower overhead.A variety of schemes have been proposed based on quantum low-density parity-check (LDPC) codes [12][13][14][15][16][17][18][19][20] and concatenated quantum codes [21].However, the existing high-rate quantum LDPC codes are hard to implement in many architectures since these codes require long-range two-qubit gates.The high-rate concatenated code can suppress errors even if we can only use nearest-neighbor noisy two-qubit gates in 2D layout [21,22], but may still need a higher circuit depth than is available under current technologies to attain sufficient error suppression.
Alternative approaches have been proposed, based on random encoding.Random stabilizer codes are known to * andrew.darmawan@yukawa.kyoto-u.ac.jp achieve a non-zero rate with vanishing error probability in the limit of large n for certain types of noise [23][24][25].The rate achievable by random stabilizer codes against independently and ideally distributed (IID) Pauli noise N (ρ) = (1 − p I )ρ + p X XρX + p Y Y ρY + p Z ZρZ is known as the hashing bound r = 1 − H(p I , p X , p Y , p Z ), where X, Y, Z are Pauli matrices, p I , p X , p Y , p Z represent Pauli error probabilities, and H is the Shannon entropy [24].While the hashing bound is not always optimal, it is relatively high compared to known upper bounds on the optimal rate [26,27].It was shown by Brown and Fawzi [28,29] that asymptotically the same performance can be achieved by random Clifford encoding circuits even when the circuit depth is only O(log 3 n).
From a practical perspective, the above random codes have some shortcomings.In particular, an efficient decoding procedure for them is not known and they require all-to-all connectivity (which is not available in many physical architectures).A recent result of Gullans et al. [30] has shown that random encoding by Clifford circuits with logarithmic depth and 1D connectivity or sub-logarithmic depth in higher dimensions can achieve a non-zero rate against erasure noise.Similar performance was observed against erasure noise in an alternative construction of codes, based on constraint satisfaction algorithms [31].While restricting to the erasure noise model greatly simplifies the decoding of such codes, it involves the strong assumption that the locations of all physical errors are known, which is not the case in current quantum computing architectures.Hence, to be practically relevant, it is necessary to investigate the performance of such codes against more realistic noise models.
In this paper, we consider 1D random Clifford encoding circuits and demonstrate that, when the circuits have a logarithmic depth d = O(log n), the generated codes can be efficiently decoded and can achieve a rate close to the hashing bound for depolarizing noise of various strengths.
Our numerical thresholds are plotted alongside various analytical bounds in Fig. 1.We obtain our results by developing a tensor-network maximum-likelihood decoder for stochastic Pauli noise that has poly(n) running time when d = O(log n).The combination of a high threshold against stochastic noise, non-zero rate, the practicality of low depth in 1D, and efficient decoding shows that such codes are promising candidates for quantum memories in future implementations of fault-tolerant quantum computers.
Low-depth random encoding circuits-Here we briefly outline the definition of codes based on low-depth Clifford circuits.We start with a trivial quantum code with 1qubit logical operators and checks.We partition the n physical qubits into k logical qubits and n − k stabilizer qubits such that the logical qubits are evenly spaced among the physical qubits.For each stabilizer qubit, indexed by i ∈ {1, . . ., n−k}, we randomly associate a nontrivial single-qubit Pauli check operator g i ∈ {X, Y, Z}.The stabilizer of the code is the group G generated by the checks.The check operators trivially commute, and the code space is defined as the +1 eigenspace of all such check operators (or all elements of the stabilizer).This implies that the initial code space is a product state on the stabilizer qubits.The logical qubits, however, are not fixed by the checks, and for every logical qubit, indexed by j ∈ {1, . . ., k}, we associate a pair of distinct anticommuting single-qubit Pauli operators l x j , l z j , which we regard as the logical X and logical Z operators for that qubit.
Given a Clifford circuit U , we can produce a new stabilizer code by transforming the checks and logical operators as g i → U g i U † , l x j → U l x j U † and l z j → U l z j U † .We assume U to be noiseless.The circuit U is an encoding circuit that maps unencoded logical qubits to encoded ones.The specific Clifford circuits we consider are low-depth circuits in 1D, where two-qubit iSWAP gates are applied in parallel between neighboring pairs of qubits in an alternating brickwork pattern (see Fig. 1(a) of Ref. [32]).After each round of two-qubit gates, a uniformly random single-qubit Clifford gate is applied to every physical qubit.The depth d of the circuit is taken to be the number of two-qubit gate layers.These locality constraints imply the weight of each check is at most 2d.
For the TN decoder, which we define below, it is useful to consider open boundary conditions.To minimize the boundary effect, for a code with a given n and r, we add an additional 4d − 1/r + 1 stabilizer qubits to make sure all logical qubits are at least 2d physical qubits away from the boundary before applying the encoding circuit.This changes the rate of the code, however, given that we restrict to d = O(log n), this does not affect the asymptotic rate as n → ∞.In a slight abuse of notation, we use r to refer to the rate of the code before the boundary qubits are added and we define n phys := n + 4d − 1/r + 1 to be the total number of physical qubits (including boundary qubits) of the code.
Tensor-network decoding-To assess the performance of these codes, we consider the following scenario.First, the logical information is encoded into an error-correcting code using the encoding circuit U as described above.Next, every physical qubit suffers noise, which we assume is depolarizing noise.Finally, all of the checks are measured, and decoding is performed.We assume the measurements are noiseless.Decoding is a classical computation that takes the check measurement outcomes, called the syndrome, as input and outputs a correction to restore the encoded data.For stochastic Pauli noise, this decoding task (a classical computational problem) is hard in general [33]; however, we show that efficient nearoptimal decoding is possible for 1D random Clifford codes of depth d = O(log n).Note that, while this scenario is not fully realistic due to the assumptions of noiseless encoding circuits and measurements, these assumptions are conventionally used to quantify code performance in a code capacity setting [34].
Here we briefly outline how the decoding problem can be cast as a tensor network (TN) contraction.Full details are provided in Appendix A. Say that all the checks are measured, and the syndrome s is obtained, and let f be a Pauli operator that anti-commutes only with the flipped checks (such a Pauli operator can always be computed efficiently by row-reduction of the check matrix given s).Due to the fact that two Pauli errors have an identical effect on the code if and only if they differ by multiplication by an element of the stabilizer, the probability that f will correct the physical error e p is the probability that e p ∈ f G, which is given by where p(f e) denotes the probability of Pauli error f e.
For independent Pauli noise, each p(f e) is a product The coset probabilities for maximum-likelihood decoding for any stabilizer code can be expressed as the contraction of a two-dimensional TN.Each horizontal row corresponds to a physical qubit of the code, so the number of rows is always n phys .The network is constructed from generators of G (checks).In this illustration, nodes corresponding to different generators are distinguished by their color.As illustrated, the generators are sorted into columns and are arranged such that no two generators overlap in a given column.Non-trivial Pauli operators in a generator are replaced with the corresponding tensors in (a), and all tensors in a generator are connected by a vertical wire.This implies that all σu and σ d in the tensors corresponding to a given check must coincide to contribute a non-zero term to the contraction.The tensor p(j) is a vector of the four Pauli error probabilities on site j which are permuted according to the action of fL on site j.The 0-tensors on the left fix the left indices to 0, and the small black tensors have entries all equal to 1.By horizontally contracting the tensors in a given row with σ indices fixed, either pI , pX , pY or pZ on the jth qubit is obtained depending on what Pauli operators act on the qubit in fLe.Contracting the network corresponds summation over all σ indexes, and thereby all e ∈ G, which evaluates to p(fLG) as in Eq. ( 1).
of Pauli error probabilities.If p(f G) can be computed for any f , then optimal, maximum-likelihood decoding can be performed by computing this probability for all logically inequivalent corrections f L := f L consistent with the syndrome, where L are logical operators, and choosing f L with the largest probability among the set {p(f L G)} L .However, it is inefficient to compute the probabilities directly by evaluating the sum in Eq. ( 1) since the number of terms in the summation is |G| = 2 n−k .
In fact, we do not expect an efficient algorithm to exist for computing coset probabilities of codes in general, due to the #P-hardness of the problem [33].
Fortunately, for codes with local checks, there are examples where coset probabilities can be computed efficiently using TN methods.We present two such methods for evaluating coset probabilities that are efficient and nearly optimal when restricted to 1D codes with log-depth encoding circuits.One of them, which we describe in Appendix A 1, is based on the TN description of coset probabilities for the surface code in Ref. [35].
The results presented in this paper, however, have been computed using an alternative TN description of the coset probabilities, which we illustrate in Fig. 2. The caption sketches the TN and briefly explains how it works.See Appendix A 2 for the full details.In the case of codes with 1D encoding circuits of depth d, the shape of the TN is n phys × O(d).When we restrict to low depth d = O(log n), such a TN can be contracted exactly in poly(n) time.Note that this is in contrast to the problem of contracting an n × n square lattice TN, which is known to be #P-complete [36].
To contract the TN, we have used a simple exact contraction algorithm.However, its structure as a square lattice TN leaves potentially many alternative approximate or exact contraction schemes for speeding up the contraction [37].We expect that the TN description based on logic circuits is likely most suited to codes where the check weight is relatively large since the size of individual tensors does not grow with the check weight.
While each coset probability can be computed exactly and efficiently with this method, an approximation is used due to the fact that the total number of inequivalent cosets to be maximized over is 4 k .As described in more detail in Appendix A 4, we overcome this by decoding each logical qubit independently while marginalizing over other qubits.This requires only 4k coset probabilities to be computed and maximized over in total but could be suboptimal if logical failures are highly correlated (however, this does not seem to be the case, as shown Appendix B).
Numerical results-We have performed simulations using the TN decoder described above to study the properties of codes defined by random Clifford encoding circuits in 1D.In each run of the simulation, we randomly generate a code using a Clifford circuit of depth d = O(log n) as described above and sample a Pauli error e p according to the depolarizing noise model, which gives rise to a syndrome s.The decoder calculates a correction f L , using s as input.
We say that logical qubit j fails when f L e p anticommutes with at least one of the logical generators U l x j U † or U l z j U † , which can be easily checked.At least 2 × 10 5 runs of the simulation are performed for each data point.Taking the average over the stochastic Pauli noise as well as over the random Clifford encodings, we can estimate the probability of any given logical qubit failing.
In Fig. 3, we show the spatial distribution of logical errors for various system sizes.Although there is a clear boundary effect, we observe that the logical failure probability for qubits sufficiently far from the boundary is uniform across logical qubits and independent of system size if d is kept constant.We let p L denote the failure probability of a logical qubit in the bulk region.In Appendix B, we present numerical results that indicate, unlike the fully random codes, the spatial correlations of logical errors are short-ranged, as observed in Ref. [30] for erasure noise.
We have plotted p L as a function of the physical error probability for a variety of depths d and r = 1/5 in Fig. 4. The crossing point for curves of different d indicates a threshold in the code, below which p L decays exponentially in d.Evidence of this exponential decay, plots at other rates, as well as a list of computed thresholds, are included in Appendix C. The numerically obtained thresholds are plotted alongside the hashing bound in Fig. 1.
The performance of the code can be improved by slightly modifying the random Clifford circuits.As one instance, we propose a random 'greedy' code by choosing single qubit gates in the random circuit to maximize the weight of checks and logical generators in each layer, rather than uniformly at random.See Appendix D for the details.The results of the greedy construction are also provided in Fig. 4 alongside the standard construction.Despite differences in logical error rates, the threshold crossing points are very similar, the greedy construction appears to behave like the standard construction except with a greater effective depth.
For both constructions, these plots show that the thresholds are very close to the hashing bound for r 1/3.This is remarkable since these codes achieve the same threshold as a fully random code for these rates, despite being much more local and restricted.At higher rates, e.g.r = 1/2, a threshold is harder to discern from the data; however, we conjecture that using larger values of d than is currently accessible with our numerical method, would produce a threshold estimate close to the hashing bound, as with the lower rates.Note that the observed exponential rate of decay in d of p L below the threshold for r 1/3 implies that the probability of an error occurring on at least one logical qubit, which we refer to as p L , also tends to zero as n → ∞ provided that log 2 k = αd for a sufficiently small constant α > 0. In other words, d = O(log n) for fixed r can be sufficient for p L to tend to zero as n → ∞.Note that 1 − p L is equal to the entanglement fidelity of the k-qubit logical channel (which is proportional to the average channel fidelity [38]).In Fig. 5, we plot p L as a function of n phys with fixed rate r = 0.1 and p = 0.05 and various α.As can be seen, by varying α, we observe a trade-off between the rate of decay of the total logical error rate and the total number of encoded qubits k = rn.For practical error correction, it would likely be useful to tune the value of α as well as the rate r, according to the target number and error probability of logical qubits.
Discussion-In this work, we have studied quantum error correcting codes defined by 1D low-depth Clifford encoding circuits.We have shown that for the family of these codes with logarithmic depth d = O(log n), maximumlikelihood decoding of stochastic Pauli noise can be performed in poly(n) time using TN methods.We have also numerically shown that, for depolarising noise over a large range of noise strengths, the codes can achieve a rate close to the hashing bound if d = O(log n).Thus, 1D Clifford encoding circuits with depth O(log n) can generate quantum error correcting codes that have the same rate as random stabilizer codes and can be efficiently decoded.The high performance, non-zero rate as well as locality in 1D suggest that such codes could serve as practical quantum memories in future implementations of quantum computers.The results suggest a number of potential directions for future research.Firstly, it would be interesting to see whether codes defined by low-depth Clifford encoding circuits in two or higher dimensions have advantages over the 1D codes considered here.This was shown to be the case for the erasure noise model when the code is modified with a process called expurgation [30].Whether the same holds against Pauli noise remains unknown.
Another direction towards practical implementation is to consider the realistic case where syndrome extraction is itself prone to error.In this case, there is the added complexity that higher-weight stabilizer measurements are likely to be less reliable than low-weight ones.Furthermore, fault-tolerant methods for state preparation and logical gates must be developed.On potential route is via Knill's fault-tolerant error correction gadgets that work for any stabilizer code even with a non-constant-weight stabilizer like ours [39,40].
The above directions for future research may require generalizations to the TN decoding methods described in this work.For this, one could consider leveraging more advanced TN contraction methods or adapting non-TN decoding methods that have been applied to other types of codes.
Finally, while we believe that these numerical results strongly suggest that the rate of these codes is close to the hashing bound, they do not constitute mathematical proof.It would be interesting to see whether the proofs supporting, or contradicting these suggestions can be made by, for instance, strengthening bounds in Ref. [29].
Here we provide full details of the TN decoder which we have used to study codes produced by low-depth random Clifford circuits in 1D.We show that the TN decoder runs in polynomial time for codes with 1D log-depth encoding circuits.
Say an n-qubit Pauli error e p occurs on a state in the code space, and all of the checks are measured, yielding the syndrome outcomes s = s 1 , s 2 , . . ., s n−k where s i ∈ {−1, 1} is the outcome of the measuring check g i .Let f be any product of Pauli operators that is consistent with that syndrome, in that it anti-commutes with the checks that returned a −1 outcome and commutes with the other checks.The physical error e p is one such error, but it is not known to the experimenter.One can find an operator consistent with any syndrome by performing row-reduction on the check matrix, the rows of which are binary vectors representing the check operators.
The operator f applied to the code will correct e p if e p ∈ f G.If not, then f e p = L for some non-trivial logical operator L, and f Lg for any g ∈ G will correct the error.Maximum-likelihood decoding finds the correction that is most likely to correct the error.To do this, we determine the L for which p(f LG) is maximized, i.e., the coset f LG that e p most likely belongs to.For any L, we henceforth combine the f and L into a single Pauli error f L .
The probability of an error belonging to the coset f L G is simply the sum of the probabilities of every error in that coset i.e. p(f L G) = e∈G p(f L e).By the definition of the generating set, every element of G is of the form e(σ) = n−k i=1 g σi i , with σ = σ 1 , σ 2 , . . .σ n−k , σ i ∈ {0, 1} for each i representing a particular check configuration.
For independent Pauli noise, we can write the probability of an error f L e(σ) as a product of single-qubit Pauli error probabilities p(f L e(σ)) Z } for each σ and where p (j) Q is the probability of Pauli error Q occurring on qubit j.Note that, if the noise is identical for each qubit j, p (j) P (P = I, X, Y, Z) does not depend on j.Since this discussion will mainly focus on the computation of a single coset (with fixed L), we will henceforth drop the dependence of A (j) σ (L) on L from our notation.While A (j) σ for a particular site j depends on the check configuration σ, it is clear that it only depends on the bits σ i for which g i acts non-trivially on qubit j.Thus, each term A (j) σ can be written as a tensor of rank r j , where r j corresponds to the number of generators that act non-trivially on j.For a general stabilizer code, we replace A (j) σ with A (j) σ(j) , where σ(j) is the list of all check bits σ i for which g i acts non-trivially on j.
To obtain the coset probability, we sum over all indices Note that the expression on the right-hand side has a similar form to a TN contraction.Each A (j) σ(j) is a tensor of rank r j , and the sum of the product structure is nearly identical to tensor contraction.One small difference is that each index σ i can appear in more than two tensors (they essentially correspond to hyperedges of the TN, compared to usual graph edges, that only connect two nodes/tensors).We can convert this into a typical TN in the following ways.

TN description of coset probabilities, based on
Ref. [35] One way to turn the summation in Eq. (A1) into a TN, as done in Ref. [35], involves adding a δ tensor for each check, with only two non-zero entries specified by δ σ1,σ2,...,σn = 1 if The TN is defined by connecting indices of A tensors to the appropriate check tensors δ.
The TN then has the same form as the Tanner graph of the code, as illustrated in Fig. 6.This is a bipartite graph with two types of nodes, which we refer to as qubit nodes and check nodes.Each qubit node corresponds to a physical qubit and each check node corresponds to a FIG. 6.(a) Coset probabilities represented as a TN following the construction in Ref. [35].Each black circle node corresponds to a check and each green square node corresponds to a qubit.An edge is drawn between a check node and a qubit node if and only if the check acts non-trivially on that qubit.The check nodes correspond to δ tensors and the green nodes correspond to A σ(j) tensors (defined in the text).(b) δ tensor can be split and combined with the qubit tensors, resulting in the blue square tensors.(c) This is done for every δ tensor in (a), resulting in a 1D TN.For 1D codes with depth O(d) encoding circuits, the maximum number of edges connecting a pair of neighboring tensors is O(d), resulting in a maximum matrix size of 2 code check.An edge is added between a qubit node and a check node if and only if the check acts non-trivially on the corresponding qubit.This graph is mapped to a TN by placing a δ tensor at every check node and an A tensor at every qubit node.
This TN description has proven useful for the surface code and other planar codes [41] since it can be contracted efficiently using established TN methods.However, for families of codes whose check weight grows with n, like the random codes studied in this paper, one encounters the problem that the size of the tensors grows exponentially with the weight of the checks.Furthermore, the resulting TN is not planar and so the methods of Ref. [41] cannot be applied directly.
Nevertheless, for 1D Clifford encoding circuits, the TN can be converted into a 1D TN as shown in Fig. 6.The 1D TN consists of a product of (sparse) matrices of size 2 O(d) × 2 O(d) .When d = O(log n), this TN can be contracted exactly in polynomial time in n.
We have implemented this decoder and confirmed that it produces the correct coset probabilities.We did not use this TN description to obtain the results in this paper, since our implementation of it was slower than the method described in the following section.We note, however, that this description could prove useful if optimized e.g. by exploiting the sparse structure of the matrices.For the remainder of this section, we focus on the following alternative TN of coset probabilities.

Alternative TN description of coset probabilities
Here we describe an alternative way to efficiently represent the coset probabilities p(f L G) for any stabilizer code as a two-dimensional network of small tensors.We have used this method to produce the results presented in this paper.The network is illustrated in Fig. 2 in the main text.
Most of the tensors in the network are from the set {T X , T Y , T Z }, which are defined in Fig. 2(a) in terms of classical logical circuits.The tensors have an entry 1 for each index assignment corresponding to a valid execution of the circuit, and the remaining entries are zero.The tensors are precisely defined as follows: for P = X, Y, Z, and As we will see below, in the computation of the probability p(f L e) (e ∈ G), the indexes σ u and σ d are used to characterize what generators g i ∈ G are contained in e, and the indexes i X , i Z , j X , j Z are used for recording the changes of the Pauli operators when each such g i is applied.Following the graphical representation of the tensor as shown in Fig. 2(a), in the following, we may refer to (i X , i Z ), (j X , j Z ), σ u , and σ d at left, right, top, and bottom indexes of the tensor, respectively.
A tensor p(j) , which is dependent on f L (although not explicitly in our notation), is associated with the Pauli error probabilities p I , p X , p Y , p Z occurring at the jth qubit.This tensor has two left-pointing binary indexes (i X , i Z ) and has entries from {p I , p X , p Y , p Z }.In the case of f L = I, the entries of the tensors p(j) (i X ,i Z ) equal p I , p X , p Y , p Z for (i X , i Z ) = (0, 0), (1, 0), (1, 1), (0, 1), respectively.For non-trivial f L , one simply flips input bits of p(j) according to the action of f L on site j, so if f We now explain how to determine the layout of the 2D network.The number of rows in the network is always n phys , but the number of columns depends on the details of the checks, which we will explain later.Given a single check, we construct part of the network corresponding to that check as follows.When the check acts as a nonidentity Pauli operator P on the jth qubit, we place T P tensor in the jth row.We do this for each qubit that the check acts non-trivially on and connect the σ u/d indices of all the T P tensors by a vertical wire.The σ u of the top tensor and σ d of the bottom tensor are each connected to a single index δ tensor, which has two entries both equal to one and is indicated by a small black circle in Fig. 2(b).We call this connected set of tensors a check subnetwork.
The set of all check subnetworks, where each subnetwork corresponds to a given check, is then sorted into columns so that no pair of subnetworks in a given column overlap on a row.The right index of every T P tensor is connected to the left index of the tensor on its right.The left index of the left-most tensor T P in each row is fixed to 0. The right index of the right-most tensor T P in the jth row is connected to the index of p(j) .See Fig. 2(b) for a specific example of the TN.
The number of columns depends on the choice of checks and can be n − k when there is a qubit shared by all the checks.However, in the case of 1D encoding circuits of depth d, the number of columns can be O(d) as the checks can only act non-trivially on at most 2d neighboring qubits.
In the 2D TN, each horizontal row is associated with a single physical qubit, and every horizontal wire actually contains two single-bit wires, which we can represent as a single tensor edge of dimension 4. In the jth row, the i X bit corresponds to an X error on the jth qubit, and the other i Z corresponds to a Z error on the jth qubit.By contracting with the probability tensor, a probability factor of p (j) Y or p (j) Z , which does not depend on j if the noise is identical for all qubits, will appear, depending on bit values of the wire.Note that the probability tensor is permuted according to f L , as we have explained.
In contracting the TN, it is important to notice that the set of T P tensors in the network constitutes a large logic circuit and that only valid computations of the classical logic circuit will be summed over (invalid computations evaluate to zero).In valid computations of the classical circuit, the bits σ u/d along all vertical wires in the tensors associated with single check qubits must all be 0 or all be 1.If σ (i) u/d = 1 for a given check i, then the bits on the horizontal wires matching that check will be flipped; otherwise, they will not be affected.Fixing the vertical check indices σ

Contracting the network
The TN can be contracted in various ways.The task is, essentially, to contract an L × W sized square-lattice TN, where L and W are the length and width, respectively, of the network.To perform the contraction exactly, we first contract the first row into a single tensor with O(W ) indices.We then contract the remaining tensors one by one with this tensor, starting with the first tensor in the second row, then moving down the network along rows then down columns until all tensors are contracted.Assuming L ≥ W , the maximum memory cost of this procedure is O(2 W ), and the time cost is O(W L2 W ). For the low-depth random circuits we consider, L = n phys and W = O(d) = O(log n phys ), and so both the time cost and memory cost are polynomial in the block size.
One could try to improve this scaling further by, e.g., using approximate contraction strategies.The approximate boundary matrix product state (MPS) method, described in Ref. [42] and used in other decoders [35,41,43,44], would reduce the time and memory costs to polynomial in W if a truncated bond dimension χ is kept constant in N .We have tested this method, but it appears that the quality of the approximation varies considerably among the codes sampled.A direction of future research could be to find methods to contract the network more efficiently; however, for this work, we obtain our results using the exact contraction method, which is sufficiently fast for our purposes.

Decoding logical qubits independently by marginalization
The coset probability p(f L G) can be computed for any L using this method.One problem when encoding a large number of qubits k is that the number of inequivalent logical operators and cosets grows as 4 k .Therefore finding the most likely coset by computing the probability of every coset takes exponential time in the number of encoded qubits.We overcome this issue by decoding each logical qubit independently.
To decode a single logical qubit j, we calculate the probabilities of the cosets for f L ∈ {f, f U l x j U † , f U l z j U † , f U l x j l z j U † } and marginalize over the other logical qubits, where l p j are the unencoded logical Pauli operators for the jth logical qubit, and U is the encoding circuit.This marginalization is achieved simply by adding the logical generators U l x i U † , U l z i U † for all i different from j to the list of checks when we construct the TN.In the case of the one-dimensional 1D encoding circuit with depth d, adding the logical operators to the TN adds an additional O(rd) columns to the network and therefore does not change the scaling of the width of the network with d.However, the cost of marginalization on exact TN contraction does increase with r, which explains why our simulations can only deal with d = 7 with r ≥ 1/3, while d = 8 for r ≤ 1/5.as we will observe in the performance improvement in the 'greedy' method.
The first simple modification is to ensure that the initial checks are either single-qubit X or Y before applying the first layer of iSWAP gates.This ensures that the weight of all checks is increased to two by the iSWAP gates (the weight of a single qubit Z would not be changed).After this step, the distance between the first and last sites on which the check acts non-trivially is guaranteed to be increased maximally by two for every layer of iSWAP.
Note that, when a two-qubit gate is applied to a Pauli operator acting on 2 qubits, the weight of the operator may increase, decrease or stay the same.Therefore, rather than choosing single-qubit Clifford gates uniformly at random, as in the standard construction, we choose the single-qubit Clifford gate that maximizes the increase in total check weight (sum of the weights of all the checks) when passed through the next iSWAP gate.When multiple gates produce the same increase in weight, one among these is chosen uniformly at random.
It is harder to avoid low-weight logical operators, since any product of logical generators and stabilizers is also a logical operator, and therefore it is harder to check that the overall weight of logical operators increases or decreases with the application of a two-qubit gate.However, as a simple heuristic, we also add the logical generators to the list of checks whose weight is maximized by greedily choosing single-qubit gates.This appears to further slightly improve performance.

6 FIG. 3 .
FIG.3.Error probability of a logical qubit vs. logical qubit index divided by k (corresponding to its relative position on the chain) for r = 0.5 with d = 6 and p = 0.02.The boundary effect can clearly be seen.The error rate stabilizes to a system-size independent value at a certain distance from the boundary.

8 FIG. 4 .
FIG.4.Bulk logical error probability p L versus physical error probability p for r = 1/5 using a fixed system size of n = 50, excluding O(d) added boundary qubits using both standard and greedy random encoding circuits.A clear crossing point can be observed, which is very close to the threshold error probability implied by the hashing bound p = 0.139 indicated by the dashed grey line.Computed thresholds and threshold plots for other rates are included in Appendix C.

FIG. 5 .
FIG.5.Probability of at least one logical qubit failing vs n phys when d = α −1 log 2 k, for various α.The error rate decays to zero with n phys , and by varying α we can increase the number of encoded qubits at the cost of a slower rate of decay.

L
= X, one flips only the i X bit in the definition above, and if f (j) L = Z, one flips i Z and for f (j) L = Y one flips both bits.
and summing over the remaining indices, we can pull out a product of Pauli probabilities from the p tensors, which is equal to p(f L e(σ)), with σ = σ (1) u/d , σ (2) u/d , . . ., σ (n) u/d .Finally, in contracting the TN by summing over the check indices, we sum over all possible configurations σ of the check bits and obtain the coset probability p(f L G) = σ p(f L e(σ)).

FIG. 8 .
FIG.8.Bulk logical error probability p L vs. physical error probability p for various encoding circuit depths d for r = 1/10, 1/3 and 1/2 in (a), (b) and (c) respectively.The crossing point of the different curves on each plot corresponds to the threshold.For r ≤ 1/3 the threshold is close to that given by the hashing bound, while a clear crossing point is not discernible for r = 1/2 for the values of d simulated.The system sizes used for (a), (b), and (c) are n = 50, n = 54, and n = 50 respectively.

1 FIG. 9 .
FIG. 9. Bulk logical error probability p L vs. depth for r = 1/10 and r = 1/3 in (a) and (b) respectively, for various error probabilities using the greedy code generator.The dashed lines are obtained by linear regression.A straight-line relationship indicates exponential decay in d.