Decoding Across the Quantum LDPC Code Landscape

We show that belief propagation combined with ordered statistics post-processing is a general decoder for quantum low density parity check codes constructed from the hypergraph product. To this end, we run numerical simulations of the decoder applied to three families of hypergraph product code: topological codes, fixed-rate random codes and a new class of codes that we call semi-topological codes. Our new code families share properties of both topological and random hypergraph product codes, with a construction that allows for a finely-controlled trade-off between code threshold and stabilizer locality. Our results indicate thresholds across all three families of hypergraph product code, and provide evidence of exponential suppression in the low error regime. For the Toric code, we observe a threshold in the range $9.9\pm0.2\%$. This result improves upon previous quantum decoders based on belief propagation, and approaches the performance of the minimum weight perfect matching algorithm. We expect semi-topological codes to have the same threshold as Toric codes, as they are identical in the bulk, and we present numerical evidence supporting this observation.


I. INTRODUCTION
Any scalable computer architecture must be robust against hardware imperfections.In quantum computing, where qubits are realised as fragile quantum two-level systems, fault tolerance necessitates active error correction [1][2][3][4][5].A quantum error correction code specifies an encoding in which quantum data is distributed across a larger space of qubits to create a logical qubit state.Errors are detected on the logical state via a series of non-destructive stabilizer measurements (quantum parity checks) yielding an error syndrome.This syndrome information is processed by a decoding algorithm to determine the best recovery operation to return the encoded quantum information to its uncorrupted state.All three stages of the error correction cycle -syndrome measurement, decoding and recovery -must be performed within a short time frame before the qubits irreversibly decohere.Performing the decoding in real-time is a computationally intensive inference problem, with realistic resource estimates showing a need for terabytes of syndrome information to be processed per second [6].As such, efficient decoding algorithms are necessary to allow quantum error correction to be performed whilst maintaining realistic demands on classical co-processors [7].
Low density parity check (LDPC) codes are a ubiquitous classical error correction protocol [8], finding use, for example, in the recent 5G communication standard [9].The specific advantage of LDPC codes is that they can be decoded using an algorithm from probabilistic graph theory known as iterative belief propagation (BP) [10].The BP algorithm exploits the structure of the error correction code to solve the decoding inference problem in time linear in the code block length [11].For certain LDPC codes, BP decoding enables error correction at close to the Shannon-capacity, the theoretical upper bound on the rate of information transfer along a noisy channel [12,13].
Quantum LDPC (QLDPC) codes can be constructed from classical LDPC codes using the hypergraph product framework due to Tillich and Zemor [14].The hypergraph product translates the parity check sequences of a classical parent code into a set of commuting stabilizers that define a quantum code.The most commonly studied hypergraph product codes fall into one of two types: topological QLDPC codes and random or expander QLDPC codes.
In contrast to classical LDPC codes, there is no established decoder that works generally for all QLDPC codes.For purely 2D topological codes, the minimum weight perfect matching algorithm achieves a threshold [15] that is close to the theoretically maximum possible value derived from statistical mechanics arguments [16].For random QLDPC codes with the expansion property [17][18][19], the small set-flip (SSF) decoder has a theoretically proven threshold [18] that has been verified numerically [20].Furthermore, in a recent study by Grospellier et al. [21], it was shown that the performance of the SSF decoder can be improved by combining it with the classical BP algorithm.This two-stage BP+SSF decoder exhibits a higher code threshold, in addition to being applicable to a wider range of random QLDPC codes than the SSF decoder alone.
In this paper, we consider another two-stage quantum decoder, first proposed by Panteleev and Kalechev [22], that combines BP with a post-processing technique known as ordered statistics decoding (OSD) [23,24].Panteleev and Kalechev demonstrated that for many random QLDPC codes, the BP+OSD method improves decoding performance by several orders magnitude over the BP algorithm alone.In this work, we expand on the results of Panteleev and Kalachev to provide further evidence that the BP+OSD decoder is a general decoder for all QLDPC codes that can be constructed from the hypergraph product.To this end, we first propose a new class of semi-topological codes which share properties of both topological and random QLDPC codes.We use this new class of codes to define a spectrum of QLDPC codes, and run numerical simulations to show that the BP+OSD decoder applies generally across it.
Topological QLDPC codes have stabilizers that can be locally embedded in some D-dimensional space [25].The simplest example is the surface code, obtained by taking the hypergraph product of the classical repetition code.The stabilizers of the surface code are local, meaning they can be implemented via nearest-neighbour interactions on a 2D array of qubits [25,26].With regards to experimental implementation, this is highly beneficial, as many qubit technologies are limited in terms of connectivity between qubits [27][28][29].Another practical advantage of the surface code is that it has a high threshold [15,16,26].The disadvantage of the surface code, and topological codes in general, is that they have poor encoding rate.The surface code, for example, encodes only a single qubit per logical block meaning its encoding rate tends to zero as the code distance is increased.
Random QLDPC codes are constructed by taking the hypergraph product of high-performance classical LDPC codes.The strength of QLDPC codes over topological codes is that they can have considerably higher encoding rates that do not tend to zero with increasing block length [17][18][19].The trade-off is that random QLDPC codes have non-local stabilizer checks, typically requiring interactions between arbitrary qubit pairs.Quantum computers based on ion traps [30][31][32][33], photonic qubits [34,35] or nitrogen vacancy centres [36] promise connectivity beyond nearest-neighbours.However, such prototype devices do not yet meet the connectivity requirements of high-rate random QLDPC codes.Another disadvantage of random QLDPC codes is that they appear to have lower thresholds than their topological counterparts [20-22, 37, 38].
The new class of semi-topological codes we propose in this work allow for interpolation between local topological codes and non-local random QLDPC codes.The construction of semi-topological codes begins by modifying a classical parent code via a process called edgeaugmentation.This involves replacing code edges with chain segments that are similar in form to repetition codes.The semi-topological code is then obtained from the augmented parent code via the hypergraph product, which maps each of the chain-segments to a surface code-like patch.A semi-topological code can therefore be thought of as a set of surface code patches connected to one another at their boundaries via a small number of long-range interactions.The locality of a semitopological code can be finely controlled by varying the degree to which the parent code is augmented.The ability to control connectivity makes semi-topological codes promising candidates for networked surface code architectures [39].
In its unmodified form, the BP algorithm is ineffective for decoding QLDPC codes due to degenerate quantum errors.Quantum degeneracy is a uniquely quantum effect, and arises in situations where quantum superposition permits multiple equivalent solutions to the decoding problem.Panteleev and Kalachev [22] have shown that for random QLDPC codes, the problem of quantum degeneracy can be resolved by decoding using BP in conjunction with OSD post-processing.The OSD method is called when BP fails, and uses matrix inversion to resolve ambiguities in the decoding due to quantum degeneracy.
In this work, we show that in addition to random QLDPC codes, BP+OSD enables high-performance decoding of both topological QLDPC codes and our new class of semi-topological codes.To this end, we first run numerical simulations of BP+OSD on the Toric code with increasing code distances.Our results indicate a threshold in the region 9.9 ± 0.2%, in addition to showing evidence of exponential suppression in the low error regime.This BP+OSD threshold improves upon previous BP-based decoders for the Toric code [38], and is close to the value of 10.3% achieved by state-of-the-art decoders for the Toric code based on the minimum-weight perfect matching algorithm [15,40,41].We perform further numerical simulations of BP+OSD applied to a family of semi-topological codes, as well as a family of finite-rate random QLDPC codes.For large block sizes, the BP+OSD threshold obtained for the semi-topological codes approaches the value obtained for the Toric code, reflecting the fact that the majority of stabilizer checks are 2D local.
This paper is structured as follows.In section II, we first review essential concepts in classical coding theory, before introducing the edge-augmentation procedure.Section III covers the basics of quantum stabilizer codes, and explains how they can be represented as binary linear codes.In section IV, we describe how QLDPC codes are obtained from classical LDPC codes via the hypergraph product, giving explicit examples of the construction of topological and random QLDPC codes.Following this, we explain how semi-topological codes are constructed by taking the hypergraph product of augmented parent codes.In section V we describe the workings of the BP+OSD decoder.In section VI, we describe the 'combination sweep' strategy as a greedy search method for finding higher order solutions to BP+OSD.Following this, we present the results of our numerical simulations of the BP+OSD decoder for topological QLDPC codes, semi-topological codes and random QLDPC codes.Finally, in section VII we summarise and discuss directions for future work.H such that H • c mod 2 = 0 1 .By the rank-nullity theorem, a parity matrix permits k = n− rank(H) linearlyindependent codewords.If a codeword is subject to an error e, the parity check matrix yields an m-bit syndrome s = H •(c+e) = H •e. The syndrome will be non-zero for all errors of Hamming weight less than the code distance |e| < d.In general, classical codes are labelled with the [n, k, d] notation, where n is the codeword length, k is the number of encoded bits and d is the code distance.The code rate is given by the ratio R = k/n.
Factor graphs -The factor graph of an [n, k, d] classical code is a bipartite graph G = (V, U, Λ) with an adjacency matrix given by the code's parity check matrix H [42].For an m × n parity check matrix H, the two sets of nodes in G are defined as follows: 1) Data nodes V = {v j |j = 1, ..., n} corresponding to the columns of H and taking the bit-values of the error e; 2) Parity nodes U = {u i |i = 1, ..., m} corresponding to rows of H and taking the bit-values of the syndrome s = H • e.A graph edge λ ij ∈ Λ is drawn between a pair of nodes {v j , u i } if H ij = 1.Factor graphs serve as a useful visualisation of the parity check matrix with applications in code design and decoding [11,43].Diagrammatically, factor graphs are drawn with circles representing data nodes, squares representing parity nodes and solid-lines representing the edges.Figure . 1 shows factor graphs for two instances of the three-bit repetition code.
Low density parity check (LDPC) codes -A family of (l,q)-LDPC codes is defined as a set of codes whose parity check matrices have column and row weights upper bounded by l and q respectively.As first demonstrated by Gallager [8], it is possible to construct an (l,q)-LDPC code by randomly generating a parity check matrix with the desired column and row weights.An alternative to random LDPC code search is to employ graphical constructions in which an LDPC code family is obtained by systematically modifying the factor graph of a base code.
Edge augmented LDPC codes -We now introduce 1 From this point on, we assume all arithmetic is performed modulo-2 'edge augmentation' as a graphical method for creating an LDPC code family from the starting point of any 'parent' factor graph G = (V, U, Λ).In section III, we show how semi-topological codes are created by taking the hypergraph product of such augmented codes.Focusing first on a single edge λ ij connecting nodes {v j , u i } in the parent code, the edge augmentation operation involves the addition of a 'graph chain segment' G g = {V g , U g , Λ g } containing g data nodes V g = {v g j |j = 1, ..., g} and g parity nodes U g = {u g i |i = 1, ..., g}.The adjacency matrix H g of the graph chain segment has dimensions g × g.Its general form is obtained by taking a size-g identity matrix and adding a '1' to the right of each of the first g − 1 entries in the diagonal.As an example, the adjacency matrix of a graph chain segment with g = 4 is given by Following addition of the graph chain segment to the parent graph G, the updated factor graph G ′ is written where Λ \ {λ ij } is the original parent edge set minus the edge that has been augmented.Two additional edges Λ w = {λ g 1j , λ g ig } are added to connect the nodes {v j , u g 1 } and {v g g , u i }.These edges 'weld' the graph chain segment to the parent nodes {v j , u i }.
A g-augmented factor graph G ⋆g = (V ⋆g , U ⋆g , Λ ⋆g ) is obtained by edge-augmenting each edge in a parent graph G = (V, U, Λ) with a length-g graph chain segment.If G corresponds to an [n, k, d] code with parity check matrix H, then the g-augmented graph G ⋆g corresponds to an [n + g|Λ|, k, d + d ′ ] code with parity check matrix H ⋆g , where |Λ| is the number of edges in the parent graph.The increase in code distance depends upon the structure of H, but is lower-bounded by d ′ ≥ (1 + gµ)d, where µ is the minimum degree over all data nodes V (for the proof of this lower bound see Appendix.1).If the parent graph G is an (l, q)-LDPC code with l, q ≥ 2, then the gaugmented graph G ⋆g will also be an (l, q)-LDPC code.A family of LDPC codes with increasing code distance can be obtained by augmenting a parent code with increasing values of the augmentation parameter g.

III. QUANTUM CODING
Quantum error correction -Quantum bits (qubits) are susceptible to a continuum of errors corresponding to rotations about the Bloch sphere.Fortunately, due to an effect known as the digitization of the error, quantum errors can be modelled in terms of the random occurrence of a discrete set of Pauli-operators {1 1, X, Y, Z} 2 on each qubit [44].An [[n, k, d]] quantum error correction code Q is a mapping |ψ → |ψ L from a k-qubit quantum state |ψ to an entangled n-qubit codeword (logical) state |ψ L .The quantum codewords |ψ L ∈ Q satisfy the condition S j |ψ L = (+1) |ψ L for all S j ∈ S, where S is a group of mutually commuting Pauli operators known as the code's stabilizer [45].Pauli-errors of Hamming weight less than the code distance |E| < d will result in at least one stabilizer S k projecting onto the negative eigenspace The Pauli group has a convenient binary representation in which each operator is mapped to a length-2 vector: 1 1 → (0, 0), X → (1, 0), Z → (0, 1) and Y → (1, 1).In general, the binary representation of an n-qubit Pauli operator K will be a length-2n vector of the form k = (x, z), where x and z both have length n and represent the positions of X-and Z-Pauli components respectively.As an example, the binary representation of the three-qubit Pauli operator K = X 1 Z 3 is k = (100, 001).The binary representation provides a useful setting from which to repurpose existing classical coding techniques for quantum error correction.
A quantum parity check matrix is defined as a matrix in which each row corresponds to a code stabilizer in its binary representation.Calderbank, Shor and Steane 2 The Pauli operators are defined as follows: (CSS) codes [46][47][48] are a subset of quantum codes with parity check matrices of the form H CSS = HX 0 0 HZ , where H Z • H T X = 0 due to the requirement that the stabilizers commute.For a CSS code subject to a Pauli error From the above, it can be seen that the working of a CSS code can be thought of in terms of two classical codes, C(H Z ) and C(H X ), designed to detect bit-flips (X-errors) and phase-flips (Z-errors) respectively.
Hypergraph product codes -The hypergraph product, first proposed by Tillich and Zemor [14], is a method for converting classical code pairs {C H1 , C H2 } to a quantum CSS code HGP(C H1 , C H2 ).In the below, we describe the special case of the symmetric hypergraph product HGP(C H ) for which C H2 = C H1 .
For a classical code C H with code parameters [n, k, d], the symmetric product HGP(C H ) is a CSS code with where H T is the transpose parity check matrix describing a 'transpose' code Here, k T is the number of logical qubits encoded by the transpose code whilst d T is the distance of the transpose code.The quantum code parameters of HGP(H) are The specific advantage of the hypergraph product construction is that it allows any classical code to be converted to a quantum code: the commutativity constraint H Z • H T X = 0 is satisfied for all binary parity check matrices H.
IV. QUANTUM LDPC (QLDPC) CODES An (l Q , q Q )-QLDPC code family is defined as a set of CSS codes whose quantum parity check matrices H CSS have row and column weights upper bounded by l Q and q Q respectively [49].The hypergraph product preserves the sparsity of the original classical code [14].From the structure of equation ( 4), we see that the hypergraph product of an (l, q)-LDPC code with parity check matrix H results in an (l Q ,q Q )-QLDPC code with quantum parity check matrix H Q , where l Q = max(l, q) and q Q = l + q.The hypergraph product of a classical LDPC code family is therefore a quantum LDPC (QLDPC) code family.
Two important classes of hypergraph product codes are: 1) topological QLDPC codes, such as the surface and Toric codes, constructed by taking the hypergraph product of repetition codes; 2) random QLDPC codes constructed by taking the hypergraph product of randomly generated classical LDPC codes.When random codes generate a factor graph with the expansion property, these are known as 'quantum expander codes' [17][18][19].In this section, we propose a new class of semi-topological codes constructed by taking the hypergraph product of augmented LDPC code families.Semi-topological codes are designed to share properties of both random and topological QLDPC codes.
Topological Topological codes such as the surface code are considered leading candidates for experiment due to their high threshold [16] and the fact that they are local: all code stabilizers can be measured via interactions between nearest neighbour qubits [26].Another advantage of the topological codes is that they have parity check matrices that are (4, 4)-QLDPC, meaning each stabilizer measurement involves at most four qubits.From a hardware perspective, this is beneficial, as each parity check operation involves error-prone multi-qubit operations.The shortcoming of topological codes is that they scale poorly in terms of rate: R = k/n → 0 as d is increased.
Random QLDPC codes -Random QLDPC codes are constructed from the hypergraph product of randomly generated classical LDPC codes [37].The advantage of random QLDPC codes, over topological codes, is that they can encode more qubits per logical block.Table .I lists members of an (4, 7)-QLDPC code family constructed by taking the hypergraph product of a family of randomly generated (3, 4)-LDPC codes.The (3, 4)-LDPC classical code family was obtained using the Mackay-Neal method which ensures the randomly generated parity check matrix has no length-four cycles [13].The resultant (4, 7)-QLDPC hypergraph product codes are finite-rate, with R = k/n = 0.04 as the distance is increased.The disadvantage of QLDPC codes is that they are highly non-local, requiring arbitrary qubit-qubit interconnectivity to perform stabilizer checks.Furthermore, the stabilizers typically involve more qubits than topological codes.The family of codes shown in Table.I, for example, are (4, 7)-QLDPC with stabilizer checks of mean weight w = 7.0.This is higher than the mean check weight of w = 4.0 for the (4, 4)-QLDPC Toric codes.
Semi-topological codes -Semi-topological codes are constructed by taking the hypergraph product of augmented LDPC codes.Table .II shows the code parameters of a family of semi-topological codes constructed from (2, 3)-LDPC augmented codes of the type illustrated in Figure .2. For an augmented code C * g H , each augmented edge can be thought of as a section of a repetition code.The hypergraph product HGP(C * g H ) therefore maps each augmented edge to a section of code that resembles a surface code.In these regions, the code stabilizers will be local.As the distance of the augmented code is increased, the resultant semi-topological code contains larger surface code patches and becomes more local in nature.This convergence to surface code-like structure is shown by the check-weight parameter w in Table.II, which tends to 4.0 with increasing code distance as the local surface code-like patches begin to dominate.We term this new family 'semi-topological codes', as they encode more logical qubits than the topological codes whilst requiring fewer long range interactions than random QLDPC codes.

V. BELIEF PROPAGATION DECODING
In the classical setting, the role of the decoder is to determine the most likely error-string e satisfying the syndrome equation H • e = s.In practice, this decoding problem amounts to finding a minimum weight (MW) estimate of the error e MW → argmax e P (e|z).For a uniformly distributed random noise model, the MW estimate can be computed bit-wise by calculating the where ∼ei denotes a summation over all bits e j except e i .The marginal P 1 (e i ) is referred to as a soft-decision for the bit e i .The final decoding estimate (hard-decision) is then made for each bit according to Belief propagation (BP) is an efficient marginalisation algorithm and the backbone of many high-performance classical decoders [50].The essential intuition underpinning BP is that (for certain codes) the probability distribution P (e|s) can be factorised in a way that reduces the number of repeat summations in the computation of the marginals.The specific form of this factorisation is deduced from the structure of the code's factor graph.
The BP algorithm computes exact marginals when applied to codes with tree-like factor graphs.For factor graphs with loops, the BP decoder outputs approximate marginals.However, it has been shown [13] that good decoding performance is nonetheless possible provided the factor graph is sufficiently loop-free.
The BP decoder takes a parity check matrix H and a syndrome s as input.The algorithm iteratively updates a soft-decision vector P 1 (e) by passing sets of 'beliefs' between the nodes of the factor graph.At each iteration, a BP estimate e BP is obtained via a hard decision on P 1 (e).If the BP estimate satisfies the syndrome equation, H • e BP = s, the BP decoder is said to have 'converged' and the BP algorithm is terminated.The BP decoder fails if convergence does not occur within a number of iterations equal to the block length of the code.A more detailed description of BP can be found in Appendix.C.
BP decoding of quantum codes -For a CSS code subject to a Pauli error E → e Q = (x, z), the quantum syndrome is given by s Assuming a Pauli-noise model with uncorrelated X-and Z-errors, the CSS code can be decoded independently as two classical codes with syndrome equations s x = H Z • x and s z = H X • z.Unfortunately, the unmodified BP algorithm cannot be used to directly decode CSS codes owing (in part) to an effect known as quantum degeneracy.Quantum degeneracy arises due to the fact that there can be multiple minimum-weight solutions to the quantum decoding problem.In classical coding the goal is to estimate the exact error configuration that occurred e MW = e.In contrast, for quantum coding, it is sufficient to find any recovery operation r Q that is equivalent to the error up to a stabiliser r Q +e Q = rowspace(H CSS ).For BP decoding, quantum degeneracy becomes problematic when are there multiple minimum-weight solutions satisfying the syndrome equation.As an example, consider a bit-error decoding problem s x = H Z • x that has two minimum-weight solutions x 1 and x 2 .As the degenerate solutions have equal Hamming weight |x 1 | = |x 2 | the BP decoder assigns high probability to both.This situation is referred to as a split-belief [15], and leads to a BP output of the form x BP = x 1 + x 2 .In this case, H Z • x BP = s x + s x = 0 = s x .The BP decoder therefore fails to converge when there are split beliefs of this type.
Ordered statistics decoding -Many attempts have been made to modify or supplement the BP algorithm to solve the problem of quantum degeneracy.The most successful approach to date involves applying a post-processing algorithm known as the ordered statistics decoder (OSD).Originally designed as a method for reducing error floors in classical LDPC codes by Fossosier and Lin [23], OSD was first applied in the quantum setting by Panteleev and Kalachev [22] and shown to be a surprisingly effective decoder of random QLDPC codes.In this paper, we show that OSD also performs well for the Toric codes and our new class of semi-topological codes.We also provide the first open-source demonstration of the algorithm [51].Note that in the below, for notational simplicity, we describe OSD post-processing as applied to a classical decoding problem s = H • e.The procedure we outline applies equally to decoding the H X and H Z components of a CSS code.
As parity check matrices do not have full column-rank, it is not possible to solve the syndrome equation by matrix inversion H −1 • s = e.However, for any parity check matrix it is possible to find a subset of columns, specified by the indices [S], that are linearly independent.These columns form a basis and can be used to define a submatrix H [S] with full column-rank, formed by selecting the columns [S] of the original parity check matrix H.As this sub-matrix has full column-rank, it can be inverted to give a solution to the syndrome equation . Each choice of the basis [S] corresponds to a unique solution e [S] , eliminating any potential ambiguity due to quantum degeneracy.It is possible to select [S] as a random basis set, but this approach is unlikely to result in a good (low-weight) solution for e [S] .The idea behind the OSD post-processing algorithm is that the soft-decisions from BP are used to select a basis-set [S] containing bits that have high-probability of having been flipped.
The OSD-0 algorithm -In a BP+OSD decoder, the OSD post-processing step is called when the BP algorithm fails to converge within a number of iterations equal to the block length of the code.The simplest manifestation of the OSD decoder is known as OSD-0, the steps of which are as follows: 1. Use the BP soft decision vector P 1 (e) to obtain a ranked list of bit-indices [O BP ] ordered (left-toright) from most-to-least likely of being flipped.Higher order OSD -In higher-order OSD, we consider solutions for which e [T ] = 0.The first step involves computing the OSD-0 solution e [S] on the basis bits as described above.Following this, for a given choice of e [T ] , the higher order OSD solution across all bits is given by vector is equal to k ′ = n − rank(H), meaning there are 2 k ′ distinct configurations: as a result, searching over all configurations soon becomes intractable for large codes.However, the BP soft-decision vector P 1 (e) can be used to rank the bits in e [T ] .Good solutions can then be discovered by implementing a weighted greedy search routine which prioritises the more probable configurations of e [T ] according to the soft-decisions P 1 (e).

Order the columns of the parity check matrix
Greedy search strategies for higher order OSD -For the numerical simulations in this paper, we implement a greedy search method we refer to as the 'combination sweep strategy', a variant of the method originally proposed in [23].The steps of the combination sweep strategy are as follows: We label our decoders using the combination sweep greedy search algorithm as BP+OSD-CS.For all the simulations in this work, we set the combination sweep search depth parameter to λ = 60.Note that in [22], Panteleev and Kalachev used a different greedy search method that involved testing all 2 λ permutations of the first λ bits in x [T ] .For a fixed number of search terms, the combination sweep search algorithm provides a modest improvement in decoding performance over this exhaustive approach.For more details, see Appendix.B.

VI. NUMERICAL SIMULATIONS
Simulation methodology for BP+OSD decoding -For the numerical simulations of the BP+OSD decoder in Logical-error rate, p L [[145 5 6]] BP [[421 5 10]] BP [[841 5 14]] BP [[1405 5 18]] BP [[145 5 6]] BP+OSD-CS [[421 5 10]] BP+OSD-CS [[841 5 14]] BP+OSD-CS [[1405 5 18]] BP+OSD-CS  this work, we sample errors from the phenomenological uncorrelated X-Z noise model.As the quantum error correction codes we consider are constructed from a symmetric hypergraph product, the respective decoding problems for X-and Z-type errors are equivalent.As such, it suffices to simulate a single error species to assess decoding performance.Here, we sample X-errors and solve the decoding problem s x = H Z • x.Pseudocode for the specific implementation of BP we use for the numerical simulations in this paper can be found in Appendix.C. The simulation chain we implement for each BP+OSD decoding cycle is described below: 1.An error x is randomly sampled from a binary symmetric channel with bit error rate p.The syndrome is then calculated 2. The BP decoder is called with H Z and s as inputs.
The output of the BP decoder is a candidate solution x BP along with its respective soft-decision vector P 1 (x).If H Z • x BP = s x , then the BP decoder has converged and the simulation jumps directly to step 5.If H Z • x BP = s x , then the OSD postprocessing routine (steps 3-4) is called.For our decoding simulations we use the 'min-sum' variant of BP algorithm as described in [52].
3. The OSD-0 post-processing method, as described above, is used to obtain a solution of the form A greedy algorithm is run to search for higher-order OSD solutions that improve upon OSD-0.For this work, we adopt the combination sweep strategy with the search depth parameter set to λ = 60.However, in general, the specific form of the greedy search routine can be tailored according to parameters such as the physical error rate or code structure.The lowest weight OSD solution, min|e [S,T ] |, is mapped to the original bit-ordering and chosen as the BP+OSD candidate solution e OSD .
5. After applying the recovery provided by the decoder, the 'residual' error is given by x R = x+x OSD (or in the case where BP converged The decoding cycle is counted as a success if x R is a not an X-type logical operator of the code.By definition, an X-type logical operator will anticommute with its corresponding Z-type logical operator.Checking for decoding success therefore involves verifying that L Z • x R = 0, where L Z is a matrix in which each row represents a Z-type logical operator. Next, we discuss our thresholds estimates for thre code families across the QLDPC code spectrum, with an overview presented in Table .III. Topological QLDPC codes -Figure .3 shows a Toric code threshold plot comparing the BP decoder against the BP+OSD-CS decoder.The logical error rate p L is plotted against the physical error rate p for code distances d = {9, 11, 13, 15}.Due to quantum degeneracy, the BP decoder alone (dashed lines) does not exhibit a threshold: increasing the code distance d increases the logical error rate p L for all values of the bit-error rate p.In contrast, the BP+OSD-CS decoder (solid lines) shows crossings that indicate a threshold in the region 9.9 ± 0.2%.Furthermore, by inspection of the sub-threshold regime, we see evidence of exponential suppression in the logical error rate with decreasing physical error rate.The corresponding threshold (not plotted) for the BP+OSD-0 decoder is 9.2 ± 0.2%.Performing the combinationsweep for higher-order OSD solutions therefore results in a quantifiable improvement in decoding performance.
Semi-topological codes -Figure.4 shows the threshold plot for a family of semi-topological codes constructed from augmented (2, 3)-LDPC codes.The parameters for this code family are listed in Table .II.The logical error rates p L (for both BP and BP+OSD-CS) are plotted against the physical error rate p for code distances d = {6, 10, 14, 18}.As with the Toric codes, the BP decoder alone does not yield a threshold.For the BP+OSD-CS decoder, however, a crossing is clearly visible, suggesting a threshold in the range 9.7 ± 0.2%.Within margin of error, the semi-topological code threshold aligns with the threshold for the Toric code using the same decoder.This is the expected behaviour, reflecting the fact Logical-error rate, p L [[400 16 6]] BP [[625 25 8]] BP [[900 36 10]] BP [[400 16 6]] BP+OSD-CS [[625 25 8]] BP+OSD-CS [[900 36 10]] BP+OSD-CS  that semi-topological codes become structurally similar to Toric codes (more local) as their distance is increased.Discrepancies in the crossing locations can be attributed to finite size effects.
Random QLDPC Codes -Figure .5 shows the results of numerical simulations of the BP+OSD decoder applied to the finite-rate family of random QLDPC codes summarised in Table .I. The code distances considered are d = {6, 8, 10}.In contrast to the Toric and semitopological codes, the BP decoder alone (before any OSD post-processing) shows a crossing, pointing to a threshold in the range 6.5 ± 0.1%.The existence of this threshold for the BP decoder can be attributed to the fact that random QLDPC codes are less structured than Toric and semi-topological codes; the repeating patterns present in stabilizer checks of topological codes lead to high densities of degenerate errors that cause BP to fail.The full BP+OSD-CS decoder applied to the random QLDPC family results in a threshold in the range 7.1 ± 0.1%.Whilst this threshold value is only a modest improvement over BP, the real benefit of the OSD post-processing for random QLDPC codes becomes apparent in the low-error regime; at p = 0.1, for example, the logical error rate p L for BP+OSD-CS decoder is approximately an order magnitude less than that for BP.

VII. SUMMARY
Quantum LDPC codes have traditionally been studied as local topological codes or non-local random codes.In this paper we introduce semi-topological codes as a means of interpolating on the local to non-local QLDPC spectrum.Previously, the practicality of QLDPC codes has been hindered by the lack of a general purpose decoder: designing a new family of QLDPC codes would necessitate the development of a special-purpose decoding strategy [19,20].In this paper, we provide further evidence that the recently proposed BP+OSD decoder [22] applies to all QLDPC codes constructed via the hypergraph product, including our new family of semitopological codes.
The methods for constructing semi-topological codes proposed in this paper allow the locality of QLDPC codes to be balanced against other factors such as code rate.The existence of a general purpose BP+OSD decoder for QLDPC codes grants quantum computer architects more freedom in the design of fault tolerant quantum computers; modifications to the structure of a QLDPC code can be made according to demands of a given device, without compromising the practicality of their decoding.
All of the simulations in this work were run under the assumption that the syndrome measurements are noiseless.In reality, syndrome extraction is performed using ancilla qubits with imperfect readout.In work currently in preparation, we study the performance of the BP+OSD decoder for higher dimensional hypergraph product codes with the single-shot property [53][54][55], designed with in-built protection against syndrome noise.
Since our semi-topological codes contain local patches of surface code, it would be useful to determine whether other QLDPC codes can be modified to contain such patches.For instance, Panteleev and Kalechev [22] constructed a [[1270, 28, d]] (with unknown d) code that was especially competitive with surface codes.However, this code was constructed using a generalised hypergraph product and it is unclear whether an analog of our edgeaugmentation process can be applied to this more general code family.
QuantERA ERA-NET Cofund in Quantum Technologies implemented within the European Union's Horizon 2020 Programme.EC is additionally supported by the Engineering and Physical Sciences Research Council (EP/M024261/1).DW was supported by a research grant from Huawei.We thank Armanda Quintavalle for related discussions and comments throughout the project.The authors are grateful for the use of the follow-ing open source software packages: Software for LDPC codes [56]; Scipy [57]; Numpy [58]; Matplotlib [59].

SOFTWARE
The code for the BP+OSD decoder used for the sim- nodes V and the augmented data nodes V g .We let A denote a subset of data qubits A ⊆ V ∪V g that corresponds to a codeword of the classical code, which is the case if and only if every check node in G ⋆g has an even number of graph neighbours in the set A. Furthermore, because the augmented data nodes V g are all degree two, for each graph chain segment either all the data nodes are in A or none of them are.Furthermore, for every parent data node a ∈ A ∩ V , it follows that every whole graph chain segment welded to a must be in A. Using #chains(A) to denote the number of graph chain segments present in A, we have that Recall that each graph chain segment welds to one parent data node and one parent check node.Therefore, we can count the number of graph chain segments as follows #chains(A) = a∈A∩V deg(a).In the graph G ⋆g , the parent data nodes have the same degree as they did in the original graph G and by assumption this is lower bounded by µ.Therefore, we have #chains(A) ≥ µ|A ∩ V | and so Next, we observe that A can only be a codeword with respect to graph G ⋆g if A ∩ V is a codeword with respect to graph G, which entails that Let us break this observation down into steps.Assume to the contary that |A∩V | < d, so that with respect to graph G there is a parent check node c ∈ U such that it has an odd number of neighbours in A ∩ V .Furthermore, there will be an odd number of edges connecting c to A ∩ V in graph G.Each one of these edges maps to a graph chain segment in A ∩ V in the augmented graph, each of which welds to check node c.Therefore, check node c also has an odd number of neighbours with respect to graph G ⋆g .This is impossible when A is a codeword in graph G ⋆g , so we must have that Eq. (A3) holds.Combining Eq. (A2) and Eq.(A3), gives |A| ≥ (1 + gµ)d for any codeword A in G ⋆g , so this gives a lower bound on d ′ .
ratios for probabilities and the incorporation of variable scaling to prevent runaway values.Belief Propagation calculates marginal probabilities over graphical probabilistic models, a form of statistical inference, and is widely applied to the decoding of classical error-correcting codes.In the quantum domain the decoding task differs slightly in that, rather than trying to infer the original codeword from the received message, we are given the syndrome indicating whether a given "parity check" failed and must infer a recovery operator; we must also cope with quantum degeneracy.Despite these differences, the task of quantum error correction can be reformulated as a classical syndrome-based decoding problem.Unfortunately, syndrome-based decoding is not common in the classical decoding literature and there are few good references on the topic.
A more significant difference when applying BP to quantum codes is that all CSS and non-CSS QLDPC codes have factor graphs of girth four; originally BP was designed to work on acyclic graphs, but these factor graphs contain short cycles.Whilst this violates the invariants of the algorithm and hence its proof of its correctness, empirically BP has been found to perform surprisingly well on cyclic graphs.Although it may sometimes fail to converge to a feasible solution, we can detect this by checking that its output satisfies the syndrome equation.

Formulating QEC Decoding for BP
A QEC factor graph has data nodes representing each bit in the error string, which we denote v j .It has one "check" or "parity" node for each syndrome measurement, which we denote u i .The graph is described by the parity check matrix H (whether it concerns X or Z errors alone, or both, is immaterial; the methodology is the same).A one at position (i, j) in H indicates that parity node u i has an edge directly connecting it to data node v j .
There are two forms of prior information we must incorporate into the graph: the error rate of the channel, p, and the syndrome s.The error rate is incorporated as a hidden input to the data nodes.The syndrome measurement is implicitly present in the graph via calculations made at the parity nodes.
BP is conceptualised as a message passing algorithm.We denote a message from a parity to a data node m ui→vj and from a data node to a parity node as m vj →ui .As we possess only the syndrome, and not the received codeword, the factor graph for QEC is slightly different from the standard graph found in classical decoding -but it is indeed equivalent to the (rarely discussed) syndromebased classical decoding.
Our task is as follows: given the syndrome s and the structure represented by the factor graph, what is the most likely value of each bit in the error string?

Algorithm Description
Algorithm 1 Pseudocode for Belief Propagation using log likelihood ratios, the minsum product algorithm, and a scaling factor.Log likelihood ratios and the minsum algorithm (the use of w in Line 13) make the computation more efficient and avoid the numerical instability of other implementations.for iter ← 1 to max do w ← min ∼v j ∈V (u i ) {|m v ′ j →u i |} 13: mu i →v j = −1 s i α( ∼v j ∈V (u i ) sign(m v ′ j →u i ))w return False, eBP , P1 The pseudocode for our implementation is given in Algorithm 1, and consists of four sequential steps: 1. Initialisation Messages are sent from data nodes to parity nodes giving the a priori probability of that bit in the error string being a one, i.e. the LLR (log likehood ratio) of the channel error rate p, which we denote p l in its log likelihood form: p l log((1 − p)/p) (C1)

Parity nodes to data nodes
Messages are sent from parity nodes to data nodes containing the marginal probability of an error at the destination data node.However, we implement several optimisations that somewhat complicate the calculation of this message.Denoting the neighbouring data nodes of a given parity node u i as V (u i ), the messages sent are: The set minus notation in the subscripts indicates that this is a marginal distribution, i.e. we consider only the probabilities from other data nodes when calculating the marginal for this bit.The sign function and the first exponential (−1) si are used to incorporate the syndrome, with s i being the i th bit of the syndrome.In other words: "consider all configurations of connected error bits, and increase the probability of the implied value for this bit compatible with the observed syndrome."The first factor is an XOR operation that establishes the sign of this probability, i.e. whether u i is implied to be a one or a zero, based on the decision represented by the messages sent by other data bits.The second factor describes the magnitude of the probability, is based on the notion that the 'cheapest' way that this value of u i could be incorrect is if one of the other bits was flipped.For a full explanation, see [11].
We also include alpha, a scaling factor as outlined in [52].The scaling factor α is set according to the current iteration iter, where the first iteration is numbered iter = 1:

Data nodes to parity nodes
Next, messages are sent from data nodes to parity nodes giving the probability ratio for that bit in the error string, calculated by summing the inbound marginals and taking into account the error rate for the channel, omitting normalisation for efficiency: Where we have denote the neighbouring data nodes of a given check node v j as U (v j ).
4. Termination check.If the factor graph is a tree, we can always terminate after a single iteration of the algorithm.If it is cyclic (as in QEC), then we will terminate on success or else when a given number of iterations are complete.We first calculate a "hard decision" of the most likely error string, by selecting the most likely configuration via the bitwise marginals we have calculated: We then select the most likely error string Ẽ given these bitwise probabilities, and calculate the expected syndrome: We terminate if s matches the measured syndrome, or if we have reached a preset maximum number of iterations (often equal to the block length).Otherwise, we return to Step 2, sending 'parity nodes to data nodes' messages.
The outputs of BP are both the soft and hard decisions; the former are used by OSD if BP has failed to converge, i.e. the hard decision does not satisfy the syndrome equation.The soft decision is a bit-wise estimate of the probability an error occurred, which OSD uses to bias its search for an error string.
II. LOW DENSITY PARITY CHECK CODESClassical error correction-A classical error correction code C H describes a redundant encoding b → c from a k-bit data string b to an n-bit codeword c (where n > k).The codewords c ∈ C H are defined as the nullspace vectors of an m × n binary parity check matrix

Figure. 2
illustrates the first three levels of a (2, 3)-LDPC code family starting from a [3, 2, 2] parent code with parity check matrix H = ( 1 1 1 1 1 1 ).The factor graph of the parent code G is shown in Figure.2a.
Figure.2b   shows the g-augmented graph G ⋆1 with g = 1 and code parameters[9,2,6].Here, each edge in the parent graph G has been augmented with a length-1 graph chain segment, the nodes of which are coloured red.Figure.2c is the g-augmented graph G ⋆2 corresponding to a code with parameters[15,2,10].

3 .
H [OBP ] according to the ranking [O BP ].Select the first rank(H) linearly independent columns of H [OBP ] as the most-probable basis-set [S]. 4. Calculate the OSD-0 solution on the basis-bits by matrix inversion e [S] = H −1 [S] • s. 5.The OSD-0 solution across all bits is given by e [S,T ] = e [S] , e [T ] = e [S] , 0 , where we define the remainder-set [T ] as the bits which are not in the basis-set [T ] / ∈ [S].The OSD-0 solution will always satisfy the syndrome equation H [S,T ] • e [S,T ] = s.6. Map the OSD-0 solution to the original bit ordering e [S,T ] → e OSD-0 .

)
Note that the above solution satisfies the syndrome relation H [S,T ] • e [S,T ] = s for all possible configurations of e [T ] .A higher order OSD routine involves searching over different values of e [T ] to find the OSD solution with the lowest Hamming weight min(|e [S,T ] |).The length of the e [T ]

FIG. 3 . 1 . 2 . 3 .
FIG.3.Toric code threshold plot comparing the BP decoder (dashed lines) versus the BP+OSD-CS decoder (solid lines).The logical error rate pL is plotted against the physical error rate p for code distances d = {9, 11, 13, 15}.For this simulation, the search depth parameter for the greedy search 'combination sweep strategy' is set to λ = 60.

FIG. 4 .
FIG.4.Threshold plot for the semi-topological codes constructed from a family of augmented codes (see Table.II for the code parameters).The logical error rate pL is plotted against the physical error rate p for code distances d = {6, 10, 14, 18}.The search depth parameter for the greedy search combination sweep strategy is set to λ = 60.
FIG.4.Threshold plot for the semi-topological codes constructed from a family of augmented codes (see Table.II for the code parameters).The logical error rate pL is plotted against the physical error rate p for code distances d = {6, 10, 14, 18}.The search depth parameter for the greedy search combination sweep strategy is set to λ = 60.

FIG. 5 . 5 ±
FIG. 5. Threshold plots for the family of constant rate QLDPC codes listed in Table.I.The logical error rate pL is plotted against the physical error rate p for code distances d = {6, 8, 10} The search depth parameter for the greedy search combination sweep strategy is set to λ = 60.

FIG. 6 .
FIG.6.Comparison of the BP+OSD-E and BP+OSD-CS methods when applied to the distance d = 15 Toric code.The λ value for BP+OSD-E is set to λ = 12, leading to a total of 4096 inputs to the encoding operator defined in equation(8).For BP+OSD-CS, the λ value is set to λ = 86, leading to 3881 inputs to the encoding operator.

TABLE III .
Observed thresholds for numerical simulations of the BP+OSD decoder applied to Toric, semi-topological and random QLDPC codes.