Character randomized benchmarking for non-multiplicity-free groups with applications to subspace, leakage, and matchgate randomized benchmarking

Randomized benchmarking (RB) is a powerful method for determining the error rate of experimental quantum gates. Traditional RB, however, is restricted to gatesets, such as the Clifford group, that form a unitary 2-design. The recently introduced character RB can benchmark more general gates using techniques from representation theory; up to now, however, this method has only been applied to"multiplicity-free"groups, a mathematical restriction on these groups. In this paper, we extend the original character RB derivation to explicitly treat non-multiplicity-free groups, and derive several applications. First, we derive a rigorous version of the recently introduced subspace RB, which seeks to characterize a set of one- and two-qubit gates that are symmetric under SWAP. Second, we develop a new leakage RB protocol that applies to more general groups of gates. Finally, we derive a scalable RB protocol for the matchgate group, a group that like the Clifford group is non-universal but becomes universal with the addition of one additional gate. This example provides one of the few examples of a scalable non-Clifford RB protocol. In all three cases, compared to existing theories, our method requires similar resources, but either provides a more accurate estimate of gate fidelity, or applies to a more general group of gates. In conclusion, we discuss the potential, and challenges, of using non-multiplicity-free character RB to develop new classes of scalable RB protocols and methods of characterizing specific gates.


I. INTRODUCTION
Advances in accurate and scalable methods for characterizing the performance of quantum gates are critical for the realization of large-scale reliable quantum computers. Quantum process tomography can, in theory, completely characterize an unknown quantum channel [1][2][3][4], but requires resources that scale exponentially in the number of qubits [4]. In addition, any tomographic approach will also include the effect of state preparation and measurement (SPAM) errors, which may be of the same order as the gate error that is being characterized.
Randomized benchmarking (RB) [5][6][7][8] provides a method to scalably characterize gates that form a group G with the additional mathematical property of being a "unitary 2-design" [9], most frequently the Clifford group [10][11][12]. Rather than completely characterizing a noise channel, RB determines the average fidelity, a standard measure of gate quality that can be related to other common measures such as entanglement and process fidelity [13,14] and used to bound the gate error rate [15]. RB works by experimentally measuring the overall fidelity of a random circuit as a function of the number of applied gates U ∈ G and fitting this to an exponential decay. The parameters of the decay then determine the average fidelity of a single gate. Unlike tomographic methods, RB provides an estimate for the average fidelity that is independent of SPAM errors.
Standard RB, however, is limited to groups that form a unitary 2-design and whose elements can be efficiently compiled (i.e. decomposed) into elementary gates. This limitation prevents standard RB from characterizing any set of quantum gates that are large enough to be universal for quantum computation [11,12], and also prevents standard RB from characterizing smaller subgroups of 2-designs. There are ongoing efforts to extend RB to a larger class of gates. Interleaved RB was proposed to characterize individual Clifford group elements [16] as well as the T -gates needed for universal quantum computation [17], but these methods are specific to the gates considered and only produce bounds on the fidelity. Ref. [18] developed a method to extract the fidelity of the dihedral group on one qubit, which is not a unitary 2design and includes the T gate, while [19] proposed a method of extending dihedral RB to an arbitrary number of qubits. Refs. [20,21] extended this work by deriving decay formulas for the fidelity of random circuits of arbitrary groups, but these formulas involved fitting sums of multiple exponentials, and the decay parameters could not be related to the average fidelity. Ref. [22], introduced character RB to address these limitations, providing a method that only requires fitting a single exponential decay and directly predicts the average fidelity. However, this was only explored for "multiplicity-free" groups, a mathematical limitation on the group's representations (see below).
In this work, we provide a generalized derivation of character RB that applies to arbitrary groups. As in [22] but unlike [20,21], this method allows us to directly predict the average fidelity of the gates in G. For nonmultiplicity-free groups, our method potentially requires fitting a sum of multiple exponentials rather than a sin-gle exponential; however, the number of exponentials is significantly reduced compared to [20,21]. In addition, Our primary motivation for this generalization is to improve the recently introduced subspace RB [23] designed to characterize gates that preserve a subspace of the full Hilbert space. Such gates can never form a 2-design, and are never multiplicity-free, necessitating a generalized RB procedure. The original work on subspace RB established decay formulas for the fidelity of certain random circuits but could only give loose bounds on the average fidelity of the gates; our method, in contrast, allows us to directly estimate the average fidelity using a similar number of experiments as the original subspace RB. As an additional application of our method, we present a new protocol for leakage RB [24][25][26], a benchmarking protocol designed to characterize qubits that can "leak" into a non-computational section of the Hilbert space, that reduces the assumptions on the benchmarking group compared to the original [26]. As a final application, we introduce a new scalable RB procedure for the matchgate group [27], a class of quantum circuits that, like the Clifford group, are efficiently simulatable [27][28][29][30] but are very close to universal [29][30][31][32][33][34][35]. This procedure necessarily requires the full non-multiplicity-free character RB, and represents, along with the dihedral group [19,22], one of the few non-Clifford groups that can be scalably benchmarked.
Non-multiplicity-free character RB is a general framework for benchmarking quantum groups. It provides a method for characterizing individual gates, described in Section IV, when the gates are components of operations that form a group. This powerful framework expands the family of groups that can be scalably benchmarked. Scalable benchmarking protocols are necessary to measure gate quality in large quantum processors particularly to understand the effects of non-local errors such as crosstalk. While we provide one example of a scalable benchmarking protocol, for the matchgate group, we expect the framework non-multiplicity-free character RB will lead researchers to develop further scalable examples. We discuss the potential of and some challenges for generating further examples. Benchmarking multiple overlapping groups (or subgroups of groups) may allow more accurate error characterization.
Our paper is organized as follows. Section II provides mathematical background on the Liouville representation and the definition of average fidelity. Section III outlines the full non-multiplicity-free RB protocol, and proves that it correctly estimates the average fidelity of the gates. The next sections consist of applications. Section IV demonstrates how our method can be used to rigorously estimate the fidelity of gate sets that preserve subspaces, such as those studied in [23]. Section V applies our framework to formulate a leakage RB protocol with fewer assumptions than the current state-of-the-art [26]. Section VI reviews the matchgate group, and describes how our method can be used to derive a scalable RB protocol for this group. We conclude in Section VII with discussion of possible extensions of our work, including some of the challenges. We relegate technical details to appendices, including Appendix B, which provides a self-contained and straightforward proof that generalizations of the Clifford group to qudits for d prime form a unitary 2-design, which may be of general interest.

II. MATHEMATICAL PRELIMINARIES
In this paper, we use the Liouville representation of quantum channels. In the Liouville representation, given some fixed basis {|i } of our Hilbert space H, a density matrix ρ = ij ρ ij |i j| is represented by a column vector |ρ = ij ρ ij |i ⊗ |j , where we use a doublebracket |· to distinguish elements of H ⊗ H from elements of H. In the case of a pure state ρ = |ψ ψ| we will also sometimes write |ψ in place of |ρ . A quan- In this representation, matrix multiplication corresponds to composition matrix-vector multiplication corresponds to applying a quantum channelΛ |ρ = |Λ(ρ) , and the inner product of two vectors corresponds to the Hilbert-Schmidt inner product of the corresponding density matrices σ|ρ = Tr(σ † ρ).
In particular, if M is a projector into some measurement outcome, the overlap M |ρ gives the probability of measuring M from a state ρ. For a more detailed treatment of the Liouville representation, see [36].
Given a unitary group G acting on our Hilbert space H, the natural action of U ∈ G on density matrices is given by U (ρ) = U ρU † . In the Liouville representation, such an operator is represented byÛ = U ⊗ U * . The map φ : U → U ⊗ U * forms a representation [37] of the group G on H ⊗ H that we will refer to as the natural representation of G. We can also define the G-twirl of a quantum channel Λ aŝ where |G| is the order of the group. As we will see, Λ G has properties similar to the original channel Λ, but it has a simpler structure that makes it more tractable to study.
If an noisy implementation of a gate U results in applying the channel (Λ•U ), we would like to characterize how close the noise channel Λ is to the identity. We will focus on one common measure of noise, the average fidelity F Λ , given by Here, dψ is the unitary-invariant Haar or Fubini-Study measure on H. The integrand ψ|Λ|ψ is the probability of preserving a state |ψ after the noise operator Λ has been applied. The average fidelity is then simply the average of this probability over all possible input states.

III. THE GENERALIZED CHARACTER RANDOMIZED BENCHMARKING PROCEDURE
Let G be the unitary group on H that we wish to benchmark. Let φ : G → L(H ⊗ H) be its natural representation, which decomposes into irreducible representations as φ a 1 φ 1 ⊕ · · · ⊕ a I φ I , where a i ∈ Z + is the multiplicity of the irrep φ i . Let H ⊗ H i C ai ⊗ H i be the corresponding decomposition of Hilbert space, such that each φ i acts nontrivially only on a single copy of H i . We will make the standard RB assumption that the gate error Λ associated with U ∈ G is independent of U , although this can be relaxed [22,38,39] Let G ⊆ G be a subgroup of our unitary group with natural representation φ a 1 φ 1 ⊕ · · · ⊕ a I φ I and corre- that there is not in general any relation betweenĪ and I, or the multiplicities a i and a i . We choose G such that for every i ∈ {1, ..., I}, there exists a corresponding i ∈ {1, ..., I} such that C a i ⊗ H i ⊆ C ai ⊗ H i . One may always choose G = G, but we will see below that for this procedure to scale with the number of qubits we must choose G G. We denote the character of the irrep φ i by χ i .
Our RB procedure consists of the following steps: 1. For each i ∈ {1, ..., I}, choose an initial state |ρ i and measurement projector |M i such that | M i |P i |ρ i | is large as possible (see Section III C below), whereP i is the projector onto H i .
2. For a given N , choose unitaries U 0 ∈ G and U 1 , ..., U N ∈ G randomly and uniformly (note elements can be repeated).
Apply the gates (U 1 U 0 ), U 2 , ..., U N +1 sequentially, where (U 1 U 0 ) is compiled as a single element of G. 5. Repeat steps 2-4 many times, to estimate the character-weighted survival probability for each i, where Pr U0,...,U N +1 is the probability of measuring |M i after applying gates U 0 , ..., U N +1 to |ρ , including the effect of gate and SPAM errors.
6. Repeat steps 2-5 for different values of N .
7. Fit each character-weighted survival probability to a function of the form where the C i,j and λ i,j are fitting parameters independent of N .
8. Estimate the average fidelity of the gate error Λ as where d := 2 n is the dimension of Hilbert space.
A similar RB procedure was first proposed in [22] for groups with all a i = 1, the so-called multiplicity-free groups. In this case, each character-weighted survival probability becomes a single exponential decay. Character RB had been previously proposed for the multiplicityfree dihedral group on one qubit [18], and a related approach has been used to simplify standard RB [40]. The idea of including an initial gate U 0 and weighting by characters to isolate exponential decays has also been independently proposed in [41].
We note if we omit the initial gate U 0 and the character-weighting χ * i (U 0 ), we get the method of [19][20][21]; in this case, we get a single survival probability S(N ) that is given by S(N ) = i,j C i,j λ N i,j . Determining the λ i,j then requires fitting all the parameters C i,j and λ i,j simultaneously, and quickly becomes infeasible for a modestly large number of parameters. We see that while both our method and the method of [19][20][21] involve simultaneously fitting multiple exponential decays, our method significantly reduces the number of parameters in each fit. For example, if φ 2φ 1 ⊕ φ 2 ⊕ φ 3 , our method requires fitting three functions, corresponding to φ 1 , φ 2 , and φ 3 , where the first function is a sum of two exponential decays and the latter two functions are single exponential decays. In contrast, [19][20][21] require fitting a single exponential function that is the sum of four exponential decays, one for each copy of each irrep. In addition, the method of [19][20][21] cannot determine F Λ ; this is because it is not possible to match the observed parameters {λ i,j } to their corresponding H i in order to use Eq. 5.
The remainder of this section is devoted to deriving this procedure, for groups that are not necessarily multiplicity-free.

A. Deriving the decays
To derive the form of the character-weighted survival, Eq. 4, we will need two facts from representation theory.
Fact 1 (Schur's Lemma). Let φ : G → L(V ) be a representation of a group G on a vector space V , which decomposes into irreducible representations as φ a 1 φ 1 ⊕ · · · ⊕ a I φ I , where a i ∈ Z + are positive integers. The corresponding decomposition of V is V i C ai ⊗ V i . In terms of this decomposition, any linear mapη ∈ L(V ) satisfyingηφ(U ) = φ(U )η for all U ∈ G is of the form whereQ i is some a i × a i matrix for each i.
Fact 2 (Projection formula). Let φ and V be as above.
Given an irrep φ i : G → L(V i ), define the character ). Then we can write the projector onto C ai ⊗ V i aŝ For proofs of both facts, see [37].
Given these results, we can prove the key property of G-twirls that allows us to compute the average fidelity.
Theorem 1 (Form of G-twirls). If G is any unitary group acting on H, let φ a 1 φ 1 ⊕ · · · ⊕ a I φ I be the decomposition of the natural representation into irreps, and let H ⊗ H i C ai ⊗ H i be the corresponding decomposition of H ⊗ H. If Λ is any quantum channel, the G-twirl of Λ is of the form where Q i is defined as in Fact. 1.
Proof. We apply Eq. 1 to observe that for any U ∈ G. We can then apply Fact 1.
We are now ready to derive the formula for the character-weighted survival probability S i (N ). This proof follows the logic of [22], adapted for nonmultiplicity-free groups. Writing out Eq. 3 explicitly, including the effect of preparation and measurement errors Λ P and Λ M , we have The sum over U 0 gives the projection |G|P i /dim(H i ) according to Eq. 7. To do the sum over U 1 , ..., U N , we can define new group elements D 1 , ..., D N by D i = U i · · · U 1 . In terms of the D i , we then have U i = D i D † i−1 , with the convention that D N +1 = 1. Note that summing over U 1 , ..., U N is the same as summing over D 1 , ..., D N . We therefore may write We can now easily perform the sum over the D i , since each sum just gives a G-twirl according to Eq. 1. Per-forming this sum, and using Thm. 1, gives where in the last line, we used the fact that the range ofP i is included in C ai ⊗ H i . We see that the effect of the character-weighting is to produce a projector that restricts our attention to a single i. If we diagonalizeQ i asQ i = ai j=1 |e i,j λ i,j e i,j | with e i,j |e i,j = δ j,j , thenQ N i = ai j=1 |e i,j λ N i,j e i,j |, and we may write the final form of S i (N ) as which is precisely the form given in Eq. 4. Notice that the λ i,j depend only on the gate error Λ, and not the SPAM errors Λ P , Λ M which are absorbed into the constant prefactor.

B. Computing the fidelity
Finally, we prove the fidelity can be estimated according to Eq. 5. This was first derived in [21], although we will adopt a simpler proof here using techniques introduced in [13,14]. The key realization is that both the fidelity and the trace of a channel are invariant under twirling by an arbitrary group: F Λ = F Λ G and Tr(Λ) = Tr(Λ G ) (see Eq. 1). In particular, if we choose G to be the full unitary group it is known that the full twirl of a channel is simply a depolarizing channel [13,14][42]: 1|.
In terms of the parameter p, we can directly compute Similarly, we can also directly compute Tr(Λ G ) = pd 2 + (1 − p). Combining these equations gives To complete the proof, we note that Tr(Λ) can be written in terms of the matricesQ i in Eq. 8 as which, combined with Eq. 10, gives Eq. 5 as desired.

C. Scaling and Feasibility
We note that experimentally determining S i (N ) requires Monte Carlo sampling of U 0 , U 1 , ..., U N . Each term in this sample is bounded by max U0∈G (|χ i (U 0 )|) = dim(H i ). Therefore, the standard deviation of the samples is bounded by dim(H i ), and the sample mean has uncertainty bounded by dim(H i )/ √ no. samples. To determine the relative uncertainty, we consider S i (N ) ≈ ai j=1 C i,j which is given by where we've approximated Λ, Λ M , Λ P ≈ 1. The relative uncertainty in S i (N ) is therefore bounded by We see that to efficiently benchmarking a group G, we must have I, a i , and dim(H i ) all small. I must be small so that we only need to estimate a small number of character-weighted survival probabilities S i (N ), a i must be small so that we may fit a function with a small number of parameters, and dim(H i ) must be small for our Monte Carlo estimation of S i (N ) to converge quickly. Note that for any G the natural representation satisfies I i=1 a i dim(H i ) = 4 n where n is the number of qubits, so that choosing G = G will not suffice if the number of qubits is large. In particular, to scalably benchmark a group, we must choose G so that the number of irreps I grows slowly with n, the multiplicity a i of each irrep is bounded by a small constant, and G has corresponding irreps H i whose dimension grows slowly with n. These scaling considerations are similar to those discussed in [22] for multiplicity-free RB, except in our case we allow a i to be bounded rather than strictly 1.
Note that the optimal |ρ i with largest | M i |P i |ρ i | is necessarily a pure state, since any mixed state |ρ i = γ p γ |ψ γ has Ref. [22] considered the case of mixed initial states, and included a protocol for sampling from a mixed state |ρ i = γ p γ |ψ γ provided one can efficiently prepare the states {|ψ γ }. However, we see that it suffices to take the initial state to be one of the efficiently preparable |ψ γ , which simplifies initial state preparation.
Our scaling estimates are based on the typical case; however, there are a few worst-case failure modes. First, the noise may have some symmetry that restricts e i,j |Pī ≈ 0 for some (i, j). In this case, the corresponding λ i,j will not be accurately estimated by the fitting function. To remedy this, one may choose a set of projectors P i,1 , ...,P i,k such that each e i,j | has overlap with at least oneP i,α . This requires at most a i projectors. We can then definê The modified character-weighted survival probability will require taking additional data to achieve the same relative uncertainty, since the corresponding dim(H i ) = α dim(H i,α ) will be larger, but is otherwise identical. The fitting procedure may also have difficulty fitting multiple exponential decays, especially if the decay rates are similar. In the case of similar decays, the fit might have numerous local minima; worse, the fitting function might simply set the coefficient of one of the decays to zero and the corresponding decay rate to some arbitrary value, and fit the curve using fewer exponential decays. This can be detected during the fitting procedure, and corrected by either taking more data to more closely constrain the fit or by simply fitting fewer exponential decays.

IV. APPLICATION: SUBSPACE RANDOMIZED BENCHMARKING
As an application of the general character RB method, we can improve on the recently introduced subspace randomized benchmarking method [23]. Subspace RB characterizes the error associated with a group of gates G that preserve a subspace of the Hilbert space. In [23], a benchmarking procedure is introduced that yields two decay parameters that are functions of the noise channel, but the procedure does not give an estimate for the average fidelity or other quantities with simple physical interpretations. The multiplicity-free character RB of [22] is not directly applicable to this situation, as we will see that any group that preserves subspaces necessarily decomposes into irreps with multiplicity. However, using our method we can easily characterize the average fidelity of such gates.
To simplify our discussion, we will focus on the particular case discussed in [23]. The system considered in [23] can implement arbitrary symmetric single qubit gates U 1 := U ⊗ U as well as the two-qubit entangling gate U ZZ := exp{−i π 4 Z ⊗ Z}. The symmetric single qubit gates have negligible error compared to the entangling gate, so the goal of the experiment is to characterize the fidelity of U ZZ . This is accomplished by combining the elementary gates into elements of a benchmarking group G, using a fixed number of the relevant gate U ZZ , and then designing an RB procedure to benchmark elements of G. It is straightforward to see that any U ∈ G made up of products of U 1 and U ZZ operators preserves the triplet and singlet subspaces This implies that every gate U ∈ G decomposes as U = U T ⊕U S , with U T and U S acting on the triplet and singlet spaces, respectively.
Our method differs from the original in several ways. Most notably, we combine the elementary gates into elements U ∈ G such that G forms a group. This requires a moderate increase in complexity of the combined gates; [23] combined their gates into unitaries involving three U ZZ gates, while our construction requires four. However, in return for this increased complexity, our method offers several advantages. Rather than estimate decay parameters with no clear physical interpretation, our method produces direct estimates of the average fidelity. In addition, the derivation of the form of the exponential decays in [23] required assumptions on the relative phases of U T and U S that could not actually be realized on their experimental platform. In contrast, our method yields rigorous decays thanks to the underlying group structure of G.
The original subspace RB can be extended to sets of gates G that preserve some arbitrary splitting of H into subspaces H = H 1 ⊕H 2 provided the set G can be written as are both groups and unitary 2-designs [43] (see below for the definition of a 2-design). However, it is difficult to construct such a set in a way that is experimentally relevant; indeed, [23] could not do this for the simple case of two qubits, and we avoid attempting such a construction here. A more useful approach, which mirrors our approach below, is to construct an arbitrary group out of the elementary gates and perform character RB on whatever irreps result. This method can likely be used to benchmark other two-qubit gates that are symmetric under SWAP besides U ZZ , and may also prove useful for gates that preserve other subspaces.

A. Constructing the benchmarking group
Ref. [23] constructed their benchmarking set G using a generalization of the Clifford group [11,12] to a dlevel system [44]. We will follow a similar procedure, modified to ensure G forms a group. For a d-level system, analogues of the X and Z qubit operators are defined as [45]: and addition is performed modulo d. These generalized X and Z operators are unitary, and the set {X a Z b : a, b ∈ Z d } forms an orthogonal basis for the set of all d × d matrices. Note that for d = 2 we recover the usual Pauli matrices.
Specializing to d = 3, define the generalized Pauli that P is a group follows from the commutation relation ZX = ωXZ. The generalized Clifford group is defined to be the set of all unitaries that stabilize P [44]: An element U ∈ G T is defined (up to a global phase) by its action on X and Z. Defining U XU † = ω ηx X ax Z bx and U ZU † = ω ηz X az Z bz , and noting , leading to a total of 216 elements of G T . We can find the action of U ∈ G T on a general element X a Z b by The action of U on a general density matrix then follows by linearity. Our benchmarking group G is constructed by combining the elementary symmetric gates to act as G T on the triplet subspace, where the three levels |0 , |1 , |2 correspond to the triplet basis |00 , |01 +|10 √ 2 , |11 . The most general composite gate is formed by alternatively applying U 1 and U ZZ gates to our qubits. A straightforward calculation shows that if such a circuit applies an operator U T to the triplet subspace, its action on the singlet subspace is necessarily given by (−1) nz ω η det(U T ) 1/3 , where n z is the number of entangling U ZZ gates. By varying the single-qubit unitaries U 1 , we find computationally that all elements of G T and all relative phases ω η can be generated by circuits of exactly four U ZZ gates, as shown in Fig. 1 [46]. In total, then, the benchmarking group is given by where the first summand acts on the triplet subspace and the second acts on the singlet subspace. Note that every group element contains exactly four entangling gates, so the average fidelity of G gives a useful measure of the fidelity of the entangling gate.

Subrep
Projector χi(UT ⊕ US) which are described in Table I. These are all clearly subrepresentations of the natural representation; for proof that they are in fact irreducible, we will use the concept of a unitary t-design [9].
Let S be a set of unitaries acting on a space H. A balanced polynomial of degree t is a polynomial in the matrix elements of U and U * where each term in the polynomial has degree d < t in the elements of U and degree d in the elements of U * . S is a unitary t-design if for balanced polynomial p(U, U * ) of degree t, averaging p(U, U * ) over S is the same as averaging over all unitaries on H (weighted by the Haar measure) A classic example is the Clifford group, which forms a unitary 3-design [9,47,48].
The group G T forms a unitary 2-design [49] (see Appendix B for a proof). This allows us to prove the representations in Table I are irreducible, using the following fact: Fact 3 (Schur normalization). Let χ be the character of a representation. The representation is irreducible iff For a proof, see [37].
The representations H T 0 and H S0 are 1D, thus irre-ducible. For the representation H T ⊥ , we have where the second equality follows from the unitary 2design property, and the third follows from the fact that H T ⊥ is an irrep of the natural representation of the full unitary group on H T . Finally, for H T S and H ST we have where the second equality follows from the unitary 2design property and the third follows from the fact that the direct representation of the full unitary group on H T is irreducible.
Note that H T 0 and H S0 are two irreducible copies of the trivial representation, so that G is necessarily non-multiplicity-free [50]. The remaining irreps are all unique, since they have different character functions.

C. Benchmarking G
The form of the decay curves corresponding to each irrep is given by Note that from our general form Eq. 4 we would expect that S 0 (N ) is the sum of two exponentials term, with each λ 0,j corresponding to an eigenvalue ofΛ G restricted to H 0 . However, we know that for trace-preserving noise 1|Λ G = 1|, which implies that one of the eigenvalues is 1. We define two different subgroups G 1 , G 2 ⊆ G for our benchmarking procedure. We will use G 1 to construct S 0 (N ) and S T ⊥ (N ), and G 2 to construct S T S (N ) and S ST (N ). We define For G 1 , we can define the following character functions and their corresponding projectors: We also see that dim(H T ⊥ ) = 1, so that S T ⊥ (N ) will have the best possible relative error (see Section III C).
For G 2 , we can define the character functions and corresponding projectors We again see that P T S projects into H T S ⊆ H T S dim(H T S ) = 1, so that S T S (N ) will also have the best possible relative error.
As our initial states, we choose Here, we've restricted ourselves to initial states that are a mixture of Z-basis product states, for ease of preparation. As our measurement projectors, we choose Here, we've restricted our measurement projectors to correspond to Z measurements, for ease of measuring. With these choices, the S i (N ) are approximately Note that λ ST = λ * T S , so it is unnecessary to compute both S T S (N ) and S ST (N ). Note also that λ 0 and λ T ⊥ are both necessarily real, as are C 0 and B. The remaining parameters are complex. For convenience, we will rotate S T ⊥ (N ) by e iπ/3 so that S T ⊥ (N ) is approximately real.
We demonstrate our method by generating random error channels and simulating our RB procedure. To generate a random error channel Λ on a d-dimensional Hilbert FIG. 2. The predicted and measured character-weighted survival probability for a random error channel. The exact decay (green) is an exponential decay given by Eq. 11. We estimate Si(N ) by applying random gates and measuring the final state (blue points). The data is fit to an appropriate function (orange) from which we estimate the fidelity. .
space, we generate a random unitary on a (d 2 +d) dimensional Hilbert space and trace out d 2 auxiliary degrees of freedom; to adjust the fidelity, we take a convex combination of the resulting channel with the identity channel. All channels generated by this method are guaranteed to be completely positive trace-preserving (CPTP), thus valid error channels, and every CPTP channel can be generated via this method [36]. For each error channel, we take data at 15 different values of N , and sample unitary operators at each value of N until we have applied a total of 150, 000 unitary operators in total. For each string of unitary operators, we perform full state-vector simulation to apply the RB sequence of operators, and then generate a measurement outcome of 0 or 1 using the appropriate probability, and compute the characterweighted average. In Fig. 2, we show the exact value of S i (N ), the data we take to estimate S i (N ), and the fit to S i (N ) according to Eq. 11 for a single random error channel Λ. From the fit data, we can estimate F Λ by applying Eq. 5: Note that the imaginary parts of λ T S and λ ST always cancel to give a real F Λ as expected. We use this formula FIG. 3. The exact and estimated fidelity for a selection of randomly generated error channels. Each estimate was based on data taken over 15 different lengths N . Each estimate was arrived at by applying a total of 150,000 benchmarking group elements. This is the same number of elements applied in the experiment described in [23]. The diagonal line denotes the points where the exact and estimated fidelities are equal. The data agree with the line with a reduced χ 2 value of .9, indicating good agreement. Note that the error bars are derived from statistical uncertainty in the data, and vanish in the limit of an infinite number of data points to estimate the fidelity of our randomly generated error channels, and compare our estimate to the true fidelity in Fig. 3. We see that the true fidelity and the estimated fidelity agree within the error bars set by the uncertainty of our fits. We can directly compare this with the original subspace RB method [23]. That method served to estimate only λ 0 and λ T ⊥ (t and r in their notation), and they could only form a measure of gate fidelity using these quantities. They defined a so-called "extended subfidelity"F Λ , which they obtained by replacing λ ST and λ T S with the weighted average of the other eigenvalues:

10
. It is obvious that if F Λ → 1,F Λ → 1 as well, but the reverse is not necessarily true. We can compare the approximate fidelity to the exact fidelity for the various noise sources explored [23]. We consider intensity errors, which correspond to an overrotation e −i ZZ ; optical pumping errors, which cause amplitude-damping on each qubit; inhomogenous fields, which cause phase-damping on each qubit; and SWAP errors, which interchange the qubits.. The results are shown in Fig. 4. We see that while for most error sources F Λ ≈F Λ , there exist worse-case errors, such as SWAP, that cannot be detected byF Λ . This was also noted in [23] as a limitation of their method.
Our work also improves upon the original work in the mathematical assumptions needed to derive the benchmarking decays. Ref. [23] derived their decay formulas under the assumption that their benchmarking set was of the form {U T ⊕ σφ U T : U T ∈ G T , σ = ±}, where φ U T is some uncontrolled phase that occurs on the singlet space and σ is a controllable phase between the singlet and triplet spaces. However, in practice they could not control σ using a constant number of U ZZ gates. Instead, they implemented only {U T ⊕ φ U T : U T ∈ G T } and assumed the form of the decay would not change. In our work, by contrast, we have rigorously derived decay formulas for a group of gates that can be directly compiled into elementary symmetric gates using a constant number of U ZZ .
We note that our method does require one additional capability that was not required in the original work: in order to estimate S T S (N ), it is necessary to initialize and measure the |01 state. This requires additional experimental overhead to individually address and measure each qubit at the beginning and end of the benchmarking procedure. However, such overhead only contributes to the SPAM errors Λ P , Λ M , and does not affect our estimates of the entangling error. In any case, our method to measure λ 0 and λ T ⊥ does not require individual addressing, and can be viewed as a mathematically rigorous method to extract these parameters with no additional experimental requirements.

V. APPLICATION: LEAKAGE RANDOMIZED BENCHMARKING
We may also use our generalized character RB to improve the leakage RB introduced in [26]. In leakage RB, like subspace RB, one is given a group G that preserves the splitting of the Hilbert space into subspaces H = H 1 ⊕ H 2 . In leakage RB, however, H 1 ⊕ H 2 does not represent the computational Hilbert space, and the goal is not to compute the average fidelity of the group operations. Instead, H 1 represents the computational space of a quantum system (e.g. the two lowest-level states that encode a qubit), while H 2 represents the leakage space outside the computational space. Leakage RB determines the average probability of "leaking" from H 1 to H 2 or "seeping" from H 2 to H 1 . Noting that the probability of a state |ρ being in subspace α = 1, 2 is given by 1 α |ρ , define the leakage L and seepage S by In addition, leakage RB determines the average fidelity restricted to the subspace H 1 which is the appropriate measure of gate quality, since all computations take place in H 1 . Leakage RB is relevant for any system in which qubits are encoded in the subspace of a larger Hilbert space, which includes superconducting qubits [51,52], quantum dots, [53][54][55][56][57], and trapped ions [58][59][60]. The original leakage RB could only be applied to a group such that {U 1,a1 : a 1 ∈ A 1 } and {U 2,a2 : a 2 ∈ A 2 } form 2designs on their respective subspaces [61]. This is a very stringent condition, as it requires being able to independently control the computational and leakage subspaces. In many experimental implementations such control is not realistic; an experimental implementation of a gate U 1,a on the computational subspace will naturally implement some U 2,a on the leakage subspace. It is therefore desirable to develop a leakage RB that can be applied to more general groups.
Using our method, we can derive a leakage RB procedure that is more general than the one described in [26]. Let G be a group of unitary gates that preserve the subspaces of H, and let Λ be their shared error channel. To estimate L and S, we will require that the only trivial representations of G are |1 1 and |1 2 , while to estimate F Λ,1 we additionally require that the subrepresentation H 1⊥ ⊆ H 1 ⊗ H 1 orthogonal to |1 1 is an irrep of multiplicity 1.
then the first condition is satisfied provided {U 1,a : a ∈ A} and {U 2,a : a ∈ A} are unitary 1-designs, while the second condition is satisfied if provided these groups are unitary 2-designs with dimensions d 1 = d 2 (see Appendix C for proofs). Note that our requirements are significantly weaker than the original leakage RB, as we are only assuming the ability to implement an independent phase on the leakage space. We outline our procedure for determining L, S, and F Λ,1 for such groups G. Our procedure, like the original leakage RB, requires that SPAM errors do not mix the the subspaces H 1 and H 2 , or at least that such mixing is negligible compared to the gate errors. In our derivations we will assumeΛ M =Λ P =1, although the generalization to errors that act only within the subspaces is trivial.
Our modified leakage RB procedure consists of the following steps: 1. Choose an initial state |ρ ∈ H 1 and measurement projector |M i = |1 1 .
2. For a given N , choose unitaries U 0 ∈ G and U 1 , ..., U N ∈ G randomly and uniformly. Compute 4. (a) The extended sub-fidelityFΛ of [23] versus the exact fidelity FΛ that we can estimate in our paper, for a selection of error channels of varying strengths: intensity errors, which correspond to an overrotation e −i ZZ ; optical pumping errors, which cause amplitude-damping on each qubit; inhomogeneous fields, which cause phase-damping on each qubit; and SWAP errors, which interchange the qubits. This plot corresponds to the exact value of both FΛ andFΛ that one estimates in an experiment. Note that while theFΛ agrees with FΛ in the limit FΛ → 1, in general the two do not agree, and there exists worst-case errors such as SWAP thatFΛ cannot detect. (b,c) Simulation of an experiment that estimates FΛ versusFΛ for a total of 300, 000 unitaries, in the case of (b) intensity and (c) SWAP errors of varying strengths. These plots correspond to experiments that estimate the exact values shown in (a). We see that the difference between FΛ andFΛ can be discerned in a realistic experiment.

Perform a measurement of the observable M to de-
termine if the state is still in H 1 .

5.
Repeat steps 2-4 many times, to estimate the zeroth character-weighted survival probability for each i, where Pr U0,...,U N +1 is the probability of remaining in H 1 after applying gates U 0 , ..., U N +1 to |ρ .
6. Repeat steps 2-5 for different values of N .

Fit the survival probability to a function of the form
where A, B, and λ are independent of N .

Estimate L and S as
In the remainder of this section, we prove the correctness of this procedure and provide an example of such leakage RB.
A. Deriving L and S Written out explicitly, the zeroth character-weighed survival probability is whereP 0 is the projector onto the trivial irrep, and we have made the same substitutions as in Section III A to reduce the sum over {U 0 , ..., U N } to G-twirls and a projector. We know from Thm. 1 thatΛ G has a blockdiagonal formΛ G = iQ i ⊗1 i , where i indexes the irreps. BecauseΛ G is multiplied by the projectorP 0 in Eq. 16, we may ignore all terms exceptQ 0 ⊗ 1 0 . In terms of the eigendecomposition ofQ 0 , we may writê Q 0 ⊗ 1 0 = |e 0 e 0 | + λ|e 1 e 1 |, so that S 0 (N ) = 1 1 |Λ|e 0 e 0 |ρ + 1 1 |Λ|e 1 e 1 |ρ λ N where we have used the fact, noted in Section IV, that one eigenvalue ofQ 0 is always 1. This justifies the fit Eq. 17.
So far, we have simply repeated the steps in Section III A with slight modifications. However, in order to estimate L and S we will need to explicitly determine the eigendecomposition ofQ ⊗ 1 0 . We first note that theP 0 subspace is spanned by the orthonormal vectors Thus in terms of these basis vectors, we may writê Noting that M αβ = 1 α |Λ G |1 β = 1 α |Λ|1 β , we can use the definitions of L and S, (Eqs. 13 and 14) to determine the constants Q αβ : From the explicit form of Q αβ , we can determine the eigendecomposition ofQ 0 ⊗ 1 0 via straightforward algebra [23,26]: Putting this together, we can evaluate the zeroth character-weighted survival probability as We then have that B = S L+S , which can be combined with λ = 1 − L − S to immediately give Eqs. 18 and 19.

B. Deriving FΛ,1
To establish Eq. 20, we first prove the following: whereP 11 is the projector onto H 1 ⊗H 1 . We use a similar method as in our proof of Eq. 10. We first note that the restricted average fidelities ofΛ andP 11ΛP11 :=Λ 11 are equal.Λ 11 is an error channel restricted to the H 1 subspace. We can twirlΛ 11 by the full unitary group on H 1 to get a depolarizing channel Note that we have p and q rather than p and (1 − p) as in Eq. 9; this is becauseΛ 11 is not necessarily tracepreserving. We can directly compute F (Λ11) G = p + q d1 . Similarly, we can also directly compute Tr (Λ 11 ) G = pd 2 1 + q. Finally, we can directly compute p + q = Combining these three equations gives Eq. 21.
To estimate Tr(ΛP 11 ), we can divide this trace up into two pieces: whereP 1⊥ is the projector onto H 1⊥ . The latter trace is simply (d 2 1 − 1)λ 1⊥ . Plugging this in to Eq. 21 gives Eq. 20 as desired.

C. Example: Two-qubit logical encodings
As an example of our leakage, we consider an encoding of a single logical qubit into the S z = 0 subspace of two physical qubits. This encoding is frequently used in quantum dot qubits [54][55][56]. The computational space H 1 is spanned by and the leakage space H 2 is spanned by Let's assume we implement single-qubit rotations on our computational space by the operators where implementing an X or Z rotation on the computational space naturally induces a specific rotation on the leakage space. We will take our benchmarking group to be the group generated by these two rotations, G = R X , R Z . This group has a total of 16 elements. It cannot be written as a product of a group acting on H 1 and a group acting on H 2 , so the usual leakage RB does not apply. However, elementary calculation shows that the natural representation of this group contains exactly two trivial irreps, spanned by |1 1 and |1 2 , and we can therefore use our procedure to estimate L and S.
We illustrate this method by generating random error channels and simulating the RB procedure. In Figs. 5, we show the exact value of S 0 (N ), the data we take to estimate S 0 (N ), and the fit to S 0 (N ) according to Eq. 17. In Fig. 6, we repeat the same fitting procedure for a set of randomly generated error channels, and estimate L and S using Eq. 18. We see that the true values of L and S and our estimate for L and S agree within the error bars set by the uncertainty in our fits.
FIG. 5. The predicted and measured S0(N ) for a single randomly generated error channel. The actual decay (green) is an exponential decay given by Eq. 17. We estimate S0(N ) by applying random gates and measuring the final state (blue points). The data is fit to a function of the form of Eq. 17, from which we estimate L and S.
We cannot apply our method to find F Λ,1 because in this example H 2⊥ and H 1⊥ share an irrep. This reflects the overall difficulty in applying leakage RB to physically realistic circumstances. While this work provides the most widely applicable method for leakage RB currently available, more work is needed to develop a truly general procedure.
We will derive a benchmarking procedure that determines the average fidelity of circuits composed of matchgates using a number of experiments that scales polynomially in the number of qubits. Our method is the matchgate equivalent of traditional Clifford RB, which characterizes the average fidelity of circuits composed of Hadamard, phase, and CNOT gates, and also requires a number of experiments that scales polynomially in the number of qubits. However, we will see that benchmarking matchgate circuits requires the full machinery of nonmultiplicity-free character RB.

A. The matchgate group
Consider a line of n qubits with nearest-neighbor connectivity. Let G be the matchgate group on n qubits, the group of all unitaries generated from nearest-neighbor matchgates. Naively, G could contain arbitrarily long circuits of matchgates. However, one can prove that every element of G can be realized using circuits of at most 4n 3 matchgates [29,30]. We will provide a simplified proof of this fact below.
Following [29,30], our primary tool to understand G will be the Jordan-Wigner transformation [64]. Define 2n Majorana operators {c i } as Claim 2. Any unitary operator U ∈ U (2 n ) that acts on the Majorana operators as a proper rotation is in the matchgate group G. In particular, such a U can be decomposed into a product of at most 2n 3 nearest-neighbor matchgates.
These two claims together imply that the matchgate group is isomorphic to SO(2N ), and that every element of the matchgate group can be efficiently implemented in a quantum circuit.

Proof of claims
Proof of Claim 1. We provide a simplification of the proof in [30]. We prove that a nearest-neighbor matchgate acting on qubits k and k+1 acts as a rotation mixing c 2k−1 , c 2k , c 2k+1 , and c 2k+2 , and that all such rotations are realized by matchgates. It then follows that all products of matchgates also act as rotations on the Majorana operators.
Without loss of generality, we can restrict ourselves to k = 1, so our Majorana operators are given by We can write an infinitesimal matchgate as where M must be of the form with α ab ∈ R. One can directly check that U satisfies We therefore see that infinitesimal matchgates generate the whole Lie algebra so(4) of real antisymmetric matrices. By exponentiating the infinitesimal matchgates, we generate the full set of matchgates; in this process, we generate the full group SO(4) as well.
We note that an arbitrary rotation between two Majorana operators . Thus, the above decomposition of R into < 4n 3 two-Majorana rotations gives an explicit formula for the matchgates needed to construct R. We provide Python code to realize the Hoffman decomposition of R into elementary rotations, as well as the reduction of R to a matchgate circuit, at [67].

B. Irreps of the matchgate group
We want to understand how the natural representation of G decomposes into irreps. This is most convenient in the basis of polynomials of {c m }. Note that c 2 m = 1, so our polynomials are at most degree 1 in any given c m and there are 4 N such polynomials. Explicitly, an orthonormal basis of H ⊗ H is given by Define H i := span{|m 1 · · · m i } to be the space spanned by degree-i basis elements, for each i = 0, ..., 2n. Then H i i C 2n , the i-fold wedge product of C 2n . It's clear thatÛ preserves each H i , so that each H i is a subrepresentation. On H 1 ,Û acts as the rotation operator R associated to U :Û On general H i ,Û acts as the wedge product of the rotation operator: Claim 3. The natural representation of the matchgate group decomposes into the irreps H 0 ⊕ H 1 ⊕ · · · ⊕ H n,1 ⊕ H n,2 ⊕ · · · ⊕ H 2n−1 ⊕ H 2n .
Proof. Define the Hodge star operator * : Let G ⊂ G be the subgroup of the matchgate group generated R ∈ SO(2n) with R diagonal. Such an R is always of the form R = diag{σ 1 , ..., σ 2n } with σ 1 σ 2 · · · σ 2n = 1. The action on a state |m 1 · · · m i ∈ H i is given bŷ U |m 1 · · · m i = σ i1 · · · σ im |m 1 · · · m i and therefore the states |i 1 · · · i m are the irreps of the natural representation of G. Because of the constraint σ 1 σ 2 · · · σ 2N = 1, each irrep has multiplicity 2, with the irrep spanned by |m 1 · · · m i isomorphic to the irrep spanned by | 1 · · · 2n−i with { a } the complement of {m a }. For each i = 0, ..., n, we can define a character function and corresponding projector These projectors project into the multiplicty-two irreps H i ⊕ H 2n−i for i = 0, ..., (n − 1), and project into the two inequivalent irreps H n,1 ⊕ H n,2 for i = n. As our initial state, for each i = 0, ..., n we choose where kth qubit is in the + state of the X operator for i = 2k − 1. Provided we can prepare both X-basis and Z-basis single qubit states, we can prepare |ρ i . As our measurement projector, for each i = 0, ..., n we choose For i = 2k − 1, this corresponds to a measurement of the kth qubit in the X basis, while for i = 2k this corresponds to a measurement of the product of the last k qubits in the Z basis.
With these choices, the S i (N ) are approximately and the relative uncertainty does not depend on the number of qubits. This is therefore a scalable method to benchmark the matchgate group.
The form of the decay is given by FIG. 7. The predicted and measured character-weighted survival probability for a random error channel. The exact decay (green) is an exponential decay given by one of Eq. 22. We estimate Si(N ) by applying random gates and measuring the final state (blue points). The data is fit to an appropriate function (orange) from which we estimate the fidelity.
For each i, either λ i,1 , λ i,2 , C i,1 , C i,2 ∈ R or λ i,1 = λ * i,2 and C i,1 = C * i,2 , since S i (N ) is always real. For the case of i = n, we know that the former case holds when n is even and the latter when n is odd, by Claim 3. For 1 ≤ i < n, one should assume whichever case gives the best fit. Note that in all cases, we fit at most 4 real parameters.
As an example, we simulate a noisy implementation of the matchgate group on n = 3 qubits. In Fig. 7, we show the exact value of S i (N ), the data we take to estimate S i (N ), and the fit to S i (N ) according to Eq. 22 for a single random error channel Λ. In Fig. 8, we do the same fitting procedure for a set of randomly generated error channels, and estimate their fidelity. We see that the true fidelity and the estimated fidelity agree within the error bars set by the uncertainty of our fits.

VII. CONCLUSION AND DISCUSSIONS
In this work, we extended the recently introduced character RB of [22] to groups with multiplicity. Compared to earlier work on benchmarking arbitrary groups [20,21], our method allows us to accurately determine the fidelity and fit fewer exponentials to experimental data. The generalization to non-multiplicity-free groups was essen- tial to deriving a rigorous version of subspace RB and a scalable RB protocol for the matchgate group. This generalization also allowed us to develop an improved leakage RB protocol.
While we derived the character RB procedure in more generality than [22], our generalization still requires groups of small multiplicity, since the multiplicity of the group determines the number of exponential decays in our fit function. Robustly fitting a sum of many exponential decays is challenging, especially when the decay rates are roughly equal. It is likely straightforward to benchmark groups in which the trivial irrep has multiplicity three, as the corresponding decay S 0 (N ) = A + Bλ N 0,1 + Cλ N 0,2 has only five real parameters. An irrep of multiplicity 3 with a real character function χ has a decay with six parameters, which may be feasible with sufficient data. A general irrep of multiplicity 3, however, requires fitting 9 real parameters, which is likely unfeasible for realistic amounts of data. Higher-multiplicity irreps are correspondingly more difficult. All of the groups we considered in the examples in this paper decomposed into irreps with multiplicity at most 2.
All our applications involved a group that preserved some subspace of the Hilbert space. In the case of subspace RB, the group preserved the triplet and singlet subspaces; in the case of leakage RB, the computational and leakage subspaces; and in the case of matchgate RB, the even and odd parity subspaces. Any group that pre-serves subspaces necessarily has multiplicity, since there is always a copy of the trivial irrep in each subspace. It is an open question whether non-multiplicity-free character RB has useful applications to groups that do not preserve subspaces but nonetheless have multiplicity.
While our leakage RB necessitates the fewest assumptions to date, it is still too restrictive for many experimental implementations. Most notably, our RB requires the set of gates to be a group, which may be unrealistic; often, the gates will only form a group modulo rotations in the leakage space. In experimental implementations of leakage RB, this problem is usually simply ignored and an exponential decay is posited to exist with the usual relation to the leakage rate [52,57]. It is worth exploring whether the methods used here can be further extended to such sets of gates that are only groups in the computational subspace, modulo rotations in the leakage subspace, to provide a more rigorous foundation for leakage RB experiments.
There are two obvious directions for further applications of character RB, with or without multiplicity. First, character RB has the potential to drastically expand the family of groups that can be scalably benchmarked. This requires both finding a group G that can be efficiently compiled into elementary gates whose multiplicity is bounded as the number of qubits n increases, as well as finding a subgroup G ⊆ G whose irreps have slowly growing dimension. As a simple example, the subgroups of the Clifford group considered in [20] likely have a scalable protocol based on character RB, with G given by the Pauli group. Increasing the number of groups that can be scalably benchmarked gives new ways of characterizing compiled gates, especially non-Clifford gates.
Second, character RB can be used to characterize specific elementary gates by combining these gates into a group, as we did in Section IV for subspace RB. This requires finding a group that can be implemented by combining a fixed number of the gate to be characterized with known high-fidelity gates. Constructing these groups is a non-trivial task, as we have seen in the case of the U ZZ operator above. We leave the exploration of such applications to future work.
In this appendix, we extend the work of [22,38] on gate-dependent errors to the case of non-multiplicity-free character RB. Ref. [22] had previously generalized [38] to establish that multiplicity-free character RB is robust to gate-dependent errors. Here, we largely follow the same logic as [22,38], with appropriate modifications for the case of non-multiplicity-free groups. Our ultimate goal is the following theorem: Theorem 2. Let G be a benchmarking group, and let χ i be a character function for an irrep of the natural representation with multiplicity a i . Assume each gate U ∈ G is realized as a noisy operator U , but do not assume we can write U =ΛÛ for some U -independent noise channel Λ. Then the character-weighted survival probability is given by where N is an error term satisfying | N | < δ 1 δ N 2 and δ 1 , δ 2 are both small for high-fidelity gates. Since we know that λ i,j ≈ 1 for high-fidelity gates, N is negligible compared to S i (N ) for moderately large N .
This theorem implies we may safely use the RB protocols even in the presence of gate-dependent errors, although we will see the interpretation of the estimated fidelity is slightly modified.
In what follows, we will use the notation E [·] for the average 1 |G| U ∈G (·) to make our equations cleaner. We will use also denote the piece ofÛ acting on the (i, j)th subspace of H ⊗ H by φ i,j (U ). We first prove a technical lemma.
Lemma 1. There exist Hermiticity-preserving operatorŝ L andR such that where D = i,j λ i,jPi,j and λ i,j is the largest-magnitude eigenvalue of the matrix operator E U ⊗ φ i,j (U ) * .
Proof. We can rewrite Eq. A1 as We decomposeL asL = I i=1 ai j=1L i,j , whereL i,j acts only on the (i, j)th subspace (but has arbitrary range). Then our equation becomes This is an eigenvector equation forL i,j . We can rearrange the matrix elements ofL i,j into a column vector vec(L i,j ), which gives a more explicit form of the eigenvector equation This equation has a solution, since we picked λ i,j to be an eigenvalue of E U ⊗ φ i,j (U ) * . Similarly, we can find a solution to Eq. A2 by expressingR = I i=1 ai j=1R i,j , where R i,j has range restricted to the (i, j)th subspace (but has arbitrary domain).
Since we found the R i,j and L i,j by solving eigenvalue equations, they can be multiplied by arbitrary constants and still solve Eqs. A1 and A2. We use this freedom to satisfy Eq. A3. We may write For each (i, j) we know the productR i,jLi,j acts only within the (i, j)th subspace. Conjugating by a unitary does not change this, since the unitaries do not mix irreps. By Theorem 1, the twirl ofR i,jLi,j is thus proportional toP i,j . By multiplyingR i,j by an appropriate constant we may have E ÛR i,jLi,jÛ † = λ i,jPi,j . Then Proof of Theorem 2. We begin with our formula for the character-weighted survival probability  [38] demonstrated that ∆ U → 0 as U →Û , so that these error terms become negligible for high-fidelity gates.
We still need to relate the measured decay parameters λ i,j to a quality measure of the noisy gates { U }. Without loss of generality, we may assume U =L UÛR whereR is the operator in Lemma 1 andL U is a gate-dependent operator. The operatorRL U is then the gate-dependent analogue of the operatorΛ. Define a gate-dependent average fidelity: F reduces to F Λ for gate-independent noise. Using Eq. 10 to compute the fidelity of a channel in terms of the trace, we have We can evaluate this trace by assumingR is invertible.
where we used Eq. A2 in the second line. Therefore, we end up with the same formula for estimating F as Eq. 5 for F Λ : IfR is not invertible, we can perturbR by an arbitrarily small amount to make it invertible and the relationship still holds; thus, it holds for arbitraryR.
To prove G forms a unitary 2-design, we need to show (see Section IV B of the main text) 1 |G| U ∈G p(U, U * ) = dU p(U, U * ) for any balanced polynomial p(U, U * ) of degree at most 2 in the elements of U and U * . Any such p(U, U * ) can be written as a linear combination of terms of the form U AU † BU CU † and U DU † , where A, B, C, D are matrices. We are thus reduced to proving for arbitrary matrices A, B, C, D.
In the following, we will make repeated use of an elementary identity of complex roots of unity.  We evaluate the LHS by using Eq. B1 for the conjugation of a general Pauli element: We note that η = h T v + (· · · ), where (· · · ) denotes terms that do not depend on h. We see by Fact 4 that for fixed M the sum over h gives zero unless v = 0, while when v = 0 it is clear LHS = 1. This proves Eq. B3.

Degree 2 polynomials
We now turn to Eq. B2. We prove this using methods from [9], who proved the case d = 2. First, we note that the RHS of Eq. B2 is We now need to evaluate the LHS of Eq. B2 for each of the four cases above. In the first case, we find In the second case, we use Eq. B1 to simplify each summand in the LHS Therefore, the average over the group G gives ω − v T A Q v A 1.
In the third case, we again simplify each summand using Eq. B1, but with an additional B in between: The average over h does not affect this sum, so we only need to consider the average over M . We evaluate the average by realizing that if d is prime, the Clifford group sends every non-identity Pauli string to every other nonidentity Pauli string uniformly. Thus, letting M run over all symplectic matrices makes M v A run uniformly over all vectors M v A ∈ Z 2n d \ {0}. Therefore, the LHS is given by where in the final step, we used Fact 4.
In the last case, we have that each summand is of the form where (· · · ) represents terms that are independent of h. We can again apply Fact 4 to find that the sum over h gives zero. We have thus proved LHS = RHS for each of the four cases, which establishes Eq. B2.

Appendix C: Leakage RB irreps
Let G be a unitary group index by a ∈ A, G = {U a,σ : a ∈ A σ = ±1} = {U 1,a ⊕ σU 2,a : a ∈ A, σ = ±1}, where {U 1,a } and {U 2,a } are each unitary 1-designs on their respective subspaces. We claim that |1 1 and |1 2 are the only trivial irreps of the natural representation of G. Next, we prove that if {U 1,a } and {U 2,a } are in addition unitary 2-designs and d 1 = d 2 then H 1⊥ is irreducible and multiplicity-free. To prove these statements, we need a standard result from representation theory.
Fact 5 (Schur orthonormality). If χ is the character of an arbitrary representation φ, and χ i is the character of an irrep φ i , the multiplicity a i of φ i is For a proof, see [37]. We start with the trivial irreps. It is clear that both |1 1 and |1 2 are trivial irreps. In the case of the trivial irrep we have χ 0 (U ) = 1, so Fact 5 gives where the third equality follows from the unitary 2-design property, and the fourth follows from the fact that H 1⊥ is