Robust Extraction of Tomographic Information via Randomized Benchmarking

We describe how randomized benchmarking can be used to reconstruct the unital part of any trace-preserving quantum map, which in turn is sufficient for the full characterization of any unitary evolution, or more generally, any unital trace-preserving evolution. This approach inherits randomized benchmarking's robustness to preparation, measurement, and gate imperfections, therefore avoiding systematic errors caused by these imperfections. We also extend these techniques to efficiently estimate the average fidelity of a quantum map to unitary maps outside of the Clifford group. The unitaries we consider correspond to large circuits commonly used as building blocks to achieve scalable, universal, and fault-tolerant quantum computation. Hence, we can efficiently verify all such subcomponents of a circuit-based universal quantum computer. In addition, we rigorously bound the time and sampling complexities of randomized benchmarking procedures, proving that the required non-linear estimation problem can be solved efficiently.


I. INTRODUCTION
While quantum process tomography [1] is a conceptually simple approach to the characterization of quantum operations on states, its implementation suffers from a number of fundamental drawbacks. These obstacles range from its exponential scaling with the size of the system to its dependence on precise knowledge of state preparation and measurement. Precise knowledge about state preparation requires precise knowledge about operations and measurements, leading to a difficult nonlinear estimation problem [2][3][4][5]. Lack of precise knowledge about state preparation and measurement can also lead to significant systematic errors in the reconstructed operations [6]. Recently, randomized benchmarking (RB) protocols have been shown to lead to estimates of the average fidelity to Clifford group operations in a manner that is robust against imprecise knowledge about state preparation and measurement, and therefore largely free of some of the systematic errors that can affect standard tomographic reconstructions [7][8][9][10][11][12].
We describe a procedure that provides an almost complete description of any quantum map in a way that is robust against many errors that plague standard tomographic procedures. Specifically, we can estimate the unital part [13,14] of any trace-preserving map, which includes all parameters necessary to describe deterministic as well as random unitary evolution. Furthermore, we show that a related protocol can be used to efficiently estimate the average fidelity to unitary operations outside the Clifford group, again in a way that is accurate even in the presence of state preparation, measurement, and unitary control errors.
Both procedures use RB protocols as a tool, combined with several new results: We show that Clifford group maps span the unital subspace of quantum maps and that important unitaries outside the Clifford group can be expressed as linear combinations of a few Clifford group maps. These insights, combined with new error strategies and analysis, allow us to robustly characterize maps that were previously inaccessible.
Our error analysis rigorously proves that randomized benchmarking decays can be fit efficiently. We also prove new results on the average fidelity of composed maps, which is important for RB, but is also of significance to any procedure where direct access to a quantum map is limited.
This paper is organized as follows. In Sec. II, we give background on general properties of quantum operations. In Sec. III, we sketch the RB protocol and describe the information that can be extracted from such experiments. In Sec. IV, we describe how the information from RB experiments can be used to tomographically reconstruct the unital part of any experimental map even in the presence of imperfect randomizing operations. In Sec. V, we show that it is possible to efficiently bound the fidelity of any experimental map to a class of unitaries capable of universal quantum computation. Finally, in Sec. VI, we analyze error propagation in these protocols. This section includes new bounds on the effect of imperfect randomizing operations and rigorous bounds on the number of samples needed to achieve some desired error and confidence.

II. COMPLETELY POSITIVE TRACE-PRESERVING MAPS: NOTATION AND PROPERTIES
Throughout this paper, we will restrict the discussion to Hermiticity-preserving linear operations on quantum states-more specifically, linear operations on multiqubit states such that the Hilbert space dimension will always be d ¼ 2 n for n qubits. The physical operations within this class that are commonly considered are completely positive (CP) trace-preserving (TP) operations [15][16][17]. We refer to these operations on quantum systems as maps and will denote them by the calligraphic fonts A, B, etc. The composition of two maps will be denoted A∘B, meaning B acts on a state first and then A acts on it. Even when discussing unitary evolution, we will refer to the corresponding maps. The notable exceptions are the identity unitaryÎ and the unitaries in the multiqubit Pauli group P, which will be denotedP i -although the corresponding maps I and P i will also be used in some contexts. We will use the standard convention whereP 0 ¼Î. We use T to mean the map corresponding to the unitary e −i π 8Ẑ . A map E is TP iff trρ ¼ trEðρÞ for allρ, which in turn leads to the requirement that E † ðÎÞ ¼Î, where E † is the Heisenberg picture representation of E. Any linear map E can be written as which is known as the χ matrix representation of E. The map E is CP iff χ E is positive semidefinite, and the TP condition E † ðÎÞ ¼Î translates to It is often necessary to compute the representation of the composition of two maps. While such a calculation can be cumbersome in the χ representation, Liouville representations are more convenient for describing the action of composed maps on quantum states [18]. In the Liouville representation, an operatorρ is represented by a column vector jρii, and maps are represented by matrices acting on these vectors, such that the composition of maps corresponds to matrix multiplication. The most convenient choice of basis for these vectors and matrices depends on the application, but for our purposes, we will use the basis of Pauli operators and will call this the Pauli-Liouville representation (which appears to have no standard name in the literature, despite being widely used [19][20][21][22][23][24]). For a map E, the Pauli-Liouville representation is given by whereP i andP j are n-qubit Pauli operators. Hermiticity preservation implies that all matrix elements of E ðPLÞ are real. The kth entry in the vector jρii representing a density matrixρ corresponds to trρP k . This ensures that the Pauli-Liouville representation of any CPTP map can be written as [19,20] where ⃗ τ ε is a d 2 − 1-dimensional column vector, ⃗ 0 is the corresponding zero vector, and E is a ðd 2 − 1Þ × ðd 2 − 1Þ matrix.
We will quantify how distinct a map E is from a particular unitary map U by the average fidelityFðE; UÞ, which can be written as with integration taken over the unitarily invariant Fubini-Study measure [14]. This definition also impliesFðE; UÞ 1 FðE∘U † ; IÞ ¼FðU † ∘E; IÞ. The average fidelity is closely related to the trace overlap between E ðPLÞ and U ðPLÞ , as well as to χ E∘U † 00 , by the formulas [25,26] FðE; For simplicity and clarity, here and throughout the paper, we omit the superscripts from the Pauli-Liouville representation of superoperators whenever they occur within trace expressions, as these expressions already include superscripts indicating Hermitian conjugates.

III. RANDOMIZED BENCHMARKING OF CLIFFORD GROUP MAPS
RB [7][8][9][10][11][12] consists of a family of protocols to robustly estimate the average fidelityFðE; UÞ between an experimental quantum map E and an ideal unitary map U. In this context, robustness refers to the ability to estimateFðE; UÞ in a manner that is insensitive to imprecise or even biased knowledge about state preparation, measurement, and controlled unitary evolution. Such imperfections can lead to systematic errors, e.g., in fidelity estimates based on standard tomographic reconstruction protocols [5].
We now describe a framework that can be used to understand existing RB protocols but which allows us to highlight how our protocol differs from previous procedures. RB protocols consist of k repeated applications of E, each time preceded by independently chosen randomizing unitary maps D i , where 1 ≤ i ≤ k, and, after the last application of E, followed by a recovery map D kþ1 . The randomizing unitaries are chosen such that if (i) the sequence is applied to a fixed initial state jψi, (ii) E is identical to a certain unitary map U, and (iii) the randomizing maps D i are perfect, then the final state would be identical to the initial state. If the first k randomizing operations are chosen from the Haar measure over unitary maps [7,14] or from a set with the same first-and secondorder moments as the Haar measure [27], the fidelity between the initial and final states can be shown to decay exponentially with k at a rate that depends only onFðE; UÞ [7,9,10]. The RB literature typically assumes either (1) U ¼ I and E represents the errors from the randomizing operations, or (2) U is some other unitary map, and E is its potentially faulty implementation. However, we emphasize that our description is more general, and as we will demonstrate later, it allows us to reconstruct a major portion of arbitrary E, not just implementations of the randomizing operations.
In a realistic setting, one cannot assume that the initial state is pure and exactly known, that one knows what observable is measured exactly, or that the randomizing operations are applied noiselessly. However, these assumptions are not necessary for the RB protocol to work: the initial state can be any mixed stateρ 0 ≠ 1 dÎ , the measured observableM can be any observable where trρ 0M ≠ 1 d trM, and the rate of decay p of the measured expectation value is still related toFðE; UÞ in the same way. The randomizing operations need not be noiseless either [9,10], as long as the imperfect randomizing operations correspond to N ∘D i , with N representing some arbitrary CPTP error map (some of these restrictions may be relaxed, leading to more complex decays [9,10], and although our protocols generalize straightforwardly to such scenarios, we do not discuss them here for the sake of brevity). Under these more realistic assumptions, F k ðE; UÞ, the average of hMi over the choice of randomizing operations, for sequences of length k, is given by where A 0 and B 0 are constants that contain partial information about the preparation and measurement (including imperfections), and By estimating F k ðE; UÞ for different values of k, it is possible to isolate p (which contains the desired information about E) from A 0 and B 0 (which contain the undesired information about preparation and measurement), creating a protocol that is largely free of systematic errors caused by imprecise knowledge of state preparation and measurement [28]. Case (1) discussed above is the original scenario considered in the RB literature [7][8][9][10], where U ¼ I and E ¼ I, so the observed decay leads to a direct estimate of FðN ; IÞ, i.e., how well the randomization operations are implemented. Case (2) discussed above is the extension of RB to the extraction of information aboutFðE; UÞ, where E is one of the randomizing operations in the experiment and U is its unitary idealization. This is a recent development sometimes referred to as interleaved RB [11,12], but we do not make such a distinction in this paper. The previously known result in this case is thatFðE; UÞ can be bounded by experimentally estimatingFðE∘N ; UÞ andFðN ; IÞ, and in Sec. VI A, we provide more general bounds (with fewer assumptions) for the same purpose.
While the RB protocol is valid for any choice of randomizing operations discussed above, we emphasize that, in order to ensure that the protocols remain scalable in the number of qubits, U and D i are restricted to be unitary maps in the Clifford group, since this allows for scalable design of the randomizing sequences via the Gottesman-Knill theorem [29]. Moreover, although previous works have applied the RB protocols only to E very close to Clifford group maps, we emphasize that no restriction beyond E being CPTP needs to be imposed. The restricted applications of the RB protocols in previous work were partially due to the fact that the bounds used to isolatē FðE; UÞ are only useful when E is close to a Clifford group map. Since we are interested in extracting information about arbitrary E, here we consider tomographic reconstruction techniques that do not rely on these bounds. We also design efficient techniques for average-fidelity estimates that rely on new and improved general bounds onFðE; UÞ.
In summary, RB allows for efficient estimation of FðE∘N ; UÞ and efficient bounding ofFðE; UÞ for U in the Clifford group. These estimates can be obtained without relying on perfect information about preparation and measurement errors, thereby avoiding some of the systematic errors that may be present in standard tomographic protocols because of these imperfections.

A. RB sequence design
A compact way to describe how RB sequences are constructed refers back to the idea of twirling [27,[30][31][32][33]. Although this is not how this construction is typically described, we find it to be convenient and include it for completeness.
If E is an arbitrary quantum map, and S is a set of maps fC 0 ; …g, the average map is called the twirl of U † ∘E over S, where E i denotes the expectation value over uniformly random choices for C i ∈ S. If S is the Clifford group or any other unitary 2-design [27], then as before. A length k RB sequence consists of applying the twirled channel repeatedly to the same state k times, i.e., and E ⃗ i denotes the expectation value over uniformly random choices for C i l ∈ S for all l.
The RB protocol to estimateFðE; UÞ then consists of (i) choosing a sequence of C i l for 1 < l ≤ k, (ii) applying the alternating sequence of D i l and E, as prescribed in (3.6), to a fixed initial state, (iii) measuring the resulting state, and (iv) averaging over random choices for C i l to obtain F k . The F k can be fit against (3.1), yielding an estimate for p, even in the presence of imperfections. As we prove in Sec. VI, this estimate can be obtained efficiently in the number of qubits, desired accuracy, and confidence. Note that neither E nor U needs to be an element of the Clifford group. However, we will generally consider the case where E is not a Clifford group map, while U will be chosen to be a Clifford group map. Choosing U to be a Clifford group element makes the design of the experiments for n qubits efficient [10,29], while leaving E unconstrained affords us greater flexibility and has no impact on the design of the experiment.

IV. TOMOGRAPHIC RECONSTRUCTION FROM RB
As discussed above, RB can efficiently provide bounds on the fidelities of an arbitrary CPTP map E with any element of the Clifford group-in a manner that is robust against preparation and measurement errors, as well as imperfections in the twirling operations. Here, we demonstrate that the collection of such fidelities of a fixed E to a set of linearly independent Clifford group maps can be used to reconstruct a large portion of E. The advantage of this approach is that the robustness properties of the estimates obtained via RB carry over to this tomographic reconstruction.
Using the Liouville representation of quantum maps, it is clear that an estimate of the average fidelityFðE; UÞ leads to an estimate of trU † E, and thus, all information that can be extracted from these fidelities for a fixed E is contained in the projection of E onto the linear span of unitary maps. It is unnecessary to consider the span of arbitrary unitary maps, as the following result demonstrates (see Appendix A for the proof).
Lemma IV.1.-The linear span of unitary maps coincides with the linear span of Clifford group unitary maps. Moreover, the projection of a TP map to this linear span is a unital map.
Given a set of linearly independent vectors that span a subspace, and the inner product of an unknown vector with all elements of that set, it is a simple linear algebraic exercise to determine the projection of the unknown vector onto the subspace. Similarly, measuring the average fidelity of some TP map E to a Clifford group map C i is equivalent to measuring such an inner product-the matrix inner product trðEC † i Þ. Since Clifford maps span the unital subspace of quantum CPTP maps, measuring the inner product of E with a set of maximal linearly independent elements of the Clifford group is sufficient to reconstruct the projection of E onto the unital subspace. We call this projection the unital part of E and denote it by E 0 .
Since the unitality condition constrains only how the map acts on the identity component of a state, E 0 can be obtained by changing how E acts on that component. Defining Q to be the projector into the identity component of any operator and Q ⊥ to be the projection into the orthogonal complement (i.e., Q þ Q ⊥ ¼ I), one finds that which indicates that E and E 0 map traceless operators in the same way. The maps E and E 0 have Pauli-Liouville representations so we refer to ⃗ τ ε as the nonunital part of E. It is then clear that E 0 is described by ðd 2 − 1Þ 2 real parameters if E is TP, while E itself is described by ðd 2 − 1Þd 2 real parameters. The unital part of E contains the vast majority of the parameters needed to describe E-in fact, over 93% of the parameters for two qubits and over 99% of the parameters for four qubits.
As discussed, one limitation of RB is that in a realistic setting, it can only provide bounds forFðE; C i Þ (and therefore trEC † i ) because of the imperfections in the randomizing operations. Clearly, these bounds can only lead to a description of parameter-space regions compatible with E 0 as opposed to any point estimator, even in the absence of statistical fluctuations. Our approach to reconstruct E 0 is to avoid these bounds altogether and instead use the following result, which we prove in Appendix B.
Lemma IV.2.-If ðE∘N Þ 0 is the unital part of E∘N and N 0 is the unital part of N , and all these operations are trace preserving, then This allows us to reconstruct E 0 from the reconstructions of ðE∘N Þ 0 and N 0 . As both ðE∘N Þ 0 and N 0 are related directly to decay rates, we can create a point estimate of E 0 , without recourse to the bounds needed in standard RB to characterize E.
It should be noted that the only cases where ðN 0 Þ −1 does not exist are when N completely dephases some set of observables (i.e., maps them to something proportional to the identity). However, the experimental setting where tomographic reconstructions are interesting is precisely in the regime where N is far from depolarizing any observable, so ðN 0 Þ −1 is typically well defined [34]. The penalty, of course, is that the application of ðN 0 Þ −1 leads to greater statistical uncertainty in the estimate of E 0 thanks to the uncertainties in the reconstructions of N 0 and ðE∘N Þ 0 as well as uncertainty propagation due to multiplication by ðN 0 Þ −1 . However, larger experimental ensembles can be used to compensate for this, as is discussed in the section that follows.
Moreover, writing the imperfect randomizing operations as N ∘C i instead of C i ∘N Ã for some different map N Ã is merely a convention, and Lemma IV.2 can be trivially adjusted to such a different convention. In the physical regimes where RB estimates are expected to be valid, the choice of conventions is largely immaterial (see Appendix E for more details).
This result shows that the average fidelities with a spanning set of Clifford group unitary maps can lead not only to a point estimator of the unital part of any TP map but also to a point estimator of the average fidelity of E to any unitary map-i.e., information from multiple RB experiments can eliminate the need for the loose bounds on the average fidelity considered in Ref. [12]. This comes at the cost of efficiency, as the unital part of a map-like the complete map-contains an exponential number of parameters. However, for a small number of qubits, the overhead of reconstructing the unital part is small, and therefore, it is still advantageous to perform this cancellation to get better estimates of the error.

A. Example: Single-qubit maps
In order to reconstruct the unital part of a single-qubit map, one must first consider a set of linearly independent maps corresponding to unitaries in the Clifford group. As this group contains 24 elements, there are many different choices for a linearly independent set spanning the tendimensional unital subspace. One particular choice of unitaries leading to linearly independent maps iŝ In a noiseless setting, estimating the average fidelities between these Clifford maps and the map corresponding to the single-qubit Hadamard gate leads to the decays illustrated in Fig. 1. The corresponding p values are (4.10) Note, in particular, that some p values are negative, which simply indicates an oscillatory exponential-decay behavior. While these decay rates are much larger (i.e., the p values are much smaller) than those typically seen in previous RB protocols, we show in Sec. VI B that it is possible to efficiently estimate any decay rate to fixed accuracy, no matter the size.
If one considers a noisy setting, where N is not the identity, the decay rates are modified by N , but after reconstructing N 0 and ðE∘N Þ 0 separately, one is able to reconstruct E 0 . To see that errors in the estimate of N 0 will not create unmanageable errors in the estimate of E, consider how errors in the estimate of N 0 affect the estimate of ðN 0 Þ −1 . The relative error in the estimate of ðN 0 Þ −1 is given by [35] as long as where G 0 is the error in the estimate of N 0 , and κðN 0 Þ is the condition number for the matrix inversion of N 0 with respect to the matrix norm ∥ · ∥. The condition number of A is given by If we choose ∥ · ∥ to be the spectral norm, even when N 0 is the depolarizing map DðρÞ ¼ δρ þ ð1 − δÞˆI d , the condition number of N 0 is given by , one finds κðN 0 Þ ¼ 1 jγj . Thus, even for δ and γ polynomially close to 0, a polynomial increase in the number of statistical samples can be used to ensure an estimate of the inverse of N to any polynomial accuracy with high probability.

B. Beyond unital maps
What does the reconstruction of E 0 tell us about the E? We prove the following in Appendix C.
Lemma IV.3.-The unital part of a CPTP single-qubit map is always a CPTP map.
This means that the unital part of a single-qubit map imposes no lower bound on the magnitude of the nonunital part of that map-the nonunital part can always be set to 0.
For a single qubit, the unital part does impose stringent conditions on the maximum size of the nonunital part. Up to unitary rotations, any map can be written in the Pauli-Liouville representation as [20] 0 B B B @ 1 0 0 0 (4.14) where λ i and t i are real-valued parameters. The λ i , corresponding to the unital part, can be estimated using the techniques already described, but as Lemma IV.3 demonstrates, no useful lower bound on jt i j can be obtained. However, for the map to be positive, it is necessary that jt i j ≤ 1 − jλ i j [20], which gives upper bounds on the magnitudes of the nonunital parameters. The fact that, for single-qubit maps, E 0 is always CP can be turned around to say that statistically significant non-CP estimates of E 0 imply statistically significant non-CP estimates of E and may be used as witnesses of systematic errors in the experiments [6,36].
Lemma IV.3 fails in the case of multiple qubits, and it is not difficult to construct counterexamples. Numerical experiments indicate that CPTP maps chosen at random by drawing unitary dilations from the Haar distribution lead to non-CP unital parts with probability around 1. This implies that, while it may not be possible to test complete positivity of a general map by testing only its unital part, the reconstruction of the unital part of a multiqubit map yields lower bounds on the magnitudes of the nonunital parameters. Thus, while this result precludes the use of the unital part of a multiqubit map to test for systematic errors in experiments, it does provide more information about the nonunital parameters.

V. FIDELITY ESTIMATION BEYOND THE CLIFFORD GROUP
Previous RB results showed how to bound the average fidelity of Clifford operations [10,12]. While the maps in the Clifford group form an integral part of current approaches to scalable fault tolerance in quantum computers, universal quantum computation is only possible if operations outside the Clifford group are also considered. We would like to be able to not only efficiently verify the performance of Clifford gates but also verify the FIG. 1. RB decays used to estimate the fidelity between an ideal Hadamard gate andĈ 0 (green circles, with p ¼ 1 The decays corresponding to each of the remaining average fidelities coincide with one of these two representative decays. Note that these decays are much faster than decays previously estimated in RB, as they correspond to the average fidelities between very different maps. The data points are offset along the x axis for clarity. performance of universal circuits. However, there are strong indications that quantum computers are strictly more powerful than classical computers; for example, if classical computers could efficiently simulate certain classes of nonuniversal quantum circuits, it would imply a collapse of the polynomial hierarchy [37]; thus, it is considered highly unlikely. It is therefore extremely unlikely that classical computers can efficiently predict the behavior of a general polyðnÞ-depth quantum circuit [38], and without these predictions, it is not possible to check if a quantum computer is behaving as desired. For these fundamental reasons, we do not expect it to be possible to efficiently estimate the average fidelity to a general quantum map.
It is important to note, however, that it is possible to efficiently simulate some circuits that contain maps outside the Clifford group. In particular, Aaronson and Gottesman [39] have proven that circuits consisting of Clifford group maps and a logarithmic number of maps outside the Clifford group can be simulated efficiently. Despite efficient simulation being possible, these circuits can be thought of as discrete components that enable universality under composition, and thus the ability to verify their implementation is of great practical importance. We now show how our methods can be extended to allow for efficient estimation of the average fidelity of any experiment to such circuits.

A. Average fidelity to T
The canonical example of a map outside the Clifford group is the operation T ¼ e −i π 8Ẑ . This gate is commonly used in combination with Clifford group operations to form a gate set that is universal for quantum computation [40]. In this section, we show how to efficiently bound the average fidelity of a map E to U ¼ T .
In Sec. IV, we prove that Clifford maps span the space of unital maps. This implies that, in the Pauli-Liouville representation, any unitary map U ðPLÞ can be written as a linear combination of Clifford maps with β i ∈ R. By linearity, For an arbitrary unitary U, the number of nonzero β U i , which we denote by N U , can be as large as Oðd 2 Þ. However, T ðPLÞ can be written as a linear combination of three Clifford maps. The support of T ðPLÞ is given by the maps corresponding to the Clifford group unitaries,Î,Ẑ, and e −i π 4Ẑ , with the corresponding coefficients 1 2 , 1− ffiffi 2 p 2 , and 1 ffiffi 2 p . Thus, to estimateFðE∘N ; T Þ, one only needs to estimate three average fidelities to Clifford group maps (instead of the ten necessary for reconstruction of the unital part).
Suppose one estimates each fidelityFðE∘N ; C i Þ for all of the C i in the linear combination to within ϵ 0 with confidence 1 − δ 0 . In Sec. VI B, we show that this requires Oð N U ϵ 04 log 1 δ 0 Þ samples. From Eq. (5.3), it is clear that one can obtain an estimateF such that , so an estimate for the average fidelity to T can be obtained by the following procedure: (1) Perform RB with Oð 1 ϵ 4 log 1 δ Þ samples for each relevant fidelityFðE∘N ; C i Þ. This requires Oð 1 ϵ 4 log 1 δ Þ total samples and results in an estimateF such that PðjF −FðE∘N ; T Þj ≥ ϵÞ ≤ δ: (5.6) (2) Perform RB with Oð 1 ε 4 log 1 δ Þ samples to obtain an estimateF N ofFðN ; IÞ such that PðjF N −FðN ; IÞj ≥ εÞ ≤ δ: (5.7) (3) In Sec. VI A, we show how to bound the fidelity ofFðE; UÞ, given estimates ofFðE∘N ; UÞ and FðN ; IÞ. Apply the bounds of Sec. VI A for FðE∘N ; T Þ ¼F AE ϵ, and forFðN ; IÞ ¼F N AE ϵ, to obtain bounds onFðE; T Þ that are valid with probability of at least 1 − 2δ. This procedure trivially extends to bounding the fidelity of E to the case where T acts on a single qubit and the identity acts on n − 1 qubits. The sampling complexity remains the same, but the time complexity changes, as the classical preprocessing time needed to make a single average fidelity estimate scales as Oðn 4 Þ [12]. Similar arguments can be used to show that the sampling complexity of determining the average fidelity of E to any one-or two-qubit unitary acting on n qubits is constant, with the same classical preprocessing time complexity. In the next section, we will discuss more general operations acting on n qubits.

B. Average fidelity to more general unitaries
It is possible to efficiently bound the average fidelity of a map E to a unitary U when U is a composition of OðpolyðnÞÞ Clifford maps and OðlogðnÞÞ T maps on n qubits (i.e., maps that act as T on one qubit and as the identity on the remaining n − 1 qubits). Under these constraints, (i) U ðPLÞ can be efficiently decomposed into a linear combination of OðpolyðnÞÞ Clifford maps [i.e., N U ¼ OðpolyðnÞÞ]; (ii) the coefficients β U i in the linear combination satisfy P i jβ U i j ¼ OðpolyðnÞÞ. Following the argument of Sec. VA, the sampling com- Þ log N U δ 0 Þ, so together (i) and (ii) guarantee that the sampling complexity of bounding FðE; UÞ is OðpolyðnÞÞ. Since (i) guarantees that the decomposition is efficient, and the classical preprocessing time needed to make a single sample scales as Oðn 4 Þ, the time complexity is also OðpolyðnÞÞ.
We prove (i) by induction on t (the number of T maps in the circuit) and c (the number of Clifford maps in the circuit). We show that one can decompose U ðPLÞ into a linear combination of at most 3 t terms, where each Clifford map in the linear combination is written as a composition of at most t þ c Clifford maps. The base case is given by the following (a) t ¼ 1, c ¼ 0: U is a single T , and U ðPLÞ can be written as a linear combination of three Clifford maps. (b) t ¼ 0, c ¼ 1: U is a Clifford map, and so U ðPLÞ can be written as a linear combination of one Clifford map. For the inductive case, we assume that one has a unitary U that is a composition of t T maps and c Clifford maps. By inductive assumption, U ðPLÞ can be written as with M ≤ 3 t and N i ≤ t þ c. Now consider composing U with a Clifford C. Then and one obtains a linear combination of ≤ 3 t terms, each a composition of c þ t þ 1 Clifford maps. Likewise, if U is composed with T , then where the C T k are the three Clifford maps involved in the linear combination of T ðPLÞ , so one obtains a linear combination of 3 tþ1 elements, each a composition of t þ c þ 1 Clifford maps, as desired.
Therefore, if one has a unitary map U composed of OðpolyðnÞÞ Clifford maps and OðlogðnÞÞ T maps, one can write U ðPLÞ as a linear combination of OðpolyðnÞÞ Clifford maps, where each term in the linear combination is a composition of at most OðpolyðnÞÞ Clifford maps. A sequence of OðpolyðnÞÞ Clifford maps can be efficiently simplified into a single Clifford map using the Gottesman-Knill theorem [29]. The average fidelity estimate to U is obtained by estimating the average fidelities to these simplified Clifford maps.
To see that (ii) also holds, suppose one calculates a linear combination for U ðPLÞ based on the above construction. It is possible that different terms in the linear combination result in the same Clifford map, but for simplicity, we treat each term separately so that our estimate of the complexity is an upper bound. Then, if the circuit decomposition of U contains t T maps, so P i jβ U i j scales, at most, as OðpolyðnÞÞ for t ¼ Oðlog nÞ. These results demonstrate that robust estimates of the average fidelities to unitary maps outside the Clifford group can be obtained efficiently, scaling polynomially in the number of qubits.

VI. BOUNDING ERROR IN AVERAGE FIDELITY ESTIMATES
In this section, we bound sources of error that occur in RB procedures. There are two sources of uncertainty we consider. When trying to efficiently estimate the average fidelity E without inverting N , as we do in Sec. V, we lack direct access toFðE; UÞ and instead can only estimatē FðE∘N ; UÞ andFðN ; IÞ. This leads to an error on our estimate ofFðE; UÞ. We also consider statistical errors from the sampling of random variables and show that we can efficiently fit RB decays to any constant error. As a consequence, this allows us to efficiently bound the average fidelity to maps outside the Clifford group, as described in Sec. V. We address these two effects separately. These types of uncertainties can be found in many contexts, so we expect the analysis in Secs. VI A and VI B has broader applications.

A. Bounds on average fidelity of composed maps
In this section, we show how to boundFðE; UÞ when we have estimates ofFðE∘N ; UÞ andFðN ; IÞ.
In Appendix D, we prove This bound is valid for any maps A and B. There exist A and B that saturate the upper bound, but the lower bound is not tight, for reasons we discuss in Appendix D. Generally, this method gives better bounds when the operation E is close to U and when N is close to I (i.e., the imperfections in the randomizing operations are small). Because these lower and upper bounds-just like the bounds in Ref. [12]-are not close to each other, except in the regime where E is close to U, they are not useful for the type of tomographic reconstruction performed in Sec. IV, where an arbitrary map might be far from a Clifford map or from a map that is composed of Clifford maps and OðpolyðnÞÞ T maps.
Previous work on average-fidelity estimates based on RB have derived the bound [12] which is only valid whenFðA; IÞ ≥ 2FðB; IÞ − 1 or, in the fidelity estimation context, whenFðE; UÞ is close to 1 [41]. There is no way to directly verify from the experimental data that this requirement holds, but in order to compare the bounds in Ref. [12] with the bounds derived here, we use Eq. (6.1) to bound the region of validity of Eq. (6.2). As illustrated in Fig. 2, the bounds derived here are better whenFðA∘B; IÞ is close to 1 but are applicable to the entire range of parameters without additional assumptions about the maps involved.

B. Confidence bounds on fidelity estimates
Here, we show how to extract FðE; UÞ from the estimated points F k ðE; UÞ (the average fidelity of a length-k RB sequence-see Sec. III). We rigorously bound the error and sampling complexity of this nonlinear fit.
One can easily show that, using the Hoeffding bound, an estimateF k for F k ðE; UÞ can be obtained such that [9] PrðjF k − F k ðE; UÞj ≥ ε 0 Þ ≤ δ 0 ; with a number of samples Oð 1 ε 02 log 1 δ 0 Þ, which is independent of the number of qubits in the system. What we show here is that this allows for p [and thusFðE; UÞ] to be estimated with a number of samples that also scales well with some desired accuracy and confidence. In standard RB experiments, p is estimated by numerical fits to theF k with many different sequence lengths, but the dependence of the error on the number of samples per sequence length is difficult to analyze. Here, we take a different approach that leads to simple bounds on the accuracy and confidence.
Since F k ðE; UÞ ¼ A 0 p k þ B 0 , it is easy to see that and therefore, at least in principle, p can be estimated by using only sequences of length 1 and 2, along with a sequence long enough to ensure jA 0 p k j ≪ jB 0 j [42], with the corresponding expectation denoted by F ∞ . Assuming that eachF i is estimated with accuracy ε 0 and confidence 1 − δ 0 , and that 0 is not in the confidence interval for F 1 −F ∞ , it follows that the estimatep for p > 0 and A 0 > 0 satisfies with probability of at least 1 − 3δ 0 (similar expressions hold for the cases of negative p or A 0 , but for simplicity, we focus on the expressions for the positive case). If jA 0 j or jpj is small, these bounds diverge, so it is important to test the data to exclude these cases.
FIG. 2. Bounds on χ A 0;0 versus χ A∘B 0;0 , when χ B 0;0 is fixed at 0.995. Our bounds are shown as a solid blue line, while the bounds of Ref. [12] are shown as a red dashed line. The bounds of Ref. [12] are valid in the green shaded region, while our bounds are valid for all values of χ A∘B 0;0 .
Note that A 0 is independent of the sequences being used, so one can choose to estimate A 0 from a sequence with large p. Denoting the F i estimates for those sequences as F 0 i , and assuming that the confidence interval forF 0 with probability of at least 1 − 3δ 0 . From Eq. (6.4), and thus so if one desires an accuracy ε forp, whenever one can setp ¼ 0, thereby avoiding the divergent confidence intervals while still providing estimates with the desired accuracy. Similarly, from Eq. (6.4), it follows that Thus, choosing ϵ 0 ¼ 4ϵ 2 a, one can safely Taylor expand Eq. (6.6) to first order in ε to obtain with probability of at least 1 − δ ¼ 1 − 6δ 0 , as desired, using Oð 1 ε 4 log 6 δ Þ samples. This immediately gives an estimateF forFðE; UÞ that can be obtained as PrðjF k −FðE; UÞj ≥ εÞ ≤ δ; (6.15) with Oð 1 ε 4 log 1 δ Þ samples.

VII. SUMMARY AND OUTLOOK
We have demonstrated that, using information from multiple RB experiments, it is possible to reconstruct the unital part of any completely positive trace-preserving map in a way that is robust against preparation and measurement errors, thereby avoiding some forms of systematic errors that plague more traditional tomographic reconstruction protocols. The unital part of a map consists of the vast majority of the parameters of that map, including all parameters necessary to describe any deterministic unitary map, as well as any random unitary map, such as dephasing with respect to any eigenbasis.
We also presented a robust procedure for bounding the average fidelity to an arbitrary unitary and showed that this protocol is efficient for a large class of unitaries outside of the Clifford group. The overhead of the procedure depends on how the unitary is decomposed as a linear combination of Clifford group unitary maps, and we gave rigorous bounds on the number of samples needed to achieve some desired accuracy and confidence in the fidelity estimate.
The extension of these results to nonqubit systems remains an open problem. In addition, the characterization of the nonunital part of a map in a robust manner seems to present a larger challenge than the characterization of the unital part. New techniques are needed to access this important information. Note added in proof.-After the paper was accepted and while we were waiting for the publication proofs to appear, we became aware that a more general form of Lemma IV.1 was previously proven independently by Wim van Dam and Mark Howard in Ref. [43].

APPENDIX A: UNITAL MAPS AND THE LINEAR SPAN OF UNITARY MAPS
The Pauli-Liouville representation is particularly convenient when discussing the Clifford group of n-qubit unitary maps because, in this representation, such maps are monomial matrices [44,45]. In the particular case of qubits, E ðPLÞ ij ∈ fAE1; 0g for a unitary in the Clifford group. Given these facts, we can now straightforwardly prove the result about the linear span of Clifford group maps on n qubits. First, we need to prove a small result about Clifford group unitaries.
Proof.-This claim shows that there are no subsets of nonidentity multiqubit Pauli operators that do not mix under the action of the Clifford group.P i andP j can both be written as tensor products of single-qubit Pauli operators and identity operatorsÎ, where in the tensor product of each, there is at least one element that is notÎ. Using local Clifford group unitaries, one can take each nonidentity element in each tensor product to the single-qubit Pauli operatorX. We call these new Pauli operatorsP i 0 and P j 0 . Now the problem is equivalent to finding Clifford group unitaries that take one tensor product ofX andÎ to another.
Let CNOT k;l denote the controlled-NOT unitary with qubit k as a control and qubit l as a target. The CNOT is a well-known unitary in the Clifford group with the property that CNOT k;lX ðkÞ CNOT k;l ¼X ðkÞ ⊗X ðlÞ , where we usê X ðiÞ to denoteX acting on the ith qubit. In this way, one can increase or decrease the number ofX in the tensor product decomposition ofP i 0 using unitary maps, as long as there is at least oneX in the tensor product. This means that any tensor product ofÎ andX on n qubits can be mapped to any other tensor product ofÎ andX on n qubits through the use of CNOT unitaries-in particular, one can mapP i 0 toP j 0 . Now we can prove the intended result. Proof.-It suffices to show that any matrix element in the unital part of a map (in the Pauli-Liouville representation) can be written as a linear combination of Clifford group unitary maps.
The Pauli-Liouville representations of unitaries in the n-qubit Clifford group are monomial matrices with nonzero entries equal to AE1. For any given such unitaryĈ, one can construct 4 n orthogonal unitaries of the formP iĈ , with corresponding 4 n mutually orthogonal Pauli-Liouville representation matrices. Pauli operators are diagonal in the Pauli-Liouville representation, so for a fixedĈ, the Pauli-Liouville representations of allP iĈ have support in the same set of 4 n matrix elements as the Pauli-Liouville representation ofĈ; thus, the values of any of these matrix elements for any map E can be recovered by collecting the Hilbert-Schmidt inner products between E ðPLÞ and the Pauli-Liouville representation of the map for theP iĈ , i.e., trEðP i CÞ † . From Claim A.1, one can choose a Clifford group unitary that has support on any particular matrix element in the unital block; therefore, any unital matrix can be written as a linear combination of Clifford group unitary maps. Since Clifford group maps are unital, this concludes the proof.

APPENDIX B: RECONSTRUCTION OF THE UNITAL PART WITH IMPERFECT OPERATIONS
In the main body of this paper, we describe how RB allows for the reconstruction of the unital parts of E∘N and N , where E is some quantum operation one would like to characterize, and N is the error operation associated with each of the randomizing operations. We now prove the result that allows for the estimation of the unital part of E alone, given an estimate of the unital part of N .
Proof.-Any trace-preserving linear map A can be written in the Pauli-Liouville representation as (B1) where, as discussed previously, the unital part is The Pauli-Liouville representation of the composition of two trace-preserving linear maps A and B is given by the multiplication of the Pauli-Liouville representations, resulting in (B3) and thus Recall that the Pauli-Liouville representation of a singlequbit map E may be written as King and Ruskai [20] show that there exist unitary maps U and V such that To prove E 0 is CPTP, we first show thatȆ 0 (the projection ofȆ onto the unital block) is always CPTP, and then we prove that ifȆ 0 is CPTP, E 0 is CPTP.
Proof.-Ruskai et al. [46] prove that E is CP if and only if So, is a valid CP map. Then, is also a valid CPTP map because the composition of valid quantum maps is always a valid quantum map. However, by Eq. (4.3), W is equal to E 0 , so the unital part of a singlequbit map is always CPTP.

APPENDIX D: BOUNDS ON FIDELITY
Recall that for an operation E, the χ-matrix representation is Because of complete positivity constraints, χ matrix elements satisfy Composing two maps, their χ-matrix representations are A∘BðρÞ ¼ X m;n;k;j χ A m;n χ B k;jPmPkρPjPn : (D3) Let σ i ðmÞ be the index such thatP σ i ðmÞPm ¼P i . Then, using the fact that the absolute value is greater than the real or imaginary parts of a complex number, we obtain Setting i ¼ 0 gives the desired result. To see why the lower bound is, in general, not tight, consider Eq. (D4). For the lower bound on χ A∘B i;i , we take all of the terms of the form χ A σ i ðmÞ;σ i ðnÞ χ B m;n and replace them with −jχ A σ i ðmÞ;σ i ðnÞ jjχ B m;n j because many of these terms have unknown phases, which in the worst case, can have a value −1. However, when m ¼ n, because χ is positive semidefinite, we get terms of the form χ A σ i ðmÞ;σ i ðmÞ χ B m;m ¼ jχ A σ i ðmÞ;σ i ðmÞ jjχ B m;m j; thus, we are subtracting terms that should actually be added. However, there is no way to address this issue without obtaining more information about the χ matrix.

APPENDIX E: ORDERING OF ERROR MAPS
Throughout this paper, we chose to describe the noisy maps as the composition of the ideal map and some error map (applied in that order). In other words, the noisy implementation of the map C i is expressed as where N i is the error map and C i is the ideal Clifford map. This choice can be made without loss of generality, and it has no effect on experimental observations. In other words, we could instead express the implementation of the Clifford C i as where N Ã i is, in general, a different error map. The average fidelity of these noisy maps to C i is the same, or, equivalently, N i and N Ã i have the same average fidelity to the identity. However, in general, N i ≠ N Ã i . Other process metrics are immune to this problem because they take the error to be additive rather than multiplicative, and so there is no ordering choice to be made or imposed.
If all error maps are identical for either of the conventional choices (N i ¼ N or N Ã i ¼ N Ã ), then Eq. (3.1) holds, and small deviations from these cases lead to perturbative corrections that generalize the results in Refs. [9,10]. If the error maps are close to the identity, both perturbative models are likely to be valid, so N ≈ N Ã -the question of which convention is used becomes immaterial. However, if either N or N Ã is far from the identity, low-order perturbative expansions may not be valid for one of the conventions. Fits to individual RB decays cannot differentiate between these two cases, and the bounds used to isolate the error in E from N or N Ã do not depend on this conventional choice, thus, as long as Eq. (3.1) holds for some separation of the error and ideal channel.
A problem arises when one attempts to use Lemma IV.2, as, unless N ≈ N Ã , the choice of conventions becomes important. The physical regime where, e.g., N i ≈ N is precisely the regime where N i ≈ N Ã i ≈ I, and so this is not likely to be a problem in practice; within the accuracy of the perturbative expansions to Eq. (3.1), the inversion in Lemma IV.2 will be valid, as would a similar inversion taking the error map to occur before the ideal map.
In the more general formal cases where, e.g., N i ≈ N but the N Ã i are very different from each other, there appears to be no way to choose the appropriate convention from individual observations. It may simply be the case that the E 0 reconstruction via Lemma IV.2 using one convention is highly unphysical, while the other is physical, indicating which convention should be used. In the absence of this indication of systematic errors, however, one should report both reconstructions or simply choose the worst of the two.