Dynamically corrected gates suppress spatio-temporal error correlations as measured by randomized benchmarking

Quantum error correction provides a path to large-scale quantum computers, but is built on challenging assumptions about the characteristics of the underlying errors. In particular, the mathematical assumption of statistically independent errors in quantum logic operations is at odds with realistic environments where error sources may exhibit strong temporal and spatial correlations. We present experiments using trapped ions to demonstrate that the use of dynamically corrected gates (DCGs), generally considered for the reduction of error magnitudes, can also suppress error correlations in space and time throughout quantum circuits. We present a first-principles analysis of the manifestation of error correlations in randomized benchmarking, and validate this model through experiments performed using engineered errors. We find that standard DCGs can reduce error correlations by $\sim50\times$, while increasing the magnitude of uncorrelated errors by a factor scaling linearly with the extended DCG duration compared to a primitive gate. We then demonstrate that the correlation characteristics of intrinsic errors in our system are modified by use of DCGs, consistent with a picture in which DCGs whiten the effective error spectrum induced by external noise.


I. INTRODUCTION
Suppressing and correcting errors in quantum circuits is a critical challenge driving a substantial fraction of research in the quantum information science community. These efforts build on quantum error correction (QEC) and the theory of fault tolerance [1][2][3][4][5][6] as the fundamental developments that support the concept of large-scale quantum computation [7][8][9]. In combination, these theoretical constructs suggest that so long as the probability of error in each physical quantum information carrier can be reduced below a threshold value, a properly executed QEC protocol can detect and suppress logical errors to arbitrarily low levels, and hence enable arbitrarily large computations. Underlying this proposition is an assumption that errors are statistically independent, i.e., the emergence of a qubit error at a specific time is uncorrelated with errors arising in other qubits or at any other time in the computation. Error correlations that decay with distance between qubits (spatially) can induce simultaneous multi-qubit errors [10], and correlations that decay with circuit length (temporally) have been shown to produce more rapid accumulation of net circuit errors [11,12].
The practicality of the assumption of uncorrelated errors has long been questioned, as laboratory sources of noise commonly exhibit strong temporal correlations, captured through spectral measures exhibiting high weight at low frequencies. As such, coherent errors induced by low frequency noise and miscalibrations * Contact: michael.biercuk@sydney.edu.au have recently become a larger focus of research, with their detrimental effects on QEC implementations being examined [11,[13][14][15] and first ideas targeting their suppression emerging [16,17]. Attempts to address these errors in the theory of quantum error correction are challenging and results to date suggest that revision of postulated fault-tolerant thresholds may be required [18,19] relative to more optimistic predictions that have recently emerged [20]. Indeed, when implicit assumptions that errors are both spatially and temporally uncorrelated are weakened, the value of a tolerable error threshold can change from some value ε to ε 2 , easily leading to orderof-magnitude decreases in the acceptable error rates [7].
The adverse effect of correlated errors on error correction procedures has been observed in the context of a repetition code both experimentally [21] -where they were seen to effectively negate any advantage obtained from iterative error correction -and theoretically [13], where an increase in the logical failure rate was identified. Furthermore, while a recent full-scale numerical simulation has shown that coherent errors at the physical layer can, in fact, be overcome by topological error correcting codes [22], large numbers of physical qubits are required with error rates that are uniformly sub-threshold. The emerging message is that, while correlated errors do not invalidate the use of QEC, their presence can significantly increase the requisite overhead, and may reduce the tolerable magnitude of physical qubit errors.
In this manuscript, we demonstrate experimentally that using a low-level abstraction known as a dynamically corrected gate (DCG), we can suppress error correlations in addition to error magnitudes. Replacing "primitive" physical quantum gate operations with logically equivalent DCGs [23][24][25][26][27] forms a "virtual" layer arXiv:1909.10727v1 [quant-ph] 24 Sep 2019 wherein error characteristics can be modified ("virtualized") before the application of QEC [28,29]. We present a novel first-principles analysis of Clifford randomized benchmarking [30,31] in order to quantitatively model the impact of error correlations on simple experimental observables, building on concepts in [32]. Specifically, we identify that error correlations are manifested in the scaling of the distribution over sequence randomizations, at fixed sequence length, with measurement averaging. We validate this framework using randomized benchmarking experiments performed with a single trapped Ytterbium ion. We then demonstrate that the replacement of the individual Clifford operations within each sequence with logically equivalent DCGs modifies the error correlation signatures such that they are experimentally consistent with the presence of uncorrelated errors. Single-qubit experiments performed under engineered noise with tunable correlation characteristics show consistent reduction in the correlated error component when switching from primitive to DCG sequences. We explain this behaviour using a framework that describes the action of DCGs at the operator level [27,33,34] as whitening the effective error spectrum experienced by each gate. Finally, we demonstrate that using DCGs in sequence construction reduces spatial error correlations between qubits, through simultaneous randomized benchmarking on five trapped ion qubits. These results provide direct and strong evidence that the use of dynamically protected physical qubit operations in a layered architecture for quantum computing [29] can facilitate the successful application of existing QEC theory with only minimal revision on the path to fault-tolerant quantum computation.

II. IDENTIFYING SIGNATURES OF ERROR CORRELATIONS IN CIRCUITS
We begin by laying out the challenge of establishing clear quantitative metrics allowing the identification of error correlations in quantum circuits. As a first step we analyze how correlations in a physical noise process translate to correlations in the resultant unitary errors within a circuit of j = 1, . . . , J gates. In our model, any noisy operationŨ j within the circuit can be decomposed into the ideal operatorÛ j and an error operator Λ j , such thatŨ j =Λ jÛj . Here,Û j ≡Û (n j , θ j ) rotates the state vector by angle θ j around an arbitrary axis n j on the Bloch sphere. Considering unitary semiclassical noise processes, the error component in each operation can be written asΛ j = exp {i ∞ α=1 [ j ] α ·σ}, withσ the vector of Pauli matrices, α an index denoting the Magnus expansion order [35], and j the error vector characterizing the strength and nature (affected quadrature) of the error [34][35][36][37]. A quantum circuit experiences temporally correlated errors if the values of j across the circuit (in space or time) exhibit non-zero correlations.
Our approach to measuring error correlations is built on common quantum verification protocols employed to infer the average behavior of gate operations [11,30,[38][39][40][41][42][43][44][45][46][47]. Restricting our analysis to the single-qubit case, error correlations between gates may occur in these protocols when physical noise processes exhibit strong correlations in time. We demonstrate this numerically by calculating the error vector j for each operation in a single-qubit randomized benchmarking sequence exposed to detuning (σ z ) noise with a variable block-correlation length, M n ; this is defined to be the number of gates over which the noise strength is constant within the sequence. The sequence is assembled from the 24 Clifford operations comprising combinations of π and π/2-rotations about the x, y and z-axes of the Bloch sphere, and an identity gateÎ. Calculating the autocorrelation function of the error vector's magnitude throughout a sequence reveals strong correlations over a length of gates, M ε , which appear to scale linearly with the correlation length of the input noise process, M n (Fig. 1c). This behavior suggests a linear mapping from noise correlations to error correlations in conventional settings. As a prelude to future demonstrations in this manuscript, we note that if the individual Clifford gates are replaced by DCGs, this simple linear mapping from input noise correlations to output error correlations breaks down.
In general, the primary limitation one faces in accessing information about M ε in a physical experiment is that using standard, projective measurements at the end of a circuit will limit the ability to probe correlations that arise throughout the circuit's execution. Most experimental quantum verification routines suffer from exactly this limitation, and primarily measure the average difference between a qubit state transformed under an imperfect operation and a predetermined target state at the end of the protocol (Fig. 1a). However, as we will illustrate in the following, there is additional useful information present in the outcomes of randomized benchmarking measurement routines that may be employed to extract novel insights about error correlations appearing during the sequence.
The key underlying concept is that in a randomized benchmarking sequence built up from many operations, the resultant net state transformation in the presence of noise,Ũ eff |ψ (Fig. 1b), is determined by an interplay of both the sensitivity of each individual operation to the noise [35] and the impact of the sequence structure on error accumulation [32,46,48]. Specifically, nominally equivalent randomized benchmarking sequences (constructed to perform the same net operation) exhibit variations in correlated-noise susceptibility that are analytically calculable and verifiable in experiments. We use this variability and the behavior under experimental averaging to extract a signature of error correlations within quantum sequences.
FIG. 1. Translation of noise correlations to error correlations in quantum circuits. a, A single operation applied to a qubit in the presence of noiseŨj can be decomposed into an error operatorΛj and the target operationÛj. Bloch spheres schematically illustrate the effect of an imperfect π-rotation about the x-axis acting on input state |1 , with dark shading indicating an over-rotation error. b, Noise (red line) exhibiting non-zero temporal correlation of length Mn = 3, quantized in units of gate operations, acts on a quantum circuit composed of sequentially applied unitary operations. The resultant errors accumulate and lead to a noisy effective operatorŨ eff , whose effect is determined through a projective measurement at the end of the circuit. c, Translation of correlations in a noise process to correlations in the magnitude of the circuit error vector, j . The error vector for each gate of a randomly composed sequence of 1000 primitive gates under a noise process with noise correlation length Mn is calculated and the autocorrelation function of the magnitude of the error vector, E [ j 1 j 2 ], is shown for the first 100 gates. d, e, Random walks for the extreme error correlation cases, d, Mε = 1 (uncorrelated) and e, Mε = J (fully correlated). Final walk displacements of eight sequences, each with 1000 error realizations, are shown along with the full walk for a single sequence that is common between the two cases.

A. Random walk formalism for error accumulation
We present a first-principles analysis to directly link measurement outcomes for single-qubit randomized benchmarking sequences to the nature of the underlying error correlations quantified by M ε , expanding the formalism introduced in reference [32]. We consider randomized benchmarking sequences composed of J singlequbit Clifford operations, J j=1Ĉ ηj =Î, with the vector η containing labels for the 24 Clifford operations, η j ∈ {1, 2, . . . , 24}. A final gate is pre-calculated to yield a net identity operation for the sequence, such that in the absence of error the final qubit state will be the same as the initial state. Due to imperfections in the operations, the physically implemented gatesC ηj differ from the ideal gates by an error mapC ηj =Λ jĈηj .
The accumulation of errors throughout a sequence can be represented by a sequence-dependent "random walk" in three-dimensional Pauli-error space; the net walk length can then be related to the final sequence error [32]. For a particular realization of the error i, this walk is captured by the vector with gate error values ε (i) j ∼ N (0, σ 2 ) sampled from a zero-mean Gaussian distribution with rms value σ. It will be shown in Section II C that this leads to an average, randomized benchmarking error per gate ∝ σ 2 . Here, the values of r 3D,j are unit-length vectors that define the sequence-specific random walk steps; they can be calculated deterministically for any randomized benchmarking sequence, irrespective of the strength or correlation characteristics of the gate errors. In a circumstance where the normalized error takes a consistent value ε (i) j ≡ 1, the length of the J-step walk created by these steps is an intrinsic property of the sequence and will be shown to act as a proxy for its susceptibility to correlated errors. Examining individual randomized benchmarking sequences reveals the idiosyncratic nature of their walks; certain randomizations exhibit long walks, while others have walks that terminate near the origin, solely determined by the structure of the sequence and the form of the error channel. Accordingly, in the presence of correlated errors we expect a wide variance of outcomes, determined by the underlying structures of the randomly selected sequences. The general framework linking this Pauli walk to accumulated error was experimentally validated in [46].

B. Signatures of error correlations
We identify that the key measurable signature of error correlations arises in the process of experimental averaging over repetitions of a sequence, and hence over different realizations of the error. In order to understand this, we begin by examining how error correlations impact the random walk introduced above, and how the behavior of that walk changes with experimental averaging.
Gate errors induce the mapping r 3D,j → R . To see the effect of correlations in the error process, we calculate the locus of walk termination points for eight different sequences and 1000 error realizations, shown in Fig. 1d,e. In the presence of errors whose magnitudes are constant across all gates in a given benchmarking sequence, the error ε (i) j ≡ ε (i) rescales all steps in the walk uniformly, such that all termination points for a given sequence fall on a line (Fig. 1e). The walk terminations for the same sequence are thus dominated by the underlying sequence structure ("rays" in Fig. 1e). By contrast, in the presence of uncorrelated errors where ε (i) j changes randomly for each step, the termination points appear randomly distributed in Pauli space for different realizations of the error (Fig. 1d).
These differences will manifest in an experiment that averages the experimental performance of a set of sequences over many different realizations of an errorinducing noise process. In the case of correlated errors, the preservation of sequence-structure dependence in the sequence error leads to a broad distribution of outcomes over different randomized benchmarking sequences. This breadth is maintained even when averaging experiments together over various realizations of the random but temporally correlated errors. In contrast, for uncorrelated errors, the random, formless distribution of walk termination points over the same set of sequences implies that averaging over experiments would result in a spread of outcomes that grows narrower as the experiment number increases, consistent with the central limit theorem. It is therefore in the distribution over measured results of noise-averaged, randomized benchmarking sequences that the signatures of error correlations between gates within a sequence will appear. In Sections II C and II D we will describe how this phenomenology can be accessed through a modified analysis of conventional randomized benchmarking experiments.

C. Mapping to measurable quantities
We now link the random-walk framework to measurements commonly performed in the laboratory -a single projective measurement in the qubit basis. Such measurements are unaffected by rotations about the z-axis, i.e., they are phase invariant. Consequently, this type of projective measurement is insensitive to the component of the random walk oriented along theσ z -axis, and instead probes a two-dimensional projection of the walk onto theσ xσy -plane of Pauli-error space [46]. Considering a measurement routine involving averaging a single sequence over n realizations of the error, we may relate the two-dimensional walk length to the projective measurement results as, where · n is an average over n instances of the error process, P := 1 − P (|1 ) n is the measurable, noiseaveraged sequence "survival probability" when the qubit is initialized in the state |0 , σ is the rms of the normally distributed errors, and R 2D denotes the random walk in theσ xσy -plane of Pauli-error space. For simplicity, we will proceed by referring to R 2D , and its individual steps r 2D,j , simply as R and r j respectively. We analyze in detail three distinct error correlation regimes for a unitary error channel with values ε (i) j ∼ N (0, σ 2 ): (i) M ε = J, identically correlated errors with fixed, constant magnitude over a sequence and rms value σ C ; (ii) M ε = 1, uncorrelated, normally distributed errors that change randomly between each gate in a sequence with rms value σ U ; and (iii) statistically independent, contemporaneous correlated and uncorrelated error processes such that the relative strengths σ C and σ U determine the effective error correlation length.
The expression for survival probability in Eq. (2) can be used to calculate the distribution of survival probabilities without modification for both regime (i) and (ii) simply by using the appropriately calculated random walks. In the limit of long sequences and many noise averages (large J and n), the noise-averaged survival probability is Gamma distributed over different, nominally equivalent, sequence randomizations [46]; the shape and scale parameters of the distribution, a and b respectively, can be calculated from first principles using the particulars of the sequence, noise averaging, and error characteristics. For these two limiting cases of identically correlated errors over a sequence and uncorrelated errors changing randomly between gates, the respective survival probabilities are sampled from Gamma distributions shaped according to From these expressions, the variance and expectation values of the distribution over sequence randomizations can be calculated. To leading order, both distributions exhibit the same mean value E = ab, giving a randomized benchmarking average gate error of 2 3 σ 2 . However, the distributions diverge in the second moment V = ab 2 .
We may now derive the properties of the distribution associated with regime (iii) by considering two independent walks; one is induced by the correlated error component R U . To begin, it is convenient to note that in the case of a correlated, fixed error process over a sequence, it is possible to factor out the constant error strength from the random walk for a particular realization of the error [32], We thus introduce V to describe the sequence-specific walk, defined by the steps r j that remain invariant un- der different realizations of the error process ( Fig. 1e). This separability is not achievable in the presence of uncorrelated errors due to the randomization of each step in the walk by the error process. The expression for survival probability can then be expanded in terms of these independent walks to second order in σ C , σ U as where the cross-term is identically zero using ε (i) C n = 0. For all three correlation regimes, higher-order terms and cross-terms contribute to the second moment of the distribution and have been calculated analytically (Table I). These terms reduce to those calculated using the Gamma distributions in Eq. 3 in the limit of large J and n, with J n. On inspection, we expect that in the presence of uncorrelated errors the variance will narrow with increasing n, while it will remain fixed in the presence of correlated errors. Such differences in scaling of a variance measure with averaging are reminiscent of the manifestation of noise correlations in other physical quantities, e.g., the Allan variance used in precision frequency metrology [49,50]. Our analysis therefore highlights that calculating the variance of measurements of randomized benchmarking survival probabilities for different sequences, and exploring how this variance changes with experimental averaging, can give insights into the underlying error correlations. The functional dependence of the distribution variance with n will be employed throughout the remainder of this work as a key signature of error correlations in standard randomized benchmarking.
In the next section we demonstrate how the model can be updated to connect to realistic laboratory noise models.

D. Modelling realistic laboratory error models
Building on the general framework introduced above, we introduce new first-principles calculations connecting the theoretical model for gate error with actual, errorinducing noise in experiments. We determine the sequence walk in the presence of arbitrary, unitary error maps, incorporating the possibility of multi-axis and gate-dependent errors. This facilitates the analysis of experimental measurements performed subject to the most common noise sources encountered in the laboratory.
We consider two physically motivated noise processes that can occur throughout a randomized benchmarking sequence. First, frequency detuning noise -either on the qubit's resonant frequency or the frequency of the control field used to drive qubit gate operations -creates an off-resonance error between the qubit and control. Second, amplitude noise, which may arise from couplingstrength variations or drifts and miscalibrations in the control, results in an over-or under-rotation error of the qubit state vector. Both of these represent "concurrent" noise sources (i.e., applied simultaneously with the execution of a gate), which ultimately produce complex gatedependent errors.
In general, depending on their underlying cause, both frequency detuning and amplitude noise processes may possess temporally correlated and uncorrelated components. Correlated noise sources include miscalibrations, magnetic field drifts, and temperature drifts in control systems, while uncorrelated noise often stems from electrical noise or local environmental sources, e.g., anomalous heating in ion traps [51] or two-level system (TLS) fluctuators in superconducting qubits [52,53].
To now examine the impact of these physical noise processes on the behavior of the sequence survivalprobability distributions, we proceed by explicitly calculating the translation between the physical noise strength, δ (i) j ∼ N (0, ρ 2 ), and the effective sequence errors at the core of our model ε = ε(δ). In our notation, ρ is used to denote the rms magnitude of the noise, distinguishing it from the rms magnitude of the error operator σ. Our calculations incorporate the fact that single-axis noise (e.g., detuning) present during a non-commuting operation generally results in a multi-axis error process. Furthermore, physical implementations of Clifford operations typically employ variable gate durations, resulting in gate-dependent error operators.
In this setting, the error ε (i) j employed in Eq. (1) is replaced by the physical noise strength δ (i) j . As a result, the previously unit-length steps r 3D,j now take more complex, but still analytically calculable, values due to the gate-dependence and multi-axis character of the errors induced by concurrent noise processes. For a particular noise process we calculate the associated random walk, which enables a mapping of the rms magnitude of the physical noise ρ to an updated rms value of the error σ. Appendix B describes the formalism to calculate the The translation from the rms value of a physical noise process, ρ, with correlation length Mn, to the rms value of the gate error, σ, used to calculate the first and second moments of noise-averaged sequence survival probabilities. The values ρC , ρU represent the rms magnitudes of the correlated and uncorrelated noise processes respectively. Similarly, the terms rU,j, rC,j represent the random walk steps for the different noise processes. Full details of the derivation of the relevant random walk step expectation values, E rj 2 , E rj 4 , and Cov rU,j 2 , rC,j 2 for the specific noise models employed in our verification experiments are presented in Appendix B 1.
noise-to-error translation in standard Clifford gates for an arbitrary, unitary error process. Table II summarizes the results which, when combined with the expressions from Table I, can be used to predict both the expectation and the variance of the distribution of survival probabilities over sequence randomization.

III. EXPERIMENTAL IMPLEMENTATION
A. Randomized benchmarking on 171 Yb + qubits We perform experiments using a qubit encoded in the 2 S 1/2 hyperfine ground states of a single laser-cooled 171 Yb + ion confined in a linear Paul trap, with the computational basis states defined as |0 := |F = 0, m F = 0 and |1 := |F = 1, m F = 0 . Laser cooling, state initialization to |0 , and detection are performed using a laser at 369 nm that couples the 2 S 1/2 |F = 1 ground state to the first excited state 2 P 1/2 |F = 0 . As the ion selectively fluoresces when it is projected to the upper, "bright" qubit state |1 , one can distinguish between the two basis states by counting the number of emitted photons during the detection period. Single-ion qubit state detection is performed in a time-resolved manner [46,54] using an avalanche photodiode; multi-ion data employs an EMCCD camera and processing through a Random Forest classifier from the scikit-learn framework [55].
Qubit rotations are driven via a microwave field near 12.6 GHz generated by a Vector Signal Generator (VSG). Using an internal baseband generator, we program arbitrary rotations of the qubit via IQ modulation. Rotations about the z-axis are implemented as instantaneous, pre-calculated IQ frame shifts. Randomized benchmarking sequences composed from Clifford operations are preloaded into the VSG and mapped to the desired physical operations prior to the recording of each data set. The experiments in this manuscript are performed using k sequences each comprising J operations. The first J − 1 gates are randomly composed Clifford operations,Ĉ ηj , and the final operation,Ĉ η J = ( J−1 j=1Ĉ ηj ) † , is selected such that the sequence implements the identity in the absence of error. A full list of the Clifford operations and their physical implementations can be found in the Supplementary Materials of reference [32]. Typical, singlequbit randomized benchmarking experiments with primitive gates achieve a baseline result of p RB ≈ 1.9 × 10 −5 in our system (Appendix A).

B. Verifying error correlation signatures with engineered errors
The key signature of the presence of temporally correlated errors appears in the variance of the distribution over sequence survival probabilities and its scaling with experimental averaging; averaging reduces the variance in the case of uncorrelated errors, but has limited impact when errors exhibit strong temporal correlations.
We begin our experimental study by engineering experimental noise sources to test and verify the predictions of the theoretical model presented in Sec. II. We perform standard randomized benchmarking, but engineer detuning and control-amplitude noise with different user-defined bandwidths. All noise values are generated numerically, and are sampled from a zero-mean Gaussian distribution N (0, ρ 2 ) with rms strength ρ. Off-resonance errors are induced via fractional detuning noise present during the application of the randomized benchmarking sequence, δ = (∆/Ω), set by the frequency detuning ∆ between the qubit transition and the microwave source in units of the Rabi frequency, Ω. Over-rotation errors are produced by amplitude noise in the microwave control field, effectively changing Ω. Two limiting noise bandwidths are treated: maximally correlated noise, M n = J, and uncorrelated noise, M n ≤ 1. For the detuning (control-amplitude) noise process, the correlated noise component is engineered using a constant offset in the VSG microwave frequency (amplitude) over the entire sequence, and the uncorrelated noise is applied via an external FM (AM) modulation input, and changes value every primitive π/2-time. The relevant random walk steps calculated for these noise processes and used in modelling our experimental measurements are found in Table III of  Appendix B. Instead of simply calculating the randomized benchmarking decay rate, p RB derived from fitting to the mean of the distribution over different values of J, we instead focus on analyzing our data to extract information that is otherwise generally discarded in averaging processes. In each individual measurement, the qubit is initialized in state |0 via optical pumping and one of k = 50 randomized benchmarking sequences with J = 100 gates is applied in the presence of engineered noise. A final projective measurement in each experiment yields a discretized qubit state measurement, which is used to infer the probability of finding the qubit in state |1 by repeating the experiment r = 220 times under application of the same engineered noise realization (reducing quantum projection noise). The survival-probability measurement outcomes for each sequence are then averaged over a variable number up to n = 200 different realizations of noise possessing the same engineered correlations. This process is repeated for all k = 50 sequences, allowing us to calculate the distribution variance V  Table II. These theoretical predictions -which involve no free parameters -show good agreement with the data in the regimes studied.
These data clearly illustrate the differences in the distributions over the same set of randomized benchmarking sequences when subjected to noise with differing correlation properties. As shown in Ref. [32] and highlighted here in Table I, the distributions possess approximately the same mean value, despite the differing noisecorrelation properties. The skew to high fidelities in the data taken using correlated noise is a manifestation of the randomized decoupling effects known to exist within some randomized benchmarking sequences [32]. More importantly, the behavior of the variance of the distributions under an increasing number of noise averages n varies substantially. For small n the distributions are similarly broad despite the differences in their shapes, but with further averaging the distribution measured under uncorrelated noise narrows while the variance of the distribution measured under correlated noise remains approximately constant (as discussed in Sec. II C).
To highlight the effect of noise correlations on the experimental averaging behavior, we plot the variance of the distribution over measured sequence survival probabilities, V , as a function of the number of noise averages n (Fig. 2d). Potential unintended systematic bias in the scaling of the experimental data with n is mitigated by random re-ordering of the measured outcomes prior to cumulative averaging, producing a collection of individual averaging trajectories. For correlated noise, M n = J, the resulting trajectories are initially broadly distributed and fluctuate before converging with n to a fixed, analytically calculable variance. By contrast, in the case of uncorrelated noise with M n ≤ 1, all trajectories show an approximate reduction in V (n) k ∝ 1/n, commensurate with a continued narrowing of the distribution of outcomes over different sequences under averaging (Fig. 2a-c).
Solid lines capturing key scaling behaviors observed in both data sets of Fig. 2d are derived from the expression for variance in Table I using the noise-to-error translations presented in Tables II and III, calculated for concurrent detuning noise with no free parameters. Overall, agreement with the measured experimental data are good across a wide parameter range and two orders of magnitude in V (n) k . For correlated noise, small deviations between the theoretical trace and measured mean scaling appear for low values of n. Numerical evidence attributes this to the limited sample size in terms of sequences, which does not always capture the rare, highly error-susceptible sequences that would lead to a larger variance. In the case of uncorrelated noise, there is an overall vertical shift between the theory and the data, which is fully compensated by adjusting the rms noise strength ρ U by ∼ 6%. Numerical simulations and analytic considerations attribute the need for this adjustment to the strong noise employed in these experiments, which violates the theoretical assumption Jρ 2 U 1, such that higher-order terms in the theory cannot be fully ignored.
The uncorrelated noise data begin to deviate from an exact 1/n-scaling of V (n) k at large numbers of noise averages. This behavior is captured by our theoretical model and varies in a predictable way with the applied noise bandwidth and sequence length J (Appendix B 1); we have verified it is not due to fundamental measurement limits in our system or quantum projection noise, as discussed in Appendix D. We are able to attribute this "saturation" in variance scaling for uncorrelated noise to residual sequence dependence, even in the case of purely uncorrelated noise, and the fact that our projective measurement probes only a two-dimensionalσ xσy -plane in Pauli-error space. For example, one can imagine a sequence composed solely ofÎ gates, which, due to an induced off-resonance error, will experience a net phase rotation that cannot be measured by single-axis projective measurements. Hence, no amount of averaging over different noise strength realizations will produce a survival probability that converges to the distribution mean, even in the case of uncorrelated noise.
Overall we find that our theoretical models predict not only the full distribution of survival probabilities over randomized benchmarking sequences, but also the scaling of this distribution's variance with experimental averaging. The difference between the gray and red data in Fig. 2d, and the agreement of theory, thus constitute key experimental validations of the central theoretical contributions made in this manuscript.

IV. SUPPRESSING ERROR CORRELATIONS USING DYNAMICALLY CORRECTED GATES
In the next part of our study we explore the ability to modify error correlations within a sequence through deterministic replacement of each Clifford operation in a randomized benchmarking sequence with an error-suppressing dynamically corrected gate (DCG). Each DCG is implemented by replacing primitive physical rotations with composite pulses comprising multiple physical rotations [33], according to one of several prescriptions [25]. This approach abstracts the target state transformations away from the physical qubit manipulation in a manner that builds in error robustness via coherent averaging. In this way, these composite gates modify the error susceptibility of the target operations, and in particular change the relationship between an input correlatednoise process and output gate errors. We therefore refer to their action as "virtualizing" the Clifford operations, consistent with an abstraction above the physical-layer operations presented in [29].
The error-virtualization process is described quantitatively by calculating the error vector j at the operator level and expressing it in the Fourier domain. In the limit of classical Gaussian dephasing noise, described in the Fourier domain as the spectrum β z (ω), the leadingorder Magnus term (α = 1) in theσ z -quadrature may be written as Here, G z (ω, T j ) is an analytically calculable, filtertransfer function that describes the spectral characteristics of a gate active for duration T j [34]. The effective error spectrum experienced by the gate may therefore be represented by the spectral overlap of the filter-transfer function with the noise, written as G (1) z (ω, T j ) × β z (ω) → E(ω, T j ). Fig. 3a demonstrates the mapping between input noise and the effective error spectrum schematically for an example 1/ω-noise spectrum and a primitive π-rotation about the x-axis. In this example, correlations in the noise are directly transferred to the correlations in the effective error spectrum [37] (c.f. direct M n to M ε translation for primitive gates in Fig. 1c).
Replacement of the primitive gate with a logically equivalent DCG virtualizes the effective error spectrum for each operator through the process of noise filtering [27,33,34,37]. Fig. 3b illustrates this effect, where the DCG's reduced susceptibility to low frequency noise (captured through its filter-transfer function) results in a whitening of the effective error spectrum relative to β z (ω). In the current context, this whitening suggests that DCGs should not only reduce overall error magnitudes when the noise is dominated by low frequency contributions, but they should also suppress the signatures of error correlations between sequentially applied gates.
The particular DCG constructions examined in this work are the "Compensation for Off-Resonance with a Pulse SEquence" (CORPSE) [56] and "Walsh Amplitude Modulated Filter" (WAMF) [57] gates, which suppress detuning errors, and the BB1 pulse family [58], which suppresses over-rotation errors. Specific details of DCG construction for the various operations employed here are presented in Appendix C.  z (ω, Tj) and the noise spectrum (here βz(ω) ∝ 1/ω) combine to produce an effective error spectrum E(ω, Tj) for a single gate. b, The modified filter functions for first-order DCGs scale as ω at low frequencies, which results in a "whitening" of E(ω, Tj) relative to the input noise spectrum. c, d, Variance scaling with n for primitive (gray) gates, and WAMF (orange), CORPSE (blue), and BB1 (green) DCGs all subjected to noise with both correlated and uncorrelated components. For c, detuning noise is engineered with strength δC ∼ N (0, 2 × 10 −3 ), δU ∼ N (0, 5 × 10 −4 ), and for d, amplitude noise is engineered with strength δC ∼ N (0, 9 × 10 −4 ), δU ∼ N (0, 2 × 10 −4 ). Dotted lines are means of 1000 trajectories randomized over noise realizations, and solid lines for the DCGs are theoretical fits from Table I to the mean with the values of σ 2 U and σ 2 C allowed to vary. Black solid lines for primitive gates are derived from the same theory with no free parameters. As with Fig. 2, all data is measured for k = 50 sequences of length J = 100 with n = 200 noise realizations and r = 220 repetitions.

A. Modification of variance scaling with engineered errors using DCGs
We begin by performing a detailed, quantitative study of the measured signatures of error correlations through the application of engineered noise. We experimentally implement primitive, CORPSE, WAMF and BB1 gates, where the first two DCGs are designed to suppress errors arising from frequency detuning noise and the latter is designed to suppress errors arising from amplitude noise. Using the same set of randomly generated randomized benchmarking sequences as in Fig. 2, we now apply a mixed noise spectrum, simultaneously containing uncorrelated, rapidly varying noise (M n ≤ 1), and quasi-static offsets that are constant over a full sequence giving a strongly correlated component (M n = J). In addition to performing measurements with primitive gates, we also construct DCG sequences by deterministically replacing each Clifford with its logically equivalent DCG counterpart. The relations for the mixed noise spectrum provided in Tables I and II now permit a direct study of the impact of using DCGs on error correlations appearing within the randomized benchmarking sequences via the averaging behavior of V (n) k . Beginning with frequency detuning noise, both DCG implementations shown in Fig. 3c exhibit an initial variance scaling with noise averaging V (n) k ∝ 1/n, reminiscent of the application of the purely uncorrelated noise process in Fig. 2d. The observed saturation in V (n) k at large n for the DCG data combines contributions due to both the analytically calculable component occurring in the presence of purely uncorrelated noise introduced above, and residual uncompensated error correlations. The general behavior observed for the DCG sequences is to be contrasted with that observed for the same sequences composed of primitive gates where, as in Fig. 2, the strong correlated noise component causes the variance to converge to a large constant value (gray).
Similar behavior is observed when considering the amplitude error quadrature. We demonstrate this through the application of engineered control-amplitude noise in Fig. 3d, where measurements on sequences composed of DCGs derived from the BB1 family exhibit a similar V (n) k ∝ 1/n averaging behavior. Again, this is contrasted with the behavior of sequences composed of primitive gates where once more the variance saturates to a high constant value, despite application of the same noise in both settings.

B. Quantitative analysis of error-correlation suppression
In order to calculate the change in error correlations realized in randomized benchmarking sequences composed of DCGs, we compare experimental measurements of V (n) k with the predictions of the model summarized in Table I. For the primitive gates, we explicitly translate the applied detuning noise strengths to an effective error strength using the noise-to-error relations in Table II; for this, we also use the expected random walk step expressions calculated and presented in Table III of Appendix B for detuning or amplitude noise with a π/2-bandwidth in the uncorrelated component. The solid, black lines in Figs. 3c,d are then derived using these calculated error strengths, with no free parameters. Agreement between experimental measurements and theoretical predictions for the primitive gate sequences is good, but we observe a small (∼20%) deviation that appears approximately constant over several orders of magnitude in n for both noise processes. Ongoing work is investigating the source of this discrepancy; possible sources include the unaccounted impact of higher-order terms due to the strength of the applied noise, and undersampling of the distribution over noise-averaged sequences.
To extract the relative correlated and uncorrelated error components after DCG application, we fit the data using the theoretical predictions for the scaling of V (n) k shown in Table I, and use the strengths of the two error components σ 2 U and σ 2 C as free parameters. First, for all DCGs we observe a reduction in σ 2 C coupled with an increase in σ 2 U . Specifically, σ 2 C is reduced by a factor of 49× for CORPSE, 6× for WAMF, and 10× for BB1, while all experience an increase in σ 2 U by approximately 6 − 7×. The relative performance of the DCGs observed in our experiments is aligned with their documented strengths, as CORPSE is known to more efficiently cancel purely static detuning errors than WAMF [33,57], although improved calibration of the pulse-amplitude values used in WAMF gates is expected to improve the efficacy of correlated-error suppression.
The increase in σ 2 U is approximately consistent with the increase in duration of the DCGs relative to the primitive gate implementations. Considering the highpass-filtering nature of all DCGs illustrates why uncorrelated noise processes fluctuating rapidly on the scale of the individual DCGs are transmitted by their filters and lead to residual errors that may be amplified by the DCG structure. Overall, these measurements -in particular the scaling of V (n) k -are consistent with an interpretation that the action of the noise whitening in the filter-transfer-function framework transforms correlated noise into predominantly uncorrelated residual errors at the operator level.

C. Signatures of variable error-correlation lengths
To expand on the previous analyses, we experimentally demonstrate that the reduction in effective error correlation, indeed, resides at the virtual gate layer. Using the same sequences as before, and the same engineered ρ U and ρ C rms magnitudes for detuning noise, the length of the correlated noise component is now varied in terms of the number of gates at the virtual level, breaking it up into blocks of length M n . The lab-frame durations of the noise blocks therefore now differ by a factor of ∼ 6 be- tween the primitive and the CORPSE gates (the average increase in the duration of the Clifford operations when using CORPSE).
In the case of sequences composed of primitive gates, the signature exhibited by the variance scaling under noise averaging in Fig. 4a gradually changes from indicating correlated errors (saturation at high variance) to purely uncorrelated errors (1/n-like scaling) as the block length is decreased, consistent with observations in Fig. 2 and Fig. 3. By contrast, the sequences composed of CORPSE gates in Fig. 4b retain their overall 1/nlike scaling behavior for all correlated component block lengths, demonstrating that residual uncorrelated errors remain dominant. All traces in Fig. 4a,b have been normalized to the initial mean variance for each engineered noise case to highlight the change in the relative corre-lated and uncorrelated error components, rather than the net error strength.
As a witness of the suppression of error correlations, Fig. 4c shows the ratio of the initial mean variance V (n=1) k to the final, fully noise-averaged variance V (n=200) k . This ratio scales approximately inversely with M n for primitive gates but remains nearly constant for CORPSE gates. Extrapolation of this ratio for CORPSE back towards small M n reveals a crossover with the primitive data that lies between M n ≈ 1 to 2. This shows that CORPSE gates can reduce the noise correlation length to an error correlation length commensurate with physical noise M n ≈ 1 to 2. Because the noise correlation blocks were matched to the duration of the underlying Cliffords -whether through primitive or composite constructionthese data highlight the efficacy of DCGs in virtualizing error characteristics for the logical gates implemented.

V. DCG'S IMPACT ON INTRINSIC ERRORS
After verifying the utility of the theoretical constructs we have introduced in this work, we now turn to characterizing the intrinsic errors limiting the performance of our system. In the trapped 171 Yb + ion experiment described in Section III, we achieve a single-qubit randomized benchmarking average error per gate (EPG) of (1.89±0.12)×10 −5 (Appendix A). Increasing the number of qubits to five and performing simultaneous randomized benchmarking using a global microwave control field reveals a monotonic increase in the EPG across the register, ranging from (5.7 ± 0.5) × 10 −5 to (1.3 ± 0.1) × 10 −4 . As such, were we to run multi-ion algorithms that use global state manipulations, e.g., transversal gates in the 7-qubit Steane code [4], we would not see the net error rate scale linearly with respect to the initial single-qubit EPG. This non-linear scaling with increasing qubit numbers has been observed in many systems and is often due to cross-talk between qubits [59]. It is important to note that this experimental observation of inhomogeneous error rates also violates a common assumption on noise statistics made in studies of error correcting codes, namely that the noise is independent and identicallydistributed (iid).
In our case, the underlying cause of the observed error inhomogeneity is a sub-percent-level gradient in the amplitude of the microwave control field across the ion chain, caused by interference from metallic surfaces in the proximity of our in-vacuum antenna. We also observe a small magnetic-field gradient across the qubit chain, such that both amplitude and detuning noise are present simultaneously. Spatially correlated errors have recently been studied in reference [60], wherein it is noted that previous studies of multi-qubit errors tend to assume either spatially independent errors or identically spatially correlated errors, facilitating the use of a decoherence free subspace. Our situation, with a gradient of spatially correlated errors, falls between these two cases, but can still induce simultaneous multi-qubit errors that lower the efficacy of QEC.
To characterize the impact of DCGs on spatially correlated errors, we utilize simultaneous randomized benchmarking sequences of length J = 500 applied to all five qubits in the register, and again explore variance scaling with experimental averaging. We construct DCG sequences using BB1 gates to combat the dominant microwave-control-amplitude errors. Data collection proceeds by interleaving a single sequence implemented using either primitive or BB1 gates to ensure a fair comparison between the sequences in time, in the event that any systematic drifts occur.
We examine the scaling of V (r) k with averaging over repetitions r, up to r = 500; because noise is native to the system, we make the substitution n ≡ r. The signature of the temporally correlated intrinsic errors is observed for all ions when using sequences of primitive gates in Fig. 5a (red). We observe a staggered, increasing saturation value for V (r) k at r = 500, increasing with the spatial distance from qubit 1 (leftmost qubit in Fig. 5a inset), which is used to calibrate the gate operations. As expected, the qubit that is furthest from the calibration qubit suffers both the worst randomized benchmarking performance and shows the highest saturation value in variance scaling. By contrast, the over-rotation error suppressing BB1 gates (blue) saturate at a value of variance over an order of magnitude lower than achieved by the primitive gates, and recover a 1/r-like scaling for all qubits. We further find the relationship between the physical positions of the qubits and the ordering of saturation variances has become scrambled. Using the analysis introduced above, we fit the mean variance trends with the expression in Table I, allowing the strengths of the error σ 2 C , σ 2 U to vary. We extract a reduction in the correlated error strength when using BB1 gates ranging from ∼ 5 to 16× for the five qubits.
To directly probe the action of DCGs in virtualizing the spatially correlated errors, we calculate the pairwise cross-correlation coefficient between the survival probabilities in each experimental realization (Fig. 5b). For primitive gates, all errors are highly correlated between qubits (cross-correlation coefficient ≥ 0.9 for all qubit pairs), whereas for the BB1 gates, a reduction of approximately 50% can be seen between all qubit pairs, further supporting the evidence that DCGs provide a suppression of error correlations in both time and space.
Separate investigations not presented here using the multi-axis error suppressing DCG CinBB showed no additional benefit. This observation suggests that the offresonance error created by the magnetic-field gradient was sufficiently small that it was dominated by other larger, but rapidly fluctuating, intrinsic error sources FIG. 5. Intrinsic errors in a five-qubit chain. a, Variance over noise-averaged sequence survival probabilities for five-qubits using k = 60 sequences of length J = 500, averaged over repetitions r, up to r = 500. Each trajectory is produced by shuffling the order of repetitions used in the graph to avoid bias, dotted lines indicate the means of 1000 trajectory randomizations, and solid lines are fits where the correlated and uncorrelated error strengths were free to vary. The correlated error strengths, σ 2 C , are {1.2, 1.5, 1.9, 2.4, 2.7} × 10 −4 from qubit 1 to 5 for the primitive gates, and {2.3, 2.5, 1.1, 2.2, 2.3} × 10 −5 for the BB1 gates.

VI. OUTLOOK
The results we have presented suggest that the path to the practical implementation of QEC may be facilitated by transforming miscalibrations and common laboratory noise sources exhibiting slow drifts and low-weight noise spectra, into effective error processes with dramatically reduced correlations at the virtual layer using DCGs. We believe this is important as the pursuit of functional quantum computers -even at the mesoscale -will clearly require major advances in the control and suppression of errors, as gate counts quickly exceed 10 10 for even moderate problems requiring only ∼ 200 qubits [61]. Combined with the observation that certain DCGs can mitigate spatial cross-talk in multi-qubit systems [62], we believe that our demonstration of the suppression of temporal and spatial error correlations within quantum circuits solidifies the central importance of dynamic error suppression techniques at the virtual level for practical quantum computing.

ACKNOWLEDGMENTS
The authors acknowledge S. Mavadia for assistance with data collection and simulations, and discussions with H. Ball Using the experiment described in Section III, with a single trapped 171 Yb + ion and microwave gates, we achieve a single-qubit error per gate of p RB = (1.89 ± 0.12) × 10 −5 measured using randomized benchmarking (Fig. 6). The fit to the mean survival probabilities used to extract the error per gate is given by where P is the mean survival probability, J is the number of gates in a randomized benchmarking sequence, and κ is the value of our single-qubit State Preparation and Measurement error (SPAM), found to be κ = (3.3 ± 0.1) × 10 −3 .

Appendix B: Physical noise to error strength translation
We verify the model presented in this manuscript by using primitive Clifford gates under engineered noise, where the strength and effect of the noise are known exactly, allowing for quantitative analysis. For this verification, we need to calculate the translation between the rms magnitude of the physical noise process, ρ, and that of the resulting error operators, σ. The noise is applied concurrent with the gate operations inducing multi-axis and gate-dependent errors for the different Clifford operations, whose lengths differ between π and π/2 rotations. Due to this introduced gate-dependence, an exactly constant noise process will not be directly translated to a constant error process with identical error vectors for every gate, and hence the translation for each gate needs to be considered explicitly.
The method to transform noise strength to error strength for noisy, primitive Clifford gates is initially presented here for a general noise process that is static over the duration of a single gate. Each of the singlequbit Clifford gates are made up of rotations on the Bloch sphere with the rotation axis and angle specified by the Clifford gate index, η j ∈ {1, . . . , 24}. If the jth gate in a sequence is affected by laboratory noise with value δ j ∼ N (0, ρ 2 ), the resulting noisy gate can be decomposed into an error operator and the ideal gate, C ηj =Λ jĈηj , witĥ whereσ is the vector of Pauli matrices. In the main text, this operator was introduced in terms of the error vector j asΛ j = exp {i ∞ α=1 [ j ] α ·σ}. We have now separated the error vector into two components for the Magnus expansion of order α, [ j ] α = δ α j [ν ηj ] α , to explicitly show the dependence on the physical noise strength δ j , which will change between different realizations of the noise, and the particular gate's susceptibility to the error channel, described by the term ν ηj . There will be 24 gate-specific error vector terms, ν ηj , corresponding to the 24 Clifford operations, which can be calculated explicitly for a given noise process. We now consider how these terms affect our ideal randomized benchmarking sequence.
Starting with the standard randomized benchmarking procedure, we compile a sequence of randomly composed single-qubit Clifford operations, J j=1Ĉ ηj =Î, which are mathematically right-multiplied to the preceding operator such that they act sequentially on an initial state. Then, the complete noisy sequence is given bỹ The survival probability for a qubit prepared in |0 , averaged over n noise instances, is calculated using To approximate the sequence, the method from [32] is employed: the first-order term of each error operator can be translated to a step in Pauli-error space, with the total random walk in three dimensions for a given noise instance i given by The jth random walk step, r 3D,j , is calculated from the product of the preceding ideal gates modifying the firstorder, gate-specific error for the jth operation in the sequence [ν ηj ] 1 ·σ, To obtain the sequence survival probability that would be measured via a single-axis projective measurement, the relevant steps are then the projection of r 3D,j in the two-dimensionalσ xσy -plane, r 2D,j ≡ r j , of Pauli-error space. As with the original model, it can be shown that a sequence's survival probability is given by where R is the two-dimensional random walk. From this expression, the expectation and variance of the distribution over noise-averaged sequence survival probabilities have been calculated for arbitrary step lengths; the results of this calculation are summarized in the noise-toerror translation in Table II of the main text. These expressions are based on the expected random walk steps induced by the 24 error maps, which are shown in Table III for a range of physical noise processes. We proceed here by showing an example derived for a concurrent detuning error.

Example for Concurrent Detuning Noise
The combined main text Tables I and II predict the form of the noise-averaged survival-probability distribution for different engineered noise processes, given the Concurrent Detuning 1 value per primitive (π or π/2) gate Over-and Under-rotation 1 value per primitive (π or π/2) gate π 2 /18 5π 4 /576 29π 4 /5184 Over-and Under-rotation 1 value every primitive π/2 gate time π 2 /36 5π 4 /2304 29π 4 /10368 expected random walk steps in theσ xσy -place of Paulierror space, E r j 2 , E r j 4 . As an explicit example, these quantities are calculated here for concurrently applied detuning noise, produced by an offset between the qubit frequency and the control field frequency, normalized to the Rabi frequency, δ = ∆/Ω. An ideal rotation of angle θ about the n-axis of the Bloch sphere is modified by detuning noise as, From this, the eight physical error maps affecting the Clifford operations are calculated to be, more generally expressed for the jth operation in the sequence asΛ Only eight error maps are required to treat all 24 Clifford operations due to the error-free nature ofσ zrotations, which are generally implemented via instantaneous phase-changes on the control field. Following the definition of the Clifford operations given in [32], there is only one non-σ z -rotation per Clifford, which exactly corresponds to one of the eight error maps described in (B8). Ifσ z operations were also affected by the noise, the procedure would follow similarly but all error maps would need to be calculated.
To find the expected random walk steps for this unitary error channel, recall from (B5) that the direction of the Pauli-error steps is determined by the preceding operations in the randomly composed sequence. As such, a given step will remain deterministic in its size, yet be performed along an arbitrary direction in Pauli-error space, determined by the preceding gates. Studying the error maps for concurrent detuning noise, we can write the gate-dependent steps aŝ withm 1 ,m 2 ∈ ±{σ x ,σ y ,σ z }. This implies that πrotations about the x and y-axes of the Bloch sphere produce a unit-length step in Pauli-error space that will be randomly oriented along one of the six principal axes. Similarly, π/2-rotations produce a 1/ √ 2 -length step oriented at 45 • between two principal axes,Î gates produce a π/2-length step along a principal axis, and rotations about the z-axis contribute no step due to their errorfree nature.
The probability of producing a particular non-zero r j is shown in Table IV, based on the prevalence of different gates in the 24 Clifford gates and the likeliness of their projection into theσ xσy -plane. Note that these steps are completely independent of the strength of the particular noise realization, δ (i) j . The noise will eventually rescale each step length, but here we only consider the unscaled walk. For this particular noise type and bandwidth, it is not necessary to distinguish between rj 1  E r C,j 2 and E r U,j 2 , as both the correlated and uncorrelated error processes are static over the duration of an individual gate, and hence will result in the same expected average walk steps; it is only when increasing the bandwidth of the uncorrelated noise that they need be distinguished. Using Table IV one finds Using Tables I and II,  for both correlated and uncorrelated errors. This again illustrates the equivalence of the distribution mean, which is related to the parameter that standard randomized benchmarking analysis returns, for noise of the same strength despite vastly different correlation lengths. The difference between the correlated and uncorrelated processes becomes evident when looking at the variance over survival probabilities with increased noise averaging.
For uncorrelated errors, noting that in the limit n → ∞, the variance scaling saturates at a value ∝ 1 J relative to the starting variance.
again tending towards a constant; however, this occurs at a significantly smaller number of noise averages than for uncorrelated noise and saturates at a much larger variance ∝ 1 + 1 J relative to the starting variance. Using the revised model, the noise-averaged survivalprobability distributions under correlated noise remain Gamma distributed with an updated scale parameter. While this is yet to be shown explicitly for the uncorrelated case, the behavior is approximated in the limit of large n and J, with n < J by modifying the distribution in (3), to yield P C ∼ Γ(a = 1, b = 2 3 Jσ 2 ( 1 2 + π 2 96 )), (B22) P U ∼ Γ(a = n, b = 2 3n Jσ 2 ( 1 2 + π 2 96 )).
The normalized Gamma distributions for correlated error processes shown by solid gray lines in the main text Fig 2a-c were calculated from first principles using (B22) with no free parameters. The distributions for the uncorrelated error process in red were calculated from an altered version of (B23), which was modified for higher bandwidth noise that took multiple values of δ in a single error map. This made use of the relation such that the multiple values of δ could be expressed as from which point the previous method can be followed. The equivalence in (B25) occurs because δ 1 , δ 2 are independent samples from a Gaussian distribution, meaning their combination is also Gaussian distributed.

Appendix C: DCG constructions employed in this work
Three error suppressing DCGs are utilised in this work: CORPSE and WAMF gates, which suppress detuning errors, and BB1 gates, which suppress over-rotation errors. For each of these constructions, the target angle θ t = π, π/2 gates are created as multi-segment pulses described by the segments' rotation angles θ i , phase angles φ i , and Rabi frequencies Ω i normalized to the maximum frequency Ω. The constructions of the different gates are shown in Table V. To ensure that the error suppressing aspects of the DCGs are maintained for all Clifford gates, the identity gate is implemented as a rotary spin echo by concatenating a π rotation about the x-axis with its inverse −π rotation. While this results in a net zero rotation, effectively identical to the simple wait time used for primitiveÎ gates, it makes the identity operation first-order insensitive to detuning errors during its implementation. The physical motivation here is that if a qubit is remaining idle at any point during a multiqubit circuit, it may be preferable to continuously drive this type of rotary spin echo to ensure that it does not accumulate phase errors during its idle period.
Quantum projection noise (QPN) describes the intrinsic uncertainty in qubit measurements due to the binomial nature of measurement outcomes [63] and its scaling with the number of samples. The variance of a measurement due to QPN is p(1−p) /r, where p is the true state projection onto the z-axis of the Bloch sphere and r is the number of identical measurements performed. Our work studies variances over distributions of noise-averaged survival probabilities, and consequently it is necessary to demonstrate that we were not limited by QPN bounds. We consider the CORPSE data shown in Fig. 3c; in order to ensure that our results are not measurement artefacts from quantum projection noise, we average each sequence and noise realization combination r = 220 times. At this number of repetitions, the largest possible projection noise variance is given by 0.5(1−0.5) /220 = 1 × 10 −3 . In addition to the worst case QPN, we compare the variance scaling results for the CORPSE DCG under simultaneously applied correlated and uncorrelated noise to the QPN given by the measured survival probabilities. Fig. 7 shows the mean trajectory for the CORPSE variance scaling under the combined noise process presented in main text Fig. 3c in dark blue. The dashed black line gives the worst case QPN and the two other sets of trajectories are calculated directly from the measured probabilities. For these, the QPN was calculated at each n for 100 randomizations of noise realizations to reduce bias, and the 100 values are plotted. The lower set of trajectories are divided by (n × r) rather than just r. Our results are well above this lower limit suggesting that this is the most valid measurement of setting our QPN limit. Furthermore, we note that the saturation observed at large values of n is not set by any static QPN bound limiting our measurements.