A Race Track Trapped-Ion Quantum Processor

We describe and benchmark a new quantum charge-coupled device (QCCD) trapped-ion quantum computer based on a linear trap with periodic boundary conditions, which resembles a race track. The new system successfully incorporates several technologies crucial to future scalability, including electrode broadcasting, multi-layer RF routing, and magneto-optical trap (MOT) loading, while maintaining, and in some cases exceeding, the gate fidelities of previous QCCD systems. The system is initially operated with 32 qubits, but future upgrades will allow for more. We benchmark the performance of primitive operations, including an average state preparation and measurement error of 1.6(1)$\times 10^{-3}$, an average single-qubit gate infidelity of $2.5(3)\times 10^{-5}$, and an average two-qubit gate infidelity of $1.84(5)\times 10^{-3}$. The system-level performance of the quantum processor is assessed with mirror benchmarking, linear cross-entropy benchmarking, a quantum volume measurement of $\mathrm{QV}=2^{16}$, and the creation of 32-qubit entanglement in a GHZ state. We also tested application benchmarks including Hamiltonian simulation, QAOA, error correction on a repetition code, and dynamics simulations using qubit reuse. We also discuss future upgrades to the new system aimed at adding more qubits and capabilities.


I. INTRODUCTION
Several technology platforms are viable candidates for large-scale quantum computation, including trapped ions [1], neutral atoms [2], and superconducting circuits [3].However, existing demonstrations face scaling challenges to achieve the qubit numbers and fidelities necessary for fault-tolerant quantum computing.In addition, all platforms need refinement in reliability, power consumption, form factor, and cost.This concept, known as Rent's rule, has been discussed rigorously in terms of classical computing technologies and recently generalized to include quantum processors [4].
In this work, we characterize a trapped-ion quantum computer with a new trap design based on the QCCD architecture.The new machine, Quantinuum System Model H2, significantly increases the qubit number and decreases the physical resources per qubit, all while matching-and in some instances surpassing-the high circuit fidelity of our previous generation system [5].
The QCCD architecture was proposed as a scalable method for trapped-ion quantum computation [6,7].Trapped-ion systems with a single trapping zone are limited in qubit number due to challenges in individually addressing single qubits within a large ion crystal, as well as motional mode crowding, which complicates achieving high-fidelity operations in a large crystal [8,9].QCCD trapped-ion systems instead have multiple trapping zones allowing operations to always be done with a small number of ions, thereby facilitating low-crosstalk addressing and maintaining high fidelity [10].Two-qubit gates between arbitrary pairs of qubits are enabled by ion transport during a quantum circuit, which brings pairs to be gated into the same trapping zone.Such dynamic rearrangement enables the execution of circuits with arbitrary connectivity without the overhead of logical SWAP gates typically incurred for platforms with fixed and limited connectivity [11].This transport requires traps with a large number of programmable electrodes, which can be achieved using microfabricated surface traps [12][13][14].
Our first generation hardware, H∅ and H1 (based around the same linear trap design) [5,16] demonstrated many key components of the QCCD architecture and achieved high-fidelity gates with arbitrary two-qubit (2Q) couplings.Since the initial operation of the linear trap, qubit number N was increased fivefold, from 4 in its initial mode of operation [5] to 20 in its latest [16], while 2Q gate errors were decreased by roughly a factor of 5.By increasing both the transport speeds and the number of gate zones, the average time required to execute a layer of N 1Q gates and N /2 2Q gates on a random pairing of all N qubits was kept roughly constant as N increased.
This progress notwithstanding, linear geometries pose severe scaling challenges.The time to rearrange ions for arbitrary circuit connectivity scales poorly for the linear trap design as the number of qubits increases.The future of QCCD systems is likely in 2D traps that offer better scaling of rearrangement times and are also well suited to many error correcting codes [17,18].However, 2D traps present new engineering challenges that are still under development, such as junction transport [19,20] and signal routing under the trap top metal layer.Many other aspects of QCCD scaling still in development include coupling multiple surface trap die [21,22], control of a sufficient number of electrodes, laser light generation and delivery [23,24], and detection [25].Not all of these challenges will be met simultaneously, but rather advances will be inserted as they are available.
This report marks the first major trap design advancement in the H-series QCCD quantum computers.Specifically, the new trap (shown in Fig. 1) introduces: (1) RF tunnels so that RF voltage electrodes do not need to be connected on the top surface that defines the trapping potential (Sec.II A), (2) voltage broadcasting to multiple control electrodes, thereby reducing the number of independent voltage sources needed to control the device (Sec.II A), and (3) MOT loading of the trap to increase the ion loading rate, decreasing the initialization time [26] (Sec.II B).In addition to the upgraded system design, we also report on upgraded operations, including higher performance and more efficient gating primitives.We present detailed benchmarking of the system performance with component benchmarking in Sec.III, system-level benchmarking in Sec.IV, and algorithmic benchmarking in Sec.V. Similar to the H1 series, the configuration described in this report is only the first of the H2 series, and we expect to make significant qubit count and gate zone operation upgrades in the near future.

II. OVERVIEW OF THE HARDWARE A. Trap design
As shown in Fig. 2, H2 has a race track geometry similar to traps fabricated by other groups [27][28][29].Two concentric RF electrodes circumscribe the center region and are driven at ∼ 200 V and 42 MHz, creating an RF-null 70 µm from the surface where ions are trapped.RF tunnels are required for the concentric RF electrodes and allow for DC electrodes to tile the full trap perimeter shown in Fig. 2c.The trap has two rows of gate zones colored in blue in Fig. 2, four on the top (UG01-UG04) and four on the bottom (DG01-DG04).In this work we use both rows for ion rearrangement (physical swaps), but only the DG zones are used for quantum operations (gating, state preparation, and measurement).We plan to extend quantum operations to both rows in future work.
The "conveyor belt" region of the trap is colored green in Fig. 2d.In this region, voltage "broadcasting" is used to minimize DC control signals by tying multiple DC electrodes within the trap die to the same external signal.As shown in Fig. 2b, each conveyor belt region contains equally spaced and sized electrodes tied together in a repeating fashion ({a, b, c, a, b, c, ...}).This requires only three total voltage signals and can support 20 wells on each side (one for every three electrodes).Additional electrodes, called shim electrodes, are located outside of the RF electrodes and used to compensate micromotion and rotate the trap principal axes.The load hole, visible in the middle of the left-side conveyor belt region in Fig. 2d, is surrounded by electrodes with independent signals.
The gate zone electrode configuration is similar to that in Ref. [5] and the spacing between gate zones remains the same (750 µm).An additional improvement to the signal count was realized by reducing the number of electrodes in the auxiliary regions around the gating zones (light grey in Fig. 2d).As expected, the linear transport through the auxiliary zones is not degraded compared with H1.
In total, the trap has 376 electrodes connected to 268 independent voltage sources and 1 RF drive.Similar to H1, H2 uses a 280 pin ceramic pin grid array to connect the trap electrodes to the DC control signals.This is a reduction in the number of electrical feedthroughs per qubit in the system, which is an important metric as the number of qubits grows.

B. Ion loading and state preparation
H2 uses a 2D MOT as a source for neutral atoms instead of an effusive atomic oven [30].The MOT is connected to the main vacuum chamber via a differential pumping tube.The MOT cools both 171 Yb and 138 Ba neutral atoms, which are directed toward the backside load hole in the main vacuum chamber.A fraction of the neutral atoms that pass through the load hole are ionized by the photo-ionization beams on the front side, and subsequently cooled after loading into the trap (see Fig. 2a).In the best case, we load one 171 Yb + in ∼ 1.2 ms and one 138 Ba + in ∼ 40 ms; however, algorithmic latency and validation procedures limit the time to load the full trap (32 171 Yb + -138 Ba + [YB] pairs in a deterministic orientation) to about 3-4 minutes.Under normal operating conditions, we observe no impact of the behavior of the quantum processor with the MOT beam on.Once the trap is fully loaded, we detect individual loss events and replace the affected ion pairs, requiring only 10-15 seconds per lost pair.
The qubit subspace occupies the hyperfine approximate clock states of 171 Yb + in the 2 S 1/2 state, |0 ≡ |F = 0, m f = 0 and |1 ≡ |F = 1, m F = 0 .The quantization axis is set by an externally applied magnetic field in the plane of the trap at 45 • with respect to the long axis.After loading, qubits are prepared in the |0 state via optical pumping, similar to previous work [5,31].State preparation is currently only possible in the DG zones, so we prepare 8 qubits at a time and prepare all 32 qubits in four rounds.

C. Quantum gates
Quantum gates are implemented in the DG gate zones by stimulated Raman processes described in Ref. [5] with a laser geometry shown in Fig. 2d-f.Single-qubit (1Q) gates use co-propagating beams (Fig. 2f) and 2Q gates use pairs of beams with ∆ − → k coupling to the axial mode of motion (Fig. 2e).2Q gates are implemented with a phase-sensitive Mølmer-Sørensen (MS) gate [32,33] sandwiched between 1Q wrapper pulses (using the same laser beams as the MS interaction) to generate the parameterized gate U ZZ (θ) = exp (−iθZZ/2) [33][34][35][36].The value of θ is controlled by varying the detuning, duration, and Rabi rate of the MS interaction.Modeling supports an average gate infidelity that decreases roughly linearly with θ down to a finite offset of ≈ 5 × 10 −4 as θ → 0 as shown in Fig. 3.The finite offset at zero angle exists because the wrapper pulses still occur with a delay between them, and some fraction of the 2Q laser light remains on at zero angle, leading to residual errors predominantly from laser phase noise and spontaneous emission.
The 2Q beams have the strictest requirements and consume a large portion of the total laser power budget, with the current configuration using four 2Q laser beam pairs to operate four gate zones.We note that adding one more pair of beams would enable the operation of four more gate zones on the other side of the trap, an upgrade we plan to explore in future work.

D. Measurement
Measurement operations are performed in the DG zones with resonant beams traveling perpendicular to the long axis of the trap using state-dependent resonance fluorescence shown in Fig. 2g.A photomultiplier tube array allows independent detection in all eight gate zones simultaneously, though we only implement measurement operations in the DG zones.
Similar to previous work [5,16], qubit measurement and reset may be performed in the middle of a quantum circuit while quantum information is preserved on other qubits.Mid-circuit measurement and reset (MCMR) causes a small crosstalk error that acts on neighboring qubits due to stray light from the measurement and reset beams (see Sec. III and Table II).For unmeasured ions in the gate zones, this error is mitigated by the micromotion hiding technique described in Ref. [15] and depicted in Fig. 2g.Ions in the conveyor belt regions suffer from a similar level of crosstalk errors as ions in the gate zones, although we do not attempt to apply the micromotion hiding technique to them.

E. Ion transport
Arbitrary qubit connectivity is achieved via physical ion transport.During 32-qubit operation, the ions can be grouped into four "batches" of 8, with the four batches occupying the DG zones, UG zones, and each of the two storage regions as shown in Fig 2d .The fundamental transport operations are similar to those in Ref. [5] and include split/combine, linear shifts, and physical swaps.A special type of linear shift for H2 is the batch shift, which shuttles batches of ions collectively to different regions of the trap.This operation is comparatively slow and dominates the circuit time.The fraction of total circuit time taken up by transport varies from circuit to circuit but is 60% on average (see Table I).
During ion transport, we cool all 138 Ba + ions with Doppler cooling "sheet beams", illustrated on top and bottom of Fig. 2d, which resemble sheets of laser light that cover the entire trap.These sheet beams have about a 25% variation in intensity between the center of the trap and the edges, which does not present any performance limitations.
A compiler generates a schedule of quantum gates and transport operations with the goal of minimizing the total transport time required to execute the circuit.The circuit is first decomposed into layers, which are built iteratively by looking ahead through the circuit and grouping together (into one layer) the largest possible set of 2Q gates subject to two constraints: (1) no ions participate in more than one gate in each layer, and (2) the time ordering of 2Q gates that share one or more qubit (or any time-ordering enforced by an explicitly requested barrier) is respected.The circuit is then converted into a layered directed acyclic graph, and a modified Sugiyama algorithm [37] is applied to iteratively sort the qubits in each layer of the graph in an effort to minimize the overall transport time required to execute all layers.The periodic boundary conditions of the device are explicitly taken into account and the resulting transport operations are computed using a parallel bubble sort routine that allows qubits to move in both directions around the device.
After compilation, the layers of the circuit are then executed sequentially, with transport primitives used to arrange the ions so that qubits scheduled to be gated in a given layer are positioned next to each other.Once arranged, we perform gates on each batch of ions, starting with the qubits already in the DG zones.Ions are transported to the center of the gate zones for quantum operations.1Q gates are performed after moving a single YB pair into the center of the gate zone with shift operations (Fig. 2f), while 2Q gates are performed with two YB pairs combined into a single four-ion crystal YBBY (Fig. 2e).Before 2Q gating operations, we apply resolved sideband cooling to the 138 Ba + ions in the DG zones [5,38,39].Ions in the UG zones are transported to the nearby auxiliary zones so that they are not addressed by the gating laser beams (see red circles in Fig. 2d).After the gates are applied, we perform batch shifts to move new batches of ions into the DG zones and repeat the gating procedure until the full layer is completed.A spatial phase tracking routine accounts for inhomogeneities in the magnetic field and spatially-dependent AC Zeeman shifts from the RF current [40], which lead to spatially varying qubit frequencies.The routine calculates extraneous phase shifts that each qubit accumulates throughout the quantum circuit and compensates for them by adjusting the phase of 1Q operations appropriately.Imperfections in the spatial phase tracking-due to temporal instabilities in the magnetic field environment and imperfections in the calibration routines that set the 1Q optical phases-lead to memory errors during the transport operations and sideband cooling time.Additional sources of memory error are the finite T1 time of several minutes, transport failures, or background gas collisions leading to an unintentional qubit reorder (the last two are difficult to distinguish experimentally).

F. Classical programming and CPU-QPU interactions
Quantum algorithm developers can write programs for H2 in different frameworks and languages so long as their programs compile to either OpenQASM 2.0 or QIR [41,42].Both representations contain real-time support for classical operations in the middle of the circuit, conditional expressions that rely on these classical calculations that are performed in real time, and elementary feed-forward operations conditioned on measurement results.
Many quantum computing applications call for interactions between classical and quantum processing units.Perhaps the most notable example is quantum error correction schemes in which syndrome measurement results are sent to a classical computer where a decoding algorithm is used to determine recovery operations and update quantum circuits in real-time.As discussed in our previous work [16,18], we have demonstrated this capability using two different frameworks: (1) OpenQASM 2.0++, which allows for real-time decision making, and (2) a more capable classical compute environment, utilizing Web Assembly (Wasm) [43], that can execute complex calculations.Option (2) has significantly enhanced capabilities aimed at the development of hybrid quantum/classical algorithms and is crucial for applications like quantum error correction.

III. COMPONENT OPERATIONS AND BENCHMARKS
As our first level of benchmarking, we measure the errors from various component operations in the system.Quantum operations (e.g.gates and SPAM) dominate the error budget but are only performed in the DG zones, and therefore we measure performance with a subset of eight qubits (two per DG zone).Other errors that occur during a circuit, such as memory errors, are measured with an interleaved randomized benchmarking (RB) experiment performed simultaneously on all 32 qubits.Details of each component benchmarking experiment are given below: We use the standard Clifford-twirl randomized benchmarking for measuring the error of 1Q gates [47] with a random final Pauli to fix the asymptote [48].We report the average infidelity per 1Q Clifford.
• 2Q gate randomized benchmarking (2Q RB): Similar to 1Q RB, we use the Clifford-twirl technique [47] for measuring the error of 2Q gates.Each 2Q Clifford is constructed with zero to three U ZZ (π/2) gates and each sequence includes a random final Pauli to fix the asymptote [48].We scale the 2Q Clifford average infidelity by the average number of U ZZ (π/2) gates per Clifford, which is 1.5, and report that as the average infidelity per 2Q gate.An example decay plot is shown in Fig. 4a.
• 2Q SU(4) gate randomized benchmarking (2Q SU(4) RB): We use the same general technique as 2Q RB, but instead of 2Q Cliffords we use unitaries randomly sampled from the Haar measure over SU(4) constructed with three parameterized U ZZ (θ) gates, and for each sequence include a random final Pauli to fix the asymptote [48].We report the average infidelity per SU(4) operation.
• 2Q parameterized gate randomized benchmarking: We use a direct randomized benchmark- ing procedure [49] to measure the average infidelity of the parameterized 2Q gate U ZZ (θ) as a function of angle θ.The details of the protocol are in App.A 3. A plot of the average infidelity versus angle is shown in Fig. 3.
• Measurement/reset crosstalk depumping: Measurement/reset crosstalk errors are estimated with bright-state depumping experiments [15] where a subset of qubits are prepared in |1 and other qubits are measured/reset repeatedly.The qubits in |1 decay due to crosstalk errors from the repeated process, and the decay rate scales with the average infidelity.Additional experimental details and data can be found in App.A 1. Results from these experiments are reported in both Table II (averaged over zones) and Table VI.
An example breakdown of circuit timing for 2Q RB and transport 1Q RB is shown in Table I.
For 1Q and 2Q RB, we also measured the rate of leakage errors per gate by applying a "leakage detection gadget" at the end of each circuit, as illustrated in Fig. 5.The leakage detection gadget uses an ancilla qubit to flag shots that had a leakage error, i.e., an error that moved population outside of the computational subspace.In our system, leakage is most likely due to the unavoidable spontaneous emission that occurs in gates driven by a stimulated Raman process.The leakage rate per gate r L is defined as the rate that population leaves the computational subspace (whether 1Q or 2Q).We can estimate r L by repeating the gate times, applying the gadget to each gated qubit, and fitting the leakage detection rate as shown in Fig. 4b.Further details are given in App.A 2. 5. Leakage detection gadget, adapted from Ref. [50].The gadget uses an ancilla qubit 'a' to detect whether qubit 'q' has leaked.The ancilla is initially prepared in |1 .If 'q' has leaked, the 2Q gates have no logical effect and 'a' is measured as |1 .If 'q' has not leaked, then the gadget (within the barriers in the circuit diagram) acts as XaIq, and 'a' is measured as |0 .

IV. SYSTEM-LEVEL BENCHMARKS
Benchmarks of component operations are a crucial fine-grained tool for estimating the contribution of various errors to quantum circuits.However, there are many potential ways in which they can mischaracterize device performance, for example when crosstalk or non-Markovian errors are present.Therefore, it is important to also benchmark performance on a variety of more complex, multi-qubit circuits, and to assess to what extent that performance can be understood from the measured performance of the constituent operations.Here we present results from four system-level benchmarks: (A) mirror benchmarking [51,52], (B) quantum volume (QV) [53,54], (C) linear cross-entropy measurements for random 2D circuits [55], and (D) creation/certification of N-partite entanglement in GHZ states.
In benchmarks (A-C), the random structure of the circuits justifies simple heuristic arguments relating the overall circuit performance to the component operation fidelities.In each case, we assume that all non-SPAM errors can be attributed to the 2Q gates themselves, and come in the form of a depolarizing channel (with uniform fidelity) attached to each 2Q gate.This approach accumulates all errors that happen to qubits between 2Q gates, primarily due to SQ gate errors and memory errors, and lumps them in with the 2Q gate to form an effective "per 2Q gate" error rate, which we denote by 2Q eff .To obtain a simple but reasonable estimate for 2Q eff based on the component benchmarks, we first determine the average angle θ of 2Q gates used in the system-level benchmark.The data and linear fit reported in Fig. 3 then allows us to estimate the 2Q gate contribution.We then add this 2Q gate contribution together with twice the error from Transport 1Q RB, giving a predicted effective error 2Q eff = 10 −3 2.9(2) θ/π + 0.9(1) . (1) Using analyses described in the appendices, we also extract an inferred 2Q eff from the system-level benchmarking data presented below, and report the comparisons to the predicted values in Table III fect, nor is it expected to be, given that memory errors can be highly circuit dependent.For example, in QV circuits, multiple 2Q gates happen with very little delay in between; whereas, the memory error per 2Q gate inferred from Transport 1Q RB assumes a single random reconfiguration of ions between every repeated gate on a given qubit, which likely contributes to the overestimate of 2Q eff reported in Table III.Nevertheless, the overall reasonable agreement between predicted and inferred values suggests that the results of large-scale circuits are generally well aligned with expectations based on the individual component benchmarks.

A. Mirror benchmarking
Circuit mirroring was introduced as a scalable way to benchmark arbitrary quantum circuits [51,56].We perform a randomized circuit mirroring experiment that we refer to as mirror benchmarking (MB).As described in Ref. [52], MB circuits consist of layers of 1Q gates on all qubits and 2Q gates between random pairings of the qubits with full connectivity.The 1Q gates are Clifford gates sampled uniformly at random, and each 2Q gate is the native U ZZ (π/2) gate.The circuits are "mirrored", meaning that the inverse circuit is applied in the second half.A final random N -qubit Pauli is applied to randomize the ideal outcome for each circuit.The circuits also employ Pauli randomization on the 2Q gates so that the error channel per layer can be treated as stochastic Pauli [57].The circuit-averaged probability of observing the ideal outcome as a function of the number of circuit layers will then decay exponentially.If the 2Q gate error channel is depolarizing, then the decay parameter as a function of the 2Q gate average fidelity is given by an analytic formula (Eq.C4 in Ref. [52]).Fitting experimentally measured decay curves to exponentials and inverting this formula provides an effective 2Q infidelity for the system that includes 1Q gates, 2Q gates, and the memory error for random permutations.
We performed MB experiments on H2 with N =20, 26, and 32 qubits.The decay plots are shown in Fig. 6, and the results are listed in

B. Quantum volume
Quantum volume is a system-level test designed to be comparable across gate-based quantum computers.The QV test is run with a collection of random circuits acting on N qubits.Each random circuit is generated by randomly pairing all qubits, applying random SU(4) unitaries to each pair, and repeating for N rounds.The performance is assessed with a heavy-output test that requires classical simulation of the quantum circuits.The test is passed when the probability of generating heavyoutputs is greater than 2/3 with two-sigma confidence, which yields a measured value of QV = 2 N [53].A totally decohered circuit returns heavy outputs half the time, so the QV test's threshold of 2/3 requires that the errors are small enough to be strongly distinguishable from a random distribution.Therefore, a QV of 2 N implies high performance on many circuits with more than N qubits and/or depth greater than N , as evidenced by several example algorithms run on all 32 qubits in Sec.V. QV has been measured on a variety of different systems [58] with the largest previously reported value of QV = 2 15 from H1 [59].
We performed several QV measurements, with the highest measured value being QV = 2 16 .The QV = 2 16  test data is shown in Fig. 7 and used 200 randomly generated circuits each run with 100 shots and using an average of 296 parameterized 2Q gates.The measured heavyoutput probability is 68.2%, which clears the minimum threshold of 2/3 with greater than two-sigma confidence calculated by the semi-parametric bootstrap method outlined in Ref. [54].

C. Random circuit sampling
A system-level benchmark of recent interest is the computational task of sampling the output distributions of random quantum circuits (RCS).Like QV, RCS is not a scalable benchmark as it requires classical computation time exponential in N .It was recently proven [60] that at fixed gate error RCS is not a scalable route to quantum supremacy at large N ; however, it still tests the quantum computer's ability to faithfully execute circuits for which classical simulation methods are, at least in practice, extremely difficult given high enough gate fidelities.Also, it has been run on a variety of quantum computers in the context of quantum advantage demonstrations [55,61,62], making it useful from the standpoint of crossplatform comparisons.
We structure our circuits as if the qubits involved tile a two-dimensional grid with nearest-neighbor interactions [55], although we emphasize that this constraint is only imposed for fair comparison with prior art and is not a hardware constraint of H2.Future work may study whether random circuits built from randomly gating pairs of qubits with arbitrary connectivity achieves a greater degree of classical simulation difficulty at a reduced circuit depth.At each N , the grid dimensions are chosen to be as close to square as possible.In each layer from the circuit, a 1Q gate chosen randomly from , is applied to each qubit.The 1Q gate applied to a given qubit in one layer is omitted from the set of possible 1Q gates applied to the same qubit in the next layer.Subsequently, a 2Q gate is applied to pairs of qubits following a particular tiling pattern on the grid (see Ref. [55], Fig. 3).A final round of 1Q gates is applied to all qubits before measurement.We implement the exact same 2Q gate as in Ref. [55], Up to 1Q gates, The SWAP gate is handled in software by relabeling and transporting qubits, so the fSim( π 2 , π 6 ) gate is implemented on H2 with exactly one 2Q gate.In practice, we generate the circuits using the Sycamore gate defined in the pytket library [63]; pytket's compilation to Quantinuum hardware automatically rebases the circuits to use a U ZZ (5π/12) gate via the above identity with the addition of 1Q gates that are absorbed into the randomly chosen 1Q gates.
Since H2 currently supports 32 qubits, well within the classically feasible regime, we focus on the "classically verifiable" repeating EFGHEFGH gate tiling pattern from Ref. [55].As developed in Ref. [64], we use linear cross-entropy benchmarking to quantify the success of the quantum computer in sampling from the true output distribution of each random circuit.This procedure computes a quantity called the linear cross-entropy benchmarking fidelity, F XEB .To match the parameters in Ref. [55], we explore random circuits of depth 14, and average the resulting F XEB over 10 circuits at each fixed N , combining the uncertainties on each measurement of F XEB by inverse-variance weighting.The measured results on H2 are displayed in Fig. 8.With future improvements to the number of qubits in H2, assuming compara-   +|1i ⌦8 < l a t e x i t s h a 1 _ b a s e 6 4 = " r D x K e x a g s l 8 y Y J x q k t j N i  ble 2Q eff , we expect the cross-entropy benchmark results will pose serious challenges to classical simulations.

D. N-partite entanglement certification in GHZ states
The N -qubit GHZ state [65] is defined as Producing GHZ states is a demanding test of qubit coherence, as they are maximally sensitive probes of global dephasing.Moreover, GHZ state fidelities have been widely measured and reported across a variety of quantum hardware [66][67][68][69][70], making this test helpful for assessing the performance of the H2 device in a broader context.We prepare GHZ states of N =20, 26, and 32 using the log-depth circuit construction given in Ref. [71] and an N = 32 GHZ state using a constant-depth adaptive circuit construction [72].The latter was submitted via OpenQASM 2.0++ and exemplifies how mid-circuit measurement and feed-forward can be used to create longrange-entangled states in constant depth [73][74][75].Both circuit constructions are shown in Fig. 9.
We estimate the fidelity of the GHZ states using the method of Ref. [76].The fidelity of a density matrix ρ with respect to the GHZ state is The first two terms are the populations in the all-zero and all-one states and are estimated by measuring all qubits in the computational basis.The third term is estimated using the fact that where the operators for k ∈ {1, . . ., N } correspond to the global parity of spin along the axis θ k = kπ/N on the equator of the Bloch sphere, and can be measured with only 1Q rotations.The complete fidelity estimation protocol requires N + 1 measurement bases.We ran one circuit with 50 shots for each of the N measurements of M k , and N circuits with 50 shots for the population measurements.All the log-depth unitary preparation circuits across the various N were run in a random order.The results of the population and parity measurements are shown in Fig. 10, and the estimated state fidelities are listed in Table IV.For N = 32, we obtain fidelities of 0.82(1) and 0.74(1) (without correcting for SPAM errors) for the unitary and adaptive state preparation circuits, respectively.By comparison, a GHZ fidelity > 0.5 is sufficient to witnesses genuine multipartite entanglement [77].The adaptive circuit contains more 2Q gates (46) and measurements (48), and therefore produces a lower fidelity than the unitary circuit, which contains 31 2Q gates and 32 measurements.Especially for systems with limited connectivity and appreciable memory errors, the constant-depth adaptive circuit should outperform the unitary preparation circuit at large enough N .

V. APPLICATION BENCHMARKS
The system-level benchmarks of the previous section serve to verify quantum computer performance on a well-defined set of volumetric circuits.However, problems of practical interest tend to involve structured circuits with very specific demands on gate set and connectivity.A comprehensive survey of all such problems is beyond the scope of this work (and difficult to define), but a sampling of such applications is still helpful for evaluating the machine's capabilities with respect to plausible near-term use cases and the demands they impose.In this section, we present the results of four application benchmarks: (A) Hamiltonian simulation, (B) QAOA, (C) large-distance repetition codes, and (D) holographic quantum dynamics simulation.We chose these benchmarks in a complementary way, as each places a different emphasis on particular error sources.For example, Hamiltonian simulation is highly dependent on 2Q gate error, QAOA performance depends strongly on qubit connectivity, and repetition codes and the holographic quantum dynamics simulation require high-fidelity MCMR.

A. Hamiltonian simulation
Simulating the continuous time evolution of manybody quantum systems is an important and classically challenging problem for which quantum computers are well-suited [78][79][80].To benchmark the performance of the H2 quantum computer on this task, we simulate the dynamics of an L = 32 site transverse-field Ising model (TFIM) in one spatial dimension, with Hamiltonian Here and elsewhere in this section, site subscripts are taken mod(L) to yield periodic boundary conditions.We simulate a quantum quench where the initial state is prepared in the ground state at h/J = ∞, that is, The Hamiltonian is suddenly quenched to h/J = 0.2, and the state is then evolved up to Jt = 7 under the new Hamiltonian.We evaluate the dynamics of the expectation value of the Pauli X operator averaged over all qubits, i.e., X ≡ 1 L L j=1 X j .We digitally simulate the dynamics using 1st-order Trotterization of the time-evolution operator [81], which approaches the true evolution as r → ∞.The 1Q and 2Q gates in this decomposition are X rotations and U ZZ (θ), which are native on H2.The number of Trotter steps r is chosen such that the errors on X due to Trotterization are below 0.01, as determined by explicit calculations of the noiseless Trotterized dynamics [82] and comparison to exact results for the continuoustime evolutions [83].This threshold ensures that Trotter errors are at or below the scale of the expected ∼ 1% statistical fluctuation in the experiment (more details in App.C 1).
The results of our experiment, plotted in Fig. 11, show reasonably good agreement between our quantum simulation and the exact solution up to time Jt = 7, suggesting the quantum computer has small enough errors to coherently simulate quantum dynamics up to a nontrivial time (note that a completely depolarized state has X = 0).The data has not been post-processed or errormitigated in any way.Figure 11 also compares the circuit implementations with and without using the parameterized angle U ZZ (θ) gate.Without parameterized angle gates, every such gate has to be decomposed into two U ZZ (π/2) gates with additional 1Q rotations, resulting in a doubling of the number of 2Q gates, and the possibility of more than doubling the error per Trotterization step (see Fig. 21).The improvements to the simulation results when using parameterized-angle 2Q gates highlights their benefit for near-term applications of quantum computers to simulating many-body physics.

B. QAOA
The quantum approximate optimization algorithm (QAOA) [84] is a near-term heuristic algorithm for solving combinatorial optimization problems of general interest in many industries.As in previous benchmarking studies [36,85,86], we focus on solving the Max-Cut problem restricted to the class of unweighted 3regular graphs G = (V, E).The standard QAOA circuit consists of alternating applications of a mixing unitary U B (β n ) = e −iβnH B and a phase-splitting cost unitary The 2p parameters β n and γ n are found variationally, by searching with a classical optimization algorithm for the choice of parameters that extremizes the cost of the QAOA final state, The initial state is taken to be |ψ 0 = |+ ⊗N , the ground state of H B = i X i , while for the unweighted MaxCut problem the cost Hamiltonian is and therefore each term in the cost Hamiltonian comprising the cost unitary U C (γ n ) can be implemented with a single U ZZ (θ) gate.For the classical optimization procedure, we use the derivative-free BOBYQA optimizer [87] as implemented in the Py-BOBYQA package [88].The BOBYQA optimizer builds a local quadratic model to the objective function within a trust region of size that decreases with iterations of the optimizer.We set the optimizer convergence conditions to be met when the precision of the variational parameters reaches the same order as the measured 2Q gate errors, 1 × 10 −3 .
We study two separate experiments in this work.The first implements a larger-scale MaxCut QAOA problem (N = 130, p = 1) on 32 physical qubits using qubit-reuse compilation [36] and 100 shots per circuit.The second experiment solves an N = 32 MaxCut QAOA problem at p = 2 with 200 shots per circuit to demonstrate an improvement in solution quality compared to p = 1 with the more expressive and deeper ansatz.For plotting purposes, the energy was rescaled by a sign so that in all cases the optimum corresponds to the solution of minimum energy.
In Fig. 12 we display the results from the N = 130, p = 1 experiment.The optimizer shows convergence within the first ten circuits.Using the tensor network methods available in the Python library quimb [89] in conjunction with the global Bayesian optimizer in scikit-optimize [90] we also exactly evaluated the best average energy possible for any p = 1 circuit.The convergence of the blue data to the green line in Fig. 12 demonstrates that the optimization procedure succeeded in locating the optimal parameters and that H2 evaluated the circuits with sufficiently low noise to nearly saturate the best possible result.In App.C 2 we also display the optimization trace on the energy landscape, further confirming that the optimizer succeeded in locating the optimal parameters.To evaluate the performance of the algorithm in solving the combinatorial problem, we also compare the minimum value of the energy sampled in any given shot to the exact value of the max cut computed in gurobi [91].As expected since the circuit depth is only p = 1, the best cut value found on H2, 148, is substantially less than the exact value of 178.Nevertheless, this experiment represents substantial progress towards solving industry-scale combinatorial problems with QAOA on small quantum computers.
In Fig. 13 we demonstrate the results of the N = 32, p = 2 optimization procedure.Comparing to the best average energies possible for any p = 1 or p = 2 circuits, the experimental data for p = 2 consistently performs better than the best possible p = 1 circuit and is close to saturating the ground state energy for p = 2 circuits.Furthermore, H2 succeeded in locating solutions with the best possible max cut of 42 for this graph.

C. Error correction: repetition code
Large quantum computations are widely thought to only be possible through quantum error correction (QEC).Therefore, in the context of fault-tolerant quantum computers, perhaps the most important quantum algorithm is not a particular targeted calculation, but rather the QEC algorithm being run in the background.Additionally, given the large resource overheads of QEC, the design requirements for large scale quantum computers will likely be driven by the optimization of these codes' power and efficiency, highlighting the importance As the number of syndrome extraction (SE) rounds increases, more noise is injected into the system, degrading the logical fidelity.For a given number of rounds of SE, as the code distance increases, so does the logical fidelity.All error bars are calculated using Jack-knife resampling [92], except for those where the sampling number was too low to calculate an error (i.e.100% fidelities), in which case the statistical rule of three was used [93].
of closely related benchmarks.Many QEC schemes are based on stabilizer codes that encode logical information into the joint subspace of many physical qubits, known as data qubits.Additional physical qubits, known as ancilla qubits, are used to make non-destructive syndrome measurements [94] which discretize errors into a manageable set of bit and phase errors, allowing for general QEC.Repetition codes are examples of stabilizer codes but can only correct a single type of error, typically either bit or phase flip errors.However, they make good benchmark algorithms since they possess all the components needed to implement a quantum code.Specifically, a distance d repetition code can reliably correct up to d−1 2 errors.Corrections are determined by repeatedly measuring stabilizers of the code using MCMR, syndromes are decoded using algorithms similar to those used in quantum codes, and calculating logical fidelities is done in the usual way.Using all 32 qubits, we implement a d = 31 repetition code with 31 data qubits and one ancilla, maximizing the code distance that can be tested.This low overhead implementation of the code is made possible by H2's qubit reuse capabilities and performing 30 unique stabilizer measurements serially.
The syndrome measurements are processed in realtime using Wasm calls to the classical compute environment during the quantum circuit.At the end of the circuit, the data qubits are also measured and used to construct a final syndrome measurement.We use this last syndrome in addition to all previously recorded syndromes to decode the logical output state and calculate the logical fidelities.The decoding uses a minimumweight perfect-matching algorithm [95,96], which is performed online at the end of the circuit as part of the control-system software execution of each shot (i.e. while the hybrid quantum/classical program is still being executed on the actual device).Since this is done after the logical qubit has been measured, the operation does not perform mid-circuit real-time decoding, making these experiments insensitive to memory error associated with the computation time of a correction.Real-time decoding operations are possible with Wasm and the advanced classical compute environment infrastructure, but they are unnecessary for repetition code memory experiments.
Experiments on both the d = 31 bit flip code and phase flip code were performed while varying the number of rounds of syndrome extraction, and recording all syndrome measurements, allowing us to process subsets of the code after the program completes.The subsets allow us to reconstruct logical fidelities for all odd distance codes less than d = 31.These measurements are similar to Ref. [97,98], which use a fixed architecture and parallel syndrome measurements, making for a direct comparison of different distances.In contrast, our architecture offers a less direct comparison between different code distances, as syndrome measurements are done serially, but allows for larger distance codes with lower qubit overheads.We note that the Wasm decoder was only used to calculate the d = 31 fidelities.All other code distance fidelities were calculated by the same minimum-weight perfect-matching algorithm offline.
The experimental results in Fig. 14 show the larger distance codes achieve higher logical fidelities as expected, with the bit flip code producing a higher logical fidelity compared to the phase flip code for a given distance, consistent with a biased noise environment.These results demonstrate many of the necessary components for implementing scalable, real-time QEC, and show how the capabilities of the H2 system can help realize large distance stabilizer QEC codes, all of which will be the subject of future studies.

D. Holographic quantum dynamics simulation (HoloQUADS)
High fidelity MCMR is crucial for quantum error correction, and can also help expand the reach of many near-term algorithms [36].In particular, such techniques have been shown to enable the simulation of quantum dynamics from initially correlated states directly in the thermodynamic limit, with qubit number requirements set by the evolving entanglement entropy of the state rather than its physical size [99].Based on work in Refs.[100,101], Ref. [102] recently proposed and demonstrated a benchmark for such methods by simulating exactly solvable dual-unitary circuit models applied to initial matrix-product states on H1-1.Here we use the additional resources of H2 to extend those results to longer evolution times, where the system contains more entanglement.
Following Ref. [102], we simulate time evolution under dual-unitary circuits [100,101] which are onedimensional brick-work circuits (Fig. 15a) having generic properties of typical circuits (e.g., exhibiting quantum chaos and ballistic growth of entanglement [100]) and certain non-generic properties (e.g., their correlations spread at the maximal possible velocity [103], and are confined to the light-cone boundary rather than its interior), which allow quantities such as entanglement entropy and correlation functions to be analytically deter-mined [100,101,104].An initial matrix product state is prepared by applying gates between the physical qubits and an ancilla "bond" qubit (blue gates in Fig. 15a,b) and then time-evolving this state by the self-dual kicked Ising (SDKI) model [100,105] (green gates in Fig. 15a,b).After t layers of SDKI gates are applied to |ψ 0 , we measure the smoothed correlation functions where |ψ t is the time-evolved state and L is the system size, and in Fig. 15c we compare the results to exact theoretical calculations from Ref. [101].We use H2's 32 qubits to simulate t = 0, 8, 16, 24 layers of time-evolution applied to a length L = 128 + t = 128, 136, 144, 152 matrix product state.
The experimental data show close agreement with ideal noiseless results, suggesting H2's mid-circuit measurement, mid-circuit reset, crosstalk, and memory errors are low enough for sizeable quantum dynamics simulations using HoloQUADS.We note that effects of errors can be highly circuit dependent.For the particular dualunitary circuit studied in this benchmark, their maximal velocity behavior [103] causes only ∝ t Pauli errors along the edges of the causal cones of qubits i and j to affect the ψ t |X i X j |ψ t correlation function.For a generic circuit, we would expect any of the ∝ t 2 Pauli errors in the causal cones to affect correlation functions, meaning dual-unitary circuits are less sensitive to errors than typical circuits.

VI. A SUMMARY OF THE RESULTS AND OUR OUTLOOK
The H2 quantum computer is a significant upgrade from our previous H1 system, maintaining or exceeding many previous fidelity metrics while operating on more qubits.The clearest manifestation of the robust scaling of our QCCD architecture is that the system-level benchmarks are consistent with the errors measured by the component benchmarks.We also benchmarked H2's performance on a variety of applications that are widely considered to be well-suited for near-term quantum computers, with the goal of assessing the feasibility of such algorithms given current hardware performance metrics.Our benchmarking results show that 2Q gates remain the dominant error source, although fidelities improved slightly in our new generation.However, the transport time between arbitrary circuit layers did increase, which translates to larger memory error.Future work will focus on reducing both of these error sources by improvements to laser systems, transport speed, and magnetic field stability.
The new system also demonstrates a number of key technological milestones on the path to scaling, including ion transport controlled via broadcast electrode signals, RF signals routed under the surface of the trap, and fast MOT-based loading.
These improvements are achieved in a system initially configured to operate with 32 qubits (but designed to accommodate more) and collectively bolster the case for the viability of the QCCD architecture as a route to large-scale trapped-ion quantum computing.The further development of the QCCD architecture will include truly two-dimensional trapping structures for fast ion sorting [20], as well as moving beyond free-space optical delivery.This captures only the errors in the computational subspace and not leakage errors, which are measured with the leakage detection gadget.The reset and measurement crosstalk decay functions are fit to functions derived from error models of their respective operations in Ref. [15] with the following equations where A M/R are the SPAM fit parameters for each method and r M/R is the rate of measurement/reset crosstalk scattering.Each scattering rate is then converted to average infidelity M = 5r M /6, (A5) For all component measurements a final combined estimate is obtained by performing the RB (or crosstalk) analysis on a combined dataset between all measured qubits, which is reported in Table II.For example, in 1Q RB the combined dataset is obtained by treating each qubit measurement as a single sequence randomization and performing the RB fitting averaged over every qubit's random sequences.This leads to an RB experiment with 8 × 40 random sequences for each length.For the crosstalk and SPAM measurements the combined dataset is obtained by adding all circuit output counts together.Zone specific data for each component testing experiment is shown in Table VI.Decay plots for each component benchmark are shown in Figs.4,16,17,18,19 and 20.

Leakage detection gadget
The leakage rate r L is defined as the rate that population leaves the computational subspace due to a process Λ based on Ref. [109]  where 1 L/C is the identity operator on the leakage/computational subspace.The number of leakage detection events is fit to the model as shown in Fig. 4b and 16b.Gate errors in the leakage detection gadget can cause false-positive or false-negative detection events, but these only contribute to the parameter A, as they are independent of , similar to the SPAM parameter in RB.

2Q parameterized randomized benchmarking
To measure the average infidelity of U ZZ (θ) as a function of θ, we use direct RB.In standard RB, the unitaries comprising the RB sequence are sampled from a unitary 2-design, such as the Clifford group or SU(2 N ).In contrast, direct RB samples unitaries from a set of native gates that generate the group [49,110].Under such circumstances the survival probability will still approximate an exponential decay with decay parameter linearly related to the average fidelity [111].} are fit to the model p( ) = Ar + 1/4, and the average infidelity is computed by Eq. (A2).For θ = 0, the average infidelity is computed by the procedure described in App.A 1. The average infidelity versus θ is shown in Fig. 3.
gate set generates SU(4).The inversion unitary is applied by decomposing the resulting SU(4) element into three U ZZ (π/2) 2Q gates using a standard decomposition [112].A final random Pauli is applied to randomize the survival state.The decay curves are shown in Fig. 21, and the average fidelity is obtained by fitting to (A1).
In addition to positive values of θ ∈ { π 8 , π 4 , 3π 8 , π 2 }, we also run a direct RB experiment with θ very close to 0 (specifically 2 × 10 −4 ), to measure the baseline error due to the MS wrapper pulses and memory error accumulated during the cooling pulses.However, for θ = 0, the direct gate set reduces to SU(2)⊗ SU (2), which no longer generates a unitary 2-design, and the RB theory leading to a single exponential decay no longer applies.To estimate the fidelity in this case, we use the fact that the action of SU(2)⊗SU(2) decomposes as a direct sum of 4 irreducible representations (irreps).(A good introduction to representation theory as it applies to RB is in Ref. [113].)The irreps are the span of the identity II, the spans of weight-1 Pauli operators on each qubit {IX, IY, IZ} and {XI, Y I, ZI}, and the span of the weight-2 Pauli operators.We let λ ∈ {II, IZ, ZI, ZZ} label these irreps.If E is the error channel for U ZZ (θ ≈ 0), then the twirl of E over SU(2)⊗ SU(2) is a linear combination of projectors onto these four irreps: where φ is the superoperator representation of SU(2)⊗ SU(2), and Π λ is the projector onto the irrep λ.The survival probability at sequence length is then given by p( ) = λ A λ r λ . (A10) We use the fact that r II = 1 for trace-preserving maps, and the randomization in the survival state to fix A II = 1/4.To reduce the number of exponential decay curves needed to best-fit to, we assume qubit symmetry in the error channel, that is, r IZ = r ZI = r 1 .Relabeling the SPAM parameters and defining r 2 = r ZZ , the decay model is then given by The entanglement (or process) fidelity is given by The average infidelity is related to the entanglement fidelity for any d-dimensional trace-preserving error [114].
Appendix B: Details of system-level benchmarks

Mirror benchmarking
Table VII lists the survival probabilities, decay parameter, and effective 2Q average infidelity 2Q eff for the MB experiment.The average survival probability as a function of sequence length is fit to the model p( ) = Au −1 . (B1) Let E be an N -qubit error channel.Let {P i } i be the Nqubit Pauli operators with P 0 = I.The i-th Pauli fidelity of E is defined as By applying Pauli randomization to the TQ gates in the MB circuits, the error channel for each circuit layer can be assumed to be a stochastic Pauli channel [57].Assuming a constant stochastic Pauli error channel E per circuit layer, it was shown in Ref. [52] that the decay parameter u is equal to the mean square of the non-identity Pauli fidelities: For a constant depolarizing error channel on each 2Q gate, u is given by an analytic formula (Eq.(C4) in Ref. [52]).After best-fitting the experimental decay curves to obtain u, this formula is used to extract 2Q eff . Sequence

Quantum volume
In addition to the QV = 2 16 dataset presented in the main text we also ran several smaller QV tests.In Fig 22 we show the next largest QV = 2 15 test.This test was run with 100 random circuits each with 50 shots and containing an average of 243 parameterized 2Q gates.The measured heavy output probability was 70.9%, above the threshold with over two-sigma confidence calculated from the semi-parametric bootstrap resampling method.
To infer an effective 2Q error from QV data, first we convert the measured heavy-output probability to a circuit fidelity based on Eq. 13 in Ref. [54].We then scale this based on the SPAM error and average number of 2Q gates as shown in Eq.B6.

Random circuit sampling
The definition of the linear cross-entropy benchmarking fidelity is where P (x i ) is the probability of measuring the output bitstring x i in the ideal output distribution, and the expectation value is taken over the empirically measured bitstrings.The linear cross-entropy fidelity is a measure of the correlation between the empirical output distribution and the ideal output distribution.Consequently this requires exact classical simulation of the random circuits, which is a major obstacle to scalability of the benchmark.The uncertainty on the linear cross-entropy fidelity for each circuit can be obtained from (B4) by combining the variance estimator for P (x i ) with the standard uncertainty-on-the-mean formula, namely, var(F XEB ) = 2 2N var(P i ) N shots . (B5) In Fig. 8 we report a fit for the linear cross-entropy benchmarking fidelity on H2 as a function of N .This fit was obtained by the following procedure.At each fixed N , a representative random circuit was generated and compiled with pytket to obtain an expected number of 2Q operations.We note that the final number of 2Q U ZZ operations in each circuit is equal to the number of fSim π 2 , π 6 gates in the original uncompiled circuit.The overall model for the linear cross-entropy fidelity is then Here F 2Q represents the effective entanglement (or process) fidelity of two-qubit operations, while SPAM is the SPAM error as measured by component benchmarking.The conversion between entanglement fidelities and average infidelities as obtained via component benchmarking in Table VI is given in Eq. (A13).
Appendix C: Details of application benchmarks

Trotter steps of Hamiltonian simulation experiment
Here we provide details on the Trotter steps used in the experiment.The Trotter steps r are determined by relative convergence with tolerance 0.0025, i.e., we choose a cutoff r such that for r ≥ r, neighboring steps are within the threshold | X r +1 − X r | ≤ 0.0025, where X r is the X expectation value after r steps of propagation in a noiseless circuit.For this purpose, we compute each X r exactly via a discrete-time Jordan-Wigner transformation in the Heisenberg picture [82].We checked that this 0.0025 relative error tolerance provides an absolute ∼ 1% Trotter error tolerance in | X r − X | for the times we simulate, which is at the scale of the expected ∼ 1% statistical fluctuation in the experiment.We chose these values because further improvements from lowering Trotter error would not be reliably observable even if the circuit were completely noiseless, though we did not choose them in a noise-aware fashion (further lowering of the number of Trotter steps used may well give further improvements given the presence of gate errors).The steps and the corresponding Trotter errors are shown in Fig. 23a,b.The difference between the experiment data and the exact value is shown in Fig. 23c.

The QAOA optimization landscape
In Fig. 12 and Fig. 13 in the main text, uncertainties were computed on the expectation value of the energy H C as evaluated on H2 (blue points).These uncertainties were computed by bootstrap resampling via the reverse-percentile method [115], and quantify the uncertainty due to shot noise, but not physical noise sources on the machine.We emphasize that the different data points in Fig. 12 and Fig. 13 are evaluated at different values of the parameters β and γ.
In Fig. 24 we display the full optimization trace on the energy landscape for the N = 130, p = 1 MaxCut QAOA instance described in the main text, further justifying that the classical optimizer successfully converged to the minimum value of the energy.

Details of HoloQUADS experiment
Here we provide additional details on the holographic quantum dynamics experiments performed on H2.We consider an initial matrix product state of the form H2 are used as in-parallel as possible.The leakage detection gadget (Fig. 5) is used to discard results where the bond qubit was measured to have leaked (measuring 2%, 2%, 5%, 7% bond qubit leakage for t = 0, 8,16,24).
We also use a circuit identity to construct each gate U with a single parameterized U ZZ (θ) gate and one physical SWAP [102].

FIG. 2 .
FIG. 2. Overview of the H2 trap including upgrades in trap design and gating operations.(a) A 2D MOT produces a collimated beam of atoms, allowing for higher neutral atom density and faster loading than an effusive oven.(b) abc tiling of electrodes for conveyor belt transport.(c) RF tunnels to implement inner and outer RF electrodes.Ions are trapped 70 µm below the trap surface.(d) Colored top metal layer of H2 trap.Green curved zones are conveyor belt regions for ion storage.Bottom blue zones are DG01-DG04 (from left to right), used for quantum operations.Top blue zones are UG01-UG04 gate zones (from right to left), used for sorting but not quantum operations.Darker grey loops are RF electrodes.Yellow circles represent qubits that are gated while red circles represent qubits sitting in storage during gates (note that 138 Ba + ions are omitted for simplicity).Yellow arrows indicate the Doppler sheet beam direction while blue arrows indicate the Doppler repump sheet beam direction.(e) Ion configuration and beam direction for 2Q gates.Large orange circles represent 171 Yb + while smaller purple circles represent 138 Ba + .(f) Ion configuration and beam directions for 1Q gates on left 171 Yb + .(g) Ion configuration and beam directions for state preparation and measurement (SPAM) operations on left 171 Yb + with micromotion hiding on right 171 Yb + [15].(h) Storage ion configuration in conveyor belt region.

FIG. 3 .
FIG.3.Average infidelity as a function of angle for the parameterized 2Q gate UZZ (θ).Each data point is obtained by fitting the decay curves shown in Fig.21to an exponential decay function.The infidelity at θ = 0 is due to both the wrapper pulses and memory error incurred during the cooling pulses, which are still applied in the absence of an MS gate.The linear best-fit to the zone-averaged data is given by (θ) = 2.9(2) θ/π + 0.46(6) ×10 −3 .

FIG. 4 .
FIG. 4. 2Q randomized benchmarking decay curves for each zone and for the combined average across all zones.(a) Standard RB decay curve.The average infidelity per 2Q gate is 1.83(5)×10 −3 across all four gate zones.(b) Decay of fraction of shots without leakage on either qubit as identified by the leakage detection gadget, which gives a measured leakage rate per 2Q gate of 3.9(2)×10 −4 across all four gate zones.

FIG. 7 .
FIG.7.Quantum volume QV = 216 quantum volume measurement on H2.The average and two-sigma confidence interval of the heavy-output probability are plotted as a function of the circuit index.Passing occurs when the green shaded region (two-sigma confidence interval from semi-parametric bootstrap method) is above the dashed grey line at 2/3, which we satisfy in a dataset with 200 randomly generated circuits.

FIG. 8 .
FIG.8.Linear cross-entropy benchmarking fidelity as measured on H2 for classically verifiable random circuits.Each data point displays the combined results from 10 circuits each executed with 100 shots.The details of the best-fit curve are described in App.B 3.
t e x i t s h a 1 _ b a s e 6 4 = " 1 C 8 r e r C 2 M 3 M + 3 9 e s n 9 j z j u N t U k 4 A k T P 8 g N Z I w P W B 8 e H V Q s A d M d T a + Z 4 A P H 9 H C c a v e U x V P 2 t 2 P E E m O G S e S U C b N P p t Q r G G 1 i M 5 l 3 F I 1 / H Z F m A 7 B l f c 8 U o 8 t c l E z 8 g / L m N r 7 o j o T K c g u K f y 8 e 5 x L b F B d R 4 5 7 Q w K 0 c O s C 4 F u 5 2 z J + Y Z t y 6 D / F d q n Q + w 7 + g c 9 K k Z 0 1 6 e 9 p o X c 7 y r a I 9 t I + O E E X n q I W u 0 Q 1 q I 4 5 e 0 C t 6 Q + + V T 2 / J W / c 2 v 6 V e Z e b Z Q a X y d r 8 A p b 2 1 N g = = < / l a t e x i t > |0001i +|1011i < l a t e x i t s h a 1 _ b a s e 6 4 = " 9 f Y u v p G 1 Q M S M g k 5 o x P P P p u S C V 5 Q

s 9 r k 3 V
n 8 7 w L 7 g 9 a v k n L f / 6 u H F + N s l 3 k e y Q P X J A f H J K z s k l u S I d w s k r e S P v 5 K P 2 5 W w 5 2 8 7 u T 6 t T m 3 i 2 S K W c 5 j c d S b 0 R < / l a t e x i t > |0i ⌦8 4 d N g g P H 9 I M 4 1 e 4 p G 0 z Z n 4 4 x S 4 w Z J Z H r T J h 9 M C W t Y L S J z e S 3 o x D + d U S a D c G W + / u m G F 3 m o m T i H 5 Q 3 t / F Z b y x U l l t Q / G v x O J e B T Y M i 9 a A v N H A r R w 4 w r o W 7 P e A P T D N u 3 d / 4 L l X y O 8 O / 4 K b Z I C c N c t W q t 8 9 n + S 6 j f V R D R 4 i g U 9 R G l 6 i D u o i j Z / S C X t F b 5 c O r e r v e 3 l e r V 5 l 5 d l C p v N o n J o O 4 m A = = < / l a t e x i t > |00121406i +|10020416i < l a t e x i t s h a 1 _ b a s e 6 4 = " 3 6

2 g o u c e y 3 M 4 M
p 4 w P W x y c H F Y v R d E a F D W O y 5 5 g e i R L t U l l S s L 8 V I x Y b M 4 x D N x k z + 2 x K v Z z R J j L j v 4 q 8 8 a 8 i 1 G y A t j z f M / n V Z S 6 M x / 5 e + e U 2 O u 2 M h E o z i 4 r / P D z K J L E J y X d E e k I jt 3 L o A O N a u L 8 T / s w 0 4 9 Z t 0 n e u 0 r 8 e T o P 7 R p 0 e 1 + n t U e 3 i b O Lv I u z A L h y 4 T Z 3 A B d x A E 1 r A 4 Q 0 + 4 B O + v G 3 v 3 L v y r n 9 G v c p E s w W l 8 J r f Q W 6 5 C w = = < / l a t e x i t > |00020406i +10121416i< l a t e x i t s h a 1 _ b a s e 6 4 = " 9 f Y u v p G 1 Q M S M g k 5 o x P P P p u S C V 5 Q

s 9 r k 3 VFIG. 9 .
FIG.9.GHZ state preparation circuits for (a) log-depth unitary and (b) constant-depth adaptive preparation, here shown for N = 8 for simplicity.

FIG. 10 .
FIG. 10.Populations and parities of N =20, 26, and 32-qubit GHZ states constructed with a log-depth unitary protocol, and also of a 32 qubit GHZ state produced with a constant-depth adaptive circuit.(a) Populations of |0 ⊗N and |1 ⊗N .The ideal GHZ state has populations of 0.5 in these two states and zero in all other states.(b) Expectation values of the operator M k defined in Eq. 7, plotted versus angle θ k = kπ/N .The ideal GHZ state has values of 1 and −1 for even and odd k, respectively.The dashed lines denote the averages.

FIG. 11 .
FIG.11.The dynamics of X for a 32-qubit TFIM Hamiltonian simulation vs. evolution time.The orange data is obtained by directly implementing each ZZ rotation in every Trotter step using our native parameterized angle UZZ (θ) gate with θ = 2Jt/r.The green data is obtained by decomposing each ZZ rotation into two UZZ (π/2) (Clifford) gates with some 1Q rotations.Each data point is obtained as the average of 100 shots of the associated Trotterized circuit for that time.

FIG. 12 .
FIG.12.Optimization trajectory of N = 130, p = 1 QAOA computed via qubit reuse on H2.The expectation value of the energy as measured experimentally at p = 1 (blue) converges well to the best possible exact value (green).Uncertainties on the measured value of HC are plotted but smaller than the displayed point size (see App. C 2 for details).The best sample taken at each iteration (orange) is also displayed relative to the true max cut (purple).

FIG. 13 .
FIG.13.Optimization trajectory of N = 32, p = 2 QAOA on H2.The expectation value of the energy as measured experimentally at p = 2 (blue) surpasses the best possible exact value for a p = 1 circuit (green, dashed).The best sample taken at each iteration (orange) is also displayed relative to the true max cut (purple).

9 FIG. 14 .
FIG.14.The logical fidelities of the phase (square, dotted) and bit (circle, dashed) flip repetition code as a function of distance.As the number of syndrome extraction (SE) rounds increases, more noise is injected into the system, degrading the logical fidelity.For a given number of rounds of SE, as the code distance increases, so does the logical fidelity.All error bars are calculated using Jack-knife resampling[92], except for those where the sampling number was too low to calculate an error (i.e.100% fidelities), in which case the statistical rule of three was used[93].

FIG. 15 .
FIG. 15. (a)A one-dimensional brickwork circuit of length L = 12 with t = 4 layers of gates applied to a quantum matrix product state of bond-dimension χ = 2 n b = 2. (b) HoloQUADS re-uses qubits through MCMR to execute the same circuit with a minimal number of qubits.Here we use Nmax = 9 qubits, but Nmax can be adjusted between n b + t + 2 = 7 (maximally serial) and n b + L = 13 (maximally parallel).(c) The experimentally measured (dots) correlation function C xx (r, t) for a dual-unitary circuit applied to a length L = 128 + t solvable χ = 2 quantum matrix product state compared to the exact thermodynamic limit results (solid lines), using Nmax = 32 qubits up to time t = 24.Error bars are standard deviations of the mean from four 100 shot experiments.

FIG. 16 .
FIG. 16. 1Q RB data with parameters given in Table V.(a) Decay of survival probability.(b) Decay of unleaked fraction of shots.

FIG. 22 .
FIG.22.Quantum volume QV = 215 quantum volume measurement on H2.The average and two-sigma confidence interval of the heavy-output frequency are plotted as a function of the circuit index.Green shaded region shows two-sigma confidence interval from semi-parametric bootstrap method.

FIG. 23 .
FIG. 23.(a) The number of Trotter steps used at each simulation time.(b) The absolute Trotter error | X r − X | at each time, where X r is the expectation value using r Trotter steps under a noiseless circuit and X is the exact value at that time.(c) The data value relative to the exact value | X exact − X data | at each simulation time.The data error bars are included to reflect signal-to-noise ratio.

4 FIG. 24 .
FIG.24.The energy landscape of an N = 130, p = 1 MaxCut QAOA problem.The optimization trace (pink) starts near the maximum in the top left and eventually converges into the potential well in the top right.Individual points at which circuits were evaluated are marked with stars.Note that the landscape is periodic, so the trajectory wraps around the sides of this plot.
. The agreement is not per- Table.VII.For N = 32, we find [52]s per circuit.The average survival probabilities are fit to the model p( ) = Au −1 .The parameter u is used to obtain an effective 2Q gate average fidelity for a constant 2Q depolarizing error[52].

TABLE V .
Parameters used for component benchmarking testing.*The transport 1Q RB test is done with 32 qubits in parallel and repeated twice.

TABLE VI .
Component benchmarking results for the tests outlined above.All values are in terms of average infidelity and ×10 −4 .For 1Q RB, 1Q leakage rate, measurement and reset crosstalk, and SPAM the brackets show the average infidelity for each side of the gate zone.
Our direct RB circuits are constructed by repeatedly applying U ZZ (θ) (for a fixed value of θ) interleaved with Haar random SU(2) gates on each qubit.For θ > 0, this FIG.21.Decay curves for direct RB of parameterized 2Q gates.The different sets of curves show data for θ ranging from 0 to π/2 in increments of π/8.Each experiment used sequence lengths =4, 50, and 100, with 10 random circuits per sequence length.The circuits were run in parallel across the 4 gate zones and all circuits were run in a random order.The dashed curves are for individual zones, while the solid curve is the average over all zones.The decay curves for