Using Cryogenic CMOS Control Electronics To Enable A Two–Qubit Cross–Resonance Gate

,


INTRODUCTION
Next generation quantum computers will undergo a paradigm shift whereupon multi-qubit devices will predominantly perform fault tolerant quantum circuits.This new era of quantum computing will require orders of magnitude more qubits than are currently being integrated in today's systems [1].For large quantum computing systems comprised of solid state quantum processors (superconducting qubits or quantum dots), cryogenic control electronics is considered a key enabling technology [2][3][4].There has been significant development in CMOS electronics for quantum dot processors [5][6][7][8]; primarily due to the increased I/O demands, and the potential for integrating CMOS electronics with qubits at temperatures > 100mK.More recently, cryogenic CMOS electronics for superconducting qubits have been developed and have been used for single qubit gate demonstrations [9][10][11].
Research in cryogenic CMOS control electronics has primarily focused on achieving the low-power analog requirements to operate within the thermal load limitations of a dilution refrigerator (DR).While power dissipation is an important specification for cryogenic control technologies, minimalistic controllers are not sufficient for practical qubit control.For a fully cryogenic control architecture to be useful, a classical processor capable of producing relevant pulse sequences will be required.This classical processor presents an additional source of power dissipation, and specialized digital architectures will be needed for cryogenic integration to achieve required per-formance within the limited available power budget.
Another important consideration for cryogenic control technologies is the maximum thermal load of the different stages in a dilution refrigerator.For thermalization of the CMOS-chip at the T = 4K plate, the power limitation of a cryogen-free DR is set by the second stage of a cryo-cooler, and the maximum thermal load will be determined by the number of cryo-coolers in the DR.Notably, the maximum load of cryo-coolers is temperature dependent; they yield more cooling power at higher temperatures [12].If the second stage were allowed to operate at higher temperatures, then the DR would be able to support more cryogenic control channels.Assuming higher operating temperatures of the second stage do not impact the cooling of other temperature stages, then this strategy will make the integration of cryogenic control technologies more feasible.
Cryo-controlled architectures such as the one shown in Fig. 1(a) may also yield systematic scaling advantages that would warrant using low power CMOS ASICs.Examples of such advantages include: lower communication latency [14], lower thermal noise floor [15], reduced dispersion of base-band signals [16], reduction in RF loss due to signal delivery [17], wire count reduction from room temperature to the T = 4K stage, a reduction in power per channel, a reduction in cost per channel, and a reduction in total system size.
In this manuscript, we present measurement results on a transmon-based quantum processor (QP) [18,19] controlled via a custom cryo-CMOS application specific integrated circuit (ASIC).The ASIC was designed for Here, the QP is composed of fixed frequency transmon qubits coupled together through a fixed frequency quantum bus and arranged in a heavy hexagonal lattice.The cryogenic control unit (CCU) is composed of a cryogenic central processing unit (CCPU), qubit waveform generators, readout waveform generators, and quantum state discriminators.A CCU containing these key elements would be capable of autonomous operation, yielding a shorter latency loop when running deterministic quantum circuits.A room temperature (RT) processor performs classical computations, orchestrates high level quantum operations, and interprets results of quantum algorithms [1].RT support electronics are necessary to power, clock, and program active cryogenic electronics.The support electronics interface with a RT server which performs classical computations necessary to run quantum algorithms.(b) The X and Z stabilizer circuits for performing error correcting protocols on a heavy hexagonal lattice.Stabilizers represent the primary protocol for logical qubit maintenance and the CCU oversees these protocols; which include monitoring physical qubits, decoding errors on physical qubits [13], and generating conditional sequences of pulses.(c) An expanded block diagram of the qubit waveform generator (blue box in (a)), used for this manuscript.
cryogenic operation and is capable of generating pulse sequences for qubit characterization, verification, and validation (QCVV) experiments common for use with transmon qubits.Here the CMOS chip is a dual-channel semiautonomous qubit state controller fabricated in 14nm FinFET technology [10,11].A single channel block diagram is shown in Fig. 1(c).The on-chip processor facilitates autonomy through its ability to play pre-defined sequences of qubit control waveforms, a necessary requirement for both quantum error correction (QEC) [20][21][22][23] and quantum error mitigation (QEM) workloads [24][25][26].
In the experiments described in this work, emphasis was placed on understanding the demands of the classical processor (CP), which represents a near-term development challenge for cryogenic control technologies.For example, present day QPs are primarily used for physics learning and qubit characterization experiments [27][28][29], which can be difficult to support in a low-power processor.
An important qubit gate characterization experiment that presents challenges to limited memory control hardware is randomized benchmarking (RB) [30].For RB experiments, a random sequence of Clifford gates followed by an inversion pulse are required to be stored in memory.Individual Cliffords correspond to waveforms with independent amplitudes, phases, and durations, while the sequence of Cliffords corresponds to a set of instructions.For complex experiments like RB, compressing large instruction sets into limited memory is challenging; however, not all experiments are as demanding.In a quantum computing system that primarily performs QEC protocols (Fig. 1(a)(b)), the set of required pulse sequences is simple and repetitive, especially when compared to those needed for QCVV experiments [31].Understanding the complexity and characteristics of the pulse sequences demanded by these experiments is important for developing optimal qubit control technologies, and could lead to special purpose instruction set architectures (ISA).
Here we report the use of cryo-CMOS to generate qubit control waveforms for a suite of characterization experiments including: T 1 , T 2 , T * 2 , Carr-Purcell-Meiboom-Gill sequences (CPMG), rotary echo, Hamiltonian tomography, and RB of single-qubit and two-qubit gates.The classical processor was characterized during qubit measurements, highlighting the efficiency of the ISA.Transmon calibration routines were performed in order to realize the aforementioned characterization experiments.These measurements serve as a promising demonstration of cryo-CMOS based control technology, with single-qubit and two-qubit error per gate (EPG) observed to be ϵ 1Q = 8e-4 and ϵ 2Q = 1.4e-2, respectively.We show through Lindblad master equation simulations that the observed error is set by control noise.The primary error source is a pulse-induced qubit Z-axis rotation that arises due to spurious spectral content observed in the CMOS controller output.Additionally, a Gaussian distributed pulse amplitude noise was observed on the RF pulses, but simulations showed that this noise did not significantly impact gate errors.

II. CRYO-CMOS CONTROL ELECTRONICS
An ideal quantum control architecture will have the capability to autonomously generate waveform sequences conditioned on the measurement of physical qubits [32].A control unit able to satisfy this requirement will be composed of circuit blocks for qubit waveform generation, entanglement waveform generation, readout waveform generation, and qubit state discrimination.Furthermore, the unit will require a central processor that manages these blocks and conditionally asserts logic for running the appropriate quantum circuits (Fig. 1(a)).CMOS is an ideal technology for realizing such a control unit because of existing industrial fabrication capabilities and ease of integration of the different circuit blocks required.
In a quantum computing system optimized to execute specific quantum algorithms, the fault tolerant operations are in principle deterministic based on the decoding of physical qubit measurements [13,33,34], implying full autonomy is achievable.If the above mentioned capabilities are performed autonomously and within the dilution refrigerator, this approach will yield a reduced latency control configuration (Fig. 1(a)), thereby increasing the achievable number of circuit layer operations per second (CLOPS) [35].The primary latency concerns addressed here are a reduction in the round-trip transient time and the response time for conditional waveform generation.
The proposed autonomous control unit is composed of distinguishable circuit blocks, which can be developed either as stand-alone chips in a multi-chip configuration or as a single larger integrated circuit.The work detailed in this manuscript focuses on a demonstration with a standalone circuit block consisting of two distinct RF channels for qubit waveform generation.Here the same RF generator is used for both single qubit control and generating entanglement between qubit pairs; this approach leverages the inherent wiring advantage associated with the cross-resonance based architecture [19].As shown in Fig. 1(c), this semi-autonomous qubit state controller consists of an analog block with a low power digital to analog converter (DAC), a special purpose classical processor, and a serial interface for communicating with RT electronics [10,11].

A. Analog Block
The qubit state controller was designed and fabricated in 14 nm FinFET technology, a technology choice which was desirable due to its high switching efficiency, large transistor on/off ratio, and lower threshold voltages that lead to reduced power dissipation [36].The choice of a highly-scaled transistor technology also reduces the cost of adding a high degree of digital programmability.For example, the qubit controller is capable of being configured to the following modes of conversion operation: double-sideband with suppressed carrier (DSB-SC), single-sideband direct conversion lower sideband (SSB-LSB), and single-sideband direct conversion upper sideband (SSB-USB).The reported experiments utilized SSB-LSB as a mode of operation; which provided additional filtering of the LO when it was placed high in frequency than the transmons.
For experimental versatility, the qubit state controller's analog control block was made configurable utilizing over 200 bits of digital control.The analog control block is composed of two 10-bit DACs (in-phase and quadrature, or I/Q), two baseband filters (I/Q), a complex mixer to provide, e.g., SSB-LSB output, and a tunable output stage.The complex mixer receives quadrature clock signals for upconversion of the DAC's baseband signal to the qubit's |0⟩ to |1⟩ transition frequency ω 01 .The SSB mode was chosen as the primary mode of operation in order to reduce circuit complexity while also reducing analog power consumption.In order to maximize dynamic range while simultaneously minimizing noise generation, a fully differential current mode design was implemented.Notable advantages of the design include: current reuse among multiple functional blocks, high bandwidth interfaces between circuit elements, convenient implementation of the variable gain stages (using current scaling and current steering [? ]), and low switching noise at the output [10,11].
The differential wiring extends from the DAC output to the output stage of the chip.The output stage consists of a balun which converts the differential signal to single-ended; the balun resonance frequency is tunable to support the range of desired SSB frequencies.Balun resonance and shape tuning are controlled using 4 bits of center frequency adjustment and 2 bits of quality factor adjustment.The output impedance is adjustable to match the fridge wiring that is connected to the balun output.A variable attenuator provides 20dB of programmable attenuation for noise reduction, plus an additional 25dB to be switched in for blanking the AWG during readout.To satisfy dynamic range requirements, two variable gain stages were used in the analog control path.The first gain stage was placed between the baseband filter (BBF) and the SSB upconverter while the second gain stage was placed at the output of the SSB up-converter.Both gain stages are unidirectional (to provide reverse isolation) and yield a total of 34 dB of gain control with an average step size close to 2 dB.The BBF bandwidth (as reflected in 3dB cutoff frequency) is configurable over a range of 100-800 MHz using 5 bandwidth control configuration bits.A variable attenuator The DAC can be programmed to produce an IF offset within ∼ 400 MHz of the LO, and the in-band spurious tones at the output were suppressed to a spurious-free dynamic range of 40 dB out to 500 MHz [10,11].A microwave source with a frequency between 8 and 12 GHz is delivered from room temperature to the cryo-CMOS chip, on which LO signals in a range of 4-6 GHz are generated with a 2:1 frequency divider.Leveraging the analog control path's differential current mode architecture, programmable DC currents are added to the output currents of the DACs in order to compensate differential offsets, which helps reduce LO leakage in the RF output.

B. Digital Block
The digital architecture features a processor implementing 32 bit fixed-point instructions for programming flexibility, including special instructions for waveform generation and phase rotations, and is designed to minimize power consumption for cryogenic operation.The processor's instruction set architecture (ISA) defines 32 general purpose instructions (8 branch/flow control, 10 data movement, and 14 arithmetic) to enable triggercontrolled loops, subroutines, and computation, as well as 5 special instructions for the generation of waveforms and digital output signals.The processor core uses three SRAM banks with 32 KB dedicated for instructions, 20 KB dedicated for waveforms, and 32 KB dedicated for data, respectively.To minimize power consumption, the processor has a fast clock domain operating at the sampling frequency f s of the DACs for providing waveform data to the DACs, and a slow clock domain operating at f CLK = f s /16 for program control.The microarchitecture implements instruction fetch, decode, branch resolution, and scalar arithmetic execution within one slow clock cycle.
Waveform data is stored as an envelope modulated by intermediate frequency I/Q sinusoids with an initial phase of zero.This approach reduces waveform memory footprint and avoids the power overhead associated with the sine/cosine evaluations of a numerically controlled oscillator (NCO) [37,38].The Compute Waveform Coefficients (CWC) instruction prepares the I/Q coefficients used by the Play Waveform (PW) instruction to set the phase and amplitude of the output waveform.These coefficients are calculated relative to the frame phase which can be modified by special instructions such as Add Frame Phase (ADDFP) to effect a virtual Z-rotation of the qubit phase [39].The coefficients are applied to the stored waveform data through 16-way single instruction, multi data vector arithmetic logic in the slow clock domain.One PW instruction can play up to 4096 waveform samples.The samples are serialized into the fast clock domain and sent to the I/Q DACs in the analog section.The waveform retrieval and processing functions progress independently of the program flow and control facilities of the processor.

III. CMOS PROCESSOR FOR QUBIT CONTROL
Even with a specialized ISA, some experiments were challenging to accommodate with this low power processor, primarily due to its limited memory.Note that all experiments performed with the cryoCMOS processor were originally developed using room temperature electronics that offer significantly more memory and higher performance as compared to the custom processor design.This discrepancy created issues for the porting of experiments to the low power processor.Details regarding issues encountered are illustrated in Fig. 2 and are further discussed in section IV.Here Fig. 2 shows the memory usage of the processor for the different experiments performed.Many experiments required no change from routines developed with room-temperature electronics, but in some cases additional effort was required in order to fit pulse sequences into processor memory.The default approach to reducing memory demands was to decrease the point density by customizing parameters, while ensuring enough data points were collected to extract accurate fit results.In cases when the default approach was not sufficient, it was necessary to rewrite experiment routines or substitute new pulse types.One experiment for which new pulse definitions were required is Hamiltonian tomography [40] and details for how this experiment was made to work are reviewed in section IV.
With respect to how they were operated using room temperature control electronics, most experiments fell into the category of not requiring change or only needing custom parameters to achieve successful execution.RB is an example of an experiment made to work through parameter adjustment: in this case, the instruction memory was the limitation to be overcome.The strategy used to address this challenge was first to reduce the number of Clifford sequences, and then to use logarithmic spacing between the different sequence lengths, both of which helped to reduce the number of instructions required.However, as shown in Fig. 6(a) and Fig. 7(a), the last data point in an RB experiment consumes the most memory, implying simple point reduction methods will not scale as error rates improve.Lower error rates will require more Clifford gates for the exponential decay to converge; which will require more instruction memory.As shown in Fig. 2, the instruction memory requirements for an RB experiment are predicted to increase from 32KB to 256KB when the 1Q error and 2Q error are reduced from 0.1% to 0.01% and 1% to 0.1%, respectively.
To accommodate longer sequences, ASICs may require more effective instruction memory, a more specialized ISA, or a custom compiler designed to specifically for the ISA [41].Alternatively, new measurement techniques with lower memory overhead could also be adapted to extract error rates, e.g.: using simple characterization measurements along with error models to infer errors [42][43][44][45], interpreting the error through an under-sampled RB experiment [46], or performing experiments like quantum process tomography [47,48] that place lesser demands on memory.It is also worth noting that longer sequences may not be needed in future quantum computers.Experiments like RB are useful for performance evaluation during hardware and software development, but are not the intended use case for quantum computers.Experiments such as QEC [20][21][22][23] and QEM [24][25][26], will be the primary use cases for useful applications of quantum computers.

IV. MEMORY REDUCTION IN A LOW POWER PROCESSOR
The on-chip processor supports 32KB of SRAM for instructions, and 20 KB of SRAM for waveforms.For simplicity these memory banks have a single purpose designation: waveform memory is used store waveform data, and instruction memory is used to store programs that play sequences of qubit control pulses.The resource bottlenecks were observed when waveforms became too long, and/or sequences became too long.When this happened memory reduction techniques were necessary.One example technique was to reduce waveform memory by partitioning waveforms.
A key example is the Hamiltonian tomography experiment from Fig. 5(d), which required many long square-Gaussian waveforms of different lengths that did not fit in waveform memory.To overcome this problem, the flat section of the square-Gaussian waveform was partitioned into small equal-sized segments, and the waveform was constructed by stepping through instruction memory to add segments together, as shown in Fig. 3.This technique made a fundamental physics experiment feasible with limited waveform memory, but at the cost of more instruction memory.The width of the segment was used as a sweep parameter in order to optimize the trade-space between waveform size, and the number of instructions.This waveform memory reduction technique was also applied to other 2Q calibrations for consistency.In these cases the width of the smallest segment was chosen to align with the fastest 2Q CR pulse.To increase the pulse width, the number of instructions increased linearly with the number of segments.As was observed in Fig. 2 for 2Q calibrations marked with the "Slow" prefix.In this specific case the smallest segment was chosen to be 71.1 ns and the slowest CR pulse used was 711.1 ns; resulting in ≈ 10x increase in instruction memory usage.It is conceivable that the processor's arithmetic and branch facilities could be exploited to reduce the 10x increase, but this is not trivial because of phase alignment requirements between adjacent segments.As mentioned in Section II B, the processor supports phase manipulation instructions so in theory this can be overcome, but in practice it is difficult to implement correctly and accurately.
Another challenge associated with this simplistic ISA model is resource starvation, because there are often no free clock cycles to allow execution of arithmetic or branch instructions.For example, in these experiments the shortest 1Q gate instrumented in Fig. 8(e) is just 28.44 ns wide, which corresponds to two processor clock cycles.In order to generate the shortest pulse, two instructions are required: one to compute waveform coef- The reported powers are extracted from the on-chip supply voltage measured by a sense line, and the current sourced by the power supply.Power dissipation due to fridge wiring is calculated using cryogenic material models a .
a Cryogenic properties of commonly used metals, https://trc.nist.gov/cryogenics/materials/materialproperties.htmficients (CWC), and another to play a waveform (PW).Shorter gates are desirable because they are shown to reduce errors, but a dependency exists between the processor clock frequency and the 1Q gate length.The processor clock frequency sets a lower bound on the 1Q gate speed.Increasing the processor clock frequency would speed up the execution of the CWC and PW instructions, but would result in more power dissipation.
The difficulty in optimizing instruction sequences on the ISA, is compounded by the layered structure of the existing software stack; in which high level pulse definitions are generated in a manner that supports a variety of room temperature control hardware.Since off-the-shelf AWGs do not provide quantum specific instructions, such as frame phase adjustments or parametric looping; the existing pulse generation software did not support the efficiency yielded by a quantum specific processor.Ideally a special purpose compiler would be used to take in high level pulse definitions and convert them into instructions for special purpose processors like the one presented in this manuscript [41].

V. EXPERIMENTAL SETUP
The experimental setup in a closed-cycle dilution refrigerator, detailed in Fig. 4(a), consists primarily of a cryoCMOS AWG payload (labeled CP) mounted to the 4K plate, and a qubit payload (labeled QP) mounted to the mixing chamber (MXC) plate at 10 mK.The CP composed of an Au plated Cu machined mount that mechanically and thermally anchors a printed circuit board (PCB).The PCB houses the cryoCMOS AWG chip, decoupling capacitors, and routed traces that connect the various pins of the chip to the wiring connectors.The cryoCMOS AWG chip, shown in Fig. 4(b), is bump bonded to a laminate with NPO ceramic decoupling capacitors and is inserted into a pogo-pin socket on the PCB, shown in Fig. 4(c).The cryoCMOS chip is thermally anchored to the 4K stage via a Cu block on the lid of the socket and a Cu strap that is connected directly to the Cu back-plane of the mount.DC power supplies, reference currents, clock and local oscillator (LO) sources, and an FPGA all used to power and the control the chip are located at room temperature (RT) near the cryostat.The output of the two cryoCMOS AWG channels, set to the resonant frequencies of the qubits (≈5 GHz), are sent via coaxial cables down to the MXC plate.At the MXC plate the signals pass through a total of -22 dB of cold attenuation, a Mini Circuits VLF 5500 low-pass filter, and a ferrite cryogenic isolator before connecting to the qubits.
The QP consists of a pair of transmon qubits that are connected by an LC cancellation bus that helps reduce the effect of constant ZZ-type errors.The transmons are operated and measured in transmission, each with their own designated Purcell filters and readout resonators.The readout pulses, supplied by RT mixer-based signal generators, are set to the frequency of the respective qubit readout resonators (≈7 GHz) and driven into the fridge.The readout control signals are combined at the MXC plate with the cryoCMOS AWG signals into a single line using a directional coupler.After passing through the QP the output signals from each of the two qubits are sent through another cryogenic isolator, a HEMT amplifier, and a RT amplifier, before being sent to an ADC to be digitized.The maximum power level of the control sig-nals from the cryoCMOS AWG that can be delivered to the qubits (after passing through the -22 dB of cold attenuation between them) is ≈ -40 dBm.The net gain in the readout path is ≈16 dB, defined to be the total amplification of the HEMT and RT amplifiers, minus the losses associated with the coaxial cables.The readout chain did not include use of a quantum limited amplifier (such as a Josephson-based traveling-wave parametric amplifier), but if desired one could be added to improve the signal-to-noise ratio of the output signal.Even without the use of such amplifiers, state discrimination measurements showed that fidelities of ≈97% are achieved.
Proper synchronization, timing, and triggering between the cryoCMOS AWGs and the readout control electronics is critical to performing all of the qubit calibrations and measurements.All the supplies (Clock, LO, and those for the readout electronics) are connected to a common 10 MHz clock reference.When an experiment is to be performed, the waveform data and the instruction sequences are first loaded onto the cryoCMOS AWG via the FGPA and the serial communication interface, then a signal is relayed back to the FPGA to confirm that the program was successfully loaded onto memory.At the beginning of each cryoCMOS AWG program there is a WAIT instruction.A trigger signal, marking the start of the experiment, is sent from the readout electronics at RT to the cryoCMOS AWG, satisfying the WAIT condition and commencing the programmed pulse sequence.At the end of a sequence, when a readout pulse is being played, the cryoCMOS AWG uses its on-chip programmable attenuation to blank the output of the AWG signal by ≈45 dB.By adjusting the pulse timing within the sequence itself and by using a buffer delay on the RT readout electronics, it was ensured that the readout pulses begin soon after the control pulses end, ≈10 ns after the blanker feature on the cryoCMOS AWG is engaged.The coordination and timing between the control pulses, the blanking window, and the readout pulses was confirmed using a high-speed oscilloscope.If necessary for a given experiment, the cryoCMOS AWGs can output their own marker pulses to trigger one another, or to indicate that a sequence is complete.At the end of a sequence the processor then loops back to the beginning of the program and waits for another trigger.Through this procedure the cryoCMOS AWG can initialize the qubits into any desired Clifford state, kick off a pulse sequence, such as for calibration, QEC, or to perform a specific qubit experiment, and ensure that at the end they are followed up by well-aligned readout pulses to complete the measurement.

VI. QUBIT CALIBRATIONS
Performing calibration routines with room temperature control electronics are common practice for transmon-based devices, and detailed discussions can be found in references [49,50].Executing these routines The oscillations on the target are fit to a Hamiltonian, which can be used to extract device parameters, provide information about the optimal pulse width and the extent of errors such as IY.By computing the Bloch vector | ⃗ R|, one can extract the optimal CR pulse length at the curve's first minima.The qubit-2-qubit coupling was extracted to be J = 2.7 MHz. (e) CR amplitude calibration, optimally at the first maxima.(f ) CR phase calibration, optimally at the first maxima.using a novel, low power cryo-CMOS ASIC represents an important demonstration of functionality.As shown in Fig. 5, experiments to optimize the amplitude, frequency, phase, and width of the control pulses were performed, from which a set of optimized parameters were found and stored in waveform memory.The successful execution of these calibrations is necessary to facilitate the high fidelity two-qubit echoed CR gate demonstrated in this manuscript.

A. Single Qubit Waveforms
The DACs produce in-phase and quadrature signals of the form V I (t) = Ω(t) cos (ω SSB t − ϕ) and V Q (t) = Ω(t) sin (ω SSB t − ϕ), respectively, where ω SSB is the single side band frequency and ϕ is the phase.For singlequbit gates, the signals are shaped with a Gaussian envelope of the form, where Ω 0 is the amplitude, the width of the pulse is defined by t g , and σ is the standard deviation.The functional form of Ω G is chosen to enforce that the pulse start and end with zero amplitude [39,51].The pulse amplitudes for X(π) and X( π /2) were calibrated by driving the qubit at the |0⟩ to |1⟩ transition frequency: Dur-ing the Gaussian pulse, the higher energy levels of the transmon experience a Stark shift, resulting in a driveinduced phase shift about the Z-axis of the Bloch sphere.This phase shift is amplitude dependent, and is mitigated with a derivative removal via adiabatic gate (DRAG) calibration [39,[51][52][53].The DRAG pulse is implemented by applying Ω G (t) and the derivative β ΩG (t) to the inphase and quadrature channels, respectively.As shown in Fig. 5(c), the DRAG pulse is repeated N times while sweeping the DRAG scale parameter β; the collective minima correspond to the optimal parameter value.Subsequent single qubit calibrations consisted of error amplification measurements for fine tuning all pulse parameters [54,55] B. Two Qubit Waveforms In order to drive the cross-resonance interaction, a square-Gaussian waveform is applied to the control qubit at the target qubit's resonant frequency [56].The pulse shape is defined by a Gaussian rise and fall of length τ r and standard deviation σ r , has a flat-top of length τ p − 2τ r and amplitude Ω 0 , and a total pulse length of τ p .The expression describing the pulse shape is given by The interactions between control and target qubits can be driven by sweeping the pulse width for a fixed amplitude, or conversely by sweeping the amplitude for a fixed pulse width.In Fig. 5(d), full-state Hamiltonian tomography is performed on the target qubit.This calibration is a method to identify the coherent error terms relevant to the microwave drive [40].
The calibration is performed by sweeping the width of the CR waveform applied to the control, then applying a X π , Y π , or Z π pulse in order to project the state of the target onto the X, Y , or Z axes of the Bloch sphere.This process is carried out for the control in |0⟩ and |1⟩.The data is then fit to a block-diagonal Hamiltonian and the six interaction terms IX, IY , IZ, ZX, ZY and ZZ are parameterized.The quantity ∥R∥ = (⟨X 5(d) is the two-norm distance between Bloch vectors of the target for the two states of the control.When ∥R∥ = 0, the two qubits are maximally entangled, and indicates an optimal pulse width τ p .

VII. BENCHMARKING A CROSS-RESONANCE GATE
The cryo-CMOS chip was specially designed to generate waveforms for transmon qubits in a cross-resonance (CR) based architecture, as shown in Fig. 1(a).A detailed description of the experimental setup with the CMOS chip thermalized to the T = 4K stage in a dilution refrigerator is provided in Section V.In a cross-resonance based qubit device, entanglement between neighboring qubits is generated via the cross-resonance (CR) interaction [19,40,54,[57][58][59][60].The CR interaction gives rise to a ZX term in the Hamiltonian, which describes a statedependent Rabi oscillation on the target qubit which in turn depends on the state of the control qubit.The entanglement is generated by applying an RF drive to the control qubit at the target qubit's |0⟩ to |1⟩ transition frequency.The always on coupling in CR devices gives rise to parasitic terms in the Hamiltonian such as ZZ, ZI, and IZ.These undesirable terms can be mitigated in hardware through cancellation buses [60], and digitally through echoed gate sequences [40,59].Here an echoed gate was realized on a device with a cancellation bus.This gate sequence used Gaussian waveforms for singlequbit rotations and square-Gaussian waveforms for generating entanglement.

A. Single-Qubit Randomized Benchmarking
Single qubit RB experiments were performed both individually and simultaneously on the control and target qubits and characterized as a function of the gate length t g .For each t g , the waveforms were calibrated and measurements of T 1 and T 2 were interleaved between RB experiments.The results for individual RB experiments are displayed in Fig. 8(e).The data show an error reduction as a function of gate length t g .However, the observed errors are not solely explained by decoherence.For example, the errors do not track with the first order single qubit error model ϵ 1Q [45,61].This discrepancy implies that the control electronics are contributing to the excess error measured in the RB experiments.Potential control-related error sources include spectral content as shown in Fig. 10 and pulse amplitude noise detailed in the section VIII.The spurious spectral content and quasi-static amplitude noise give rise to over/under rotation errors that vary quadratically with noise source amplitude.
During simultaneous operation, each control channel plays unique random Clifford sequences, which gives rise to coherent quantum cross-talk errors of the form ϵ ZZ 1Q = 1 6 (2πZZ t g ) 2 that contribute to the total observed error [43].The increase in simultaneous error observed in Fig. 8(e) is not explained by error analysis that assumes only coherent quantum cross-talk, a result which implies an external noise source.It is suspected that classical cross-talk on the CMOS chip arises during simultaneous The longer gate has more error ϵ1Q = 0.0032, but converges more quickly, requiring fewer instructions.The shorter gate has less error ϵ1Q = 0.0008, but requires longer Clifford sequences to measure and thus more instructions.The observed decay does not converge to 0.5, indicating leakage outside of the computational basis [39].For example, leakage into the |2⟩ state gives rise to IQ counts that are different from |0⟩ and |1⟩.This yields a measurement result of the form V0•p0+V1•p1+V2•p2, which converges above 0.5, without proper binning of the higher excited states.We believe this effect to be caused by spurious spectral content such as LO leakage.
waveform generation, which gives rise to the observed error.

B. Two-Qubit Randomized Benchmarking
Two-qubit gates are more difficult in practice than single qubit gates.When compared to single qubit gates, two-qubit gates are longer and there are more ways in which errors can arise; furthermore, the coupled qubits are more sensitive than single qubits to error sources.For example, in the decoherence error model ϵ 2Q [45,61], the error coefficients are larger and there are more terms that factor into the error (see section X).In addition to decoherence, there are other potential sources of error, including phase noise [62], amplitude noise, pulse-induced decay [63,64], spurious spectral tones, coherent quantum cross-talk [65], microwave cross-talk [66,67], and leakage outside of the computational basis [39,68].Additionally, the two-qubit waveform generation is more complex, requiring on average 17.51 instructions per Clifford (IPC) (Fig. 7), compared to 1.71 IPC for single qubit gates (Fig. 6).
The echoed two-qubit cross-resonance gate was calibrated and benchmarked for different two-qubit gate lengths, as shown in Fig. 8(d).The pulses driving singlequbit rotations were held constant, and measurements of T 1 and T 2 were interleaved with RB experiments.The error rates were simulated using a two-qubit model Hamiltonian parameterized with experimental data, and the measured RB data does not track with decoherence or coherent quantum cross-talk.
The simulations suggest the error is due to an alwayson Z-rotation combined with an amplitude dependent Z-error on the target qubit.The simulated Z-error is consistent with rotary echo experiments performed on the target qubit that are observed to have an average Zrotation of 83.1kHz, and modeled using Linblad Master equations (Fig. 10(a)).The target Z-rotation is further consistent with spectrum analyzer measurements of the CMOS chip output that reveal excess LO leakage when measured T = 5K in a closed-cycle 4 He cryostat.The presence of off-resonance spectral content will give rise to an AC Stark effect [69] which shifts the qubit frequency, resulting in a Z-rotation.As shown in Fig. 10(c) the spurious content was observed to be channel dependent, with the LO leakage on the control qubit's channel being the worst of the two.This random variation is in excess of what was expected in the chip's design phase, so the designed tuning range did not cover the distribution observed in hardware samples; the data shows an example where there was sufficient range for one channel and insufficient for the second.(e) Single-qubit RB measured on each qubit individually and both qubits simultaneously, while sweeping the Gaussian pulse width.The individual RB error is modeled with a Hamiltonian simulation assuming 83.1 kHz of Z-rotation; consistent with oscillations observed in the rotary echo experiment.For simultaneous RB, each qubit is measured and the average error is reported.The simultaneous error is believed to be due to an increase in quantum cross-talk from ZZ, and classical cross-talk from the CMOS chip.(f ) Two-qubit RB of an echoed cross-resonance gate as a function of the width of the cross-resonance pulse.Faster gates are observed to have reduced error, consistent with a reduction in decoherence errors; however, simulations reveal qubit coherence is not the leading source of error.Using a parameterized two-qubit Hamiltonian, the additional error was modeled by assuming an amplitude dependent Z-error on the target qubit, and a constant Z-error on the control qubit.

VIII. AMPLITUDE NOISE
In Fig. 9, a measurement was performed to observe quasi-static amplitude noise.The measurement sequence consisted of repeated amplitude calibrations for X π followed by X π/2 .For each calibration, the DAC amplitude coefficient was stored, and the percent difference was plotted as a function of lab time.The data were binned and then fit to a normal distribution.Three different measurements were performed each with slightly different experimental configurations: Run #1 was performed with the nominal device configuration and gate length, Run #2 was performed with a shorter gate length after adding additional low pass filtering to the supply voltages, and for Run #3 the on-chip attenuation was maxed out requiring a long, low amplitude Gaussian pulse to generate a pi-rotation.
For each experimental run, the amplitude required to drive a X π and a X π/2 changed due to different amounts of in-line attenuation.The relative fluctuations in the pulse amplitude were evaluated and shown to increase for larger pulse amplitudes.In Fig. 9, the distribution width σ was plotted vs the pulse amplitude, which yielded a 1/x dependency.Since the different experimental config-urations did not deviate from the 1/x dependency, these results imply that the source of the noise is on-chip, and could be related to an increase in low frequency noise at cryogenic temperatures [70].
Error analysis was performed using the 1/x fit of the noise as an input into the error model.We find that the amplitude noise is larger than decoherence errors, but is not significant enough to explain the observed 2Q gate error.For small low frequency fluctuations the amplitude noise will result in either an over or under rotation during the gate.The error per gate from an over/under rotation can be fit to the form c 0 + c 1 ∆ amp + c 2 ∆ 2 amp .The error due to quasi-static amplitude noise was simulated using modeling techniques described in Section X.The simulated error was fit to the quadratic error model, and the coefficients for different sample lengths as shown in Table II.Resulting in a simple heuristic error model.

IX. ROTARY ECHO EXPERIMENT
A rotary echo experiment [71] was performed to measure driven decay, as shown in Fig. 10(b).The qubit was pulsed in the +X direction, followed by an idle gate, and  then pulsed in the −X direction.The pulse width was 200ns and the sequence was repeated for N = 31 times.Perfectly symmetric and noiseless pulses will yield an exponential decay consistent with the qubit lifetime.Here a time dependent oscillation is observed and it is believed to be due to an AC Stark shift that arises during the pulse.The oscillations are fitted using a two level qubit model where an additional Z-rotation occurs during the +X and −X rotations.This Z rotation arises from spurious peaks which stark shift the qubit while it is being driven.As shown in Fig. 10(c tracted from fitting to the data with the three different buffer lengths: 14.2ns, 28.4ns, and 42.7ns.The simulation consisted of solving the Lindblad master equation for a single qubit with a time dependent X-pulse with an additional time dependent Z-pulse.All pulses were square-Gaussian [56] shaped with a 16 ns sigma.The X rotation rate, Z rotation rate, T1, and T2 were all input variables used during the fit and most variation was tied to changes in the Z rotation strength.

X. MODELING CROSS-RESONANCE GATE ERRORS
To model the effect of spurious Z-errors on our qubits, we numerically calibrate the cross resonance (CR) gate Buffer (ns) X-rate (MHz) Z-rate (KHz) T1 (us) T2 (us)  III.Fit parameters extracted from the master equation simulation.A Nelder-Mead optimization routine was performed on the data-set from Fig. 10 to determine best fit.The Z-rate was a the free parameter in each fit.The X-rate, T1, and T2 were allowed small bounds in order to achieve an optimal fit.The optimal T1, T2 fit parameters are different than observed values reported in Table III.The deviations are attributed to TLS fluctuations between the time the rotary echo data was collected [? ?].
using a two-qubit model Hamiltonian which describes two transmon qubits with lowering operators a and b within a Duffing model with anharmonicity δ a/b and coupling strength J.To generate the crossresonance entangling interaction we drive the control qubit (a + a † ) with the target qubit's frequency (ω b ).Ω x (t) describes the pulse envelope which for CR gates we use the Gaussian square shape Ω GS (t) described in Eq.
(2).Our pulse is allowed to rise (and fall) over twice σ GS , the Gaussian width, before (and after) a square pulse of duration τ p − 4σ GS .For each pulse width considered, we have calibrate the amplitude of the pulse in (3) to minimize the total error of a U ideal = e −iπZX/4 rotation (the native entangler produced by the CR drive).We do this by performing time-domain simulations of the Hamitonian in (3) to estimate U sim , then minimizing the two-qubit gate error as defined by

XI. OUTLOOK
The primary concern with realizing cryo-CMOS electronics for controlling qubits is the cooling limitation set by the dilution refrigerator.The CMOS ASIC used in these experiments is comprised of two low power RF analog front ends that dissipate 8.92mW of power per channel, and a quantum specific processor that dissipates 10.42mW per channel (see Fig. 4).Assuming cooling powers in a standard dilution refrigerator, it is believed that up to 100 cryo-CMOS channels could be integrated into a quantum computing system.The outlook becomes The difference in slopes between measured error and simulated error implies a constant Z-noise model is not sufficient for describing the observed errors.A best fit is obtained by introducing an amplitude dependent Z-noise on the control qubit, with the Z-noise on the target qubit fixed to 145kHz.We note that the amplitude dependent Z-noise on the control and the constant Z-noise on the target are not explained by the experimental data that was collected.(b) The simulated Z-noise as a function of the CR pulse amplitude.The observed EPG was best captured by assuming Z-noise that varies linearly with pulse amplitude.more optimistic when taking into account circuit innovations and advances in cooling infrastructure.
A key insight from this work is that the processor requirements for common calibrations and quantum information experiments will need to increase as the performance of the quantum processor improves, but increasing processor capabilities will lead to more power dissipation from the CMOS chip.To mitigate the demand for more processing power new instruction set architectures maybe required, or new low overhead experiments for calibrating and characterizing qubits will be needed.
Operating CMOS control electronics within the dilution refrigerator has the potential for improved gate performance due to lower noise, reduced loss, and less dispersion; however, the technical nuances of cryogenic operation and system integration make it challenging to achieve this potential.For example, most foundry circuit models are not reliable below 50K, surface mount components do not meet spec at cryogenic temperatures, and the electrical path length between the support electronics and ASICs is long and lossy when compared to CMOS based servers.Consequently, this technology will take time to achieve it's full potential.
This manuscript describes the first demonstration of 2qubit randomized benchmarking with a cryogenic CMOS controller, with an observed error per gate of ϵ 2Q =1.4e-2.The leading source of error is shown to arise from the electronics; however, an advantage of custom CMOS electronics is that new circuits can be designed to mitigate error sources after they have been identified.The primary engineering challenge arises from being able to distinguish errors arising from devices physics versus errors arising from control electronics.Identifying error sources is a non-trivial task, but these efforts are becoming simpler due to innovative approaches being developed in the field of QCVV.

XII. CONCLUSIONS
We have developed a low-power CMOS ASIC designed to operate at T = 4K that is able to generate sequences of RF waveforms for controlling, calibrating, and benchmarking a universal set of quantum gates between a pair of transmon qubits.The cryogenic control electronics were used to demonstrate high fidelity two-qubit cross-resonance gates.A two-qubit Hamiltonian model provides insight into the behavior of spurious Z-errors, which indicate the control electronics noise has an amplitude dependence.Modeling and analysis suggests that the observed drive-depended Z-rotation during rotary echo experiments and of LO leakage in the ASIC's output are connected, implying that spurious content from the CMOS-chip is the primary source of gate error.
The CMOS processor was characterized across a wide variety of qubit experiments demonstrating its viability for providing control pulses to next-generation quantum computers.Furthermore, these results highlight challenges with low-power cryogenic control electronics related to the instruction and memory requirements for standard qubit experiments.These results underscore the need for further innovation of digital architectures as gate error rates approach fault-tolerant thresholds.

FIG. 1 .
FIG. 1. (a)A block diagram for a fault tolerant quantum computing architecture in which control signals for the quantum processor (QP) are digitally synthesized at the 4K stage of the dilution refrigerator.Here, the QP is composed of fixed frequency transmon qubits coupled together through a fixed frequency quantum bus and arranged in a heavy hexagonal lattice.The cryogenic control unit (CCU) is composed of a cryogenic central processing unit (CCPU), qubit waveform generators, readout waveform generators, and quantum state discriminators.A CCU containing these key elements would be capable of autonomous operation, yielding a shorter latency loop when running deterministic quantum circuits.A room temperature (RT) processor performs classical computations, orchestrates high level quantum operations, and interprets results of quantum algorithms[1].RT support electronics are necessary to power, clock, and program active cryogenic electronics.The support electronics interface with a RT server which performs classical computations necessary to run quantum algorithms.(b) The X and Z stabilizer circuits for performing error correcting protocols on a heavy hexagonal lattice.Stabilizers represent the primary protocol for logical qubit maintenance and the CCU oversees these protocols; which include monitoring physical qubits, decoding errors on physical qubits[13], and generating conditional sequences of pulses.(c) An expanded block diagram of the qubit waveform generator (blue box in (a)), used for this manuscript.

FIG. 2 .
FIG. 2. Instruction and waveform memory requirements for each of the calibration sequences, QCVV experiments, and RB experiments, using nominal pulse widths of 42.67 ns for single qubit gates, 71.1 ns for the shortest CR pulse width, and 711.1ns for longest CR pulse width (marked "Slow" in the data labels).The X and Y axes unit is memory size in bytes, in logarithmic scale.CryoCMOS memory limit of 32KB for instructions and 20KB for waveforms are denoted by the dotted square.'1Q' in the data labels identifies single qubit calibrations and RB experiments, '2Q' or '2Q Slow' identifies two qubit calibrations and RB experiments, and 'QCVV' identifies the characterization experiments (T1, T2, T2 star and CPMG).For calibrations, the last string identifies the type of calibration, encoded as [ R=Rough | F=Fine ] [ A=Amplitude | F=Frequency | D=Drag | P=Phase ].For RBs, the last string specifies the errors per Clifford, and the corresponding marker identifies the memory size needed to reach P(1)=0.49for the error rate.WS stands for wideband spectroscopy, SRF stands for super rough frequency calibration, and HTE for Hamiltonian Tomography with Echoed CR.Vast differences in the memory requirements are evident for these different experiments.For instance, CR amplitude, CR phase, spectroscopy and HTE require large waveform memory, while RB, CPMG, spectroscopy and HTE require very large instruction memory.The colors indicate the degree of difficulty to accommodate the CryoCMOS memory limitations.Experiments that fall in the 'No Change' zone (white) runs without any modifications.'Parameter Change' requires reducing and/or carefully recalculating the experiment's parameters such as number of steps and range (lower resolution / narrower range).'Rewrite/Substitute' requires rewriting the experiment itself and/or substituting pulse types, to devise an equivalent experiment that fits better in memory by exploiting CryoCMOS core features.2Q HTE is in this category.'Hardware Assist' (purple) identifies experiments that may require changes to the current CryoCMOS hardware and/or instruction set.1Q and 2Q RB with lower error rates fall in this category, requiring further innovation going forward.

FIG. 3 .
FIG. 3. (a) The standard waveform used for a cross-resonance Hamiltonian tomography experiments.(b) Illustration of waveform partitioning used for memory reduction of pulses that were too long to be stored in waveform memory.

FIG. 4 .
FIG. 4. (a) Block diagram of the experimental setup.The cryoCMOS payload (CP) is mounted to the T = 4K plate of a dilution refrigerator and drives control signals down to a pair of transmons in the qubit payload (QP) on mixing chamber (MXC) plate at 10 mK to perform single-qubit and two-qubit cross-resonance gates.Between the CP and the QP there is 22 dB of cold attenuation, a Mini Circuits VLF 5500 low-pass filter, a ferrite isolator, and a directional coupler which combines the control and the readout signals together.The DC bias supplies, reference currents, clock frequencies, local oscillator (LO), readout electronics, and FPGA are all located at room temperature.(b) Micrograph of the CMOS chip highlighting the digital and analog sections of the two AWG channels.(c) The CMOS chip bonded to a laminate with NPO ceramic decoupling caps, that sits inside in a pogo-pin socket.The chip is thermally anchored to T = 4K through a Cu backing plate on the lid of the socket.(d) Power dissipation of the cryo-CMOS chip measured while under active control, for each sub-component of the chip, and the passive heat-load due to wiring from 50K to 4K.Including wiring, the total power dissipation per control channel is 27.27 mW.The reported powers are extracted from the on-chip supply voltage measured by a sense line, and the current sourced by the power supply.Power dissipation due to fridge wiring is calculated using cryogenic material models a .

FIG. 5 .
FIG. 5. Calibration data for single-qubit (1Q) and two-qubit (2Q) waveforms and the corresponding pulse sequences.Where relevant, dashed vertical lines indicate optimal values extracted from calibration.(a) 1Q Rabi measurement to tune the π-pulse amplitude, optimally at the maxima in the curve.(b) 1Q Ramsey measurement to tune the qubit frequency.Playing either a X(π/2) (blue) or Y(π/2) (purple) pulse as the second pulse in the sequence yields two curves with a relative phase of π/2.The period of the curve(s) determines how offset the driven frequency is from the qubit's transition frequency.(c)Derivative Removal by Adiabatic Gate (DRAG) calibration to add a derivative of Gaussian quadrature component to the pulse shape.The π-pulses are repeated N times within a pulse sequence for different sizes of the DRAG parameter.Each sequence yields a curve with a different period.The optimal DRAG parameter is the collective minima of the different curves.(d) Hamiltonian tomography measured as a function of the cross resonance (CR) pulse width.The |Z| state of the target qubit is measured after projecting into ⟨X⟩, ⟨Y ⟩, ⟨Z⟩.These measurements are performed with the control qubit prepared in either the |0⟩, |1⟩ state.The oscillations on the target are fit to a Hamiltonian, which can be used to extract device parameters, provide information about the optimal pulse width and the extent of errors such as IY.By computing the Bloch vector | ⃗ R|, one can extract the optimal CR pulse length at the curve's first minima.The qubit-2-qubit coupling was extracted to be J = 2.7 MHz. (e) CR amplitude calibration, optimally at the first maxima.(f ) CR phase calibration, optimally at the first maxima.

FIG. 6 .
FIG.6.RB data for single-qubit experiments.(a) The number of instructions increase linearly with Clifford count, with 36.94% of the instructions coming from the last RB data point.A single-qubit RB experiment requires 1.71 instructions per Clifford, which is extracted from the slope of the plotted line.(b) Single-qubit RB data for Gaussian widths of 28.4ns and 113.8ns, respectively.The longer gate has more error ϵ1Q = 0.0032, but converges more quickly, requiring fewer instructions.The shorter gate has less error ϵ1Q = 0.0008, but requires longer Clifford sequences to measure and thus more instructions.The observed decay does not converge to 0.5, indicating leakage outside of the computational basis[39].For example, leakage into the |2⟩ state gives rise to IQ counts that are different from |0⟩ and |1⟩.This yields a measurement result of the form V0•p0+V1•p1+V2•p2, which converges above 0.5, without proper binning of the higher excited states.We believe this effect to be caused by spurious spectral content such as LO leakage.

FIG. 7 .
FIG. 7. RB data for two-qubit experiments.(a) Two-qubit RB is more demanding on the processor requiring 17.51 instructions per Clifford, with 18.41% of the instructions coming from the last data point.For both RB experiments, a small fraction of the total number of instructions come from pre-sequence calibration, buffering, and pulse idle times, and contribute to IPC.(b) Two-qubit RB measurements for CR pulse widths of 71.1 ns and 213.3 ns, respectively.The shorter gate length have less error ϵ2Q = 0.014 compared to ϵ2Q = 0.037, but require more computational overhead in order to measure with precision using traditional RB.Error bars for RB experiments are averaged over ten rounds of the same random seed.

FIG. 8 .
FIG. 8. Two-qubit RB data for different single-qubit and two-qubit gate lengths.All data and error bars are averaged over three different RB experiments performed at each gate length.(a-b) The random Clifford gate sequences are stored in instruction memory, and the memory demands are plotted as a percentage of fullness.Instruction memory usage fluctuates because random sequences may have slightly different lengths.(c-d)Pulse definitions for the 1Q and 2Q gates are stored in waveform memory, which is shown to increase linearly with gate length and does not increase the number of instructions.(e) Single-qubit RB measured on each qubit individually and both qubits simultaneously, while sweeping the Gaussian pulse width.The individual RB error is modeled with a Hamiltonian simulation assuming 83.1 kHz of Z-rotation; consistent with oscillations observed in the rotary echo experiment.For simultaneous RB, each qubit is measured and the average error is reported.The simultaneous error is believed to be due to an increase in quantum cross-talk from ZZ, and classical cross-talk from the CMOS chip.(f ) Two-qubit RB of an echoed cross-resonance gate as a function of the width of the cross-resonance pulse.Faster gates are observed to have reduced error, consistent with a reduction in decoherence errors; however, simulations reveal qubit coherence is not the leading source of error.Using a parameterized two-qubit Hamiltonian, the additional error was modeled by assuming an amplitude dependent Z-error on the target qubit, and a constant Z-error on the control qubit.

FIG. 9 .
FIG. 9. (a) Measurement sequence performed to extract quasi-static amplitude noise.The pulse amplitude calibrations were repeated consecutively, and the X π 2 and Xπ calibrations were interleaved.The tg was the same for X π 2 and Xπ, and the amplitude was varied.For each calibration, the amplitude coefficient was observed to fluctuate, consistent with a normal noise distribution.(b),(c) The percent difference of the pulse amplitude coefficients plotted for X π 2 and Xπ, respectively.The time series of coefficients binned data, fit to a normal distribution, and are projected to the panel to the right.The noise was measured for three different experimental configurations: Run 1 was the standard experiment, Run 2 had additional filtering on the RT supply lines and a shorter tg, and Run 3 had increased on-chip attenuation and a longer tg.The pulse amplitude varied with tg.(d) The width of the normal distributions are plotted as a function of the amplitude coefficient.The relationship between the quasi-static noise and pulse amplitude is shown to follow a 1/x dependence of the form f (x) = 0.00041/x + 0.0007.This result indicates the SNR improves for larger DAC amplitudes.(e) Simulated gate error as a fuction of the quasi-static amplitude noise σ, and for different CR pulse widths.The error is fitted to a quadratic heuristic model of the form c0 + c1∆amp + c2∆ 2 amp .Coefficients are listed in II.(f ) Measured 2Q gate error as a function of CR pulse width, along with the simulated error due to observed amplitude noise.Modeling implies amplitude noise is not leading source of error.

FIG. 10 . 4
FIG. 10.(a) A pulse sequence for a rotary echo experiment.(b) Rotary echo measurement on the control qubit shows oscillatory behavior rather than the expected exponential decay.Data is fit to a master equation simulation that includes: T 1, T 2, and a constant Z. (c) Measured amplitudes of the most prominent spurious peaks observed in a spectrum analyzer.The LO was set to 5 GHz, with a side band frequency of 250 MHz.The cryogenic measurements were performed at 5K in a4 He closed-cycle cryostat prior to loading into a dilution refrigerator for qubit testing.The same bias conditions for the CMOS chip were used for qubit measurements.Channel 1 was connected to the control qubit, and channel 2 was connected to the target qubit.Here channel 2 is observed to have more spurious content, which is consistent with observed behavior on the control qubit.The additional spurious content Stark shifts the qubit, which shifts the energy levels and gives rise to an always on Z error on the target qubit.

FIG. 11 .
FIG. 11.(a) Measured and simulated two-qubit EPG as a function of the CR pulse width.Simulated error is shown for different amounts of Z-noise on target qubit, while the Z-noise on the control qubit is held constant at the measured 85kHz.The difference in slopes between measured error and simulated error implies a constant Z-noise model is not sufficient for describing the observed errors.A best fit is obtained by introducing an amplitude dependent Z-noise on the control qubit, with the Z-noise on the target qubit fixed to 145kHz.We note that the amplitude dependent Z-noise on the control and the constant Z-noise on the target are not explained by the experimental data that was collected.(b) The simulated Z-noise as a function of the CR pulse amplitude.The observed EPG was best captured by assuming Z-noise that varies linearly with pulse amplitude.

TABLE II .
Fit parameters for the heuristic gate error model that assumes quasi-static amplitude noise.The fitting routines were applied to numerical results from Hamiltonian simulations using measured device parameters.The quadratic behavior is consistent with "theta squared" errors that arise from over/under angle rotations.