Modeling Noisy Quantum Circuits Using Experimental Characterization

Noisy intermediate-scale quantum (NISQ) devices offer unique platforms to test and evaluate the behavior of non-fault-tolerant quantum computing. However, validating programs on NISQ devices is difficult due to fluctuations in the underlying noise sources and other non-reproducible behaviors that generate computational errors. Efficient and effective methods for modeling NISQ behaviors are necessary to debug these devices and develop programming techniques that mitigate against errors. We present a test-driven approach to characterizing NISQ programs that manages the complexity of noisy circuit modeling by decomposing an application-specific circuit into a series of bootstrapped experiments. By characterizing individual subcircuits, we generate a composite model for the original noisy quantum circuit as well as other related programs. We demonstrate this approach using a family of superconducting transmon devices running applications of GHZ-state preparation and the Bernstein-Vazirani algorithm. We measure the model accuracy using the total variation distance between predicted and experimental results, and we find that the composite model works well across multiple circuit instances. In addition, these characterizations are computationally efficient and offer a trade-off in model complexity that can be tailored to the desired predictive accuracy.


I. Introduction
Quantum computing is a promising approach to accelerate computational workflows by solving problems with greater accuracy or using fewer resources as compared to conventional methods [6,34,42,49]. Testing and evaluation of early applications on experimental quantum processing units (QPUs) is now possible using prototypes based on superconducting transmons [1,18,19,41] and trapped ions [17,24,36,46] among other technologies. Although these QPUs lack the fault-tolerant operations required for known computational speed ups, they offer the opportunity to understand the behaviors of noisy quantum computing [40].
Noisy, intermediate-scale quantum (NISQ) devices have enabled a wide range of early application demonstrations [1,15,23,25,30,35], but validating program performance in the presence of non-reproducible device behaviors remains a fundamental challenge. NISQ devices are characterized by noisy and erroneous operations, where gate characterizations often change in time and with the nature of the program being implemented [44,53]. The experimental characterization of individual gates has relied on high-fidelity physics models for the underlying devices with common methods including quantum state tomography (QST) [28], quantum process tomography (QPT) [9,39], gate set tomography (GST) [5], and randomized benchmarking (RB) [21,26,33]. Physics-driven characterizations offer valuable insights into the underlying noise and errors that can inform the design of new devices and control pulses. However, translating from gate-level characterizations to circuitlevel applications is typically resource intensive because these methods often scale exponentially with the size of the qubit register to be characterized. [16].
As NISQ applications evolve toward deeper and wider quantum circuits, characterization methods must also extend to these larger scales. There is also a growing need for characterization techniques that can be executed swiftly and repeatedly to provide context-specific characterization data. Resource-intensive, physics-driven gate characterization techniques are not a scalable solution to characterizing devices and applications which are rapidly increasing in size and generally do not allow for a high level of dynamic tuning. Quantum circuit characterization methods may provide effective models of device behaviors that are efficient to generate and easy to interpret by a supporting programming environment, e.g., a compiler [7,14,48]. In particular, the validation of application behavior will require debugging methods and programming techniques that support mitigating computational errors in quantum circuits [11,20]. Effective models of noisy gates and circuits have already informed robust programming methods that lead to increased application performance [31,37,52], but a general method for composing noisy quantum circuit models is still needed.
Here, we introduce methods for generating effective models for noisy quantum circuits in NISQ devices derived from experimental characterization. Our approach is based on modeling application-specific circuits using a suite of characterization tests that build a represen-tative set of noisy subcircuit models. We compose noisy subcircuit models to generate noise models for more complicated circuits at larger scales, and we test the fidelity of the resulting model against experimental data. We show how to iteratively adjust the composite model selected for a noisy application circuit by comparing performance of the predicted behavior against application observations using the total variation distance (TVD) [31]. The iterative and flexible nature of this modeling approach is demonstrated using applications based on GHZ-state preparation and the Bernstein-Vazirani algorithm for search. We develop model composition for the fixed-frequency superconducting transmon devices available from IBM, though we propose these techniques may extend to other NISQ devices as well.
This characterization method is a coarse-grained yet fast approach to characterization which scales linearly with the number of elements in the device, e.g. qubits and couplings. Furthermore, it allows for dynamic tuning of characterization data to every execution of a particular application and can be tailored to yield desired information, e.g. development of a noise model using depolarizing parameters or performance of an entangling gate creating an equal superposition. The tradeoff compared to physics-driven characterization techniques is less total information received, which in some cases may result in a lower accuracy in the final effective description of the device.
We present the steps in the modeling methodology in Sec. II followed by a series of examples using the case of n-qubit GHZ states in Sec. III. In Sec. IV, we present results from experimental characterization for the GHZ state on NISQ QPUs and discuss the role of model selection for characterization accuracy. In Sec. V we show the performance of our noise models composed from this characterization on the GHZ state experimental results. In Sec. VI, we apply these models to the case of the n-bit Bernstein-Vazirani algorithm, while we offer final conclusion in Sec. VII.

II. Model Selection Methodology
We begin by detailing the coarse-grain modeling methodology before providing specific examples of its implementation. Consider the input for noisy circuit modeling to be an idealized quantum circuit C that is expressed in the available instruction set architecture (ISA) for a given QPU [6]. While the gates defined by the ISA may not be directly implemented within the QPU, the representation used for the ideal circuit will define the operators available for gate characterization. The input circuit is decomposed into a set S(C) = {S i } of idealized subcircuits S i that each represent a subsection of the total area of circuit C. The area of C is defined by its width (register size) and depth (length of the operation sequence). The area of each subcircuit S i is defined by the selected subcircuit width taken from C and the longest depth of the selected gate sequence. For example, a circuit C composed of one-and two-qubit gates as shown in Fig. 1 may be decomposed into a set S of two-qubit subcircuits which have depth of two gates and width of two qubits. Circuit decomposition is not unique and a given decomposition is selected based on tradeoffs in the cost of characterizing each subcircuit, prior knowledge of the suspected device noise and error processes, and any potential structure or symmetry in the circuit design. A complete characterization requires every gate and register element within the input circuit to be included in at least one subcircuit. In general, the selected subcircuits need not be disjoint. The ability to tune the decomposition enables coarse-graining of the noisy circuit model, which is formed by composing the results from subcircuit characterization.
Next we test each subcircuit to characterize the noise present within the coarse-grained area. Each test circuit specifies an idealized outcome based on the input state and gate sequence for the subcircuit instance. We select test circuits to be informative yet limited in both number and circuit dimensions in order to increase efficiency and improve scalability. To test a subcircuit S i , we may select the full subcircuit S i provided the ideal outcome is known, but we may select additional test circuits to gain more information and refine our noise models. The set of test circuits T = {T i } is therefore at least as large as S and generally larger. For example, given a two-qubit subcircuit S i consisting of a one-qubit gate followed by a two-qubit gate, we may select two test circuits-the first circuit consisting of the one-qubit gate and the second circuit consisting of both gates.
The process for selecting test circuits T (S) = {T i } for each S i follows a set of guidelines detailed below. 2. Generate measurement subcircuit T meas consisting of initialization of |ψ j and measurement in B for each q j . If |ψ j is unknown or more tests are needed, select or add the computational basis states |0 and |1 . Additional input states may include superposition states such as |ψ = (|0 + |1 )/ √ 2 or randomly generated input states |ψ = α |0 +β |1 .
3. Identify the set g = {g k } of the gates or gate compositions of G for which the expected outcomes may be calculated for a given input.
4. Select set g for testing. Elements of g are gates from g or compositions of gates from g which represent sequences of increasing depth from subcircuit S i . The selection of g may be based on tradeoff in the cost of characterization or informed by prior knowledge of expected noise processes or iterative refinement, similar to subcircuit selection.
5. For each element g k ∈ g , generate a circuit T k (g k ) which consists of initialization of |ψ j , application of g k applied to the q j identified from S i , and measurement in B.

The set of test circuits is
The implementation and execution of test circuits on a QPU generates a corresponding set of measurement observations. Each test circuit is executed multiple times to gather statistics from the distribution of results R i that characterize subcircuit T i . The i-th characterization is denoted as H i = (T i , R i ) and the set of all characterizations is given as H. The number of characterizations is fixed by the number of test circuits |T |, while the number of measurement observations acquired for each test circuit is set by the sampling parameter N s . Assuming the same sampling for all tests, then there are a total of N s |T | measurement observations, i.e., experiments, required for H.
The results of experimental characterization are used to formulate concise approximate models of the subcircuits' observed behaviors. We model each noisy subcircuit as the idealized subcircuit followed by a quantum channel that accounts for the noise [3]. Let the noisy subcircuit model M i = M (S i , p i ) representing subcircuit S i depend on model parameters p i . We estimate the channel parameters using the characterization H i , where the method of parameter estimation will vary with the selected model. Parameter estimation may be either direct or optimized methods. For example, least-square error estimates may be used to estimate parameters from noisy measurement observations by optimizing the residual model error.
We quantify the error in the resulting models using the total variation distance (TVD) [31], which is defined as where r (Hi) (k) is the probability of the k-th outcome of the test circuit T i and r (Mi) (k) is the corresponding probability predicted by the noisy circuit model. The TVD vanishes as the predictions of the model become more accurate in reproducing the observed results and reaches a maximum of unity when the sets are completely disjoint. After estimating the model parameters p = {p i } for all subcircuits, the corresponding noisy circuit model M (C, p) for the input circuit C is composed. The method of composition of the noisy subcircuit models is paired with the decomposition method to ensure a consistent representation of the original input circuit. In the examples below, we consider modeling methods based on independent noisy subcircuit models that permit separable composition-decomposition methods and defer discussion of non-separable models, e.g., context-dependent noise, to Sec. VII.
Final selection of the noisy circuit model is then guided by the accuracy with which the composite model reproduces the performance of the circuit C on the QPU. For clarity, we define the actual executed circuit A = (C, R c ) with R c the recorded results, and we measure the accuracy of the noisy circuit model as d tv (A, M ). The desired TVD sets an upper bound on the threshold for model accuracy. If this user-defined threshold is not satisfied, selection of the noisy subcircuit models is revisited. This iteration may include refinement of the noisy subcircuit models to improve the accuracy of each M i or redefinition of the circuit composition-decomposition methods to manage the trade-offs in modeling complexity and accuracy. The former requires repeated post-processing analysis of the characterization H, whereas the latter requires additional characterization testing. In either case, model selection continues until the threshold has been meet. Once the accuracy threshold has been satisfied, noisy circuit modeling is complete.
The noisy subcircuit models can then be tested for robustness in predicting the expected outcome from both the input circuit and other circuits executed on the characterized device. We again use TVD to measure the accuracy for selected models to characterize the behavior of other application circuits within the same QPU context.
We summarize the complete procedure as follows.

Decompose the circuit into set
3. Select set of test circuits T = {T i } which define an input state and ideal outcome for each element in S.

Propose a noisy subcircuit model
5. Implement and execute T on QPU to generate experimental characterizations H i = (T i , R i ) using results R i returned from QPU.
6. Using set of characterizations H = {H i }, fit noise parameters p i based on calculated expected probabilities for each M i . 8. If d T V is not at threshold return to 2, apply refinements to 2, 3, and 4, and continue to 7 until threshold is met.
For step 8, refinements to step 2 include additional elements selected from the set g, addition of compositions of elements in g such that the test components are larger, or addition of elements to g not explicitly represented in G. Refinements to step 3 include additional initializations as test circuits. Refinements to step 4 include additional noise model parameters p i or different noise channels to define M .

III. Application to GHZ States
We next illustrate the methodology of Sec. II using the example of a GHZ-state preparation and measurement circuit. We generate noisy quantum circuit models for this application for various circuit sizes executed on the IBM poughkeepsie QPU, which has a register and layout as shown in Fig. 2. All data for characterization tests and applications is collected in a single job sent to poughkeepsie, a process which typically required under 30 minutes of execution time after queuing. As the poughkeepsie device is periodically calibrated, our experimental demonstrations ensure that all data is collected within one calibration window to preserve the QPU context. The software implementation of our examples below as well as all experiment and simulation details such as subcircuits and noise models is available publicly [29].
We consider the example of preparing the n-qubit GHZ state where the subscript denotes the qubit and the schematic representation of the input circuit C is given in Fig. 3. The instruction set for this circuit is limited to the one-qubit Hadamard (H) and two-qubit controlled-NOT (cnot) unitaries along with the initialization and readout gates acting on a quantum register of size n. We study this example for a range of register sizes from n = 2 to 20 by composing a noisy circuit model that represents GHZ-state preparation on a QPU based on superconducting transmon technology [8,47]. This example demonstrates the unique features of superposition and entanglement using a circuit depth that is within the capabilities of the NISQ devices [13,55]. We decompose the GHZ-state preparation circuit from Fig. 3 into a set of subcircuits S based on the procedure detailed in Sec. II. In this example, we identify a series of overlapping 2-qubit subcircuits for coarse-graining the nqubit state preparation. Spatial variability in the device noise motivates a decomposition based on each register element q i . We extend these subcircuits to generate a corresponding set of test circuits T by the set g given as from which we select The expected outcomes of these particular test circuits are simple to calculate from the truth tables for each operator [38]. We examine the models using these test circuits.

A. Noisy Measurement Model
We begin by characterizing the initialization and measurement test circuits, which are necessary for modeling noisy unitary gate behavior. The measurement process for each register element discriminates an analog signal to generate a classical bit [32], and errors in signal discrimination may lead to the wrong value. Characterization of measurement records the number and type of outcomes observed for each initial state. We characterize each register element with respect to both the 0 and 1 output states. The leading errors in the observed results occurs when the j-th register element maps an expected output value to its complement, i.e., 0 → 1 and 1 → 0.
We model measurement of the j-th element as a binary process subject to errors which act on the postmeasurement classical bit string, and we consider two models for the measurement error process: symmetric readout noise (SRO) and asymmetric readout noise (ARO). The SRO model is defined by a single parameter p sro that specifies the probability for a bit to flip, and we define a test circuit to characterize this process as measurement immediately after initialization to state |0 . We directly estimate the value of p sro from the number of errors when preparing this computational basis state as p sro = r(1), where r(k) is the observed probability of k errors recorded. This model implicitly delegates initialization errors to the readout error model. The SRO model is developed by test circuits By contrast, the ARO model uses two parameters: p 0 for the probability of error in readout of |0 and p 1 as the probability of error in readout of |1 . The ARO model therefore represents a refinement of both the noise model parameters p i and the test circuit suite T . We may estimate p 0 using the same test circuit above, but we must extend the characterization to preparation and measurement of |1 to estimate p 1 . These additional test circuits will require inclusion of the single-qubit X gate, and we also add a test circuit for the XX operation of two successive X gates applied to a single qubit. The latter reproduces the initial state |0 , enabling the error in readout of state |1 to be isolated from the error associated with the X gate. The ARO model is therefore defined by We model the test circuits for the ARO process using an isotropic depolarizing channel parameterized by p x to describe noise in the X gate, where I, X, Y , and Z are the Pauli operators. Characterization of the ARO model yields an overdetermined system of equations relating the four experimentally observed probabilities r (X) (0), r (X) (1), r (XX) (0), and r (XX) (1) to the parameters p 0 , p 1 , and p x . Of these parameters, only the latter two are unknown since p 0 is determined by the same method outlined above for p SRO . Because the experimental observations directly relate to each other via r (X) (0) + r (X) (1) = 1 and r (XX) (0) + r (XX) (1) = 1, we select the following system of equations for each register element based on counts of r (·) (0).
This system of equations is solved using the SciPy function fsolve, which is based on Powell's hybrid method for minimization [54].

B. Noisy Subcircuit Models
Test circuits for characterizing noisy subcircuits generate results that include measurement noise. We use the noisy measurement model above to account for these behaviors when modeling the results from test circuits. For the SRO and ARO models discussed above, this directly estimates the probabilities expected to be observed for each register. We use this procedure when discussing the characterization below.
We first characterize the subcircuit representing the Hadamard operation. The test circuit for a single Hadamard is defined with respect to the expected values for input states drawn from the computational basis, which yield a uniform superposition of binary results upon ideal measurement. We also use even-parity sequences of Hadamard gates as a second test to estimate noise in the subcircuit. These test circuits T = {T H (|0 ), T HH (|0 ), T 4H (|0 ), T 6H (|0 ), ..., T nH } are used to characterize the Hadamard gate to yield M H (T, p H ).
We define test circuits for the cnot operations that mirror the subcircuits used in the target application. For GHZ-state preparation, these are based on characterization of Bell-state preparation. The test circuit specification shown in Fig. 4 produces the idealized result of a uniform distribution over perfectly correlated binary values. These test circuits may be defined across all pairings of register elements as represented by Fig. 3. In particular, additional cnot test circuits may be added to the set g from the set g, and additional cnot test circuits for couplings not explicitly in G may be added as well. For convenience, we will denote the Bell-state preparation subcircuit as U BS (j,k) = U (cnot) (j,k) H (j) |0 j , 0 k . The noisy test circuits for Bell-state preparation are modeled by a pair of identical, independent depolarizing channels. Each channel, together defined as DP j,k = DP j ⊗ DP k , is parameterized by p cnot , which represents the probability of a depolarizing error determined independently for each qubit in the two-qubit cnot gate. We therefore use the test circuit T = {T BS (j,k) (|0 j , 0 k )} to compose model M cnot = M (T, p cnot ). . The test circuit for characterizing the cnot operation acting on register elements qj and q k . This test prepares the two-qubit Bell state as an instance of n = 2 in Fig. 3.
The probability of observing bits a and b is given by where the operator Π ab projects onto the state |a, b , and the resulting trace yields the probability of the ideal measurement. The probabilities expected from the noisy Bell state subcircuit on qubits j, k with ideal measurement is then given by Errors in readout transform these probabilities according to the noisy process, which may be either the SRO or ARO model. For example, the probability following readout s j,k (00) under the ARO channel is given by + (1 − p j 0 )p k 1 r j,k (01) + p j 1 (1 − p k 0 )r j,k (10) + p j 1 p k 1 r j,k (11) From the system of four equations generated by the readout probabilities s j,k (cd), we use the method of least squares to estimate p cnot . We minimize the sum of the squared residuals, cd s j,k (cd) − h j,k (cd) 2 (11) where each residual is defined as the difference between the modeled probability s j,k (cd) and the experimentally observed probability h j,k (cd) for each state result cd. The value h j,k (cd) represents the counts of state cd on qubits j, k measured during a total number of experiments N s . The value returned for p cnot is found using the SciPy fsolve function and bounded between 0 and 1 [54].

IV. Experimental Characterization
In this section, we report on the results of experimental characterization and noisy circuit modeling of GHZstate preparation using a QPU based on superconducting transmon technology developed by IBM. The IBM poughkeepsie device has a register of 20 superconducting transmon elements that encode quantum information as a superposition of charge states [27]. Microwave pulses drive transitions between the possible charge configurations and induce single-qubit gates. Coupling between register elements uses a cross-resonance gate that drives a mutual transition between transmons and therefore only occurs between two spatially connected elements [8].
The layout of the 20-qubit register in poughkeepsie at the time of data collection is shown in Fig. 2. A common edge in the connectivity diagram specifies those register elements that may interact through the cross-resonance operation. Individual registers are measured through coupling to a readout resonator, which results in a statedependent change in the resonator frequency. Amplification of the readout signal then enables discrimination of the state using a quantum non-demolition measurement [10,18].
Circuits are sent to the backend where they are translated into the appropriate ISA. The ISA for poughkeepsie consists of the gates U 1 , U 2 , U 3 , CX, and ID [12]. The U 1 , U 2 , and U 3 gates are unitary rotation operators, of which U 1 is a "virtual" gate performed in software and U 2 and U 3 are performed in hardware. The identity gate ID is used as a placeholder to create a timestep since it does not alter a quantum state. CX represents the cnot gate [2]. These instructions are implemented using low-level hardware operations. For instance, the CX operator is implemented in hardware using a sequence consisting of cross-resonance gates and single-qubit rotation gates [12,43,50].
The poughkeepsie QPU is accessed remotely using a client-server interface. We employ the Qiskit programming language to specify the input circuit and test circuits for the GHZ-state preparation application [22]. These Pythonic programs are transpiled to the specifications and constraints of the backend, including ISA, connectivity layout, and register size. Additional inputs to the transpiler may include optimization protocols for minimizing circuit operations or noise levels. The transpiled programs are executed remotely on the poughkeepsie device, which returns the corresponding measurements along with job metadata.
We use a shot count of 8,192 for all of the circuits executed on poughkeepsie which represents the number of times each circuit is individually executed and generates the distribution of output states from the input circuit. Therefore each probability estimated by experiment is given by r(k) = C(k)/N s , where C(k) is the number of events observed for each measurement and N s is the shot count of 8,192. These measurements are subject to error due to variability in sampling in experiment from the QPU distribution. We restrict our sample size to a single experiment of 8,192 shots to avoid introducing effects from drift in the poughkeepsie QPU. We use the standard deviation of these measurements to report error and statistical fluctuations, which is given by (p(1 − p)/N s ) where p is the binomial distribution probability parameter measured from experiment.
We characterize measurement of all register elements in poughkeepsie and analyze the results using the SRO and ARO models. The results for direct estimation of the ARO model parameter p 0 and p 1 are shown in Fig. 5.
The results for the SRO model correspond with p sro = p 0 . From these results, we observe a large spatial variability in readout error as well as asymmetry per register element. The readout of state |1 is almost always more error-prone than readout of state |0 . The results of estimating the parameter p x for the depolarizing noise model of each X gate are shown in Fig. 6. From these results, we see spatial variability in the recovered error parameter. We observe one case of a negative error rate for qubit 17 recovered from direct estimation using Eqs. 6 and 7. Because an estimated error rate of zero is within the experimental error, this is most likely due to statistical fluctuations. However, it could also be attributable to inconsistencies in the error behavior for the test circuits such that the model cannot estimate a feasible parameter based on the results, or to errors for this register that are not well described by a depolarizing channel such that a different model may yield a better solution. All other error rates are relatively small and therefore we have not investigated model refinement for this case because of the negligible contribution to the noise. We next characterize the Hadamard gate. We characterize error rates using test circuits generated from long sequences of Hadamards acting on a single element. We observe small error rates which correspond on average to 0.1% error per gate. We attempted to model the Hadamard noise using a depolarizing channel but it did not lead to a better TVD than using a noiseless model for the gate.
We also characterized gate error models based on unitary rotation noise in X, Y , and Z for the Hadamard gate which represents coherent errors. These characterizations did not yield a smaller TVD than using a noiseless model. Our choice to restrict characterizations to computational basis measurements significantly limits the achievable accuracy or effectiveness of this model. In general, such characterizations are not capable of identifying arbitrary coherent noise and are limited, e.g. only X and Y noise have an observable effect in the Z measurement basis. Additional test circuits could address this limitation at the expense of increased experiment count. For our purposes, we concluded that error rates associated with the Hadamard operation were negligible as this noise was 100 times smaller than the next leading gate error.
We next characterize the Bell-state preparation circuits for each pair of possible interactions shown in Fig. 2. We select the depolarizing noise model because it is a well-understood model for quantum noise that captures several different fundamental aspects of quantum behavior. We do not expect the depolarizing model to be a perfect fit to experimental data but this model provides a useful method to understand noise levels in the system and how noise from different components interacts. We use least-squares error estimation to find the value of depolarizing parameter p cnot that best fits the results while accounting for readout error as in Eq. (10). This approach yields more consistent results than solving each equation in the system explicitly and using a selection process to determine the final p cnot value from among these solutions which are often highly varied. The estimated parameter values are shown in Fig. 7. The magnitude of the error bars for the parameter estimations highlights the relative magnitude of gate noise to readout noise.
We test the accuracy of the noisy subcircuit models with estimated parameters from experimental characterization. For these tests, we use explicit numerical simulation of the quantum state prepared by each noisy subcircuit model. We estimate the measurement outcomes for these modeled circuits using the simulated quantum state, and we compare these simulated observables with the corresponding experimental observations from the poughkeepsie device. The accuracy of the noisy subcircuit model is quantified using the total variation distance (TVD) defined in Eq. (1).
Our simulations of the quantum state use a numerical simulator bundled into the Qiskit software framework. The Aer software simulates both noiseless and noisy quantum circuits using the same Qiskit programs sent to the poughkeepsie device as input. We constrain the simulator to a statevector simulation method. Within Aer, we input the noise models using the error rates and noise operators of depolarizing and readout channels as defined in Sec. III. Aer models gate noise using error functions parameterized by these error rates which create noisy descriptions of gates for simulation. When a noisy simulation is run, these functions sample errors and inject them as operations within the circuit. We tailor the simulations to match the developed noisy subcircuit models. Each test case acquired N s samples in order to mimic the finite statistics from experimental characterization. We generate a number of simulation samples of 8,192 shots per sample to create a sampling distribution. We report the standard deviation of this distribution which represents error due to variability in sampling in simulation.
A comparison of accuracy for different noisy subcircuit models is shown in Fig. 8 for simulating the Bell state circuit on qubits 0 and 1 on the poughkeepsie device. We calculate the TVD between experiment and simulation using six different noise cases. We consider symmetric readout only (SRO), asymmetric readout only (ARO), cnot depolarizing error only (DP), symmetric readout with cnot error (SRO+DP), and asymmetric readout with cnot error (ARO+DP). The error rate parameters are optimized for each composite noise model, e.g. the optimal depolarizing parameter in the SRO+DP case may not be the same value found for the ARO+DP case. We also simulate a noiseless Bell state for a baseline comparison.
The results shown in Fig. 8 clarify the noisy circuit model yielding the smallest TVD is composed from the asymmetric readout channel with a cnot depolarizing channel (ARO+DP). Since each noise model achieves a clear improvement in TVD as measured by a decrease from the noiseless case that is outside of error bars, we can be confident that each selected model is capturing some of the noise behavior present in the system while also illustrating which models provide the best descriptions of the noise. For example, in the noise model case 'DP' we have modeled a depolarizing channel for which the p cnot parameter is calculated to account for all noise in the system. This model has a clear improvement on TVD and therefore is likely to be an effective description of the noise in the system. However, the addition of readout noise models for the 'SRO+DP' and 'ARO+DP' cases is evidently a more accurate noise model because these models achieve further improvements in TVD.

V. Performance Testing Results
We now present the performance of the selected composite model on n-qubit GHZ-state preparation circuits. Using the estimated ARO and cnot error rates, we demonstrate iterations of this composite noise model which represent varying model complexity and experimental efficiency to achieve a particular accuracy. These iterations are shown in Fig. 9. The 2-qubit average case represents the performance of a noise model with only three parameters-p 0 , p 1 , p cnot -which are taken as the average of the error rates for only qubits 0 and 1. This represents a case of characterization using the fewest quantum resources, requiring only 7 experiments. We also consider a case which uses these same three parameters averaged over the entire register which retains low model complexity of only three noise parameters but requires the full suite of experiments. Our most detailed model accounts for spatial variations in the error parameters and uses individualized readout error rates for each qubit and cnot error rates for each coupling. As with the Bell state example in Fig. 8, we show the noiseless case for the sake of context and comparison. Finally, we also show the sum of the minimum TVD achieved for noisy simulation of the Bell state across each qubit pair for which a cnot was applied in the GHZ preparation circuit.  Figure 9 demonstrates a significant improvement in model accuracy for GHZ state preparation using our composite noisy circuit model. The improvement is a 3-fold decrease in TVD as compared to the noiseless simulation. Our fully spatial model performs better than the coarsergrained models, such as the average two-qubit model, particularly for larger sizes of GHZ state preparation. We also examine the scaling in the error with respect to the area of the circuit. We normalize the computed TVD by the number of cnot gates in each GHZ preparation circuit, and we find that the per-qubit model accuracy is nearly constant across all GHZ circuit instances, as shown in Fig. 10. This trend would also hold when TVD is scaled by qubit count, since qubit count and cnot count are strongly linked in the GHZ example. Since the TVD increases at a rate commensurate with cnot count or qubit count, this may indicate that higher levels of entanglement or larger Hilbert spaces impact the predictability of noise in the device.

VI. Bernstein-Vazirani Application
We next test the performance of this noisy circuit model on a different application to evaluate its ability to capture fundamental characteristics of the device. We test the performance by modeling several quantum circuit instances of the Bernstein-Vazirani algorithm. This algorithm considers a black box function that is encoded by a secret binary string which the Bernstein-Vazirani algorithm finds in one query [4]. Figure 11 shows an example of our circuit implementation of this algorithm using a three-bit string. We use a phase oracle qubit as the black box function encoded with the secret string. Upon measurement of the non-oracle qubits we obtain the secret binary string. We select the Bernstein-Vazirani algorithm because it is implemented using the same gate set we have characterized for the GHZ example, so we do  Other secret strings are produced by changing the cnot gate sequence such that control qubits correspond to output bits of 1.
Given the connectivity constraints of the poughkeepsie device, the maximum bit string we can test without introducing SWAP operations is of length three. We choose qubits 6, 8, and 12 with oracle qubit 7 because this set has among the lowest error parameters. We execute the Bernstein-Vazirani algorithm for every possible encoding of the three-bit secret string and record the accuracy as the probability that the encoded string was observed. We include collection of these measurements during the same job used to characterize the device. Figure 12 plots the simulated accuracy of the circuit outcome using the fully spatial noise model alongside the experimental accuracy. Our model captures the decrease in experimental observed accuracy across the various binary strings. The loss in accuracy scales with the number of 1 bits in the secret string for both the experiment and simulation. However, the accuracy predicted by simulation is consistently higher than the accuracy observed experimentally, indicating a state-dependent noise source Figure 12. Performance of Bernstein-Vazirani algorithm evaluated as the measured probability of the prepared secret string. Simulation is subject to noise defined by the fully spatial model.
remains missing from this model.

VII. Conclusion
We have presented an approach to noisy quantum circuit modeling based on experimental characterization. Our approach relies on composing subcircuit models to satisfy a desired accuracy threshold, model complexity, and experimental efficiency, which we implement using the total variation distance. We have tested our ideas using the IBM poughkeepsie device, which enables evaluation of our characterization methods as well as the comparison of predicted performance for GHZ-state preparation and an instance of the Bernstein-Vazirani algorithm. The initial example focused on GHZ-state preparation examined model fidelity with respect to both width and depth of an input circuit. Models for the readout and cnot subcircuits accounted for a majority of the model error. Our analysis of a second test circuit using instances of the Bernstein-Vazirani algorithm reveals additional sources of errors not captured in the original GHZ circuit characterization. Because both tests depend on the same gates for state preparation, the appearance of new errors suggests a possible state-dependent noise model that warrants further investigation. While our demonstrations have focused on specific devices and input circuits, the methodology provides a robust and flexible framework by which to generate noisy quantum circuit models on any device.
A significant feature of this approach to noise model decomposition is to iteratively adjust the models until sufficient accuracy is obtained. Improvements in accuracy may be obtained by changing characterization circuits or parameter estimation. The Bell-state and GHZstate preparation examples demonstrate how this model adjustment may be performed by varying the experimental efficiency and the input to the model to change the accuracy of the final composite model. Our demonstrations have focused on the depolarizing channel for gate modeling, but circuit characterization can be directly extended to account for new noise models, components, applications, and algorithms. For example, in both the GHZ and Bernstein-Vazirani results, we observe an increase in TVD that scales with the number of cnot gates applied in the circuit. A more sophisticated cnot noise model may improve accuracy of the final noise model. Since placing limitations on coarse-graining may introduce insensitivities to certain error types, for instance measurement only in the computational basis creates insensitivity to Z error types, it will likely be necessary to refine test circuits to address more sophisticated models. Additionally, this methodology assumes separability in composition-decomposition, i.e. it assumes that the noise present in the decomposed subcircuits is not substantially different from that of the composed circuit and that any differences may be tuned away by refinement. If this assumption is not true, there may be an upper limit to the achievable accuracy of noise modeling using subcircuit testing. Further model refinement and testing would be necessary to demonstrate this non-separability.
Our original motivation was to address the growing challenge of characterizing NISQ applications, for which efficient and scalable methods are necessary. We have shown how to construct a set of test circuits that scales with the area of the input circuit C and the underlying decomposition strategy. In the GHZ-state preparation example, the number of total experiments needed for full spatial characterization scales with the size of the register q and the number of couplings c according to N s (2q + 2c + 1). This resource requirement enables characterization to be run alongside the state preparation circuit when the job is sent to the QPU. This efficiency should help ensure noise characterization is performed within the same processor context as the sought-after circuit. We anticipate such real-time characterizations to be valuable for dynamic compiling and tuning of quantum programs [37,45,51].
Our approach to characterization has relied on model selection using minimization of the total variation distance (TVD) between noisy simulation and experimental results. This demonstration used a small set of the possible models for characterizing the observed QPU behavior, and expanding the set of potential models is possible for future work. There is a necessary balance, however, between the sophistication of the model and the utility for characterizing QPU behavior. While fine-grain quantum physical models are capable of capturing a more detailed picture of the dynamics present on small scales, the dawning of the NISQ era requires the addition of new techniques to our toolbox that have a higher-level and larger-scale approach. For scalable numerical analysis of quantum computational methods, it is essential that we develop coarse-grained, top-down approaches to capture the core behavior of QPUs.