Distinguishing parity-switching mechanisms in a superconducting qubit

Single-charge tunneling is a decoherence mechanism affecting superconducting qubits, yet the origin of excess quasiparticle excitations (QPs) responsible for this tunneling in superconducting devices is not fully understood. We measure the flux dependence of charge-parity (or simply, ``parity'') switching in an offset-charge-sensitive transmon qubit to identify the contributions of photon-assisted parity switching and QP generation to the overall parity-switching rate. The parity-switching rate exhibits a qubit-state-dependent peak in the flux dependence, indicating a cold distribution of excess QPs which are predominantly trapped in the low-gap film of the device. Moreover, we find that the photon-assisted process contributes significantly to both parity switching and the generation of excess QPs by fitting to a model that self-consistently incorporates photon-assisted parity switching as well as inter-film QP dynamics.

QP-induced decoherence in superconducting qubits is typically ascribed to tunneling of excess nonequilibrium QPs across a Josephson junction (JJ), as shown schematically by the purple arrows in Fig. 1(a). In this mechanism, when a QP tunnels across the JJ, it couples to the phase across the junction and can thereby cause transitions of the qubit state. The rate of QP tunneling occurs in proportion with the QP density in the superconductors, by which previous experiments have inferred QP densities (normalized to the Cooper pair density) in the range x QP = n QP /n CP ∼ 10 −9 −10 −5 [12,15,[18][19][20][27][28][29][30][31][32][33]. Depending on levels of other sources of decoherence, QP densities in this range may limit qubit performance.
These QP densities are many orders of magnitude higher than expected for devices at thermal equilibrium with the ∼30 mK base temperature of a dilution refrigerator, in which QP excitations are exponentially suppressed by the superconducting gap ∆ Al . This contrast points to the nonequilibrium nature of the QPs in the superconductors. Furthermore, attempting to explain measurements of QP-induced excitation and relaxation of the qubit by QP tunneling requires an assumption of their presence at high energies [19]. This is inconsistent with predictions that QPs relax to a distribution near the edge of the superconducting gap [11,34,35]. Despite the many observations of excess QPs across various types of qubits, a two-pronged question remains unanswered: how are nonequilibrium QPs generated and why do they appear to have a nonthermal distribution?
A possible mechanism to help answer these questions was proposed by Houzet et al. [36]. There, it was pointed out that a photon with sufficient energy to break a Cooper pair (≥2∆ Al ∼ 100 GHz) may be efficiently absorbed at the JJ, generating a pair of QPs. Just as QP tunneling results in a single charge being transferred across the JJ, this process likewise switches the parity of the number of electrons that have tunneled across the JJ (referred to simply as the "parity" for the remainder of the work) and may cause decoherence of the qubit state [ Fig. 1(a) orange arrows]. We label this process Photon-Assisted Parity Switching (PAPS), and it may contribute to QP-induced decoherence in two ways simultaneously: directly, by inducing qubit transitions during parity switches, and indirectly, as a generation mechanism of excess x QP . These additional QPs may then tunnel and induce decoherence, which we refer to as the NUmber-conserving Parity Switching (NUPS) mechanism because unlike PAPS, it conserves the number of QPs [13,14]. We distinguish PAPS from QP generation by ionizing radiation [33,37,38], which likewise generates QPs but does not directly cause a parity switch.
Additionally, PAPS causes qubit transitions imitating a nonthermal QP distribution when the typical photon energy is well above 2∆ Al [36]. For this reason, even in cases where the energy relaxation time of the qubit is limited by other decoherence sources, PAPS may be the primary cause of anomalous excitation of the qubit. The amount of high frequency radiation that reaches the qubit can be reduced with targeted filtering and shielding, which has been shown to improve qubit performance [29,32,39]. Even with these measures in place, recent experiments have shown that stray high frequency photons are indeed absorbed resonantly by spurious antenna modes, inducing parity switching [40][41][42]. Given that PAPS can be responsible for QP generation and the appearance of a nonthermal QP distribution, it is imperative to experimentally distinguish the contributions of this mechanism to parity switching and QP generation in superconducting qubits. However, it is difficult to discern the mechanism from a single measurement of the parity-switching rate or QP-limited energy relaxation time.
Here, we measure the parity-switching rate Γ in a flux (Φ)-tunable transmon sensitive to offset-charge. As we describe below, the dependence of the parity-switching rate on the applied flux can distinguish PAPS from NUPS. In the flux dependence of the parity-switching rate Γ(Φ), we observed a peak which can be explained by a difference of superconducting gaps between the two aluminum films of the device matching the qubit transition energy. This gap difference enhances the contrast in flux dependence between PAPS and NUPS and also helps to demonstrate the thermalization of QPs in the device. Using a new measurement protocol to extract the parity-switching rates conditioned on the initial state of the qubit, we find evidence that excess QPs relax to a low energy distribution and are primarily trapped in the low-gap film. We developed a model that quantitatively fits the measured Γ(Φ) with a self-consistent combination of PAPS and NUPS and derive two new insights. First, PAPS is responsible for a significant fraction of parity switching. Second, PAPS generates excess QPs at a rate on par with the sum of all other mechanisms, which we observe by measuring Γ(Φ) in the presence of a controllable photon source operated at several powers. Due to these effects, estimates of x QP obtained from measurements of Γ or QP-limited qubit relaxation that do not take into account PAPS or gap difference may be inaccurate. These results advance our understanding of QP dynamics in superconducting qubits and will inform approaches to mitigation of single-charge-tunneling decoherence.

II. EXPERIMENTAL DEVICE
A. Flux-tunable offset-charge-sensitive transmon Parity-switching rates were measured directly with an offset-charge-sensitive transmon [15,19,32]. A parity switching event, in which a single charge tunnels across the JJ, changes the parity p ∈ {e, o} of the electron number in each of its two electrodes. Such an event appears as a sudden jump by 1/2 in the reduced offset-  1. (a) Superconducting density of states of the JJ in the excitation picture, with a difference in the superconducting gaps δ∆ of the aluminum films. Two parity-switching mechanisms are illustrated. The first conserves QP number while the second generates two QPs. Number-conserving parity switching (NUPS, purple): a preexisting QP may tunnel while exciting, relaxing, or without exchanging energy with the qubit. Some cases of QP-qubit interactions may be suppressed by lack of available final states (red x). Photon-assisted parity switching (PAPS, orange): a photon with energy greater than ∆L + ∆H may be absorbed at the JJ, breaking a Cooper pair to generate two QPs with one tunneling across the JJ. (b) Cross-sectional cartoon of the device, formed by one thinner, higher gap film and one thicker, lower gap film. Arrows represent NUPS (purple) and PAPS (orange) tunneling across the JJ. Like the JJ, the pads of the device consist of a bilayer of both films, but have ∼ 10 6 × larger contact area. (c) Circuit diagram of a flux-tunable offset-charge-sensitive transmon. Fluctuations in the charge environment may induce offset charge ng = CgVg/2e. Both NUPS and PAPS across either JJ result in a switch of the parity which appears as a sudden jump in ng by 1/2. (d) Qubit ground (|0 ) and excited (|1 ) states separated into even and odd parity manifolds. For EJ/EC 30, f e q − f o q (ng = 0) 500 kHz. In addition to causing a parity switch, both NUPS and PAPS may relax or excite the qubit. charge n g = C g V g /2e (which is measured in units of 2e). The plasmon eigenstates i ∈ {0, 1} of the qubit can be separated into even and odd parity manifolds [ Fig. 1(d)], and in the offset-charge-sensitive transmon regime (E J /E C 30), this jump in n g can result in a substantial jump in the qubit frequency between two values f q ± δf q , where δf q (n g = 0) 500 kHz. The parity dependence of the qubit frequency has been used previously to measure the parity-switching rates Γ i→j correlated with qubit transitions from i to j, and these rates were inconsistent with parity switching solely due to a thermal distribution of the resident QPs [19].
Because the parity-switching rate depends on the density of states available for tunneling [ Fig. 1(a)], changes to the qubit frequency can affect PAPS and NUPS differently. Thus, we replace the JJ with a DC SQUID in order to tune in-situ the average frequency of the qubit f q (Φ) with applied flux [ Fig. 1(c)]. For both NUPS and PAPS, the flux-dependent rates of parity switching causing a qubit transition from i to j (Γ i→j (Φ)) depend on the single-charge-tunneling qubit matrix elements as well as factors accounting for the occupation and availability of QP states in the JJ films [17,35,36][App. A]. While applying flux changes the single-charge-tunneling matrix elements identically for NUPS and PAPS, the resulting change to the qubit frequency f q (Φ) affects PAPS and NUPS differently due to the unique constraints on the energies of the QPs involved in each process. As a result, the rates of parity switching by NUPS (Γ i→j N (Φ)) and by PAPS (Γ i→j P (Φ)) can have starkly different dependences on flux.
The offset-charge-sensitive SQUID transmon in this experiment was fabricated with E J1 /h = 2.465 GHz, E J2 /h = 8.045 GHz. The large asymmetry of the JJs enabled Γ to be measured at all values of flux by mapping the parity onto the state of the qubit [15,19]. With E C /h = 0.352 GHz, δf q varied from 0.7 − 14.5 MHz as the mean even-odd qubit frequency f q was tuned from

B. Gap differences in aluminum films
A key factor influencing the available density of states for parity switching is the difference in the superconducting gaps of the two superconductors on either side of the JJ tunneling barrier. The superconducting gap of thin-film aluminum increases with decreasing film thickness [43][44][45]. Previous works on Cooper pair transistors [3,45] and Cooper pair box qubits [27,28] have taken advantage of this effect to trap QPs in thicker aluminum films and reduce QP poisoning of the island. However, the effect of gap difference on parity switching in transmon qubits has not been previously reported.
The offset-charge-sensitive transmon measured here consists of two aluminum films of 20 nm and 30 nm, respectively [ Fig. 1(a,b), [46]]. For these thicknesses, the gaps are expected to differ by δ∆ := ∆ H − ∆ L ∼ 20 µeV ∼ 5 GHz × h [43,45]. This gap difference has two important consequences for parity switching in transmons. First, a gap difference changes the proportionality of Γ N to x QP , as certain QP-qubit interactions are suppressed or enhanced depending on the value of δ∆ relative to the qubit transition energy hf q [ Fig. 1 Here we show a signature of gap difference δ∆ in the flux-dependent Γ(Φ). As we explain below, this effect is specific to the NUPS mechanism and aids in the differentiation of PAPS from NUPS. The parity-switching rate was measured as a function of flux through the SQUID loop of the qubit using techniques developed and demonstrated in Refs. [15,19] [see App. B for additional detail]. Due to the fast repetition rate and the symmetrized parity-conditional π-pulses, the qubit spent approximately equal time in the ground and excited states during the experiment. As a result, the measured parity-switching rate Γ weighs the parity-switching rates when the qubit is in |0 and |1 approximately evenly: The flux dependence of Γ displays a peak near Φ/Φ 0 ≈ 0.325 (f q ≈ 4.12 GHz) on top of a broader increase from Fig. 2(a, upper)]. This broader increase is predicted to result from the flux dependence of the single-charge-tunneling matrix elements. Here, we focus on the intermediate peak, which may be understood as an enhancement of NUPS when the qubit energy matching the gap difference hf q ≈ δ∆. Under this condition, the rates of NUPS-induced relaxation (Γ 1→0 N ) and excitation (Γ 0→1 N ) are enhanced due to the divergences of the superconducting density of states on both sides of the JJ, since a QP at the gap edge on one side can tunnel to the gap edge on the other side by exchanging energy with the qubit. The implied gap difference δ∆ ≈ hf q ≈ 20 µeV is consistent with reported gap measurements for films of these nominal thicknesses [43,45,47].
This peak indicates that a considerable fraction of parity switching is due to NUPS, since PAPS is not enhanced in the same way by δ∆ = hf q , as we now explain. PAPS depends on the sum ∆ H + ∆ L , rather than the gaps individually, because the sum of the energies of two QPs generated by PAPS are determined by the absorbed photon energy hf P [ Fig. 1(a), orange arrows]. This is in contrast to NUPS, for which the final energy of the tunneling QP is constrained to be the approximately the same as the initial energy, and may only differ by the qubit energy [ Fig. 1(a), purple arrows]. The PAPS rate therefore does not exhibit a peak when hf q = δ∆ and shows only a smooth flux dependence due to the matrix elements [App. A].

B. Probing the QP distribution
The peak in Γ also provides insight into the energetic and spatial distributions of QPs in the device. The average energy of QPs in each film appears to be close to its respective gap, otherwise there would be not be a significant peak when δ∆ = hf q . This is consistent with pre- Flux dependence of parity-switching rate Γ ≈ 0.5(Γ 0 + Γ 1 ), measured by fitting power spectral densities of jump traces of parity (Γ 0 := Γ 0→0 + Γ 0→1 , Γ 1 := Γ 1→1 + Γ 1→0 ). Parity is measured by a Ramsey-like sequence which maps the parity onto the state of the qubit [15,19]. (Lower) Flux dependence of parity-switching rate when the qubit starts in the ground (blue) or excited (red) state, measured by the protocol described in (b). (b) Pulse sequence for measurement of Γ 0 , Γ 1 . Two measurements of the parity (p) are separated by a variable delay of length τ . For the duration of the delay, the qubit state is repeatedly measured (mq, 4 µs), followed by preparation of a superposition with polarization angle θ (ψ(θ)), with feedback block repetition time τF B = 5.376 µs. (c) The decay of the autocorrelation of the parity p(0)p(τ ) for polarization angles θ = 0 (green) and θ = π (pink) as a function of delay τ between parity measurements (Φ/Φ0 = 0.335). The parity autocorrelation function p(0)p(τ ) decays at rate 2Γ, which depends on the θ-dependent time the qubit spends in |0 or |1 .  Fig. 1(b). Away from the JJ, QPs may tunnel between the two films over a very large contact area. QPs are shown predominantly in the low-gap film, illustrating the QP trapping effect to which we attribute the rise in Γ 1 but not Γ 0 observed in the lower panel of (a).
dictions that QPs generated at energies above ∆ Al relax rapidly by emitting phonons toward a steady-state distribution with average energy close to the gap [11,34,35].
If the QPs are indeed efficiently relaxing to the low energies, it would also be natural to expect that they tend to reside in the low-gap film (i.e., they are trapped there [41,48,49]). The previously described measurement was not directly sensitive to this, so we developed a new experiment to probe the QP densities in each film. Note that Γ 1→0 N is enhanced by the presence of QPs in the low-gap film and requires the qubit to be in the excited state. In contrast, Γ 0→1 N requires QPs in the high-gap film of the JJ and the qubit initially in the ground state. Thus, by measuring Γ with different initial qubit states we can learn about the distribution of QPs in the two films.
In our new protocol, the decay of the parity autocorrelation function p(0)p(τ ) was measured with the sequence shown in Fig. 2(b), which controlled the time the qubit spent in the ground and excited states with active feedback. During the delay between parity measurements, the qubit undergoes repeated blocks of a qubit state measurement and a preparation into |ψ θ = cos θ 2 |0 + sin θ 2 |1 . In this way, the qubit was projected to the ground (excited) state with probability cos 2 θ 2 (sin 2 θ 2 ) by the ensuing measurement. With this procedure, . Ideally, polarization angle θ = 0 would keep the qubit in |0 , and result in measurement of Γ 0 := Γ 0→0 + Γ 0→1 ; θ = π would keep the qubit in |1 and result in measurement of Γ 1 := Γ 1→1 + Γ 1→0 ). In practice, T 1 (Φ) ≈ 20 − 70 µs limited the experimentally attainable polarizations. Instead, the qubit measurement record during the delay was used to estimate the fraction of the delay the qubit spent in each state [App. C]. Then, plotting Γ as a function of the average qubit state measurement m q during the feedback delay, we used a linear fit to extrapolate to m q = 0, 1 and infer the parity-switching rate conditioned on the qubit state |0  (a) Illustration of QP processes included in our model. Films are numbered 0-3 for reference in the text. Generation of QPs may occur by pair-breaking in the pads (red) at rate g other or by PAPS at the JJ (orange) at rate gP . QPs tunnel across the JJ at rates ΓP (orange) and ΓN (purple). QPs are eliminated by recombination at rate r or are removed from the tunneling population by trapping at rate s in each film (green). (b) Parity-switching rate Γ(Φ) from Φ/Φ0 = 0 − 0.5. The fit to the self-consistent model (Section IV A) is indicated in black, with ΓN (purple) and ΓP (orange) contributions shown explicitly. The peak near Φ/Φ0 ≈ 0.145 is explained by the δ∆ ≈ hfq condition illustrated in Fig. 2(e).
We repeated this measurement as a function of flux and found that only Γ 1 exhibits a clear peak at Φ/Φ 0 ≈ 0.325 [ Fig. 2(a), lower]. This suggests that the increased parityswitching rate when hf q ≈ δ∆ is due to QPs tunneling from the low-gap edge to the high-gap edge by relaxing the qubit [ Fig. 2(e), red]. From this, we deduce that generated QPs relax to a relatively cold steady-state distribution with an average energy from the low-gap edge ε − ∆ L δ∆. The thicker, lower gap aluminum film acts as a built-in QP trap [ Fig. 2 Therefore, the gap difference helps to reduce parity switching in two distinct ways. In the low-to-high-gap direction, tunneling is reduced because of the lack of available states in the high-gap film. In the high-to-low-gap direction, tunneling is reduced because the effective x QP in the high-gap film of the JJ is reduced due to trapping in the low-gap film on top of it. For a QP distribution thermalized at a phonon temperature T ph δ∆, the parity-switching rate would be exponentially suppressed while the qubit is in the ground state if hf q < δ∆. However, in Fig. 2(a, lower), it is clear that Γ 0 is always of the same order as Γ 1 , suggesting that PAPS is also contributing significantly to parity switching in this device.

IV. PAPS CONTRIBUTION TO PARITY SWITCHING
A. Self-consistent QP dynamics model In order to elucidate the contributions of both NUPS and PAPS, we developed a model that takes into account the gap difference as well as parity switching and QP generation by PAPS. With this model, we can take advantage of the different flux dependence of NUPS and PAPS to distinguish between the respective contributions from the fit to Γ(Φ). The total parity-switching rate is the sum of both parity-switching mechanisms: Here,n is the occupation of a high-frequency mode at f P that couples to the JJ and induces PAPS at a perphoton rate calculated following Ref. [36]. In this model, the relative increase of Γ P with Φ/Φ 0 varying from 0 to 0.5 is determined by f P , whilen independently scales the magnitude of Γ P . These are the two fit parameters for the PAPS contribution to Γ. There is an ambiguity in the physical interpretation of these quantities, as it is possible for combinations of modes with varying occupations to give the same effective Γ P [App. A]. The case for a narrow band of modes with dominant coupling to the JJ has been made previously [40,41] and is consistent with our data (discussed below). The NUPS rate depends on the QP densities in the JJ films 0 and 3 (x 0 , x 3 , "QP" is dropped for the x QP of specific films for notation simplicity). In an isolated superconducting film, the steady-state x QP can be determined by balancing QP generation (g) with trapping (s) and recombination (r), For our model, we extend this concept to all four films of the device (i.e. the high-and low-gap films on each side of the JJ), considering these dynamics in each film as well as tunneling between films. We separate generation of QPs into two types. Generation by PAPS Other pair-breaking that does not directly result in a parity switch is accounted for by g other , which is assumed to occur equally in each pad. Trapping at rate s may arise from vortices or gap inhomogeneities and is included as a fit parameter also assumed to be the same for all four films. The recombination rate r = 1/(120 ns) based on the literature [30] is also included in each film. As described earlier [Section III B], the qubit-state dependence of the peak in Γ(Φ) indicates that QPs thermalize and become trapped in the low-gap films of each pad (films 0 and 2). This is supported by predictions that QPs relax by emitting phonons rapidly relative to the other dynamics in the system [34,49]. Under this assumption, we express the QP distributions as Fermi distributions thermalized at T ph , f (ε, T ph , µ L(R) ) = 1/(e (ε−µ L(R) )/kBT ph + 1) with nonzero chemical potential µ L(R) accounting for excess QPs [27,50] on the left (right) side of the JJ. We take T ph ≈ 50 mK for the temperature of the device based on the reading of a thermometer mounted near the sample cavity [App. D]. In the presence of the gap difference, this thermalization will result in the x QP of the high-gap films being reduced from that of the low-gap films by a factor e −δ∆/kBT ph ≈ 0.008.
Because the QPs are assumed to reside predominantly in the low-gap films, we can ignore the effects of trapping and recombination in the high-gap films, which will have a negligible effect on the overall x QP on each side. We can thus approximate the dynamics of x QP in the low-gap films (x 0 and x 2 ): Here, we have included QP tunneling between films 0 and 3 by the rates γ 03 and γ 30 , which are the per-QP tunneling rates in each direction. These rates take into account T ph and δ∆ and are also flux-dependent. If the qubit state were in thermal equilibrium with the QPs, x 0 and x 2 would be the same, because tunneling in each direction across the JJ would be balanced. However, as discussed in Section III A, the qubit is frequently πpulsed throughout the parity-mapping sequence, resulting in nonthermal qubit population during measurement of Γ. The extra time the qubit spends in the excited state results in excess tunneling from film 0 to film 3 as compared to the reverse process, since Γ 10 N favors tunneling from low-gap to high-gap film due to the densities of states of the superconducting films. According to our model, the measurement therefore "pumps" QPs from film 0 to 3, which produces a steady-state x QP in films 2 and 3 that is larger than in same-gap films 0 and 1 [51]. This effect is predicted to be particularly strong when hf q ≈ δ∆.
Setting the left-hand side of Eqs. (3) and (4) equal to 0 results in coupled equations which can be solved to determine the steady-state x 0 and x 2 . These values will be flux-dependent due to the flux dependence of g P (Φ) and per-QP tunneling rates γ 03 (Φ), γ 30 (Φ). Finally, Γ N (Φ) can be calculated given the x QP of the JJ films (x 0 (Φ) and x 3 (Φ) = x 2 (Φ)e −δ∆/kBT ph ), and added to Γ P (Φ) to determine Γ(Φ) [Eq. (2), see App. E for additional detail on the model].
B. Distinguishing parity-switching mechanisms In Fig. 3(b), we show Γ(Φ) for the same device measured during an earlier cooldown, which shows a similar peak as seen in Fig. 2 but at a lower flux corresponding to a value of δ∆ that is 2.9 µeV higher. The cause of this shift is not known, but may be related to mechanisms causing JJ aging [52]. Fitting this data with the self-consistent model (black) yields f P = 112 ± 2 GHz, n = (1.9±0.2)×10 −3 , δ∆/h = 4860±5 MHz. The values of g other and s cannot be independently extracted from this fit, so we simply set g other = 0 (below this condition will be relaxed, as we will discuss). The QP densities in the low-and high-gap films that form the JJ (at Φ/Φ 0 = 0) are x 0 ≈ 6.2 × 10 −9 and x 3 ≈ 0.1 × 10 −9 , respectively, corresponding to on average 65 QPs in the low-gap film and 0.7 QPs in the high-gap film. We decompose the fit Γ into its Γ P (orange) and Γ N (purple) components and see that the peak at Φ/Φ 0 ≈ 0.145 is due to the effect of δ∆ = hf q on Γ N , while Γ P is insensitive to δ∆ and increases monotonically due to the matrix elements. Importantly, we find that 0.53 ≤ Γ P (Φ)/Γ(Φ) ≤ 0.83, indicating that both mechanisms contribute significantly to parity switching.
The ratio Γ 0→1 /Γ 1→0 ≈ 0.53 implies that QPs cause much more qubit excitation than would be expected by detailed balance at 50 mK, which would predict Γ 0→1 /Γ 1→0 ≈ 8 × 10 −3 . A previous observation of this type of anomalous excitation was interpreted as evidence of a nonthermal QP distribution of unclear origin [19]. Here, this apparently nonthermal ratio is interpreted as resulting from PAPS.

A. Varying photon incidence
While the fit in Fig. 3(b) determines the contribution of Γ P to Γ, it does not determine whether QPs responsible for the observed Γ N are generated solely by Γ P or if other generation mechanisms also contribute. The generation contributions g other cannot be determined from the data shown in Fig. 3 because Γ N depends on x 0 and x 3 , and the trapping rate s can compensate varying levels of g other to yield the same QP densities. To obtain an estimate of g other to compare to g P , we varied the photon incidence and observed the response in x QP .
To do this, we added a resistor which acted as a controllable source of additional photons for PAPS [ Fig. 4(a)]. A 2 cm length of manganin wire (≈1.4 Ω) was suspended inside the aluminum shield containing the copper cavity, though notably not in-line with the coaxial cable for the microwave reflection measurement. When current was passed through this "lamp" resistor, we observed that Γ(0) increased approximately linearly with the power dissipated by the lamp [ Fig. 4 We do not attribute this to an increase in temperature resulting in additional thermal QPs because the temper- (a) The manganin "lamp" resistor hangs in the aluminum can shielding the copper cavity containing the qubit. The coax lines include a high-frequency-absorbing Eccosorb CR-110 filter ("E") in-line just before the readout cavity, inside the can. Passing current through the lamp causes its temperature to rise, resulting in additional radiation that can leak into the cavity and induce PAPS. Photons may enter the electromagnetic environment of the qubit through SMA connectors or packaging seams. (b) Γ as a function of power dissipated by the manganin "lamp" resistor, measured at Φ/Φ0 = 0. Γ increases approximately linearly with power dissipated by the lamp (black dashed, see App. F for discussion). (c) Γ(Φ) measured at P lamp = {0, 1.4, 5.6, 12.6} µW. We perform a simultaneous fit to all four data sets allowing unique fP ,n, and shared s, g other , δ∆ (solid black). The contributions of PAPS (orange) and NUPS (purple) indicate that as the lamp power increases, both types of parity switching increase. (d) Simulated Γ(0) (blue, left axis) and x0 (lavender, right axis) as a function of ΓP , calculated using fit values from Fig. 4(c). Experimental values (squares) are marked for reference, and the PAPS level at which gP = g other is marked by the red dashed line. Reducing ΓP beyond the current background level (P lamp = 0) would reduce Γ and xQP, but the improvement is limited due to g other which is comparable to gP . ature measured by a RuOx thermometer installed at the bracket holding the copper 3D readout cavity increased by only a few mK. Instead, we attribute the increase of Γ to an increase in PAPS (including the added generation g P ), suggesting that the SMA connections to the cavity and Eccosorb filter inside the aluminum shield may allow photons to leak into the input line [53].
The parity-switching rate Γ(Φ) was measured at P lamp = 0 µW, 1.4 µW, 5.6 µW, 12.6 µW [ Fig. 4(c)]. Since the overall value of Γ increased by 25× from P lamp = 0 µW to 12.6 µW and the intermediate peak at Φ/Φ 0 ≈ 0.145 was still visible, it is clear that Γ N must have increased along with Γ P . This indicates additional x QP generated by PAPS and constrains s and g other [App. G]. We fit these curves simultaneously with the model described above, allowing uniquen and f P for each P lamp and assuming common s, g other , and δ∆ (fit values shown in Table I). The f P andn values in the table for nonzero lamp power correspond to the additional PAPS added by the lamp on top of the P lamp = 0 µW background.
From this fit, we observe that with the lamp off (P lamp = 0 µW), the ratio of generation contributions g P /g other = 0.37, i.e. the rate at which PAPS generates QPs is comparable to generation by other sources. The extracted trapping rate s = 11 s −1 is of the same order as the trapping rate estimated in [30]. Surprisingly, we find that the effective frequency of the additional PAPSgenerating photons from the lamp is f P ≈ 125 GHz for each power. If the coupling of high frequency photons to the qubit were broadband, we would expect this frequency to increase as the power of the resistor increased, due to the rising temperature of the emitting blackbody. Instead, the results may indicate that higher lamp power causes increased occupation of modes that are well-matched to antenna modes of the qubit, which have relatively high absorption efficiency [40][41][42]. De-  Fig. 4(c).

B. Implications for mitigating parity-switching decoherence
Having established the QP dynamics, we may extrapolate Γ under different levels of Γ P -inducing radiation. In Fig. 4(d), we simulate sweeping Γ P for the device measured in this work. The blue curve shows the calculated Γ(Φ = 0) with g other = 8 × 10 −8 x QP /s, s = 11 s −1 , and δ∆ = 4844 MHz from the fit in Fig. 4(c), while the lavender curve depicts the corresponding x 0 as a function of Γ P . The experimental values derived from the fits in Fig. 4(c) are marked, and the Γ P at which g P = g other is marked by the red dashed line. In the g P g other regime, decreasing Γ P efficiently lowers Γ. High-frequency absorbing filters in the RF lines have indeed been shown to significantly lower Γ [32]. The efficient increase of Γ with P lamp emphasizes the previously observed importance of light-tight shielding surrounding the device [29,54] to reduce the flux of such photons seen by SMA-connectors below the last in-line filter.
The lower bound due to the QPs generated by non-PAPS mechanisms is Γ ≈ 81 s −1 , suggesting that removing Γ P entirely would help reduce Γ only by a factor of 4 in this device, with x 0 reaching a plateau due to g other . To further decrease the parity-switching rate, QPs generated in the pads of the device would need to be addressed. One such possible source is ionizing radiation, which has been shown to cause correlated errors in qubits across a single substrate attributed to bursts of QPs being generated by the impact and then tunneling [37]. In this device, we observe sudden occurrences of rapid parity switching directly [App. H], which supports the interpretation of QPs being the mechanism for these errors. The frequency and amount of energy deposited by these bursts as well as the timescale for decay of the generated QP density will determine the extent to which these impacts contribute to g other , which is a subject for future investigation.
An alternative approach is to reduce the harmful impact of QP generation by preventing QPs from tunneling after they have been generated, via QP traps. These may be implemented by additional normal-metal or lower-gap superconductor traps [48,49], but a simpler solution may be to increase the gap difference between the aluminum films. For fixed number of QPs in the device, the tunneling rate decreases exponentially with the difference between the gaps as discussed above.
The fit to our model yields a limit on energy relaxation time due to QPs of T QP 1 = Γ 0→1 + Γ 1→0 −1 = 2.2 ms for this device, which is about one order of magnitude longer than T 1 of current state-of-the-art transmon qubits [39,54,55]. While parity-switching decoherence does not currently limit transmons, the even-tual reduction of dielectric loss [56,57] will motivate further mitigation of parity-switching. Fortunately, parityswitching rates well below the lower limit imposed by non-PAPS sources in this experiment have been measured [39,41,54,58], with Refs. [41,58] measuring Γ < 1 s −1 [59].

VI. CONCLUSIONS AND OUTLOOK
We have measured the flux dependence of parity switching to distinguish the contributions of Photon-Assisted Parity Switching (PAPS) and NUmberconserving Parity Switching (NUPS). In the flux dependence, we observed a peak which stems from NUPS in the presence of a difference between the superconducting gaps of the aluminum films of our device. The dependence of this peak in the flux dependence on the qubit state indicates that QPs relax into a low energy distribution in the low-gap aluminum film. We fit the flux dependence of parity switching with a model that takes into account QP dynamics between the films of the qubit and self-consistent generation of QPs by PAPS. From this fit, we conclude that parity switching in this device is consistent with comparable contributions of PAPS and NUPS. We also found that PAPS generated QPs at a rate similar to other processes that do not directly change the parity. This work shows that parity switching in transmon qubits cannot be understood in terms of solely NUPS defined by a single QP density. The roles of PAPS and gap difference must be considered to accurately determine QP densities from measurements of the parity-switching rate Γ or QP-limited energy-relaxation time.
This device may be modified in several ways in order to better elucidate certain aspects of QP dynamics and generation. For example, qubits with different metallization of the pads may result in different spatial distributions of QPs throughout the device. Also, changes to the qubit geometry can affect the coupling to PAPS-inducing radiation [40][41][42] and aid in the investigation of the spectrum of incident radiation. Our self-consistent model, which includes the effects of PAPS and gap difference, can be also be extended to include additional complexities. These include possible differences in the superconducting gap at the JJ vs. in the pads [41] or taking into account the full energy dependence of QP dynamics with numerical simulation. The framework introduced here will assist investigations of QP generation and impact in these devices going forward.
Acknowledgments We acknowledge helpful discussions with N. In this section, we derive Γ N and Γ P following Refs. [14,36], and additionally incorporate the fluxdependence of Γ for the SQUID device. The Hamiltonian for single-charge tunneling across the JJ and coupling to the phase degree of freedomφ of the qubit iŝ where t is the tunneling amplitude,ĉ l,s is the electron annihilation operator for the reservoir on the left side of the junction, and s =↑, ↓ denotes the spin of the electron. We apply the Bogoliubov transformation, which diagonalizes the BCS Hamiltonian of the superconductors in the leads.
The operatorγ l↓ (γ † l↓ ) is the operator for annihilation (creation) of a QP excitation on the left with spin down and u l , v l are the conventionally-defined BCS coherence factors, which depend on the QP energy ε l [50]. Applying this transformation to (5), we see that two mechanisms of single-charge tunneling may occur: The first term, withγ † rsγls , accounts for NUPS as it empties a QP state on one side and fills a state on the other.
The second term, withγ † rsγ † ls , generates QPs on both sides. In order to conserve energy, this process may only happen with the absorption of a photon of energy greater than ∆ L + ∆ H (the conjugate process annihilates QPs on both sides and emits a photon). Such radiation will couple to the qubit by imposing a time-dependent phase across the junction. Assuming the phase increments ϕ P induced by the electric field of the incident photon with frequency ω P are small, the single-charge-tunneling operators are transformed by linear expansion of the trigonometric functions in the field-induced phase increments: We now apply Fermi's Golden Rule to calculate the rates of parity switching accompanied by a qubit transition from plasmonic eigenstate i to j. Applying Fermi's Golden Rule to the first term of Eq. (7) gives the rate of NUPS accompanied by such a transition (Γ i→j N ). The second term of Eq. (7) gives the rate of PAPS (Γ i→j P ), assuming the presence of high-frequency photons and expanding the single-charge-tunneling operators as above (Eq. (8)). We also use the Ambagaokar-Baratoff relation to express t in terms of the Josephson energy of the junction E J and substitute the energy-dependent definitions of u and v above to find: Here, Γ i→j P is the total PAPS rate induced by photons with frequency ω P . To calculate it, we have multiplied the transition rate induced by a single photon at ω P by the average photon number in the moden. The coupling factor which determines the per-photon rate depends on the electric field amplitude and effective dipole length of the qubit. Following Ref. [36], we express this factor in terms of system parameters: geometric coupling rate g/2π = 331 MHz, readout resonator frequency ω r /2π = 9.126 GHz, and qubit frequency at Φ/Φ 0 = 0 ω q /2π = 5.0594 GHz. Here, we only account for PAPS transitions between the ground and first excited states of the qubit. Photon-assisted transitions to higher states may also occur and add to Γ P , but inclusion of these transitions would not qualitatively alter these results.
The so-called QP structure factors S ± = S rl ± + S lr ± include the BCS coherence factors, QP distribution functions, and superconducting density of states ν. Tunneling from left to right and right to left are summed to obtain the total rate. The structure factors can be expressed: Nonzero values of µ in f (ε, T ph , µ) = 1 e (ε−µ)/k B T ph +1 describe distributions of QPs which are thermalized at T ph but have excess number [27]. The total unassisted and photon-assisted parity-switching rates for a single junction device Γ N and Γ P are then calculated as the sums of the individual rates with qubit transition from i to j weighted by the qubit state probabilities ρ 0 , ρ 1 , with m ∈ {N, P }. In the SQUID transmon, single-charge tunneling across either JJ results in a parity switch. Therefore, to calculate the parity-switching rates in the SQUID transmon as a function of the flux Φ through the loop, we sum the respective rates across the JJs with E J1 (Γ m1 ) and with E J2 (Γ m2 ): with m ∈ {N, P }. The SQUID transmon Hamiltonian can be expressed in terms of both Josephson ener-giesĤ = 4E C (n − n g ) 2 − E J1 cos(φ − ϕ ext ) − E J2 cosφ, with the externally tunable flux ϕ ext = 2πΦ/Φ 0 . Equations (9) and (10) are modified to calculate the parityswitching rate for the individual junctions: E J → E J1 ,φ →φ − ϕ ext for Γ m1 ; E J → E J2 for Γ m2 . The proportionality factor for photon absorption in Eq. (10) also includes a factor of E J1 (2) E J1 +E J2 , for Γ m1 (2) . The flux dependence of Γ results from the tuning of the qubit frequency with flux (see ω ij in Eqs. (11), (12)) as well as the flux-dependent matrix elements. In Fig. 5, we plot these matrix elements as a function of flux. Green (purple) lines correspond to the matrix elements for single-charge tunneling across the JJ with E J1 ≈ 2.5 GHz (E J2 ≈ 8 GHz). The overall increase in Γ P (Φ) from Φ/Φ 0 = 0 to Φ/Φ 0 = 0.5 results primarily from the increase of the 0| sinφ −ϕext 2 |0 2 and 1| sinφ −ϕext 2 |1 2 matrix elements for single-charge tunneling across the lower-E J JJ. Note that the wavefunctions |i implicitly depend on ϕ ext , ensuring that the choice of assignment for ϕ ext does not affect the calculated rates.

Modeling ΓP assuming single photon frequency fP
For the purposes of this work, the true spectrum of the radiation inducing PAPS was not required. The total induced Γ P (Φ) for arbitrary spectra of PAPS-inducing photons above ∆ L + ∆ H can be approximated by an effective occupationn of a single mode at frequency f P : Γ P (Φ) = Γ P (Φ,n, f P ). Three examples of this are shown in Fig. 6. The relative flux-dependence of the photonassisted parity-switching rate, Γ P (Φ)/Γ P (0), depends on the absorbed photon frequency f P . Lower f P induce stronger relative increase with Φ, while f P cause a weaker relative increase, as shown for 110 GHz (black dashed) and 300 GHz (black dot-dash) in Fig. 6. This is due to smaller photon energies generating QPs closer to the gap, where interference between electron-like and holelike tunneling of QPs is stronger (see Eq. (12): lower f P results in larger S +P /S −P , which emphasizes the flux dependence of the matrix elements).
In Fig. 6, we calculate Γ P (Φ) assuming spectral densities of: a white spectrum with a cutoff frequency of 130 GHz (orange), a 1d blackbody at 1 K from 110 to 300 GHz (cyan) and a 3D blackbody at 1 K from 110 and 300 GHz (pink). We see that the relative fluxdependence Γ P (Φ)/Γ P (0) for each spectral density can be fit instead by PAPS induced by photons at a sin-FIG. 6. Calculated relative flux-dependence of Γ induced by various absorbed photon spectral densities. Thin lines correspond to ΓP calculated for individual photon frequencies as described in App. A, with lower photon energies resulting in larger increase in ΓP (Φ)/ΓP (0). Thick lines correspond to ΓP calculated for the following spectral densities: white spectrum with cutoff at 2.5∆ ≈ 130 GHz (orange), 1d blackbody at 1 K (cyan), 3D blackbody at 1 K (pink). Each of these spectral densities result in ΓP (Φ)/ΓP (0) that can be matched by PAPS induced by individual photon frequencies (thin colored lines).
gle frequency (solid curves). The absolute value of Γ P will depend on the unknown attenuation of the radiation as it couples into the cavity, along with the possibly photon-frequency-dependent coupling rate to the qubit. Accordingly, for simplicity in our model, Γ P (Φ)/Γ P (0) is described by a single effective frequency f P , and an average photon numbern which determines the magnitude of the Γ P contribution. Further experiments with different qubit geometries could be performed to elucidate the spectrum and coupling of radiation inducing PAPS, as done for 2D qubits in Ref. [41,42].

Measurement protocol
In order to obtain a single measurement of Γ, we took the power spectral density of a jump trace of parity. The jump trace was measured using the Ramsey sequence first demonstrated in Ref. [15], which maps the parity onto the qubit state. The sign of the second π/2 pulse in the measurement sequence was alternated between parity measurements such that the sequences enact π pulses which alternate being conditioned on the even or odd parity. While a measurement of Γ could be determined from a jump trace that is ≈ 2 s long with good signalto-noise ratio (e.g. Fig. 7(a), in which Γ = 341 ± 2 s −1 ), we observe that these measured values fluctuate in time [ Fig. 7(b)]. In order to average over these fluctuations, we measured Γ over approximately 20 minutes.
For the data shown in the main text, we measured 25 jump traces, each comprised of 2 × 10 6 measurements of the parity repeated every 10 µs. Each of these 20 s jump traces was chopped into 25 segments that were 400 ms long. We computed the PSD of each segment  7. (a) Example of a power spectral density of a 400 ms jump trace of parity. The power spectral density is fit by a Lorentzian with characteristic rate Γ, the parity-switching rate, modified by a frequency-independent term due to infidelity of the parity-mapping sequence. (b) Time series of Γ measurements (e.g. Fig. 7(a) and averaged five PSDs together to obtain one value of Γ [ Fig. 7(a)]. Finally, we fit the distribution of all Γ measurements with a Gaussian to determine the mean value of Γ [ Fig. 7(c)], as well as an estimate of the fluctuations from the σ of the distribution. The width of this distribution did not become narrower when more measurements were included, suggesting that the width was due to fluctuation of Γ rather than measurement uncertainty.
In between each jump trace, the difference between the even and odd parity qubit frequencies was checked by a Ramsey experiment. If the value of δf met two criteria, the next jump trace would be measured. First, the value of δf had to exceed a threshold value such that the delay during the parity-mapping sequence τ = 1/4δf would remain well below T 2 ∼ 2 µs. Measuring at n g with δf below this threshold would result in reduced fidelity of the measurement. Second, for fluxes at which the effective E J /E C 20, the dispersive shifts of the ground and excited states depended on n g [32]. In order to measure the parity by mapping onto the qubit state, the ground and excited states needed to have dispersive shifts such that the phase of the reflected measurement signal differed between |0 and |1 . Therefore, the value of δf found by the Ramsey measurement needed to correspond to a value where the ground and excited states were separable. This range of usable δf values was deter-mined by inspection at each flux prior to measurement of Γ. It was also verified that Γ did not itself depend on δf . If the value of δf did not meet these two criteria, the dc voltage on the readout pin of the 3D cavity was changed to induce a change in n g and δf was measured again.

Measured Γ over wider flux range
In Fig. 8, we show the measured parity-switching rate Γ as a function of flux from Φ/Φ 0 = −0.5 to 1.0 for the cooldown corresponding to the data in Figs. 3, 4. The intermediate peaks corresponding to hf q = δ∆ are also observed at Φ/Φ 0 = −0.145 and Φ/Φ 0 = 0.855 where this condition is also met. In order to measure Γ 0 and Γ 1 , the parity-switching rates conditioned on the qubit being in |0 and |1 , we would ideally measure Γ with the qubit spending nearly all of the measurement time in |0 and |1 , respectively. Measurement of Γ by fitting the power spectral density of a jump trace of parity does not allow this, because the individual parity measurements in the jump trace necessarily π-pulse the qubit conditioned on the parity. Instead, we measured Γ by the decay of the parity autocorrelation The observed mq deviates from the nominal value mq = sin 2 (θ/2) due to energy relaxation as well as the large detuning between the carrier frequency of the qubit pulse and the even and odd parity qubit frequencies. (b) Charge-parity-switching rate Γ as a function of the average qubit-state measurement during feedback between parity measurements [ Fig. 2(b)]. For each of the eight measurements of Γ (gray), a different state ψ(θ) is prepared after each qubit measurement mq. Triangles represent measurements for which during feedback, the qubit was π-pulsed if the qubit was in |1 , followed by rotation to ψ(θ). Squares represent measurements for which during feedback, the qubit was π-pulsed if the qubit was in |0 , followed by rotation to ψ(θ). Both sets of measurements are fit to a single line in order to determine Γ 0 (blue diamond) and Γ 1 (red diamond). function p(0)p(τ ) . In this protocol, we separated two measurements of the parity by a variable delay τ . During this delay, we repeatedly measured the qubit state and used active feedback to control the mixture of Γ 0 and Γ 1 being measured. The qubit energy relaxation time as a function of flux varied from 20 − 70 µs. Since this is only somewhat longer than the time for qubit measurement (4 µs), the qubit state often changed between repeated measurements. Therefore, due to these jumps, measuring Γ while feeding back to |0 or |1 still resulted in a mixture of Γ 0 and Γ 1 . While the exact timing of such jumps was unknown, we estimated the fraction of the delay the qubit spent in |0 or |1 based on the qubit measurement record during τ . We repeated this measurement for different polarization angles to obtain values of Γ with different mixtures of Γ 0 and Γ 1 , and we plotted these values of Γ vs. m q , the average qubit measurement during τ . Then, since Γ ≈ m q (Γ 1 − Γ 0 ) + Γ 0 , we extrapolated a linear fit of Γ( m q ) to obtain Γ 0 , Γ 1 .
To summarize, the measurement protocol at each flux point consisted of the following: (1) Perform a Ramsey experiment to determine δf for the parity measurements p.
(5) Repeat steps (2)-(4) for each θ. Steps (2)-(5) took ∼ 30 seconds, during which n g was typically stable. If it was found that n g changed between successive Ramsey measurements, the data between them was not used since the parity measurements were unreliable. The full protocol took ∼ 1 hour per flux point.
We now describe how θ was chosen and implemented in the pulse sequence. The extremal cases are simplest. In order to spend maximal time in |0 , we prepared ψ(θ = 0). The feedback protocol was to π-pulse the qubit if m q = 1, otherwise do no pulse. In order to spend maximal time in |1 (ψ(π)), the protocol was to π-pulse the qubit if m q = 0, otherwise do no pulse. For angles in between, due to the nature of the feedback implementation with the FPGA, only one dynamical angle could be used for rotation. As a result, the protocol required an additional π-pulse on either |0 or |1 . For example, to prepare ψ(π/4), the protocol could be: if m q = 0, rotate by π/4; if m q = 1, do a π-pulse and then rotate by π/4. Alternatively, the extra π-pulse could be enacted instead on |0 with 3π/4 rotations to also prepare ψ(π/4).
For the measurement in Fig. 2(d), eight polarization angles were used: θ = 0, π/6, π/3, π/2, with the extra πpulse on |1 [ Fig. 9, triangles] and θ = π/2, 2π/3, 5π/6, π with the extra π-pulse on |0 Fig. 9, squares]. Fig. 9(a) shows the average qubit state measurement m q for all eight instances of θ as a function of the expected m q = sin 2 (θ/2). We observe that the measured m q corresponding to angles in which the extra π-pulse was acted on |0 and those measured with the extra π-pulse on |1 deviate from the expected value. We attribute this to two effects. First, energy relaxation tends to shift m q toward the thermal value. Second, large detuning δf (up to ≈ 14 MHz) between the pulse carrier frequency and the even and odd parity qubit frequencies affects the value of m q when preparation of ψ(θ) is performed with the extra π-pulse. For example, δf ≈ 5 MHz for the data depicted in Fig. 9, and thus the qubit state evolves significantly during the time between the π-pulse and the nominal rotation to θ. Even with no intentionally added delay between these pulses, there is effectively time between the rotations due to the Gaussian shape of the pulses. Depending on the specific δf and effective evolution time, the true prepared state ψ may differ significantly from the nominally expected value ψ(θ).
However, calculating the functional dependence of m q on δf and the pulse lengths was not necessary for this work. The measurements of Γ 0 and Γ 1 relied on measurement of Γ with different mixtures of |0 and |1 and not specific values of m q . Fig. 9(b) shows that the parity-switching rate Γ can be fit by a single line for feedback protocols with the extra π-pulse on |0 or |1 , since ultimately the value of Γ depends on time spent in |0 and |1 which is estimated by m q given the measurement record. It was also checked that ψ(0) and ψ(2π) gave the same values of Γ 0 , Γ 1 within measurement uncertainty, showing that the drive power itself does not influence the measured value of Γ.
APPENDIX D: MEASUREMENT SETUP AND DEVICE IMAGES

Device images
The offset-charge-sensitive SQUID transmon is comprised of e-beam evaporated aluminum on a sapphire substrate. The fabrication process was the same as that described in detail in the Supplemental Material of Refs. [19,32].

Measurement setup
The qubit was mounted in a 3D Cu microwave readout cavity and measured in reflection (κ = 3.5 MHz) as diagrammed in Fig. 11 with a SNAIL parametric amplifier providing initial amplification to achieve single-shot qubit-state readout. An Eccosorb filter inside the aluminum and magnetic shields was included, which has been demonstrated to reduce the parity-switching rate [32]. A 2 cm section of manganin wire (≈ 1.4 Ω) was suspended in the aluminum can by superconducting leads and acted as an adjustable source of PAPS-inducing photons. A RuOx thermometer was mounted on the bracket holding the Cu readout cavity in order to monitor the temperature of the bracket as power was dissipated by the manganin wire. A dc voltage bias was added to the RF input line with a pair of bias tees, such that a dc voltage could be applied to the readout pin in the cavity in order to bias the offset charge. All lines entered the shields via a narrow slot opening and copper tape was used to make the slot as light-tight as possible.

APPENDIX E: FOUR-FILM MODEL FOR QP DYNAMICS
Here, we provide additional detail to Section IV A on the calculation of x 0 and x 3 , the QP densitites of the JJ Optical image of qubit fabricated on the same wafer as the experimental device, which is nominally identical. The end of the right pad is not in view, but is symmetric with the pad on the left. (lower) Scanning-electronmicroscope image of the SQUID loop of the optically imaged device, in which the difference of the areas of the two Josephson junctions are apparent. The qubit is fabricated using the bridge-free technique in a liftoff process as described in Refs. [19,32]. films that determine Γ N . We consider the QP dynamics in all four films of the device in order to determine these values, taking into account generation of QPs by PAPS and tunneling across the JJ self-consistently with Γ P and Γ N , respectively. We first consider x 0 and x 1 , the QP densities of the films on the left side of the JJ in Fig. 3(a). These QP densities may change by mechanisms of Eq. (2) (generation, trapping, and recombination), but also by tunneling between films of the device. QPs may tunnel between the low-and high-gap films 0 and 1 (t 01 , t 10 ) as well as across the JJ between films 0 and 3 (γ 03 x 0 , γ 30 x 3 ). The interfilm transport can be described by the coupled equations: Below, we describe our approach to these inter-film tunneling terms. While films 0 and 1 are separated by an AlO x layer, they share a large contact area in the pads, so QPs tunnel rapidly between the films. In Section III B, we showed evidence that QPs thermalize to distributions primarily located in the low-gap films on each side of the JJ. We therefore assume Fermi distributions for the QPs, f (ε, T ph , µ L ) = 1/(e (ε−µ L )/kBT ph + 1) with films 0 and 1 sharing the same nonzero chemical potential µ L to describe an excess QP number independent of the temperature T ph [27,50]. The QP densities in each film x 0 (x 1 ) are related to µ L through the definition of x QP = 2 ∞ 0 dε ν(ε, ∆ 0(1) )f (ε, T ph , µ L ). Due to the difference between the gaps, this thermalization results in x 1 ≈ x 0 e −η , where η = δ∆/k B T ph . In our model, we therefore replace the tunneling rates between films 0 and 1 (t 01 and t 10 ) with this assumption of rapid thermalization. We add Eqs. (14) and (15) and substitute x 1 = x 0 e −η . The same approach is applied to films 2 and 3, which likewise share a large contact area in the opposite pad, reducing the number of independent QP densities in the system from four to two. This results in two coupled equations for x 0 and x 2 : As described in Section IV A, if the qubit were in thermal equilibrium with the QPs, the films on opposite sides of the JJ would share the same chemical potential. However, the qubit state is frequently pulsed during measurement of Γ, and as a result the tunneling from films 0 to 3 and films 3 to 0 are not equal. The QP densities on the opposite side of the JJ (x 0 and x 2 ) differ in the steadystate during the measurement due to this imbalance.
The rates γ 03 and γ 30 are the per-QP tunneling rates in the 0 to 3 and 3 to 0 direction, respectively. These may be calculated using only the S lr or S rl terms of S ±N [Eq. (11)], and assuming nonthermal qubit-state weights ρ 0 = ρ 1 = 0.5 caused by the measurement sequence. Because Γ N,03 is approximately proportional to x 0 and independent of x 3 for small x 3 such as those existing in this experiment, the per-QP tunneling rate γ 03 := Γ N,03 /(x 0 N CP,0 ) can be calculated. Similarly, γ 30 := Γ N,30 /(x 3 N CP,2 ). The number of Cooper pairs in each film N CP,k = 2D(ε F )∆ k V k , with film number k ∈ {0, 1, 2, 3} depends on the superconducting gap and volume. The single spin density of states at the Fermi energy D(ε F ) = 0.72 × 10 29 µm −3 J −1 [60] and the volumes are approximately 100 × 700 × 0.03 µm 3 for the low-gap films and 100 × 700 × 0.02 µm 3 for the high-gap films.
These per-QP tunneling rates depend on T ph and δ∆, and have significant dependence on flux. The imbalance Γ N,03 /Γ N,30 > 1 is largest when hf q = δ∆, since QPs at the low-gap edge in film 0 can absorb an excitation from the qubit to tunnel to the high-gap edge in film 3. Based on our fit, this results in 55% more QPs in films 2 and 3 as compared to films 0 and 1 at the flux for which hf q = δ∆.
We separate QP generation into generation by PAPS (g P ) and generation by other mechanisms (g other ), which break Cooper pairs but do not simultaneously result in a parity switch. While PAPS generates QPs only in JJ films 0 and 3, mechanisms contributing to g other could generate QPs in all the films. Therefore, we have g 0 + g 1 = (g P + g other,0 ) + g other,1 = g P + g other , where we have defined g other := g other,0 + g other,1 as the total generation on one side of the JJ. Likewise, g 2 + g 3 = g other,2 + (g P + g other,3 ) = g P + g other , where we have made the reasonable assumption that non-PAPS pairbreaking occurs identically on opposite sides of the JJ since the two pads have the same geometries. Generation by PAPS g P := Γ P /N CP,0 is calculated self-consistently with Γ P and g other is a flux-independent fit parameter in our model.
In Section IV A, the additional approximation is made that since e −η ≈ 0.008 for T ph ≈ 50 mK and δ∆ ≈ 4.844 GHz ≈ 233 mK, the high-gap films can essentially be removed from the determination of x 0 and x 2 . Eqs. (3) and (4) are obtained by substituting the above expressions for generation into (16) and (17), and making the approximation e −η = 0. We observed empirically that Γ increased approximately linearly with P lamp , the power dissipated by the manganin resistor "lamp". The emission spectrum of the lamp depends on the temperature of the lamp, which is determined by a balance of the dissipated heat and the thermal conductivity to the nearest cold heat sink (likely the mixing chamber). We expect dissipation to be uniform in the manganin filament of the lamp and independent of temperature for the range of our experiment. On the other hand, the thermal conductivity of the leads which determines the temperature of the lamp filament is itself temperature-dependent and may vary along the length of the leads.
As described Section V A, the lamp was required to compare QP generation that occurs with parity switching due to PAPS (g P ) and QP generation that does not occur with parity switching (g other ). For the purpose of constraining g other , measurement at any increased Γ without increased device temperature was sufficient, since a second set of {Γ P (Φ), Γ N (Φ)} constrained the model and fixed the strength of the trapping rate.
Nonetheless, we attempted to understand the linear dependence of Γ on the power dissipated by the lamp P lamp with a simple model that takes into account temperaturedependence of the thermal conductivity of the NbTi/Cu leads. We assume that the heat flow in the wire is constant in the steady-state, but the thermal conductivity of the Cu in the leads depends on the position x from the mixing chamber due to the temperature gradient [ Fig. 12 The temperature of the lamp T lamp is the temperature at T (x = l), where l is the length from the mixing chamber where the NbTi/Cu leads are thermalized to the lamp. The thermal conductance per unit length κ(x) depends on temperature according to the Weidemann-Franz law, κ(x)∂x = c κ T (x). The proportionality factor c κ is related to the electrical resistivity ρ and the Lorenz number L, c κ = L/ρ, and we treat it as a fit parameter here. Substituting for κ and integrating, we find that The mixing chamber temperature is T MC ≈ 30 mK.
Next, we propose that Γ P is proportional to the integrated power P int radiated from the lamp that enters the cavity and causes PAPS. Treating the lamp as a 3D blackbody, we find that integrating the radiated power from 100 − 300 GHz yields a dependence on the temperature of the lamp (T lamp ) that is approximately quadratic for temperatures ∼ 1 − 5 K [ Fig. 12(b)]. The upper bound of this frequency range of photons which induce PAPS is determined by the constraint that the size of the qubit must be smaller than the wavelength of the photon. Higher frequency photons with smaller wavelengths are increasingly likely to be absorbed in the pads of the device due to the larger absorption area.
In Fig. 12(c), the measured Γ is shown as a function of P lamp . We fit the data with a simple model in which the lamp radiates the spectral density of a 3D blackbody at T lamp , and Γ is proportional to P int , the integrated Power radiated by a 3D blackbody at temperature T integrated from 100 to 300 GHz. We observe that from 1 K to 5 K, the dependence is approximately quadratic in temperature with an offset of 1 K. At higher temperatures, Pint ∝ T . (c) Measured Γ as a function of the power dissipated by the lamp (orange). The data is the same as Fig.  4(b) but here shown in log-log scale. We fit it with the model in which Γ ∝ Pint(T lamp ), where T lamp depends on P lamp according to Eq. (19). In blue, we fit assuming the lamp is at the mixing chamber temperature when no power is dissipated (TMC = 0.03 K). The model captures the linear behavior between ≈ 1−10 µW but not the low-power dependence of Γ. In green, in an attempt to fit Γ(P lamp < 1 µW), we allow T lamp at P lamp = 0 µW to vary as a fit parameter, which yields TMC = 1.3 K. This is an unreasonably high temperature, indicating that this model does not accurately capture the low power behavior. (d) The temperature of the lamp T lamp as a function of P lamp , calculated using the fit parameters of (b). In blue, T lamp (0) is assumed to be the mixing chamber temperature, while in green, T lamp (0) = 1.3 K. In both cases, power between 100 and 300 GHz radiated by the lamp: Γ = AP int (T lamp ) + B. In this model A is an unknown proportionality constant between Γ and P int which includes attenuation and coupling of radiation to the qubit which is assumed to be frequency-independent for simplicity. This scale factor also includes the additional Γ resulting from increased Γ N due to the QPs generated by the increased Γ P (when g P g other , Γ N is approximately proportional to Γ P ). The lamp-power-independent offset B accounts for background PAPS from other sources and NUPS from QPs existing at P lamp = 0 µW. Combining T lamp ∝ P 1/2 lamp with P int ∝ T 2 lamp , we expect a linear dependence of Γ P on P lamp for a range corresponding to T lamp ≈ 1 − 5 K. In the range of P lamp ≈ 1 − 12.6 µW, the data shows this linear dependence.
The blue line shows a fit with T MC = 0.03 K, which models the linear part of the data reasonably well but does not capture the low power behavior. The fit parameter c κ corresponds to an electrical resistivity of ρ = 2.9 × 10 −9 Ωm. Shown on log-log scale, it is clear that Γ(P lamp ) is not linear for P lamp 1 µW. The fit to this range can be improved by allowing T MC to vary (green), but yields T MC = 1.3 K, which is forty times hotter than the mixing chamber. Thus, while this model does capture the observed linear behavior for P lamp ≈ 1 -12.6 µW, the low P lamp behavior is not described by this simple model. This is reasonable given that the model assumed T lamp T MC and g p g other .

Effect of lamp on temperature
To confirm that the manganin lamp was increasing Γ by PAPS rather than by increasing the temperature of the device and generating additional thermal QPs, we mounted a ruthenium oxide (RuOx) thermometer on the bracket holding the copper cavity. The temperature measured by this thermometer is plotted for each measurement of Γ in Fig. 13(a). The difference between the temperature measured by this thermometer and the mixing chamber temperature T MC ≈ 30 mK attributed to the thermal resistance from the end of the bracket to the mezzanine to the mixing chamber. We observe that while the temperature increases by several mK as the power dissipated by the lamp increases, this increase is far too small to generate a significant number of thermal QPs. This was verified by performing a separate sweep in which we increased the temperature by a heater mounted on the mixing chamber plate [ Fig. 13(b), red]. We can see that when the bracket temperature measured by the RuOx increased due to the remote heater by similarly small amounts, there was minimal change in Γ. Only when the bracket RuOx reached approximately 150 mK did the increase in Γ match what was produced by 12.6 µW dissipated by the lamp. Fig. 13(b) shows that the increase in Γ due to power dissipated by the lamp cannot be explained by an increased temperature of the sample.
This dependence of Γ on the mixing chamber temperature was used to estimate the average gap of the two superconducting films∆ = (∆ L + ∆ H )/2, since thermal activation of QPs will be sensitive to this value. The black solid line shows a fit to a model in which there is temperature-independent PAPS and temperature-dependent NUPS due to the QP energy distribution changing and thermal QPs being generated at higher temperature. The only fit parameters used in this model are Γ P and∆. The fitted∆ is not sensitive to the assumed values for trapping and recombination rates that are fixed parameters in the model. In this case, we set g other = 0, since the data cannot distinguish between a temperature-independent PAPS rate vs. a temperatureindependent background of excess QPs. Repeating the  Fig. 4(c). The colors correspond to different lamp powers. The thermometer is mounted on the copper bracket supporting the 3D readout cavity in which the qubit is embedded. The temperature remains well below the temperature at which thermal QPs are generated for all values of Φ/Φ0, P lamp . (b) The measured parity-switching rate Γ as a function of temperature measured by the RuOx thermometer near the readout cavity. Two sweeps are shown. In yellow, the lamp power is swept [ Fig. 13(b)], and here we observe the bracket temperature does not change significantly as Γ increases by this mechanism. In red, the mixing chamber temperature is swept by increasing power dissipated by heaters on the mixing chamber plate. The temperature measured by the RuOx increases as the mixing chamber temperature rises, but Γ doesn't increase dramatically until approximately 125 mK. This shows that the slight temperature increases shown in (a) cannot be responsible for the large increase in Γ. The black line shows a fit to Γ vs. T , from which we may extract the average gap∆/h ≈ 51.8 GHz.
fit with g other as a fit parameter and Γ P = 0 yielded the same∆. In each case, we found∆/h = 51.8 GHz, which is consistent with reported superconducting gap measurements of thin-film aluminum [43,45].   Fig. 4(c) to the trapping rate s. In Fig. 14(a), we plot the pseudo-R 2 goodness-of-fit metric for fits to the full data set fixing s (blue, left axis). We use the definition R 2 = i [1 − (S res,i /S tot,i )]/4, i ∈ {0, 1, 2, 3}, where S res,i and S tot,i are the sum of squares difference between the data FIG. 14. (a) Pseudo-R 2 goodness-of-fit metric for fits to the full data set as a function of trapping rate s (blue, left axis). The value of g other obtained in each fit is also plotted for reference (red, right axis). Vertical black lines indicate values of s for which the fits are shown in (b). (b) Fits to the P lamp = 0 µW (left) and P lamp = 12.6 µW (right) data for the values of s marked by vertical lines in (a). The fits corresponding to lower values of R 2 (i.e. s = 2 s −1 , s = 100 s −1 ) appear to describe the intermediate peak at Φ/Φ0 = 0.145 less well for the P lamp = 12.6 µW as compared to the best fit (s = 11 s −1 ). and the model (S res,i ) and between the data and mean (S tot,i ) measured at the ith lamp power.
We observe a maximum of R 2 near s = 11 s −1 corresponding to the best fit. For s 6 s −1 , generation by PAPS is sufficient to generate the approximate x QP in the device, such that the contributions of g other found by the fits in this range are negligible. For s 6 s −1 , the best fit g other increases linearly with s in order to keep the x QP at the value which best fits the data for P lamp = 0 µW. Fig. 14(b) illustrates how values of s differing from the best fit value affect the model. We observe that for a lower value of s = 2 s −1 , the additional QPs generated by the enhanced Γ P with P lamp = 12.6 µW cause a level of Γ N that gives a peak at Φ/Φ 0 ≈ 0.145 which is too large compared to the data. For a higher value of s = 100 s −1 , the strong trapping rate suppresses the additional QPs generated by the enhanced Γ P with P lamp = 12.6 µW, causing a level of Γ N that gives a peak at Φ/Φ 0 ≈ 0.145 which is too small compared to the data.

APPENDIX H: DIRECT OBSERVATION OF QUASIPARTICLE BURSTS
One possible source of pair-breaking energy for QP generation (i.e., a contribution to g other ) is ionizing radiation. The energy cascade from the ionization of atoms in the superconducting films or substrate, described in [33,37,38,61], ultimately results in "bursts" of QPs. Experimentally, evidence for these bursts has thus far come in the form of sudden drops in T 1 in superconducting qubits or drops in the quality factor of superconducting resonators. In this device, we found evidence of rapid parity switching which further substantiates the hypothesis that these sudden drops in T 1 are due to QPs. By tuning the flux of our SQUID offsetcharge-sensitive transmon such that E J /E C 20, we were able to measure the parity directly via the paritydependent dispersive shift as described in Ref. [32]. With a quantum-limited SNAIL parametric amplifier [62], we were able to discern the parity with a single 4 µs measurement of the readout cavity.
In addition to generating high-energy phonons which break Cooper pairs and generate QPs, ionizing radiation impacts can also cause n g jumps due to the redistribution of charge in the substrate. With this direct detection technique, which tracks parity switching and charge jumps simultaneously, we were able to quantify how many of the observed burst events are correlated with an n g jump. In Fig. 15, we show two examples of bursts: one in which the n g is unchanged (a-d) and one in which n g changes visibly before and after the burst (e-h).
In Fig. 15(a), we show a 20 ms-long jump trace of qubit measurements, with the I quadrature of the complexvalued signal plotted. Fig. 15(b) shows the 2D histogram in the complex plane of the full 100 ms time series from which the data in (a) is drawn, which consists of 2 × 10 5 qubit measurements. The readout pulse length and integration time were 4 µs, with 1 µs delay between measurements. We can see the measurements primarily form two Gaussian distributions near Q = 0, which we inter-pret as the even and odd parity ground states (even and odd are assigned arbitrarily). The remaining population corresponds to higher energy states. At t < 0, we can see from the value of I that the qubit initially has even parity and then switches to odd parity at t ≈ −7 ms and back to even at t = −3 ms. The background parity-switching rate at this flux value was Γ ≈ 600 s −1 ≈ 1/1.7 ms during this cooldown. Fig. 15(c) zooms in on a 4 ms section around a burst event, which we label as occurring at t = 0. We observe that at t = 0, the value of I jumps to a value between the values corresponding to |0, e and |0, o . This is likely due to the parity switching much faster than the 5 µs repetition time of our measurement. We confirm that this is not a result of qubit excitation by checking the 2D histogram of points during this 4 ms segment, in which it is clear that smearing between |0, e and |0, o is much more prevalent than qubit excitation [ Fig. 15(d)]. Following this jump, the qubit state switches between the |0, e and |0, o states much faster than the 1.7 ms lifetime characteristic of the majority of the jump trace. The parity-switching rate appears to decay significantly by t = 10 ms, although it is not clear that it has returned to the consistent background rate. Fig. 15(e-h) are analogous to (a-d), except that at t = 0, there is a sudden change in the dispersive shifts of the qubit states signalling a jump in n g in addition to the onset of rapid parity-switching. In this data set, we search for bursts when the configuration of n g -dependent dispersive shifts provide maximal distinguishability of the even and odd parity states (the configuration of (a-d)). In the event shown in Fig. 15(e-h), the dispersive shift of the |0, o state changes at t = 0, changing from I/σ ≈ 0 to the optimal configuration in which I/σ ≈ 6. Impacts which cause n g jumps out of this optimal configuration into others will not be detected; however, we note that they will occur at the same rate as jumps into the optimal configuration.
We observed approximately 209 events in ≈ 5.6 hours of data. Of the 209 impacts, we find that 60 are correlated with an n g jump, which provides additional information about the location of the impacts. Adding in an equal number of undetected jumps out of the optimal configuration, we estimate an event rate of ≈ 1/75 s. This is consistent the impact rates reported in Refs. [37,61,63] after taking into account the area of the substrate (our sapphire substrate is 3 mm x 15 mm). The extent to which bursts contribute to the background excess QP density and parity-switching rate depends on the energy deposited by the ionizing radiation, the frequency of the events, and the timescale on which the density decays. Further work is necessary to estimate the energy deposited by these events and to understand the decay of the QP density after a burst, which may involve several timescales [38,61]. Rapid switching between parity states begins at t = 0, which we attribute to a "burst" of QPs generated by ionizing radiation. (b) 2D histogram of qubit-state measurements in the complex plane for the full 100 s time series from which the data in (a) is drawn, shown for state identification. Most of the counts form two Gaussian distributions with positive Q, which we attribute to the even and odd parity ground states, respectively. The counts at negative Q correspond to excited states of the qubit. Data is plotted in log scale for visibility of excited states, and dashed lines indicate 2σ. (c) Zoom in on (a) showing 4 ms around the onset of the burst. The value jumps to a value between |0, e and |0, o due to switching that is fast compared to the 1/(5 µs) measurement repetition rate. This is followed by rapid switching between the parity states which decays in frequency over time. (d) Same as (b), but for only the data in the window between the red lines shown in (c). The data is primarily in the ground states, showing that the data near I ≈ 0 at t = 0 is due to rapid switching rather than a jump to an excited state. (e-f) are analogous to (a-d), but show a burst correlated with a jump in ng due to the redistribution of charge in the substrate resulting from the ionizing impact. At t < 0, the |0, o state is located at I/σ ≈ 0, Q/σ ≈ 6, and parity switches occur between the even and odd parity ground states in this configuration. When the burst occurs at t = 0, the |0, o state jumps to the same configuration as in (a-d), indicating a jump in ng correlated with a burst.