Using the Autler-Townes and ac Stark effects to optically tune the frequency of indistinguishable single-photons from an on-demand source

We describe how a coherent optical drive that is near-resonant with the upper rungs of a three-level ladder system, in conjunction with a short pulse excitation, can be used to provide a frequency-tunable source of on-demand single photons. Using an intuitive master equation model, we identify two distinct regimes of device operation: (i) for a resonant drive, the source operates using the Autler-Townes effect, and (ii) for an off-resonant drive, the source exploits the ac Stark effect. The former regime allows for a large frequency tuning range but coherence suffers from timing jitter effects, while the latter allows for high indistinguishability and efficiency, but with a restricted tuning bandwidth due to high required drive strengths and detunings. We show how both these negative effects can be mitigated by using an optical cavity to increase the collection rate of the desired photons. We apply our general theory to semiconductor quantum dots, which have proven to be excellent single-photon sources, and find that scattering of acoustic phonons leads to excitation-induced dephasing and increased population of the higher energy level which limits the bandwidth of frequency tuning achievable while retaining high indistinguishability. Despite this, for realistic cavity and quantum dot parameters, indistinguishabilities of over $90\%$ are achievable for energy shifts of up to hundreds of $\mu$eV, and near-unity indistinguishabilities for energy shifts up to tens of $\mu$eV. Additionally, we clarify the often-overlooked differences between an idealized Hong-Ou-Mandel two-photon interference experiment and its usual implementation with an unbalanced Mach-Zehnder interferometer, pointing out the subtle differences in the single-photon visibility associated with these different setups.


I. INTRODUCTION
The single-photon source (SPS) as a resource for quantum information technology has in recent years exhibited great progress in experimentally achieved efficiency and quantum state purity, pushing the technology towards practical near-term applications. Recent advances have enabled single photons to be generated on-demand with efficiencies exceeding 50% [1][2][3] and near-unity quantum indistinguishability [4] and purity [5,6], facilitating advances in boson sampling [7] and quantum key distribution [8][9][10], and even approaching minimum fidelities required for efficient linear optical quantum computation [11,12].
For on-demand SPSs, these advances have largely been achieved using semiconductor quantum dots (QDs), where the dipole-active transition of an electron-hole pair (exciton) across the band gap in conjunction with the three-dimensional confinement afforded by the QD geometry provides an excellent quantum two-level system which, when inverted by excitation, emits a single photon radiatively. Additional challenges to implementation * cgustin@stanford.edu of SPSs which have seen recent progress are the desirable criteria of scalability [13][14][15][16], and frequency tuning [17], as QDs are typically grown such that energy levels are stochastic in nature, but many applications require many SPSs with degenerate frequencies. Effective methods for frequency tuning QD SPSs (with a large variance in attainable bandwidth between methods) include electrical tuning [18,19], strain tuning [20][21][22], quantum frequency conversion via optical nonlinearity [23], and multi-photon Raman transition processes utilizing multilevel systems [24][25][26]. This last all-optical tuning process typically involves two sequential laser pulses, and uses the biexciton (two exciton) state, which extends the twolevel structure of the QD to a cascade-type ladder system, and as such is applicable to any ladder system involving three or more energy levels, not just QDs.
In a similar manner, we have shown recently-and demonstrated experimentally using the QD biexcitonexciton cascade-how on-demand frequency-tunable single photons can be generated from such a ladder system with high efficiency, indistinguishability, and purity, by instead using a single pulse excitation under the presence of a cw laser dressing the exciton-biexciton transition [27]. Depending on whether the cw laser is resonant or detuned, this SPS then operates using either the Autler-Townes (AT) effect or ac Stark shift, respectively. Such an approach allows for the potential of all-optical frequency modulation of the emitter resonance [28], which has applications including creating high-dimensional entangled quantum states [29,30], and topological states [31]. This optical frequency tuning may potentially also improve the performance of entangled photon pair sources in QDs [32][33][34][35][36], where the small fine structure splitting of polarized excitons can degrade entanglement fidelity.
While the possibilities of frequency-tuning a SPS at the level of the coherent optical system dynamics are interesting, single photon emitters for practical quantum information technology applications have very stringent requirements on efficiency, single-photon indistinguishability, and purity. It is thus an important question from a theoretical perspective what role the cw dressing laser plays in these SPS figures of merit, and how the source can be designed to minimize these effects. The analysis required to answer such a question would supplement and extend previous theoretical work that has helped elucidate the limits of SPS figures of merit in undressed systems, including the role of the pulse, cavity, and electronphonon scattering [1,4,5,[37][38][39][40].
In this work, we address this question in detail by studying theoretically a four-level ladder system, which can be physically realized using the QD biexciton-exciton cascade, as we have done in Ref. [27]. We find that the primary effects of source figure-of-merit degradation come from undesirable spontaneous emission from the higher energy state, and, in the case of semiconductor QDs, electron-phonon scattering induced by the cw laser causing excitation-induced dephasing during the emission process and, usually, increased population of the higher energy (biexciton) state. However, we find that incorporating an optical cavity resonant with the lower energy (exciton) state can mitigate these effects by accelerating emission into the preferred cavity mode. The ac Stark regime offers better SPS efficiency and indistinguishability at the cost of much reduced bandwidth of achievable frequency-tuning. We note that in this work we refer to the states of the system as QD (bi)excitons, but all results are also presented for the case of no phonons, which is generally applicable to any quantum four-level system, and the principles apply equally to a three-level system, with modified population dynamics due to the reduced number of decay channels.
The layout of the rest of the paper is as follows: in Sec. II, we introduce our frequency-tunable SPS design and basic principles of operation, based on a four-(or three-) level quantum ladder system, of which the biexciton cascade in QDs is one physical realization. We present our quantum master equation (ME) model of the SPS, including for the case of QDs coupling to phonon reservoirs using the polaron master equation (PME) method.
Next, we define in Sec. III the figures of merit we use to quantify the SPS fidelity. In particular, we include a quantum optical derivation of the two-photon inter-ference visibility used to extract the single-photon indistinguishability of the source in a Mach-Zehnder (MZ) interferometer simulating a Hong-Ou-Mandel (HOM) interference experiment. The resulting expression is well known [41], and can be expressed in terms of the inteferometer properties, the single-photon purity, and the single-photon indistinguishability. However, most theoretical works on the subject to date assume an HOM inteferometer with two distinct SPSs; the MZ setup used in experiments only utilizes one physical SPS, which gives differing photon statistics. As a result, different expressions for the single-photon indistinguishability have existed in the literature-a discrepancy which becomes particularly important when the purity is non-ideal, as recent work has highlighted [42]. For the sake of comparing results directly to experiment, we expect this derivation will be of use in bridging the gap between theoretical and experimental works in the literature.
In Sec. IV, we discuss our main results for the operation of the SPS in both AT and ac Stark regimes, and show how for the case of the QD SPS the phonon bath influences the performance of the device in both regimes and places limits on achievable figures of merit. We also show how an optical cavity can be used to significantly improve device performance by reducing timing jitter and phonon-related decoherence by means of selectively increasing the desired dipole transition rate. We then discuss aspects of the initial pulse excitation, including the effect of cw dressing on the source purity. Finally, in Sec. V we conclude.
We also include five Appendices: in Appendix A, we present a full analytical solution for the efficiency and indistinguishability for the case of resonant laser dressing in the absence of phonon effects. In Appendix B, we show how a unitary transformation to the dressed state basis and secular approximation can be used to remove fastoscillating terms in the ME, which drastically improves computational efficiency for most numerical calculations. In Appendix C, we use a weak phonon coupling approximation, appropriate for the regimes studied in this work, to derive simple analytical expressions for the phonon interaction terms, and give an intuitive physical picture for the phonon processes. In Appendix D, we show how to extend the PME to include a time-dependent excitation pulse which we use to calculate the SPS purity. Lastly, we include in Appendix E a study of the cw error rate induced by the far off-resonant excitation of the system by the cw drive (otherwise neglected for most of our analysis), which can typically be mitigated by spectral filtering.

II. THEORETICAL MODEL OF A FREQUENCY-TUNABLE SPS
In this section, we present the main theoretical model we use to study the frequency-tunable SPS using the QD biexciton-exciton cascade, and describe its regimes of op-eration. In Sec. II A, we describe the four-level cascade model and present the ME of the system Hamiltonian under cw driving with radiative emission. In Sec.'s II B and II C we describe the AT and ac Stark regimes of operation, respectively, and in Sec. II D we describe how we model the electron-phonon interaction using the PME. Further detail and characterization of our SPS scheme, including emission spectra, can be found in Ref. [27].

A. Quantum ladder model
We model the quantum ladder cascade system for the practical physical realization of the semiconductor QD as a four-level system with ground |G , excitons |X and |Y (with orthogonal linear polarizations), and biexciton |B states, with the |B -|X transition dressed by a coherent drive with strength Ω cw . The coherent laser also weakly couples the |X -|G transition, which we assume to be far-detuned due to the biexciton binding energy.
The total Hamiltonian for this setup, neglecting for now phonon coupling and radiative emission, is (letting = 1 throughout) where ω i is the energy of the i th state and i ∈ {X, Y, B}, The undressed system (i.e., with Ω cw = 0) gives rise to Xpolarized fluorescence emission energies at ω B − ω X and ω X . In Fig. 1(a), we show a schematic of this undriven system in this bare state basis.
The system is driven at (near-)resonant cw frequency ω cw = ω B −ω X −δ with an X polarized laser, such that δ is the laser detuning from the biexciton-X-exciton transition. By moving into an interaction picture defined by where the biexciton binding energy is defined as E B = 2ω X − ω B , and performing the rotating wave approximation, we obtain the time-independent system Hamiltonian: Equation (2) contains a far-detuned drive coupling the |X -|G transition via the σ X x term, and as such can model weak cw excitation of the exciton from the ground state. We shall assume for proper device operation that E B |δ|, Ω cw in all cases, such that this coupling can generally be neglected. However, we shall use Eq. (2) for the simulations in Appendix. E where we model the cw error rate. Neglecting this term, we can move into a different rotating frame instead defined by that the system Hamiltonian can now be written as where This Hamiltonian has eigenenergies E ± = ±η/2, where η = Ω 2 cw + δ 2 , and corresponding eigenstates The frequency splittings apparent in E ± allow for frequency tuning of the source via radiative transitions between the dressed energy levels. Without yet considering phonon coupling, we can model spontaneous emission using a Lindblad ME for the reduced density operator of the system ρ: where we have included radiative decay from both excitons with rate γ X , and from the biexciton with total rate γ B (i.e., we assume throughout orthogonal polarization channels have equal decay rates): where

B. Autler-Townes regime
For the case of no detuning (or small detuning relative to the drive strength |δ| Ω cw ), and a drive strength which exceeds the decay rates of the system (i.e., Ω cw γ X ), the SPS operates in the AT regime, where the |± eigenstates of the Hamiltonian in Eq. (3) become symmetric and antisymmetric superpositions of |X and |B states, with an AT energy splitting of Ω cw . The emission spectrum from the |X -|G transition then consists of two peaks with energies ω X ± Ω cw /2. These peaks have nearly equal spectral weight (area), and as such the efficiency of a device operating in the AT regime is at most ∼ 1/2 if only one of the peaks is of interest. In Fig 1(b), we show a schematic of the four-level system QD model operating in this regime, and the associated energy splittings.

C. ac Stark regime
For larger detunings (|δ| Ω cw ), the |± dressed eigenstates of the Hamiltonian in Eq. (3) become unequal superpositions of |X and |B states, and as such, (d) Example schematic of a phonon-assisted excitation process; shown here is the process where the excitation from the ground |G to excited |X state, which is detuned by EB +δ, is assisted by the annihilation of a phonon in the phonon bath with energy ∼ EB + δ. Note for (b,c) we have not shown the |Y state which is involved in another decay channel as seen in (a).
a system initialized in the |X state will tend to emit photons preferentially from the eigenstate which contains a higher amplitude of the |X state; the emission spectrum will consist of a dominant peak from the transition from this dressed state to the ground state and a subdominant peak from the other dressed state transition to the ground state. As the detuning is increased even further relative to the drive strength, the subdominant peak becomes negligible, and the spectrum consists of a single Stark shifted peak. In this limit, one also requires the detuning and drive rates to greatly exceed the damping.
We then define a parameter equal to the undressed resonance ω X minus the frequency of the dominant peak in the spectrum, which quantifies the frequency shift achieved in the SPS: By construction, this quantity is positive (negative) when δ is positive (negative), corresponding to a red (blue) frequency shift. ∆ ac can then be used to give the frequency shift of the dominant peak both in ac Stark shift and AT regimes, although ∆ ac flips sign and thus changes discontinuously as δ → 0 as the dominant and subdominant peaks switch roles. Formally at δ = 0, ∆ ac is unde-fined, as both peaks with frequency shifts ±Ω cw /2 are equally prominent in the AT regime (without considering, e.g., electron-phonon coupling). We can also solve for the drive strength in terms of this frequency shift and laser detuning, In the ac Stark regime, where |δ|/Ω cw 1, Ω cw ≈ 2 √ ∆ ac δ, and one also satisfies δ/∆ ac 1. Thus, we shall use both criteria interchangeably to denote the ac Stark regime. Also in this limit, ∆ ac ≈ Ω 2 cw /(4δ), which is the usual ac Stark shift encountered in perturbation theory.
In Fig. 1(c), we show a schematic of the QD model operating in the ac Stark regime and the associated frequency shift of the |X state ∆ ac .
For the sake of this work, we do not explicitly define for what values of δ/∆ ac or δ/Ω cw the system enters either regime, but rather seek to understand the two regimes as limiting cases associated with these parameters.

D. Exciton-phonon coupling and the polaron master equation
It is well known that the coupling of excitons in semiconductor QDs to longitudinal acoustic (LA) phonon modes has important effects on their dynamics under optical driving, including excitation-induced dephasing, offresonant feeding effects, Rabi frequency renormalization, and non-Markovian real phonon transitions which lead to the formation of a broad phonon sideband [43][44][45][46][47][48][49][50][51][52]. Using a spherical QD wavefunction model, the phonon coupling can be characterized by a super-Ohmic spectral function where α is the phonon coupling constant and ω b is a cutoff frequency which scales inversely with the size of the QD [53]. The Hamiltonian that couples excitons with phonons takes the form of the independent Boson model, which is exactly diagonalizable [54,55]. Employing a unitary "polaron" transform to a frame in which this interaction is diagonalized thus allows one to construct a perturbative expansion in the optical drive strength; this approach, under the Born-Markov approximation, yields the PME [47].
Assuming the different transitions in the cascade have equal dipole moments [56], the result is that to incorporate phonon coupling, we add to the ME in Eq. (5) the following term where , and the timedependent complex phase term is defined through and B = e −φ(0)/2 . We have absorbed a coherent attenuation factor B from the PME into our definition of Ω cw for easy comparison with the no-phonon case, as well as a polaron shift in exciton resonance frequencies.
Except for when we model the cw error rate, we can neglect the σ X m term in X m , which is consistent with using the approximate Hamiltonian in Eq. (3). The operators X m (−τ ) = U (τ )X m U † (τ ) are calculated using U (τ ) = exp [−iH S τ ]. Note this unitary transform can be simplified analytically when using Eq. (3) as H S [47,57], and we do this in Appendix B in the dressed state frame.
It is worth noting that the X m terms in Eq. (10) lead to an overall scaling factor of ∼ Ω 2 cw in L PME ρ, which dominates for small effective drive η relative to ω b and k B T , although the full functional dependence of the phonon scattering on the drive strength will also depend on the interplay between the phonon function φ(τ ) and coherent dynamics induced by H S in the X m (−τ ) functions; this leads to rich and highly nonlinear features in the phonon decoherence rates as a function of Ω cw [57]. In Appendix C, we derive simplified expressions for the phonon coupling rates valid for strong driving and weak phonon coupling strengths.
Throughout our calculations, we shall use for our calculations two different sets of phonon parameters, denoted I and II, such that α I = 0.04 ps 2 , and ω b,I = 0.9 meV, while α II = 0.006 ps 2 , and ω b,II = 5.5 meV. For the most part (a notable exception being the calculation of the cw error rate), set I corresponds to a "stronger" phonon coupling strength, and is similar to what has been extracted from measurements with InAs/GaAs QDs [49][50][51][58][59][60], while set II gives a (in most cases) weaker phonon coupling strength, and is more similar to numbers consistent with experimental results in some waveguide structures [61], including our own [27]. For our study of the source purity, we also use a set III with intermediate values α III = 0.025 ps 2 and ω b,III = 2.5 meV. In all cases, we use a phonon bath temperature of T = 4 K.
One of the main consequences of phonon coupling is the formation of a broad phonon sideband, which arises from non-Markovian real phonon transitions concurrent with photon emission. Photons emitted into this sideband have poor indistinguishability [39,62], and as such this sideband is usually filtered out for HOM interference measurements, leaving only the zero-phonon-line (ZPL), which has much better coherence properties due to the fact that phonon dephasing of the ZPL for bulk phonons vanishes very rapidly at low temperatures [63] (and this is a higher-order process, not captured by our PME). The PME is in fact capable of capturing this non-Markovian effect by means of the exponential factor which arises upon transformation back to the lab frame from the polaron frame [39,54]. We can, for the sake of this work, approximate the filtering process that we assume to occur to remove this phonon sideband by simply neglecting this factor, and calculating all observable quantities directly in the polaron frame. In doing so, we miss an efficiency cut that arises from neglecting the sideband contribution to the emission. However, we can analytically approximate this contribution using the factor B , and we quantify this efficiency reduction in Sec. IV.
In addition to this phonon sideband in the emission spectrum, it is also important in this work to consider the phonon sideband in the absorption spectrumspecifically, the potential for phonon-assisted excitation of energy levels under detuned driving. An example of this process is shown in Fig. 1(d): in this example, we show that the far detuned excitation of the |X state by the cw laser (due to the σ X x term in the full Hamiltonian of Eq. (2)) can be assisted by the absorption of a phonon in the phonon bath with energy E B + δ-if the phonon spectral function J(ω) is appreciable over this frequency range, this process may become significant. In the case of Fig. 1(d), the process is suppressed at low temperatures due to the small thermal occupation of phonons in the bath (although it still plays a potentially significant role as we show in Sec. IV), but the corresponding process for a cw drive with frequency exceeding the energy transition of interest is highly significant even at low temperatures, as it involves phonon creation. The dynamics of the driven |B -|X transition are also subject to similar considerations, and Appendix C gives simplified analytical rates and a schematic picture of these phonon processes.
Finally, we note that the coherent (unitary) part of the phonon effects also leads to small frequency shifts in the emission spectrum; for the sake of this work we neglect these and focus on the nominal frequency splitting given by ∆ ac ; if desired, the analytical simplifications in Appendices B and C can be used to calculate these small shifts explicitly.

III. SPS FIGURES OF MERIT AND TWO-PHOTON INTERFERENCE EXPERIMENTS
In experiment, the two-photon interference (TPI) visibility of the source is typically measured by simulating an HOM interferometery setup using an unbalanced MZ interferometer, excited with two photon pulses separated in time by T 0 , and with overall repetition time of the laser T rep . In contrast to this setup, theoretical analyses of the single-photon indistinguishability (which is often conflated with-or defined to be equal to-the TPI visi-bility) often derive this parameter by assuming an HOM setup with two identical but distinct SPSs described by the same density operator [37,64,65]. While both approaches should lead to perfect TPI for perfectly indistinguishable single photons with unity purity (no multiphoton probability from a source excitation), the photon statistics of the two scenarios are different, leading to different normalizations. This can make direct comparison of experiment and theory difficult, particularly in the case of non-unity single-photon purity. In the work of Kiraz et al. [64], a HOM-type experiment was analyzed, but the authors omitted terms corresponding to the second order correlation function of the source field; as this correlation function separates into a product of photon flux expectation values at large delay times, this omission erroneously led to a definition of the TPI visibility which in fact corresponds to an MZ-type experiment-although only in the limit of an ideal interferometer and zero multiphoton emission probability per excitation.
Furthermore, some authors use the single-photon indistinguishability interchangeable with the "corrected" TPI visibility, after accounting for the finite multiphoton probability of the source, imbalance of the beam splitters, and deviation from perfect interference fringe contrast of the interferometer [66]. This is, however, potentially ambiguous, as the latter two effects are considerations arising from the experimental detection process, whereas the multiphoton probability of the source is a fundamental and physical limitation on the degree of TPI achievable with the source. Additionally, there exists an alternative definition of the TPI visibility which involves normalizing by a cross-polarized cross-coincidence histogram peak at zero delay, which gives a different value for the observed visibility for nonzero two-photon emission probability. The difference in photon statistics from HOM vs. MZ inteferometry experiments was correctly pointed out by Fischer et al. [67], although the metric they propose to quantify the TPI visibility differs from those typically used in experimental works.
To help clarify the matter, and for the sake of one-toone comparison of experiment and theory, we present in Sec. III A a quantum mechanical derivation based on field correlation functions of the TPI visibility for a real MZ interferometer (the expression for which is already wellknown from photon counting arguments [41,67,68]) in the spirit of previous studies of the corresponding quantity for an HOM interferometer [64,65,69]. We explicitly define the indistinguishability to a measure of the first-order degree of coherence of the source, and g (2) [0] to be a measure of the second-order degree of coherence of the source (purity, or lack of multiphoton emission events). We define the raw TPI visibility to be what is measured in experiment, and the corrected TPI visibility what would be measured in an idealized experiment with perfect fringe contrast and balanced beam splitters; this latter metric is the most important single parameter to characterize the fidelity of the SPS as it encompasses both first and second order coherence of the source.
In Sec. III B, we then relate these experimentally observable quantities to the theoretical figures of merit for our frequency-tunable SPS, by means of conventional quantum optics input-output theory [70]. In particular, we show how the dressed state basis of Eq. (4) can be used to derive figures of merit for each sidepeak of the spectrum separately.

A. Derivation of TPI visibility
A simplified schematic of the experimental procedure for extracting the TPI visibility of photons emitted sequentially from a SPS using a MZ interferometer is shown in Fig. 2(a). The SPS, excited every T rep with two pulse excitations separated in time by T 0 (assumed to be much greater than the relaxation time of the SPS, to ensure independent excitation events), emits photons into a decay channel mode. These photons then pass through two sequential beam splitters, where one of the transmission channels between the beam splitters is subject to a time delay T 0 . The outputs of the second beam splitter then propagate to photodetectors from which a crosscorrelation HOM coincidence signal can be constructed as a histogram of detection events; for perfect quantum single-photon interference, this signal vanishes at zero time delay [71]. We also show in Fig. 2(c) an experimental example of this cross-coincidence function taken from the data in Ref. [27]. Note that in this analysis we assume a purely pulsed SPS; any residual cw contribution which arises due to (for example) the small excitation of the ground-excited state transition from the far off-resonant cw laser is assumed to be filtered out of the emission spectrum, although it is possible to extend the analysis to also account for the cw background [42,72].
In experiment, the raw visibility of TPI is often defined as (sometimes without the factor of 2) [4,15,41] where A 0 denotes the area of the peak in the crosscorrelation coincidence histogram at τ = 0, and A ± denote the area of the neighbouring peaks. To theoretically calculate this value, we include here a quantum optics derivation of the cross-correlation signal that is detected by an unbalanced MZ interferometer setup with delay T 0 as show schematically in Fig. 2(a). We assume a signal mode with annihilation operator s, which we will relate to the system modes of the QD via standard input-output theory [73], as well as a vacuum mode with annihilation operator v. We assume that the timescale of decay for the system dynamics T lifetime ∼ γ −1 X is much smaller than T 0 .
Assuming for simplicity the two beam splitters to be identical and lossless, we can express the modes detected by the photodetectors as (13) and (14) where the reflectivity and transmissivity of the beam splitters satisfy |T | 2 + |R| 2 = 1 and RT * + R * T = 0, enforcing unitarity. The vacuum terms do not contribute to any normal-ordered expectation values and are dropped henceforth.
For convenience, we define quantities associated with the source excited only with a single pulse excitation. These constitute the main single-photon figures of merit for the total spectrum emitted from the source (i.e., including both peaks of the split spectrum in the AT regime). These are (i), the number of photons emitted by the source: which we shall sometimes refer to as the "efficiency" (or "brightness", although note there are other sources of end-to-end efficiency degradation not captured by this metric); (ii) the normalized Hanbury-Brown-Twiss (HBT) g (2) [0]-a measure related to the two-photon emission probability of the source, which can be measured by blocking one arm of the MZ interferometer and normalizing the t-integrated peak (time-averaged) in the cross-correlation signal around τ ∼ 0 to any other peak: (16) and (iii), the single-photon indistinguishability: which is a measure of the first order degree of coherence of the SPS. The cross-correlation signal from the 'a' and 'b' photodetectors is proportional to the probability that if one detector detects a photon at time t, the other will detect a photon at time t + τ , and is proportional to (assuming identical detectors) The vanishing of G MZ (t, τ ) around τ = 0 for times around t where one might expect classically a crosscorrelation photon detection event due to two wavepackets hitting the (second) beam splitter at the same time and travelling down different channels is, similar to the HOM setup, a hallmark of TPI and a signature of single-photon indistinguishability. However, the photon statistics of the MZ interferometer differ from that of an HOM interferometer, and the calculation of the TPI visibility is modified accordingly. Note that we could instead define G (2) MZ (t, τ ) using only one of the two crosscorrelation functions, which will, in the presence of an unbalanced beam splitter change the various peak weights of the time-integrated cross correlation function [41], but our metric of two-photon visibility is independent of this choice. Using Eq.'s (13), (14), and (18), we find: In the above, we have made note that since we have assumed that T 0 T lifetime , the dynamics of operators separated in time by T 0 are uncorrelated. Thus we have dropped terms where for any values of t, τ , using this property, we can decompose the correlation function as a product of multiple expectation values involving a phase oscillation (i.e., unequal numbers of annihilation and creation operators within a correlation function). These terms are small but nonzero for finite g (2) [0]-however, they are phase-sensitive and thus will average out to zero in an experimental setting where data is collected over a timescale longer than the coherence of the system [41,67].
In the TPI experiment, the MZ interferometer is fed two photons from the same source separated in time by T 0 , and this process is repeated every T rep . For the following derivation, we shall assume that T rep − 4T 0 T lifetime , such that each laser repetition is an independent event containing only two QD excitation events. However, as the figures of merit we derive only involve peaks at τ delays of zero and ∼ ±T 0 , the final results are valid for the weaker condition T rep − 3T 0 T lifetime . Under the assumption that T rep 4T 0 , we only need to consider from the perspective of a theoretical analysis the function G MZ (t, τ ) for a source excitation s(t) that is excited only at times t = 0 and t = T 0 (i.e., one laser pulse cycle). Thus, any correlation functions involving operators with time arguments that are not within ∼ T lifetime of 0 or T 0 vanish. Furthermore, G MZ (t, τ ) is nonzero around times t and delays τ of ∼ 0, ∼ T 0 , and ∼ 2T 0 only.
In the TPI experiment, photon detection events are integrated over time, and thus we will integrate G Schematics of (a) an unbalanced MZ simulating a HOM two-photon interferometry experiment with (c) sample cross-coincidence count data taken for an undressed QD in Ref. [27], and (b) an idealized HOM experiment with distinct but identical sources.
ation time of the system, the system density operator is prepared in the same excited state from both QD pulse excitations, and thus certain correlation functions will be equal around time arguments ∼ 0 and ∼ T 0 . Using this knowledge of the MZ cross-coincidence correlation function, we can simplify the time integration: The function ∞ 0 dtG (2) MZ (t, τ ) corresponds to the coincidence count histogram as shown schematically in Fig. 2(c) and has a characteristic 5-peak structure (experimentally, peaks at larger delays also occur due to the laser repetition rate). With this function in mind, we define the TPI visibility by normalizing the peak around τ = 0 to its neighbouring peaks: where the integration bounds of τ are chosen to capture the peak of interest alone (much larger than T lifetime to capture the entire peak, but not so large as to integrate over neighbouring peaks).
Using Eq.'s (20) and (21), we localize the integration bounds for t and τ to occur to around t, τ ∼ 0. The ex-pression for V raw can then be highly simplified by noting that any correlation functions containing operators s(t), s † (t) where t is not in the vicinity of ∼ 0 or ∼ T 0 vanish, recalling that operators separated in time by ∼ T 0 become uncorrelated, and finally noting that correlation functions evaluated around T 0 are equivalent to to those evaluated around 0.
As an example of how this works, consider the integration around τ ≈ 0: MZ (t, τ ). We have three terms coming from Eq. (20), all given by Eq. (19) at different time arguments. Considering just the third term G (2) (t + 2T 0 , τ ) as an example, this term only has one nonzero correlation function, physically corresponding to a multiphoton detection event from the output given from the second pulse excitation of the source, having travelled down the longer arm of the MZ interferometer, and is given by the second term in Eq. (19), where R = |R| 2 and T = |T | 2 .
Applying this procedure to all terms, we ultimately where N 0 is the number of laser pulse cycles over which data is collected, multiplied by an overall efficiency factor which contains, for example, both extraction and detector efficiencies and is assumed equal for both detectors. From this, we find the visibility, recovering known results [41,67], where is a correction factor related to imperfections associated with the interferometry setup; we have added a factor of the interferometer fringe contrast (1 − ) wherever first order degree of coherence correlation functions appear to account for optical surface imperfections reducing interference visibility. Clearly, in the limit of no multiphotons (g (2) [0] = 0), the visibility of TPI in a perfect MZ interferometer (1 − = 1) with balanced beam splitters (R = T = 1/2) and the single-photon indistinguishability are equivalent. We can find a corrected TPI visibility (which would occur in a perfect MZ interferometer with balanced beam splitters) using where the second line is appropriate in the usual high purity case that g (2) [0] 1. We can also solve for the single-photon indistinguishability: It is useful to contrast this result to that obtained from a HOM interferometry experiment using two distinct SPSs with identical expectation values, as shown in Fig. 2(b). In this case, we can consider two sources with bosonic operators s 1 (t) and s 2 (t) incident upon a single beam splitter, and compute the cross-correlation function of the detectors. Following a completely analogous derivation as in the MZ case, we can find the area of the peaks that appear around τ ∼ 0: which leads to Equation (29) is a direct generalization of a result initially found by Hong, Ou, and Mandel [71] to allow for nonzero g (2) [0]. In the case of an ideal inteferometer, Eq. (31) reduces to V (HOM) raw . This definition has appeared in some theoretical works [37,57,[74][75][76], sometimes referred to therein as the indistinguishability. Most notably, this definition leads to a TPI visibility of ∼ 1/2 in the limit of distinguishable single photons, in contrast to the MZ setup. It is worth noting however, that there also exist definitions of the visibility which differ from Eq. (12) by a factor of two, such that the visibility in the MZ setup also goes to ∼ 1/2 for distinguishable photons [41]. Such a definition is still slightly different than the HOM setup in the case of an imperfect interferometer or nonzero g (2) [0].
One should also be aware that a different convention for the visibility is sometimes encountered, where the raw visiblity is instead defined as 1 − A 0 /A 0,cross , where A 0,cross is the zero delay peak area obtained after rotating the polarization of one of the MZ arms prior to the second beam splitter. A 0,cross thus differs from A 0 by the indistinguishability term, which vanishes for cross-polarized photons. From Eq. (36), it can be seen that the resultant expression for the raw visibility in this case differs slightly from the definition used in this work for nonzero g (2) [0], and this is important to keep in mind when comparing visibilities calculated using different methods.

B. Input-output relations and SPS figures of merit
In the previous subsection, we derived the brightness N , HBT visibility g (2) [0], indistinguishability I, and TPI visibility V for photons emitted from a SPS in terms of the output channel operators s, s † . Using input-output theory, these can be related to QD exciton operators for the desired |X -|G transition in the Heisenberg picture as s(t) = √ γ X σ − X (t) (neglecting vacuum input noise terms which do not contribute to any expectation values) for the total emission spectrum, neglecting any extraction efficiency loss from photons emitted into undesired modes [70]. However, in the AT regime (and also in the ac Stark regime to some extent), there exist two spectral components emitted from the exciton transition due to the cw laser dressing inducing energy splittings. When the difference in frequency between these peaks is greater than their spectral widths, input-output theory can be applied to each of the transitions separately [70], and figures of merit can be expressed for these peaks separately [27]. This is done using the eigenstates of Eq. (4).
For example, consider the total emitted photon number (brightness): where ρ ± (t) = ±| ρ(t) |± , and ρ +− (t) = +| ρ(t) |− . In the last line, we dropped an integration over a coherence term, as in the limit of well-separated peaks (i.e., with center-frequencies separated by γ X ), the integrand is highly oscillatory and contributes negligibly; such is in the same spirit as the secular approximation made in Appendix B.
One can also define indistinguishability for the sidepeaks using the dressed operators: where g (1) , σ − ± = |G ±|, σ + ± = |± G|, and ρ ± = ±| ρ |± . While it is possible to define an HBT g (2) ± [0] for the sidepeaks, this quantity is likely strongly dependent on the filter width used to isolate the peak of interest (and thus can not be unambiguously defined without reference to the filter width), as the short excitation pulse leads to broad two-photon emission tails in the spectrum. As for a pulse shorter than the cw dressing timescale, the first emitted photon during the pulse is centered around the undressed ω X energy, we can expect most of this to be filtered out of the isolated sidepeak, and generally we expect the g ± [0] to be much smaller than the total g (2) [0]. In fact, this purity-enhancing effect has been predicted even with unshifted emission frequencies in the context of pulse excitation of QDs in cavities, which play the role of spectral filtering [37].
In summary, the figures of merit we use to quantify the SPS are the emitted photon number (or efficiency/brightness) N , the HBT g (2) [0], and the singlephoton indistinguishability I, given by Eq.'s (15), (16), and (17), respectively (all with s(t) = √ γ X σ X (t)). We also have the corresponding quantities defined for the sidepeaks of the dressed system, N ± and I ± , given by Eq.'s (32) and (33), respectively. In Appendix E, we define and study an additional figure-of-merit, the cw er-ror rate E cw , which quantifies the proportion of photons emitted due to the weak off-resonant excitation of the ground-exciton transition by the dressing laser. This effect has not been taken into account in the calculation of the figures of merit in the main text, as we assume this contribution can be removed from the emitted spectrum (and also is usually very small) by spectral filtering, in typical cases.

IV. SPS OPERATION AND FIGURES OF MERIT FOR AUTLER-TOWNES AND AC STARK REGIMES
In this section, we analyze the operation and figures of merit of our SPS source in both AT and ac Stark regimes. In Sec. IV A, we derive approximate formulae for the emitted photon number and indistinguishability in the ac Stark regime by adiabatic elimination of the |B state, assuming the source to be excited in the |X excited state by a short pulse at time t = 0. In Sec. IV B, we assume the same initial condition, and calculate the evolution of the reduced density operator using the full ME Eq. (5) with the Hamiltonian (and associated rotating frame) of Eq. (3).
Throughout this section, unless otherwise stated, we let γ X = 1.32 µeV as in Ref. [27]. For the case relevant to QDs where all radiative transitions have the same rate, such that the |B state has half the lifetime of the |X state, we have γ B = 2γ X , and the solution to the ME with initial condition |X is analytically calculable for the AT regime δ = 0 (neglecting any phonon effects). This situation corresponds closely to the biexciton cascade realization of our SPS source, where for example in Ref. [27], γ B /γ X = 1.92. The solution is given in full in Appendix A, but in the well-dressed limit Ω cw /γ X 1, the emitted photon number is N = 1/2, and the indistinguishability is a very poor I = 11/21, whereas for each sidepeak the emitted photon number is N ± = 1/4 and I ± = 2/3.
In Sec. IV C, we show how a cavity mode can be used to increase the spontaneous emission rate of the |X -|G transition, improving the SPS figures of merit at the cost of larger emission linewidths relative to the frequency shifts. We assume that the only significant effect of the cavity in the regimes studied is to change the ratio γ X /γ B , so the results of this section are also applicable to non-QD systems where the decay rates of each transition may be quite different.
In Sec. IV D, we discuss the efficiency loss that arises from the filtering of the phonon sideband, which can be captured using the polaron transform, and in Sec. IV E, we discuss the role of the pump pulse to initialize the system in the |X state at time t = 0, and what role a pulse with a nonzero duration plays in the HBT g (2) [0].
Well into the ac Stark and/or AT regime, we have a Hamiltonian which oscillates rapidly compared to the dissipation rates of the system (spontaneous emission and phonon decoherence rates). To simplify our numerical calculations by avoiding having to resolve these rapid oscillations, in Appendix B, we perform a secular approximation by moving into an interaction frame defined by the system Hamiltonian Eq. (3) and dropping these rapidly-oscillating terms. For all numerical calculations presented in this work, we have checked that the secular approximation gives the same results (i.e., visually indistinguishable on any plots) as the full ME presented in the main text as the driving rates are increased into regimes where the secular approximation is expected to asymptotically recover the full solution, thus ensuring the accuracy of our simulations.
Finally, we note that if the effective cw drive Rabi oscillation period ∼ η −1 is not much larger than the excitation pulse width, the QD will experience Rabi oscillations between the |B and |X state during the process of the pulse excitation, if in the AT regime, or become offresonant with the Stark shifted |X state during the pulse excitation, if in the ac Stark regime. Such a process will highly degrade the inversion efficiency of the pulse, and an initial condition of |X will no longer be applicable. Better inversion efficiency could perhaps be achieved using different excitation techniques, such as a non-π pulse, off-resonant phonon-assisted excitation [38,49,51,77], or adiabatic rapid passage [66,78]; however, we leave a full study of this to future work. For reference, assuming a Gaussian pulse with full width at half maximum in intensity of 2 ps, proper inversion is achieved for η/γ X 249.

A. Adiabatic elimination under far off-resonant driving
For |δ| Ω cw , the biexciton state remains largely unpopulated as it is driven far off resonance. In this regime, for the case of no phonon coupling, we can approximate the system as a two-level system under radiative decay via adiabatic elimination of the biexciton state. We consider the dynamics of the σ − B operator under the Heisenberg-Langevin equation (neglecting noise fluctuation terms): As the dynamics of σ − B are fast oscillating, we approximateσ − B ≈ 0, and solve for σ − B : where and the approximation in the second line is justified on the grounds that for the adiabatic elimination procedure to be valid the detuning should greatly exceed the linewidths. We note that in the ac Stark regime, A 1, and thus from Eq. (36) we can also find σ . Substituting this result into the ME and expanding to second order in A, we find that the dynamics for a QD initialized in the |X state can be described with a simple ME for an effective two-level system, where here we have used that to second order in A, ∆ ac ≈ δA 2 . Equation (37), when considering the radiative decay of the X exciton, is an ME for spontaneous emission with effective decay rate γ X + A 2 γ B /2 and pure dephasing with rate γ eff = A 2 γ B /2 ≈ γ B 2 ∆ac δ . Note that this result can be derived as well in a more rigorous manner by using the effective operator formalism, which utilizes a Feschbach projection and perturbation theory to separate slow and fast subspaces [79]. The single-photon indistinguishability can be easily calculated for this system: again with accuracy up to order A 2 , Also to order A 2 , we have N ≈ I. For γ B = 2γ X , we have I = 4δ 2 /(4δ 2 + Ω 2 cw ). We can express this in terms of the ac Stark shift ∆ ac , so that In the absence of sources of additional dephasing or decoherence, Eq. (39) gives an estimate of the highest achievable indistinguishability for a given ∆ ac in terms of the maximum detuning δ that can be introduced without exciting other unwanted energy levels, with Ω cw ≈ 2 √ δ∆ ac , or in terms of the maximum drive strength without introducing additional decoherence.

B. Numerical results for SPS efficiency and indistinguishability
In this subsection, we present the results of our numerical (and analytical) solutions of the ME for the singlephoton emitted photon number and indistinguishability, without yet considering any cavity coupling (for γ X = γ B /2). For all plots here and in subsequent subsections, unless otherwise stated, we show results without any phonon coupling as solid lines, phonon parameter set I as dashed lines, and phonon parameter set II as dashed- dotted lines. Red (lower energy) sidepeaks are shown in red, blue (higher energy) sidepeaks are shown in blue, and the total spectrum results are shown in black. To calculate the two-time correlation functions that appear in the definition of the indistinguishability and HBT g (2) [0], we use the quantum regression theorem [80]. In Fig. 3(a), we show the analytical solution of Appendix A, as well as the numerical solution with phonon coupling in the AT regime with δ = 0 for the SPS indistinguishability and emitted photon number. As the AT regime requires weaker drive strengths to achieve equivalent energy splittings as compared with the ac Stark regime, the influence of phonon scattering is quite weak here, only being perceptible for the phonon parameter set I. In Fig. 3(b), we show the numerical solutions to the ME for the indistinguishability and emitted phonon number for a device operating the ac Stark regime with a fixed frequency shift of ∆ ac = 5γ X , as well as the approximate solution derived under adiabatic elimination of the higher energy state |B in Sec. IV A. Note that we only present results here for the red (lower-energy) sidepeak, but well into the ac Stark regime as δ/∆ ac 1, the spectral weight of the other sidepeak rapidly goes to zero, and the red sidepeak becomes nearly equal to the total spectrum, as we show explicitly later.
As δ/∆ ac 1, the full numerical solution without phonon coupling asymptotically approaches the approximation solution under adiabatic elimination. For phonon parameter set I, the effect of phonon coupling initially increases with increasing detuning, before ultimately decreasing again at very high detunings.
To understand this observation, it is useful to refer to the approximate simplification of the PME which is presented in Appendices B and C as the weak phonon coupling ME under the secular approximation. While we use the full PME (with the secular approximation) for all numerical calculations, the expressions derived in Appendix C allow for insight into the physics of the phonon interaction and its effect on the source figures of merit. We show that for the phonon parameters studied in this work, the dominant effect of the phonon interaction is to induce transitions from the |+ to the |− state with rateΓ 0 [n ph (η, T ) + 1], which corresponds to a phonon creation process, as well as transitions from the |− to the |+ state with rateΓ 0 n ph (η, T ), which is a phonon absorption process, whereΓ 0 is given by Eq. (C3), and n ph (ω, T ) = [e ω/(k B T ) − 1] −1 is the thermal phonon occupation number. A schematic of these processes in the dressed state frame is shown in Fig. 11. For η k B T 1 , the phonon creation and absorption processes occur at similar rates, and the phonon coupling leads to an incoherent dephasing-like effect which scales with ∼ η 2 . At higher effective drive strengths η k B T , only phonon creation becomes probable, and the |+ to |− transition becomes driven with rate proportional to ∼ η 3 .
In light of this, the initial increase in the role of phonons can be understood as a consequence of the concurrent increase in the drive strength Ω cw ≈ 2 √ ∆ ac δ, which, for small η relative to ω b and k B T (for T = 4 K, k B T = 345 µeV = 178γ X ), leads to a roughly linear increase in the phonon decoherence rates as given by Eq. (10) as a function of δ (see Appendix C) when holding ∆ ac fixed. This manifests in an increased population of the higher lying biexciton state, which reduces the efficiency (as seen in the inset), and the indistinguishability via timing jitter. In fact, for phonon parameter set I, this increased decoherence is sufficient to outweigh the increased indistinguishability afforded by moving further into the ac Stark regime by increasing the detuning, leading to non-monotonic behavior of the indistinguishability as a function of δ/∆ ac , and an initial local maximum of the indistinguishability as the detuning is increased. In addition, this behavior allows us to deduce that excitation-induced-dephasing also reduces the coherence of the emitted photons, as over the range of detunings where the indistinguishability decreases, the emitted photon number continues to increase, suggesting that the reduction of coherence can not be entirely attributed to timing jitter.
Moving into even higher detuning regimes, we see the indistinguishability and efficiency increase again for phonon parameter set I, which is due to the phonon absorption process which takes states from |− to |+ becoming improbable as η k B T , as the number of phonons in the bath with the required energy becomes small. It is worth noting that this effect is only present for rather large detunings (of order ∼meV) and drive strengths (∼hundreds of µeV), where the π-pulse inversion is likely very inefficient. In Fig. 4, we show the formation of the local maximum for sufficiently large phonon coupling strengths by plotting the indistinguishability of the dominant red sidepeak as a function of detuning for a fixed frequency shift ∆ ac .
While holding the ac Stark shift ∆ ac fixed and vary-ing the detuning and drive strengths to achieve this shift (as we have done in Fig. 3(b) and Fig 4) is useful from a theoretical perspective to reveal the achievable figures of merit for a given Stark shift, as well as to show the transition from the AT regime at δ = 0 to the ac Stark regime as δ/∆ ac 1, it is less suited to what would be directly observed in an experiment, as the drive strength Ω cw must also be varied simultaneously with the detuning to achieve a constant Stark shift. Thus, in Fig. 5 we plot the indistinguishability and emitted photon numbers for a fixed detuning, and instead vary the drive strength Ω cw . Here, we can see that for a fixed detuning, the figures of merit vary inversely to the effective frequency shift; as we increase the drive strength, we move further away from the ac Stark regime with weak frequency shifts and good figures of merit (due to minimal excitation of higher energy states), towards the AT regime where δ/∆ ac is small, and the energy splittings are larger, but with increased decoherence mostly due to increased timing jitter.
In Fig. 6, we show more details on some aspects associated with the phonon bath interaction for a QD SPS. As visible in Fig. 3(b), for phonon parameter set I, there occurs a local maximum of the indistinguishability as a function of the detuning in the ac Stark regime, for a given Stark shift ∆ ac . This local maximum occurs for modest drive strengths and detunings, and as such is important to consider in light of practical experimental considerations. In Fig. 6(a), we plot the indistinguishability of the red sidepeak as a function of shift ∆ ac , where for each value of ∆ ac we sweep the detuning δ to find the value δ opt /∆ ac where this local maximum occurs, and plot this as well as the corresponding indistinguishability at this detuning value. We can clearly see here that for phonon parameter set I, the maximum achievable indistinguishablity drops off rapidly as the splitting is increased. By frequency shifts of ∆ ac ∼ 20γ X , the optimal indistinguishability occurs rather close to the AT regime, with small detunings δ/∆ ac and indistinguishabilities not much higher than the AT regime value of (without phonons) I − = 2/3. In Fig. 6(b,c), we compare the performance of the device for positive (red) and negative (blue) dominant peak frequency shifts by plotting the source figures of merit as a function of detuning δ/|∆ ac | for both positive and negative detunings. In the absence of phonon coupling (and, as with all of the calculations in this subsection, neglecting the far off resonant |X -|G drive term in the Hamiltonian), as expected, the device performance is perfectly symmetric with respect to the sign of the detuning.
Upon introducing phonon coupling, however, the device operation becomes highly asymmetric with respect to the detuning (and thus the sign of the dominant peak frequency shift ∆ ac ). In all cases, the figures of merit for the SPS are better for positive (red) frequency shifts ∆ ac in the ac Stark regime. This is because at low temperatures, there are few phonons present in the phonon bath (i.e., in the sense of a thermal distribution of bosons); for negative detunings, the laser frequency is larger than the transition frequency between the X exciton and biexciton states, and a resonant process can occur where the difference in energy between the laser and transition can be absorbed by the creation of a real phonon in the bath with this energy, leading to phonon-assisted absorption. The corresponding process, where for positive detunings the energy difference is made up for by annihilation of a phonon with energy near equal to the energy gap ( Fig. 1(d)), is more strongly suppressed due to the small number of phonons present in the equilibrium bath with this energy range at T = 4 K. Thus, for positive detunings (red energy shifts), the population of the undesired higher energy biexciton state is less, and as such, the efficiency and timing jitter is reduced.
In terms of the simplified model presented in Appendix C, the process associated with phonon emission becomes more dominant as η is increased, which leads to increased transitions from the |+ state to the |− state; for positive δ, |− is more |X like, whereas for negative detunings, it is more |B like. For this reason we mostly focus on positive detunings and frequency shifts throughout this work. As we show later in Appendix E, the usual case of positive biexciton binding energies can also lead one to favor positive detunings and frequency shifts.
Note we also see in Fig. 6(c), that the non-dominant peak weight (emitted photon number) goes very rapidly to zero as the detuning is increased, indicating that the device is operating in the ac Stark regime.
In Fig. 7, we plot the indistinguishability and emitted photon for the dominant red sidepeak as a function of detuning, as in Fig. 3(b), but for larger frequency shifts of ∆ ac = 20γ X and ∆ ac = 40γ X . Here the plots span a similar range of drive strengths Ω cw , showing how larger drive strengths are required to reach the ac Stark regime for larger frequency shifts, which thus increases phonon related decoherence.
C. Use of a cavity to improve device performance via the Purcell effect In our results thus far, we have assumed that the higher lying state |B has a spontaneous emission rate γ B which is twice that of the desired state |X such that γ B = 2γ X , which is close to the case in QDs where |B corresponds to the biexciton. However, it is the simultaneous dipole radiation of the higher energy state |B and the desired state |X which leads to timing jitter and reduced coherence of the spontaneously emitted (and frequency shifted) photons. Thus, it is intuitively reasonable that should we increase the radiation rate γ X relative to the decay rate γ B , we should expect to see improved figures of merit for both the indistinguishability (for reasons of timing jitter mentioned above) and emitted photon number (as emission into the X-polarized decay channel becomes accelerated relative to the Y -polarized one).
One way to increase the effective radiation rate γ X relative to γ B is to use the Purcell effect afforded by a cavity with an enhanced density of optical states (near)resonant with the |X -|G transition. To be concrete, we can consider a single-mode cavity with bosonic operators [a, a † ] = 1, QD-cavity coupling rate g, which couples to the |X exciton with a Hamiltonian term and has photon decay rate (spectral full width at half maximum) κ, which should satisfy κ E B to ensure the biexciton transition is not also broadened. Then, in the bad cavity (weak coupling) limit that g/κ 1, the cavity mode can be adiabatically eliminated, and the result is that the spontaneous emission rate γ X is increased by a factor γ X → (1 + F P )γ X , where and γ 0 is the bare γ X before any cavity enhancement.
In principle, to determine the quantitative influence of incorporating a cavity mode on the SPS figures of merit, the Hamiltonian H cav in Eq. (40) should be included in the system Hamiltonian H S , and the PME should be modified to reflect this change (as in, e.g., Ref.'s [37,38,48,56]). Output observables of the system should then be calculated in terms of the cavity operators a and a † , as input-output theory tells one that the scattered fields of the reservoir in the Heisenberg picture (which are ultimately detected) differ from their input by √ κa(t) [70,81]. This approach leads to a correct description of some of the subtleties involved with cavity coupling, including filtering of the output spectrum which occurs due to finite cavity width κ, the emitted photon numbers in each respective channel (i.e., the cavity mode collects a factor F P greater photons than background emission channels, in the weak-coupling limit), and cavity-induced dephasing.
For simplicity, however, we shall for the results in this section assume that the influence of the cavity is solely to increase the exciton decay rate γ X . This approach has, for one, the advantage of generality, as it can apply to any (e.g., atomic) system with decay rates which do not satisfy γ B = 2γ X , with or without any cavity coupling. Furthermore, in the weak coupling limit g κ, which is the ideal regime of operation for SPSs [37], the phonon decoherence rates associated with the cavity-QD interaction become insignificant [37,39], and the dominant effect of the cavity on the QD dynamics is enhanced spontaneous emission. Thus, we simply use a variable γ X in our simulations with the model of Sec. II, with the understanding that Eq. (41) can be used to get an estimate of the cavity parameters required to observe the corresponding figures of merit as a function of γ X . Neglecting the background spontaneous emission channels, we can thus let F P ≈ γ X /γ 0 , where γ 0 = 1.32 µeV, and F P is given by Eq. (41).
In Fig. 8, we plot the figures of merit of the SPS operating in the AT regime with δ = 0 as a function of the variable decay rate γ X . We see in the case of QD SPSs that, provided the enhancement is given by a cavity with a sufficiently broad linewidth to capture the frequency splittings and operate in the weak-coupling regime, that the source can emit with near 1/2 efficiency (the best case scenario in the AT regime) and high (> 90 − 95%) indistinguishability for AT splittings on the order of hundreds of µeV (but with highly broadened linewidths). In Fig. 9, we explore the ac Stark regime of operation with a variable decay rate γ X /γ 0 . Here, we see that as the decay rate is increased, the emitted photon number increases as losses due to Y -polarized emission channels are decreased, up to a point at which the emitted photon number begins to decrease again; this decline can be attributed to the laser detuning becoming smaller relative to the effective decay rate, which moves the operation regime towards the AT regime, and thus slightly increases the population of the non-dominant peak at the cost of the dominant peak. Nonetheless, for large emission rates γ X /γ 0 , the emission exceeds the phonon scattering rates and the figures of merit remain quite high; for the case of QDs, we see that near-unity indistinguishability (> 97% for both phonon cases a ∆ ac = 25γ 0 ) for frequency shifts on the order of tens of µeV.
Interestingly, in Fig. 9(d), the emitted photon number N − actually becomes larger with phonon coupling. This can be understood in the context of the weak phonon coupling approximation of Appendix C. In particular, the ratio of the rate that takes states from |+ to |− to the rate that takes states from |− to |+ is simply exp [η/k B T ]-a direct consequence of the thermal occupation distribution of phonons in the bath. Thus, for η k B T (in Fig. 9(d), and positive detunings δ > 0, we expect phonons to increase the proportion of photons emitted into the dominant peak. Indeed, for Fig. 9(d), we have η/k B T ≈ 5. In contrast, Fig. 9(c) has η/k B T ≈ 1, and this effect is not seen. The fact that the indistinguishability in Fig. 9(b) is not also improved relative to the no-phonon case is indicative that the excitationinduced dephasing of the |X -|G transition dominates over the reduced timing jitter afforded by this phononinduced transition. We stress that the numbers presented here can be achieved with cavity and QD parameters which have been demonstrated in the literature with semiconductor microcavities [3,4,[82][83][84][85][86][87][88][89][90][91]; the range of Purcell factors shown here (less than 40; cf. a value achieved with a photonic crystal cavity of 43 [87]) can be achieved with (e.g., using dielectric micropillar resonators [3,4,[82][83][84]92]) a linewidth κ of a few hundred µeV (corresponding to Q factors ∼ 10 3 −10 4 ), and a coupling g on the order of (at most) tens of µeV, and the phonon parameter sets we use reflect measured values as discussed in Sec. II.

D. Efficiency loss due to phonon sideband coupling
As mentioned in Sec. II, the presence of the non-Markovian broad phonon sideband due to scattering with LA phonons leads to much lower photon indistinguishability if it is not filtered out of the detected spectrum (retaining only the ZPL-in this case the frequency shifted peak(s)). Since we are assuming in this work that the sideband is removed by frequency filtering after emission, we would like to quantify this efficiency cut in terms of the phonon coupling parameter sets I and II we use for the presented simulations.
Without any cavity coupling (i.e., assuming a postemission filtered ZPL), the fraction of total photons emitted that remain unfiltered is η eff = B 2 , whereas with efficient cavity filtering, it is η eff,cav = B 2 F P /(1 + B 2 F P ) [39]. In Table I, we show these filter efficiencies for phonon parameter sets I, II, and III at T = 4 K. We can also find low-temperature analytical expressions for these efficiencies by noting that whereT = πk B T /ω b , and in the second line we have, for the term containing n ph (ω, T ), expanded the exponential cutoff as a power series and evaluated the resulting Bose-Einstein integrals 2 ; at T = 4 K, retaining only terms up to orderT 4 is an excellent approximation for ω b 1.5 meV, and qualitatively accurate for ω b,I = 0.9 meV as well.
Note that the efficiency cut has not already been taken into account in our simulations for emitted photon numbers N (±) , and the efficiency given here must also ultimately be considered to yield the total SPS efficiency (in addition to other experimental considerations, e.g., output fiber coupling efficiencies, etc.) E. Single photon purity g (2) [0] associated with pulse excitation In our simulations presented in Sec.'s IV B and IV C, we have simply assumed the SPS to be initialized in the |X state, and we have not explicitly modelled the pulse excitation. As we have also neglected the σ X x term in H S that can give rise to (far off-resonant) excitation of the |X -|G transition, we have not included in our model any mechanism for more than one photon to be emitted from the transition of interest to our SPS. Formally, then, all of the above simulations are for g (2) [0] = 0.
To improve on this approximation, we include the pulse excitation directly in the system Hamiltonian, and use a time-dependent PME which incorporates the pulseinduced phonon scattering [37,38,57], described in Appendix D. However, the re-excitation probability (and α (ps 2 ) ω b (meV) B η eff (%) η eff,cav (%) phonon parameter set I 0.04 0.9 0. corresponding two-photon emission probability) is not strongly affected by the dressing field so long as η −1 is much larger than the pulse width in time, as the pulse occurs much quicker than the period of Rabi oscillations between the |X and |B states. Thus, the statistics of two-photon emission are very closely related to results known for the simple two-level system pulse excitation [5,27,37,93]. The primary modification to the g (2) [0] comes from the fact that in a two-photon emission event, the first photon will, for η −1 much larger than the pulse duration, be emitted in the desired |X -|G transition. The sequential photon, however, is emitted with probability N (i.e., the efficiency). Specifically, consider a Gaussian pulse with full width at half maximum in intensity τ p , and area in amplitude π (i.e., after the coherent attenuation factor from phonons B is applied, as with the rest of this paper). Then, for a short pulse that satisfies ητ p 1 and γ X τ p 1, the results of Fischer et al. [93] give g (2) [0] ≈ η G γ X τ p /N 2 for the case of no cw dressing laser, 4376 is a factor associated with the Gaussian pulse. For the reasons outlined previously, with cw dressing, this value is reduced by a factor of N such that Next, using Eq. (25), we find where we neglect small terms of order (γ X τ p ) 2 . Note if we multiply the second term on the right hand side of Eq. (44) by a factor of 1/N , this equation also applies to the usual scheme of QD SPSs with resonant pulse excitation and no dressing, and is thus broadly useful in characterizing the relationship between indistinguishability and TPI visibility as measured in an MZ interferometry experiment. It is important to keep in mind, however, that the indistinguishability (first-order coherence) is also degraded due to pulse excitation in a manner which scales similarly to the g (2) [0] behavior [37,76]. In Fig. 10, we study the full g (2) [0] of the source, using the PME model of the excitation pulse described in Appendix D, as well as the approximate solution in Eq. (43). For ps-duration pulses, the approximate formula is very accurate, agreeing almost exactly with the full calculation (without phonons) for τ p ≈ 2 ps, and de- HBT g (2) [0] as a function of pulse width for AT (green lines) and ac Stark (blue lines) regimes, without phonons (solid lines) and with phonons using parameter set III (dashed lines). Also plotted as stars is the semi-analytical approximate solution given in Eq. (43) with (red) and without (blue) phonons.
viating only slightly in the case of phonons. With phonon coupling, the g (2) [0] is uniformly larger than without, which is likely due to the reduction in inversion efficiency from pulsed-excitation-induced dephasing [37]. For long pulses in the ac Stark regime, the approximate solution overestimates the g (2) [0]. This may be because in this scenario ητ p 1 is no longer satisfied even for a short pulse, and the Rabi oscillations between the |X and |B states lead to a small reduction in the re-excitation probability and thus g (2) [0].
As such, we can conclude that to maximize the purity of this source and minimize g (2) [0], very short pulse excitation should be used to minimize re-excitation probability (but not so short as to excite unwanted energy levels). Additionally, the use of a high-Q factor cavity, which already is beneficial to the SPS figures of merit for reasons of collection efficiency and minimization of timing jitter (see discussion in Sec IV C), also suppresses g 2 [0] strongly by means of a dynamical decoupling effect leading to a pulse-induced time-dependent Purcell factor, as found in Ref. [37]; this effect is not captured by our simple model which involves merely changing the relative decay rates.
There is, in addition to the contribution to g (2) [0] from the pulse excitation, a potential contribution which arises from the cw drive directly via the σ X x term in Eq. (1). Generally we assume these photons can be filtered out of the collected spectrum, however one can expect this contribution to the g (2) [0] to remain small relative to the pulse so long as E cw (discussed in Appendix E) is sufficiently small, and Ref.'s [42,72] contain information on how to incorporate a cw contribution to the g (2) [0] into the analysis if desired.

V. CONCLUSIONS
In conclusion, we have theoretically analyzed the important factors that affect the figures of merit of our experimentally realized SPS [27], which allows for optical frequency tuning of emitted single photons, using a polaron transform ME method to incorporate rigorously the effects of electron-phonon scattering. Our detailed study is also relevant to wide range of semiconductor QD platforms.
In the AT regime, we have shown that (without any cavity coupling and equal dipole transition rates) large frequency shifts can be achieved, but at the cost of poor indistinguishability (∼66%) and efficiency (∼25% at best). In the ac Stark regime, with a detuned laser drive, we have shown that much higher indistinguishabilities can be achieved (>90% for frequency shifts of tens of µeV), but with the need for much higher cw drive strengths leading to increased phonon-related decoherence, including increased population of the higher energy state and thus increased timing jitter and reduced efficiency, as well as excitation-induced dephasing of the photon emission channel. This phonon-related degradation of the source figures of merit increases rapidly as the frequency shift ∆ ac is increased. For large enough phonon coupling rates, phonon scattering also leads to, for a fixed frequency shift ∆ ac , a local maximum in the indistinguishability as a function of detuning, whereas without phonon coupling the indistinguishability continues to increase as the detuning increases further according to the approximate relation derived by adiabatic elimination of the higher energy state I ≈ N ≈ δ/(δ + ∆ ac ). In this case, the maximum achievable indistinguishability is set by the presence of other energy levels neglected in this analysis, other sources of excitation-induced decoherence, or other experimental limitations.
In addition, we have shown how the low-temperature asymmetry associated with the phonon bath due to low phonon occupation probabilities leads to preferential operation of the SPS in the ac Stark regime with positive (red detuned) detunings (and thus red Stark shifts of the emitted photons), as this leads to reduced phononinduced excitation of the higher energy state and thus reduced timing jitter and efficiency loss.
We have also elucidated how cavity coupling (or more generally, application of the model to systems with different transition rates between levels) can be used to improve the source figures of merit via the Purcell effect, at the cost of broader emission linewidths relative to the frequency shift. In this case, using the AT regime to generate indistinguishable photons becomes more practical, with >90% indistinguishability achievable with N ≈ 1/2 for frequency shifts up to hundreds of µeV for realistic QD and cavity parameters which have been widely achieved in the literature for dielectric resonators. The presence of selectively enhanced spontaneous emission rates also was shown to benefit the ac Stark regime, enabling near unity indistinguishabilities with high efficiency for Stark shifts up to tens of µeV.
Finally, we analyzed the multiphoton statistics of the source, including the g (2) [0] parameter, which was shown to follow closely related trends as previous studies on simple undressed two-level system models have predicted [37,93].
We reiterate that while the results in this paper have been presented for the specific realization of the source with the semiconductor QD biexciton cascade four-level system model (which necessitates an analysis of LA phonon coupling), the principles of operation only require a quantum 3 (or more) level ladder system, and many of the results of our analysis for the case of no phonons should apply qualitatively to these systems as well.
Furthermore, we expect that the specific implementation of the frequency tuning mechanism we have illustrated in this work could be modified, expanded upon, or optimized further by the engineering or implementation of different energy levels or radiative decay rates. For example, the QD biexciton cascade also involves a Ypolarized exciton, which could be used to create Stark shifted photons of the X-cascade by instead dressing the |Y -|G transition with an orthogonally polarized cw laser. This would remove the issue of the cw background by employing polarization filtering. In this case, the indistinguishability follows similar trends as in the case of biexciton-exciton dressing. For the sake of this work, however, we have restricted our detailed analysis to a cascade type dressing involving the biexciton state, as this is what we have reported in experiment.
Overall, our results indicate that the use of the AT and ac Stark effects to produce optical frequency shifts in a quantum ladder system can be effective in generating indistinguishable single photons with high efficiency. While the achievable figures of merit are ultimately limited by phonon effects in the case of the QD cascade system studied here, frequency shifts of up to tens-hundreds of µeV are achievable with realistic cavity and QD parameters which have regularly appeared in the literature, while maintaining high indistinguishabilities, efficiencies, and purities, and this analysis is consistent with experimental results we have reported in Ref. [27]. ACKNOWLEDGMENTS L.D. and Stephen H. acknowledge financial support from the Alexander von Humboldt Foundation. We are also grateful for the support by the State of Bavaria, the Natural Sciences and Engineering Research Council of Canada, and the Canadian Foundation for Innovation. We would like to thank Christian Schneider for useful discussions and work in fabrication of the sample used to obtain the data in Fig. 2(c). For γ B = 2γ X , and δ = 0 (AT regime), neglecting phonons, the solution to the ME of Eq. (5) with the Hamiltonian (and rotating frame) of Eq. (3) can be expressed analytically. Considering the system to be in the |X state at t = 0, the single-photon indistinguishability is, up to an integral, where and with and where cos φ = Ω/Ω cw , sin φ = γ X /(2Ω cw ), and Ω = . For the density matrix elements, we obtain and where a ± (t) = with Ω = Ω 2 cw − γ 2 X 16 . In the dressed state basis (using Eq. (4)), we have simply ρ + (t) = ρ − (t) = e −γ X t /2, and so N ± = 1/4. The first-order two time correlation function is (A10) Notably, in the well-dressed limit (γ X /Ω cw → 0), we have N = 1/2 (as the dressed system has equal decay rates to both polarization channels), and I = 11/21.
We also consider the indistinguishability of a photon emitted from one of the sidepeaks, I ± . By symmetry, for δ = 0 both sidepeaks give the same indistinguishability such that I + = I − . The result for this I ± is (assuming Ω is real, as these observables are only well-defined for dressing exceeding the damping) where with In this case, the indistinguishability I ± tends to 2/3 in the well-dressed limit, which is the same as what one would find for an undressed system initialized in the |B state.
Appendix B: Interaction frame secular approximation In this appendix, we discuss the secular approximation which can be made when the emission spectrum peaks are well-separated, which removes the fast oscillations from the equations of motion and makes the numerical solution of the ME vastly more efficient. This approximation also produces an ME in Lindblad form.
We first consider the model given by the system Hamiltonian H S of Eq. (3) and the ME of Eq. (5) with the additional phonon term of (10). We then move into an interaction frame defined byρ(t) = U † (t)ρ(t)U (t), where for our four-level system model, The ME in this interaction frame then takes the forṁ where L(t)ρ(t) = U † (t)Lρ(t)U (t), in terms of the dressed state basis of Eq. (4) with dressed energies E ± = ±η/2. Next, we note that applying the unitary transformation of Eq. (B1) to the radiative and phonon terms L rad ρ and L PME ρ will yield a sum over time-independent terms, as well as time-dependent terms which oscillate at frequencies given by E ± and E + − E − . If η is much larger than the characteristic rates atρ(t) evolves, then these time-dependent terms can be dropped, making a secular approximation (or post-trace rotating wave approximation), as they average out to give a negligible contribution to the interaction frame density operator evolution. Intuitively, we expect this situation to occur when the peaks of the system are well-separated by much more than a linewidth or any of the phonon rates. The characteristic rates at whichρ(t) evolves are given by the coefficients of Eq. (B2) which we give below.
Making the secular approximation, we thus drop all of these rotating terms such that Eq. (B2) now has no explicit time dependence (i.e., except that coming from ρ(t)). With some work, it can be shown that the ME in the interaction frame with the secular approximation can then be written in Lindblad form aṡ where the radiative contribution is where we have let σ +− = |+ −| to simplify the notation, and the non-unitary phonon contribution is We also have a unitary part of the phonon interaction, which is given by the Hamiltonian The complex phonon scattering rates are given bỹ From Eq.'s (B5) and (B6), we see that the influence of the exciton-phonon interaction can be seen in the dressed state basis to take the form of a pure dephasing-type term with rate Re{Γ 0 x }, and phonon-driven transitions from |+ (|− ) to |− (|+ ) with rate Re{Γ + m } (Re{Γ − m }). Additionally, the Hamiltonian term H PME gives a renormalization of the dressed state energies, which in the bare-state basis is equivalent to a small shift in the |X and |B state energies, as well as the drive term between them; our simulations (not shown) performed without this Hamiltonian term look very similar to the full calculations, indicating that L S PMEρ (t) has the dominant influence on the SPS figures of merit.
It is easy to see that in the dressed-state frame, the observable figures of merit for the ± peaks are calculated the same as in the bare state frame, but with ρ →ρ, and for the total emitted photon number, N = N + + N − .
Appendix C: Weak phonon coupling PME For common values of the phonon parameters α, ω b , and the temperature T , including the parameter sets we study in this work, the phonon coupling function satisfies |φ(τ )| 1. In this case, we can to a good approximation expand the phonon functions G m (τ ) that appear in to leading order in φ, to find G Under this weak phonon coupling approximation, we can simplify the PME in the dressed state basis under the secular approximation of Appendix B. The Hamiltonian part of the PME then becomes (using primes to indicate this weak phonon coupling approximation) whereΓ ± y are given by Eq. (B7c), and the incoherent part of the PME becomes where n ph (ω, T ) = e ω/(k B T ) − 1 −1 is the thermal phonon occupation number, and While we do not use the weak phonon coupling approximation for our numerical simulations (although for the phonon parameter sets we use, it is expected to give good quantitative agreement with the full PME), it is nonetheless useful to gain analytical insight into the underlying physical processes and scaling behavior of the phonon decoherence rates. The Hamiltonian term H PME gives a small renormalization of the dressed state energies, such that E ± → E ± + Im{Γ ± y }/2, which gives a perturbation to the nominal splitting ∆ ac . As mentioned in the main text, we neglect this phonon drive-dependent frequency shift renormalization when we refer to the frequency shift ∆ ac , and we have checked in our simulations that its effect is small (typically 10% of ∆ ac ).
FIG. 11. Schematic of phonon-assisted transitions between dressed states as given by the weak phonon coupling PME. In the AT regime (a), the phonon absorption processes with rateΓ 0 n ph (Ωcw, T ) and the phonon emission processes with rateΓ 0 [n ph (Ωcw, T ) + 1] take the system to both |X and |B states, as the dressed states |± are symmetric and antisymmetric combinations of these states, whereas in the ac Stark regime (b) the phonon absorption processes with ratẽ Γ 0 n ph (η, T ) and the phonon emission processes with ratẽ Γ 0 [n ph (η, T ) + 1] take the system to higher and lower energy, respectively. In this schematic we let n ph (ω) ≡ n ph (ω, T ).
The non-unitary part of the PME (which leads to phonon-related decoherence) under the weak phonon and secular approximations is given by L S PMEρ and is shown schematically in Fig. 11.
In the AT regime, with δ = 0, the phonon rate is simplyΓ 0 = π 2 J(Ω cw ), and we can furthermore neglect the exponential cutoff term in the phonon drive, as the regime where it becomes significant requires very high drive strengths and is difficult to realize in experiments. Then, if Ω cw k B T (very strong driving), the phonon dissipator simply takes the form of spontaneous emission from state |+ to state |− with rate α π 2 Ω 3 cw . At lower drive strengths Ω cw k B T , the phonon-induced transitions between |+ and |− states occur with the same rate α π 2 k B T Ω 2 cw , giving the expected Ω 2 cw scaling. In the ac Stark regime with |δ| Ω cw , we can find the leading order behavior (i.e., in δ/∆ ac ) of the incoherent phonon rates by approximating η ≈ |δ|, and Ω cw ≈ 2 √ ∆ ac δ. Then, for δ k B T , the phonon dissipator again takes the form of spontaneous emission from |+ to |− with approximate rate α2π∆ 3 ac δ ∆ac 2 e −δ 2 /(2ω 2 b ) . At lower detunings δ k B T , again the phonon-induced transitions between |+ and |− occur at the same approximate rate α2πk B T ∆ 2 ac |δ| ∆ac e −δ 2 /(2ω 2 b ) .

Appendix D: Polaron master equation with pulse drive
To extend the PME to include the excitation pulse, we add a term Ω p (t) cos (ω X t)σ X x to the Hamiltonian in Eq. (1), where Ω p (t) is the pulse amplitude satisfying ∞ −∞ dtΩ p (t) = π, neglecting the coupling of the pulse to the |B -|X transition which is appropriate provided E −1 B is much smaller than the duration of the pulse. We also neglect the coupling of the cw drive to the |X -|G transition. We then choose another interaction frame given by The PME superoperator is then , X m (t)] + H.c., (D2) and we have again absorbed a coherent attenuation factor B from the PME into our definition of Ω cw and Ω p (t) for easy comparison with the no-phonon case. The operators X m (t−τ, t) = U (t, t−τ )X m (t−τ )U † (t, t−τ ) are calculated using the "additional Markov" [37] approximation: U (t, t−τ ) ≈ exp [−iH S (t)τ ], and X m (t−τ ) ≈ X m (t) within the integrand, and this approximation is valid for pulse widths much greater than ω −1 b , and Appendix E: Continuous wave excitation error rate As mentioned in the main text, the presence of the σ X x term in Eq. (2) leads to weak cw excitation of the |X state, by means of the far off-resonant drive. As a result, in addition to the (pulse-wise) emitted photon number N , which is calculated in absence of this term, using instead Eq. (3) for the system Hamiltonian, there is a small, constant in time, photon emission flux. However, this contribution produces photons which are (for E B + δ > 0) blue Stark shifted by ≈ Ω 2 cw /4(E B + δ), which in many instances should be far-off from the desired peak of interest, and as such can be filtered out of the collected spectrum. Nonetheless, in this section, we consider the cw error rate assuming no emitted photons of the |X -|G transition are filtered.
To quantify this cw error rate, we define a quantity E cw to be the ratio of the average number of photons emitted in the absence of any pulse excitation over a duration equal to the laser repetition rate T rep , divided by the number of photons emitted by the source with a pulse excitation.
To calculate E cw , we assume as a first approximation that the pulse initializes the system in the |X state. We then can simulate, using the Hamiltonian of Eq. (2), the photons emitted N 0 over a duration T rep starting from the initial condition ρ 0 , which we choose to be the steadystate condition of the ME, and then divide this quantity by the number of photons emitted using the same Hamiltonian with instead initial condition |X , which we denote N X . Then, For the case where we do not consider phonon interactions, as an additional approximation to this quantity, we can also note that so long as Ω cw |E B + δ|, the cw excitation of the |X -|G transition is very weakly driven (i.e., see Fig. 1(d)), and as such the steady-state population of the |X state, and thus the photon flux due to the cw drive will be very small. In this case, we can approximate the biexciton population (and associated coherences) to be negligible, and solve the ME in the absence of phonons analytically to find: where ρ 0 X = Ω 2 cw /(4(E B + δ) 2 ) is the steady-state population of the |X state under this approximation, and in the second line we have noted that the difference between N X , the emitted photon number calculated over a time duration T rep with the Hamiltonian of Eq. (2) and the initial condition ρ(t = 0) = |X X|, and the total emitted photon number N calculated using the Hamiltonian of Eq. (3) and the same initial condition (which does not contain any cw excitation contributions) scales with ρ 0 X . Specifically, the final line of Eq. (E2) allows one to approximate the cw error rate in the absence of phonon couplings using only the emitted photon number N calculated without considering the far off-resonant driving of the |X -|G transition. When phonons are considered at a nonzero temperature, phonon-assisted transitions as shown schematically in Fig. 1(d) render this approximation inappropriate.
In Fig. 12, we plot the cw error rate E cw using both the full calculation N 0 /N X , as well as the approximate solution in the final line of Eq. (E2) for the case of no phonons, for both AT and ac Stark regimes. We use biexciton binding energy E B = 3.24 meV, as in our experiment in Ref. [27]. For both regimes, the approximate formula is an excellent approximation to the full solution. Also in both regimes, we find in contrast to the figures of merit studied in the main sections of this paper, the error rate is much worse for phonon parameter set II compared to parameter set I (which is close to the no-phonon case). This is due to the larger value of ω b,II giving a more appreciable value of the phonon spectral function at the detuning from the relevant transition J(E B + δ), meaning that the resonant process of absorption simultaneous with phonon absorption becomes more prominent, despite this process being mostly suppressed at low temperatures. We can see this clearly in Fig. 12(b), where for negative detunings, the error rate increases drastically for both phonon parameter sets, as the detuning E B +δ of the |X -|G transition becomes smaller and the spectral function becomes more and more appreciable even for the smaller phonon cutoff frequency of ω b,I . This is an additional reason to prefer positive detunings (and thus red frequency shifts) when operating in the ac Stark regime, for the typical case of positive binding energy E B .
We note, of course, that the cw error rate can be improved by using a smaller repetition rate T rep , however one has to keep in mind the relaxation rate of the SPS system, and if the repetition time were made too small the derivation and equations presented in Sec. II on the HOM and HBT experimental procedures may need to be revised; in this manner, the introduction of a cavity provides another potential advantage through enhancement of the relaxation rate.