Error-detected state transfer and entanglement in a superconducting quantum network

Modular networks are a promising paradigm for increasingly complex quantum devices based on the ability to transfer qubits and generate entanglement between modules. These tasks require a low-loss, high-speed intermodule link that enables extensible network connectivity. Satisfying these demands simultaneously remains an outstanding goal for long-range optical quantum networks as well as modular superconducting processors within a single cryostat. We demonstrate communication and entanglement in a superconducting network with a microwave-actuated beamsplitter transformation between two bosonic qubits, which are housed in separate modules and joined by a demountable coaxial bus resonator. We transfer a qubit in a multi-photon encoding and track photon loss events to improve the fidelity, making it as high as in a single-photon encoding. Furthermore, generating entanglement with two-photon interference and postselection against loss errors produces a Bell state with success probability 79% and fidelity 0.94, halving the error obtained with a single photon. These capabilities demonstrate several promising methods for faithful operations between modules, including novel possibilities for resource-efficient direct gates.

Quantum networks must address the problem of inefficiencies in signal coupling, transmission, routing, and receiving, both by minimizing losses in these stages and by mitigating the impact of loss on operation fidelities. Both optical fiber and microwave transmission line can be very efficient transmission channels, with loss over a meter of superconducting coaxial cable as low as 10 −3 [8]. Circuit quantum electrodynamics (cQED) combines this low transmission loss with efficient switchable couplings between qubits and microwave photons [9][10][11], making it a promising candidate for modular quantum computing. However, robust and scalable networks require communication protocols with performance beyond even small losses. Schemes which condition on success with an ancillary herald are employed in both optical [4] and microwave networks [10,12], at the cost of reduced operation rate. The cQED toolkit also enables quantum nondemolition (QND) measurement of multi-photon bosonic qubits for deterministic correction of loss [13], though realization of this approach for communication has remained an outstanding challenge [14,15]. cQED networks that employ propagating photons [10,[15][16][17][18][19] have been unable to leverage the intrinsic quality of superconducting cable due to reliance on lossy directional elements such as circulators. In contrast, intermodule links that forgo directional elements can have much lower loss, and support a standing-mode structure [20,21]. A single mode can be used as a quantum * luke.burkhart@yale.edu † robert.schoelkopf@yale.edu bus [22,23], potentially connecting several modules (Fig.  1a). This simple link provides features unavailable in a directional channel, such as interference of photons from different modules. When the bus connects two bosonic modesâ 1,2 , this interference can give rise to an effective beamsplitter transformation: a 1 → cos θâ 1 + sin θâ 2 a 2 → cos θâ 2 − sin θâ 1 .
The beamsplitter is a powerful tool for manipulating and entangling propagating photons [24,25]. In this work we extend these applications to stationary modes in separate modules, using this interaction for two key networking tasks -qubit transfer and entanglement generation.
As a simple demonstration of the quantum network bus, we connect two modules with a superconducting coaxial cable of length l = 6.6 cm, though this technique extends to longer links (Supplementary Information). We choose the third harmonic (l = 3λ/2) of the cable as the bus, shown in Fig. 1b. Constructed without lossy microwave components or connectors, this mode has a relatively high quality factor (Q = 51, 000, κ b /2π = 110 kHz). Each module contains a 3D cavity bosonic qubit coupled to the bus via a conversion transmon [26]. When the transmon is driven by two offresonant pumps with frequency difference close to the difference between the cavity (â j ) and bus (b) frequencies, the Josephson nonlinearity enables parametric conversion [11,27] of the form Modules are connected via a standing mode of a transmission line or waveguide structure that serves as a quantum bus (blue). A conversion element (green switch) allows gated interaction (purple arrow) between each module and the bus.
(b) Circuit quantum electrodynamic implementation. The busb is the third mode (l = 3λ/2) of a section of superconducting coaxial cable (l = 6.6 cm). Each module contains a 3D cavityâj, which is coupled to the bus mode via parametric conversion, enabled by the nonlinearity of a conversion transmon (green). Modules also contain ancilla transmon and readout resonator (grey) for state preparation, error detection and tomography. (c) Simultaneous parametric conversion between each cavity and the bus enables effective interaction between the two cavity modes. Symmetric detuning of conversion in both modules controls the participation of the bus mode (vertical axis representing energy not to scale). (d) A sketch of the dynamics of the three mode system for zero detuning (top) shows significant occupation of the bus halfway through transfer of an excitation from cavity 1 to cavity 2. For large detuning (bottom), the bus is eliminated from the dynamics, at the cost of much slower interaction (time axes not to scale). However, even at zero detuning, there is a time (final point) where the occupation of the bus returns to zero, effectively eliminating the bus for any input state.
The pump amplitudes control the conversion rate g, and their frequencies set the detuning ∆ j from the frequencymatching condition. An important consideration when utilizing this coupling is to ensure that no information is left in the bus at the end of the protocol. When conversion is activated in both modules simultaneously, the resulting three-mode system allows bidirectional population transfer in which the bus begins and ends in the vacuum state (Fig. 1c,d), satisfying this requirement. However, constant resonant conversion (∆ 1 = ∆ 2 = 0) is not well suited for entanglement by partial transfer, because halfway through transfer the occupation of the bus is maximized, leaving it entangled with the cavities. There are solutions to this problem that involve modulating the coupling strengths in time, such as stimulated Raman by adiabatic passage [28], but this approach does not allow the full range of interference effects utilized here (Supplementary Information). Instead, we choose to use equal, nonzero detuning in both modules (∆ 1 = ∆ 2 = ∆). The best-known  case is the virtual Raman regime ∆ g, where the bus occupation is suppressed, effectively eliminating it from the dynamics, resulting in the beamsplitter evolution of equation (1) between the cavities. This elimination suppresses infidelity due to loss in the bus, but lengthens the protocol significantly, an unfavorable trade-off when other sources of decoherence demand faster operations.
We address these concerns by working at intermediate detuning ∆ ∼ g, where the beamsplitter transformation in equation (1) can be constructed with any mixing angle θ without slowing down the protocol. We take the conversion rate as fixed, since it is usually limited by experimental constraints, and use detuning as a control parameter. For a given detuning, the bus returns to vacuum and is eliminated periodically, with period τ BS = 2π/ 8g 2 + ∆ 2 . Thus the evolution is a linear transformation on the two cavities of the form of equation (1), with beamsplitter angle (Supplementary Information). This operation, tunable from opaque to transparent, is equivalent to a physical beamsplitter for propagating photons as depicted in Fig.  2a, but we emphasize that here this evolution applies to two stationary modes, whose initial (final) states are represented by the input to (output from) the beamsplitter. By relying on parametric conversion and a high-quality bus, this scheme has the benefit compared to previous ex-periments [10,[15][16][17][18][19][20][21] of not requiring precise frequency matching or tunability.
We apply this idea to eliminate the bus at two beamsplitter working points -fully transparent for state transfer, and 50:50 for entanglement generation. We prepare a single photon in cavity 1, turn on conversion in both modules, and measure the occupation of each cavity as a function of time, shown in Fig. 2b, for g/2π = 560 kHz, the value used throughout this work. At zero detuning (θ = 90 • ), we observe complete excitation transfer between cavities in τ SWAP = 624 ns, with energy transfer efficiency η = 84%. Furthermore, an intermediate detuning ∆ = 8/3g realizes a 50:50 beamsplitter (θ = 45 • ), which can generate entanglement from separable input states. The time to eliminate the bus at this detuning is τ 50:50 = 520 ns, demonstrating the advantage of this operating point over the much slower virtual Raman regime. The dynamics as a function of detuning in Fig. 2c demonstrate the continuous tunability of this operation.
Both operations are several hundred times faster than the decay rate of either bosonic qubit (κ −1 1,2 = 300, 450 µs), so the excitation decay during the beamsplitter is dominated by dissipation in the bus (κ −1 b = 1.5 µs). The transfer inefficiency due to loss is 1−η ≈ κ b /2g = 11%; thus, these operations can be improved in future with increased conversion rate or bus quality, which is partly limited by resistive loss at the interface with the modules (Supplementary Information). Additionally, excitation events of either conversion transmon due to thermal or (d) Two-photon Bell state from Hong-Ou-Mandel interference. One photon is prepared in each cavity as inputs to 50:50 beamsplitter. To detect loss events, parity is measured in each cavity simultaneously before tomography. Ideal state and rejected error states are shown in beamsplitter output. (e) Conditional Wigner tomograms, postselected on even parity in both cavities. Phase shift in X basis is deterministic effect of parity measurements and pumps, not calibrated out in this experiment. (f ) Reconstructed Pauli expectation values for two-photon Bell state, showing higher fidelity than in c.
pump-induced transitions [27] dispersively shift the bus and cavity frequencies, effectively turning off conversion and dephasing the cavity. These events are responsible for an additional 3-4% inefficiency, but could be mitigated with a different type of conversion element [29]. The value of g used here is a trade-off between speed, which reduces loss errors, and excitation errors induced by the pumps; an element which makes this trade-off more favorable can in future improve the beamsplitter.
To characterize the bus as a communication link, we turn to the task of state transfer, an operation which can move qubits between modules for local processing or entanglement distribution. We demonstrate transfer of a single qubit from one module to another; however, the link is bidirectional and linear, allowing simultaneous two-way transfer for any input states. The sim-plest encoding is the Fock basis -presence or absence of a single photon. We prepare a known superposition α|0 + β|1 in cavity 1, apply the 90 • beamsplitter, and perform Wigner tomography on the resulting state in cavity 2 (Fig. 3a). This protocol is performed for six input states, two of which are shown in Fig. 3b. The resulting mean state fidelity, reconstructed using maximal likelihood estimation (Supplementary Information), isF Fock = 0.92(1), the average fidelity with which an arbitrary quantum state is transfered.
This protocol yields significant improvement over previous experiments which relied on directional communication links [15,18], but the performance remains dominated by loss in the bus. We demonstrate suppression of this source of infidelity by leveraging strategies for correcting photon loss in stationary memories [5,13,14].
Because the cavities and the bus are all harmonic oscillators, we use an encoding which detects loss in any of these modes with a single measurement. We choose the cat code [5], superpositions of four coherent states with even photon number parity, and add a QND parity measurement using an ancilla transmon prior to Wigner tomography (Fig. 3c). The cat code employed here is chosen to contain on averagen ≈ 1.7 photons, more than three times as many as the Fock encoding. This increase makes loss events more likely, and is the unavoidable overhead of an error-correctable encoding. Indeed, without error detection, the mean state fidelity isF cat = 0.80(1) <F Fock (Supplementary Information). However, the parity measurement can detect loss events, mitigating the impact on fidelity and resulting in a net gain.
By adding the syndrome measurement before tomography, we overcome the overhead to reach the break-even point with respect to the Fock encoding. The measured Wigner functions sorted by parity outcome are shown in Fig. 3d. The dominant outcome is that the parity remains even (p even = 84%), and the fidelity in this noerror case isF even = 0.93 (1). Change in parity denotes a single loss error (p odd = 16%), and we find the coherence to be preserved in the odd manifold, with fidelitȳ F odd = 0.86 (3). This error can be corrected in real time, or the state conditionally decoded [13] (Supplementary Information), but the odd manifold is also a valid logical encoding, so future operations may simply be updated without feedback correction; hence, the state is errortracked. Averaged over all trials, the resulting deterministic fidelity isF cat,tracked = 0.92 (2), which includes encoding and syndrome measurement errors of ∼ 3% (Supplementary Information). Even in the presence of code overhead and these imperfections, the cat code still reaches the break-even point at which the transfer fidelity is as high as in the Fock encoding. In future, use of a more robust parity measurement [30] and suppression of conversion transmon excitations -an uncorrectable error -should enable error-corrected state transfer beyond break-even.
To fulfill another key network requirement, we generate entangled states between modules, a critical resource for non-local gates [2,31,32] and stabilizer measurements [33]. The 50:50 beamsplitter in Fig. 2b can entangle the modules for a variety of input states. The simplest input is a single photon in one of the cavities (|10 ), ideally creating the odd-parity Bell state (|10 + |01 ) / √ 2 (Fig. 4a). To characterize the entanglement, we perform logical two-qubit tomography by measuring conditional Wigner functions. We measure cavity 2 in one of three logical bases {X, Y, Z} (Supplementary Information), ideally projecting cavity 1 into one of the corresponding basis states depending on the measurement result. The Wigner function of cavity 1, measured simultaneously and conditioned on the logical measurement outcome, shows the correlations expected of an entangled state (Fig. 4b). From these data we reconstruct the logical two-qubit state (Supplementary Information), which has fidelity to the ideal Bell statē F Bell,01 = 0.88(1), not corrected for a 2% error in the logical measurement (Supplementary Information). The reconstructed joint Pauli expectation values (Fig. 4c) exhibit dominant two-qubit correlations, but visible singlequbit polarization indicates the infidelity results mostly from photon loss in the bus.
Finally, we implement a novel scheme for QND detection of loss during entanglement generation, which is probabilistic but results in higher fidelity when successful. We apply the 50:50 beamsplitter to the two-photon input state |11 , ideally producing (|20 + |02 ) / √ 2, an example of Hong-Ou-Mandel interference [34] that relies on frequency conversion to make photons indistinguishable. Single-photon loss results in odd photon number occupation in one of the cavities after the beamsplitter, and can be detected by measuring the parity of each cavity, as indicated in Fig. 4d. Since the ideal state has only even number, these measurements do not dephase the entangled state. We declare successful entanglement when both cavities have even parity (success probability 79%), and post-select the resulting data. The measured conditional Wigner functions in Fig. 4e exhibit the expected correlations in the two-photon manifold, and the reconstructed logical state in Fig. 4f shows reduced single-qubit polarization compared to the single-photon protocol. The Bell state fidelity, again uncorrected for tomography, isF Bell,02 = 0.94(1), a two-fold reduction of errors from the single-photon case, with only a small failure rate, which can be improved by reducing loss in the bus. We emphasize that this simple protocol is only possible with bosonic qubits and a bidirectional bus which supports multiple indistinguishable photons.
This error-detected protocol also admits a simple and rapid way to increase the effective success probability. Because the state upon failure is known (|01 or |10 ), we can simply load another photon into the empty cavity and reapply the beamsplitter. For instance, with up to three entangling attempts, the success probability increases to 95%, with only a small decrease in the fidelity toF Bell,02 = 0.91(1), with an average entangling time of 5.2 µs, twenty times faster than the cavity decoherence times (Supplementary Information). This ability to rapidly generate high-fidelity Bell states is an essential feature for teleportation schemes using resource entanglement.
We have demonstrated a high-quality quantum bus in a cQED network which enables a flexible beamsplitter operation between bosonic qubits in separate modules. We use the beamsplitter to transfer qubits and generate entanglement, and show how both protocols can be improved with multi-photon states and error-detection, reaching the break-even point for state transfer, and reducing the Bell state infidelity by half with a high success probability. The bus efficiency can be significantly enhanced with reasonable improvements in assembly, a higher-quality transmission line or waveguide link, or improved parametric conversion. This platform offers routes towards extensible networks by coupling several modules to a single bus, or arraying modules with multiple two-port buses.
The capabilities demonstrated here suggest several complementary approaches for operations between modules in superconducting quantum networks. The entanglement fidelity achieved completes the set of requirements for a viable gate over the network [31], while error-correctable state transfer enables shuttling logical qubits between modules to instead perform local gates. The low-loss beamsplitter also allows application of gates such as controlled-SWAP and exponential-SWAP [6,7] directly across the network without resource entanglement or shuttling. Finally, the conversion interaction employed here is a general tool for manipulating bosonic degrees of freedom in separate modules, which can be applied for boson sampling [25,35] and as a tunable hopping term for quantum simulation of bosons. This diverse set of possibilities in a single platform provides many directions for future research towards distributed quantum computing with superconducting networks. Each module is a nominally identical device constructed from a solid piece of 99.99% pure aluminum, chemically etched as in Ref. [36] to improve surface quality. Modules consist of a central post cavity [26,36] with two orthogonal tunnels intersecting the cavity, similar to the sample in Ref. [37]. One tunnel houses a chip containing the ancilla transmon, readout resonator, and Purcell filter [26]. The other tunnel houses a separate chip with conversion transmon. This tunnel is intersected by another, smaller tunnel, housing the end of the coaxial bus resonator (see Sec. I B. All chips are double-polished sapphire with aluminum films defined by a single electron-beam lithography step, and double-angle evaporation to form the Josephson junctions in a Dolan bridge process. Samples are thermally anchored to the base stage of a dilution refrigerator at approximately 20 mK, with a magnetic shield surrounding the modules and bus. Device Hamiltonian parameters and coherence times are listed in Table S2. Module and chip design is shown in Fig. S1

B. Coaxial bus resonator
Modules are connected by a l = 6.6 cm section of commercial NbTi coaxial cable (Coax Co. SC-086/50-NbTi-NbTi) with PTFE dielectric. The cable is a multi-mode resonator with free spectral range 1.9 GHz. The third harmonic (l = 3λ/2) is used as the bus in this experiment, as it has largest dispersive coupling to the conversion transmon by virtue of being close in frequency. The final 7.5 mm of outer conductor and dielectric is removed from each end of the cable to expose the inner conductor. The cable is inserted in a tunnel in the module, with exposed inner conductor near one capacitor pad of the conversion transmon. The outer conductor is clamped against the body of the aluminum module with a brass screw, with a small amount of bulk indium between screw and cable to provide a larger area of contact force. This mounting scheme places only two superconducting joints in the path of current flow for the bus mode, one at each end. The outer conductor is lightly sanded before assembly to remove some of the oxide layer. The quality of this interface is irreproducible, and likely limits the quality factor of the bus (see Sec. III B). Fig. S2 illustrates the assembled modules and coaxial cable.

C. Experimental wiring
Each module has a mostly separate and identical drive chain to address all modes and apply the off-resonant pumps to the conversion transmon (see Fig. S3). Measurement of the ancilla transmon is performed with dispersive readout in reflection off a port which is strongly coupled to the Purcell filter. Readout pulses are sourced at room temperature with a continuous-wave generator (RF) and fast microwave switch. Reflected readout signals are amplified at the base stage by two SNAIL parametric amplifiers (SPA) [38] with approximately 23 dB of gain, 20 MHz of instantaneous bandwidth, and noise visibility ratio of 7 dB. The SPAs are operated in phase-preserving mode by detuning the pump from twice the readout frequency by about 20 MHz. SPA pumps are gated by directly pulsing the source generators. Signals are further amplified at 4 K (40 dB) and room-temperature (30 dB), then down-converted to an intermediate frequency (50 MHz) by a separate local oscillator (LO). This signal, as well as a reference signal formed by mixing the RF and LO continuously, are amplified again (14 dB) and digitized by a pair of analog-to-digital converters (ADCs). The signal and reference are compared on each experimental shot, and the relative trajectory is integrated with an appropriate envelope and thresholded to discriminate ancilla states.
All other input signals are IQ modulated by 8 pairs of digital-to-analog converters (DACs), amplified and filtered at room temperature. DACs, ADCs, and digital channels are on four Innovative Integration X6-1000M cards, which have FPGAs loaded with custom logic. All control and measurement lines are additionally filtered at low temperature (see Fig. S3).

D. Pump scheme and phase locking
Each conversion transmon is driven by a pair of far-detuned pumps, applied approximately 100 and 1200 MHz above the transmon frequency (called pump X and pump Y, respectively). Mode and pump frequencies are listed in Table S2. All pump pulses have a constant duration of the length quoted in the text, plus a cosine-shaped rise and fall, each 48 ns in length. Since the overall phase of a transmitted state depends on the initial phase as well as the phase of each of the four pumps, three local oscillators are shared between the modules as indicated in Fig. S3. These local oscillators source the cavity drive, pump X, and pump Y for both modules. Ancilla and readout input signals are created by independent sources. Resonant drives for conversion transmons, used only for characterization experiments, are sourced by the same drive chain as pump X.
The use of shared LOs between control setups ensure the LO phases cancel out in the data. Likewise, the phase of the oscillators which control the single-sideband (SSB) modulation frequency must also be locked. This is ensured by choosing SSB frequencies such that the parametric conversion frequency condition ω a1 + ω X1 − ω Y1 = ω a2 + ω X2 − ω Y2 is met. This condition guarantees that a photon converted from cavity 1 into the bus and out into cavity 2 acquires the same phase on every experimental shot. Additionally, the phase of the oscillators which set these SSB frequencies are reset at the beginning of every experimental shot to remove long-term drifts.

E. Crosstalk between modules
While the modules used in this experiment are nominally identical for ease of manufacture, future realizations might benefit from intentional asymmetry between the modules. The near-degeneracy of the two cavity modes, detuned by only 8 MHz, causes a small amount of crosstalk between control pulses. We find when we displace one cavity with a wide-band pulse, the other is displaced by 2-3% in amplitude. This may cause small errors in simultaneous control and tomography, which can be easily mitigated by a small intentional detuning between the two by machining the lengths of the posts to be slightly different.
The strong coupling of the conversion transmons to the bus, and their proximity in frequency space, leads to a strong dispersive coupling to one another. Due to the situation of the frequencies in the straddling regime [39], the dispersive shift between the two conversion transmons is +2π × 2 MHz. Since we do not use the conversion transmons as anything other than a mixing element, this shift plays no role, but it could be used to rapidly entangle the two modules.

A. Conversion Hamiltonian
The parametric conversion between cavity and bus results from a four-wave mixing process as in Ref. [11]. In the frame of the drives, the effective Hamiltonian describing the interaction between the bus and both cavities can be rewritten asĤ We take g to be real and equal for both modules, but the amplitude and phase is controlled by the amplitudes and relative phases of the off-resonant pumps. In particular, we must tune the amplitudes so that g is equal on both sides. The common detuning ∆ of the bus mode is controlled by detuning one of the pumps used for each conversion interaction away from the conversion resonance by ∆. This is the detuning in Fig. 2 of the main text.

B. Equations of motion and elimination of the bus
The lossless dynamics of this bilinear Hamiltonian may be readily found by solving the Heisenberg equations of motion forâ 1 ,â 2 , andb:ȧ Since the equations of motion are linear, the solution for the field operators can be written in matrix form as with matrix elements where we have defined the effective interaction rate Ω in analogy with detuned vacuum Rabi oscillations. Population thus oscillates between the cavity and bus modes with frequency 2 √ 2Ω, with the amplitude of oscillation proportional to (g/Ω) 2 . In the large detuning limit ∆ g, this amplitude is suppressed, scaling as (g/∆) 2 , a familiar result in the context of virtual Raman transitions.
This exact solution makes it clear that at time t = τ such that √ 2Ωτ = kπ (for integer k), M 13 = M 31 = M 23 = M 32 = 0. This means thatâ 1 (τ ) andâ 2 (τ ) are decoupled fromb(0), andb(τ ) =b(0), up to an overall phase. This is precisely what we mean by the bus being eliminated. The first time this elimination occurs (k = 1) is which we refer to as the beamsplitter time. The solution for the field operators at the beamsplitter time iŝ which is a beamsplitter transformation with mixing angle This general result establishes the two working cases used in the main text, explained on the next two subsections.

C. State transfer and efficiency
The first useful working point is used for state transfer. By choosing ∆ = 0, we have θ = π/2, which occurs at time τ SWAP = π/( √ 2g). This results in evolutionâ which swaps the state of the two cavities. Our choice of the phase of g in equation (S1) results in a true SWAP operation at these conditions. Any deviation in the phase of g would result in a cavity phase space rotation on one or both of the input states, which can be calibrated out in any encoding -this is not a logical qubit phase.
In this case, we can easily include the effect of damping in the bus by replacing ∆ → iκ b /2 in equation S2. In this case, we replace the detuned Rabi frequency Ω with the loaded oscillation frequencyg = g 1 − κ 2 b /(32g 2 ). To compute the energy efficiency in the presence of loss, it is sufficient to consider the case of an input coherent state |α in cavity 1, and vacuum in cavity 2, which emulates the one-way state transfer demonstrated in the main text. In this semi-classical case, it is convenient to replace the field operators with their expectation values, e.g. a j (t) = â j (t) . The initial conditions are then a 1 (0) = α and a 2 (0) = b(0) = 0. As long as the oscillations are under-damped (κ b < √ 32g,g ∈ R), the bus will still periodically be eliminated. The exact solution for the dynamics is given by For the values used in this experiment, the loading of the oscillation frequency is a very small effect, andg ≈ g to a very good approximation.
The energy efficiency of the transfer, η, is the energy in mode a 2 at the end of the transfer relative to the initial energy in mode a 1 , and is given by which is η = 0.89 for the experimentally measured values of g and κ b quoted in the main text.

D. Entanglement generation
The other regime utilized in this work is the 50:50 beamsplitter (θ = π/4), which is obtained at ∆ = g 8/3. At this detuning, Ω = g 4/3, and the interaction time is τ 50:50 = π 3/8/g = 3/4τ SWAP . The resulting operation iŝ which can produce maximally entangled final states for certain initial states. Loss in the bus will also introduce a finite efficiency in the θ = 45 • beamsplitter. For a singe-photon input state, this efficiency is to leading order in κ b /g which is less loss than the resonant swap because the process is faster and populates the bus less.
Since there is only one excitation, the state which results from loss is |00 . So the above inefficiency results in a mixture of the ideal Bell state and the vacuum state: The second term has zero fidelity to the ideal state, so (S14)

E. Alternate entangling schemes
There are other many ways the conversion process used here can generate entanglement. For instance, modulating the conversion couplings in time can effect an entangling partial swap. One such approach is to prepare a single photon in cavity 1, turn on conversion to the bus for only cavity 1 for time t half = π/(4g) to "half-swap" a single photon into the bus. This creates a Bell pair between cavity 1 and the bus. We may then turn on conversion from the bus to cavity 2 for t full = π/(2g), which fully swaps the bus occupation into cavity 2, resulting in a Bell state between the two cavities. A continuous version of this protocol involves simultaneous on resonance conversion with unequal strength, such that g 1 = √ 2 − 1 g 2 . Dynamics are qualitatively similar to the beamsplitter operation we use in this work, with the bus mode being occupied at intermediate times and returning to vacuum after time ∼π/g resulting in the 50:50 beamsplitter relations. However, such protocols implement Hamiltonians that are not symmetric under the exchange of modes a 1 and a 2 . As a consequence, should a photon loss event occur in the bus mode, the environment will gain information that projects the cavities into states that are not symmetric in a 1 and a 2 , making it difficult to implement robust multi-photon entanglement schemes.
For protocols such as the error-detected Hong-Ou-Mandel entanglement scheme, we must make sure to engineer a "true beamsplitter" transformation. For this scheme, if photon loss occurs in the bus mode, the environment does not learn from which cavity this excitation originated, due to the indistinguishability implied by the symmetry of the interaction. In fact, even after a single photon loss event in the bus, the joint cavity state is ideally the single-photon Bell state |01 + |10 . It is only after the parity measurements that the state is projected into either |01 or |10 . Entanglement schemes that are insensitive to photon loss on the bus or otherwise rely on it will be the subject of future work. Another approach robust state transfer and entanglement generation is stimulated Raman by adiabatic passage [28], which modulates the coupling strengths in time in a way which suppresses the occupation of the bus at all times. However, since this modulation must be adiabatic with respect to the maximum coupling strength, such protocols are necessarily much slower than the ones used here, similar to the virtual Raman approach. Loss induced by non-adiabaticity will not have the indistinguishability properties of the beamsplitter.
Additionally, encoding-independent entangling gates between bosonic modes such as exponential-SWAP and Fredkin gates [7] may be constructed by sandwiching local operations with ancillae between 50:50 beamsplitter operations. The tools demonstrated in this work enable such operations between separable modules.

A. Quality factor and attenuation length
With quality factors in the tens of thousands, the hypothetical maximum state transfer efficiency from just using a section of superconducting coaxial cable is extremely high. To illustrate this point, we consider the 6.6 cm section of cable we use for this work, which has modes with Q ≈ 50 000. Regarding the cable as a Fabry-Perot cavity with uniform loss, this quality factor corresponds to an energy attenuation length of 300 m [8]. Over reasonable meter-scale lengths of cable within a cryostat, this corresponds to a single-pass loss approaching 10 −3 . This fundamental limit is orders of magnitude smaller than the single pass loss observed in circulator-based communication links, which are limited to around 0.1 [15,18,19].
In the current implementation, the achievable efficiency is limited by the speed of the protocol (see equation (S10)), as a transmitted photon essentially makes many passes through the bus. However, by increasing the conversion strength would move closer towards this fundamental limit. Furthermore, as the bus is a 3D cavity, it is reasonable to believe that the quality factor can be improved by several orders of magnitude with improved materials and better seam quality.

Measuring the bus ex situ
We have used a scheme similar to Ref. [8] to characterize the quality of the bus mode before integrating it with the modules. We couple to the modes of a section of cable in reflection and measure S 21 with a vector network analyzer. The cable is terminated at one end in a tunnel of 6061 aluminum alloy that functions as a waveguide below cutoff. At the other end, we capacitively couple to the cable via a coupling pin in a similar tunnel. We set the distance between the coupling pin and center conductor of the cable such that we are nearly critically-coupled. The cable is secured to these blocks of aluminum in the same way it is attached to the modules, with an indium-tipped brass set screw. Typical quality factors are of order ∼ 50, 000 although quality factors as high as ∼ 160, 000 have been observed. Quality factors can differ by factors of 2 to 3 in the same cable when we re-set the mounting screws between cooldowns. As such, we suspect seam loss between the outer conductor of the cable and the aluminum package of the modules to be a limiting factor. Efforts are ongoing to improve the quality and reproducibility of the seam between the cable and modules. This screening process also allows us to measure the frequency and quality of potential bus resonators before assembling the full experimental hardware. We are also able to measure multiple modes of the same cable in this way. Fig. S4 shows such measurement of four modes of the same cable used in the main text.

Measuring the bus in situ
When the bus is installed in the modules, we can measure its properties without any dedicated drive or measurement lines. Since the mode of interest has a dispersive shift to the conversion transmon, we can detect population in the bus by performing spectroscopy on the converter, much the way we measured storage mode population with the ancilla. The results of this spectroscopy with the cable driven to a coherent state is shown in Fig. S5a.
Given the number-resolved spectrum in Fig. S5a, the damping rate of the cable κ b χ bc . We can measure this rate directly by displacing the cable and driving the converter with a selective pulse at ω = ω c . The height of this spectroscopic peak corresponds to the occupation of the |0 state of the bus. This ring-down measurement, shown in Fig. S5b, reveals a bus lifetime of 1.6 µs, or κ b /2π = 100 kHz.
We certify that the lifetime of the bus is not degraded in the presence of the conversion pumps. This is done by preparing a single excitation in one of the storage modes and turning on conversion on only one module, such that excitations swap into the bus and back into the storage cavity. By measuring the storage population after applying conversion for variable time t we can extract both g and κ from the fit in Fig. S6. The fitted value κ b /2π = 110 kHz shows minimal change in the presence of the conversion process. Likewise, fitting the data in Fig. 2b of the main text results in a similar value of κ b . We also verify that the cable has no measurable thermal population by examining the storage population after a single swap between the storage and bus when we initialize vacuum.  Bus is displaced to |α = 2.5 , then after a variable delay, the conversion transmon is driven with a selective π pulse at ω = ωc.

C. Dependence of scheme on cable length
The section of cable we use is fairly short, suitable for joining adjacent modules together. For certain applications, it may be beneficial to have meter-scale lengths of cable which can connect any two modules within a cryostat. For all other hardware kept fixed, there are some challenges which arise when making the link much longer. Firstly, the coupling rate g would decrease due to a reduction in the cross Kerr between the conversion transmon and the bus mode. This cross-Kerr is proportional to the energy participation of the bus mode in the junction of this transmon. From a simple volume argument, the energy density in the bus mode should scale as ∼ 1/l and thus so should the cross-Kerr. The parametric conversion strength g scales as the geometric mean of this cross-Kerr and the cross-Kerr between the storage cavity and the conversion transmon, so we expect g ∼ 1/ √ l. This reduction in g may be compensated by increasing the capacitive coupling between the conversion transmon and the end of the cable. In doing so, we may Purcell limit the transmon and storage cavity as well as increase crosstalk between modules. We have reason to believe we may already be Purcell limiting the conversion transmons -their lifetimes as measured before the addition of the bus were a factor of a few longer.
The Purcell effect arises due to off-resonance static couplings to modes of the cable even in the absence of conversion drives. As we make the cable longer, the free spectral range decreases as 1/l. Within a fixed frequency window, there are now more cable modes the conversion transmon or storage mode can couple to off-resonantly. This potential issue could be addressed with filtering dedicated filter modes between the conversion transmon and the bus. This is similar to the approached used in Ref. [20], but in this case the filter modes need not be precisely frequency-matched. Improving the quality of the bus modes would also lessen the effect of the Purcell limit. Finally, it is worth emphasizing that even for a longer cable, we can still effectively model the dynamics by considering coupling to a single bus mode and the protocols would be unchanged. Only when g FSR will we need to consider simultaneous coupling to multiple bus modes, as was explored in Ref. [21]. For a 1 m cable, the FSR is ∼ 100 MHz, much larger than the conversion strengths we can engineer at present.

IV. EXPERIMENTAL AND ANALYTICAL TECHNIQUES
A. Measurement

Ancilla measurement
For all measurements which are not at the end the experimental sequence, we use a 460 ns long square readout pulse calibrated to discriminate ancilla states |g and |e . The acquisition window is 580 ns long. This measurement is used for system reset as well as parity assignment. Ancillae have assignment fidelities of 0.99 for |g and 0.98 for |e , with the asymmetry due to relaxation events during the measurement.

Tomographic readout
Measurements used for tomography (the final measurements in an experimental run) are preceded by a pulse on the ancilla which inverts the population in |e and |f . The readout pulse and acquisition (500 and 640 ns, respectively) are longer, and calibrated to distinguish states |g and |f . This allows for a measurement which is much less sensitive to decay events of the ancilla [40,41], providing higher and more symmetric assignment fidelity, higher than 0.995 for all states. This improves the measurement contrast and reduces errors in the single-shot projective measurement in the entangled state tomography (Subsec. IV D).

Cavity and ancilla manipulation
Unless otherwise noted, all ancilla rotations are effected with 40 ns Gaussian pulse (σ = 10 ns). Cavity displacements are 40 ns Gaussian pulses (σ = 10 ns). All other manipulation of the cavity state are carried out with numerically-optimized control pulses (NOCP) on the cavity and ancilla using the GRAPE algorithm [42]. Pulse lengths are 500-1200 ns depending on the operation.

Parity measurement
Cavity photon number parity measurement is effected with a Ramsey-type sequence on the qubit [43]. Two 90 • ancilla rotations with an inter-pulse delay of π/χ at = 416 (492) ns for module 1 (2) entangles ancilla state with photon number. Ancilla rotations for parity measurement are 24 ns Gaussian pulses (σ = 4 ns) to make pulses maximally unselective on photon number. Parity measurements used for error detection have the phase of the second rotation reversed to map even number (the most probable outcome) onto the ground state of the ancilla, to minimize the probability of errors during ancilla measurement.

Cavity population measurement and normalization
Measurement of the occupation of the nth Fock state in the cavity is made by applying a spectrally narrow rotation on the ancilla, at frequency ω q +nχ, exciting the ancilla only for this number state. Selective pulse lengths are 1200 ns (σ = 400 ns) for module 1, 1920 ns (σ = 480 ns) for module 2. To normalize for errors in the pulse and readout, we measure this occupation for each relevant n, as well as a reference measurement of the ancilla state with no rotation, then subtract the reference and normalize so that the sum of all occupations is 1. This normalization procedure is applied to the data in Figure 2b,c in the main text.

Conversion transmon measurement
The conversion transmons do not have their own readout resonators, and are measured indirectly through the cavities. This is used only for system reset and characterization measurements. The cavity can be used to measure the converter as in Ref. [37]. After the cavity is determined to be in its vacuum state (see Section IV B), it is displaced to a coherent state with amplitude α, typically ∼ 1.5. After a delay of ∼ 200 ns, the opposite displacement is applied. If the converter is in its ground state, the coherent state will not have moved during the delay, and will return to vacuum. If the converter was excited during this time, the cavity state will rotate by an angle χ ac t, typically ∼ 70 • . The reverse displacement brings this to another coherent state with a very small overlap with the vacuum state. By applying a π-pulse to the ancilla, selective on zero photons in the cavity, we obtain an excitation probability proportional to the converter excitation probability, with a fidelity of about 0.95. Importantly, this measurement is unlikely to incorrectly give the result corresponding to the converter in its ground state, so it is useful for verifying with high confidence that the converter is not excited.

System reset by feedback
To ensure the modules begins in a known state, we use an active feedback cooling sequence that makes use of the ability of our control hardware to perform simultaneous and independent control branching when resetting the ancilla transmons.
The set of nested subroutines used at the beginning of every experimental sequence is shown in Fig. S7. The sequence begins by ensuring both ancillae are in their ground states, actively resetting as necessary. Then a π pulse, selective on n = 0 photons in the cavity, is applied to each ancilla, followed by measurement. Measurement of the ancilla in |e heralds an empty cavity. If both cavities are empty, we continue (see next paragraph). If not, we reset the ancillae to |g , then actively empty the cavities by performing swaps with the bus (as in Fig. S6), one at a time, with a 10 µs delay after the swap to allow the state to decay in the bus. This is at least two orders of magnitude faster than waiting for the long-lived cavities to decay on their own. We then start the sequence over, beginning with ancilla reset, and repeating as necessary to ensure both cavities are in the vacuum state.
Once the cavities are confirmed empty, we use them to measure the state of the conversion transmons. This uses the Ramsey-style selective displacement described in Sec. IV A 6, which displaces the cavity if the transmon is not in its ground state. We then repeat the cavity measurement. If the cavities are again found in vacuum, we know the converters were in |g . We then reset the ancillae one last time, and begin the experimental sequence. If either cavity is not in the vacuum state, this means its converter was not in |g . We then empty both cavities and begin the entire sequence from the beginning. Since the time to empty the cavities is relatively long, we do not actively reset the converters, but simply allow them to decay during this time.
After successful completion of this cooling routine, we find the ancilla transmons and cavities with less than 1% probability each of being out of their respective ground states. The conversion transmons are difficult to measure

Reset ancillae
Empty cavities

Reset ancillae
FIG. S7. System Reset. Logical control flow for system reset and verification. Each experimental sequence begins with a call to "Reset System," which follow the control flow until the conditions have been met to reach "Proceed." Individual subroutines loop until reaching "Return," at which point they return to the previous sequence.
to this degree of accuracy, since their measurement sequence is fairly long and involved. Given the length of this sequence, it is likely that they have re-thermalized to about the 1% level (each) by the time the cooling is successfully completed.

State preparation
As discussed above, all nontrivial cavity states are prepared with NOCPs. For all operations used to prepare cavity states, the ancilla is meant to return to its ground state at the end of the the pulse. Errors during the operation can result in occupation of the excited state, usually with probability 2 − 3% depending on the pulse. To detect these errors, we measure the state of the ancilla after application of the pulse for all experiments. If not measured to be in its ground state, we consider this a failure of the preparation, and reset the entire system before trying again. This measurement is responsible for the small, deterministic phase shift seen in the Wigner tomograms of the states as prepared, shown in Fig. 3b,c of the main text. While this makes the entire experiment probabilistic, we regard this as a part of the system initialization process, which is already nondeterministic. Operations not at the beginning of the experimental sequence are not error-detected in this way. With improved ancilla coherence and calibration, this step is not necessary [42].

Logical state encoding
For each of the two logical encodings used in Figure 3 of the main text, the state is encoded in the cavity using NOCPs. First the state is prepared in the ancilla transmon with a phase-and ampltiude-controlled rotation. Then the pulse, which maps combined ancilla-cavity state |g |0 (|e |0 ) onto |g |0 L (|g |1 L ). As stated in Section IV B 2, we then confirm the ancilla has successfully returned to its ground state before proceeding. The mean fidelity of the encoded states are 0.99 for the Fock encoding and 0.98 for the cat encoding, obtained from Wigner tomography.

Measurement, symmetrization, normalization and reconstruction
The Wigner function measurement is carried out as in Ref. [43], for instance. The cavity is displaced by a variable amount β, and the average parity is measured using the parity measurement described in Sec. IV A 4. To symmetrize the measurement, we perform two distinct parity mapping sequences -one which maps even photon numbers to |g of the ancilla, and one which maps even to |e . We take the difference of the two resulting datasets. This symmetrizes the Wigner function against biased readout errors and finite number-selectivity of the ancilla rotations.
The Wigner function of any physical cavity state should integrate to 1, even for a mixed state. Since our reconstruction routine assumes the data to be physical, we normalize the measured Wigner functions by a trapezoidal 2D integral over the entire dataset. This corrects for loss of contrast due to the parity mapping sequence and the ancilla measurement, an effect of 2-3%. This results in the data presented in the main text.
The cavity state ρ is reconstructed from the Wigner function using is a maximal likelihood estimation, the same routine used in [15]. The routine is a convex optimization over the space of physical cavity density matrices with dimension d = 8 for all data. Since the largest states measured have mean photon numbern ≤ 2, this Hilbert space is sufficiently large to capture all population. The physicality constrains are that ρ is positive semi-definite and Tr (ρ) = 1.

Fidelity error bars
The state fidelities quoted in the text are computed as the fidelity of the reconstructed state to the ideal state ρ ideal , F = Tr √ ρ ideal ρ √ ρ ideal 2 . The net contribution of errors in reconstruction due to noise and systematic errors such as the dependence of the parity measurement contrast on mean photon number contribute about 1% error on average, as estimated from simulating these imperfections on ideal data. This gives the error bar quoted for most of the mean state fidelities in the text. The systematic error forF odd , which is reconstructed from Wigner functions taken after a measurement of odd parity, is larger due to the occurrence of ancilla decay errors during the first parity measurement, since odd parity is associated with a measurement of the ancilla in |e . These errors result in distortion of the measured Wigner function from to dephasing caused by the dispersive shift to the ancilla. However, due to the low probability of this case, the overall error inF cat,tracked is not as large.
The errors in the entangled state reconstruction, described in Sec. IV D, are similar, since this method relies mostly on density matrices reconstructed from Wigner functions.

D. Entangled state tomography
To clearly illustrate the correlations between the two cavities, we perform Wigner tomography on cavity 1, postselected on a logical measurement in cavity 2 in the x, y, and z bases, as indicated in Fig. 4a,d of the main text. Since the Wigner function is a complete description of the state, these conditional Wigner tomograms provides enough information to reconstruct the full two-qubit state.

Logical basis measurement
The logical basis measurements for entanglement characterization are effected by decoding the cavity state onto the ancilla using NOCPs. These are the opposite of the encoding operations in Sec. IV B 3. The mapping is |g |0 L (|g |1 L ) to |g |0 (|e |0 ). We then measure the ancilla to effect a z basis measurement, or rotate the ancilla into the appropriate basis with a π/2 pulse around the Y (X) axis to measure in the x (y) basis.
To assess the fidelity of the decoding operations, we prepare six cardinal states in the ancilla (|±z , |±x , and |±y ), encode and immediately decode, then apply the rotation which should restore the ancilla to the ground state, and measure. We find on average a 3-4% error, depending on the encoding. We assume the infidelity of encoding and decoding is similar, and attribute half the average incorrect measurement result to the "decode, rotate, and measure" operation, which is the same operation that makes up the logical basis measurement. This yields the ∼ 2% tomographic error quoted in the main text.

Two-qubit state reconstruction
Conditional Wigner tomograms for the z and x bases are shown in Fig. 4b,e of the main text; the complete dataset is shown in Fig. S12. We reconstruct each Wigner function individually to produce the density matrix of cavity 1, conditioned on the measurement outcome in cavity 2. As before, the measured Wigner function is normalized before reconstruction, to correct for measurement contrast in the parity mapping and ancilla readout. The result is two conditional density matrices for each of the three bases. This way we can reconstruct the two-qubit state without having to apply a decode pulse on cavity 1 as well.
We use these conditional density matrices to reconstruct the logical two-qubit density matrix. This reconstruction uses a routine used in Ref. [31], adapted for our tomography scheme. The use of the |f level of the ancilla for enhanced and roughly symmetric measurement contrast obviates the need for additional measurements to symmetrize the resultant data (see Sec. IV A 2).
Each joint choice of bases for the two cavities is given by {k, l} ∈ {x, y, z} ⊗2 , where k corresponds to the basis choice for cavity 1 and l for cavity 2. We refer to the measured probabilities of the logical measurements in cavity 2 as p ±l , and the conditional density matrices of cavity 1 as ρ ±l . The goal is to produce the expectation values p ±k,±l of the four projectors Π ±k,±l = |±k 1 |±l 2 ±k| 1 ±l| 2 , which correspond to the probability of measuring the joint state to be in |±k 1 |±l 2 .. This joint probability is p ±k,±l = p ±l P (±k| ± l), where P (±k| ± l) is the conditional probability of measuring ±k in cavity 1 given the result ±l in cavity 2. This conditional probability is the expectation value of the single-cavity projector Π ±k = |±k 1 ±k| 1 , given the result ±l in cavity 2.
To compute these conditional probabilities, we take the conditional density matrix ρ ±l and evaluate P (±k| ± l) = Π ±k ±l ≡ Tr (ρ ±l Π ±k ), which is the squared overlap of the measured cavity state with the logical state |±k 1 given outcome ±l. This is essentially the probability we would measure cavity 1 to be in |±k 1 with an ideal projective measurement. It is important to note here that, since the cavity density matrix is of dimension larger than 2, leakage out of the logical subspace (here, {|0 , |1 }) results in P (+k| ± l) + P (−k| ± l) < 1. We will see in a moment that this results in a reconstructed logical two-qubit state with trace slightly less than 1. Since we assign a binary outcome to the logical measurement of cavity 2, p +l + p −l = 1 by construction. This means that leakage out of the logical space on cavity 2 is not directly observed. However, such leakage will contribute to infidelity. Since the decode operation cannot account for this leakage, the result is some arbitrary outcome of the ancilla measurement, which is we assume to be uncorrelated with the result in cavity 1. Thus, while the decode-and-measure sequence will mask this leakage, it should convert it to infidelity in the form of a statistical mixture. Put another way, this local operation cannot increase the amount of entanglement, so it does not result in overestimation of the fidelity. The 3 × 3 × 4 = 24 computed joint probabilities p ±k,±l are fed into the MLE reconstruction routine, which is a convex optimization over the space of all physical two-qubit (2 2 dimensional) density matrices. To ensure physicality, the resultant density matrix ρ L is constrained to be Hermitian and positive semi-definite. In addition, Tr (ρ L ) ≤ 1 to account for the possibility of leakage out of the logical space as discussed above. This leakage is very small for the {|0 , |1 } encoding -the trace of ρ L (the value of the II bar) is found to be 0.999, consistent with very small (< 10 −3 ) occupation of Fock states above n = 1 for the reconstructed cavity density matrices.
For the two-photon entangled state, there is a small but measurable amount of leakage outside of the {|0 , |2 } code space due to errors in the parity measurement and cavity decay during tomography. The value of the II bar (and hence the trace of ρ L ) is 0.991, consistent with a typical 1% occupation of the |1 state in the measured Wigner functions of cavity 1. In fact, the occupation of the error state |1 is found to be largest for states with large occupation of |2 , suggesting cavity decay errors are primarily responsible.

A. Definition
Our error-correctable encoding is the four-component cat code [5]. The codewords have definite photon number modulo 4: where N i denotes the state-dependent normalization factor. These states are orthogonal for all values of α.
A single photon loss event on the logical space spanned by these codewords takes a superposition of codewords into the error space which is also spanned by the odd-parity four-component cats: An odd parity outcome after the state transfer tells us we just need to relabel the codewords to that of the error space and we will have mostly preserved the quantum information.

B. Optimum cat size
The value of α for the cat code is something we can chose when we encode in the initial states with NOCPs. We can see that in the limit α → 0, the codewords become |0 L = |2 , |1 L = |0 . This is not a good encoding for error detection since only the |0 L codeword can lose a photon. Upon measuring odd parity, the state is projected into |1 , destroying any initial superposition. We can detect errors, but cannot recover the information. For small but finite values of α, upon knowing that a photon was lost, the state will be polarized more towards the codeword that started with the larger number of photons resulting in a loss of fidelity. Similarly, upon knowing that no photon was lost, the state is polarized more to the codeword with fewer photons. This no-jump backaction results in a logical dephasing error.
This no-jump error is suppressed exponentially for large α, since |0 L and |1 L will contain the same number of photons on average (n = |α| 2 ). However, at larger α, the dominant error is two-photon loss errors. Since loss of two photons does not change the parity, this error is undetectable, and results in a logical bit-flip. As such, there is an optimum value of α to use for a given energy transfer efficiency η as illustrated in Fig. S8. This trade-off has been explored theoretically in more detail in Ref. [44].
In this experiment, η = 0.84 yields an optimal starting α = 1.3. With only photon loss error, this puts a theoretical upper bound of 0.97(1) for the transfer fidelity. The measured value of 0.92 is lower due to additional experimental errors, mainly excitations of the conversion transmons and infidelity in the parity measurement and state preparation.  (2) TABLE S1. State transfer error contributions. Error quoted due to bus loss for cat code is the remaining infidelity from second-order erorrs, assuming perfect error-tracking. "State preparation" error is fidelity of preparing a single photon for the inefficiency, and mean state fidelity of encoding for the infidelities. Infidelity due to parity measurement is roughly equal contributions of error in parity mapping and ancilla measurement, both of which can be suppressed by repeated fault-tolerant parity measurement [30].

C. Comparison to Binomial encoding
The optimal cat code basis is qualitatively very similar to the lowest order binomial encoding [14] with codewords |0 L = (|0 + |4 )/ √ 2, |1 L = |2 . For the experimental transfer efficiency η = 0.84, the cat code is predicted to give slightly better transfer fidelity by a few percent due to lower overhead. This owes to the factn ≈ 1.7 for the optimal cat code vsn = 2 for the binomial code.

D. Post-transfer basis
After the transfer protocol, the information is encoded in a new logical basis. After reconstructing the density matrix of the 6 cardinal transferred states, we find the basis that maximizes the average transfer fidelity independently for each parity outcome. We optimize the choice of basis over the size of the cat, α and a deterministic phase shift that is different for the even and odd parity outcomes. For both parity outcomes, we find an optimal α = 1.2, close to what we expect from the no-jump backaction. Both these bases still have the same error-correctable properties as the original basis, namely further single photon loss events can be detected by measuring parity jumps.

VI. ERROR-CORRECTED STATE TRANSFER
To demonstrate the feed-forward tools needed to actively correct for photon loss during the state transfer, we extend the error-tracked state transfer with a real-time conditional decoding procedure. We apply a decoding NOCP on module 2 to transfer the state from cavity 2 to ancilla 2, ideally leaving cavity 2 in the vacuum state, analogous to the decoding used for the logical measurement used for entanglement tomography in Subsec. IV D. Overall, this amounts to transferring the qubit state from ancilla 1 to ancilla 2 with the information encoded in the |g , |e states of ancilla 2 at the end of the protocol. Ancilla measurement for qubit tomography is the same as the measurement described in Sec. IV D 1.
Since the qubit is encoded in a different basis depending on the measured error syndrome outcome, we pre-load two decodings, one for even parity and one for odd. The controller branches on the parity measurement outcome to use the correct decoding operation, similar to Ref. [13]. The NOCPs for the Fock and cat encodings introduce a small additional infidelity estimated to be around 1-2% due to ancilla decoherence and pulse errors. Since the ancilla can be treated a two-level system, leakage errors out of the cavity code space are converted to errors on the ancilla of another type, which in general depends on the details of the leakage and the NOCP. While the decoding process masks the form of these errors, it cannot in principle improve the fidelity beyond what was measured from direct Wigner tomography.
For both Fock and cat encodings, we find an ancilla-to-ancilla average state transfer fidelity of 0.91, indicating we are also at break-even for this extended version of our protocol. Qubit tomography and individual transfer fidelities are shown in Fig. S14 and Table S3. Alternatively, one may error correct the cavities by performing conditional NOCPs that map the appropriate post-transfer basis back to the original encoded basis.
The decoded state tomograms revel important essential features of error correction with the cat code. Since the codewords used do not have the same average photon number, there is a noticeable polarization error towards the state |0 L in the case of a single photon loss, evident in Fig. S14 b). In other words, when we detect a single photon error, we learn something about the logical state: we were more likely to have started in state |0 L (contains n = 2, 6, ...), the state with larger average photon number. Similarly, there is also a smaller polarization towards state |1 L (contains n = 0, 4, ...) in the event of no photon loss. These opposite polarizations cancel out in the wighted deterministic state, resulting in a symmetric loss of contrast in the X and Y bases, which is a logical dephasing error. Also apparent in the deterministic data is a symmetric decrease in the Z polarization due to bit-flip errors from multi-photon loss events. Since we operate at the optimum point, these logical bit and phase flip errors are balanced, and the result is a uniformly-depolarizing error. A different choice of α can bias the logical error channel. This trade-off is explored more fully theoretically in Ref. [44].

VII. EFFECTIVELY DETERMINISTIC TWO-PHOTON ENTANGLEMENT
For the Hong-Ou-Mandel entanglement scheme, in the event we measure a parity outcome other than (even,even), we in principal still know the current state of both cavities. The parity outcomes in the event of single photon loss are (even,odd) and (odd,even) which occur with equal probability and project the cavities into the states |01 and |10 . We may reload a photon in the empty cavity using the corresponding NOCP to rapidly re-prepare the initial state |11 and try again to generate the desired Bell state within the same experimental realization. We can repeat this protocol indefinitely until we obtain the desired (even,even) parity outcome, effectively making this scheme deterministic. With this multi-round modification and keeping 100% of the data, we can reach an average Bell state fidelity of 0.88(1), comparable to the single photon Bell state generation scheme. If we impose a cutoff to the maximum number of rounds, we can boost this fidelity whilst maintaining a high success probability. This trade-off between Fidelity and maximum allowed number of rounds is shown in Fig. S13 Whilst this scheme mitigates the effect of single photon loss errors in the bus, errors from undesired transmon excitation become increasingly prevalent at high round number and result in failure to reload photons or enact the beam splitter, and inaccuracies in the parity measurements. This is evidenced by the large number of rounds needed to reach failure probability near zero in Fig. S13. If the only error were photon loss in the bus, we would expect > 99% success probability within three rounds.
The control flow for this repeated entanglement scheme is shown in Fig. S9. The flow is broken into several blocks: "Initialize" (2,208 ns, performed only once), "Attempt" (2,364 ns, repeated for each attempt), "Reset" (774 ns on average, repeated for each attempt after the first one), and "Tomography" (2280 ns, upon success). Each readout block includes an acquisition delay for internal controller and cabling delays (360 ns), an acquisition time (580 ns), and a delay for the controller state estimation to be ready for branching (220 ns). As explained in Subsec. IV A 2, the final tomography measurements have an additional ancilla e − f rotation (40 ns), a longer acquisition (640 ns), and no controller delay. The "Reset" length is not deterministic because of the differing lengths of the Fock state creation pulses and the differing number of decision branching steps (48 ns), so we quote the average. The primary parity measurement outcomes are "gg" (success), "eg," and "ge" (odd-even and even-odd, respectively), but there is a small (∼ 1%) probability to measured "ee" (odd-odd) due to measurement errors. In this case, we reset both ancilla and proceed as if the parity measurement were faithful, returning to the beamsplitter to try again. We could instead measure the parity again to confirm or reject these outcomes as failures, but since these events are rare, the flow taken in this case is not very important.
Taking into account these sequence lengths and the relative probabilities of the number of rounds to success (see Fig. S13), the average time to success when considering up to three rounds is 5205 ns, as quoted in the main text, not including tomography. The mean time to success for the fully-deterministic protocol (up to 147 rounds) is 6257 ns. This time is only slightly longer because the probability of success approaches one in only a small number of rounds.

A. Full state transfer data
Wigner tomograms of all six states of the Fock encoding, measured as prepared in cavity 1 and received in cavity 2, are presented in Fig. S10. Wigner tomograms of all six states of the Fock encoding, measured as prepared in cavity 1, received in cavity 2 without parity measurement, and sorted by parity after measurement, are presented in Fig. S11. Decoded tomograms are shown in Fig. S14. State fidelities for all datasets are presented in Table S3.

B. Full entanglement data
All six conditional Wigner tomograms for the single-and two-photon entangled states are presented in Fig. S12. Results of multi-round two-photon entanglement are shown in Fig. S13.

C. Sample parameters
Measured sample parameters for both modules can be found in Table S2.    TABLE S3. State transfer fidelities Transfer fidelities of the reconstructed states shown in Figs. S10, S11, and Fig. S14. Fidelities obtained from Wigner function reconstruction and from decoding onto ancilla are presented. For the cat encoding, fidelities for each parity outcome, as well as the probability of measuring odd parity, are provided. Fidelity is given for for cat code without a syndrome measurement. Weighted fidelity is the deterministic average fidelity of that state, weighted by the probability of the parity outcome.