Photonic architecture for scalable quantum information processing in NV-diamond

Physics and information are intimately connected, and the ultimate information processing devices will be those that harness the principles of quantum mechanics. Many physical systems have been identified as candidates for quantum information processing, but none of them are immune from errors. The challenge remains to find a path from the experiments of today to a reliable and scalable quantum computer. Here, we develop an architecture based on a simple module comprising an optical cavity containing a single negatively-charged nitrogen vacancy centre in diamond. Modules are connected by photons propagating in a fiber-optical network and collectively used to generate a topological cluster state, a robust substrate for quantum information processing. In principle, all processes in the architecture can be deterministic, but current limitations lead to processes that are probabilistic but heralded. We find that the architecture enables large-scale quantum information processing with existing technology.


I. INTRODUCTION
Quantum computers promise to surpass even the fastest classical computers, but the task of building a quantum computer presents a significant challenge. Even if they are precisely engineered, quantum systems will inevitably suffer from decoherence and other errors. If these errors are sufficiently rare and not too strongly correlated, then they can be suppressed with quantum error correction [1]. The role of quantum computer architecture is to integrate quantum error correction with feasible experimental technology, to find a path to a reliable and scalable quantum computer. In this context, of the many physical systems identified as candidates for quantum information processing [2], the negatively-charged nitrogen vacancy (NV − ) centre in diamond [3][4][5] features a number of desirable properties [6][7][8][9][10]. For example, the NV − centre possesses both a nuclear spin and an electron spin-the nuclear spin can serve as a memory to store quantum information for relatively long times [11], and the electron spin can be coupled to a photon to serve as a flexible interface with other NV − centres [12]. The experimental feasibility of this system has been well established in recent years. Experiments have demonstrated individual electron and nuclear spin initialisation, manipulation, and measurement [13][14][15][16][17][18][19][20][21][22][23], long-lived nuclear memories [11], a coherent interface between an electron spin and an optical field [12], and optical cavities containing NV − centres [24][25][26]. State-dependent reflectivity has been demonstrated with atoms [27], though not yet with NV − centres. At the same time, new techniques for quantum error correction have lessened experimental requirements [28][29][30].
Here, we develop a quantum computer architecture based on a simple module comprising an optical cavity containing a single NV − centre in diamond. Modules are connected by photons propagating in a fiber-optical net-work. The cavities mediate interactions between the photons and the electron spins, enabling entanglement distribution and readout. The electron spins are coupled to nuclear spins, which constitute long-lived quantum memories where quantum information is stored and processed. Aside from modules connected by optical fibers, other elements of the architecture are single-photon detection devices and classical control lines. These elements are laid out in a regular two-dimensional array, with sufficient connectivity between modules to enable topological cluster-state error correction [31][32][33]. This arrangement is independent of the size of the network. At a circuit level, we find the maximum tolerable error per elementary quantum gate to be approximately 0.73%. However, by analysing the architecture at the physical level, we also estimate how well each component of the module must operate for the system to meet this threshold and be truly scalable. The results of this analysis indicate that the architecture is consistent with present technology and might be achievable in the near future.

II. FUNDAMENTAL BUILDING BLOCKS
Our approach can be adapted to a variety of promising physical systems, such as ions, neutral atoms, and quantum dots [12,[34][35][36][37], and for this reason, we begin with a general description of the fundamental module. However, to show that the module can form the basis of a truly scalable architecture, we focus on a concrete implementation using NV − centres.
We begin our description of the architecture with an entanglement scheme based on the state-dependent reflectivity of a module consisting of an emitter-cavity system [38,39], as depicted in Fig. 1. We can describe the emitter as a four-level system with transitions |0 → |0 E and |1 → |1 E , each with a frequency ω 0 and ω 1 = ω 0 +δ, FIG. 1. Schematic representation illustrating the module and the entanglement distribution scheme. The module contains an optical cavity with a four-level system. The entanglement distribution scheme is based on a Michelson interferometer where two modules are connected via an optical fiber. A single photon comes in from the right port and is conditionally reflected at each module depending on the state of the emitter. Erasing the path information at the beam splitter followed by detection at the dark port projects the system to the singlet Bell state.
respectively. The probability for a photon to be reflected by a module with cooperativity C and the cavity tuned to the interrogation frequency ω 0 is given by [40] P R = 1 − 1 + 4C + (δ/γ) 2 1 + 4C + 4C 2 + (δ/γ) 2 . (1) We have assumed a cavity with matched mirrors, in which case an impinging photon will be reflected by the module with high probability if the emitter is in the ground state |0 and the cooperativity is C 1. In the case of large detuning, (δ/γ) 2 C 2 C 1, the cavity is effectively empty and the reflection probability approaches P R → 0. In the simplest variant of our entanglement scheme (Fig. 1), we place two such modules at the output ports of a 50 : 50 beamsplitter and prepare each emitter in an equal superposition of the ground states |0 and |1 . A single photon is then sent onto the beamsplitter. If it is subsequently detected at the "dark" port of the beamsplitter, the emitters are projected onto the maximally entangled state with success probability p = η 2 /8, where the collection efficiency η 2 includes the effects of inefficient sources and detectors and transmission losses. This probability may appear to be low, however the generated entangled state has extremely high fidelity (>99%) and is robust to imperfections (see supplementary material). For instance, imbalance in the cavity reflection coefficients slightly reduces the success probability but does not degrade the fidelity of the resulting state. The low success probability of the implementation can be simply overcome using a repeat until success approach to establish an entanglement link with high probability [41,42]. We will show in the following sections that the scheme not only exhibits high fidelity in the presence of physical imperfections, but also, unlike other approaches, does not involve any catastrophic errors.
In addition, the module enables (near) perfect nondemolition measurement of the qubit state. For an architecture for quantum computation we require a second qubit in the cavity to act as a quantum memory. Ideally, the coupling between our four-level system and this memory qubit can be switched on and off as required. This allows the four-level system to be reused for entanglement creation, now with a third module. By repeating this process with additional modules we can generate a cluster state suitable for fault-tolerant quantum computation. In the following we will detail this architecture by describing a full implementation using single NV centres in micro cavities connected in a photonic network.

III. THE DIAMOND MODULE
Let us now turn our attention to a concrete implementation: a fiber-connected optical cavity containing a single NV − centre, of which the energy levels are depicted in Fig. 2a. The lowest three electron spin states, |m s = 0, ±1 ≡ |0, ±1 form the spin-1 3 A 2 manifold which has a zero-field splitting of 2.87 GHz. With an externally applied magnetic field B ∼ 20 mT, our electron spin qubit levels |0 and | + 1 are far detuned from the | − 1 energy level and so form an excellent qubit. The isotope 15 N will be utilised as a spin-1 /2 nuclear memory. Next, the optical transitions between one of the 3 A 2 magnetic sub-levels |i and the 3 E levels |M i coupled to the cavity field can be rep- NV − centre is shown as a definite example of the artificial atom to realise the module. Its energy level structure for a low temperature, low strain sample [12,43] is illustrated in a). A static magnetic field of approximately 20 mT is used to separate the ms = ±1 levels. The NV − centre possesses both an electron spin and 15 N nuclear spin, which will be used to store and grow a cluster state for quantum information processing. b) illustrates how the storage of entanglement in the nuclear spins is achieved. The nuclear spin needs to be prepared in the superposition state, |n+ = 1 √ 2 (|0 + |1 ) before the protocol starts. During this operation, the electron spin is in a polarized state |0 , hence the hyperfine coupling is effectively turned off. When the electron spins rotate to 1 √ 2 (|0 + | + 1 ) for the entanglement distribution scheme, the clock associated with the hyperfine coupling starts. A spin-echo sequence can be used to decouple the electron and nuclear spins where necessary-for instance, when the entanglement distribution fails, we need to decouple the electron and nuclear spins before re-initialising the electron spin and attempting the protocol again until success. resented by g ms,i i=1...6 a † |i M i | + a|M i i| where g ms,i are the coupling constants between the transitions and field, a † , (a) are the field's creation (annihilation) operators and M i are the energy eigenstates, in order of ascending energy, within the 3 E manifold. At zero strain they are given by the basis states {M 1...6 } = {E 2 , E 1 , E x , E y , A 1 , A 2 }, neglecting a small mixture of the E x,y and E 1,2 states due to spin-spin interaction. The basis states E x and E y have electronic spin zero, while the others (A 1,2 and E 1,2 ) are equal superpositions of spin ±1 [12,44]. For our scheme, we apply an electric field in the x-direction (E x ) to lift the degeneracy of the spin-zero states in the excited-state manifold. This greatly reduces the sensitivity to rogue strain or electric field influences in the y−direction and thus makes the system more robust. E x can be adjusted at each site to bring different NV − centres to the same resonance frequency. We choose |0 E = |E x + and |1 E | = |M 5 = 0.98|A 1 +0.17|A 2 + , where represents negligible contributions from other basis states. For this setting, we find δ = 2π × 2.71 GHz, which is far greater than the homogeneous optical half-width of the chosen transitions, γ = 2π × 11 MHz. We note that although the NV − is not a simple four-level system (Fig. 2a), all other allowed transitions are detuned even further from the excitation frequency ω and can be neglected. Thus we have the properties required for entanglement distribution based on state-selective reflection using the NV − centre electron spin states |0 and | + 1 .

A. Quantum non-demolition detection
The conditional reflection of a photon from a module allows us to perform a quantum non-demolition measurement of the NV − state [45] (see supplementary material). The measurement sequence consists of a photon measurement followed by a qubit flip and a second photon measurement. For the photon measurement, a single photon is sent to a module, and will be reflected and detected if the NV − centre is in the state |0 , and lost otherwise. The qubit flip is achieved by a microwave π−pulse. A photon detection would be expected with certainty for one of the photon measurements under ideal conditions, while the absence of a detection event would indicate leakage of the NV − centre from the qubit subspace to the | − 1 state. The destructiveness of this measurement depends on the probability of exciting the NV − centre and the subsequent spin-flip probability. The measurement needs to undergo several repetitions to make up for finite photon collection efficiency, thereby increasing the spin-flip probability. Nonetheless, we find that it is possible to achieve a measurement error rate of QN D = 0.1% even for a finite collection efficiency of η 2 = 0.3, which is sufficient for fault-tolerant computation (see Section V). For unity detection efficiency, QN D reaches 0.01%, but cannot be reduced further in our current scheme due to the non-zero spin-flip probability for each measurement. This is due in part to off-resonant excitations of the NV − centre in the cavity, the probability of which increases with cooperativity. This leads to an working cooperativity of C 50, which is realistically achievable with currently available microcavity technology.

B. Remote entanglement
We begin by initialising each electron spin to |0 followed by rotating to 1 √ 2 (|0 + | + 1 ) using a polarised driving field in a few nanoseconds. A single-photon pulse is then sent onto the interferometer (Fig. 2b) and the dark port monitored. We repeat this procedure until the entanglement is heralded by the successful detection of a photon at the dark port. This is made possible by the good cycling properties of the NV − transition |0 → |E x [12]. We note furthermore that the de-ionization process NV − → NV 0 , and the resulting dynamical spectral diffusion, is rendered impossible by using only one singlephoton excitation in the interferometer at a time [12].
In the NV implementation of our module, the nuclear spin is a long-lived quantum memory which will in our architecture be designated to store one node of a cluster state [47]. Our scheme creates entanglement between the electrons of the two NV − centres. The transfer of the entanglement to the nuclear spin memories is done through the Ising component of the hyperfine coupling (A ∼ 3.03 MHz [46]), which is tuned by the external magnetic field of B ∼ 20 mT to give a conditional phase on the state of the two spins. The amount of entanglement oscillates in time from zero to maximum. At time τ setting π points of the oscillation, the effective gate becomes a controlled-phase gate, while at the 2π point it gives identity. The hyperfine coupling is always present but is effectively turned off while the electron spin is in the polarised state |0 .
Putting this together, the complete nuclear spin entanglement protocol begins with both electron spins and both nuclear spins polarised in their ground states (achieved via the quantum non-demolition measurement). The electron spin is then rapidly rotated to the |+ state via a π/4 Y rotation, an effective controlled-NOT operation is then performed between the nucleus and electron via the hyperfine interaction at which point the electron is again rotated by π/4 around the Y -axis and measured in the computational basis. This initialises the nuclear spin into the |n + state. We then rotate the electron back into the |+ state to attempt an electronelectron bond via the optical transitions. The hyperfine coupling turns on when the photonic entangling protocol is initiated by the electron spin rotation but a spin echo like sequence can be used to disentangle the electron and nuclear spins at any time we require. If the gate has succeeded, we perform a π/4 Y -rotation on one of two electron spins, and wait until the hyperfine interaction maximally entangles the electron and nuclear spins within each node. A π/4 Y -rotation is then performed on the electron spin of each module followed by its measurement in the computational basis. This completes the transfer of the entangled link to the nuclear spins. If the entanglement distribution has failed, the protocol will be repeated until a success is heralded, as illustrated in Fig. 2b. We note that it is not necessary to reinitialise the nuclear spin prior to each attempt.

IV. SHARING ENTANGLED STATES BETWEEN THREE MODULES
The next step is to extend our cluster of two nuclear spin qubits to three (by adding one). We begin with an entangled pair stored in the nuclear spins of modules A and B as shown in Fig. 3. A new entanglement bond on the electron spins in modules B and C is created using the same repeat until success protocol, though only the nuclear spin in C will be initialised. Once the entanglement between the electronic qubits is created, the entanglement will be transferred to the nuclear spins using the hyperfine coupling described previously.
This time, the nuclear spin in module B is in use, carrying information established at the beginning of the protocol. Photon loss may feedback via the permanent hyperfine coupling, introducing catastrophic errors in the states stored in the nuclear spins in modules A and B. For the protocol to be useful, we should be able to preserve The repeat until success protocol is accurately time sequenced. This is required by the nature of the coupling, as entanglement between the electron spin and nuclear spin oscillates. Upon failure, we wait until the 2π point in the entangling cycle, where the nucleus and electron are decoupled. The nuclear spin is consequently protected from feedback errors through the hyperfine coupling by accurately timing the re-initialisation of the electron spins. When the distribution of entanglement between two electrons succeeds, the entanglement bond will be transferred to the nuclear spins by waiting until a π point where the electron and nuclear spins are maximally entangled.
with high-fidelity the existing entangled states stored in the nuclear spins of A and B, while using the electron spin in B to create new entanglement with module C. By introducing a time-sequenced entangling procedure we can avoid decoherence caused by photon loss. Furthermore, by using spin-echo like sequences to decouple the electron spins from their surrounding environment we may extend their coherence time. The clock for the hyperfine coupling sequence starts when the photonic entangling protocol is initiated (that is, when the electron spin is rotated out of a polarised |0 state). If the entangling protocol fails, the system waits until the spin echo sequence decouples the electron and nuclear spins. At this point the nuclear system recovers coherence and the information stored on the nuclear spin remains untouched until the protocol succeeds. This process is illustrated in Fig. 3. Once the new entangling bond is established, indicated by a heralding signal, we again wait until the spin echo decouples the electron and nuclear spins. We then perform a single π/4 Y -rotation on one of the two electron spins, and wait until the hyperfine interaction maximally entangles the electron and nuclear spins within each the nodes. An X-basis measurement is performed on each electron (via a π/4 Y -rotation and computational basis readout) to transfer the new bond to the nuclear system.
Repeating this with additional modules we can generate an arbitrary cluster state. We are particularly interested in generating the three-dimensional topological cluster state (illustrated in Fig. 4a) capable of supporting fault-tolerant quantum computation [31,32]. Topological models of error correction [48,49] exhibit relatively high tolerance to errors and are particularly well suited to architectures due to their simple underlying structure [10,[50][51][52][53]. The topological cluster state is particularly useful in the context of our repeat until success protocol as it is inherently robust against missing bonds, which will be heralded. These missing bonds can be processed in the classical interpretation of measurement results, without any modification to the quantum circuit [33]. To prepare the topological cluster state, each physical qubit is entangled with its four nearest neighbours, hence a dagger shaped cluster state is the fundamental unit, independent of the size of the network, highlighted by blue bond in Fig. 4b. Four entangling steps are required to create this fundamental state with five modules.

V. BENCHMARKING THE PHOTONIC ARCHITECTURE
To process quantum information with a threedimensional topological cluster state, the state is consumed by measurements on physical qubits in sequential two-dimensional layers, where one axis is defined as the temporal axis. These measurements create and manipulate encoded qubits defined by defects [31,54]. As the computation proceeds by measuring one layer at a time, the whole topological cluster state is not required to be constructed initially. Only two successive layers need to be prepared and stored at any given time, allowing us to concentrate on only two physical layers of modules. The current state of the computer is teleported back and forth between these two layers, which are refreshed and recycled to generate the entire topological cluster. Taking the centre of each cell (in Fig. 4b), we initiate a sequence of gates to generate the dagger shaped cluster state throughout the lattice, which generates one layer of the topological cluster state (see supplemental material). The two layers of the module network are flattened to a two-dimensional plane, as shown in Fig. 4c. This pattern repeats to an arbitrarily large cross section.
At a circuit level, we are interested in the threshold error rate, below which the architecture becomes fault tolerant [31]. The projective measurement of the nucleus (via the electron-nucleus hyperfine interaction) allows us to combine measurement and reinitialisation of the nuclear qubit in a single step. Therefore, the depth of the quantum circuit to prepare the topological cluster state is reduced from six steps to five. We find that this reduction increases the error threshold to 0.73% (see Fig. 5a). Given this threshold, our target error rate for the five relevant gates is ∼ 0.1%, as this is sufficiently far below the threshold to allow significant suppression of errors using a practical number of modules [54].
The target error rate does not tell us much until it is decomposed into each physical component. Each gate consists of several physical steps and involves several sources of errors. In our case, these sources are parameterised by the nuclear and electron spin decoherence times, electron measurement efficiency, electron rotation efficiency, and timing error. As described, the sequence to generate an entangling bond is probabilistic, and the protocol repeats until success. Given that we require bonds to succeed with probability P = 99.9%, if the success probably of a single attempt is p c , the number of attempts we require is s = log(1 − P )/log(1 − p c ). For p c = 0.0625, s = 107. We consider the error rate for each gate to be the worst-case scenario, as heralded failure can be significantly higher than the error rate for unheralded errors [33].
The required fidelity for each physical parameter is shown in Fig. 5b. Each curve is plotted assuming it is the only non-zero error, except for possible errors arising from the absorption of photons by the NV − node (see supplementary material). The coloured green region in Fig. 5b is the target for each parameter for an operational computer (though parameters in the yellow region still lead to gates below the threshold). For the architecture to be fault tolerant, these errors need to be combined (see supplementary material). Electron and nuclear decoherence is already sufficiently low [11,[55][56][57], while the other parameters still need improvement. However, it is important to note that the required improvements are less than one order of magnitude, and are not limited by any currently known fundamental limitations of the NV − system itself.
Assuming that the threshold condition is met, performance is mostly dependent on the computational cycle time, which is limited by the time taken to establish all the electron-electron connections. For bond connections with P = 99.9%, the total time required to create a nuclear-nuclear bond is 3.5 µs, assuming p c = 6.25%. This time could be reduced by lowering the required connection efficiency and exploiting the robustness of the topological code to missing bonds [33]. The quantum circuit takes five steps to construct each cross-sectional layer of the topological cluster state. Hence, a unit cell of the cluster is prepared every ∼ 30 µs. To implement an algorithm on the computer, we create pairs of defects in the cluster. The volume of cluster allocated to pairs of defects represents the degree of error correction, parametrised by the distance between defects, d. For a logical error rate p L ≤ 10 −18 , d ≥ 32 is required [54]. Therefore, a logical cell requires V = 5d 4 3 = 40 3 cluster cells. To perform a logical CNOT gate requires a cluster volume 2 × 2 in cross section and 2 logical cells in temporal depth. Hence, it takes 3.4 ms for p c = 6.25% (a clock frequency of ∼ 295 Hz). This rate can be further improved by better optical efficiencies, but is ultimately limited by the hyperfine interaction of the NV − node used for nuclear spin operations. If we assume a deterministic electron-electron connection, a logical CNOT gate would take approximately 960 µs (∼1 kHz) as the system becomes rate limited by nuclear measurement (see supplementary material).

VI. DISCUSSION
As we have seen, a simple module can form the basis of a scalable quantum computer architecture. The architecture is naturally distributed, and hence is applicable to quantum communication [58]. Such a network may be local or global, with local networks connected by quantum communication channels. In this case, the distance between the modules may become orders of magnitude larger. The time delay due to the communication distance may be mitigated by the long-lived memory inside the module. With increased distance between modules, photon loss would increase, reducing the success proba- The physical unit cell composed of two layers. The back layer contains eight connected qubits arranged in a square (orange), while the front layer has five qubits arranged in a cross (blue). The two layers are connected by controlled-phase gates (green). Measurement of the front layer of the cluster will teleport the current state of the computer to the back layer, at which point the physical qubits we just measured can be reconnected in accordance with the geometry of the cluster state, and the information can be teleported back again. In this way, the two physical layers execute the even and odd temporal steps of the computation, allowing an arbitrarily deep computation to be performed with a fixed number of physical qubits. c) A compact layout of modules on two-dimensional plane.
bility of the entangling protocol. However, long-distance communication does not necessary require P = 99.9%. Instead, with P = 99.0%, the number of attempts can be reduced to s = 71 for p c = 6.25%.
We have found that physical requirements of our architecture are broadly consistent with present technology. However, improvements are still required, in particular to the measurement efficiency. However, while technological developments might help to meet these requirements, physical requirements may be found to be less stringent with a more sophisticated adaptive error analysis.  The logical error rate is plotted as a function of the physical error rate for various code sizes (distances d), where we have assumed that all gates and measurements are operating at the same error rate. Each point corresponds to at least 10 4 trials. The value of the physical error rate at the intersection gives the threshold (in this case, approximately 7.3 × 10 −3 ). For physical error rates below this threshold, the logical error rate can be reduced arbitrarily by increasing the code distance. b) The required fidelity for each physical parameter. The dots on the lines show the current best accuracy reported, all of which already meet the required accuracy, 99.27%. For a realistic implementation, the gate fidelity should be above 99.9%, corresponding to the green coloured regime in the plot. Electronic and nuclear spin coherence times are already in this regime, and the remaining parameters may soon meet the desired accuracy given the rapid development of quantum control of such systems. The fidelity does not converge to unity due to imperfections in the NV − centre.
The dynamics of the NV − centre, consisting of the electron spin-1 3 A 2 manifold and the nuclear spin-1/2 system, can be described by the Hamiltonian H = H e + H n +H e−n . The electron spin's ground state Hamiltonian is given by [59,60] which represents a zero-field splitting (D/2π = 2.87 GHz), a strain induced splitting (E/2π ∼ 1-10 MHz), and a magnetic field induced splitting (g e µ B B), where µ B is the Bohr magneton and g e = 2.0 is the g-factor.
In this Hamiltonian, S z , S x , S y are the usual spin-1 operators. With an externally applied magnetic field B ∼ 20mT, our electron spin qubit levels |0 and | + 1 are far detuned from the | − 1 energy level, supporting our electron spin qubit. The nuclear spin Hamiltonian H n = − g n µ n BI z represents a magnetic field induced splitting of the 15 N nuclear spin, where µ n is the nuclear magneton and g n = −0.566 the nuclear g-factor. I z is the usual Pauli Z spin-1/2 operator. The hyperfine coupling between the electron and the nuclear spins is given by [46] where S ± (I ± ) are the electron spin (nuclear spin) raising and lowering operators respectively. This coupling includes an Ising part with coupling strength A /2π ∼ 3.03 MHz and an exchange part with coupling constant A ⊥ /2π ∼ 3.65 MHz [46]. With B ∼ 20 mT the exchange coupling is far off-resonance resulting only in a small dispersive phase shift. This results in a natural controlled-phase gate that operates on a time scale ∼ 165 ns, where λ is the frequency difference between the electron and nuclear spin levels.
An external microwave driving of amplitude Ω 0 is used to perform the electron and nuclear spin rotations. The driving Hamiltonian can be expressed as where the frequency ω d is chosen appropriately to determine whether we drive the electron or nuclear spin, with φ representing an initial phase offset. By using a polarised field, electron spin rotations can be achieved with high fidelity in at most a few nanoseconds. The nuclear spin operations are much slower due to the weak gyromagnetic ratio but can be achieved (with high fidelity) in a few microseconds by using the hyperfine coupling to enhance the natural nuclear spin splitting. Next, the NV − centre also has a 3 E energy level manifold with optical transitions to the 3 A 2 manifold. The optical transitions between one of the 3 A 2 magnetic sublevels and the 3 E levels coupled to the cavity field can be represented by where M i are the energy eigenstates, in order of ascending energy, within the 3 E manifold. At zero strain and magnetic field, the 3 E manifold is represented by the ba- neglecting a small mixture of the E x,y and E 1,2 states due to spinspin interaction. The optical field of frequency ω can be described by H f = ωa † a with a † (a) being the field's creation (annihilation) operators. The cavity coupling rate for a given transition is given by g ms,i . The basis states E x and E y have electronic spin zero, while the others (A 1,2 and E 1,2 ) are equal superpositions of spin ±1 [12,44]. For our scheme, we apply an electric field of E x = 1 GHz in the x-direction so as to lift the degeneracy of the spin-zero states in the excited-state manifold. This greatly reduces the sensitivity to rogue strain or electric field in the y−direction making the system more robust.

Coherence Properties
It is critical to mention the coherence properties of our electron-nuclear spin as this can vary significantly. Here we are assuming a single 15 NV − centre is created on isotopically pure (99.9%+ 12C) diamond substrate [55] and that our module will operate at low temperature (4-20 K) rather than room temperature. In such a case it has been reported that T 1 of the electron spin is greater than 1 s, while T * 2 ∼ 90 µs [56,57] with T 2 much longer [55]. The nuclear spin T 1 and T 2 are at least 0.2 s at present [11]. The limiting coherence parameter in this design is the T * 2 of the electron spin during the 165 ns controlledphase gate. However with Gaussian decay having the form exp − (2t/T * 2 ) 2 , the error associated with this is small in principle (< 10 −5 ) [55]. At the centre of our approach is a quantum module in which an NV − centre is embedded in an optical cavity (Fig. 6). The NV − centre is composed of a spin-one electronic spin and a spin-half 15 N nuclear spin. Our module is an interface between the optical, microwave and radio frequency regimes allowing information to be transferred between them. It works as follows: state dependent reflection allows the creation of entanglement between an external optical field [38,39,61,62] and the electron spin, while the hyperfine interaction allows the transfer of the electron spin state to the long-lived nuclear spin. It also allows the nuclear spin to be measured via the electron spin, thus completing the interface. While this is conceptually simple, the details of the physical system lead to a number of complications which we will address in this supplementary material.
To understand exactly how this module operates we must examine the interactions between the three components of our hybrid system (optical field, electron spin, nuclear spin) as a whole. The overall system including couplings and driving fields can be described by the Hamiltonian where H f = ωa † a is the Hamiltonian for the optical field detuned from the cavity resonance frequency ω c by ∆ = ω c − ω with a (a † ) being the field annihilation (creation) operator.
The second term H e = (DS 2 z + E S 2 x − S 2 y + g e µ B BS z ) represents a zero field splitting (D/2π = 2.87 GHz), a strain induced splitting (E/2π < 10 MHz) and a magnetic field induced splitting (g e µ B B) for the NV − centre's electron spin [46]. In this spin-one system, S x,y,z represents the generalised Pauli X,Y ,Z operators with S + (S − ) being the raising (lower) operator. Further µ B is the Bohr magneton and g e = 2.0 the g-factor. For an externally applied magnetic field of B ∼ 20 mT, the |0 and | + 1 levels are separated by approximately 3.43 GHz. The |m s = −1 energy level is detuned approxi-mately 1.1 GHz below the |m s = +1 level and ∼ 2.3 GHz above the |m s = 0 level.
The third term H n = − g n µ n BI z represents a magnetic field induced splitting of the nuclear spin with I z being the Pauli Z spin-half operator. Here, µ n is the nuclear magneton and g n = −0.566 the nuclear g-factor. The computational basis states of the nuclear spin are | ↓ (| ↑ ).
sents an electromagnetic field driving whose magnitude on the electron (nuclei) is determined by both the amplitude Ω 0 of the applied field and the ratio of g n µ n /g e µ B . The frequency ω d is chosen appropriately to determine whether we drive the electron or nuclear spin while φ specifies the phase.
The first of the coupling terms H e−n = A S z I z + represents a hyperfine interaction between the electron and nuclear spin. This coupling contains both an Ising part with coupling strength A and an exchange part with coupling constant A ⊥ . For a 15 N nucleus, A /2π ∼ 3.03 MHz and A ⊥ /2π ∼ 3.65 MHz [46]. The second coupling term H e−f is between the optical field and the electronic spin. It is detailed in the methods section of the main text and will be discussed in the next several sections.
Before proceeding it is also useful to consider the coherence parameters of our NV − centre. With isotopically purified CVD diamond [56,57,63] we can expect electronic spin coherence times T * 2 of 90 µs and T 2 > 1.8 ms while the relaxation T 1 can be over 1 second when the sample operates in the 4-80 K regime [64]. The coherence times of nuclear spins have been shown to exceed 1 s [11]. We now explore in detail measurement and entanglement of two NV − centres.

Level structure
In this section we consider the main features of an NV − centre in a microcavity to ascertain how well the state of an NV − centre can be coupled to an external optical field and detected, and how two NV − centres can be entangled by detection. We apply a magnetic field B z = 20 mT to separate the ground state levels |+1 and | − 1 . We aim to use resonant light tuned to the |0 ↔ |M 3 ≡ (0.998|E y + 0.07|E 1 ) |E y transition with almost pure x−polarisation. We apply an electric field of 1 GHz to lift the degeneracy between the |E x and |E y states, and also to increase the detuning between |0 ↔ |M 5 and other transitions. The electric field has a negligible effect on the ground state triplet, leading to an amplitude mixing of the |+1 and | − 1 levels on the order of 2 × 10 −5 . The closest strongly allowed transition to |0 ↔ |M 3 is the |+1 ↔ |M 5 ≡ (0.98|A 1 +0.17|A 2 ) transition, with a detuning of δ ω = 2π × 2.71 GHz. Furthermore this transition is almost purely circularly polarised. Assuming transform-limited linewidth, at low temperatures (2 K) the excited-state decay transitions have amplitude decay rates of γ(M 3 ) = 2π × 6 MHz and γ(M 5 ) = 2π × 11 MHz [65] so that in both cases δ ω γ. All other significantly allowed transitions are detuned even further and can be neglected.

Quantum non-demolition measurement of the electron spin state
We now consider an NV − centre placed at the antinode of a cavity resonant with the |0 ↔ |M 3 transition. The natural entanglement we can generate between the electron spin and optical field allows us to perform a quantum non demolition (QND) measurement [66] of the electronspin (and thus also its initialisation). We can use a single photon to probe the electron spin a number of times and from the measurement patterns (clicks or no clicks) determine with high probability the state of the electron spin, that is whether it is in the |0 or | + 1 state. In the following, we assume the cavity to have no losses other than the transmission through the mirrors, and a spatially perfectly mode-matched input beam. The core of the proposal is based on the different effects of the NV − centre being in the ground state |0 rather than in state |+1 on light impinging on the cavity. The resonator is assumed to have a finesse F and a 1/e 2 mode intensity radius w C , leading to a cooperativity of Here, σ E = 3λ 2 /(2π) is the emitter scattering crosssection, while σ C = πw 2 C is the cavity mode area. η BR is the branching ratio of the transition in question, which is η BR (M 3 ↔ 0) = 4% and η BR (M 5 ↔ ±1) = 2%.
The resonator amplitude decay rate depends on the resonator length L with κ = πc/(2LF). The reflection and transmission amplitudes for a photon with a linewidth γ phot κ, γ being reflected or transmitted by a cavity containing an NV − centre are given by with ∆ C = (ω laser − ω cavity )/κ, ∆ E = (ω laser − ω NV )/γ, and where A = (r 1 − r 2 )/(1 − r 1 r 2 ) is the amplitude of the reflected light for an empty cavity on resonance, with amplitude reflectance coefficients r 1,2 for the input and output mirrors, respectively. Then the probabilities for reflection and transmission are By energy conservation, the incoherent scattering probability is P S (|0 ) = 1−(P R +P T ). For emitter, cavity and probe light tuned to resonance, the expressions reduce to P res.
We now need to maximise the difference in reflected signal caused by an NV − centre in the ground |0 state, which can be done in two ways: • High-cooperativity implementation: In this approach, we minimise A so that A 0 and maximise C. Then the signal for the empty cavity results in P R 0 while the signal for a cavity containing an NV − centre in the ground |0 state tends to P R 1 for C 1. In this limit, the emitter excitation decreases with cooperativity as P S → 1/C [67]. However, there is a small off-resonant excitation of NV − centres in the |+1 ground state which remains even for large cooperativity, and limits the performance of the device.
Excitation of the NV − centre can be significantly decreased for either the |0 or | + 1 ground states by using an appropriately polarised optical field. In principle, the excitation for one of these states can be entirely turned off. In our situation we select a polarised field to suppress excitation in the |+1 ↔ |M 5 such that P S (| + 1 ) → 0.
This approach also requires careful matching of mirror reflectivities.
• Low-cooperativity implementation: In this approach, we select a large negative value for A (A −1) and tune C such that 2C + A = 0. This can be arranged by choosing r 2 r 1 . This approach is both more flexible and more readily achievable as we only require an initial cooperativity of C(|0 ↔ |M 3 ) ≥ 0.5. The cooperativity can then be reduced to 0.5 by rotating the polarisation of the incoming photons away from the x−direction. Alternatively, the interaction strength between light and emitter can be decreased by detuning. Assuming a value of A = −1, a detuning of ∆ E = ∆ C = √ 2C − 1 leads to vanishing reflection probability. Conversely, the reflection probability approaches A 2 when the NV − centre is in the state | + 1 , as can be seen from Eqn. (B4), so detecting a photon projects the NV − centre onto this state. However, as this implementation is based on the conditional absorption of a photon, the performance is limited by spin-flipinducing transitions. Experimentally, these have been observed to be on the order of 1%, which excludes this implementation for our purposes unless this issue can be addressed. It will nonetheless be suitable for initial demonstrations of the entangling mechanism. For our work we therefore focus on the high-cooperativity implementation.
Measurement sequence and sources of error: We aim for near perfect contrast of the empty cavity and maximum reflectivity-that is, A r (| + 1 ) ∼ A → 0 and A r (|0 ) → 1. Our state detection is based on a measurement -spin rotation -measurement sequence where we assume that negligible errors occur during in the spin rotation. Furthermore, we assume that this sequence will be repeated many times, as photon loss will be unavoidable in a realistic device. If the electronic spin is in the |0 state, the probability that our single-photon detector clicks at least once in s attempts is Then, the probability of at least one click in s attempts with no spin flips is where P flip,0 and P flip,+1 are the probabilities of a single measurement inducing a spin flip upon excitation when the NV − centre is in one of the qubit states, due to resonant and off-resonant excitation, respectively. Conversely, the probability for the detector never to click in s attempts and not spin flip when the NV − centre is in the | + 1 state, is P click,+1 (s) = 1 − |ηA r (| + 1 )| 2 − P S (| + 1 )P flip,+1 s where η 2 is the single photon detection efficiency, including all losses along the channel. The error probability for our entire sequence is then One can immediately see the advantage of having |A r (| + 1 )| 2 ∼ A = 0. A detector click strongly indicates that the NV − centre is in the |0 state. We assume P flip,0 = 0.003 and P flip,+1 = 0.35 respectively (see Fig. 8), but we note that there is no current consensus on these values in the literature [68]. A key advantage of our measurement sequence is that it allows us to determine whether the NV − centre exits the qubit subspace into the state | − 1 . The scheme can easily be modified to use weak coherent pulses instead of single photons. We neglect this approach to avoid errors due to de-ionization of the NV − centre, which are possible for coherent states given their non-zero overlap with Fock states with n > 1. While the probability of this occurring may be small, it may be difficult to detect explicitly.
QND measurement performance: The QND measurement performance is depicted in Fig. 9 for both the low-cooperativity and high-cooperativity approaches. The low-cooperativity approach (A ≈ −1) is limited by the spin-flip probability per measurement for the resonant excitation situation. The numerically optimised points for the case of negative A are closely matched by choosing ∆ E = ∆ C = √ 2C − 1 (red lines in Fig. 9), while for A = 0, ∆ E = 0 and ∆ C = Cγ(M 5 )/δ ω (blue lines in Fig. 9). The error rate of a single QND measurement is on the order of 1%. For our scheme, we require an error rate of less than 7.3 × 10 −3 (see main text). To overcome this limitation, we consider the high-cooperativity approach where we perform multiple measurements, shown in the middle panel of Fig. 9. The error rate as a function of detection efficiency is shown in the lower panel of Fig. 9. We require optical efficiency (including detectors) of η 2 30% to meet the requirements of our scheme (see main text). A useful working cooperativity is of the order of C ∼ 50.
Before outlining how to generate remote entanglement, we will briefly discuss detection errors, namely photon loss and dark counts: • Photon loss: This is the most common error, which can arise from a number of sources including absorption or scattering in the channel, coupling inefficiencies between the cavity and channel, and inefficient single-photon detection. This error simply decreases the probability that we successfully measure a photon at the detector. We can model this by a parameter η 2 which ranges from [0,1] with η 2 = 1 being no loss. The probability of successfully measuring the photon is the ideal success probability multiplied by η 2 .
• Dark counts: This error is where the detector clicks when no photon was incident. In principle, with current gated APDs, this dark count probability could be less than 10 −5 per time window [69].

Entanglement
The creation of an entangled state between two remote electron states can be described in a straightforward manner given the previous discussion. Our scheme, which we depict in Fig. 10, is comparable to the protocol of Duan, Lukin, Cirac and Zoller [70]. We place two microcavities, each containing a single NV − centre, at the output ports of a 50:50 beamsplitter in a Michelson interferometer configuration. For simplicity, we set A r,i (| + 1 ) = A i and A r,i (|0 ) = A r,i with i = (a, b) the indices of the two cavities.
In the most general case, we start our sequence by first preparing the NV − centres in superpositions α a |+1 a + β a |0 a and α b |+1 b + β b |0 b . A single photon then impinges on the beamsplitter, resulting in a path-entangled state being sent to the two cavities. The photon then interacts with the cavities and then returns to the beamsplitter. A detection event in the dark port projects the NV − centres into the state Now non-zero values of A ri and 1 − A i will only lead to a decrease in the state amplitude, while differences between the two cavities will generally lead to a loss in fidelity. This can be seen from the first two terms in the state ψ d . Assuming perfect state preparation with α i = β i = 1/ √ 2, A a = A b = 0 and A r,a = A r,b = A r , our expression for ψ d simplifies to A r / √ 8(|0, +1 a,b − |+1, 0 a,b ). The probability of projecting the two NV − centres onto our desired entangled state is p c = η 2 A 2 r /8. This probability may seem quite low, however the Bell state is generated with extremely high fidelity, even with imperfect transmission and reflection coefficients. Instead of impacting the fidelity of the resulting singlet state, A and A r impact the probability of detecting a single photon in the dark port.
No photon detection leaves the electron in an indeterminate state, as the photons could have been lost in the channel, scattered from the NV − centres, or lost due to imperfect coupling, inefficient detection, or through the unmonitored b out a,b cavity ports. To address the low success probability, we repeat the process a number of times to establish a link with high probability [71].
So far we have assumed the transmission and reflection coefficients of the two cavities have been matched. This may not be the case in practice, and we will likely have A b ∼ A a as A a , A b ∼ 0 but A r,b = A r,a . In this case, we can introduce a small loss element into the reflected path of the photon with the greater A r,i coefficient to effectively decrease its amplitude. Hence our resulting state is ψ d ∝ β a α b |0, +1 a,b − α a β b |+1, 0 a,b , as required.

A little determinism: adding a 15 N nuclear spin
With electron-spin initialisation and readout, the ability to generate remote entanglement, and a microwave driving field to perform electron-spin rotations, we essentially have all the operations required for distributed quantum computation and communication, particularly via the preparation and measurement of cluster states [72]. However, unsuccessful attempts to introduce additional qubits to the cluster state may destroy entanglement that has already been established, significantly increasing the resource overhead for low success probabilities [73]. Adding a little determinism will decrease these requirements.
An NV − centre in diamond possesses an electron spin and also a nuclear spin from the 15 N atom. These couple naturally via the hyperfine interaction given by H e−n , which may allow us to add another qubit to the module. With a 20 mT field, the exchange interaction component is far off-resonance and so in an appropriate rotating frame we can write the effective interaction Hamiltonian as 2Λ with Λ = D + g e µ B B − g n µ n B being the detuning between the electron and nuclear spin levels. This interaction gives a fast and natural controlled-phase (CPHASE) gate, where the time to create a maximally entangled state is t max = π/A net ∼ 165 ns.
To transfer quantum information between these systems, we require single-qubit operations on both the electron and nuclear spins. The electron spin rotations can be achieved using a σ + polarised microwave driving field of the form H rf Driving = Ω 0 e iφ | + 1 0| + e −iφ |0 +1| (in our rotating frame). With φ = π/2, a −π/4 Yrotation transforms |0 → 1 √ 2 [|0 + | + 1 ] (a Hadamardlike operation) in approximately 2 ns [74,75]. The nuclear spin rotation operation could similarly be achieved through driving the exchange part of the hyperfine coupling in 1 µs [74]. We hence have the operations required to construct gates that transfer the state of the electron spin to the nuclear spin and vice versa. These gates may also be used to initialise and measure the nuclear spin, via a projective measurement of the coupled electronnuclear system. We will discuss the error channels in the electron-nuclear spin system once we have integrated all the elements.

Appendix C: A hybrid interface
Next, we combine the basic operations between the optical, electron-spin, and nuclear-spin components in a protocol for generating entanglement between two remote nuclear spins. Care is required to ensure that the operations work as intended. For instance, coupling between the electron and nuclear spin is always on, meaning that failed attempts at electron-electron coupling could cause errors on the nuclear spins.
We begin by preparing the electron (nuclear) spin in the |0 (|n + = 1 √ 2 (| ↓ + | ↑ )) state. An accurate (subnanosecond) clock is started in the first module and the electron spin rotated to |+ = 1 √ 2 (|0 + |1 ). Two independent operations occur at this time: • First, as soon as the electron spin is rotated from |0 to |+ , the hyperfine interaction begins coupling the electron spin with the nuclear spin, according The resulting entanglement is periodic and oscillates between separable and maximally entangled with period 2π/A net ∼ 330 ns. The oscillation stops when the electron spin is returned to a polarised state. Alternatively we can use a spin-echo technique to disentangle the electron and nuclear spins at any time. We know that after a time t the state |+ |n + has evolved to |Ψ(t) . Performing a spin-echo pulse and waiting a further time t evolves our combined state to 1 √ 2 |+ | ↓ + e iAnett | ↑ . The electron and nuclear spins are disentangled with the electron spin returning to the original state and the nuclear spin evolving to 1 √ 2 | ↓ + e iAnett | ↑ .
• Second, a single photon is split on a 50:50 beamsplitter into two modes. The bottom mode in Fig. 10 interacts via dipole-induced transparency with the NV − centre in the first module, where it becomes entangled with the electron spin state. Both modes exciting the cavity are temporally multiplexed and transmitted over a fiber to the second module where the multiplexing is reversed. In the second module, the clock is started and the electron spin in the cavity is rotated to |+ where it interacts with the top mode from the original beamsplitter.
The two modes are recombined on the beamsplitter and the dark port of the interferometer is monitored.
Two possible outcomes, which we refer to as unsuccessful and successful, are distinguished by the measurement result: • The unsuccessful case is where no photon is detected at the dark port, which occurs if a photon is detected at the bright port or not at all (it may have been lost in the cavity, during coupling, or in the channel, or the detector may not have detected it due to error). In this case, we are unsure of the exact state of the remote electron spins and must assume it is maximally mixed. Consequently (assuming that A net is identical for both NV − centres), the density matrix of the combined nuclearelectron system is ρ = |00 00| e ρ n + |01 01| e e −iAnettZn 2 ρ n e iAnettZn 2 + |10 10| e e −iAnettZn 1 ρ n e iAnettZn 1 + |11 11| e e −iAnettZn 1 Zn 2 ρ n e iAnettZn 1 Zn 2 , (C1) where e and n denote the electron and nuclear subsystems respectively. The hyperfine coupling combined with the fact that photon loss completely mixes the state of the electrons implies that either one or two phase errors can be back-propagated to each nucleus. However, the nuclear component of this mixed state "re-purifies" itself with the periodicity A net t = 2πm of the hyperfine coupling or via a spin-echo pulse (the spin-echo pulse is preferred as it potentially much faster). After such a pulse the electron and nucleus become decoupled and the state of the nuclear qubits is simply This slight phase rotation e iAnett , where t is when the spin-echo pulse is applied, can be corrected later.
• The successful case is where a photon is detected at the dark port and the remote electron spins are projected into a singlet state with a high fidelity, as discussed in Section IIB. A spin-echo pulse is also performed on each module to decouple the electron and nuclear spins.
At this point, the electron and nuclear spins are decoupled. What to do next depends on the measurement result: • In the unsuccessful case, we measure the electron spin at the 2t time of the spin echo and initialise the electron spin into |0 , which collapses the overall density matrix to one of the four terms in Eqn. (C1). Then we start the procedure again from where we clocked the first attempt. Although the electron spin states are completely mixed, this gate sequence allows the nuclear spins to avoid decoherence and be preserved for the next attempt. This can be repeated until success. Errors may propagate to the nuclear spins due to poor control of the time when electrons are reinitialised to the |+ state prior to each attempt.
• In the successful case, we perform a single-qubit π/4 Y -rotation on one of the two electron spins at the 2t time of the spin echo (the π/4 Y -rotation is an effective Hadamard gate necessary to convert the electron-electron singlet state into the appropriate two-qubit cluster state, (|0+ − |1− )/ √ 2). We then wait until the hyperfine interaction maximally entangles the electron and nuclear spins within the node (at a time t = mπ/A net ). A second π 4 , Y -rotation is performed on the electron spin of each module followed by measurement in the computational basis (an effective X-basis measurement).
Upon success, we have transferred newly established entanglement between the electron spins in two remote modules to the nuclear spins in those same modules (by effectively teleporting a CPHASE gate), which is where we are storing and processing our quantum information. Importantly, the protocol circumvents photon-loss induced decoherence via the hyperfine interaction on the nuclear spin.

Timescales
The timescales for the various processes in the protocol can be grouped into three categories: short (1-30 ns), medium (100 ns -1 µs) and long (> 1 µs). Short timescales are associated with electron spin operations (initialisation, detection, and rotations), medium timescales are associated with hyperfine coupling operations (entanglement, nuclear spin initialisation, and measurement in the Z-basis), and long timescales are associated with nuclear spin rotations (via the hyperfine interaction [74]). Nuclear spin rotations are generally only required only for initialisation, and the number of nuclear rotations is independent of the number of attempts to create an electron-electron bond. Similarly, measurement of the nuclear spin is only required for measurements that consume the preprepared entanglement. Transmission of a single photon between the modules is our last operation of interest, and its timescale depends on the task at hand. In quantum communication, remote modules may be separated by up to 40 km. In this case, it takes approximately 0.4 ms to transmit a photon between modules and receive a classical return signal. In this case, the duration of each attempt is determined by this timescale. By contrast, for modules separated by 1 m, the transmission time is ∼ 10 ns, which is shorter than the timescale associated with hyperfine coupling operations.
The overall rate of the protocol is determined by the product of the per-attempt rate and the number of attempts. The number of attempts is related to the probability of success of each attempt, which depends on the efficiency of the optical components. We define the total efficiency of the optical components, p o , to be the combined efficiency of all factors that influence the success probability of the optical gate, besides the theoretical upper bound of 0.125. If p o = 0.5, then for each attempt the probability of success is 0.125 × 0.5 = 0.0625. After approximately 107 attempts the probability of success is P = 0.999, which for our purpose is effectively deterministic.

NV − module
Let us now return to the issue of errors in the module. Error can be divided into two categories: • Accumulation errors are those that depend on the number of attempts taken to establish entanglement between remote electron spins. These errors only affect the error rate of the nuclear-nuclear CPHASE gate, not the error rate of nuclear measurement and initialisation. To tolerate a low success probability (which necessitates a large number of attempts) these errors need to be heavily suppressed.
• Non-accumulation errors are those that are independent of the total number of attempts, and depend only on the final successful attempt to establish entanglement between remote electron spins.
In Sections III and IV we will break down errors into several parameters that determine the overall error rates of nuclear spin measurement and initialisation CPHASE gates between remote nuclear spins. These error rates will then allow us to determine the performance of the architecture.

Appendix D: Nuclear Spin measurement and initialisation
Both measurement and initialisation of the nuclear qubit is performed via projective collapse of the electron and the hyperfine interaction. As described in [74] we can generate multiple types of controlled operations (where the electron acts as the control) between the electronnuclear system. The most basic is the natural hyperfine generated CPHASE gate. Combining this with Y rotations on the electron, we are able to perform an effective Z-basis measurement on the nucleus with a total time of approximately 2×5+165+100 = 275 ns, assuming singlequbit gates take less than 5 ns and single attempt initialisation and measurement of the electron takes 100 ns (see Fig. 11a). This measurement circuit also initialises the nucleus in a known state. Therefore, measurement and initialisation in this model is achieved with a combined gate.
Similarly, we can drive the hyperfine interaction to generate a controlled rotation around a different axis (rather than the Z axis). Two examples are a controlled-not (CNOT) operation and a controlled-Y operation, which can be used to measure the nucleus in the X basis and Y basis respectively (see Figs. 11b and 11c). Driving of the hyperfine interaction necessitates a longer time for these controlled operations (approximately 1 µs [74]) and these are therefore classified as long timescale operations. Errors associated with nuclear spin initialisation and measurement do not depend on the number of attempts to establish entanglement between remote electron spins and may occur only when nuclear spins are measured.
X-basis measurement and initialisation ≈ 1.1µs FIG. 11. Projective measurement of the nuclear spin, mediated by the electron-nuclear hyperfine interaction. The natural hyperfine interaction enables fast Z-basis measurements, while a driven hyperfine interaction enables X and Y -basis measurements [74]. (a) Measurement in the Z basis and initialisation in |+ consists of measurement via the natural hyperfine interaction and then a controlled-Y gate on the nuclear spin with the electron spin polarised in the |1 state, which effectively rotates the nuclear spin into the |+ state. (b) Measurement and initialisation in the X basis.
(c) The two types of Y -basis measurements required by our scheme are performed by driving the hyperfine interaction. After measurement, a controlled Z-rotation with a polarised electron spin will reinitialise the nuclear spin in the |+ state.
Gate times for the combined measurement and initialisation operation are approximately 1 µs. The error rate associated with measurement and initialisation in the Z and Y basis is higher than in the X basis as a second rotation is required to reinitialise the nuclear spin in the X-basis (to prepare cluster states, qubits should be initialised in |± ). As the natural CPHASE gate of the hyperfine interaction is much faster than the driven CNOT (or controlled-Y ) gate, the timescale associated with measurement and initialisation in the Z-basis is the same as in the X-basis.

Intrinsic decoherence in diamond
For both the nucleus and electron, intrinsic decoherence can be induced through spin relaxation (thermalisation) and through dephasing. For the nuclear spin we can model both processes as a Markovian process where the errors induced are approximately given by, where t is the length of time considered and T in are the decoherence times (i = 1 for relaxation and i = 2 for dephasing). Dephasing (from T * 2 processes) results in Z errors while relaxation (thermalisation) results in X and Y errors. We assume here that spin-echo techniques are being used on the electron spin to effectively decouple the electron and nuclear spins. If this is not the case, the coherence times of the nuclear spin will be much shorter.
For the electron spin, relaxation can be modelled again as a simple Markovian process p Generally the relaxation times are very long (seconds) compared to the electron-spin gate times (nanoseconds to microseconds) and can be neglected, leaving only Z errors as our intrinsic error. We use approximate expression for p n,e (t) to simplify our estimates and to find an upper bound for our error probabilities. The master equation used for each process will give slightly different expressions for X, Y , and Z errors.
Control errors can be modelled by an error , which is defined as the over or under rotation caused by imprecise control of the Hamiltonian of the electron spin. Over or under rotation simply produces a error of the same type as the rotation axis, whereas axis misalignment may cause an arbitrary error. We assume that this error affects the rotation angle and not the rotation axis. In either case, given a rotation error of , the error induced is given by sin 2 ( ) ≈ 2 , for 1. The total error for electronic rotations (the combination of decoherence and control errors) is given by A similar expression can be derived for p n for pure decoherence over time t, p n (t) = (1 − e −t/Tn ).
Note that we do not include an 2 term for the nuclear spin error. This is because all nuclear rotations are achieved by driving the hyperfine interaction, where the associated error is given by the coupled error terms p 2z and p 2x . These intrinsic errors associated with the hyperfine control may introduce correlated errors. A more detailed analysis of these processes can be found in [74] and the total probability of error during these rotations will encapsulate both the systematic errors and the intrinsic decoherence on both the electron and nucleus over the relevant time scales. These can be modelled by a general two-qubit depolarising map with probabilities p 2z , p 2x and p 2y . Each of these expressions can now be used to bound the error rate associated with nuclear measurement and initialisation, where p e is the electronic rotation error, p M is electronic measurement error (also initialisation) and p 2(x,y,z) are the errors associated with the hyperfine coupling for natural (z) or driven (x, y) evolution. The timescale of each measurement is approximately 1 µs and the errors in p M Z will be dominant.

Appendix E: Electron-electron connection
Errors that accumulate as we attempt to establish entanglement between remote electron spins arise from three sources: 1. Hyperfine interaction timing errors. After an attempt to entangle two remote electron spins, the hyperfine interaction must be allowed to evolve (including spin-echo sequences) to the 2π point so that the electron and nuclear spins are disentangled prior to the next attempt. If there is an associated timing error, ν, a Z error will propagate back to the nucleus with a probability of sin 2 (ν/165 ns) ≈ (ν/165 ns) 2 . In Table I we give the required accuracy for a successful connection probability of P = 0.99 (accumulated nuclear spin error of 1%) and P = 0.999 (accumulated nuclear spin error of 0.1%) for various optical component efficiencies, p o . The probability of the connection being successful using a single sided-cavity protocol is given by p c = 0.125p o .
2. Nuclear decoherence. As entanglement is established between remote electron spins over a series of attempts, decoherence will accumulate on the nuclear spins. Long nuclear decoherence times are required to accommodate the low success probability. We assume that the physical separation between NV − centres is short enough such that the optical protocol can be confirmed to have succeeded or failed within the 165 ns required for the electronnuclear hyperfine gate. An unsuccessful attempt takes approximately 2 × 45 + 100 + 5 ∼ 200 ns (initialisation of the electron via measurement, rotation of the electron, and spin-echo to disentangle the electron and nuclear spins prior to the next attempt). Therefore, the nuclear decoherence will be p n (200 ns) = (1 − e −2.00×10 −7 /Tn ) per attempt. For s attempts, this becomes p n (s200 ns).
3. Excitation of the electronic system. When attempting to entangle remote electron spins, or when measuring and initialising the electronic spin via an optical photon, we may accidentally excite the electronic system. When this occurs, the attempt is automatically unsuccessful as the photon has been absorbed. With high probability, the excited system will relax to its original state with no error induced on the nuclear spin. However, due to level mixing in the upper manifold, there is a possibility of a series of non-spin conserving transitions back to the ground state. As soon as the spin state of the electron changes, the timing control that we use to prevent errors back-propagating to the nucleus becomes unreliable. This error channel is active not only during every connection attempt, but when measuring and initialising the electron spin.
Experiments to precisely determine the relevant branching ratios for the decay of the electron have not been performed, but we can approximate these values using a theoretical model. Consider the basic level structure of the NV − centre shown in Fig. 8. The probability of a photon being absorbed by the NV − centre can be calculated using Eqn. (B4). Given the parameters we assume for our system, where P R is the probability of reflection for each state and P S is the probability of absorption for each state. The probability of error on the nuclear spin depends on the state of the electron spin, and the worst case is when the electron is in the |0 state (as the probability of excitation is higher). The probability of error on the nuclear spin also depends on the likelihood of an excitation causing a spin flip in the NV − centre. The general error mapping, in the worst case, is given by where P 0 is the probability that no absorption takes place and the photon is lost through other mechanisms. P 1 is the probability that the NV − centre relaxes to the |0 state via a series of spin-0 levels and P 2 is the probability that it relaxes to the | + 1 state when initially in the |0 state. When the system relaxes to the | + 1 state, the probability of a error on the nuclear spin is related to exactly when the electron decays from the metastable state back to the | + 1 state with respect to the 165 ns π point of the hyperfine coupling. Reliable estimates for this decay are not experimentally available, so we will attempt to make a large overestimate. If this decay pathway occurs, we assume that a full Z-error occurs on the nuclear spin. Each probability can be estimated from the probability of absorption, P S , and the relative probabilities of each of the transitions, (E4) Therefore, P 2 is our estimate of the probability that an error occurs on the nuclear spin due to excitation of the electron spin. This is likely to be is a significant overestimate as we have not accounted for the timing of the electron relaxation relative to the π point of the hyperfine interaction. This estimate was done assuming a cooperativity of C = 50. By doubling this cooperativity, the probability of error halves.

Appendix F: Topological cluster states
We will now outline how our protocol to establish entanglement between remote nuclear spins enables scalable quantum information processing. In particular, we will outline how to prepare cluster states appropriate for universal quantum computation and quantum communication. A common way to prepare cluster states involves two-qubit CPHASE gates between neighbouring qubits in some geometry [73], and our protocol is effectively a CPHASE gate between remote nuclear spins. Therefore, with a cluster state stored in the states of the nuclear spins, our protocol can be applied with additional modules to introduce additional qubits to the cluster. In this way, we can prepare an arbitrary cluster state by repeating the protocol as required with a sufficient number of modules.

Topological cluster-state error correction
For scalable quantum information processing, some form of error correction will be essential. Of the many schemes for error correction, the two-dimensional surface code and the closely related scheme based on threedimensional topological cluster states are the strictly local schemes with the highest tolerance to errors (above 0.5% per gate in both cases) [31,[76][77][78][79]. In both cases, each qubit is only required to interact with its four nearest neighbours. Typically, the surface code is thought to be appropriate for matter-based qubits, while topological cluster-state error correction is thought to be appropriate for photonic qubits. Despite the fact that our nuclear spin qubits are immobile, topological cluster-state error correction features a natural mechanism to tolerate missing bonds in the cluster state, which might arise in our scheme due to the strictly non-deterministic nature of the CPHASE gate. Missing bonds can be avoided through a clever interpretation of the measurement results during computation, at the cost of a reduced tolerance to other errors [79]. This is not possible with surface code [80]. As such, we will focus on topological clusterstate error correction, which requires us to prepare the three-dimensional topological cluster state illustrated in Fig. 12a. In topological cluster-state error correction, two dimensions of the topological cluster state are reserved for the spatial distribution of protected logical qubits. The third dimension is identified with the temporal axis of the computation. As such, we are not required to prepare the entire topological cluster state before the computation can begin. Instead, only two adjacent layers of the topological cluster state are required at a given time. In Fig. 12b we illustrate the physical unit cell of the topological cluster state, comprising two layers. The back layer contains eight qubits connected in a square (orange), while the front layer contains five qubits connected in a cross (blue). The two layers are connected in the temporal direction (green). This pattern is repeated over the entire topological cluster state. Then, measurement of the front layer will teleport the current state of the computer to the back layer, at which point the front qubits can be reconnected in accordance with the geometry of the topological cluster state and the information can be teleported back again. In this way, the two physical layers function as even and odd layers in the temporal direction, allowing an arbitrarily deep computation to be performed with a fixed number of physical qubits.

Mapping to a two-dimensional geometry
Because we are using matter qubits, it may be useful for the array of NV − modules to be strictly twodimensional. In Fig. 12c we illustrate physical unit cell (comprising two layers in the temporal direction) projected to a two-dimensional plane (where colour coding has been preserved). Each NV − module is no longer connected to only its nearest neighbours, and several nextnearest neighbour connections are required. However, as these connections are optically mediated, this is compatible with our scheme. In principle, the array can be distributed, where neighbouring NV − modules are separated by an arbitrary distance (subject to photon loss and communication time) and the relevant integrated (or bulk) optics are positioned between connected modules.

Connection circuits
The circuit in Figs. 13 and 14 is used to prepare the topological cluster state, layer by layer, with the array of NV − modules. Creating an optimal five-step circuit is not possible given only only two layers of modules. Instead, we use a six-step circuit, where NV − modules are idle for one step after measurement. In Fig. 13, the star notation denotes the subsequent six-step circuit that occurs at a later time (for example, 1 * denotes step 7). Figure 14 illustrates the circuit to prepare a topological cluster state with cross section equal to 1×1 and arbitrary depth. As discussed in the main text, our calculation of the threshold assumes a five-step circuit. This is a reasonable approximation to the six-step circuit, as the error that accumulates while a module is idle is restricted to pure nuclear decoherence, which is negligible over the timescale of a successful electron-electron connection.   13. Sequence of NV − node connections for a unit cell of the physical cluster. Each number represents the time-step for bonding, while a number inside each node represents measurement/initialisation. This sequence is time optimal given the physical constraints on the system. The star notation denotes equivalent time steps in the circuit which occur at later physical times, i.e. 1 * would occur at time step seven in real time and after the measurement a node remains idle of one step in the cluster.
Simultaneous connections are grouped into a single step. In order to maintain synchronicity over the entire computer, all connections in a given step should be established before moving onto the next one. As the connections are probabilistic, this may require some modules to wait while other modules are still being connected. However, as nuclear decoherence rates are orders of magnitude less than the time required to attempt an electronelectron connection, this waiting period will not adversely effect the error performance of the computer provided electronic errors propagating through the hyperfine interaction are handled carefully. This requirement determines the number of connection attempts, g, required for each module at a given success probability. The number Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 W Node idle FIG. 14. Quantum circuit required for the creation of two layers of the Raussendorf cluster. Also shown are the circuits for nuclear initialisation and readout, utilising a QND measurement via the electron/nuclear hyperfine interaction.
of attempts to ensure a bond is established with probability P is given by g = log(1 − P )/ log(1 − p c ), where p c is the probability that a given connection attempt is successful. Assuming that p c = 6.25%, g = 107 for P = 0.99%. However, in the main text we assumed that g is equal to the average number of connection attempts, given by 1/p c , which increases the rate of operation of the computer. In this case, we must ensure that the topological cluster state is synchronised over the entire computer. On average, each node will synchronise with its neighbours. In extreme cases, some modules will have to wait for g attempts to be connected. Our estimates will assume both the synchronous and asynchronous modes of operation.
Because failed connections are heralded, we can exploit the tolerance of the topological cluster state to missing bonds [79]. For example, we may reduce P to 95% to reduce the number of attempts. However, as the proportion of missing bonds is increased, the threshold error rate for all other errors is reduced. In our calculations, we do not exploit this potential robustness, and detailed calculations determining the tradeoff between missing bonds and other error rates will be studied in further work. to the | + 1 state occurs close to the 2π point of the hyperfine coupling. Reducing P 2 will be sufficient to reduce all error rates to below our target error rate.

Expected performance
Lastly, we estimate the performance of our architecture. The rate-limiting process is the connection of all electron-electron pairs in each step of the circuit to prepare the topological cluster state. As discussed, we can operate the architecture in a synchronous or asynchronous manner, and the mode of operation will affect the performance. Simplest is the synchronous mode, where 99.9% of all connections are established before moving to the next step (connections that are not established are introduce errors, which can be corrected). In this case, approximately 107 attempts per step are required for p c = 6.25%. This leads to a time per step of (200 × 107 + 275) ≈ 22 µs. In asynchronous mode, we take the average number of attempts for connections to be established (1/p c ). This implicitly assumes that different parts of the NV − array may by at different temporal stages of the computation, but classical control will be used to keep track of the entire topological cluster state, which is generated at a constant rate on average. In this case ≈ 16 connection attempts are needed at p c = 6.25% requiring a time of 3.5 µs. Initialisation and measurement takes approximately 1 µs, so this is not the rate-limiting process. The quantum circuit illustrated in Fig. 14 takes six steps to construct a layer of the topological cluster state. 1Hence, a temporal layer of the topological cluster state is prepared every ≈ 132 µs in synchronous mode and every ≈ 21 µs in asynchronous mode, with a unit cell prepared every 264 µs and 42 µs, respectively.
To estimate the size of topological cluster state and the speed of performing logical gate operations, we estimate the failure rate of a logical cell and the number of logical cells required for a logical gate [31]. The failure rate of a logical cell can be approximated as p L ≈ C 1 (C 2 p/p th ) (d+1)/2 , where d is the distance of the topological code, p is the physical error rate, p th is the threshold error rate (estimated to be approximately 0.73%), C 1 ≈ 0.13, and C 2 ≈ 0.61 [81] We assume p = 0.1% is our average error rate for all gates, as the CPHASE gate has a slightly higher error and the measurement gates have slightly lower errors. For a large computation, we are likely to require p L ≤ 10 −18 , implying d ≥ 32. Then, a logical cell is a cube of unit cells measuring 5d/4 = 40 cells in edge length. A logical qubit is defined as a cross section of the cluster, measuring 2×1 logical cells, requiring 80 × 40 unit cells. To perform a logical CNOT gate we require a cluster volume 2 × 2 in cross section, requiring 9841 physical qubits and 2 logical cells in temporal depth. Hence, the time for a logical CNOT is 2 × 40 × 264 = 21.1 ms for the synchronous mode and 3.4 ms for the asynchronous mode.