Quantum correlations between single telecom photons and a multimode on-demand solid state quantum memory

Quantum correlations between long lived quantum memories and telecom photons that can propagate with low loss in optical fibers are an essential resource for the realization of large scale quantum information networks. Significant progress has been realized in this direction with atomic and solid state systems. Here, we demonstrate quantum correlations between a telecom photon and a multimode on-demand solid state quantum memory. This is achieved by mapping a correlated single photon onto a spin collective excitation in a Pr$^{3+}$:Y$_2$SiO$_5$ crystal for a controllable time. The stored single photons are generated by cavity enhanced spontaneous parametric down conversion (SPDC) and heralded by their partner photons at telecom wavelength. These results represent the first demonstration of a multimode on-demand solid state quantum memory for external quantum states of light. They provide an important resource for quantum repeaters and pave the way for the implementation of quantum information networks with distant solid-state quantum nodes.


I. INTRODUCTION
Photonic quantum memories [1] are essential elements for quantum information networks [2], providing efficient and on-demand interfacing between single photons and stationary qubits, e.g. atomic gases [3][4][5][6], electronic spins in diamonds [7,8] or phonons [9,10]. Besides featuring efficiencies and storage times which compare or even overcome those of atomic gases [11][12][13][14], solid state photonic memories based on rare earth doped crystals are becoming increasingly important as they offer prospects for scalability and integrability [15][16][17][18]. Most protocols for long distance quantum communication require quantum memories connected to communication channels through fibers. One possible direction to achieve this goal is to use telecom quantum memories, e.g. based on erbium doped solids [19,20] or optomechanical systems [10]. However, the most efficient and long lived storage systems up to date are working at wavelengths far from the telecom window, leading to large loss in optical fibers. Possible solutions to overcome this problem include quantum frequency conversion [3,[21][22][23][24][25] or non-degenerate photon pair sources to establish entanglement between quantum memories and telecom photons [15,[26][27][28][29]. The latter approach has been demonstrated using the atomic frequency comb scheme [30] in rare earth doped single crystals or waveguides [15,27,28], but the storage of photonic entanglement was performed so far only in the excited state for short and pre-determined storage times.
Longer and programmable storage times can be obtained by transferring the optical atomic excitations to long lived spin collective excitations (spin waves) thanks to control laser pulses [30]. Recently, spin wave storage of weak coherent states at the single photon level [31,32], including qubit storage [31,33], has been demonstrated with rare-earth doped crystals. The generation and storage of continuous variable entanglement between a multimode solid state quantum memory and a light field have also been reported recently [34]. In this experiment, light-matter entanglement is created within the memory between spontaneously emitted light and spin waves, the matter part then being converted into a light field. This is a 'read-only' quantum memory [1] with the generated light fields being resonant and, for this demonstration, outside the telecommunication band. This motivates the need for a 'write-read' quantum memory that can store an externally prepared quantum optical state sharing a quantum correlation with a telecom band photon. To date, this has not been achieved with an on-demand spin wave solid state quantum memory. Its successful realization with single photons requires an efficient quantum light source matching the spectral properties of the quantum memory [35,36], and the quasi-suppression of the noise generated by the strong control pulses. These tasks are challenging due to the small spectral separation between the hyperfine states of the optically active ions (a few MHz in our system).
Here, we demonstrate the spin-wave (SW) storage with on-demand retrieval of heralded single photons in a Pr 3+ :Y 2 SiO 5 crystal using the full atomic frequency comb (AFC) scheme. This is achieved by generating pairs of non-degenerate single photons where one photon is resonant with the optical transition of Pr 3+ :Y 2 SiO 5 while the other photon is at telecom wavelength. The telecom photon is used to herald the presence of the other photon, which is stored as a SW in the crystal and re-  . The idler photon is further filtered with a cavity (FCav) to select a single frequency mode, before being coupled in an optical fiber and detected with an InGaAs single photon detector (SPD). The signal photon is spectrally filtered with a band-pass filter (BPF) and an etalon, before being coupled in a single mode optical fiber and directed towards the quantum memory (QM), where it is stored using the AFC scheme. Control pulses (CPs) are used to transfer optical excitations to spin-waves and back; they are sent at an angle and counter propagating with respect to the input photons. The retrieved single photon is spectrally filtered using a filter crystal (FC) before being detected by a Silicon SPD. Temporal filtering is achieved with acousto-optic modulators, placed after the memory crystal, opened only when we expect the SW echo. This prevents additional holeburning in the filter crystal and the SPD to be blinded by eventual leakage of the control pulses. During the preparation of the memory crystal via optical pumping, the SPDs of both idler and signal arms are gated off. After each AFC preparation, the gates are opened and we detect the arrival time of both photons of the pair during a measurement time of about 100 ms, leading to a duty-cycle of 0.14.
(b) Hyperfine splitting of the first sublevels of the ground 3 H4 and the excited 1 D2 manifold of Pr 3+ in Y2SiO5. The arrows highlight the Λ system chosen for the storage.
trieved on-demand after a controllable time. We measure second order cross correlation values between the heralding and the retrieved photons which exceed the classical bound fixed by the Cauchy-Schwarz inequality for storage times longer than 30 µs, effectively demonstrating quantum correlations between telecom photons and single spin waves in a solid. Moreover we demonstrate that our memory can store spin waves in multiple independent temporal modes. Fig. 1 depicts the experimental setup (a) and the relevant energy level scheme of Pr 3+ :Y 2 SiO 5 (b) where the chosen Λ system is indicated by arrows. The atomic frequency comb is prepared, following the spectral hole burning procedure described in [37], at the frequency of the 1/2 g −3/2 e transition and the control pulses drive the coherence from the 3/2 e to the empty 3/2 g storage state.

II. EXPERIMENTAL DETAILS
The narrow-band spectral filtering of the noise resulting from the control pulses is accomplished with a second Pr 3+ :Y 2 SiO 5 crystal, where a narrow transparency window (around 5.5 MHz) is burned at the frequency of the AFC [31]. An example of AFC prepared in the memory crystal overlapped with the narrow spectral hole burned in the filter crystal is reported in Fig. 2(a).
Our cavity-enhanced SPDC source produces ultranarrow photon pairs where one photon, the idler, is in the telecom E-band at 1436 nm, and the other is resonant with the Pr 3+ optical transition at 606 nm, specifically with the transition where the AFC is prepared [28,35]. The pump frequency is 426.2 nm and the average power for the measurements presented in this paper is 3.3 ± 0.5 mW. The probability to obtain a single signal photon in front of the quantum memory conditioned on a detection in the idler SPD (i.e. the heralding efficiency) is η H = (20.9 ± 0.5)%. We also switch off the SPDC pump after the detection of the idler photons for 30 to 40 µs (depending on the experiment), thus interrupting the creation of photon pairs during the detection of the stored and retrieved photons. Further details about the experimental setup and the preparation of the atomic frequency comb can be found in the Appendix.

III. HERALDED SINGLE PHOTON SPECTRUM
To access the spectrum of the heralded photons to store we compare two measurements. First we measure the The light violet trace is the transparency window that we burn in the filter crystal. The diamonds are signal-idler coincidence rates taken after preparing a single 800 kHz broad transparency window in the memory crystal and moving its frequency along the input photons. The error is smaller than the data points. The dotted blue line is a simulation of a Lorentzian peak with F W HM = 2.8 MHz convoluted with a 800 kHz-wide spectral hole. (b) Time histograms of the input photons (black trace), the AFC echo at τ = 7.3 µs (red trace), and the SW echo (blue trace) acquired over an integration time of 343 min. We construct the coincidence histogram by taking the detection of the idler photons as a start and the detection of a signal photon after the memory as stop. The coincidence count rate in the AFC echo is 4/min and in the SW echo ≈ 1/min. The control pulses, detected with a reference photodetector, are displayed as plain pulses and are separated by Ts = 6 µs. They are modulated in amplitude and frequency with Gaussian and hyperbolic tangent waveforms, respectively, as described in the Appendix. The peak power is 21 mW. The dashed vertical lines indicate the integration window for the signal (∆T d = 320 ns), while the dashed horizontal line represents the noise floor.
(c) Coincidence counts for the SW echo at T = τ + Ts = 13.3 µs normalized by the average noise level, along with its fit to a double exponential function, to account for the Lorentzian spectral shape of the SPDC photons.
temporal distribution of coincidences between the signal and idler photons of the source alone [36]. This correlation function has a width of 78 ns (F W HM ) which is the correlation time of the photons. From this value we calculate a biphoton bandwidth of 2.8 MHz. In a second experiment we employ the Pr 3+ :Y 2 SiO 5 memory crystal as a tunable frequency filter [31,38,39]. We prepare a 800 kHz-wide spectral hole and record coincidence histograms when the central frequency is swept by about 10 MHz around the frequency of the signal photons. In this experiment the photons passing through the transparency window are directly steered to the APD for detection, bypassing the temporal and spectral filtering stages. The coincidence rate as a function of the hole position (blue diamonds overlapped to the AFC in Fig. 2(a)) gives a measure of the spectral distribution of the heralded single photons at 606 nm. The result of this measurement agrees with the spectrum extrapolated from the signal-idler coincidence histogram measured immediately after the SPDC source [36]. This is confirmed by the good overlap between the diamonds and the blue dotted line, which represents the convolution of a Lorentzian curve of width 2.8 MHz and the trace of the 800 kHz-wide spectral hole. We estimate the spec-tral overlap between the heralded single photons and the AFC to be about 70 %, which currently limits the AFC storage efficiency.

IV. SPIN WAVE STORAGE OF HERALDED SINGLE PHOTONS
We first measure the coincidence histogram when the signal photons are sent through a 18 MHz-wide transparency window prepared in the memory crystal (input trace at 0 µs in Fig. 2(b)). The correlation time between the signal and idler photons is τ c = (89 ± 4) ns, leading to a heralded photon linewidth of (2.5 ± 0.1) MHz [36]. The correlation between the two photons is inferred by measuring the normalized second order crosscorrelation function g (2) s,i = P s,i /(P s · P i ), where P s,i is the probability to detect a coincidence between idler and signal photons in a time window ∆T d = 320 ns, while P i and P s are the uncorrelated probabilities to detect each photon. We find a g (2) s,i value of 96 ± 32. Then we prepare an AFC with periodicity ∆ and store the single photon as a collective optical atomic excitation |ψ e = 1 √ N N j=1 e ixj ·kin |g 1 . . . e j . . . g N , where k in is the wave-vector of the incoming photon (see level scheme in Fig. 1(b)). We obtain, for a pre-programmed storage time τ = 1 ∆ = 7.3 µs, an efficiency η AF C = (11.0 ± 0.5) % and g (2) AF C,i = 130 ± 31 (see Fig. 2(b)). This result represents an improvement in terms of g (2) AF C,i of more than one order of magnitude compared to the state of the art in the same system [28]. The g (2) value increases after the AFC storage due to the fact that the stored photons are transferred to a temporal mode free of noise [28]. After the retrieval, we measure τ c = (147 ± 7) ns, larger than the value measured before storage. We attribute this temporal stretching to spectral mismatch between the input photons before the memory (F W HM = 2.8 MHz) and the atomic frequency comb (total width 4 MHz), as evidenced in Fig. 2(a).
We then perform spin wave storage experiments by sending pairs of strong control pulses after the detection of each heralding photon.
The first control pulse with wave vector k C transfers the collective optical excitation |ψ e to a collective spin excitation (a spin wave), which can be written as For a storage time in the ground state of T s = 6 µs (thus a total storage time T = T s + τ = 13.3 µs), we detect the retrieved photons, i.e. the spin-wave echo (swe), with an efficiency η sw = (3.6 ± 0.2) %. η sw is inferred with a coincidence window of ∆T d = 320 ns, containing 80 % of the signal and including the loss in the filter crystal. Fig. 2(c) shows a magnification of the SW echo mode, normalized by the average value of the noise measured outside the peak, leading directly to the signal-to-noise ratio (SNR) of the stored and retrieved photons. We observe a maximum SNR of around 5. This curve can also be used to infer g (2) swe,i . With our filtering strategy we reach a noise floor of (2.0 ± 0.1) × 10 −3 photons per storage trial at the memory crystal (horizontal dashed line in the SW echo temporal mode). The correlation time of τ c = (200 ± 40) ns exceeds the one after storage in the excited state. This further increase after the SW storage is attributed to the limited chirp of the control pulses (see Appendix).

V. QUANTUM CORRELATION BETWEEN SINGLE TELECOM PHOTONS AND SINGLE SPIN WAVES
To investigate the non-classical nature of the photon correlation after the SW storage, we measure g (2) swe,i (∆T d ) and compare it to the unconditional autocorrelation of the idler (g (2) i,i (∆T d )) and retrieved signal (g (2) swe,swe (∆T d )) fields, respectively. To access these quantities we correlate photon detections from different unconditional storage iterations (see Fig. 3(a)). The time between two consecutive storage trials is 190 µs. In each trial (500 per comb preparation), we maintain the gate of the idler SPD open during 4.5 µs before sending the control pulses. We find g (2) swe,i = 6.1 ± 0.7. We note that our g (2) swe,i is limited by the spin wave read-out efficiency, that we estimate to be approximately η R = 24 %, while the write efficiency is η W = 31 % (see Appendix for a detailed discussion). With this unconditional sequence, the measured noise floor ((1.3 ± 0.1) × 10 −3 photons per storage trial) is lower than with the conditional one (section IV). We attribute this result to the fact that the number of control pulse pairs per comb is larger in the unconditional sequence. This probably contributes to a further emptying of the spin storage state. The classical bound from the Cauchy-Schwarz inequality g (2) swe,i < g (2) swe,swe · g (2) i,i is indicated in Fig. 3(a) as a horizontal line. The measured unconditional autocorrelations are (also for ∆T d = 320 ns) g (2) i,i = 1.32 ± 0.04 [36] and g (2) swe,swe = 1.0 ± 0.4 (see Appendix). The Cauchy- swe,swe ·g (2) i,i = 28 ± 12, exceeds the classical limit of R = 1 by more than 2 standard deviations. The confidence level for violating the Cauchy-Schwarz inequality, i.e. for observing a non-classical correlation between the telecom heralding photon and the single spin wave stored in the crystal, is 98.8 %. If a larger coincidence window is considered, ∆T d = 1 µs, the R value is reduced to 8.3 ± 2.3 due to a bigger contribution of noise in the g (2) swe,i . On the other hand, due to a reduction of the statistical error, the confidence level for the demonstration of non-classical correlation rises up to 99.92 % (see Appendix).
The main advantage of the full AFC protocol is the possibility to store multiple distinguishable temporal modes, while maintaining their coherence and quantum correlation [30]. This ability is crucial for applications in quantum information protocols, e.g. to enable temporally multiplexed quantum repeater protocols with high communication speed [26] and storage of time-bin qubits robust against decoherence in optical fibers. To test this aspect, we perform experiments with detection gates much longer than the photons duration. We divide both the idler and the retrieved signal detection windows, ∆t i and ∆t s , respectively, into smaller temporal modes of width 640 ns, as sketched in Fig. 3(b). For this analysis we also consider ∆T d = 640 ns in order to have better statistics. We then check that we have non-classical correlations between modes separated by the total storage time T = T s + τ = 13.3 µs and classical correlations between modes separated by T = 13.3 µs, as shown in Fig.  3(c). Contrary to other temporally multimode storage protocols [11,34,40], in the full AFC protocol the total storage time is maintained for the different temporal modes. To additionally demonstrate that the multimode capacity does not imply any increase of the noise, we compute g (2) swe,i (∆T d = 320 ns) for idler detection windows ∆t i varying from 320 ns to 4.5 µs, as shown in swe,i (∆T d = 320 ns) value for this measurement is 6.1 ± 0.7. The classical bound given by the Cauchy-Schwarz inequality is reported as a horizontal line. The error bars are calculated considering Poissonian statistics. (b) Scheme representing the analysis of the cross-correlation measurement to evidence the multimodality. Both idler and retrieved signal integration windows are divided into smaller intervals (640 ns) and the correlation is calculated between intervals separated by different storage time.
(c) Cross-correlation value between idler and retrieved signal photons detected in small temporal modes separated by different storage times. For this analysis, we also set ∆T d = 640 ns for better statistics. The g (2) swe,i (∆T d = 640 ns) exceeds the classical threshold (represented in the color code bar for ∆T d = 640 ns) only for windows separated by the total storage time T = 13.3 µs. Violations outside these windows are not statistically significant due to a low number of counts. (d) Cross-correlation value between the idler photons and the retrieved signal photons (full circles) and the coincidence counts in the SW echo (empty squares) as a function of the detection window for the idler photons. Both the signal detection window and the coincidence window remain constant at 4.5 µs and 320 ns, respectively. The classical threshold is also reported as a horizontal line. The integration time for this measurement is 38.5 h. The error bars are calculated considering Poissonian statistics.
3(d) (full circles) together with the coincidence counts measured in the center peak for each window size (empty squares). As expected, the latter increases by increasing ∆t i but the g (2) swe,i value remains constant, within the error bar, and well above the classical bound over the whole range. Defining the number of temporal modes N m as N m = ∆t i /∆T d , we confirm non-classical storage of a maximum of 7 temporal modes with a ∆T d = 640 ns that contains the 94 % of the coincidence peak. However, considering ∆T d = 320 ns, which still contains the 80 % of the SW echo, a 4.5 µs-wide gate can accommodate up to 14 independent temporal modes.
Finally, to illustrate the ability to read-out the stored spin wave on-demand, we perform storage experiments at different SW storage times. For these measurements, we implement a semi-conditional storage sequence in order to obtain good statistics with shorter integration time. We wait for heralding photons and, after each detection, we send 15 pairs of control pulses (each pair being separated by 190 µs from the neighboring), that we exploit to correlate the idler and the retrieved signal (see Appendix). The measured storage and retrieval efficiencies are reported in Fig. 4(a) together with a Gaussian fit, which accounts for the inhomogeneous broadening of the spin-state. As a fitting parameter we obtain the spin inhomogeneous line width γ inh = (20 ± 3) kHz, in good agreement with that measured in different experiments on the same crystal [18,31]. This further confirms that the photons are stored as spin waves. The second order cross-correlation function for increasing SW storage times T s is shown in panel (b) of Fig. 4. The red dashed curve is calculated by considering the Gaussian fit of the signal decay (solid curve in panel (a)), normalized by the source heralding efficiency η H and the average noise floor of (1.9 ± 0.2) × 10 −3 photons per trial (the average being over the different T s ). We measure non-classical correlations between the idler and the retrieved signal swe,i (320 ns) value as a function of the storage time. The classical threshold is reported as horizontal line. In both panels, the error bars are calculated considering Poissonian statistics and the circled data points refer to the measurement reported in Fig. 2.
photons up to a total storage time T = τ + T s = 32.3 µs. Note that, while we measure the Cauchy-Schwarz parameter R with unconditional measurements at T = 13.3 µs, to assess the non-classicality for longer storage times we make the assumption that the retrieved signal autocorrelation does not increase for semi-conditional measurements at different storage times. This is a conservative estimate, since the g (2) swe,swe value is mainly determined by the noise in the read-out, that we verify being constant over the whole range of storage times investigated and not bunched (see Appendix). Under this hypothesis, the Cauchy-Schwarz inequality is violated at T = 32.3 µs with a confidence of 94 % for ∆T d = 320 ns.

VI. DISCUSSION AND CONCLUSION
The demonstrated quantum correlation between a telecom photon and a spin wave in a solid is an essential resource to generate entanglement between remote solid state quantum memories [26]. The measured value of g (2) swe,i after spin wave storage is currently limited by the signal-to-noise ratio of the retrieved photon, which is in turn mostly limited by the low storage and retrieval efficiency. This could be greatly improved by using higher optical depth [11] with higher quality combs, or crystals in impedance matched cavities [41,42]. The storage time is currently limited by the spin inhomogeneous broadening and could be increased using spin-echo and dynamical decoupling techniques [32], with prospect for achieving values up to one minute [13] in our crystal, while even longer storage times (of order of hours) may be available in Eu 3+ :Y 2 SiO 5 [14]. Finally our experiment could be extended to the storage of entangled qubits, e.g. using time-bin encoding [31].
In conclusion, we have reported the first demonstration of quantum storage of heralded single photons in an ondemand solid state quantum memory. We have shown that the non-classical correlations between the heralding and the stored photons are maintained after the retrieval, thus demonstrating non-classical correlations between single telecom photons and single collective spin excitations in a solid. Finally we showed that the full atomic frequency comb protocol employed allows one to store a single photon in multiple independent temporal modes. These results represent a fundamental step towards the implementation of quantum communication networks where solid state quantum memories are interfaced with the current fiber networks operating in the telecom window [26].  While the idler photon is measured and used as herald, the signal photon is sent through a fiber to the setup where the storage protocol is performed. DFG stands for difference frequency generation, PPLN for periodically poled lithium niobate, DM for dichroic mirror, FCav for filter cavity, BPF for band pass filter, LPF for low pass filter, QM and FC are the crystals used as quantum memory and filter crystal, respectively, which are placed inside a cryostat. SPD labels the single photon detectors.

VII. APPENDIX
In the present Appendix we provide details about the experimental setup (section VIII), the preparation and characterization of the atomic frequency comb, including an analysis of the efficiency (in section IX), the sequences employed to reconstruct the second order cross-and autocorrelation functions (see section X).

VIII. SETUP
The experimental setup, shown in Fig. 5, is composed of two main parts, the quantum light source and the quantum memory.

Source
The quantum light source is based on cavity-enhanced spontaneous parametric down conversion (SPDC) [35]. A pump at 426.2 nm, whose power is maintained in the range 3 − 5 mW, produces widely non-degenerate photon pairs when passing through a periodically poled lithium niobate (PPLN) crystal. One photon of the pair, the idler, is at telecom wavelength (Telecom E-band), namely 1436 nm, while the other, the signal, is at 606 nm. To ensure that the signal photons are resonant with the quantum memory, the length of the SPDC cavity is locked to a reference beam. To guarantee simultaneous resonance of the idler photons we measure the light generated by DFG (difference frequency generation) of the pump and the 606 nm reference beam to derive an error signal which we feed back to the pump frequency. A mechanical chopper placed before the SPDC cavity enables to alternate the locking and measurement periods, in order to not blind the detection of the single photons with the classical reference beams, the duty cycle being about 45%. The double resonance of widely non-degenerate signal and idler allows for an efficient suppression of the redundant modes of the cavity due to a clustering effect [35,43]. At the cavity output the measured spectrum consists of four effective modes in the main cluster (separated by the cavity free spectral range, 423 MHz) [36]. To operate in singlemode regime, the idler photons are sent through a homemade filtering cavity (linewidth 80 MHz, free spectral range 16.8 GHz), then coupled into a single mode fiber to the single photon detector (ID230, IDQuantique). This ensures a single-mode herald. The signal photons are filtered with an etalon, which suppresses the side clusters (44.5 GHz away from the main cluster), and a bandpass filter (centered at 600 nm, linewidth 10 nm, Semrock) before being sent to the quantum memory. The secondorder cross-correlation function between signal and idler photons measured before the memory for a coincidence window of ∆T d = 320 ns is g (2) s,i = 57 ± 1 (pump power 3.4 mW).

Quantum Memory setup
The laser that we use for 606 nm light is a Toptica DL SHG pro, stabilized with the Pound-Drever-Hall technique to a home-made Fabry-Perot cavity in vacuum [28]. From this laser we derive the reference beam for the source and all beams necessary to prepare and operate the memory. The single photons at 606 nm generated by the source pass through our quantum memory, a Pr 3+ :Y 2 SiO 5 crystal cooled down to ∼ 2.7 K in a cryostat (closed-cycle cryocooler, Oxford Instruments), with a waist of 45 µm. The beam for memory preparation and control pulses has a waist of 175 µm, and is sent with a small angle with respect to the photons and counter propagating. In this way we can spatially filter out some noise generated by the control pulses. To protect the single photon counter (SPD) we add a temporal gate which is composed of two AOMs, one aligned in the +1 order, and the other in the -1, to maintain the frequency fixed. The spectral narrow band filter is performed using a second Pr 3+ :Y 2 SiO 5 crystal where we optically pump a 5.5 MHz spectral hole centered at the frequency of the single photons [31]. The two Pr 3+ :Y 2 SiO 5 crystals used as quantum memory and spectral filter are bulk samples (Scientific Materials), with an active-ions concentration of 0.05 % and length 5 mm and 3 mm, respectively. Then the retrieved photons pass through a band-pass filter (centered at 600 nm, linewidth 10 nm, Semrock). We further protect the SPD with a mechanical shutter which remains closed during the whole memory and filter preparation. The storage sequences are synchronized to the cycle of the cryostat (1.4 Hz) with a home-made syn- chronization circuit, with the aim of reducing the effects of mechanical vibrations. To detect the retrieved single photons at 606 nm we use a Laser Components SPD with 50 % detection efficiency and 10 Hz dark-count rate. For auto-correlation measurements (section X A 2) we add a second SPD (tau-SPAD, detection efficiency 45 %, dark count rate 15 Hz, PicoQuant).

A. AFC preparation
To reshape the absorption profile of the memory crystal (see Fig. 1 in the main text for the energy level scheme) into an atomic frequency comb we proceed as follows: we start by preparing a wide transparency window by sending strong laser pulses (the maximum power being 21 mW) and sweeping them by 16 MHz. This has the effect of emptying the 1/2 g and 3/2 g states of a portion of atoms, creating a 18 MHz-wide transparency window. We then send 4 MHz broad burn-back pulses resonant with the transition 5/2 g − 5/2 e to repump back atoms in the states 1/2 g and 3/2 g . Afterwards we clean the spin-storage state, the 3/2 g , using 5 MHz broad pulses resonant to the 3/2 g − 3/2 e . We want this state to be as clean as possible, in order to reduce the noise generated by the control pulses (CPs) during the spin wave storage. At this stage, we have a 4 MHz-wide single class absorption feature resonant to the transition 1/2 g − 3/2 e , where we can prepare the atomic frequency comb (AFC).
Finally we create the comb structure using a procedure inspired by [37]: we simulate the desired comb (an example being shown in Fig. 6(a)), deciding the shape of the peaks (square), fixing the optical depth (OD) and the background optical depth (d 0 ), and calculating the best finesse for each particular AFC storage time (τ ). We perform the Fourier transform of the simulated comb and cut the principal part of it (50 − 150 µs, depending on the storage time), as evidenced in Fig. 6(b) by the shaded rectangle, to maintain the duration of the preparation within the limit imposed by the cryostat cycle. This pulse is finally renormalized for the non-linear response of the double-pass AOM. We then optically pump the 1/2 g − 3/2 e transition with this pulse train 950 times (1100 times for τ > 9 µs), and each time we clean the 3/2 g spin-state with a 10 µs long pulse at the frequency of the 3/2 g − 3/2 e transition and chirped by 4.8 MHz.
In order to reduce the noise generated by the CPs during the spin wave storage, after the preparation of the comb we send 100 CPs separated by 25 µs and, afterwards, other 50 with a separation of 100 µs. If these cleaning CPs are too close to each other the atoms are coherently driven between the ground and the excited state and some might remain in the former. The light violet trace is the transparency window that we burn in the filter crystal. The diamonds are signalidler coincidence rates taken after preparing a single 800 kHz broad transparency window in the memory crystal and moving its frequency along the input photons. The error is smaller than the data points. The dotted blue line is a simulation of a Lorentzian peak with F W HM = 2.8 MHz convoluted with a 800 kHz-wide spectral hole. (b) Time histograms of the input photons (dark blue trace peaked at 0 µs) and the AFC echo (red trace, including the transmitted signal). The shaded rectangles mark the areas used to measure the uncorrelated noise in the calculation of the g 2 AF C,i value. (c) AFC storage efficiency for different storage times, τ . The filled square is τ = 7.3 µs, whose coincidence histogram is shown in panel (b).
In Fig. 7 we show a typical trace of atomic frequency comb prepared in the memory crystal and the trace of the spectral hole prepared in the filter crystal (panel (a)). The actual shape of the comb peaks is rather Gaussian than square, due to power broadening. Also shown is the spectrum of the heralded single photons reconstructed with signal-idler coincidence rates taken after preparing a single 800 kHz broad transparency window in the memory crystal and moving its frequency along the input photons. The dotted blue line is a simulation of a Lorentzian peak with F W HM = 2.8 MHz convoluted with a 800 kHz-wide spectral hole.
An example of coincidence histogram measured for τ = 7.3 µs and the AFC storage efficiency as a function of the storage time τ are shown in panels (b) and (c), respectively.

B. Control Pulses
The waveform that we use to generate our control pulses (CPs) is a Gaussian (blue dashed trace in Fig.  8) with full-width at half maximum F W HM = 2.4 µs.
As the AOM that we use to modulate the amplitude and frequency of the pulses has a non-linear response, the output waveform looks more squarish (gray solid trace). The FWHM, though, remains almost the same. This shape increases the efficiency of the CPs, so that we can take advantage of having shorter waveforms. This will be important in particular for the semi-conditional storage (section X B), in which we have to send the CPs as fast as possible after the heralding detection. In this case using longer waveforms means having the second CP closer to the echo, thus increasing the noise floor. These CPs are frequency chirped with a hyperbolic tangent (red dotted trace in Fig. 8), that makes the transfer more effective around the central frequency [44]. The transfer efficiency of the control pulses η T is measured by preparing an AFC and sending the first control pulse before the rephasing of the atoms: calling η AF C the efficiency of the AFC storage and η AF C the efficiency of the AFC rephasing after the control pulse, we have:

C. Noise generated by the control pulses
We use a semi-conditional sequence (explained in more detail in section X B) to measure the noise generated by the first control pulse and by both of them. Each time that we detect an heralding photon we send a CP, then the temporal gate is opened in the position of the AFC echo and closed before the second CP is sent. We open it again and measure the noise generated by both pulses in the temporal mode of the spin wave echo (Fig. 9). The noise generated by the first control pulse is higher than the noise after both CPs. Specifically the noise after the two CPs (red trace around 18 µs) is (2.3 ± 0.1) × 10 −3 photons per storage trial, i.e. 86% of the noise after the first CP (red trace about 7.3 µs), measured in the position of the AFC. If the 3/2 g spin state is not well cleaned, a portion of the remaining atoms depending on the transfer efficiency, is promoted to the excited state by the first control pulse, possibly giving rise to incoherent fluorescence. This excess of population at the excited state might then be coherently transferred to the ground state by the second control pulse, thus decreasing the contribution to the noise.

D. Comb analysis
The efficiency of the AFC echo can be predicted by analyzing the trace of the comb (Fig. 7(c)) according to the model described in ref. [30]. Assuming Gaussian peaks, we express the AFC internal efficiency with the following equation: where ∆ = 1/τ is the distance between the peaks of the comb, γ is their full width at half maximum, F = ∆/γ is the comb finesse,d = OD/F is the effective optical depth, and d 0 is the absorption background due to imperfect optical pumping. The total efficiency that we measure is reduced by a factor η BW ≈ 70 %, due to the bandwidth mismatch between photons and comb (see section III of the main text). Assuming for the comb reported in Fig. 7 a finesse F = 3.8 ± 0.3, average optical depth and linewidth of the peaks OD = 3.5 ± 0.2 and γ = (36 ± 3) kHz, respectively, and a background d 0 = 0.27 ± 0.1, the expected total efficiency is η AF C = (11.3 ± 1.4) % which agrees very well with the experimentally measured η exp AF C = (11.0 ± 0.5) %. With the same comb parameters, the expected internal efficiency, assuming no loss due to the bandwidth mismatch, would be η int AF C = (16 ± 2) %. The AFC efficiency can be separated into different contributions as follows: where η abs (η reph ) is the absorption (rephasing) efficiency of the comb. The factor η loss accounts for the loss due to absorption in the background. The absorption efficiency in the comb can be calculated as η abs = (1 − e −d )η BW = 43 %. The rephasing efficiency of the comb is thus η reph = 34 %.
The spin wave efficiency can be analogously expressed as assuming the transfer efficiency η T be the same for both control pulses and the spin state exhibit a Gaussian inhomogeneous broadening leading to a decoherence effect (showed in Fig. 3(a) of the main manuscript) quantified by η C = exp − (Ts·Γ inhom ) 2 2·log(2)) · π 2 = 87.3 % [31,45,46]. The expected value of the spin wave efficiency is η sw = (5.2 ± 0.6) %. The small mismatch with the experimentally measured one, η exp sw = 3.6 ± 0.2 %, can be due to the fact that our chosen coincidence window, ∆T d , contains about 80 % of the echo and that the filter crystal might have residual background due to imperfect optical pumping.
Given the control pulse efficiency η T = 72.5 % (see section IX B), we define a spin wave write and read-out efficiency as η W = η abs η T and η RO = η T η reph , respectively (thus rewriting the total spin-wave efficiency as η sw = η loss η W η C η RO ), which we estimate to be 31 % and 24 %, respectively. Taking this into account and considering that the first control pulse gives a measured noise floor of (2.3 ± 0.1) × 10 −3 , we infer non-classical correlation between the single telecom photons and the collective spin excitations during the storage with a g (2) sw,i of the order of 20.

X. STORAGE SEQUENCES
After the preparation of the comb we run different sequences for the storage of single photons. In one of them we try to store without knowing in advance if there is a single photon at the quantum memory, we call this sequence unconditional. In the second sequence, that we call semi-conditional, we condition the first storage trial on the detection of the heralding photon (idler), followed by further unconditioned trials.

A. Unconditional sequence
In this sequence, sketched in Fig. 10, after the preparation of the memory, we perform 500 storage trials, separated by 190 µs. We choose this time, longer than the relaxation time of the excited state, in order to reduce accumulated noise in the spin wave echo mode due to the multiple storage trials. Each storage trial consists of two transfer pulses (write and read), separated by T s = 6 µs. The gate of the idler detector is opened for a time ∆t i = 6 µs and it closes when the write pulse arrives to its maximum intensity. For this measurement, we assemble a Hanbury-Brown Twiss setup (fiber BS in

Cross-correlation
The cross-correlation between the retrieved signal and idler photons is defined as g (2) swe,i = Pswe,i PswePi , where P swe,i is the coincidence probability between the idler and the retrieved signal photons, while P swe (P i ) is the retrieved signal (idler) uncorrelated probability. We reconstruct the g (2) swe,i value as the ratio between the coincidences detected at a storage time τ + T S = (7.3 + 6) µs = 13.3 µs within the same storage trial and the average of the coincidences between signal and idler photons detected in the spin wave temporal mode of the 20 neighboring storage trials (a typical result being shown in Fig. 2(b) of the manuscript). In both cases the integration window is 320 ns. However, we also analyze the g (2) swe,i value, along with the spin wave efficiency as a function of the integration window. The results are shown in Fig. 11(c). The darker bar indicates the coincidence counts in a detection window ∆t d = 320 ns about 0 delay in the same storage trial. Panel (b): Second order autocorrelation histogram of the idler photons measured in CW configuration. The region marked with darker bars indicate the integration window ∆t d = 320 ns. Panel (c): Second order cross-correlation between idler and retrieved signal (circles) and spin-wave echo efficiency (squares) as a function of the integration window, ∆t d .

Auto-correlation
The auto-correlation of the retrieved signal is defined as g (2) swe,swe = Pswe1,swe2 Pswe1Pswe2 , where P swe1,swe2 is the probability to have coincidences between the two output ports of the signal arm, while P swe1 (P swe2 ) is the uncorrelated probability to detect a photon in the output 1 (2) of the Hanbury-Brown Twiss setup. We calculate the unconditional autocorrelation as the ratio of the coincidences between the signals detectors in the same storage trial and the coincidences in the previous and following 10 storage trials. As the signal can arrive everywhere in ∆t s thanks to the multimodality of the protocol, we build the coincidence histogram between the signal photons at the two outputs of the fiber BS considering the whole detection window. Then, to calculate the auto-correlation g (2) swe,swe , we integrate in a window of width ∆T d = 320 ns around 0 µs delay, to be consistent with the cross-correlation measurement. The result is shown in Fig. 11(a) and the value that we extract is g (2) swe,swe = 1.0 ± 0.4. The large error bar is due to the low statistics. This value is remarkably lower than expected for a thermal state produced by a SPDC process [47]. However, we measure the second order autocorrelation of the signal photons before the memory to be g (2) mm swe,swe (0) = 1.18 ± 0.02, for 3.9 effective spectral modes [36]. From this value, we can extrapolate the autocorrelation of the input photons for the single spectral mode case, which would measure g (2) sm swe,swe (0) = 1.70 ± 0.03. After the memory, the signal autocorrelation will be affected by the noise produced by the storage protocol. We quantify this contribution with the unconditional signal-to-noise ratio, i.e. the probability to detect a signal photon after the retrieval over the noise in unconditional measurements. We plug this value, SN R unc = (P swe − P n )/P n = 0.030 ± 0.003, in the theoretical model described in [28], which assumes that the noise is not bunched, and find that after the noisy storage protocol the expected signal autocorrelation is g (2) sws swe,swe (0) = 1.01 ± 0.08. Finally, the finite integration window also contributes to decrease the autocorrelation [36] to g (2) th swe,swe (320 ns) = 1.0±0.1, which agrees with the experimental value (g (2) swe,swe (320 ns) = 1.0 ± 0.4). This also suggests that the noise is indeed not bunched.
For the idler photons we measure the unconditional autocorrelation in a CW configuration because no storage is involved [36]. A typical measurement for a pump power of 4.3 mW is shown in Fig. 11(b) from which we calculate a value of g (2) i,i (320 ns) = 1.32 ± 0.04.

Cauchy-Schwarz inequality
To confirm the quantum correlation between the idler and the stored and retrieved signal photons, we calculate the Cauchy-Schwarz parameter R = (g (2) swe,i ) 2 g (2) swe,swe ·g , which is expected to be lower than 1 for classical fields. Using the values reported in sections X A 1 and X A 2 we calculate R = 28 ± 12, which exceeds the classical benchmark by two standard deviations. The big errorbar is due to the low statistics in the second order autocorrelation measurements of the retrieved signal photons. Considering a wider integration window, e.g. ∆T d = 1 µs, the uncertainty decreases due to better statistics. The auto-and cross-correlation values also decrease due to the contribution of uncorrelated noise. Nonetheless, the Cauchy-Schwarz inequality is still violated, as summarized in Table I. Note that, as demonstrated in the calculation of the expectation value of section X A 2, the biggest contribu-tion to the signal autocorrelation is given by the noise and that this is not bunched. Thus, any modification in the measurement that increases the noise (e.g. the use of the semi-conditional sequence described in section X B), would only further approach the autocorrelation to the value of 1, thus lowering the classical threshold.

B. Semi-conditional sequence
The drawback of the unconditional sequence is the low count rate, resulting in extremely long integration times. This is due to the fact that the measurement duty cycle is low in this configuration (∼ 2 % with respect to the measurement period). We thus set a semi-conditional sequence which allows us to estimate the signal-idler crosscorrelation, g (2) swe,i , in considerably shorter measurement times. In this sequence, sketched in Fig. 12, after the preparation of the memory, we open the idler gate and continuously (every 80 ns) check for heralding events. Each time that we detect an idler photon we close the gate of the telecom detector, we switch off the SPDC pump (for 40 µs), and we start the storage cycle. This consists of two identical control pulses, the write and the read, sent with a relative delay of T s . After the second pulse, the signal gate is opened for a time ∆t s (see Fig. 12) to detect the retrieved photon. We know that the retrieved photon will arrive at a time τ + T s after the heralding. To estimate the noise, after the retrieval we send 15 pairs of CPs (unconditional) with a period of 190 µs. In this case the photon is coupled in a single mode fiber and measured with one SPD. Analogously to the unconditional case, we calculate the crosscorrelation between the retrieved signal and idler photons as g (2) swe,i = Pswe,i PswePi . Again we consider the ratio of the coincidences between the heralds and the retrieved photons at τ +T S in the same storage trial, and the average of the coincidences in the following 15 storage trials. The resulting coincidence histogram is reported in Fig. 13, to be compared with the unconditional coincidence histogram shown in Fig. 3(a) of the main manuscript. The g (2) swe,i for the semi-conditional measurement is 5.0 ± 0.3. Note that in the same conditions (pump power and storage time), the semi-conditional sequence provides a slightly lower g (2) swe,i value, mainly due to the higher noise floor, (2.0 ± 0.1) × 10 −3 , with respect to the unconditional sequence. This sequence has been employed to measure g (2) swe,i for different T s (whose results are shown in Fig. 4 of the main text).
We also repeated the same measurement while blocking the signal photons before the quantum memory with a beam block. In this way we can measure the crosscorrelation of the noise, which is g (2) n,i (320 ns) = 1.1 ± 0.3 (g (2) n,i (3 µs) = 1.0 ± 0.1), as shown in Fig. 14.