Quantum Phase Estimation Algorithm with Gaussian Spin States

Quantum phase estimation (QPE) is one of the most important subroutines in quantum computing. In general applications, current QPE algorithms either suffer an exponential time overload or require a set of - notoriously quite fragile - GHZ states. These limitations have prevented so far the demonstration of QPE beyond proof-of-principles. Here we propose a new QPE algorithm that scales linearly with time and is implemented with a cascade of Gaussian spin states (GSS). GSS are renownedly resilient and have been created experimentally in a variety of platforms, from hundreds of ions up to millions of cold/ultracold neutral atoms. We show that our protocol achieves a QPE sensitivity overcoming previous bounds, including those obtained with GHZ states, and is robustly resistant to several sources of noise and decoherence. Our work paves the way toward realistic quantum advantage demonstrations of the QPE, as well as applications of atomic squeezed states for quantum computation.


INTRODUCTION
Quantum phase estimation (QPE) [1][2][3] is the building block of known quantum computing algorithms providing exponential speedup [4], including the computation of the eigenvalues of Hermitian operators [5], such as molecular spectra [6,7], number factoring [8][9][10], and quantum sampling [11]. All these applications require the calculation of an eigenvalue of a unitary matrix U |u = e −iθ |u , where |u is the corresponding eigenstate, which can be cast as the estimation of an unknown phase θ ∈ [−π, π). The information about θ is encoded into one or more ancilla qubits via multiple applications of a controlled-U gate [4]. The QPE problem plays a key role also in the alignment of spatial reference frames [12] and clock synchronisation [13], with further developments in atomic clocks [14], and worldwide networks [15].
These protocols have critical drawbacks that have precluded the experimental demonstration of quantum advantage. The sequential protocol i) considers 2 k applications of the controlled-U gate that require exponentially large implementation times. This approach has been demonstrated experimentally [18] using multiple passes of a single photon through a phase shifter and it has been more recently applied to eigenvalue estimation [19] and magnetic field sensing [20]. The parallel protocol ii) scales linearly with the implementation time but requires GHZ maximally entangled states containing exponentially large, 2 k , number of particles. These states are notoriously fragile, since the loss of even a single particle completely decoheres the state [25,26] and irremediably breaks-down the quantum algorithm. GHZ states are currently implemented with up to 20 particles [27,28]. Generally speaking, the current grand challenge in quantum technologies is to go beyond proof-of-principle demonstrations toward implementations overcoming "classical" bounds. In the contest of QPE, it is evident that present shortcomings call for a radically new approach that is, both, scalable with respect to temporal resources and exploits experimentally robust, easily accessible, quantum states.
Here we develop a new QPE algorithm that scales linearly with time and is implemented with Gaussian spinsqueezed states (GSS) [29][30][31]. The circuit diagram of the quantum algorithm is shown in Fig. 1 and is discussed in details below. The iterative algorithm is broken ito multiple steps, each implemented with GSS. GSS have a Gaussian particle statistical distribution and are way more robust to noise and decoherence than GHZ states. Indeed, GSS have been experimentally demonstrated in a variety of platforms [30], with a few hundreds ions [32], several thousands degenerate gases [33,34], up to millions of cold [35][36][37][38] atoms. The shortcoming of GSS -that has prevented so far their use in QPE -is that they provide high sensitivity only in a relatively small phase interval centred around the optimal phase value θ = 0 [30,31].
state preparation phase encoding phase feedback and readout output Figure 1: Circuit diagram of quantum phase estimation algorithm described in this work. (a) The K + 1 ancilla states prepared in |0 ⊗N k (k = 0, ..., K) are first rotated by applying collective Hadamard gate H ⊗N k (white box). The green box is a squeezing gate, s k being the squeezing parameter (see text and Methods). The yellow box is a controlled-U ⊗N k gate that provides phase encoding conditioned by the eigenstate |u of the register. Pink boxes are rotation phase gates Rz(−θ j est ) implementing a classical phase feedback (pink lines) based on the estimated θ (j) est at step j < k. The readout is obtained by applying a rotation gate R ⊗N x (blue box) followed by a measurement of the eigenstates ofĴz (grey box). The output of the algorithm, after K + 1 steps, is an estimate, θest,K = K k=0 θ (k) est , of the unknown phase θ. (b) Illustration of the kth step of the algorithm (bottom) and the corresponding Husimi representation of the quantum state on the N k -qubits Bloch sphere (top) for each operation. The output signal has a sinusoidal dependence on θ, from which the estimated θ (k) est is extracted and used to phase feedback the next (k + 1) step of the algorithm.
Our algorithm overcomes this limitation by a classical adaptive phase feedback. It progressively drives the unknown phase to the optimal sensitivity points of a cascade of GSS that are increasingly squeezed. An analytical optimization of the cascade, with respect of both the number of particles and the squeezing parameter of each state, demonstrates a phase sensitivity for the estimation of any arbitrary phase θ ∈ [−π, π). This result overcomes the sensitivity obtained with other QPE algorithms proposed so far in the literature [18,23,24], including those using GHZ states [14,15,[21][22][23]. Moreover, our protocol is implemented with a single measurement of a collective spin observable at each step and the phase is extracted with a practical estimation technique.

RESULTS
In the following, we present our QPE algorithm and discuss its performance in the ideal case and in presence of decoherence. Mathematical details are reported in the Methods and in the Supplementary Information (SI).
Gaussian spin states quantum phase estimation algorithm. The circuit representation of the K + 1 steps of the algorithm is shown in Fig. 1(a), while Fig. 1(b) illustrates a single step. The phase estimation uses N T particles that can access two internal or spatial modes. The ensemble of N T qubits is divided into K + 1 spin-polarized states |0 ⊗N k , one for each step k = 0, 1, ..., K of the algorithm, with N k 1 and K k=0 N k = N T . We also introduce collective spin operatorsĴ n = N k j=1σ (j) n /2, whereσ (j) n is the Pauli matrix of particle j along the axis n = x, y, z in the Bloch sphere. Each state |0 ⊗N k first goes through a collective Hadamard gate H ⊗N k , which prepares the coherent spin state |0 +|1 √ 2 ⊗N k . Except at the zeroth step k = 0, the state then goes through a spin-squeezing gate that generates the GSS |ψ k with squeezing parameter s 2 k = 4(∆Ĵ y ) 2 k /N k (see Methods). Notice that the spinsqueezing gate creates entanglement among the N k ancilla qubits [29,39]. This concludes the state preparation. It should be noticed that the spin-squeezing gate (represented in Fig. 1 by the green box), can implemented experimentally in a variety of ways and experimental systems, see Ref. [30] for a review.
The phase encoding is obtained from a controlled-U gate applied to each qubit. The gate gives a phase shift θ to the qubit in the state |1 , while leaving |0 unchanged [3][4][5] : more explicitly, U c |1 |u = e −iθ |1 |u and U c |0 |u = e −iθ |0 |u , where |u is an eigenstate of U which is stored in the register and e −iθ is the corresponding eigenvalue. Let us write the spin-squeezed state as , and c k (µ k ) are Gaussian amplitudes (see Methods). Overall, the controlled- θ |µ k z |u and is equivalent to a collective spin rotation of the state |ψ k by the angle θ around the z axis, namely U ⊗N k c |ψ k |u = e −iθĴz |ψ k |u . The algorithm requires, in total, N T controlled-U gates.
The readout consists, first, of a collective π/2 rotation around the x axis, R ⊗N k x = e −i π 2Ĵ x , and a final projection along the z axis (namely the measurement ofĴ z ) with possible result −N k /2 ≤ µ k ≤ N k /2. The single measurement leads to the estimate θ The key operation of the algorithm is the phase feedbacks (represented by pink lines and boxes in Fig. 1) : before readout, the state e −iθĴz |ψ k is sequentially est being the estimated value at the step j < k. This rotation places the GSS |ψ k close to its optimal sensitivity point, namely on the equator of the generalized Block sphere, see Fig. 1 est , estimates the unknown θ. Notice that the circuit described above leads to an estimate of θ within the inversion region [−π/2, π/2] of the arcsin function. The algorithm can be extended to the full range θ ∈ [−π, π) by a slight modification of only the zeroth step, see Methods and SI. Below, we analyze the sensitivity of the QPE for an arbitrary θ ∈ [−π, π).
Phase sensitivity. The sensitivity in the estimation of θ, ∆ 2 θ est,K , is quantified by the statistical variance of θ est,K − θ. As shown in the Methods, this is calculated using the recursive formula for k = 0, 1, ...K and initial condition ∆ 2 θ est,0 = 4/N 0 . Equation (2) assumes s 2 k N k 1. While this is only partially justified for large values of k, it is nevertheless in excellent agreement with full numerical results. Higher order terms are explicitly calculated in the SI. The optimization of Eq. (2) over s 1 , ..., s K and N 0 , ..., N K can be performed analytically (with s 0 = 1 and fixing the total number of qubits N T ) leading to and In particular, the first step of the protocol is implemented with a GSS containing N 1 = 4N 0 particles and, at each further step, the number of particles is increased by a factor 3 : and summing the geometric series . The protocol uses GSS that are more and more squeezed (namely, with s k decreasing with k) as the number of steps increases, while s 2 k N k saturates to the asymptotic value 27/2 for k 1, see SI. The analytically-optimized sensitivity is that very rapidly (in the number of steps K) approaches the Heisenberg scaling with respect to the total number of qubits and with a prefactor that converges to α K→∞ = 4. It is also worth noticing that the sensitivity is exponential in the number of steps K ∼ log 3 N T , while with standard QPE protocols K ∼ log 2 N T [1,2,18,23]. Already for K = 3 (that uses one coherent spin state and three spin-squeezed states with decreasing s k ) we obtain a sen- , which is very close to the Heisenberg limit.
In Fig. 2(a) we show the results of numerical Monte-Carlo simulations of our QPE algorithm where θ is chosen randomly in [−π, π) with a flat distribution. For further clarity, we report the QPE pseudo-code in the Methods section. The sensitivity ∆θ est,K approaches the Heisenberg scaling Eq. (1) as a function of the number of qubits (or, equivalently, the number of steps K + 1), independently of the initial N 0 . This is confirmed by N T ×∆θ est,K approaching the constant value 4 for large N T and different initial N 0 [colored symbols in Fig. 2(a)]. Numerical  simulations agree well with the analytical optimization (colored dashed lines) results.
In Fig. 2(b) we analyze the behaviour of the estimator, θ est,K . As shown by the insets of Fig. 2(b), the distribution of θ − θ est,K is -to a very good approximation -a Gaussian cantered in 0, namely the estimator is statistically unbiased (notice that the histograms are obtained for 10 4 independent repetition of the algorithm for random θ). Furthermore, the probability of making an error in the estimation of an arbitrary θ is where erf(x) is the error function, see Fig. 2(b). The error probability is thus exponentially small with N T .
Impact of decoherence. We now investigate the robustness of the protocol against noise and decoherence in realistic experimental implementations. We include a noise source in the GSS and assume ideal phase rotations (which are typically implemented on time scales much shorter than squeezed-state preparations). We first consider a collective dephasing along an arbitrary axis n in the Bloch sphere, which is described by the transformation This provides a stochastic rotation e −iφĴn with an angle φ distributed with probability P (φ), where |ψ k is the ideal GSS. Without loss of generality (upon a further proper rotation of the state |ψ k ) we consider depolarization along the n = y axis. The noise leaves unchanged the moments ofĴ y -in particular it does not affects the squeezing along the y axis -but it decreases the length of the collective spin, Ĵ x , while increasing the bending of the state in the sphere, namely Ĵ 2 x (see SI for details). Results of numerical simulation of our QPE algorithm are shown in Fig. 3(a) and (b). For a sufficiently large number of steps, the protocol reaches the Heisenberg scaling (for K 1) where the prefactor α deph ∞ is determined by the low lying Fourier components π −π dφ cos(2 λ φ)P (φ), with λ = 0, 1, see SI. Notice that the Heisenberg scaling (8) is recovered even when the width of P (φ) is of the order of 2π, highlighting the robustness of the QPE algorithm to this source of noise. This is in contrast with QPE protocols implemented with GHZ states, where there is no preferred rotation axis that guarantees robustness to dephasing noise.
The situation is different in presence of full depolarization (within the symmetric subspace of dimension N +1), described by the transformation where 0 ≤ E ≤ 1 is the depolarization parameter. This noise affects the spin moments along all directions and, in particular, it poses a limit to the smallest achievable squeezing parameter s 2 min = EN/3, see SI. Simulation of the QPE in presence of full depolarization are shown in Fig. 3(c) and (d). Similar results holds for different noise models which pose limitations to the maximum available squeezing, e.g. particles losses (see SI). The QPE protocol in presence of noise follows the ideal (noiseless) phase estimation sensitivity up to a number of steps k k for which s k ≥ s min . In particular, if E 1 such thatk 1, proaches the analytical prediction f the number of particles (or, equiof steps) and independently of the his is confirmed by N T ⇥ ✓ est apt value 4 for large N T , while chanlored symbols in Fig. 1(a)]. Numeagree with the analytical optimizalines) at each step of the protocol. king an error ✏ in the estimation of very good approximation, or function, see Fig. 2(b). The error ponentially small with N T . test the robustness of our protocol ence. We include noise after state axis -but it decreases the length of the collective spin, ⌦Ĵ x ↵ , and increases the bending of the state in the sphere, namely ⌦Ĵ 2 x ↵ . Notice that the initial coherent spin state |N 0 , 0i, being an eigenstate ofĴ z , is una↵ected by collective dephasing along the z axis. Results of simulation of our QPE algorithm in presence of z-dephasing are shown in Fig. 3(a) and (b). For a su ciently large number of steps, the protocol reaches the Heisenberg limit scaling where the prefactor ↵ (z) 1 is fully determined by low lying Fourier components R ⇡ ⇡ d cos(2 k )P ( ), with k = 0, 1 of the noise distribution P ( ). Notice that the Heisenberg scaling (8) is recovered even when width of P ( ) is of the order of 2⇡, highlighting the robustness of the QPE algorithm to this source of noise. This is in strong contrast with QPE protocols implemented with GHZ states, where there no a preferred rotation axis that guarantees robustness to dephasing noise. Numerical Monte-Carlo simulations for the QPE of an unknown phase in [ ⇡, ⇡] are in excellent agreement with expected results (see Methods).
The situation is di↵erent in presence of full depolarization (within the symmetric subspace of dimension N +1), namely with a prefactor is fully determined by the noise parameter E. Fig. 3(c)   we recover the Heisenberg scaling (1) in a finite range of total number of particles. The fact that depolarization noise is irrelevant up to a critical value of squeezing s min and number of particles Nk ∼ 4×3kN 0 is the basic physical reason behind the expected experimental robustness of our phase estimation algorithm. When k ≥k, it is more efficient to avoid phase feedback between different steps [which would lead to a saturation of ∆θ est × N T to the asymptotic value E/3, see dashed lines in Fig. 3(c)] and rather repeat the protocol with multiple copies of the squeezed state |ψk of Nk particles and squeezing parameter sk. Asymptotically in N T we thus reaches a sensitivity

Symbols in
with a prefactor β E ∼ E 1/4 (see SI). Symbols in Fig. 3(c) show results of numerical Mote-Carlo simulations, in excellent agreement with analytical calculations. In panel (d) we show β E as a function of E where the results of simulations (circles) are compared to the expected scaling with E.
We have so far illustrated the protocol with quantum states having fixed and known total number of particles. In some experimental implementations, for instance with ultracold atoms, it is possible to fix the number of particles only in average, provided several identical preparations of the sample, with a fluctuating, unknown, number of atoms in a single realisation. Our protocol is unaffected by these fluctuations : in the inset of 2(a) we plot ∆θ est,K as a function of the total average number of particlesN T , where the protocol at the kth step is implemented with states having shot-noise particle-number fluctuations (∆N k ) 2 = N k . We also emphasize that using the non-linear readout [40][41][42] (which has been demonstrated experimentally for atomic squeezed states [43]), our QPE protocol can be made robust against detection noise. To conclude, we notice that in some applications as in atomic clocks [14,15,44], it is necessary to maximize the sensitivity by implementing each round of the protocol with the maximum possible number of particles available, namely N k = N for all k [45].

DISCUSSION
To summarize, we have proposed a novel QPE algorithm that uses Gaussian spin states and reaches a sensitivity at the Heisenberg limit O(1/N T ), with respect to the total number of qubits N T or, equivalently, the total number of applications of a controlled-U gate. There are two main differences with respect to the standard QPE algorithms [1,2] using single ancilla qubits [18,24], including those based on the inverse quantum Fourier transform [3,4,6] : i) The number of controlled-U gates in the GSS algorithm of Fig. 1 scales linearly with N T rather than exponentially. This means that, if we take into account the time t U necessary to implement a single controlled-U gate, the GSS algorithm is exponentially faster. This property is also shared by QPE algorithms using GHZ states [22,23], that are however more fragile to noise. The short time required by the algorithm makes it well suited for the estimation of time-varying signals. More explicitly, if we indicate with α = dθ(t)/dt the local slope of the signal, we obtain the condition α O(t U N 2 T ) −1 for the Heisenberg limited estimation of the time-varying θ with the GSS algorithm, compared to more stringent condition α O(t U N T e N T ) −1 for the QPE algorithms using single ancilla qubits. ii) Differently from Kitaev's QPE protocol [1,2], the phase estimation at each step of the GSS algorithm is based on a single measurement. The knowledge about θ is progressively sharpened using GSS of higher number of particles and decreasing squeezing parameter. The key operation of the algorithm is the phase feedback that can be understood as an adaptive measurement [46] able to place the GSS around its most sensitive point. In particular, the analytical optimization provided by Eqs. (3) and (4) allow to fully pre-determine the number of particle and the strength of the squeezing gate at each step. Therefore, the adaptive measurement in the GSS algorithm does not require the numerical optimization of states, operations and/or control phases [23,24], nor the support of any classical memory to store the phase distribution [18].
Commonly to all QPE algorithms, our protocol uses controlled-U gates to map a quantity of interest to a phase to be estimated. The GSS algorithm thus shares all known applications of QPE [5,7,8], while using noiseresilient quantum states. The use of GSS, which are routinely created in labs, can open a novel route for experiments with cold and ultracold atoms toward applications in quantum computing, quantum computational chemistry and quantum simulation.

METHODS
Estimation method. The phase estimation protocol of the QPE algorithm follows the standard method of moment [30]. The output state is characterized by an average collective spin moment Ĵ out z that can be expressed as a function of the mean spin value Ĵ x of the state at the end of the state-preparation step (i.e. before phase imprinting), Ĵ out z = Ĵ x sin θ. From a single measurement ofĴ out z with result µ, we estimate θ as The sensitivity of this estimator is given by the statistical variance ∆ 2 θ est = N/2 µ=−N/2 P (µ|θ) θ est (µ) −θ est 2 , whereθ est = N/2 µ=−N/2 P (µ|θ)θ est (µ) is the statistical mean value and P (µ|θ) is the probability to obtain the result µ for a given θ. The statistical variance can be well approximated by the prediction of error propagation (see SI) where we have taken into account that Ĵ z = 0 and Ĵ zĴx = 0 for the GSS considered in the manuscript.
Phase estimation protocol. We now discuss the different steps of the phase estimation protocol. Further mathematical details are reported in the SI.
Zeroth step. It is implemented with a coherent spin state of N 0 particles without requiring a spin-squeezing gate. The state is rotated by e −i(θ/2)Ĵz where the factor 2 dividing the rotation angle θ in the phase encoding transformation guarantees that the behaviour of Ĵ out z is monotonic as a function of θ in the full phase interval [−π, π). The estimation method discussed above provides the estimator θ (0) est (µ 0 ) = 2 arcsin(2µ 0 /N 0 ), depending on the measurement results µ 0 , with a sensitivity ∆ 2 θ est,0 = 4/N 0 (notice that (∆Ĵ x ) 2 = 0 for the coherent spin stata). The factor 4 in the sensitivity above the standard quantum limit is a direct consequence of the factor 2 dividing the rotation angle.
kth step. The state preparation of the kth step (k = 1, ..., K) provides the spin-squeezed state |ψ k of N k particles and squeezing parameter s k , see Eq. (13). The state is transformed by the controlled-U ⊗N k gate and a series of k rotation gates est ) gate (which uses the estimated value θ (j) est obtained in the previous steps of the protocol, j < k). The overall rotation applied to the spin-squeezed state is e −iθ kĴz , where est is a stochastic variable with distri-7 bution P (θ k ) ∼ e −θ 2 1 /(2κ 2 k−1 ) , with κ 2 k−1 = ∆ 2 θ est,k−1 . A measurement ofĴ z after a final rotation R x (π/2) provides a result µ k and a corresponding estimate θ (k) est (µ k ), according to Eq. (11). The value θ (k) est (µ k ) is used to implement the adaptive phase rotation at the (k+1)th step.
Phase sensitivity. Assuming that the protocol is stopped after K steps, The phase θ is estimated by est is the overall phase rotation at the Kth step). The corresponding sensitivity is obtained by a statistical average obtained by integrating Eq. (14) (with the replacements s, N, θ → s K , N K , θ K , and approximating tan 2 θ K ≈ θ 2 K ) over the distribution P (θ K ) of θ K . We obtain ∆ 2 θ est,K = 2s 2 K + N K (1 − e −1/(s 2 K N K ) ) 2 ∆ 2 θ est,K−1 2N K e −1/(s 2 K N K ) (15) giving Eq. (2), to the leading order in s 2 k N k 1. The recursive relation, with initial condition ∆ 2 θ est,0 = 4/N 0 provides a set of ∆ 2 θ est,k with k = 1, ..., K. The optimization of Eq. (15) over s 1 , s 2 , ..., s K and N 0 , N 1 , ..., N K is as follows. First, we minimize Eq. (15) with respect to s K , that gives This equation is also understood as a recursive relation giving the squeezing parameters s 1 , s 2 , ..., s K as a function of N 0 , N 1 , ..., N K , with initial condition s 0 = 1 (for the coherent spin state). Using the optimal values s 1 , ..., s K , Eq. (2) becomes ∆ 2 θ est,K = 3 2 Pseudo-code of the algorithm. For further clarity, we report below the pseudo-code of the algorithm. θ = generate random ∈ [−π, π) input : N 0 or N T , K for k = 0, 1 . . . , K : if k = 0 θ k = θ/2 elseif k = 0 θ k = θ − Notice that, as initial conditions, we can either fix the total number of qubits N T or the number of qubits in zeroth-step state, N 0 . In the former case, the code starts with N 0 = N T /(2 × 3 K − 1), in the latter case, it uses total N T = (2 × 3 K − 1)N 0 qubits. Given N T qubits, one can further optimize the algorithm over the number of steps K, as shown in the SI.
For larger values of N k , we approximate P (µ k |θ k ) as a Gaussian distribution centered in Ĵ out z and width (∆Ĵ out z ) 2 , whereĴ out z can be expressed as a function of the spin moments of the states after state-preparation asĴ out z =Ĵ x sin θ k +Ĵ z cos θ k . We have checked the equivalence of the two approaches for small values of N k .