Photonic quantum metrology with variational quantum optical non-linearities

Photonic quantum metrology harnesses quantum states of light, such as NOON or Twin-Fock states, to measure unknown parameters beyond classical precision limits. Current protocols suffer from two severe limitations that preclude their scalability: the exponential decrease in fidelities (or probabilities) when generating states with large photon numbers due to gate errors, and the increased sensitivity of such states to noise. Here, we develop a deterministic protocol combining quantum optical non-linearities and variational quantum algorithms that provides a substantial improvement on both fronts. First, we show how the variational protocol can generate metrologically-relevant states with a small number of operations which does not significantly depend on photon-number, resulting in exponential improvements in fidelities when gate errors are considered. Second, we show that such states offer a better robustness to noise compared to other states in the literature. Since our protocol harnesses interactions already appearing in state-of-the-art setups, such as cavity QED, we expect that it will lead to more scalable photonic quantum metrology in the near future.


I. INTRODUCTION
Quantum metrology capitalizes on quantum resources to improve measurement precision beyond classical limits [1][2][3][4][5][6][7].Classically, the estimation error of an unknown parameter φ using N probes is bound by the standard quantum limit (SQL) ∆φ ≥ 1/ √ N .However, entangled probes can offer a quadratic improvement over the SQL, reaching the so-called Heisenberg limit (HL) ∆φ = 1/N .In the photonic scenario of phase estimation [8][9][10][11], quantum states of light such as NOON [12] or Twin-Fock states (TFS), i.e., the same Fock state at each arm of a Mach-Zehnder interferometer (MZI) [13], overcome the SQL (even reaching the HL in the case of NOON states).Proof-of-principle experiments have already shown the potential of this approach, but so far restricted to up to five photons [14][15][16].The underlying reason behind such low numbers is that photonic quantum metrology suffers from two limitations: i) State-of-the-art methods to generate metrologically-relevant states involve a number of operations [17][18][19] or interaction time [20,21] increasing with the number of photons.This ultimately yields an exponential fidelity decrease with photon number when gate errors are considered.A way of improving fidelities consists in using post-selection [14,15,[22][23][24][25][26][27][28][29], but at the price of vanishingly small probabilities with a growing photon number.ii) The resource entangled states, such as NOON ones, suffer from an increased sensitivity to noise, e.g., decoherence and photon loss in the channel, spoiling their quantum advantage even if generated accurately.Thus, innovative ideas are required to scale photonic quantum metrology protocols beyond proof-ofprinciple realizations.
Recently, variational quantum algorithms (VQAs) [30,31] have emerged as a tool to make the best out of current quantum hardware, which is noisy and thus can perform * alberto.munoz@iff.csic.es a limited number of coherent operations.The key idea of these hybrid algorithms is to use a classical optimizer to find the set of parameters of a parametrized quantum circuit (PQC) implemented on the hardware such that it minimizes a given cost function.Recent works on spin systems have shown how these VQAs can also be useful in the context of quantum metrology [32][33][34][35][36][37][38][39][40], e.g., by using the quantum Fisher information (QFI) [37][38][39] as cost function.However, in the photonic context this potential of variational approaches for quantum metrology has been scarcely explored, and limited to either linear systems [41,42] or PQCs with fixed non-linearities [43].
In this work, we combine VQAs with state-of-the-art quantum optical non-linearities to design an algorithm that overcomes the limitations of current protocols.In particular, we consider two types of PQCs (the ansätze) each formed by two coupled cavity systems but featuring different types of non-linearities: the coupling to a twolevel system that appears in cavity QED and a Kerr-type one.Our method employs the QFI as cost function to find the optimal parameters that transform unentangled coherent states into states that approximately saturate the HL.Importantly, we find that the number of operations required is independent of the photon number with both types of non-linearities, which guarantees that the fidelity of the generated states will not decrease with the photon number, unlike in existing protocols.For the two ansätze, we consider the impact of noise and show that the generated states feature a larger robustness than NOON and TFS.In a second step of the VQA, we consider photon number measurements and maximize the classical Fisher information (CFI) to find the optimal measurement within that scheme.Our variational approach can be applied following two different strategies: in situ [42,44,45], i.e., optimizing the PQC directly on the quantum hardware, or in silicon, i.e., simulating the PQC on a classical computer and then running the quantum hardware with the optimal parameters [43,46].Codes to reproduce the results of this manuscript are available in [47].

II. THE ALGORITHM
Let us initially restrict ourselves to the noiseless case.Our VQA can be divided into two steps: preparation and measurement, as sketched in Fig. 1(a) (more details can be found in Appendix A).In the preparation stage, a PQC described by a unitary operator U P (θ) is applied to two cavity modes.The initial state of each cavity is a coherent state with mean-photon number |α| 2 = N/2 [48], such that the mean number of photons summing both arms is N [49].The resulting state |ψ P (θ)⟩ is then sent through a MZI consisting of a symmetric beamsplitter followed by the encoding of a phase difference φ between the two modes and by another symmetric beamsplitter, resulting in a state |ψ E (θ, φ)⟩.To optimize the preparation of probe states, one needs to maximize the QFI F Q , thus setting the cost function to C P (θ, φ) = −F Q .Then, this quantity is fed to the classical optimizer, which in turn updates the parameters θ.The lower bound on the estimation error on φ is given by the quantum Cramér-Rao bound (∆φ) 2 ≥ F −1 Q [11].The closer F −1 Q is to the HL, the larger the metrological potential obtained with the PQC.
To calculate the QFI, we use the approximation introduced in Ref. [38], see Appendix A for more details.This requires evaluating the fidelity between the states |ψ E (θ, φ)⟩ and |ψ E (θ, φ + δ)⟩, where δ → 0 is a small phase difference.In principle, this demands a number of measurements as well as an amount of computation growing exponentially with the system size [50].In Appendix C we review some of the most promising tech-niques to measure the QFI-or, equivalently, the fidelity between two quantum states-aimed at alleviating this problem.Besides, the small size of the platform considered in our manuscript (hosting only two photonic modes, which implies a Hilbert space dimension scaling quadratically with N ) will further reduce the complexity of fidelity measurements.
Since the QFI is maximized irrespective of the measurement scheme, in the second step of the protocol we assess the best way to extract the information within a specific measurement type.In particular, we first apply a unitary U M (µ) to the output state |ψ E (θ opt , φ)⟩ resulting from the previous optimization and then we consider a measurement in the photon-number basis.The role of the unitary, which we label as measurement PQC, is to enable the algorithm to find the best possible combination of modes before measuring.The optimal parameters µ opt are found by maximizing the CFI F C , which we use as the new cost function of our algorithm C M (θ opt , φ, µ) = −F C .To compute the CFI, we construct the density ma- With this expression and its derivative ∂ρ M (θ opt , φ, µ)/∂φ, we calculate the CFI (see Appendix A) and optimize it.An optimal measurement should give F C = F Q ; such hierarchy is again summarized by the quantum Cramér-Rao bound (∆φ

III. PHYSICAL PLATFORMS & ANS ÄTZE
A crucial element of VQAs is the chosen PQC, since it determines the solution space that the algorithm can explore.In our case, the PQC will be defined by the two architectures represented in Fig. 1(b-e).Their common ingredient are two coupled single-mode cavities described by bosonic annihilation (creation) operators a ( †) 1,2 , whose coupling Hamiltonian reads H t = J a † 2 a 1 + a † 1 a 2 , where J is the tunneling rate.The difference stems in the source of non-linearity.
On the one hand (see Fig. 1(b, c)), we consider a nonlinearity coming from the coupling to a two-level emitter, like in cavity QED setups [44,[51][52][53].This platform enables encoding three different variational parameters per layer, namely, J, the emitter-cavity detuning ∆, and the coupling strength g, whose corresponding Hamiltonians read Here, σ i = |g⟩ ⟨e| takes the emitter i from its excited state |e⟩ to its ground state |g⟩.In this case, the unitary describing the PQC can be written as Above, d is the number of layers of the PQC, U e , and U int , where T (J,∆,g) j is the physical time in which each term is applied.θ, µ = { J1 , ∆1 , g1 , ..., Jd , ∆d , gd } are the variational parameters, each given by Jj = J j T (J) j , ∆j = ∆ j T (∆) j , and gj = g j T (g) j .The number of gates is N gates = 3d.
On the other hand (see Fig. 1(d, e)), we consider the Kerr-type Hamiltonian H (i) Kerr = U/2 × a † i a i a † i a i − 1 arising, for example, from χ (3) non-linearities in nonlinear crystals [54].With these interactions, we can write the ansatz of the Kerr non-linear circuit as follows: ( where θ, µ = { J1 , Ũ1 , ..., Jd , Ũd } are the variational parameters (two per layer), each given by Jj = J j T (J) j and Ũj = U j T (U ) j .T (J,U ) j is the physical time in which each term is applied.The number of gates is N gates = 2d.Compared to earlier works [43], we use the Kerr nonlinearity as a variational parameter to check if it can provide an advantage over fixed U -ansätze.A discussion on the physical realization of tunable optical nonlinearities is included in Appendix D, both for the Kerr non-linearity and the photon-emitter interaction.
Ideally, we would like the VQA to use an ansatz with as few gates as possible.The reason is that, e.g., if we assume a constant error per gate ε, the overall fidelity of state generation after performing N gates will be (1 − ε) Ngates .

IV. NOISELESS RESULTS
In Fig. 2 we show the convergence of our algorithm for the QFI with respect to the number of layers d (and gates, N gates ) of the preparation PQC for the emitters (panel a) and the Kerr non-linearity (panel b) ansätze.Both panels show F −1 Q as d is varied for different mean number of photons N ranging from N = 10 to N = 50.For both ansätze convergence is rapidly obtained with only two layers, making our protocol extraordinarily efficient in terms of circuit depth.What is more important, the value of d at which convergence is attained does not depend on N , at least for the range of N studied.This is in stark contrast with state-of-art protocols [18,20] in which N gates ∼ N , and thus the fidelity decays exponentially the number of photons.
Once we guarantee the convergence of the preparation step, we study whether optimal probe states and optimal measurements can be obtained with our protocol.Our results are shown in Fig. 3.In panel (a) we plot the estimation error (∆φ) 2 as a function of the mean number of photons N in the VQA protocol for a circuit depth d = 5 [55].The values of F −1 Q obtained with both ansätze are smaller than (∆φ) 2 for TFS with N/2 Fock states in each arm [12].While the optimal states produced by the Kerr ansatz saturate the HL for small N and remain very close to it as N grows, those generated by the emitters ansatz start approaching the HL at N ≳ 20.A linear fit reveals that our results follow a scaling F −1 Q ∼ 1/N β very similar to that of the HL: β = 2.0 for the emitters ansatz and β = 1.95 for the Kerr one.More details on the nature of the states prepared by our VQA, as well as Q .In the case of the emitters ansatz, the CFI closely follows the QFI, although complete saturation is not attained.In any case, both ansätze are able to prepare almost optimal measurements.We benchmarked these results with those of the Kerr ansatz featuring a fixed value of the non-linearity variational parameter Ũ = 2π, as in Ref. [43].In this case, F −1 Q,C tend to the classical 1/N scaling, signaling that the tunability of the non-linearity strength is a crucial factor to obtain metrological advantage.
To implement our protocol in an experiment, one needs to verify that the optimal values of the variational parameters are within the reach of state-of-the-art optical platforms.In particular, very strong optical non-linearities may not be physically realizable.In Appendix G, we show that for our tunable non-linearity ansätze the maximum values of g and Ũ required are of the order of 1.Since T (J,∆,g,U ) i are limited by the coherence time κ −1 , g and U must be smaller than the typical decoherence rates in the systems.Such coherence times are within the reach of certain cavity-QED platforms in both the microwave and optical regimes [56][57][58][59][60].However, in Kerr optical cavities the current record is held by polaritonic systems with only U/κ ∼ 10 −2 [61,62], while microwave resonators reach U/κ ∼ 10 2 [63][64][65].This limitation motivated us to study the effect of restricted Kerr nonlinearities in the optimization.For that, we introduce a bound Ũ ∈ [− Ũbound , Ũbound ] in the range of parameters that the optimizer can explore and study the dependence of F Q and F C on Ũbound .The results are displayed in Fig. 3(b) for fixed N = 20, showing two distinct behaviors for Ũbound ≲ 10 −3 and Ũbound ≳ 10 −1 with a continuous crossover in between.In the former regime, which is the one realistically achievable with state-of-the-art optical platforms, the bound prevents the optimizer from exploring the region of the Hilbert space where the minimum of the cost function lies, and the resulting values of QFI and CFI are the ones that would be obtained using coherent states at the input of the Mach-Zehnder interferometer (see Appendix H).However, above a critical value of Ũbound , the optimizer finds a very similar solution to that obtained using the emitters ansatz.In Appendix I, we study the dependence of this critical Ũbound on different mean-photon numbers, showing that it does not depend strongly on N for the photon numbers we can explore.

V. EFFECT OF NOISE
As a last step, we extend the previous study to a more realistic situation including noise in the quantum channels.Formally, we do it by constructing the den- ter the MZI and letting it experience a non-unitary evolution according to the Lindblad master equation ρE = κ where {L i } is the set of jump operators describing the noise channel and κ is the loss rate, which for simplicity we assume to be equal for all channels.As a further simplification, we include noise of two types only on the photonic degrees of freedom: amplitude (i.e., photon loss) and phase damping (i.e., decoherence) [66].For the former, the set of jump operators is {a 1 , a 2 }, resulting in a decay of the photonic population in both modes.In the latter, the set of jump operators is {a † 1 a 1 , a † 2 a 2 }.This in turn preserves the diagonal elements of the density matrix (i.e., the occupation probabilities) while producing a decay in its off-diagonal elements (i.e., erasing the coherences).
Fig. 4 shows the results of our algorithm in the presence of both noise channels for a circuit depth d = 5 and fixed photon numbers N = 10 (panel a) and N = 16 (panel b) as a function of the dimensionless noise factor κ = κT κ , where T κ is the typical noise timescale.For N = 16 only the results for the Kerr ansatz are shown, as the calculation for the emitters one does not reach such value of N due to the numerical overhead introduced by the emitters degrees of freedom.We benchmark the optimal values obtained with our VQA with those given by coherent states with α = N/2, TFS |N/2⟩ ⊗ |N/2⟩, and NOON states (|N, 0⟩ + |0, N ⟩)/ √ 2 alone without the preparation and measurement PQCs.In general, as κ increases, the values of F −1 Q obtained with both the emitters and the Kerr ansätze are lifted upwards, attaining values similar to those of NOON and TFS up to the value of κ where C for coherent states without measurement PQC.In the light grey shaded area of panel (b) the optimal states attain values of ∆φ which are sizeably smaller than those of TFS.The dark grey shaded area is beyond the asymptotic bound for dephasing noise [67].In both panels, we employed the circuit depth d = 5.
these states surpass the coherent ones.On the contrary, for larger values of κ the states generated by our VQA maintain a metrological advantage over coherent states.This improvement becomes larger with growing N , as can be seen in panel (b), which features a range of κ (shaded in light grey) in which the variationally computed value of F −1 Q is sizeably smaller than that of TFS, which are considered noise-robust states [11,68].This places the states generated by our protocol amongst the most noise-resilient ones.As κ grows such improvement diminishes, as it is expected for all generation protocols.The dark grey shaded area represents the region beyond the asymptotic bound (for large N ) in the presence of dephasing noise [67].For κ ≳ 10 −1 the results of our optimal states are close to such a bound.As for the CFI, it turns out to be more susceptible to noise, and even for small values of κ it deviates significantly from the corresponding QFI, reaching the CFI of coherent states for smaller κ.This implies that photon number measurements are not a good choice in a noisy situation.

VI. CONCLUSIONS & OUTLOOK
Summing up, using a variational approach, we propose a method to generate metrologically-relevant photonic states which offers an exponential advantage over standard deterministic protocols when gate errors are considered.By comparing the performance of both Kerr and emitter non-linearities, we predict that the emitter ansatz will perform better in platforms with limited Kerr non-linearities.We also showed that the tunable character of the non-linearity is essential to reach a Heisenberg scaling in the estimation error.Interestingly, our method is able to find states which provide a metrological advantage in the presence of moderate values of noise beyond other noise-resilient states considered in the literature, such as Twin-Fock states.In future works, we plan to extend our algorithm beyond the two-mode scenario in order to study multi-parameter estimation [34], as well as to apply it to other relevant problems in metrology such as electric field estimation [69].

ACKNOWLEDGMENTS
The authors acknowledge support from the Proyecto Sinérgico CAM 2020 Y2020/TCS-6545 (NanoQuCo-CM), the CSIC Research Platform on Quantum Technologies PTI-001 and from Spanish projects PID2021-127968NB-I00 and TED2021-130552B-C22 funded by MCIN/AEI/10.13039/501100011033/FEDER,UE and MCIN/AEI/10.13039/501100011033,respectively.AMH acknowledges support from Fundación General CSIC's ComFuturo program, which has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Sk lodowska-Curie grant agreement No. 101034263.AGT also acknowledges support from a 2022 Leonardo Grant for Researchers and Cultural Creators, and BBVA Foundation.The authors also acknowledge Centro de Supercomputación de Galicia (CESGA) who provided access to the supercomputer FinisTerrae for performing numerical simulations.The authors thank Martí Perarnau-Llobet, Juan José García-Ripoll, and Geza Giedke for insightful discussions.In this Section we describe in more detail the VQA proposed in our work.In general, we envision a quantum system featuring two photonic modes.Each of them initially hosts a coherent state with a mean-photon number |α| 2 = N/2.In practice, the infinite sum above is truncated at N , ensuring that the maximum number of photons in the system is 2N .The initial state in the Kerr ansatz (which does not include emitters) therefore is |ψ 0 ⟩ = |α⟩ 1 ⊗ |α⟩ 2 , where the subscripts 1, 2 refer to the photonic mode.In the emitters ansatz, we assume that the two-level emitters are initially in their ground state |g⟩, thus giving an initial state The subscripts in the state of the emitters refer to the photonic mode to which they are coupled.
In the preparation stage, a parametrized quantum circuit (PQC) described by a unitary U P (θ) is applied to the initial state.As explained in the Main Text, the applied unitary depends on the ansatz.The resulting state is |ψ P (θ)⟩ = U P (θ) |ψ 0 ⟩.
Such state is then sent through a Mach-Zehnder interferometer (MZI) consisting of a symmetric beamsplitter (described by a unitary U BS = exp [−i(a † 2 a 1 + a † 1 a 2 )π/4]) followed by the encoding of a phase difference φ between the two modes (given by a unitary ) and another symmetric beamsplitter, resulting in a state |ψ E (θ, φ)⟩ = U BS U E (φ)U BS |ψ P (θ)⟩.A second copy of |ψ P (θ)⟩ is also sent through the MZI, but in this case the encoded phase is φ+δ, where δ ≪ 1 is a small parameter.This allows us to calculate the QFI later on.Without loss of generality, in all our calculations we took φ = π/3 and δ = 10 −2 .
In order to account for noise in the preparation and encoding stages, we need to construct the density matri- which experience a nonunitary evolution according to a Lindblad master equation.We consider two noise channels: amplitude damping and phase damping, as explained in the Main Text, each featuring a loss rate κ.However, working with density matrices is computationally expensive since it requires squaring the dimension of the Hilbert space.Therefore, in practice, when we have κ > 0 we work with vectorized density matrices following the Choi-Jamiolkowski isomorphism [70], but in the noiseless case (κ = 0) we avoid it and deal directly with state vectors.
In the noisy case the resulting state is given by ρ E (θ, φ, κ) = e L pd Tκ e L ad Tκ ρ E (θ, φ)e L † ad Tκ e L † pd Tκ , where T κ is a typical noise timescale which for simplicity we assume equal for both noise channels, and L ad,pd are the Lindbladian super-operators for amplitude damping and phase damping given by The sum above is over the set of jump operators belonging to the two noise channels, given by {L (ad) } = {a 1 , a 2 } and {L (pd , where a i is the annihilation operator acting on the photonic mode i.
The QFI F Q is then calculated following the approximate formula [38] where F (φ, φ + δ) is the fidelity between the states in which the phases φ and φ + δ were encoded.This is 5. QFI FQ normalized in units of its converged value max{FQ} for coherent states, NOON states, TFS, and an optimal Kerr state with mean photon number N=44 obtained with our VQA as a function of δ. calculated as in the noiseless and in the noisy case [71], respectively.Remember that Eq.(A3) is valid in the limit δ → 0. In a realistic implementation, this procedure can be implemented by considering two different state evolutions in the MZI: i) applying opposite phase shifts ±δ/2 in the two channels of the MZI; and ii) leaving the two channels unperturbed, without applying any phase shift.The resulting states are then used to calculate the QFI according to Eq. (A3).Therefore, δ plays the role of the parameter to be estimated.While in our calculations we considered a phase shift φ = π/3, that choice is arbitrary and considering φ = 0 does not change the results, because the value of the QFI does not depend on the choice of φ [37].We give more information on how to experimentally estimate the QFI and the fidelity in Sec. C. Also, to make sure that our results are converged, we plotted the QFI attained by several states as a function of δ (see Fig. 5).From such results it is evident that δ = 10 −2 is a safe choice.
To maximize the QFI we choose a cost function C P (θ, φ, κ) = −F Q .This quantity is fed to the classical optimizer, which in turn updates the parameters θ.We initialize the preparation PQC close to the identity matrix (i.e., θ finite but close to 0), and employ COBYLA [72] as the classical optimizer, since it was the one giving the best results in a reasonable convergence time.When the optimization converges, we get the output state evaluated at the optimal parameters ρ E (θ opt , φ, κ) in the noisy case and |ψ E (θ opt , φ)⟩ in the noiseless one.
To maximize the classical Fisher information (CFI), we set a cost function C M (θ opt , φ, κ, µ) = −F C .This value is fed to the classical optimizer, which in turn returns the optimal measurement parameters µ opt .As for the preparation PQC, the measurement PQC is initialized close to the identity matrix (i.e., µ finite but close to 0), and we employed COBYLA as the classical optimizer.
As one can see, the two optimizations for the QFI and the CFI are carried separately in our VQA.

Appendix B: Starting from Squeezed Coherent States
In this Section we perform an analog calculation to that shown in the Main Text but employing squeezed coherent states [48] as initial states: where γ = α cosh r + α * sinh r and H n is the Hermite polynomial of grade n.The initial state is therefore |ψ 0 ⟩ = |α, r⟩ 1 ⊗ |α, r⟩ 2 for the Kerr ansatz and |ψ 0 ⟩ = |α, r⟩ 1 ⊗ |α, r⟩ 2 ⊗ |g⟩ 1 ⊗ |g⟩ 2 for the emitters one.
We choose α = N/2 as in the coherent states case, and r (the squeezing parameter) to be 10 dB, which is within the reach of state-of-the-art optical technology [73].In practice, the infinite sum in Eq. ( B1) is truncated at N , ensuring that the maximum number of photons in the system is 2N .Using squeezed coherent states instead of coherent ones as initial states could be of help since the algorithm already starts from non-classical states, making it easier to obtain a quantum advantage.Fig. 6(a) shows the estimation error (∆φ) 2 obtained from maximizing the QFI and the CFI using our VQA as a function of N , and for a circuit depth d = 5.As one can see, the difference between employing squeezed or coherent states (Fig. 3a of the Main Text) as initial states is negligible: in both cases, the optimizer is able to find almost identical values for the QFI and the CFI.This is true for both ansätze.
To further explore whether squeezed states provide any advantage over coherent ones, we also plot the values of F −1 Q,C obtained by both ansätze as a function of a bound in the Kerr non-linearity variational parameter Ũbound .We fix the mean number of photons at N = 20.Nevertheless, we obtain a very similar behavior to that of coherent states (shown in Fig. 3(b) of the Main Text), with a threshold located at Ũbound = 10 −2 separating below a regime where the Kerr ansatz results lie above those obtained with the emitters one, and above one where the results of both ansätze are very similar.Therefore, optical platforms only able to reach U/κ ∼ 10 −2 at most [61] will suffer from the same expressibility problems employing either coherent or squeezed states.Overall, the results of Fig. 6(a,b) discard the possibility of obtaining better results using squeezed coherent states.
Finally, to understand better what is happening below the Kerr non-linearity bound threshold, we plot in Fig. 6(c) the values of F −1 Q,C obtained with the Kerr ansatz with a bound Ũbound = 10 −4 as a function of N .Here we also plot the results for the inverse of the QFI and the CFI using squeezed coherent states with α = N/2 and a squeezing factor of 10 dB, i.e., removing the preparation and measurement PQCs (or equivalently setting θ = µ = 0).The results are identical in both cases, showing that the bound Kerr ansatz is not able to surpass the results of squeezed states alone due to the small value of Ũbound preventing the optimizer to explore a larger region of the Hilbert space.In any case, these results are better than those obtained with coherent states alone, as is shown in Sec.H, since the former are already nonclassical.Note that, while squeezed states saturate the TFS scaling at small values of N , for N ≳ 20 they deviate and start following a classical 1/N scaling.

Appendix C: Measuring the QFI
In this Section, we provide more insight on the different techniques proposed to measure the QFI.In our calculations, we compute the QFI by means of Eq. (A3).This procedure consists of evaluating the evolution of the quantum state through the MZI subjected to two different phase differences: φ and φ + δ.Such an approximation is valid in the limit of small δ.In order to obtain the QFI, one must evaluate the fidelity between the states generated by the two different evolutions.We choose to calculate the QFI this way because the alternatives (which involve either the eigendecomposition of the density matrix or evaluating the symmetric logarithmic derivative [37,74]) are computationally more expensive.Besides, in our manuscript we are dealing with a two-FIG.6. Estimation error (∆φ) 2 in the noiseless scenario starting from squeezed coherent states with α = N/2 and a squeezing factor of 10 dB.Panel (a) shows our results as a function of the mean number of photons N .In panel (b), we address the effect of a bound in the Kerr non-linearity strength Ũbound for a mean number of photons N = 20.Panel (c) compares the results of the Kerr ansatz with Ũbound = 10 −4 with those obtained removing the preparation and measurement PQCs.In the three panels, blue squares/red circles/green squares correspond to the results of the emitters ansatz/Kerr ansatz/squeezed states without preparation and measurement PQCs: filled (void) markers are the inverse of the QFI (CFI) F −1 Q(C) in the three cases.Dashed-dotted/solid/dashed lines signal the SQL/TFS/HL scaling.All calculations were made employing PQCs with depth d = 5.
mode photonic system (with each of the modes coupled to a two-level emitter in the case of the emitters ansatz) in which the dimension of the total Hilbert space scales quadratically with respect to the mean number of photons present in the system N .We therefore expect measurements of the QFI (or, equivalently, the fidelity) to be simpler than in the most general case, in which the complexity scales exponentially with N .Even in that case, there exist a number of strategies aimed at mitigating the cost of brute-force full quantum state tomography to access the spectrum of eigenvalues and eigenstates of the density matrices.Here we mention the most promising ones, although the list is not complete.
1.In the case of states that are well approximated by matrix product states (MPS) [50] proposed two schemes for efficient quantum state tomography which only require a linear amount (on the system size) of local observables as well as polynomial classical post-processing of the data.Efficient quantum state tomography can be also obtained by harnessing conditional generative adversarial neural networks [75], which leads to orders of magnitude fewer iterative steps than full quantum state tomography.
2. An alternative approach to measure the fidelity between two quantum states is given by randomized measurements [76].This strategy consists of repeatedly preparing and measuring a quantum state in a randomly chosen basis.A classical computer then processes the measurement outcome to estimate the desired property.In particular, a fidelity measurement will still feature an exponential complexity on the system size N, but better than with full tomography.Randomized measurements were also employed in [74] to construct a series of polynomial lower bounds that converge to the QFI.
3. An additional application of randomized measurements that further simplifies the measurement of several properties of the quantum system is classical shadows [77].In this approach, the number of measurements to be performed is independent of the system size.In particular, it has been applied to measure the fidelity of a state preparation process [78] leading to higher fidelities with a number of operations orders of magnitude smaller than with maximum-likelihood estimation (which is an incomplete tomography method).
4. One can also capitalize on the relation that exists between the quantum Fisher information and the variance of the operator generating the parameter encoding [79][80][81] to estimate the QFI thorough a tight lower bound [82].This can be applied in large systems while requiring few operator expectation values.
5. Further approaches to measure the fidelity include the SWAP test [83] and generalizing the quantum switch employed in [84] to measure entanglement entropy.
To sum up, the two-mode nature of our system, whose Hilbert space dimension grows quadratically with the mean photon number, combined with the growing number of techniques to experimentally estimate the fidelity (and thus the QFI) makes us confident about the possibility of implementing our scheme in experimental platforms.

Appendix D: Physical Implementation of Tunable
Optical Non-linearities In this Section we discuss in more detail the physical implementation of the tunable optical non-linearities required by our protocol.The operations in which the two ansätze are based are given by the trotterization of the natural evolution of quantum states under their respective system's Hamiltonian.This is an example of analog quantum computing [85,86], which is a particularly feasible way to exploit state-of-the-art quantum hardware.Nevertheless, a critical aspect of our work is the access to tunable optical non-linearities.In the case of the emitters ansatz, tunable emitter-photon interactions can be implemented by addressing an atom featuring a so-called lambda transition with Raman lasers [87,88].For the Kerr ansatz, tunable Kerr non-linearities have been implemented in superconducting circuits working in the microwave regime [89,90].However, it is not so obvious how to achieve a tunable Kerr non-linearity at optical frequencies, although there is a recent proposal in the few-photon regime based on the coupling of an infrared resonator to intersubband quantum well transition dipoles [91].
However, even in the worst-case scenario in which such tunability cannot be realized, one can simulate the algorithm in a classical computer and then fabricate the desired setup in which fixed non-linearities accounting for the corresponding optimal values are applied to each mode.This is an example of the in silicon approach that we mentioned in the Main Text.Thus, overall, we do not expect the requirement of tunable interactions to be a bottleneck for the implementation of our proposal.In this Section we provide more information on the results for a circuit depth d = 2, the convergence of the QFI as a function of the number of layers d of the preparation PQC, and the reason behind choosing d = 5 in Figs.3-4 of the Main Text.
We start by providing the analog of Fig. 3 of the Main Text but employing a circuit depth d = 2 for both the preparation and the measurement PQCs.The results, shown in panel (a) of Fig. 7, are quite similar to those obtained with d = 5.Even though the Kerr ansatz is able to produce states reaching the HL for N ≲ 20, above this value they start deviating upwards, giving slightly larger values of F −1 Q , but still very close to the HL.As for the emitters ansatz, for N ≳ 10 they saturate the TFS scaling.A linear fit reveals that the two ansätze follow F −1 Q ∼ 1/N β with β = 1.90 for the Kerr ansatz and β = 1.93 for the emitters one.Regarding the CFI, it closely follows the results for the QFI.For the emitters ansatz, this behavior is very similar to what we found for d = 5.However, for the Kerr ansatz, the agreement was even better for d = 5.In spite of this, with d = 2 our VQA is still able to generate states featuring a large metrological advantage, reaching the value of QFI of TFS employing the emitters ansatz, and even beating it and approaching the HL in the case of the Kerr ansatz.This supports our claim of a highly-efficient method for the generation of metrologically-relevant quantum states.Now we further clarify why we picked d = 5 for Figs.3-4 in the Main Text.Actually, such figures are calculated differently from Fig. 2. In the latter figure, the total mean-photon number N was fixed, and then the data for d was computed using the optimal parameters for d − 1 as the new initial parameters.In Figs.3-4, d was fixed and the data for N was computed using the optimal parameters for N − 1 as initial parameters.This implies that, for the same value of d, using the latter method more optimizations have been carried for values of N > d, which leads to better results than in the former case.This is visible in panels (b-c) of Fig. 7, where we show the analog of Fig. 2 of the Main Text but carrying the optimization using the latter method (i.e., fixing d and increasing N ).The inverse of the QFI F −1 Q is plotted as a function of the circuit depth d for several values of the mean-photon number N , for the emitters (panel b) and the Kerr (panel c) ansätze.In the two cases, even with d = 1 the VQA gives results which are close to the converged ones presented in Fig. 2 of the Main Text.As mentioned before, this is a consequence of the larger number of optimizations carried with this method before reaching the value of N considered.For d > 1, F −1 Q oscillates, with some values of d performing better than others.After an exploration of the results obtained with different values of the circuit depth in the range d = 1 − 6, we concluded that PQCs with d = 5 attained slightly better results for both ansätze.This is why we employed this value in Figs.3-4 of the Main Text.In this Section we explore the nature of the optimal states prepared by the VQA, and how close they are to NOON states and TFS.In Fig. 8(a) we calculated the fidelity (i.e., the complex modulus of the scalar product between two quantum states) of the states generated by the emitters and Kerr ansätze in the preparation stage of the VQA (with depth d = 5) after going through the first symmetric (50/50) beamsplitter of the MZI with respect to TFS after passing through the same symmetric beamsplitter (TFS+50/50 BS) as well as with respect to  NOON states.This is the correct comparison as TFS are defined prior to enter the first beamsplitter of the MZI interferometer, while NOON states are directly sent through the phase encoding.As one can see, as N grows, the fidelities with respect to TFS+50/50 BS rapidly decay to zero.On the other hand, the fidelities with respect to NOON states are finite even for large values of N .This is true for both ansätze.In other words, the states generated by our VQA hold some similarity with NOON states even for mean-photon numbers N ≳ 40.The fact that they share a relatively low fidelity (F ≃ 0.2 − 0.25 for N ≃ 40) should not be disturbing, as metrological advantage can be obtained with a variety of different states.Besides, our VQA does not employ the fidelity with respect to a target state as cost function, but rather aims to maximize the QFI without caring about the particular state obtained.
In order to further dive into the nature of the optimal states produced by our VQA, an interesting bench-mark is to address whether there is entanglement between the two photonic modes (similarly to what happens in a NOON state) or if both of them are uncorrelated (like in TFS).To explore this, we calculated the Von Neumann entropy of entanglement (shown in Fig. 9(a)) as well as the purity of the reduced density matrix (see Fig. 9(b)) of the first photonic mode.
We start from the density matrix describing the two photonic modes of our system ρ phot .In the case of the emitters ansatz, such density matrix can be obtained by tracing out the emitters degrees of freedom, i.e. ρ phot = Tr emit {ρ}, where ρ is the density matrix describing the total system of photons and emitters.At this point, and without loss of generality, we can trace out the second photonic mode, obtaining ρ 1 = Tr 2 {ρ phot }.The entropy of entanglement is given by [92] S(ρ 1 ) = −Tr (ρ 1 log 2 ρ 1 ) , or equivalently by where {λ i } is the set of M eigenvalues of ρ 1 and the sum above runs from i = 1 to i = M .The results obtained with the states generated by the emitters and the Kerr ansätze are shown in Fig. 9(a) (again, taking such states after they have gone through the first 50/50 beamsplitter of the MZI).These are benchmarked against the values of the entropy of entanglement for TFS (which are separable and thus give S(ρ 1 ) = 0), NOON states (in which only two states are entangled, and thus S(ρ 1 ) = log 2 (2) = 1), TFS going through a symmetric (50/50) beamsplitter (for which all possible combinations featuring an even number of photons are entangled), and the maximum possible value S(ρ 1 ) = log 2 (N + 1) for N + 1 available quantum states.As the mean number of photons N increases, the values of S(ρ 1 ) obtained with the two ansätze increase following a logarithmic law, surpassing the entropy of entanglement of NOON states as soon as for N = 4, although their values lie below those obtained by TFS passing through a symmetric beamsplitter.This is telling us that both ansätze are generating entanglement between the two photonic modes.Therefore, tunneling between the two modes is a crucial resource for the ansätze.This also confirms that the optimal states of our formalism do not resemble TFS.
A similar metric addressing the entanglement between the two photonic modes is the purity of the first photonic mode, i.e., Since a pure state always satisfies P = 1, if there is no entanglement between the two photonic modes we should obtain P(ρ 1 ) = 1.On the contrary, this value will be lower than 1 if entanglement is present.The results are shown in Fig. 9(b), and they go along the lines of Fig. 9(a): the purity of states generated by the two ansätze (as before, taking such states after they have gone through the first 50/50 beamsplitter of the MZI) rapidly diminishes as N increases.The obtained values of P(ρ 1 ) are below 1, signaling that the resulting ρ 1 is a mixed state for both ansätze and that the two photonic modes are entangled.These results are benchmarked with the values of P(ρ 1 ) resulting from NOON states, TFS, and TFS going through a symmetric beamsplitter.As one can expect from Fig. 9(a), the optimal states found with our VQA provide values of the purity between those of NOON states and TFS passing through the beamsplitter.
Overall, the results presented in this Section confirm that the two ansätze are generating states that are different from the NOON and TFS, and also different between them.
Finally, in Fig. 10 we plotted the diagonal terms of the reduced density matrix of the first photonic mode ρ (n,n) 1 in the Fock states basis |n⟩, where n is the number of photons in that mode, after going through the preparation (a, c) and measurement (b, d) PQCs.We fixed the mean number of photons at N = 20.Panels (a, b) show the results for the emitters ansatz, while panels (c, d) display those obtained with the Kerr one.As one can see, the optimal probe states given by the preparation stage are already different from coherent states, which would feature a Poisson distribution.Moreover, the measurement PQC further modifies the shape of the states.
Regarding the effect of noise, we have checked that the states prepared by our VQA still feature similar properties for finite values of κ.

Appendix G: Optimal Parameters
In this Section we examine the optimal parameters θ opt and µ opt obtained by the classical optimizer in the noiseless case (κ = 0), for preparation and measurement PQCs with depth d = 5, and for N = 20 photons.The aim is to make sure that the parameters maximizing the QFI and the CFI can be attained in real experimental platforms.This is especially concerning for the variational parameters g and Ũ , which encode correspondingly the interaction strength in the emitters and Kerr ansatz and are of the order of g/κ and U/κ, respectively.Fig. 11(a,b,c) shows the values of J, ∆, and g obtained with the emitters ansatz.As one can see, the maximum value of the emitter-photon coupling required is |g|/κ ≃ 5.This can be realized in state-of-the-art cavity-QED experiments [56][57][58][59][60].
Regarding the Kerr ansatz, we have to distinguish between the behavior below and above the expressibility threshold shown in Fig. 3(b) of the Main Text.Therefore, we compare the results for a Kerr non-linearity bound Ũbound = 10 −4 (shown in Fig. 11(d,e)) with those for a boundless non-linearity (shown in Fig. 11(f,g)).As we can see, in the first case the optimal values of | Ũ | given by the optimizer saturate | Ũbound | for all layers, thus signaling that the Kerr non-linearity bound is forcing the optimizer to stay in a restricted region of the Hilbert space where it cannot find the global minimum of the cost function, and therefore limiting its expressibility.
However, once we eliminate the Kerr non-linearity bound, we see in Fig. 11(g) that the optimal parameters contain values of U/κ ∼ 1.In a real experiment, this would require microwave platforms, which are capable of reaching U/κ ∼ 10 2 [63][64][65].However, going above the expressibility threshold may be also within the reach of systems working in the optical regime, as the current limit in U/κ ∼ 10 −2 [61] lies just in the middle of the Ũbound threshold.to the mean number of photons N .To do it, in Fig. 13 we plot the equivalent of Fig. 3

FIG. 1 .
FIG.1.(a) Overview of the variational optimization protocol.Two identical coherent states |α⟩ are the input of a variational quantum algorithm (VQA) aimed at finding the optimal state for the estimation of a phase φ.This consists of a quantum part including a parametrized quantum circuit (PQC) described by a unitary UP, a Mach-Zehnder interferometer (MZI) encoding the phase difference between its two arms, and (optionally) a non-unitary evolution accounting for noise with decay rate κ described by a Lindbladian L acting on the density matrix of the system ρE.The classical part of the VQA is an optimizer that changes the parameters θ of the PQC in search of the minimum of the cost function CP.The resulting optimal state |ψE⟩ (ρE in the noisy case) evaluated at the optimal parameters θopt is the input of the second part of the VQA, which employs the PQC to prepare optimal measurements as well as a classical optimizer that aims at minimizing the cost function CM by varying the parameters µ of the second PQC.(b) A single layer of the emitters ansatz, consisting of a tunneling unitary Ut depending on the tunneling amplitude J, a detuning unitary Ue for each mode depending on the cavity-emitters detuning ∆, and an interaction unitary Uint for each mode depending on the light-matter coupling strength g.The upper and lower modes of the scheme correspond to the emitters, while the two central ones are the photonic modes.(c) A possible implementation of the emitters ansatz in a cavity-QED setup.(d) A single layer of the Kerr ansatz, consisting of a tunneling unitary Ut depending on the tunneling amplitude J and a Kerr unitary UKerr for each mode depending on the non-linearity strength U .(e) A possible implementation of the Kerr ansatz in a photonic setup.

FIG. 2 .
FIG. 2. Scaling of the inverse of the QFI F −1 Q as a function of the number of layers d (and gates, Ngates) of the PQC, for different values of the mean number of photons N .(a) Emitters ansatz.(b) Kerr ansatz.

FIG. 3 .
FIG. 3. Estimation error (∆φ)2 in the noiseless scenario.Panel (a) shows our results as a function of the mean number of photons N .In panel (b), we address the effect of a bound in the Kerr non-linearity strength Ũbound for a mean number of photons N = 20.In both panels, blue squares, red circles, and green triangles correspond to the results of the emitters ansatz, the Kerr ansatz with unrestricted Ũ , and the Kerr ansatz with fixed Ũ = 2π, respectively: filled (void) markers are the inverse of the QFI (CFI) F −1 Q(C) in the two cases.Dashed-dotted/solid/dashed lines signal the SQL/TFS/HL scaling.In both panels, we employed a circuit depth d = 5.

FIG. 4 .
FIG. 4. Estimation error (∆φ) 2 as a function of the dimensionless noise parameter κ for mean photon numbers N = 10 (panel a) and N = 16 (panel b).Blue squares (red circles) correspond to the results of the emitters (Kerr) ansatz: filled (void) markers are the inverse of the quantum (classical) Fisher information F −1 Q(C) in the two cases.Dasheddotted/solid/dashed lines signal the values of F −1 Q obtained for coherent states/TFS/NOON states without preparation PQC.The dotted line corresponds to F −1C for coherent states without measurement PQC.In the light grey shaded area of panel (b) the optimal states attain values of ∆φ which are sizeably smaller than those of TFS.The dark grey shaded area is beyond the asymptotic bound for dephasing noise[67].In both panels, we employed the circuit depth d = 5.
Appendix A: Description of the Variational Quantum Algorithm Appendix E: Results for d = 2 Appendix F: Properties of the Output States of the VQA

FIG. 7 .
FIG. 7. Results obtained by employing preparation and measurement PQCs with depth d = 2. (a) Estimation error (∆φ) 2 in the noiseless scenario as a function of the mean number of photons N .Blue squares (red circles) correspond to the results of the emitters (Kerr) ansatz: filled (void) markers are the inverse of the QFI (CFI) F −1 Q(C) in the two cases.Dasheddotted/solid/dashed lines signal the SQL/TFS/HL scaling.(b-c) Scaling of the inverse of the QFI F −1 Q as a function of the number of layers d of the PQC, for different values of N .Contrary to Fig. 2 of the Main Text, these results are calculated by fixing d and employing the optimal parameters for N − 1 as the initial parameters for N .(b) Emitters ansatz.(c) Kerr ansatz.

6 F
FIG.8.Fidelity F of the optimal states generated by the preparation PQC after going through a 50/50 beamsplitter with respect to NOON states and TFS after passing thorough a 50/50 beamsplitter (TFS+50/50 BS), for both the emitters and the Kerr ansätze, as a function of the mean number of photons N .All calculations were made employing PQCs with depth d = 5.

FIG. 9 .
FIG. 9. (a) Von Neumann entropy of entanglement of the reduced density matrix of the first photonic mode S(ρ1) as a function of the mean number of photons N .The dasheddotted line shows the maximum entropy log 2 (N + 1) for a mode with N + 1 possible states.(b) Purity of the reduced density matrix of the first photonic mode P(ρ1) as a function of the mean number of photons N .In both panels, blue squares (red circles) are the results of the emitters (Kerr) ansatz after passing through the first symmetric (50/50) beamsplitter of the MZI.The solid (dashed) lines represent the for NOON states (TFS), while the dotted lines are obtained with TFS sent through a symmetric beamsplitter.All calculations were made employing PQCs with depth d = 5.

FIG. 10 .
FIG. 10.Diagonal elements of the reduced density matrix of the first photonic mode ρ (n,n) 1 as a function of the number of photons in that mode n.The total number of photons is fixed at N = 20.(a) Optimal state maximizing the QFI prepared using the emitters ansatz.(b) Optimal state maximizing the CFI prepared using the emitters ansatz.(c) Optimal state maximizing the QFI prepared using the Kerr ansatz.(d) Optimal state maximizing the CFI prepared using the Kerr ansatz.All calculations were made employing PQCs with depth d = 5.

FIG. 12 . 1 Q 1 C
FIG.12.Estimation error (∆φ) 2 as a function of the mean number of photons N for the Kerr ansatz with Ũbound = 10 −4 and a circuit depth d = 5, as well as for two input coherent states with mean photon number |α| 2 = N/2 without the preparation and measurement PQCs.
(b) of the Main Text for several values of N .As one can see, the crossover between the two regimes takes place around Ũbound = 10 −2 independently of N for both the QFI (panel a) and the CFI (panel b).A circuit depth d = 5 was employed throughout all these calculations.