Improved Quantum Magnetometry beyond the Standard Quantum Limit

Under ideal conditions, quantum metrology promises a precision gain over classical techniques scaling quadratically with the number of probe particles. At the same time, no-go results have shown that generic, uncorrelated noise limits the quantum advantage to a constant factor. In frequency estimation scenarios, however, there are exceptions to this rule and, in particular, it has been found that transversal dephasing does allow for a scaling quantum advantage. Yet, it has remained unclear whether such exemptions can be exploited in practical scenarios. Here, we argue that the transversal-noise model applies to the setting of recent magnetometry experiments and show that a scaling advantage can be maintained with one-axis-twisted spin-squeezed states and Ramsey-interferometry-like measurements. This is achieved by exploiting the geometry of the setup that, as we demonstrate, has a strong influence on the achievable quantum enhancement for experimentally feasible parameter settings. When, in addition to the dominant transversal noise, other sources of decoherence are present, the quantum advantage is asymptotically bounded by a constant, but this constant may be significantly improved by exploring the geometry.


INTRODUCTION
High-precision parameter estimation is fundamental throughout science. Quite generally, a number of probe particles are prepared, then subjected to an evolution which depends on the quantity of interest, and finally measured. From the measurement results an estimate is then extracted. When the particles are classically correlated and non-interacting, as a consequence of the central limit theorem, the mean squared error of the estimate decreases as 1/N, where N is the number of particles (probe size). This best scaling achievable with a classical probe is known as the standard quantum limit (SQL). Quantum metrology aims to improve estimation by exploiting quantum correlations in the probe.
In an ideal setting without noise, it is well known that quantum resources allow for a quadratic improvement in precision over the SQL [1], i.e. the mean squared error of the estimate after a sufficient number of experimental repetitions can scale as 1/N 2 yielding the the so-called Heisenberg limit. Realistic evolution, however, always involves noise of some form, and although quantum metrology has been demonstrated experimentally e.g. for atomic magnetometry [2][3][4][5][6], spectroscopy [7,8] and clocks [9,10], there is currently much effort to determine exactly when, and by how much, quantum resources allow estimation to be improved in the presence of decoherence [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27]. It is known that for most types of uncorrelated noise (acting independently on each probe particle) the asymptotic scaling is constrained to be SQL-like [12][13][14][15][16][17][18]. Specifically, when estimating a parameter ω, the mean squared error obeys ∆ 2 ω ≥ r/νN, where ν is the number of repetitions and r is a constant which depends on the evolution. If the evolution, which each probe particle undergoes, is independent of N, the scaling is constrained to be SQL-like. However, for frequency estimation this is not necessarily the case. In frequency estimation scenarios, such as those of atomic magnetometry [28,29], spectroscopy [30][31][32][33][34], and clocks [35][36][37][38][39][40], there are two relevant resources, the total number of probe particles N and the total time T available for the experiment. The experimenter is free to choose the interrogation time t = T/ν and, in particular, t may be adapted to N. In this case, the time over which unitary evolution and decoherence act is different for each N and thus the evolution is not independent of N. Schematically, the no-go results for noisy evolution in this case become with c(t) = r(t) t. Thus, if for some optimal choice of t(N) the coefficient c decreases with N, although the no-go results may hold for any fixed evolution time, the bound does not imply SQL-like scaling [41]. In frequency estimation scenarios, for the asymptotic scaling to be super-classical, c must vanish as N → ∞, which is only possible if the evolution is such that decoherence can be neglected at short time scales, and the no-go theorems then do not apply [42]. This can happen for non-Markovian [20,21] evolutions, for which the effective decoherence strength vanishes as t → 0. It can also happen in the presence of dephasing directed along a direction perpendicular to the unitary evolution [19]. In this setting, error-correction techniques can improve the scaling [18,42], and given additional ancillary particles which do not sense the parameter, even Heisenberg scaling can be attained [22,23]. Without such additional resources, it was shown that a variance scaling of 1/N 5/3 can be obtained by choosing an interrogation time t ∝ 1/N 1/3 [19]. This latter result was based on numerical analysis of the quantum Fisher information (QFI) [43] and was shown to be saturable by Greenberger-Horne-Zeilinger (GHZ) states [44]. However, GHZ states of many particles are not easily generated in practice, and the Fisher information approach does not explicitly provide the required measurements. Thus, the question of whether the scaling is achievable in practically implementable metrology was left open.
In this paper, we argue that the transversal-noise model applies to atomic magnetometry, in particular the experimental setting of [2], and study the quantum advantage attainable with use of one-axis-twisted spin-squeezed states [45] and Ramsey-interferometrylike measurements [30][31][32], both of which are accessible with current experimental techniques. We explicitly show that the setup geometry plays an important role for the achievable quantum enhancement. A suboptimal choice leads to a constant factor of quantum enhancement, while super-classical precision scaling can be maintained for a more appropriate choice. We study the enhancement achievable with the numbers of the experiment [2], and demonstrate the advantage of modifying the geometry. We further consider the case of noise which is not perfectly transversal and find that, although the asymptotic precision scaling is then again SQL-like, the precision may be substantially enhanced by optimising the geometry. As the previous results [19] were based on numerics, we also provide an analytical proof of the scaling for GHZ states in the Appendix.

MODEL
We consider a scheme in which N two-level quantum systems are used to sense a frequency parameter ω in an experiment of total duration T, divided into rounds of interrogation time t. We keep in mind that this can correspond to atomic magnetometry, in which the particles then represent the atoms with a spin precessing in a magnetic field at a frequency proportional to the field strength. As in [19], we describe the noisy evolution by a master equation of Lindblad form Here, H(ρ) = −i Ĥ , ρ is the unitary part of the evolution which encodes the parameter dependence. The Hamiltonian is given bŷ z is a Pauli operator acting on the k'th particle (qubit). The Liouvillian L(ρ) describes the noise, which is uncorrelated on different qubits, so that L = ∑ k L (k) , and for a single qubit we have where γ is the overall noise strength and α x,y,z ≥ 0 with α x +α y +α z = 1. For α z = 1, (4) describes dephasing along the direction of the unitary, while α x = 1 (or equivalently α y = 1) corresponds to the transversal-dephasing noise. For α x = α y = α z = 1/3, we have an isotropic depolarizing channel.
Under this model, interrogation-time optimisation leads to a quantum scaling advantage for transversal (α x = 1) but not for parallel (α z = 1) noise. This can be understood by looking at how the coefficient c in (1) behaves in the two cases. For short times, one can obtain bounds of the form (1) with [19]: From this we see that for parallel dephasing, interrogation-time optimization cannot prevent asymptotic SQL-like scaling, because c z is bounded from below by the non-zero factor of 2γ. However, for perpendicular noise c x → 0 as t → 0, and hence in this case optimisation may allow for super-classical scaling. In [19], it was found that taking t = (3/γω 2 N) 1/3 leads to and that this bound is achievable with the GHZ input states.
To see that the model is relevant in practice, we consider the atomic magnetometry experiment of [2] illustrated in Fig. 1. In this experiment, entanglement was demonstrated to enhance the sensitivity, but the precision scaling with N was not studied. The relevant magnetometer consists of a vapour of caesium atoms, which is subject to a strong external dc magnetic field B and used to sense a weak radio-frequency field B rf perpendicular to B (note that in [2] two separate ensembles where used; this is not important for the present argument). The atoms are optically pumped into an extreme magnetic sub-level and may be treated as effective twolevel systems with an energy splitting determined by B. An ensemble of atoms is placed in a strong magnetic field B which induces a level splitting between the magnetic sub-levels. The atoms are used to sense a weak field B rf in the plane perpendicular to B, which rotates in this plane with a frequency matched to the Larmor precession induced by B. We consider two cases for the state preparation and read-out. Scenario (a) corresponds to the geometry of the experiment [2]. All atoms are initially pumped to an extreme magnetic sub-level m = F creating a coherent spin-state aligned with B. The state is then squeezed to make it more sensitive to the evolution induced by B rf . In a frame rotating around B at the Larmor frequency, the state can be depicted as shown in the lower part. B rf points then along z and induces a rotation around the z-axis. The state is squeezed in y and B rf is estimated from a measurement of the collective-spin componentĴ y . Scenario (b) is similar, but the state is initially perpendicular to B. In the rotating frame it is squeezed in x andĴ x is measured. The dominant noise in both cases comes from individual atomic motion causing variations in the effective magnetic field and hence the energy splitting. This results in uncorrelated dephasing noise in the direction of B, which impact on the collective spin is schematically illustrated by the inner prolate spheroids. Importantly, the noise preserves the spin along B but shrinks it in the perpendicular directions.
With B B rf the dominant noise during evolution is due to the atomic motion resulting in each atom sensing a slightly different dc magnetic field, which leads to fluctuations of the individual energy splittings. This corresponds to a dephasing noise which acts on each atom independently and is characterised by the spindecoherence time T 2 [46]. As the experiment is conducted at a time-scale much shorter than the ones of spontaneous emission and B-field fluctuations, other noise sources are suppressed. In particular, the spinrelaxation time T 1 can be taken infinite and collective noise can be neglected. The frequency of the weak field B rf is matched to the Larmor frequency of the strong field B, and it is then convenient to describe the system in a rotating frame. If ρ is the state of a single atom in the non-rotating frame and B is directed along the x-axis (see Fig. 1), the state in the rotating frame reads ρ RF = e −iĤ B t ρ e iĤ B t , whereĤ B = κBσ x and κ is the coupling strength to the magnetic field. In such a Larmorprecessing frame, the master equation for the evolution may be written as [29] where the first term can be understood as the effective free Hamiltonian in the rotating frame with the B rf field pointing along z. The dephasing noise is directed along B and parametrised by T 2 . Since (8) is exactly of the form (2), it is clear that this experimental setting is captured by the previously stated model with ω = 2κB rf and transversal noise α We note that B B rf is important for the noise to be transversal, which may imply that γ is large relative to ω. In particular this is the case in [2], as seen below.
In [2], super-classical precision was demonstrated by initially aligning the collective-spin of the atomic ensemble along B and reducing fluctuations of its component in the direction perpendicular to both B and B rf via spin-squeezing ( Fig. 1(a)). Below, we study such a geometry along with another setting, in which the collective-spin is initially perpendicular to both B and B rf and its component along B is squeezed ( Fig. 1(b)). In principle, scenario (b) can be obtained from (a) by applying a π/2 pulse to the atomic ensemble before the evolution. In both cases, B rf is estimated from a measurement of a component of the collective spin (in the rotating frame), read out e.g. via the scheme of [2] which resembles a standard Ramsey interferometry [30][31][32] measurement. We show that in (b), for one-axistwisted spin-squeezed states, a super-classical scaling 1/N 5/4 of the mean squared error can be maintained, thus demonstrating that a scaling quantum advantage is possible with feasible states and measurements. At the same time, we find that in (a) the quantum advantage is limited by a constant matching the constant bound for parallel dephasing [11]. As a consequence, for an atomic ensemble size and parameters matching the experiment [2], (b) may considerably outperform (a).
As an aside, we note that when the true value of the estimated parameter is zero, the bound (7) vanishes. This does not mean that the precision is unbounded, but indicates that the bound gives no information in such a limit. One may then speculate whether the scaling can be further improved if ω can be made arbitrarily small in an adaptive manner. We discuss this issue in the Appendix.

COMPUTING PRECISION FOR SPECIFIC STATES AND MEASUREMENTS
To obtain results for the precision achievable within the above scenarios, we make use of the error propagation and apply it for adequate choices of squeezed states and collective-spin-observable measurements. Generally, when a parameter φ is estimated based on the average of measuring an observableÔ, and when the prior knowledge of φ is sufficiently tight, fluctuations in the estimate can be linearly related to the fluctuations inÔ. Thus, for a system in a state ρ, in such a local estimation regime the mean squared error of the estimate may be quantified as If the measurement is repeated, ∆ 2 φ will additionally decrease inverse-proportionally to the number of repetitions ν, which also ensures the above local regime as ν → ∞ and thus that (9) always holds. Here, we are interested in frequency estimation over a total time T with a single-round duration t, such that ν = T/t. We therefore write the overall mean squared error of the ω estimate as The expectation values in (10) can be evaluated either by computing the expectation value of the static operator in the time-evolved state or of the time-evolved operator in the input state (analogously to the usual Schrödinger and Heisenberg pictures for unitary dynamics). Specifically, in terms of the Kraus representation of the evolution one has whereÔ is the time-independent observable, ρ 0 is the input state, and K s are the Kraus operators of the global channel. For independent channels acting on each qubit, K s = K s 1 ⊗ · · · ⊗ K s N where the K s i are the Kraus operators acting on the i-th qubit.
In subsequent sections we will determine the precisions attainable under our model (2) for specific input states and measurements. The model has four Kraus operators, which have the form: Here, the coefficients a i , b i are real and depend on the frequency ω, the noise parameters γ, α x , α y , α z , and the time t (see the Appendix). However, to simplify notation we suppress these dependences. Because of trace-preservation, ∑ s K † s K s = 1, the coefficients must satisfy: For later calculations, it will be useful to compute the evolution of bothσ x andσ y under the Kraus map. For σ x we have Using (13), the evolution under the channel can then be written as (Pauli operators with no explicit time dependence are time-independent) where the coefficients ξ x = 1 − 2(a 2 1 + a 2 3 + a 2 4 ), χ x = 2(a 3 b 3 + a 4 b 4 ) are again real and encode the full dependence of the evolved operator on time, frequency, and the noise parameters. They are given in the Appendix. Similarly, one obtainŝ with ξ y = 1 − 2(a 2 2 + a 2 3 + a 2 4 ), χ y = −2(a 3 b 3 + a 4 b 4 ).

BEATING THE SQL WITH REALISTIC STATES AND MEASUREMENTS
Several experiments have demonstrated superclassical sensitivity of magnetometry with atomic ensembles by squeezing the collective atomic spin [2][3][4][5][6]. Considering the perpendicular model noise, we now show that spin-squeezed states and Ramsey-type measurements together with interrogation-time optimization are sufficient not only to reach precisions unattainable by classical protocols, but also to maintain superclassical precision scaling with the particle number.

Collective spin
Ramsey interferometry performed on a collection of spin-1/2 particles (qubits) effectively corresponds to collective-spin measurements [30][31][32]. Here, we consider the components of collective spin along x and ŷ which specify the observables measured in scenarios (b) and (a) of Fig. 1 respectively. The evolution ofĴ x under the model (2) follows directly from (15) and similarly forĴ y using (16). The derivatives w.r.t. the estimated parameter then read and adequately forĴ y after interchanging x ↔ y. We also compute (note that taking the square and evolving do not commute because the evolution is not unitary, i.e.Ĵ 2 so that from (18) and (20) we obtain the variance with Cov denoting the covariance. The variance forĴ y is again obtained by just replacing x ↔ y. For a specific initial state of the atomic ensemble with both its expectation values and variances known at t = 0, we can substitute the above expressions into (10), in order to quantify the precision attained in scenarios (a) and (b) of Fig. 1 for a given interrogation time t.

One-axis twisted spin-squeezed states
There is no unique definition of spin-squeezing [45], but generally spin-squeezed states are states in which fluctuations of the collective-spin component are reduced in a particular direction, when compared to the value they would have in a state with all individual spins aligned, i.e. in a coherent spin state (CSS), an eigenstate of the corresponding spin-component with maximal eigenvalue. Spin-squeezed states are useful for metrology due to their enhanced sensitivity to any change of the collective spin in the squeezed direction, e.g. caused by precession in a magnetic field.
A number of experiments, in particular [2], employ the so-called two-axis-twisted spin-squeezed states, which can be generated by quantum non-demolition measurement of the collective atomic spin mediated by light. However, here we focus on one-axis-twisted spinsqueezed states (OATSSs) because they are amenable to analytical treatment. As two-axis-twisted states allow for stronger suppression of the collective-spin variance in a particular direction, i.e. stronger squeezing, we expect them to attain precisions at least as good as those derived below for OATSSs. At the same time, quantum advantage with OATSSs in magnetometry has also been demonstrated in the experiment [6].
OATSSs are a particular kind of spin-squeezed states first introduced by Kitagawa and Ueda [47]. They can be produced by first preparing atoms in a CSS along one direction, and then applying an evolution with Hamiltonian quadratic in one of the perpendicular spin components. E.g. for spin-1/2 particles one can start from an eigenstate ofĴ x with eigenvalue N/2 (all spins aligned along x) and apply an evolution with Hamiltonian proportional toĴ 2 z . This will generate a state with minimum uncertainty at an angle to both y and z axes, which depends on the strength of the evolution. The state can then be rotated to align the direction of minimum uncertainty with one of the axes.
For scenarios (a) and (b) of Fig. 1, we consider two cases where the initial CSS is either along x or y, and the collective-spin component with minimum uncertainty isĴ y orĴ x respectively. For scenario (a), the mean values of the collective spin are [47] whereas the variances read with µ being the squeezing parameter, A = 1 − cos N−2 µ, and B = 4 sin µ 2 cos N−2 µ 2 . We note that the covariance (Cov(Ĵ x ,Ĵ y )) 0 = 0 vanishes for this state. The equivalents of (22) and (23) for scenario (b) are obtained by interchanging x ↔ y.

Mean-squared-error scaling under transversal noise
The mean squared errors of estimation, which are achieved in scenarios (a) and (b) of Fig. 1, can be calculated by using (22), (23), (19), (21) (and the equivalents forĴ y ) and substituting into (10). The best precision is then obtained by optimising the evolution time t and the squeezing µ for each N. The general expressions are rather involved, and we have been able to obtain their minima only numerically. However, any explicit choice of t(N) and µ(N) provides a precision which is guaranteed to be attainable.
Specifically, for scenario (b) a choice which appears to be nearly optimal is µ = (γ/ω) 1/4 (N/4) −4/5 and t = (γω) −1/2 N −1/8 [48]. For this choice we expand (10) in 1/N to find the expression for the asymptotic mean squared error Since the scaling is better than the 1/N of the SQL, this demonstrates that super-classical precision scaling is indeed possible with spin-squeezed states and Ramseytype measurements in the presence of transversal noise. The possibility for a large quantum enhancement depends strongly on the geometry. We can see this by comparing with scenario (a). There, any choice of µ ∝ 1/N s/(s+1) and t ∝ 1/N s with s > 1 leads to which coincides with the best achievable precision for the parallel-noise setting [15] constrained by (5). Numerical optimisation for varying N, γ, and ω indicates clearly that no better precision can be achieved. Thus for this geometry, under transversal noise only SQL-like scaling is possible and the quantum enhancement over classical, non-entangled strategies is bounded, while for scenario (b) the quantum enhancement is unbounded with increasing N. The difference between the two geometries shows up only for quantum strategies, that is, when squeezed states are employed. If the initial states are not squeezed but are simply CSS states along x for scenario (a) or y for scenario (b), then the precision takes the same form in both cases: Thus, we can benchmark the quantum enhancement in either scenario against this classical value. In particular, let us consider the numbers from [2]. In this experiment, N ≈ 10 11 , T 2 ≈ 30 ms, κ ≈ 10 10 (Ts) −1 , and B rf ≈ 36 fT, which gives γ = 2/T 2 ≈ 67 Hz and ω = 2κB rf ≈ 3.6 × 10 −3 Hz, and the measurement time was t ≈ 1 ms. The experiment was not performed with OATSSs, but we compute the quantum enhancements which OATSSs would provide. We insert the numbers in the full expressions (from (10)) for ∆ 2 ω (a) and ∆ 2 ω (b) and vary the squeezing. The best quantum enhancement attainable in scenario (a) is ∆ 2 ω CSS /∆ 2 ω (a) ≈ 8. In scenario (b) on the other hand the enhancement can reach ∆ 2 ω CSS /∆ 2 ω (b) ≈ 2 × 10 7 , corresponding to a 4500 fold improvement in precision. This underlines the advantage offered by geometry. However, these maximal enhancements require rather prohibitive squeezings of −60 dB and −73 dB respectively [49]. If we restrict the squeezing to at most −8 dB as discussed in the outlook of [2], then scenario (b) provides an enhancement of ∆ 2 ω CSS /∆ 2 ω (b) ≈ 6.3, corresponding to a factor of 2.5 in precision, while scenario (a) for the same numbers gives ∆ 2 ω CSS /∆ 2 ω (a) ≈ 2.5 corresponding to a factor of 1.6.
We note that, as for (7), the error (24) vanishes as ω → 0. We refer the reader to the Appendix for a discussion of this limit.

Non-transversal noise sources
In a realistic implementation, in addition to the dominant transversal noise, other sources of decoherence will be present. For example, in the experiment [2], B rf may not be perfectly matched to the Larmor frequency of B or the condition B rf B may not be sufficiently fulfilled, which should then be modelled as extra depolarising noise in the rotating frame (8). Moreover, when considering longer interrogation times t both atomic collisions and spontaneous emission may start to play a role. We assess the effect of such additional noise sources by considering a deviation from perfect transversality. In particular, we take a small component of dephasing directed along the z axis, such that α x = 1 − and α z = . As discussed in [19], once any such parallel-dephasing contribution is present, the asymptotic scaling must return to its SQL-like behaviour, that is, with c xz (γ, ) lower-bounded by the minimum of (5), i.e. c z ( γ, ω, t) ≥ 2 γ. This cross-over is illustrated in Fig. 2.
Although the asymptotic scaling is now again SQLlike, geometry can strongly influence the achievable quantum gain and the effective N at which the crossover to SQL-like scaling happens. In Fig. 2 we show the mean squared error scalings attained in scenarios (a) and (b) using respectively OATSSs along x and y and measurements ofĴ y andĴ x and compare them to a strategy without entanglement, simply using a CSS along x and measurement ofĴ y corresponding to the non-entangled strategy implemented in [2]. We see that while the strategy in (b) can saturate the bound 2γ /N, the strategy in (a) only reaches 2γ(1 − )/N as imposed by (25). Thus the mean squared error of geometry (b) is a factor /(1 − ) lower than (a), which may be significant when the noise is dominantly transversal. Furthermore, super-classical scaling persists over a larger range of N in geometry (b). On the figure, the locations where the OATSS strategies for (a) and (b) reach 90% of their asymptotic gain over the nonentangled CSS strategy are indicated. Clearly, the crossover happens at much larger N in scenario (b). As → 0, the cross-over must go to infinity. To get an idea of the behaviour, we can take the N at which the asymptotic bound 2 γ/N, crosses the asymptote (24) for perfectly transversal-noise. This intersection scales as (ω/γ) 4 / 4 . Thus, significant gain in precision by squeezing is attained over a larger range of N if the geometry is chosen correctly.

CONCLUSION AND OUTLOOK
For quantum metrology to be relevant in practical situations, it is important that good performance can be attained under realistic noise with states and measurements which are amenable to implementation in the laboratory. While recent results have shown that for many noise types, precision scaling can only improve over the classical limit by a constant, here we have demonstrated that under transversal dephasing, super-classical scaling can be preserved with experimentally accessible states and measurements, and we have argued that this noise model is relevant to recent atomic magnetometry experiments. We have shown that the choice of geometry is important for the attainable quantum improvement both asymptotically and for parameter settings corresponding to recent experiments. Furthermore, we have assessed the robustness of the model to other non-transversal sources of noise and have found that quantum enhancement could still be achieved for atomic ensembles of macroscopic size with an adequate choice of geometry.
Our results give a clear message that quantumenhanced metrology maintains its relevance even in the presence of noise, and we hope that they will encourage the search for other practically motivated scenarios where quantum strategies provide an advantage. For instance, it has been suggested that the transversal noise model applies also to NV-centers in diamonds [22], and very recently a noise-robust magnetometry scheme employing SQUID junctions has been proposed [50]. Finally, in the appendix we speculate about the potential of adaptive techniques which bias the estimated parameter towards the zero value, for which our current precision bounds fail. We expect that the question of what happens in this limit could be consistently resolved by employing Bayesian techniques [18,27,37], The map corresponding to evolution under the master equation (2) during time t can be written as a composite map of the form E ⊗N ω . Following Andersson et al. [51], the single-qubit maps are then given by whereσ i are the normalised Pauli operatorsσ i =σ i / √ 2 andσ 0 denotes the identity. All elements of the matrix S are zero, except S 00 = A + +B + , Where we have defined Γ = 2ω/γ, α ± = α x ± α y , and α = α 2 − − Γ 2 , and the coefficients: A Kraus representation of the map E ω can be obtained by diagonalising the matrix S. Denoting the eigenvalues and normalised eigenvectors of S by λ i and v i respectively, one can find a valid set of Kraus operators for the channel: with j = 1, . . . , 4, which gives the set in (12). The coefficients in (12) are rather involved, and we do not explicitly state them here. Instead, we directly give the expressions for ξ x , χ x , ξ y , and χ y of (15) and (16). For general noise they read and In case of perfectly transversal noise they further simplify, since α x = 1 implies α z = 0 andα =

Analytical scaling for GHZ states
Stemming from the error-propagation (see (10)) method utilised in the main text, we can also confirm the results of [19] analytically for the GHZ input states: by considering the parity operator in the x-direction: as the observable being measured. Similarly to the case of collective-spin operators and (18), we may utilise (15) to write the form of the parity operator at time t aŝ In the computational basis {|0 , |1 } ⊗N , such an operator just flips all of the qubits, and hence only the offdiagonal terms contribute when calculating its expectation value for a GHZ state (34). Everyσ x contributes a factor of 1 whileσ y contributes a factor of ±i. Thus, the expectation value of the measurement becomes and, sinceP 2 We compute the mean squared error of estimation via (10), after setting the interrogation time to t = (3/γω 2 N) 1/3 , as was found in [19] from numerical analysis. Expanding the corresponding ∆ 2 ω T in 1/N, we find the asymptotic scaling to read: where g(γ, ω, N) represents oscillating terms that are lower-bounded by e 2 /3 1/3 . The constant pre-factor here is larger than the pre-factor 3 2/3 /2, which was numerically verified to be optimal-optimised over all possible measurements-for GHZ states [19]. Nevertheless, although this suggests that either parity measurement is suboptimal or the above interrogation time t dependence should be improved in the parity-based scenario, (38) suffices to prove the super-classical precision scaling, 1/N 5/3 , as well as the (γω 2 ) 1/3 -behaviour of the asymptotic coefficient.

A note on vanishing parameter value
For ω = 0, both the GHZ-achievable bound (7) and the OATSS-based expression (24) vanish. This does not mean that the precision is unbounded for the two cases, but rather suggests that the results give no information in such a limit. It is therefore not clear, what precision scaling can then be achieved.
In general, for the channel (28) at ω = 0, we get ξ = 1, χ = 0, and ∂ξ/∂ω = 0, ∂χ/∂ω = (e −tγ − 1)/γ. For a GHZ state (34) and parity measurement (35), one can show utilising (10) that for fixed t. This is minimised at t opt = κ/γ, where κ is a numerical constant. Similarly, at ω = 0 for an OATSS along y squeezed in x (as in scenario (b) of the main text) with squeezing parameter µ = (N/4) −2/3 one finds and again the optimal time is t opt = κ/γ. Thus, on the one hand, the local estimation approach which we have employed above indicates that an improved scaling, even reaching the Heisenberg limit for GHZ states, is possible at the special ω = 0 parameter value. Even if the value of ω is a priori non-zero, one might then think that the precision scaling can be improved by adopting an iterative, adaptive strategy [52][53][54]. By applying a bias (e.g. in case of magnetometry, a magnetic field in opposite direction to the estimated field) to decrease the parameter after obtaining its first estimate, a better estimate is obtained with a precision which less heavily constrained by bounds (7) and (24), due to the lower effective value of ω. On the other hand, the prior information on ω required to adjust the bias may scale prohibitively. We can compute the estimate mean squared error for GHZ states and parity measurements (see above) and expand it in omega to obtain: ∆ 2 ωT ≈ c(γ, t)/N 2 + O(ω 2 ), with c given by (39). For the Heisenberg-scaling term to dominate for a fixed t, the higher-order terms in the expansion must be negligible in comparison. However, we find (for even terms, odd terms vanish) that the k'th term scales as ω k N k−3 which implies that we need ω N −(k−1)/k to neglect the higher-order terms. For this to hold for all k, ω 1/N what means that the prior information on ω must already be Heisenberg limited, as in the case of decoherence-free local estimation scenario [18,27]. At first sight, this may indicate that such an adaptive scheme may not be successful for any prior distribution of finite width, and that the value of ω must be perfectly known and set to zero for the above improved scalings to be observed. However, recent results [37], based on the Bayesian approach to estimation, indicate that in the decoherence-free case the Heisenberg scaling is attained irrespectively of the prior knowledge of ω. Hence, we expect the transversal-noise model to behave similarly due to its decoherence-free-like regime at short interrogation times, which would then prove the above adaptive strategy to also be efficient.