Optimal protocols for quantum metrology with noisy measurements

Measurement noise is a major source of noise in quantum metrology. Here, we explore preprocessing protocols that apply quantum controls to the quantum sensor state prior to the final noisy measurement (but after the unknown parameter has been imparted), aiming to maximize the estimation precision. We define the quantum preprocessing-optimized Fisher information, which determines the ultimate precision limit for quantum sensors under measurement noise, and conduct a thorough investigation into optimal preprocessing protocols. First, we formulate the preprocessing optimization problem as a biconvex optimization using the error observable formalism, based on which we prove that unitary controls are optimal for pure states and derive analytical solutions of the optimal controls in several practically relevant cases. Then we prove that for classically mixed states (whose eigenvalues encode the unknown parameter) under commuting-operator measurements, coarse-graining controls are optimal, while unitary controls are suboptimal in certain cases. Finally, we demonstrate that in multi-probe systems where noisy measurements act independently on each probe, the noiseless precision limit can be asymptotically recovered using global controls for a wide range of quantum states and measurements. Applications to noisy Ramsey interferometry and thermometry are presented, as well as explicit circuit constructions of optimal controls.


I. INTRODUCTION
Quantum metrology is one of the pillars of quantum science and technology [1][2][3][4][5].This field deals with fundamental precision limits of parameter estimation imposed by quantum physics.Notably, it seeks to use non-classical effects to enhance the estimation precision of unknown parameters in quantum systems, which has led to the development of improved sensing protocols in various experimental platforms [6][7][8][9][10][11].To characterize the metrological limit of quantum sensors, the quantum Cramér-Rao bound (QCRB) [12,13], which is saturable for large number of experiments, is conventionally used.It is defined using the quantum Fisher information (QFI) [14][15][16], which is one of the most useful and celebrated tools in quantum metrology, with a considerable amount of research focused on developing better ways to calculate and bound it [17][18][19][20][21][22].
To tackle the effect of measurement noise on quantum metrology, interaction-based readouts were proposed [46][47][48][49][50][51] and demonstrated experimentally [52][53][54], where bespoke inter-particle interactions that enhance phase estimation precision in spin ensembles are applied before the noisy measurement step and after the probing step.The idea of employing unitary controls in a preprocessing manner, i.e. after the unknown parameter has been imparted but prior to the final measurement, was later formulated as the imperfect (or noisy) QFI problem [50,55], where the preprocessing is optimized over all unitary operations.Classical post-processing methods, such as measurement error mitigation [56][57][58], can then work in complement to the quantum preprocessing method for parameter estimation under noisy measurements.
Apart from a few specific cases, such as qubit sensors with lossy photon detection [55], setting the metrological limit under measurement noise by computing imperfect QFI has been difficult, limiting its practical application.In this work, we propose a more general measurement optimization scheme, where arbitrary quantum controls (i.e., general quantum channels that can be implemented utilizing unitary gates and ancillas) are applied before the noisy measurement.The goal is to identify the FI optimized over all quantum preprocessing channels for general quantum states and measurements, that we call the quantum preprocessing-optimized FI (QPFI) and quantifies the ultimate power of quantum sensors with measurement noise, and to obtain the corresponding optimal controls, that can be applied to achieve the optimal sensitivity in practical experiments.
We systematically study the QPFI, along with the cor-responding optimal preprocessing controls in this work.In Sec.II, we first define the QPFI and review related concepts.We then introduce the concept of error observables in Sec.III, and use it to demonstrate that the QPFI problem can be cast as a biconvex optimization problem [59].In turn, this allows us to find analytical conditions for optimality, and to identify optimal controls saturating the QPFI in the setting of commuting measurements applied to pure states (see Sec. IV).The case of classically mixed states (i.e., states for which the unknown parameter is encoded in the eigenvalues) is studied in Sec.V.Besides analytical solutions, we also manage to prove that unitary controls are optimal for pure states under general measurements, and that coarse-graining controls are optimal for classically mixed states under commuting-operator measurements, with a counterexample illustrating the non-optimality of unitary controls.
For general mixed states, we further prove useful bounds on the QPFI in Sec.VI.In terms of the asymptotic behavior of identical local measurements acting on multi-probe systems, in Sec.VII, we identify a sufficient condition for the convergence of the QPFI to the QFI using an optimal encoding protocol based on the Holevo-Schumacher-Westmoreland (HSW) theorem [60,61].We show that the relevant condition is satisfied by a generic class of quantum states, including low-rank states, permutationinvariant states, and Gibbs states (with an unknown temperature), while previously only the pure state case was proven [55].
Our results provide a theoretically-accessible precision bound for quantum metrology under noisy measurements, along with a roadmap towards preprocessing optimization in sensing experiments.

II. DEFINITIONS
Given a quantum state ρ θ as a function of an unknown parameter θ, the procedure to estimate θ goes as follows (see Fig. 1a): (1) Perform a quantum measurement {M i } on ρ θ , which gives a measurement outcome i with probability p i,θ = Tr(ρ θ M i ); (2) Infer the value of θ using an estimator θ, which is a function of the measurement outcome i; (3) Repeat the above two steps multiple times and use the average of θ over many trials as the final estimate of θ.Here, the quantum measurement {M i } is mathematically formulated as a positive operator-valued measure (POVM) [62] that satisfies (Tr(∂ θ ρ θ M i )) 2  Tr(ρ θ M i ) . ( The CRB is often saturable asymptotically (i.e., when N expr → ∞) using the maximum likelihood estimator [63][64][65] and therefore the FI, which is inversely proportional to the variance of the estimator, serves as a good measure of the degree of sensitivity of {p i,θ } with respect to θ.One caveat is the CRB only applies to locally unbiased estimators and can be violated by biased estimators.Additionally, there exist singular cases where maximum likelihood estimators are no longer necessarily asymptotically unbiased, e.g., when the support of {p i,θ } varies in the neighborhood of θ, and the CRB may not apply to them [66].However, for self-consistency, this paper will focus only on optimizing the FI, regardless of the limitations of the CRB.The QFI of ρ θ is the FI maximized over all possible quantum measurements on ρ θ (see Appx.A for further details) and we will refer to the optimal measurements as QFI-attainable measurements.Formally, the QFI is defined by [12][13][14] giving rise to the QCRB ∆ θ ≥ 1 which characterizes the ultimate lower bound on the estimation error.Going forward, we will also overload the notation and write to denote the FI of a classical probability distribution {p i,θ }, satisfying p i,θ ≥ 0 and i p i,θ = 1.Note that, from now on, we will implicitly assume that the summation is taken over terms with non-zero denominators.
In practice, the optimal measurements achieving the QFI are not always implementable, restricting the range of applications of the QCRB.For example, the projective measurement onto the basis of the symmetric logarithmic operators, which is usually a correlated measurement among multiple probes, is known to be optimal [14], while quantum measurements in experiments are usually noisy and not exactly projective.Here, we consider a metrological protocol in which arbitrary quantum controls can be implemented, after the unknown parameter θ has been imparted to the quantum sensor state ρ θ and before a fixed quantum measurement is performed (see Fig. 1b).We call this additional step "preprocessing", "pre-measurement-processing" in full.Note that the idea of implementing preprocessing quantum controls to improve sensitivity goes beyond the FI formalism and applies to other figures of merit of quantum sensors [67].This model effectively describes quantum experiments where the measurement error is dominant, while the gate implementation error and the state preparation error is relatively small, a noise model that arises naturally in modern quantum devices such as nitrogen-vacancy centers [23][24][25][26] and superconducting qubits [30].
To quantify the sensitivity of estimating θ on ρ θ with the measurement {M i } fixed, we define the FI optimized over all preprocessing quantum channels, or the quantum preprocessing-optimized Fisher information (QPFI), to be where E is an arbitrary quantum channel (or a CPTP map [68]).See Appx.B for mathematical properties of the QPFI.In particular, when the quantum measurement is fixed, the CRB induced by the QPFI, i.e., ∆ θ ≥ 1 provides a practical and tighter Cramér-Rao-type bound, compared to the QCRB, for parameter estimation under noisy measurements.We assume in the following discussions that all measurements are non-trivial (i.e., ∃M i ̸ ∝ 1, for all {M i }) and ∂ θ ρ θ ̸ = 0 so that the QPFI is always positive.Unless stated otherwise, we will denote the systems that ρ θ and {M i } act on by H S and H S ′ , respectively, and we will refer to H S as the input system and H S ′ as the output system.We do not assume H S ∼ = H S ′ here.This broader context is of particular interest when the quantum state ρ θ cannot be directly measured (e.g., readout of superconducting qubits via a resonator [30] and readout of nuclear spins via an electron spin in a nitrogen-vacancy center [69][70][71][72]); or when the quantum state is restricted to a subsystem of the entire system while quantum measurement can be performed globally.
Note that for generic noisy measurements, the supremum in Eq. ( 7) is usually attainable, i.e., there exists an optimal E such that F (E(ρ θ ), {M i }) is maximized (see Appx.C).However, there exist singular cases where F (E(ρ θ ), {M i }) has no maximum, due to the singularity of the FI at the point Tr(E(ρ θ )M i ) = 0 (see Sec. IV C for an example).In such cases, there still exist near-optimal quantum controls that attain sup E F (E(ρ θ ), {M i })−η for any small η > 0. In fact, we prove in Appx.C that: and the QPFI F P (ρ θ , {M i }) is attainable for any ϵ ∈ (0, 1].
In the following, we will focus mostly on the case where the QPFI is attainable.We will discuss the behavior of the QPFI, exploring numerical optimization algorithms and analytical solutions to the optimal controls for certain practically relevant quantum states and measurements.
We will also examine the FI optimized over all unitary preprocessing channels, which we call the quantum unitary-preprocessing-optimized Fisher information (QUPFI) [50,55] where U is an arbitrary unitary gate.(Note that our QUPFI is the same as the imperfect QFI in [55].)Unlike the QPFI, we assume H S ′ ∼ = H S (and do not distinguish between S ′ and S) when we talk about the QUPFI, so that it is well defined.We note here that Theorem 1 holds for the QUPFI, as well.
The optimal preprocessing controls that attain the QPFI and the QUPFI usually depend on θ, whose value should be roughly known before the experiment.Otherwise, one might use the two-step method by first using N expr states to obtain a rough estimate θ ≈ θ, and then performing the optimal controls based on θ on the remaining N expr − N expr states [73][74][75].The two-step procedure introduces a negligible amount of error asymptotically.
Before we proceed, we prove a relation between the QPFI and the QUPFI that will be useful later.Proposition 2. Let H S and H S ′ be the input and output systems of E. Suppose H A1 and H A2 are ancillary systems such that (11) where we use subscripts to denote the systems the operators are acting on.
can be implemented by acting unitarily on H S and an ancillary system H A1 , and then tracing over an auxiliary system H A2 , if dim(H A2 ) ≥ r E (Stinespring's dilation [68]).For any quantum channel with the input system H S and the output system H S ′ , there always exists a Kraus representation From Eq. ( 3), it follows that: where we omit the subscripts for simplicity.Note that the Stinespring's dilation technique is also useful in relating the QFI of a mixed state to the QFI of its purification in an extended Hilbert space [17,18].Taking the supremum over E in the above equality, we have On the other hand, for any U from proving the other direction of Eq. (11).

III. ERROR OBSERVABLE FORMULATION
In this section, we will formalize the optimization of FI over quantum preprocessing controls as a biconvex optimization problem using the concepts of error observables.Using this new formulation, the preprocessing optimization problem becomes numerically tractable with standard algorithms for biconvex optimization [59]; and also analytically tractable for practically relevant quantum states (see Sec. IV).
Here, we consider the preprocessing optimization problem in Eq. ( 7).On the surface, it may appear from the definition of FI (Eq.( 3)) that the target function F (E(ρ), {M i }) is mathematically formidable.To simplify the target function, we introduce the error observable X and the squared error observable X 2 , defined by where x i is interpreted as the difference between the estimator value θ(i) and the true value θ, i.e., x i = θ(i) − θ.
We assume there are r measurement outcomes and use x to denote the vector (x 1 , . . ., x r ).The local unbiasedness conditions (Eq. ( 1)) for a single-shot measurement then become Tr(ρ θ X) = 0, and Tr(∂ It can be verified mathematically (which is essentially a proof of the CRB) that the minimum of the variance of the estimator under the local unbiasedness conditions is the inverse of the FI; that is, Eq. ( 15). ( 16) The problem above is a convex optimization over vari-and X an arbitrary Hermitian matrix subject to the constraints in Eq. ( 15).This formulation has several useful applications [78][79][80].In particular, an algorithm was proposed in [55] based on Eq. ( 18), to optimize the QFI of quantum channels.Combining Eq. ( 16) and Eq. ( 7), we have that Let H S and H S ′ be the input and output systems of E and let {|k⟩ S } dim(H S ) k=1 and {|j⟩ be two sets of orthonormal basis of H S and H S ′ , respectively.In the rest of this section, we use matrix representations of operators in the above bases.It is convenient to represent a CPTP map E using a linear operator acting on i be the Kraus representation of E.Then, the linear operator Ω = i |K i ⟩⟩⟨⟨K i | is usually called the Choi matrix of E [68], where |⋆⟩⟩ := jk (⋆) jk |j⟩ S ′ |k⟩ S and (⋆) jk = ⟨j| S ′ (⋆) |k⟩ S .Ω corresponds to a CPTP map if and only if Ω ≥ 0 and Tr S ′ (Ω) = 1 S .E acting on any density operator σ can be expressed using Ω through E(σ) = Tr S ((1 ⊗ σ T )Ω) (we use (•) T to denote matrix transpose).Using the Choi matrix representation in Eq. ( 19), we have: Theorem 3. The optimal value of the following biconvex optimization problem gives the inverse of the QPFI.
Eq. ( 20) is a biconvex optimization problem of variables x and Ω. Fixing Ω, Eq. ( 20) is a quadratic program with respect to x, and fixing x, Eq. ( 20) is a semidefinite program with respect to Ω; each of which is efficiently solvable when the system dimensions are moderate and the domain of variables is compact.
Note that the domain of x is unbounded in Eq. (20).In practice, one may impose a bounded domain on x so that the minimum of Eq. ( 20) always exists.For cases where the QPFI is attainable, the optimal value of the bounded version will be equal to the one of Eq. ( 20) when the size of the bounded domain is sufficiently large.For singular cases where the QPFI is not attainable, the optimal value of the bounded version will approach the one of Eq. ( 20) with an arbitrarily small error as the size of the domain increases.We describe an algorithm called the global optimization algorithm [81] in Appx.D that can solve the bounded version of Eq. (20).
Finally, we note that Theorem 3 does not directly generalize to the case of QUPFI because the Choi matrices of unitary operators do not form a convex set.On the other hand, besides the set of quantum channels, our approach is also useful in optimizing the FI over other sets of quantum controls when the constraints on their Choi matrices can be represented using semidefinite constraints, e.g., the set of quantum channels that act only on a subsystem of the entire system.

IV. PURE STATES
In this section, we consider the special case where ρ θ = ψ θ = |ψ θ ⟩ ⟨ψ θ | is pure, which is most common in sensing experiments.We first consider the optimization of the FI over the error vector x and the unitary control U , and obtain two necessary conditions for the optimality of (x, U ).We use these conditions to prove equality between the QPFI and the QUPFI for pure states, showing that unitary controls are optimal for such states (when H S ∼ = H S ′ ).We also obtain an analytical expression of the QPFI for binary measurements (i.e., measurements with only two outcomes), and a semianalytical expression and analytical bounds for general commuting-operator measurements (i.e., measurements {M i } that satisfy [M i , M j ] = 0 for all i, j).In particular, we prove that the optimal control is given by rotating the pure state and its derivative into a two-dimensional subspace spanned by two of the common eigenstates of the commuting-operator measurements.

A. Necessary conditions for optimal controls
Proposition 2 shows that the optimization for the QPFI can be reduced to an optimization for the QUPFI using the ancillary system.Thus, here we first focus on the following optimization problem over the unitary control We obtain necessary conditions for the optimality of (x, U ) that will be useful later.
Lemma 4. If (x, U ) is an optimal point for Eq. ( 21), it must satisfy In particular, suppose where the normalization factor Then Eq. ( 24) is equivalent to the following two conditions: (1) Proof.Assume (x, U ) satisfies the constraints Eq. ( 22) and Eq.(23).Then for any unitary operator V such that Tr(U also satisfies the constraints Eq. ( 22) and Eq. ( 23), where 1 is a r-dimensional vector of which each element is 1.
We call the transformation above a "V -transformation" on (x, U ).After a V -transformation, the target function becomes which shall be no smaller than Tr(U ρ θ U † X 2 ) when (x, U ) is optimal.Let V = e −idG where dG is an arbitrary infinitesimally small Hermitian matrix.The first order derivative of Eq. ( 28) with respect to dG must be zero, which then implies Eq. (24).Specifically, to simplify the notation, let ρ := U ρ θ U † and ρ := U ∂ρ θ U † .Then the difference between the target function after and before the V -transformation must be zero up to the first order of dG, i.e., where in the first step we take the inverse of both sides and ignore higher-order terms, in the second step we mutiply both sides by −iTr(ρX 2 ), and in the last step we use the fact that if an operator A satisfies Tr(dGA) = 0 for any Hermitian dG, then A = 0.For pure states, Eq. ( 24) can be further simplified.Using the definitions of |ϕ⟩ and |ϕ ⊥ ⟩, we have U ρ θ U † = |ϕ⟩ ⟨ϕ| and where h.c.stands for the Hermitian conjugate and we use Then we have |u⟩ = 0, combining |u⟩ ∝ |ϕ⟩ and ⟨ϕ|u⟩ = 0. Note that |u⟩ = 0 is equivalent to Condition (2) after multiplying both sides by ⟨ϕ|X2|ϕ⟩ 2 √ n .Finally, we note that from Condition (1) and Condition (2), the necessary condition in Eq. ( 24) can be recovered straightforwardly, proving the equivalence between Eq. ( 24) and Conditions (1) and ( 2) for pure states.
As a sanity check, consider the special case where is a projection onto an orthonormal basis of H S ′ .Then we have X 2 = X 2 , so Condition (2) is trivially satisfied.Furthermore, choose (x, U ) such that the error observable X = 1 2 √ n (|ϕ ⊥ ⟩ ⟨ϕ|+|ϕ⟩ ⟨ϕ ⊥ |), so that Condition (1) is satisfied.Moreover, the variance of the estimation is implying that the QFI is achievable using the above projective measurement, since J(ρ θ ) = 4n for pure states [14,82].For general quantum measurements, the QUPFI might be strictly smaller than the QFI, in which case for the optimal choice of (x, U ), It is interesting to note that X 2 ≥ X 2 , for general POVM measurements.This follows directly from writing and noting that each term in the above sum is positive semi-definite.

B. Unitary controls are optimal
Using the definitions of |ϕ⟩ and |ϕ ⊥ ⟩ in Eq. ( 26), we observe that Eq. ( 21) can be rewritten as where ψ θ = |ψ θ ⟩ ⟨ψ θ | is pure.Here |ϕ⟩ and |ϕ ⊥ ⟩ are two arbitrary normal vectors that are orthogonal.From Eq. (38), changing |ϕ⟩ to |ϕ⟩ /(2 √ n) makes it clear that F U (ρ θ , {M i }) can be written as the product of J(ψ θ ) = 4n (39) and a state-independent constant.We have where Or more explicitly, (Note that going from Eq. ( 41) to Eq. ( 42), we only need to optimize the target function over x with a fixed (|ϕ⟩ , |ϕ ⊥ ⟩) and use standard methods for quadratic programming, e.g., Lagrange multipliers [76]).Note that Eq. ( 40) and Eq. ( 42) were also proven using a different method in [55].γ({M i }) is the normalized QUPFI for any pure states with unit QFIs and it is a function of {M i } that lies in [0, 1], which is the ratio between the QUPFI and the QFI for any pure states.It is independent of the exact ψ θ and can fully characterize the power of quantum measurements in terms of estimation on pure states.Note that Eq. ( 40) decomposes the QUPFI into the product of the QFI, as a function of states, and the normalized QUPFI, as a function of measurements.This result is useful when experimentalists have control over input states in sensing processes.It implies when a pure input state ψ 0 undergoes unitary evolution U θ , the optimal choices of the input state that maximizes the output FI are identical in situations with or without measurement noise.
Using Condition (1) in Lemma 4, we now prove that unitary controls are always optimal, that is, the QPFI is equal to the QUPFI when H S ∼ = H S ′ .We have the following theorem Theorem 5. Consider a pure state ψ θ and a quantum measurement {M i } acting on the same system.Unitary preprocessing controls are always optimal among quantum preprocessing controls for optimizing the FI, i.e., Or equivalently, where A is an ancillary system of an arbitary size.
Proof.We first consider the situation where the QUPFI is attainable, that is, there always exists an (x, U ) such that the infimum in Eq. ( 21) is attainable.Using Condition (1) in Lemma 4, we can rewrite Eq. (38) as s.t.⟨ϕ|X|ϕ⟩ = 0, where and Proposition 2 imply It is equivalent to the optimization problem min where σ is an arbitrary density operator and corresponds to Tr A (|ϕ⟩ ⟨ϕ|).We will show below that for any σ * that is optimal for Eq. ( 47), there exists an optimal pure state solution |ϕ * * ⟩ ⟨ϕ * * | for Eq.(47).Then the optimal values of Eq. ( 45) and Eq. ( 47) must be the same, proving Eq. (43).Assume (x * , σ * ) is optimal for Eq.(47).Without loss of generality, we assume supp(σ * ) ⊆ supp((X * ) 2 ), because otherwise σ * projected onto the support of (X * ) 2 is another optimal solution because the constraints in Eq. ( 47) are invariant and the target function is no larger after the projection.We now show there exists another optimal solution (x * * , |ϕ * * ⟩ ⟨ϕ * * |).First, note that X * = i x * i M i and X * 2 = i (x * i ) 2 M i satisfy Tr(σ * X * ) = 0 and Tr(σ * (X * ) 2 ) = 1/(4n) from Eq. ( 47), and where Π is the projection onto the support of σ * .Note that Eq. ( 48) is true because Condition (1) in Lemma 4 and, i=1 is orthonormal in H S .We claim that we can always choose such that ⟨ϕ * * |X * |ϕ * * ⟩ = 0, by picking a suitable {φ k } d k=1 .To see this, observe that: is a real, continuous function f (φ 1 , . . ., φ d ) of {φ k } d k=1 ∈ R d , where we omitted the sum over k = k ′ terms because Tr(σ * X * ) = 0 implies that it vanishes.Note that for any fixed {φ k } d k=1 , the sum of all 2 d terms f (φ ) is zero, implying that one, or more, of these terms is zero, or that some are negative and others are positive.
When the QPFI of Eq. ( 21) is not attainable, we take d and using Theorem 1, we have i }) where in the second step we use the equality between the QPFI and the QUPFI in the case where the QUPFI is attainable.
So far, we have proven that Eq. ( 44) is true when dim(H A ) ≥ dim(H S ′ ) 2 , due to Proposition 2 and the equality between the QPFI and the QUPFI.It also holds for any

C. Analytical solution for binary measurements
Here we provide an analytical solution to the QPFI and the corresponding optimal preprocessing control using Proposition 2 for binary measurements where r = 2.

Measurement on a qubit
We first consider the simplest case where the measurement is on a single qubit.Let X = x 1 M 1 + x 2 M 2 where for some m 1 , m 2 ∈ [0, 1], where {|1⟩ , |2⟩} is an orthonormal basis.Moreover, we assume m 1 > m 2 and 1 − m 1 ≥ m 2 .(When m 1 = m 2 , we must have γ({M i }) = 0 because the measurement outcome does not depend on θ.)Here m 2 and 1 − m 1 can be interpreted as the error probabilities that state |2⟩ is mistaken for |1⟩, and state |1⟩ is mistaken for |2⟩, respectively.Consider first the case where 1 > m 1 > m 2 > 0, that is, the error probabilities are both non-zero.We show in Appx.E 1 that all solutions that satisfy the two necessary conditions in Lemma 4 give the same optimal FI.One optimal solution to the preprocessed state is where Here the optimal unitary control U * can be chosen as any unitary such that Eq. ( 26) is true for Eq. ( 54) and Eq. ( 55).(In the following, we will only use (|ϕ * ⟩ , |ϕ ⊥ * ⟩) to represent the optimal preprocessing unitary with the implicit assumption that U * can be chosen as any unitary rotating (|ψ θ ⟩ , |ψ ⊥ θ ⟩) to (|ϕ * ⟩ , |ϕ ⊥ * ⟩)).Note that the symmetry transformations |ϕ ⊥ * ⟩ → − |ϕ ⊥ * ⟩, |1⟩ → e iω |1⟩ and |2⟩ → e iω ′ |2⟩ for any ω, ω ′ ∈ R will generate alternative optimal solutions, and they all provide the same optimal normalized FI: Note that this result was obtained also in [55] using a different method based on the Bloch sphere representation.
Here 1 − γ({M i }) is exactly equal to the fidelity between two binary probability distributions (m Take the symmetric binary measurement as an example, where m 1 = 1 − m, m 2 = m and m < 1/2, and m represents the probability of a bit-flip error in the measurement.Then we have p * = 1/2 (as expected from the bit-flip symmetry), and γ({M i }) = 1 − 4m(1 − m), which is equal to 1 in the noiseless case, and drops to 0 when m → 1/2.
In the case of perfect projective measurements where 1 = m 1 > m 2 = 0, we show in Appx.E 1 that the QPFI is equal to the QFI and is attainable for any 0 < p * < 1.The case where 1 > m 1 > m 2 = 0 is singular, in the sense that the QPFI is no longer attainable but only approachable.It corresponds to the situation where one type of error (|2⟩ mistaken for |1⟩) is zero, while the other (|1⟩ mistaken for |2⟩) is non-zero.In this case, we have γ({M i }) = m 1 using Eq. ( 57) and Theorem 1.

Measurement on a qudit
Next, we consider the general case where the measurement is on a qudit and we assume dim(H S ′ ) = d ≥ 2. Without loss of generality, we assume where {|j⟩} d j=1 is an orthonormal basis of H S ′ .We also assume m i ≥ m j for all i ≤ j without loss of generality.
Here we assume 1 > m 1 > m d > 0, which guarantees the attainability of the QPFI (see Lemma S1 in Appx.C) and the non-triviality of quantum measurements.(The singular cases where m 1 = 1 or m d = 0 can be derived using Theorem 1.) We show in Appx.E 2 that the optimal solution to |ϕ⟩ is supported on basis states corresponding to at most two different values of m i and the problem is simplified to selecting the optimal basis states and applying the qubit-case results.We show that is an optimal solution, where The normalized QPFI is given by Viewing {(m i , 1 − m i )} d i=1 as d binary probability distributions, the optimal strategy is always to select the two probability distributions that have the minimum fidelity (i.e., the largest distance) between each other.

D. Semi-analytical solution and analytical bounds for commuting-operator measurements
Here we consider commuting-operator measurements, where all measurement operators commute, which is among the most common types of measurements in quantum sensing experiments, e.g., projective measurements affected by detection errors.
Assume dim(H S ′ ) = d ≥ 2. Without loss of generality, we assume where {|j⟩} d j=1 is an orthonormal basis of H S ′ and r i=1 m (i) j = 1 for all j.Again, we assume m (i) j > 0 for all i, j to exclude the singular cases where the QPFI is not attainable.
In order to find the optimal control, we first prove the following theorem which states that the optimal |ϕ⟩ can be restricted to a two-dimensional subspace spanned by two basis states, i.e., the optimal unitary controls rotate the pure state and its derivative to a subspace spanned by two of the eigenstates of the commuting-operator measurement.
Theorem 6.For commuting-operator measurements (Eq.( 63)), there always exists an optimal solution to √ p |l⟩ for two basis states |k⟩ and |l⟩ and The proof is provided in Appx.F 1. Then we see that the normalized QPFI for commuting-operator measurements will be using Theorem 6, where and {M i }| span{|k⟩,|l⟩} is the quantum measurement restricted in the subspace spanned by |k⟩ and |l⟩.
We show in Appx.F 2 that where p * kl ∈ (0, 1) is the unique solution to (67) and the corresponding optimal preprocessed state in span{|k⟩ , |l⟩} is (The symmetry transformations |ϕ ⊥ * ⟩ → − |ϕ ⊥ * ⟩, |k⟩ → e iω |k⟩ and |l⟩ → e iω ′ |l⟩ for any ω, ω ′ ∈ R will generate alternative optimal solutions.)The optimal preprocessed state (|ϕ * ⟩ , |ϕ ⊥ * ⟩) in the entire Hilbert space that achieves Eq. ( 64) is chosen as For the special case where r = 2, the problem reduces to the binary measurement problem discussed in Sec.IV C and p * kl can be found analytically.In general, however, the analytical solution to p * kl might not exist since it is a root of a high degree polynomial (Eq.( 67)) and numerical methods are needed.Nonetheless, a simple analytical upper bound on γ({M i }) can still be obtained, as shown in the following theorem (see a detailed proof in Appx.F 3).
Theorem 7.For commuting-operator measurements (Eq.( 63)), the normalized QPFI γ({M i }) satisfies When there exists a (k, l) that minimizes i m such that the set , 1 ≤ i ≤ r contains at most two elements, the inequality is tight.
To derive lower bounds on γ({M i }), one could replace p * kl with any 0 ≤ p ≤ 1 in the expression Eq. ( 66).For example, taking p = 1/2, we have (as also shown in [55]) where we use m Combining the upper and lower bounds, we observe that It means that the QPFI will be close to the QFI when there exist two basis states |k⟩ and |l⟩ such that the fidelity between two probability distributions {m l } is close to zero (meaning that they are almost perfectly distinguishable).
The upper bound in Eq. ( 70) is saturated when the measurement is binary.Another physical example is lossy photodetection.The probability of detecting i photons given a Fock state of k (i ≤ k) photons is: , where 1 − η is the quantum efficiency of the photodetector.Assuming the maximal number of photons is N , it is simple to see that the optimal basis states are Fock states |0⟩, |N ⟩.Since m N is non-vanishing and thus γ({M i }) saturates the upper bound: γ({M i }) = 1 − η N .(Technically, we need to assume all m (i) k > 0 to avoid the singularity issue, but the above statement holds because the value of γ({M i }) can be calculated by first adding a small perturbation to the detection errors (like in Theorem 1) and then taking the limit as the perturbation vanishes.) Finally, note that although Theorem 6 and Theorem 7 do not directly tell us how to choose the two optimal basis states, such a choice may sometimes be obvious.For example, consider a n-qubit system (span{|1⟩ , |2⟩}) ⊗n measured by {M, 1 − M } ⊗n (independently on each subsystem) and M = (1 − m) |1⟩ ⟨1| + m |2⟩ ⟨2|.Then using Theorem 6, due to the bit-flip symmetry and the fact that tracing out some parts of the quantum state will not increase its QPFI, it is clear that rotating (|ψ θ ⟩ , |ψ ⊥ θ ⟩) into span{|1⟩ ⊗n , |2⟩ ⊗n }, or any other basis states, e.g., must be an optimal choice.In general, it remains open if there is a simple criterion to help us select the optimal k and l besides a direct calculation of Eq. (66) (or sometimes Eq. ( 70)) for different k and l.

V. CLASSICALLY MIXED STATES
In this section, we consider another type of quantum states, which we called classically mixed states, with commuting-operator measurements.A classically mixed state is a state which commutes with its derivative, e.g., Gibbs states whose temperature is to be estimated [83].In this section, we use the following form of classically mixed states: where D = dim(H S ), λ i,θ are functions of θ (we will drop the subscript θ for conciseness), {|i⟩} is an orthonormal basis of H S that is independent of θ and we use ζ θ to represent classically mixed states.Note that the QFI of Eq. ( 73) .Also, note that we assume in this section, without loss of generality, that the commuting-operator measurement {M i } and the classically mixed state ζ θ share the same eigenstates {|i⟩} max{d,D} i=1 , as it is always possible to apply a unitary rotation in the preprocessing control so that they are aligned.
We first show that optimizing the FI over quantum channels is equivalent to finding optimal stochastic matrices (which describes the transitions of a classical Markov chain) for the classical preprocessing optimization problem.Then we prove that the optimal control always corresponds to a stochastic matrix that has only elements 0 or 1, which we call a coarse-graining stochastic matrix.It implies that the QPFI is always attainable, and that the QPFI can in some cases be strictly larger than the QUPFI.Finally, we closely examine the case of a binary measurement on a single qubit.
A. Optimization over stochastic matrices Lemma 8. Consider classically mixed states Eq. ( 73) and commuting-operator measurements Eq. ( 63).Then and when d = D, where S d,D represents the set of d×D stochastic matrices of which every column vector sums up to one and S db D,D represents the set of D × D doubly stochastic matrices of which every column and row vector sums up to one, m (i) is a column vector whose entries are m (i) j , λ θ is a column vector whose entries are λ i .
Proof.Let E(•) = j K j (•)K † j be an arbitrary quantum channel, then we have where the matrix P satisfies We must have Thus, P is a stochastic matrix.For any quantum channel, there exists a stochastic matrix such that Eq. ( 76) holds true, proving the left-hand side is no larger than the right-hand side in Eq. ( 74).Moreover, when E(•) = U (•)U † is a unitary channel, P ℓk = |U ℓk | 2 must be doubly stochastic, implying Eq. ( 75).
On the other hand, for any stochastic matrix P , we define ) is then a quantum channel.For any stochastic matrix, there exists a quantum channel such that Eq. ( 76) holds true, proving the left-hand side is no smaller than the right-hand side in Eq. ( 74).
We show in Lemma 8 that the problem of optimizing preprocessing quantum controls on classically mixed states with commuting-operator measurements is equivalent to a classical version of preprocessing optimization where represents the classical FI with respect to a classical distribution λ θ and a noisy measurement m (i) satisfying i m (i) = 1 (1 is a vector with all elements equal to 1), optimized over any stochastic mapping described by stochastic matrices.In particular, for perfect measurements where (m λ i is the classical FI.Note that Theorem 10 presented later implies that the supremum of the FI over stochastic matrices is always attainable using some P ∈ S d,D and it means we are allowed to replace sup P ∈S d,D by max P ∈S d,D in the definition (Eq.( 78)).

B. Coarse-graining controls are optimal
We first consider the classical case and prove Eq. ( 78) can always be attained using some d × D stochastic matrix P where every element of P is either 0 or 1.We call this type of stochastic matrix a coarse-graining stochastic matrix in the sense that P sums up one or multiple entries of λ θ to one entry in P λ θ , which is a coarse graining of measurement outcomes.Lemma 9. Given a classical probability distribution λ θ ∈ R D and a measurement {m (i) } ⊆ R d (satisfying i m (i) = 1).When F P (λ θ , {m (i) }) is attainable, there exists a d × D coarse-graining stochastic matrix P such that, Proof.Suppose F P (λ θ , {m (i) }) is attainable and P * is an optimal solution.We will show that there exists an optimal solution P whose every column vector contains one (and only) element equal to , where a * (i) and b * (i) are constants, independent of t 1 .The second order derivative of f (t 1 ) is which is always non-negative.Therefore, f (t 1 ) is a convex function and always attains its maximum at the boundary t 1 = 0 or t 1 = a * 1 .Repeat this argument many times, one can show that there exists an optimal solution P such that there is only one positive entry in every column.
Note that it is not necessarily true that the optimal coarse-graining stochastic matrix that maximizes J({m (i)T P λ θ }) is a full-rank matrix.Consider the following example.Let and m (2) = (0, 1  2 , 1).Then it is clear that the following stochastic matrix is optimal, because J({m (i)T P * λ θ }) = J(λ θ ) = 4.However, it can be verified by enumeration that J({m (i)T P λ θ }) ≤ 3, whenever P is a permutation matrix, showing the nonoptimality of the full-rank stochastic matrices.
Using Lemma 8, we can show a similar result to Lemma 9 in the quantum case, that is, coarse-graining channels are optimal quantum controls.
Theorem 10.Consider classically mixed states Eq. ( 73) and commuting-operator measurements Eq. ( 63).The QPFI is always attainable using the following type of quantum channels, which we call coarse-graining channels, where P ℓk is a d × D stochastic matrix satisfying ℓ P ℓk = 1 and P ℓk = 0 or 1. Proof.By definition, there exists a sequence of channels According to Eq. ( 77) in the proof of Lemma 8 and the arguments in Lemma 9, for every E n there exists a channel Ẽn of the form Eq. ( 82) such that Since there are finite number of channels of the form Eq. ( 82), there must exist a E * = Ẽn for some n such that proving the attainability of the QPFI.
Theorem 10 also implies that there is a gap between the QUPFI and the QPFI for general quantum states, unlike for pure states where the QUPFI is equal to the QPFI.
Theorem 11.There exists a classically mixed state ζ θ and a commuting-operator measurement {M i } such that Proof.Consider the example discussed below Lemma 9 and here we fix θ = π/4.Theorem 10 implies that for where In general, given any stochastic matrix P , the probabilities for measurement outcomes 1 and 2 must have the form for some 0 ≤ a, b ≤ 1.Moreover, J({p 1).Noting that the situation where (a, b) = (1, 0) or (0, 1) is not possible if P is doubly stochastic.Applying Lemma 8, Eq.( 84) is then proven.
The intuition behind this type of gap between the QPFI and the QUPFI stems from the fact that nonunitary operations, e.g., the coarse-graining channel, have the power of reducing the rank of quantum states, while unitary operations do not.Consequently, when certain conditions are met: (i) the noisy measurement under consideration is noiseless in a lower-dimensional subspace, e.g., span{|1⟩ , |3⟩} in the example above and (ii) the rank of the quantum state can be compressed without reducing its QFI, e.g., collapsing span{|2⟩ , |3⟩} into span{|3⟩}, non-unitary preprocessing operations can achieve the optimal QFI.In contrast, relying solely on unitary preprocessing for high-rank states results in unavoidable measurement noise and suboptimal performance.
Finally, we note that although the implementation of general quantum preprocessing channels can sometimes be challenging with the requirement of preparing a clean ancillary system that occurs in the Stinespring's dilation (see Proposition 2), the resources needed to perform coarse-graining channels can be reduced in many cases.Firstly, the ancilla size required to perform coarsegraining channels is in principle smaller than d 2 that is required in general cases.In fact, any coarse-graining channel defined by Eq. ( 82) can be simulated using a d-dimensional ancilla, e.g., by first performing a unitary operation on H S ⊗ H A1 that maps |k⟩ S |0⟩ A1 → |k⟩ S |ι(k)⟩ A1 for all k, where ι(k) corresponds to the index of the row such that P ι(k)k = 1, and then discarding the probe system H S .Secondly, the coarsegraining channel can also be performed on certain quantum states by resetting some parts of the system with no additional ancillas in some cases.For example, consider a two-qubit quantum state The coarsegraining channel mapping |00⟩ → |00⟩, |10⟩ → |11⟩ and |11⟩ → |11⟩ is optimal and it can be performed by first resetting the second qubit to |0⟩ and then applying a CNOT gate that maps |00⟩ → |00⟩ and |10⟩ → |11⟩.Note that resetting qubits is usually considered much less noisy than measuring ones, e.g., in nitrogen-vacancy centers [27,28].

C. Binary measurement on a single qubit
With Theorem 10, in principle, one can find the QPFI for classically mixed states and commuting-operator measurements by exhausting all channels of the form Eq. ( 82) which is contained in a finite set.However, since the number of coarse graining stochastic matrices is large, the exhaustion procedure will be too costly.Here we closely examine a special case where a classically mixed state is measured by a binary measurement on a single qubit.The time to exhausting all coarse graining matrices is exponentially large with respect to the state dimension D. We will show that the time to find a solution can be reduced to a linear complexity by narrowing down the possible forms of the optimal controls.
To be specific, consider the binary measurement Then using Lemma 8, we first have and and t is a column vector in [0, 1] D , corresponding to the first row of the stochastic matrix P in the proof of Lemma 8. (Note that although from Theorem 10, it is possible to restrict t to {0, 1} D , and we keep the generality of t by allowing it to be in [0, 1] D for later use.)Without loss of generality, we can arrange the order of the positive elements in λ θ such that Then we assert that where 1 ≤i represents the vector whose the first i elements are equal to 1 and the rest are zero. 1 ≥i+1 = 1 − 1 ≤i .Now we prove Eq. ( 91).Choose an optimal t * ∈ [0, 1] D that maximizes f (t).We prove Eq. ( 91) in each of the following three cases: (i) t * T ∂ θ λ θ = 0. Then the QPFI is zero and Eq. ( 91) is trivial.

VI. GENERAL QUANTUM STATES
In Sec.IV and Sec.V, we have obtained fruitful results on preprocessing optimization for pure states and classically mixed states.Here, we consider the QPFI for general mixed states and derive useful upper and lower bounds on them.
A. Upper bound Theorem 12.Given any density operator ρ θ and quantum measurement {M i }, we have Proof.Suppose H A1 and H A2 are ancillary systems such that where H S and H S ′ are the systems ρ θ and {M i } act on.We also define an additional environmental system Using the purification-based definition of QFI [17,18], we have Choose ψ * θ to be the optimal purification ψ * θ that minimizes J(ψ θ ) such that J(ρ θ ) = J(ψ * θ ).Then where we use Proposition 2, Eq. ( 40) and Theorem 5.
Theorem 12 provides an upper bound on the QPFI for general quantum states.In particular, it shows the ratio between the QPFI and the QFI is always upper bounded by a state-independent constant γ({M i }) which is attainable when the state is pure and gives rise to the following CRB for general quantum states under noisy measurement {M i }: B. Lower bound Lemma 13.Consider a density operator ρ θ and quantum measurement {M i }.
where {|i⟩ C } is an orthonormal basis of an auxiliary system H C .Then The proof Lemma 13 is straightforward-it immediately follows from the definition of the QPFI.The equality holds true when {M i } is a projection onto an orthonormal basis of H S ′ , i.e., {M i = |i⟩ ⟨i|} dim(H S ′ ) i=1 . Note that the equality in Lemma 13 also holds when ρ θ is a classically mixed state and the QFI-attainable measurement is chosen to be the projective measurement onto the basis of H S so that T (ρ θ ) = ρ θ .For general mixed states, since T (ρ θ ) is a classically mixed state, the results in Sec.V can be applied here to analyze F P (T (ρ θ ), {M i }) and derive lower bounds for general mixed states.For example, one can divide the measurement operators into two subsets, restrict the measurement in a two-dimensional subspace, and then use our previous result of the binary measurement on a qubit for classically mixed states to derive an efficiently computable lower bound on the QPFI.
Note that unlike the upper bound (Theorem 12), there are no constant lower bounds independent of ρ θ on the ratio between F P (ρ θ , {M i }) and J(ρ θ ).For example, consider the single qubit case where which tends to zero as θ → 0 (and the optimal preprocessing is identity when θ ∈ (0, π/4)).On the other hand, J(ρ θ ) = 4 is a constant, showing that F P (ρ θ , {M i })/J(ρ θ ) has no state-independent constant lower bounds.

VII. GLOBAL PREPROCESSING: ASYMPTOTIC LIMITS
In this section, we consider the power of global quantum preprocessing in the asymptotic limit (see Fig. 2).We consider a multi-partite system H S = H ⊗n and H S ′ = H ′ ⊗n where dim H = D and dim H ′ = d, a set of quantum states ρ (n) θ in H ⊗n , and quantum measurements {M i } ⊗n that can be written as tensor products of identical measurements on each subsystem H ′ .Arbitrary (and usually global) preprocessing quantum channels E are applied before the noisy measurement.We will show that for a generic class of quantum states, the QPFI can reach the QFI asymptotically for large n.Note that the QPFI is in general not achievable [55] when E can only act locally and independently on each subsystem.
A. Attaining the QFI with noisy measurements Theorem 14.Given a set of quantum states {ρ is a function of θ and acts on H ⊗n for each n, we have if for each ρ (n) θ the following are true: • There exists a quantum measurement {T (n) i } whose number of measurement outcomes is r n such that • The regularity conditions are satisfied: n) , where λ i := Tr(ρ Theorem 14 provides a sufficient condition to attain the QFI using noisy measurements in the asymptotic limit n → ∞.We will first provide a proof of Theorem 14, and return to the physical understandings of the sufficient condition later.Readers who are not interested in the technical details can skip the technical proof and advance to the discussion part. In the proof, we will make use of a quantum-classical channel T n defined using {T (n) i }, and an encoding channel Ξ E , such that θ ) asymptotically (see Fig. 2).Intuitively speaking, the first step T n is to simulate the (asymptotically) QFI-attainable measurement {T ), leading to the asymptotic attainability of the QFI.Here Ξ E , along with a corresponding deconding channel Ξ D , is chosen such that Ξ E • M ⊗n • Ξ D is asymptotically equal to a completely dephasing channel with a transmission rate ≈ C(M), which is guaranteed to exist using the HSW theorem [60,61].
Proof of Theorem 14.We first choose an α such that lim n→∞ log r n /n < α < C(M).According to the definition of the classical capacity of quantum channels [68], for any ϵ > 0, there exists an n 0 such that for any n > n 0 , there exist an encoding channel Ξ E and a decoding channel Ξ D such that where D 2 is a completely dephasing qubit channel acting on qubit Hilbert space H b , i.e., D [68] defined by ∥Φ∥ ⋄ = max{∥(Φ ⊗ 1)(X)∥ 1 , ∥X∥ 1 ≤ 1} (Φ and 1 act on systems of the same dimension, and ∥•∥ 1 denotes the trace norm).Moreover, ϵ = e −Ω(n) (see a proof in Appx.G).
For any operator σ, we have We also assume n 0 is large enough such that for any where we choose {|e i ⟩} rn i=1 to be a subset of the computational basis in H ⊗⌊αn⌋ b . Without loss of generality, we assume λ i = Tr(ρ i ) > 0 for all i (we can always exclude the terms that are equal to zero), then T n (ρ where we use the monotonicity of the QFI in the first and third inequalities and J({p i,θ }) = J( i p i,θ |i⟩ ⟨i|) for any classical probability distribution {p i,θ } in the second equality.Then we have Next we aim to show is lower bounded by a constant that approaches 1 for large n.First, assume n > n 0 , we have where we use D where We will also assume n is large enough such that δ i < λ i , which is possible due to Eq. ( 104) and the regularity condition (1).
Then we have rn i=1 In the second equality above, we use the Taylor expansion In the last inequality above, we use Eq. ( 104) to derive that Here we use the inequality /µ i from the Cauchy-Schwarz inequality.Note that it also holds that ∥∂ θ σ θ ∥ 1 ≤ J(σ θ ) 1/2 for general mixed states σ θ [84,85].
Finally, from the monotonicity of the QFI (i.e., i ) and the regularity conditions (1) and (2), we have Taking the limit n → ∞ in Eq. ( 109), from ϵ = e −Ω(n) and Eq.(111), we have Combining Eq. (106), Eq. ( 108), and Eq. ( 112), we have in an n-partite system is estimated using n identical noisy measurements acting on each subsystem, described by {Mi} ⊗n .The QPFI can approach the QFI in the asymptotic limit n → ∞ if the sufficient condition in Theorem 14 is satisfied.The optimal control is the composition of a quantum-classical channel Tn(•) = } is asymptotically QFI-attainable, and an encoding channel ΞE chosen as the optimal encoding channel for M ⊗n from the HSW theorem.Note that the decoding channel ΞD from the HSW theorem only needs to be used in a classical postprocessing manner.

Since lim
= 1 and proving the theorem.

B. Discussion
Here we discuss the intuitions behind the sufficient condition in Theorem 14 and describe the relevant situations where it is satisfied.We will see that the sufficient condition is satisfied for a generic class of quantum states ρ (n) θ and noisy measurements {M i }.
Let us first explain the meaning of the condition Eq. (102).It states that there exists an (asymptotically) QFI-attainable measurement for ρ (n) θ that has a small number of measurement outcomes.Specifically, the number of measurement outcomes r n should be smaller than 2 C(M)n (asymptotically) where C(M) is the classical capacity of the quantum measurement {M i } under consideration, i.e., Theorem 14 applies when log r n < C(M)n + o(n). (116) The requirement (Eq.( 116)) is satisfied by many practically relevant quantum states and measurements.In fact, whenever the classical capacity of M is positive, n) is a sufficient (but not necessary) condition of Eq. ( 116).Below we provide several typical examples where the QFI-attainable measurement with a subexponential number of outcomes exists.See Appx.A for additional details.
(1) Low-rank states.For pure states, it was known that there exist 2-outcome QFI-attainable measurements [14].(Note that [55] contains another proof of Theorem 14 when ρ (n) θ is pure.)More generally, any ρ (n) θ that is supported on a subspace with a subexponential dimension also has a QFI-attainable measurement with a subexponential number of outcomes.
(2) Symmetric states.The second example with a QFIattainable measurement with a subexponential number of outcomes is symmetric (permutation-invariant) states (e.g., tensor products of n identical mixed states).According to the Schur-Weyl duality [86,87], , where H ν (U(D)) and H ν (S n ) are irreducible representation spaces of the unitary group U(D) and the permutation group S n with index ν.Any symmetric state ρ (n) θ can be written as where ρ (n) ν are mixed states acting on H ν (U(D)) and p ν satisfies ν p ν = 1 (both of which can be functions of θ).Then a QFI-attainable measurement with a subex- Let us estimate the number of measurement outcomes: ν corresponds to Young diagrams (i.e., partitions of n into D parts), implying the number of different indices ν is O(n D−1 ).For any ν, dim(H ν (U(D)) is equal to the number of semistandard Young tableaux, which is at most O(n D(D−1)/2 ) according to the Weyl dimension formula [88].The number of measurement outcomes is thus upper bounded by (3) Gibbs states.For classically mixed states ρ (n) θ , the projection onto the eigenstates of ρ (n) θ is QFI-attainable but has exponentially many measurement outcomes.However, we argue that in many cases, a subexponential number of projections onto direct sums of eigenspaces are sufficient to attain the QFI up to the leading order, so that Theorem 14 applies.For instance, consider the Gibbs state where {|ν⟩} are energy eigenstates with eigenvalues {E ν } and θ is the inverse temperature to be estimated.The QFI is equal to the variance of energy, i.e.,

J(ρ
where p ν = e −θEν ν e −θEν .Assume the energy eigenvalues lie in [0, E), where E = Θ(n) (which is a standard assumption in condensed matter systems) and divide them into intervals k=1 onto the direct sums of eigenspaces corresponding to all eigenvalues in I k .The FI is where . Combining with the regularity condition (2), it implies that F (ρ θ ) up to the leading order.Next, let us explain the intuitions behind the regularity conditions: (1) Regularity condition (1) states that when the probability of obtaining measurement outcome i depends on θ (i.e., ∂ θ λ i ̸ = 0), it must be no smaller than an inverse of a subexponential function of n, that is, the probability to detect i cannot be exponentially small.This is also a practically reasonable assumption as we would want to exclude the singular cases where an exponentially small signal provides a non-trivial contribution to the QFI.
(2) Regularity condition (2) requires that the QFI of ρ (n) θ does not decrease with n asymptotically, which should be satisfied in any practically relevant cases.It also requires the QFI to be subexponential, which is a natural assumption in quantum sensing experiments (note that the Heisenberg limit implies J(ρ Lastly, we briefly comment on the resources required to implement optimal preprocessing controls.First, the total number of ancillary qubits required to implement the desired preprocessing channel Ξ E • T n is at most O(n), because in general log(D n r n ) ancillary qubits are sufficient to implement the QFI-attainable q-c channel T n and another log(D n r n ) ancillary qubits are sufficient to implement the encoding channel Ξ E .The gate complexity to implement T n is expected to depend on the structure of the quantum state ρ (n) θ .For example, for symmetric states, the Schur transform, efficiently implementable [89], can be an important step in T n .Unitary gates that are used in aligning the output basis of T n to the input basis of the encoding channel Ξ E should also be taken into consideration.For example, in the special case where ρ n is a low-rank classically mixed state, T n should be a rotation that matches ρ n eigenstates to the input basis of Ξ E .The gate complexity to implement the optimal encoding channel Ξ E is high in general.However, when r n is subexponential (as we discussed above), the encoding channel does not need to be capacity-achieving as it only needs to reliably transmit an exponentially small amount of information, potentially making it relatively easier to implement (the details are left for future dis-  Ssorting is a sorting channel with circuit depth O(log 2 n), that uses O(n log 2 n) ancillary qubits and outputs one qubit in state Eq. ( 129).It first sorts the bit-string and then swaps the first qubit with the ⌊n sin 2 θ0⌋-th qubit.(The D-shape detectors mean the qubits are completely discarded.)cussion).For example, when r n = 2, a simple repetition code mapping |0⟩ to |0⟩ ⊗n and |1⟩ to |1⟩ ⊗n will be optimal.Finally, note that although we have shown that Ξ E • T n is optimal, other simpler optimal preprocessing channels may still exist.For example, for pure states, unitary controls are optimal according to Theorem 5, requiring no ancillas; and a design of an optimal preprocessing unitary is presented in [55].

C. Examples
Lastly, we present three simple but natural examples with powerful global preprocessing controls that can be efficiently implemented using O(log 2 n)-depth circuits, assuming arbitrary two-qubit gates and all-to-all connectivity (see details in Appx.H).In these three examples, we always assume H = H ′ = span{|0⟩ , |1⟩} are qubit systems and the quantum measurement is In the first two examples, our preprocessing circuits manage to achieve a FI that is asymptotically equal to the QFI for any noise rate m.In the third example, our preprocessing circuit achieves a FI that is asymptotically equal to 2/π of the QFI, which still beats local controls in the noise regime m ≥ 0.1011.The guideline to design these circuits is to convert the quantum state to a twolevel state in span{|0⟩ ⊗n , |1⟩ ⊗n } whose probability (or amplitude) distribution encodes θ.Then a majority voting post-processing method can be used to estimate θ with a vanishingly small measurement error.Specifically, in the majority voting post-processing method, we partition the measurement outcomes from measuring the two-level state using {M 0 , M 1 } ⊗n , which are represented by n-bit strings in {0, 1} n , into two sets depending on whether the Hamming weight of the string is larger than ⌊n/2⌋.The FI of this binary probability distribution achieves the desired value asymptotically.
The first example is phase sensing using GHZ states [5], where and an optimal preprocessing circuit U G that achieves is shown in Fig. 3a, mapping |ψ The majority voting post-processing method gives an optimal estimator of θ.
The second example is phase sensing using product pure states (usually known as Ramsey interferometry [90]), where and an optimal preprocessing circuit of depth O(log n) that achieves is shown in Fig. 3b.Here we assume θ 0 is a rough estimate of θ such that |θ − θ 0 | ≪ 1/ √ n.The first step is to implement global Hadamard gates and Pauli-X rotations such that |ψ ⊗n .The second step is to apply a desymmetrization gate DS and a C(NOT) n−1 gate such that the state is approximately mapped to with an error O(n(θ − θ 0 ) 2 ).The majority voting postprocessing method gives an optimal estimator of θ.
We show a preprocessing channel E G (in Fig. 3c) of circuit depth O(log 2 n) that achieves After a sorting channel S sorting and discarding all qubits except the first qubit, the first qubit is in state where p θ is the probability that after flipping n coins whose probability of getting heads are sin 2 θ, the number of heads are smaller than or equal to ⌊n sin A FI asymptotically equal to 8  π n can then be achieved using a C(NOT) n−1 gate with n−1 ancillas initialized in |0⟩ ⊗n−1 and the majority voting post-processing method.
Note that ρ (n) θ is a symmetric state.According to the discussion in Sec.VII B, the QPFI should be asymptotically equal to the QFI, but whether there exists an efficient implementation of the optimal preprocessing circuits is unknown.Here we demonstrate the advantage of global controls by providing an efficient but suboptimal circuit in Fig. 3c.The first part (S sorting ) of our circuit can be viewed as the optimal quantum-classical channel T n in Theorem 14.The second part that encodes one qubit into n qubits is, however, suboptimal.(In order to faithfully transmit all probability distribution information, the encoding channel in the second part needs to encode log(n + 1) qubits into n qubits.)Nonetheless, our circuit in Fig. 3c is superior to any local preprocessing controls in the noise regime where This can be proven noting that the optimal FI achievable using arbitrary local channels satisfies max from Eq. ( 100).Thus it is always smaller than 8 π n when m ≥ 1 2 − 1 √ 2π .Specially, when m → 1 2 , the linear constant of the locally optimized FI is vanishingly small, while the supoptimal global one is still above a positive number.

VIII. CONCLUSIONS AND OUTLOOK
We conducted a systematic study of the preprocessing optimization problem for noisy quantum measurements in quantum metrology.The QPFI (i.e., the FI of noisy measurement statistics optimized over all preprocessing quantum channels), that we defined and investigated in depth, sets an ultimate precision bound for noisy measurement of quantum states.Our results provide, in many cases, both numerically and analytically, approaches to identifying the optimal preprocessing controls that will be of great importance in alleviating the effect of measurement noise in quantum sensing experiments.
We also considered, specifically, the asymptotic limit of the QPFI in multi-probe systems with individual measurement on each probe.We proved the convergence of the QPFI to the QFI when there exists an (asymptotically) QFI-attainable measurement with a sufficiently small number of measurement outcomes, by establishing a connection to the classical channel capacity theorem.It would be interesting to explore, in future works, if the number of outcomes for QFI-attainable measurements can be easily bounded given a quantum state.
Although we've discussed only two types of quantum preprocessing controls, CPTP maps and unitary maps, our biconvex formulation might be generalized to cover other more restricted types of quantum controls.We also narrowed the analytical forms of optimal controls for pure states and classically mixed states down to rotations onto the span of two basis states and coarse-graining channels, respectively, but it remains open whether a simple method exists to help us determine the exact operations.
Finally, there are a few important directions to extend our results to, e.g., incorporating the state preparation optimization into the QPFI optimization problem, considering the preprocessing optimization in multiparameter estimation where the incompatibility of optimal preprocessings for different parameters might become an issue, and finding optimal preprocessings for other information processing tasks beyond quantum metrology such as state tomography and discrimination.

(B8)
Consider the limit m → 0, we have (which is expected because {M i } when m = 0 is QFI-attainable for ρ θ and the QFI is additive).On the other hand, one can immediately find cases where F P (ρ θ ⊗ ρ θ , {M i }) > 2F P (ρ θ , {M i }), e.g., when m = 0.1 and θ = π/8.For any fixed m > 0, there is a threshold of θ above which the sign of F P (ρ θ ⊗ ρ θ , {M i }) − 2F P (ρ θ , {M i }) changes from positive to negative.Finally, we can consider multiple states under multiple measurements.We have, by definition, and the inequality can be strict (see the convergence to the QFI in the asymptotic limit in Sec.VII).
where we use Tr(E n (ρ θ )M i ) > min i λ min (M i ) > 0 for all n.Then we must have F (E(ρ θ ), {M i }) ≥ F P (ρ θ , {M i }) using Eq.(C1).Since F (E(ρ θ ), {M i }) ≤ F P (ρ θ , {M i }) by definition, we have proving the existence of the optimal channels.The existence of the optimal unitaries can also be proven analogously.
As a corollary of Lemma S1, we show that for any measurement whose QPFI (or QUPFI) may not be attainable, there always exists a measurement in its neighborhood such that its QPFI (or QUPFI) is attainable and close to that of the original measurement.
Lemma S2.For any quantum state ρ θ , quantum measurement {M i } and η > 0, there always exists {M (ϵ) i } and a constant c > 0 such that the corresponding QPFI and the QUPFI are attainable, and when ϵ < c, By definition, we can pick E and U such that and assume ϵ is small enough such that Note that a similar construction of ρ (ϵ) , instead of M (ϵ) i , was used in [91] to remove singularity of the QFI.Using Lemma S1, it is clear that the QPFI and the QUPFI for {M (ϵ) i } are attainable.Furthermore, we have proving Eq. (C6).Eq. (C7) is also true, similarly.When the results also follow trivially.
Finally, we are ready to provide a proof of Theorem 1, which shows a way to calculate the QPFI by considering the limit of the QPFI for a set of generic noisy measurements in its neighborhood.(Note that the theorem stated below also holds for the QUPFI.) d , where d = dim(H S ′ ) and 0 < ϵ < 1.Then and the QPFI F P (ρ θ , {M i }) is attainable for any ϵ ∈ (0, 1].
Appendix D: Global optimization algorithm for biconvex optimization problems In Sec.III, we showed that the QPFI can be obtained from the following biconvex optimization problem (Eq.( 20)): s.t.Ω ≥ 0, Tr S ′ (Ω) = 1 S , Tr((X ⊗ ρ T θ )Ω) = 0, Tr((X ⊗ ∂ θ ρ T θ )Ω) = 1.The constraints on Ω guarantee any feasible Ω is contained in a convex compact set R 2 (the absolute value of each entry of Ω should not be larger than dim(H S )).We could also set a convex compact region R 1 on x, so that the following optimization problem generates the same optimal value as Eq.(20).min As discussed in Sec.III, this is possible when the size of R 1 is suffciently large, in normal cases when the infimum in Eq. ( 20) is attainable.Otherwise, the optimal value of Eq. (D2) can still approach that of Eq. ( 20) for sufficiently large size of R 1 .
Here we describe the global optimization algorithm [81] for Eq.(D2) that is guaranteed to converge to the global optimum of Eq. (D2) in finite steps.One may seek [59] for a general survey on algorithms from biconvex optimization.
We first rewrite Eq. (D2) as min where f (x, Ω) is the biconvex target function and h i (x, Ω) are bi-affine functions.The global optimization algorithm finds the global optimum of Eq. (D2) by solving a set of primal problems and relaxed dual problems which generate upper and lower bounds on the optimum respectively.The upper and lower bounds converge to the global optimum up to a small error in finite steps.The algorithm is described as follows.
Define initial upper and lower bounds (f U , f L ) on the global optimum, where f U and −f L can be chosen as two very large numbers.Set the counter K = 1.Set a convergence tolerance parameter ε.Choose a starting point x 1 .Define three empty sets K f eas (set of feasible problems), K inf eas (set of infeasible problems), S (set of candidates of lower bound).
(1) Consider the primal problem for x = x K if it is feasible (that is, if there exists some Ω ∈ R 2 that satisfies the constraints): The strong duality theorem [76] indicates that P (x K ) can be solved through where the Lagrange function Z is a semidefinite positive matrix acting on H S ′ ⊗ H S and y is a vector of real numbers.Solve Eq. (D5) and store the optimal values (Ω K , y K , Z K ).Set f U = min{f U , P (x K )} and K f eas = K f eas ∪ {K}.(2) If Eq. (D4) is infeasible, solve the relaxed primal problem for x = x K instead: The strong duality theorem implies where the Lagrange function L 1 (x, Ω, y, Z) := i y i h i (x, Ω) − Tr(ΩZ).Solve Eq. (D8) and store the optimal values (Ω K , y K , Z K ).Let K inf eas = K inf eas ∪ {K}.
Step 3: Determine the current region of x.
Suppose Ω is parameterized by a vector of real numbers Ω i .Since Ω is contained in a compact set, Ω i has upper and lower bounds that we denote by Ω U i and Ω L i .Consider the partial derivatives of the Lagrange functions defined by , ∀x} (the last equality follows from the KKT conditions [76]) and Ω i is called a connected variable of the Lagrange functions if i ∈ I k .We can also define the linearized Lagrange functions L(x, Ω, , Ω U i } be the set of combinations of upper and lower bounds on Ω i for all i ∈ I k .We abuse the notation a bit and use Ω ∈ B k to denote the case where the part of connected variables Ω I k in Ω is contained in B k and the other part is arbitrary.We will see that the other part is irrelevant in our calculations and can be ignored.In this sense, there are in total 2 |I k | number of Ω ∈ B k which is finite.We also define R(k, Ω) to be a region of x as a function of Ω ∈ B k defined by where "≤ Ωi " represents "≤" if Ω i = Ω U i , and "≥" if The relaxed dual problem in the next step will be solved in the region of x that is contained in Step 4: Relaxed dual problem.
Determine the set of indices for connected variables I K .Note that for any k, L(x, Ω, y k , Z k )| lin Ω k is a function of the connected variables only and is fixed if the connected variables Ω I k of Ω is fixed.Therefore we will also write L(x, Ω, , solve the following relaxed dual problem: For each Ω ⋆ , store the solution (µ ⋆ , x ⋆ ) of Eq. (D10) in S.
Step 5: Select a new lower bound and determine x K+1 .From the set S, select the minimum µ min and the corresponding x min .Set f L = µ min and x K+1 = x min .Delete (µ min , x min ) from the set of candidates of lower bound S.
Step 6: Check for convergence.Check if f L > f U − ε, if yes, STOP; otherwise, set K = K + 1 and return to Step 2.
The global optimization algorithm described above works in a branch-and-bound way where x is partitioned into different regions and different candidates of lower bounds of the global optimum are explored in each iteration.The subproblems that are solved in each iteration are semidefinite programs (Eq.(D5) and Eq.(D8)) and quadratically constrained quadratic programs (Eq.(D10)) which can be solved efficiently (for a moderate system dimension) using algorithms for convex optimization [76].The running time of the entire algorithm depends largely on the number of subproblems that are solved in each iteration which is exponential in the number of connected variables.Methods that can reduce this complexity were discussed in [81].
Appendix E: Binary measurements on pure states

Measurement on a qubit
Here consider a binary measurement on a single qubit where Without loss of generality, we assume and Condition ( 2) is trivially true when p = 0, 1, otherwise translates into: where we use and J(ψ θ ) = 4n.

Measurement on a qudit
Now consider a d-dimensional system with d > 2 and a binary measurement M 1 = M and M 2 = 1 − M on it where We now show that the support of |ϕ⟩: supp{|ϕ⟩} = {i : ϕ i ̸ = 0} must correspond to at most two different values of m i when |ϕ⟩ is optimal.We prove this by contradiction.Without loss of generality, assume which contradicts m 1 > m 2 > m 3 .Thus, we conclude that the support of |ϕ⟩ must correspond to at most two different values of m i .Therefore we have The reason is that when m k ≥ m l , increasing m k or decreasing m l while the other element is fixed will only increases γ kl ({M i }).We see that by computing the derivative of γ kl ({M i }) with respect to m k .We have Appendix F: Commuting-operator measurements on pure states We take one step further from binary measurements and consider the commuting-operator measurements where and i M i = 1.We also assume m (i) j > 0 for all i, j.

Optimal solution for commuting-operator measurements
Now we proceed to compute general γ({M i }) for commuting-operator measurements.First, consider the optimization for measurements restricted in a two-dimensional subspace spanned by |k⟩ , |l⟩ for some k ̸ = l, i.e., and i (M i ) kl = 1 span{|k⟩,|l⟩} .
Let (x * , |ϕ * ⟩) be an optimal solution when |ϕ⟩ , |ϕ ⊥ ⟩ are restricted in span{|k⟩ , |l⟩} and |ϕ * ⟩ = √ p kl |k⟩+ √ 1 − p kl |l⟩ (we also assume ⟨a * ⟩ k > ⟨a * ⟩ l ).Using Eq. ( 17), we see that the optimal a * i y where we use in the last step.From Eq. (F12), we have, According to Condition (2), From Eq. (F13)-Eq.(F16), we have It will give us a unique solution to p kl because the left-hand side is a monotonically increasing function in [0, i (m and the right-hand side is a monotonically decreasing function in [0, i (m . However, a simple analytical solution to p kl from Eq. (F17) might not exist because it is a root of a high degree polynomial.Then we have where p * kl is the unique solution to Eq. (F17).Finally, using Theorem 6.Note that although Eq. (F17) might only be solvable numerically in practice for a multiple-outcome measurement.Our solution for pure states and commuting-operator measurements still has a huge simplification compared to the original biconvex problem for general states and measurements.

Proof of Theorem 7
Here we prove a simple upper bound on the normalized QPFI: Theorem 7.For commuting-operator measurements (Eq.( 63)), the normalized QPFI γ({M i }) satisfies When there exists a (k, l) that minimizes i m , 1 ≤ i ≤ r contains at most two elements, the inequality is tight.
Proof.From the discussions in Appx.F 1 and Appx.F 2, we have that Eq. ( 70) is then proven, noting that for any p ∈ where in the first equality we use i pm When there are at most two different i and j (i.e., r = 2), such a p always exists, and the upper bound is tight (which also follows directly from Eq. ( 62)).In general, when the set contains at most two distinct elements, the upper bound is tight and the optimal preprocessed state can be chosen as , 1 ≤ i ≤ r must contain at least two distinct elements-otherwise, {M i } must be trivial.
Alternatively, we can also prove Eq. ( 70) directly from its original definition (Eq.( 42)) without using Theorem 6.We have where a k = ⟨k|ϕ⟩ (which we assume to be real, without loss of generality Let X 1 , . . ., X n be n independent random variables each identically distributed to X.Using the Hoeffding inequality, we have On the other hand, according to the definition of T ε , we have It implies Tr(Πσ ⊗n ) ≥ 1 − 2 exp −2nε 2 /x 2 upp using Eq.(G8).Similarly, using the Hoeffding inequality for independent random variables distributed to Y and Eq.(G9), we have a∈Σ n Tr(Λ a σ a ) ≥ 1 − 2 exp −2nε 2 /y 2 upp .Plugging in these bounds in Eq. (G5), we have δ ≤ 16e −2nε 2 /x 2 upp + 8e −2nε 2 /y 2 upp + 2 4−nε , (G14) proving the lemma.can be implemented using O(n) gates in depth O(log n).The S sorting channel can be implemented using O(n log 2 n) gates and O(n log 2 n) ancillary qubits in depth O(log 2 n).
We first investigate the implementation of C(NOT) n−1 gates and DS gates.Note that in experimental platforms like Rydberg atoms where long-range interactions are available, the C(NOT) n−1 gate can be implemented in a single step [98].However, we focus on the standard quantum circuit model here where only two-qubit gates are allowed.
Ref. [99] included detailed quantum circuits for C(NOT) n−1 gates and DS gates using O(n) gates in depth O(log n).For completeness, we briefly discuss these circuits here, in the case where n = 2 k .
To implement C(NOT) n−1 , one starts with a Hadamard gate on the first qubit, and then implement C 1 (NOT) 2 (which means a CNOT gate where the control qubit is the first qubit and the target qubit is the second) in the first step, C 1 (NOT) 3 and C 2 (NOT) 4 in the second step, and so on.The circuit continues in the same way.In the final step, i.e., the k-th step, C l (NOT) 2 k−1 +l for l = 1, 2, . . ., 2 k−1 are implemented.One can verifies the above O(log n)-depth circuit implements a C(NOT) n−1 gate using O(n) single-or two-qubit gates.
To implement DS, one can equivalently consider the circuit implementation of DS † and then conjugate and reverse the orders of each gate.DS † is a gate that prepares W states, where To implement DS † , one starts with a Pauli-X gate on the first qubit, then performs a two-qubit gate that is a composition of a C 1 H 2 (controlled-Hadamard) gate and then a C 2 NOT 1 gate (again, we use subscripts l to denote the l-th qubit) in the first step.C 1 H 3 +C 3 NOT 1 and C 2 H 4 +C 4 NOT 2 in the second step, and so on.The circuit continues in the same way.In the final step, i.e., the k-th step, C l H 2 k−1 +l +C 2 k−1 +l NOT l for l = 1, 2, . . ., 2 k−1 are implemented.One can verifies the above O(log n)-depth circuit implements a DS † gate using O(n) single-or two-qubit gates.
Finally, we discuss the implementation of S sorting which can be decomposed into a sorting network that implements for b ∈ {0, 1} n and a SWAP gate that swaps the first qubit with the ⌊n sin 2 θ 0 ⌋-th qubit.Now we discuss the implementation of the sorting network (Eq.(H49)), which directly follows from a classical sorting network because our input state is a classically mixed state and the sorting channel is incoherent.To be specific, we define a comparator to be a two-qubit quantum channel such that |ij⟩ → |ij⟩ when i ≥ j, |ji⟩ when j > i. (H50) It can be implemented using a unitary gate acting on two probe qubits and one ancillary qubit such that |ij⟩ |0⟩ → |ij⟩ |0⟩ when i ≥ j, |ji⟩ |1⟩ when j > i, (H51) and discarding the ancillary qubit afterwards.Our sorting network (Eq.(H49)) then follows from a classical sorting network, replacing all its classical comparators with the two-qubit sorting channels described above.
Here we use a classical sorting network called a bitonic sorter [100] that uses O(n log 2 n) comparators in depth O(log 2 n).Note that it is also possible to construct sorting networks of depth O(log n) (and size O(n log 2 n)) [101], although the linear constant is large, making it impractical.We briefly summarize the bitonic sort algorithm in the following pseudocode.Note that here we assume n = 2 k (we can always add more qubits in prepared in |0⟩ to make n a power of 2).
where log is the binary logarithm and C(M) is the classical capacity of the quantum-classical channel M(•) = i Tr (•)M i |i⟩ ⟨i| C ({|i⟩ C } is an orthonormal basis of an auxiliary system H C ).

⊗⌊αn⌋ 2 (
|e i ⟩ ⟨e i |) = |e i ⟩ ⟨e i |in the first equality.On the other hand, consider and a C(NOT) n−1 gate.The circuit depth is O(log n).(c) Phase sensing using classically mixed states.

= 1 ,
in the last equality we multiply the expression by a factor of 1 = i pm(i) k + (1 − p)m (i)l , and the last inequality follows from Cauchy-Schwarz.Assume (k, l) minimizes i m(i) k m (i)l .Then the equality above holds when ∃p, such that for any i, j, m