Optimally controlled quantum discrimination and estimation

Quantum discrimination and estimation are pivotal for many quantum technologies, and their performance depends on the optimal choice of probe state and measurement. Here we show that their performance can be further improved by suitably tailoring the pulses that make up the interferometer. Developing an optimal control framework and applying it to the discrimination and estimation of a magnetic field in the presence of noise, we find an increase in the overall achievable state distinguishability. Moreover, the maximum distinguishability can be stabilized for times that are more than an order of magnitude longer than the decoherence time.


I. INTRODUCTION
Quantum control has become a very versatile tool for quantum technologies [1,2], including quantum computation [3][4][5][6] and quantum simulation [7,8]. It is based on defining a figure of merit which quantifies how well the desired target is reached and which is taken to be a functional of yet unknown external fields [1,2]. Minimization, respectively maximization, of the functional yields pulse shapes for the external fields that drive the system to a target state or that implement a desired gate operation [1,2]. Various methods are now routinely being used to derive the pulse shapes, including both gradientbased optimization methods such as GRAPE [9], Krotov's method [3,10,11], or the GOAT algorithm [12], as well as gradient-free optimization such as CRAB [13,14].
Quantum discrimination and quantum estimation underlie many applications in quantum information science, including quantum hypothesis testing, quantum detec- * hdyuan@mae.cuhk.edu.hk † christiane.koch@fu-berlin.de tion and quantum sensing. While quantum control has been employed to improve the precision in quantum estimation [47][48][49][50][51][52][53][54][55][56], the use of quantum control in quantum discrimination remains scarce [57,58]. This is so despite the fact that one may expect quantum control to help identify fundamental performance bounds of quantum discrimination, similar to those found for quantum computation [6,59], or derive pulse shapes for improved performance with direct relevance to experiments [8,60].
All that is required is to adapt the quantum optimal control toolbox to the specific use case of quantum discrimination.
Here, we develop a unified framework of optimal quantum control for quantum discrimination and quantum estimation. We employ the distance between two states that underwent different dynamics, more specifically that evolved under slightly different magnetic field strengths, as the figure of the merit. In the limit of the difference in field strength going to zero, optimizing this figure of merit becomes equivalent to optimizing the quantum Fisher information. We use quantum optimal control to maximize the distance between the two states by shaping the external fields that make up the interferometer. Intuitively, this can be understood as tailoring the external field to drive the states evolving under different dynamics away from each other, instead of towards a common target. Since both states depend on the control, the distance between them is typically not a linear function, which is different from the case of a fixed target. Krotov's method for quantum optimal control [10,11] can be used in such a case. We employ it here to optimize discrimination and estimation of a magnetic field in the presence of noise, increasing the performance compared to the standard scheme based on a Ramsey interferometer. Our work thus contributes a quantum control perspective to current efforts for improving quantum sensing protocols based on Ramsey interferometry, using squeezed [61] or anticoherent [62] states, variable detuning of the pulses [63], or machine learning of the complete protocol [64].
The article is organized as follows. We introduce the figure of merit for discrimination and the estimation in Sec. II, and then present the quantum control method to optimize this figure of merit. In Sec. III, we apply the method to the discrimination and the estimation of the magnetic fields to demonstrate the feature and advantages of the control. We summarize our findings in Sec. IV.

II. MODEL AND CONTROL PROBLEM
We consider the dynamics described by the Gorini-Kossakowski-Sudarshan-Lindblad master equation [65], where can not be measured directly, the discrimination (estimation) is achieved by the measurement of the time evolved state ρ m (T ) starting from an initial state ρ in = |Ψ in Ψ in |. For the discrimination of two Hamiltonians, the two states ρ 1 (T ) or ρ 2 (T ) should be made as distinguishable as possible. In contrast, for the estimation the precision can also be connected to the distinguishability of the states that are evolved under two neighboring Hamiltonian with H d,1 = H(B − δB/2) and H d,2 = H(B+δB/2), where δB is an infinitesimally small shift [66]. The difference between the discrimination and the estimation is the figure of merit. The figure of merit for the discrimination is typically taken as the success probability P succ to distinguish the two final states ρ 1 (T ) and ρ 2 (T ), which can be related to the trace distance D tr as [15] where ρ tr = Tr{ ρ † ρ}. The figure of merit for the estimation is typically taken as the precision, which can be calibrated by the quantum Cramer-Rao bound as is the variance of an unbiased estimatorB, R is the number of repetition of the experiments and F Q is the quantum Fisher information which determines the precision limit. Under the two Hamiltonian H d,1 = H(B − δB/2) and H d,2 = H(B + δB/2), the quantum Fisher information can be related to the Bures distance D bures between ρ 1 (T ) and ρ 2 (T ) as [66] where the Bures distance between two states is defined as [67] We consider distinguishing two Hamiltonians, H d,1 = B 1 σ z /2 and H d,2 = B 2 σ z /2. The discrimination of the two Hamiltonians can be related to the estimation when B 1 = B−δB/2 and B 2 = B+δB/2, which corresponds to the estimation of the strength of a magnetic field oriented along the z-axis.
We first compare two protocols for the discrimination -the standard Ramsey protocol and the protocol employing optimized control fields. Each protocol starts with preparing the qubit in the initial state ρ in = |Ψ in Ψ in | and is based on deducing whether the field is B 1 or B 2 by means of measuring its time-evolved state, ρ m (T ). The Ramsey scheme is to prepare an initial state on the Bloch sphere's equator and let it subsequently evolve under the constant drift H d,m , i.e., H c (t) = 0. In contrast, the optimized protocol will in addition employ time-dependent fields, i.e., H c (t) = 0. These control fields are optimized to make the two states ρ 1 (T ) and ρ 2 (T ) as distinguishable as possible. In other words, the optimized control fields need to maximize the distance measure D(ρ 1 , ρ 2 ). For the discrimination, the distance is the trace distance (4), since it is directly related to the successful probability of the discrimination, cf. Eq. (3). If expressed in terms of the Bloch vectors r 1 and r 2 for states ρ 1 and ρ 2 , it reads D tr (ρ 1 , ρ 2 ) = r 1 − r 2 /2 with · the Euclidean vector norm [68]. Thus, the trace distance coincides with the geometric distance between the Bloch vectors r 1 and r 2 and maximal distinguishability is achieved iff r 1 and r 2 are on opposite points on the Bloch sphere. Hence, the maximization of D tr will be our physical goal for the discrimination.
The presence of the drive Hamiltonian H c (t) allows to influence the evolution of D tr . We make the general assumption where E x (t), E y (t), E z (t) ∈ R are control fields that couple via σ x , σ y and σ z to the qubit, respectively. Note that while H c (t) is identical for both Hamiltonians H 1 (t) and H 2 (t), it influences the dynamics differently in the two cases due to the difference in the drift Hamiltonians. It can thus be used to maximize D tr . The presence of H c (t) thus turns the discrimination problem into a control problem, seeking to answer the question how to choose the three fields E x (t), E y (t) and E z (t) such that D tr is maximized at time T when the state ρ m (T ) is measured. We derive suitable control fields employing optimal control theory [1]. To this end, we introduce the optimization functional (8) where J T is the relevant figure of merit that quantifies the failure probability or error at final time T and g captures additional running costs at intermediate times.
The sets {ρ m } and {E k } are forward propagated states and control fields, respectively, here given by (8) describes the most general form to represent an optimization functional and therefore constitutes the standard ansatz to formulate an optimization target [69]. For the task of maximizing D tr , we choose J T as with D HS the Hilbert-Schmidt distance [70], where A, B = Tr{A † B}. Note that the relation D 2 tr = D HS only holds for qubits in which case maximization of D tr and maximization of D HS are equivalent. Since both distances are appropriate measures of state distinguishability, we choose D HS for maximization in optimal control, since it is more suitable for that purpose [71,72] because it allows to build analytical gradients with respect to the states ρ 1 and ρ 2 .
In the following we briefly describe our numerical algorithm of choice. We use Krotov's method [73], an iterative and gradient-based optimization technique, to minimize J T , cf. Eq. (9). We achieve the minimization of J T by minimizing the total functional J, cf. Eq. (8), assuming g to take the form [3] where λ k is a numerical parameter, S k (t) ∈ (0, 1] a shape function and E ref k (t) a reference field. Equation (11) is thereby a standard choice to control the pulse fluence and should prevent the optimization to optimize towards unphysical pulse shapes. With the choice of Eq. (11), Krotov's method allows the derivation of a closed form for the field update [10], where the superscripts (i) and (i+1) indicate the previous and current iteration, respectively. The states ρ and the co-states χ The superscripts of the Liouvillians L, cf. Eq. (1), indicate the respective iteration of the control fields. The reference field in Eq. (12) is taken to be the field from the previous iteration, i.e., E ref . Hence, the running cost g vanishes as the fields converge, and the total functional J essentially coincides with the relevant figure of merit J T that we seek to minimize. See Ref. [10] for a detailed description of Krotov's method.

III. RESULTS AND DISCUSSION
The general time scale on which one can expect a given control task to be feasible is an important property of the dynamics. For instance, for a control problem where an initial state should be transferred into a given target state, it is determined by the general speed of the evolution, typically set by the Hamiltonian, and the distance between initial and target state. In our case, however, we are interested in the relative distance D HS between the two time-evolved states ρ 1 (t) and ρ 2 (t) and not into their distance with respect to the initial state ρ in . Hence, the time scale on which D HS increases is defined by their relative speed of evolution. In detail, two different time scales are relevant for the problem of maximizing D HS . On the one hand, there is a quantum speed limit (QSL), i.e., a minimal time necessary to perfectly distinguish the two states. Such a minimal time is defined for every physical control task. Here, it is determined by δB via the coherent part of the dynamics and can be estimated by This is the minimal time required for perfect state distinguishability, i.e., D HS = 1, in the Ramsey protocol and under the assumption of no dissipation. On the other hand, dissipation continuously decreases D HS , since it causes both states, ρ 1 (t) and ρ 2 (t), evolving under H 1 (t) and H 2 (t) to evolve towards the same steady state ρ ss . The time scale set by the dissipation is, in contrast to the QSL, independent of δB. Since the impact of relaxation and pure dephasing, characterized by T 1 and T 2 , respectively, is quite different, we consider them individually in the following. This assumption is reasonable since in most physical settings, the noise is either T 1 or T 2 dominated. We take |Ψ in = |+ = (|0 + |1 )/ √ 2 as initial state, in accordance with the standard Ramsey scheme [74], i.e., in our dynamical description, we do not account for the process preparing |Ψ in . Figure 1 shows the distinguishability D HS as a function of the protocol length T for the Ramsey and optimized protocol. In detail, the dotted lines in Fig. 1(a) show the dynamics of 1 − D HS for the Ramsey protocol, i.e., H c (t) = 0, for several δB under relaxation, i.e., a single Lindblad operator L = |0 1| with γ = 1/T 1 . The dashed vertical lines indicate the QSL of Eq. (15). Starting at D HS = 0 at T = 0, the distinguishability D HS increases until it reaches the maximum of D max HS at approximately T ≈ T QSL . For times T > T QSL , the distinguishability D HS decreases exponentially as the relaxation causes ρ 1 (t) and ρ 2 (t) to evolve towards the same ground/steady state ρ ss = |0 0|.
The decay of the state distinguishability due to relaxation can be completely suppressed by using tailored, i.e., optimized, control fields. The markers in Fig. 1(a) show the reachable distinguishability D HS at the respective final time T used in the optimization. There are two interesting effects to notice. On the one hand, the reachable maximum D max HS increases compared to the Ramsey protocol. Hence, in the presence of relaxation, optimized control fields allow in general for better distinguishability despite a slightly longer protocol duration (factor 2) to reach D max HS . On the other hand, the improvement in field for the case of (a) relaxation with the control field Ey(t) and (b) pure T2 dephasing where the control is Ex(t). The Bloch sphere dynamics is depicted in Fig. 3(a) and (b), respectively.
state distinguishability can be stabilized at that maximally reachable distance against decay for protocol durations T much longer than the T 1 time. Figure 1(a) demonstrates it for times T up to 10 × T 1 but suggests it should, in principle, be feasible for even longer times. Figure 1(c) shows the purities for states ρ 1 (t) and ρ 2 (t) corresponding to the data in Fig. 1(a), both for the Ramsey protocol (dotted lines) and at final time T after an evolution under the optimized control fields (markers). The dotted lines show an intermediate purity loss in the Ramsey protocol due to the relaxation. The final gain in purity for t → ∞ is here a sign for the incoherent process of both states approaching the same (pure) ground/steady state. In contrast, the behavior of the purity in case of the improved and stabilized D HS depends on δB. While for larger δB the loss of purity is avoided at all T by the respective optimized control fields, the improvement in case of small δB comes along with a loss in purity.
The improvement and stabilization of D HS is achieved via a simple control strategy which is most conveniently understood on the Bloch sphere, cf. Fig. 3(a). To this end, we choose the control field E z such that it cancels the known B, i.e., E z (t) = −B. This eliminates the fast, coherent oscillations of r 1 (t) and r 2 (t) around the z-axis which do not contribute to the distinguishability D HS . Furthermore, in order to protect both states, r 1 (t) and r 2 (t), as much as possible from the detrimental relaxation, i.e., prevent their vector norms from shrinking, we kick both states from their initial position on the equator close to the ground/steady state ρ ss = |0 0|. This is achieved by a π/2 like pulse via E y right at the beginning of the protocol. The states will stay close to ρ ss for the largest part of the protocol where they evolve effectively decoherence-free in the vicinity of ρ ss . For the final measurement both states are transferred back to the equator by a second, inverse π/2 like pulse.
Note that this strategy of protecting both states close to the ground/steady state for as long as possible has been identified in steps. Initially, we allowed the optimization of all three control fields E x , E y , E z and started optimizing without any strategic choice for their guess fields. However, the above strategy (with only slight deviations) has been identified even then. Its reduced version consists of a constant E z and no E x at all such that E y is the only time-dependent field that needs to be optimized. Figure 2(a) shows, in an exemplary case, the guess and optimized form of E y (t) when guiding the optimization with a guess field that already incorporates the initial π/2 like kick in the beginning and its inverse counterpart at the end [75]. Compared to the guess field, the optimization increases the intensity of the first kick such that the rotation from the initial equatorial state ρ in = |+ +| towards ρ ss is carried out as fast as possible. The corresponding dynamics on the Bloch sphere is shown in Fig. 3(a). After the first kick, the states remain most of the time close to the ground/steady state ρ ss , which effectively protects them from loosing purity. The second, inverse kick is much smoother and transfers the states symmetrically to the equatorial plane such that D HS becomes maximal at T , i.e., the final time of measurement. The optimized field in Fig. 2(a) and its corresponding dynamics on the Bloch sphere, cf. Fig. 3(a), have been picked as a representative of an entire class of solutions for the problem of maximizing distinguishability in the presence of relaxation. The exact details of the optimized control field and corresponding dynamics differ depending on δB and T , but the general control strategy remains similar.
We now turn to the case of pure dephasing with Lindblad operator L = σ z and rate γ = 1/T 2 . Figure 1(b) shows the dynamics for the Ramsey protocol as dotted lines. In comparison to the case of T 1 decay, cf. Fig. 1(a), pure dephasing has a more severe influence on D HS even if the decay rates are identical, γ = 1/T 2 = 1/T 1 . But also in this case, optimization is capable of improving D HS over the Ramsey protocol -again at the expense of longer protocol durations (factor 2). The effect of stabilizing D HS at the maximal reachable distance for times much longer than the decay time can be observed as well. Nevertheless, the dynamics both in the Ramsey protocol as well as under the optimized control fields look quite different compared to relaxation. With pure dephasing no unique, single steady state exists but rather a set of states, namely the coherence-free states given by {ρ ss = p |0 0| + (1 − p) |1 1| |p ∈ [0, 1]}, i.e., all states on the z-axis of the Bloch sphere. Since neither the drift H d,m nor the dephasing cause a change of any state's z-projection, the two states r 1 (t) and r 2 (t), starting initially in the equatorial plane, precess around the z-axis while loosing purity, i.e., shrink within the equatorial plane. Hence, they evolve towards the Bloch sphere's center, i.e., the completely mixed state. This is evidenced by the dotted lines in Fig. 1(d), which show the purity evolving towards 1/2 under the Ramsey protocol.
An optimization of all three available control fields E x , E y , E z again yields a simple control strategy. Like in the case of relaxation, it can also be realized by a single timedependent control field, which is what for simplicity we discuss here. This time, the time-dependent control is E x (t), while E z (t) = −B again cancels the known field B and E y is not needed at all. Figure 2(b) shows the guess field for E x (t), which exhibits a peak at the beginning. This peak is modified by the optimization such that it splits the two states r 1 (t) and r 2 (t) within the equatorial plane as a first step and then rotates them onto the z-axis in a second step, see Fig. 3(b) for the corresponding dynamics. Once the states reach the z-  Fig. 1(a) while (b) corresponds to the case of pure dephasing in Fig. 1 axis, E x (t) ≈ 0 is essentially turned off and the states become invariants of the dynamics which implies that their distinguishability D HS can essentially be preserved forever. This readily explains the stabilization observed in Fig. 1(b). The respective optimized field and dynamics in Figs. 2(b) and 3(b) again represent an example for the entire class of solutions for the problem of maximal distinguishability in the case of pure dephasing. The exact details depend again on δB and T . Next, we relate the improved distinguishability D HS observed in Fig. 1 to the quantum Fisher information F Q , cf. Eq. (5). However, it depends on the Bures distance D bures , which is a distance metric on the set of density matrices, just as the trace distance D tr or the Hilbert-Schmidt distance D HS . Unlike the trace distance discussed above, D bures cannot be related to D HS , not even in the case of qubits. Nevertheless, the increase of D HS is expected to increase D bures as well [72]. For the maximization of D HS , shown in Fig. 1, this is in fact true and D bures is readily improved alongside D HS .
Note that Eq. (5) is only valid for small δB. Moreover, it needs to be weighted by the protocol duration T in order to quantify the amount of information that can be obtained per unit time for any given protocol. Accordingly, Fig. 4 shows the quantum Fisher information F Q weighted by the protocol duration for small values of δB. In the case of pure dephasing, cf. Fig. 4(b), there is a small improvement in D HS , respectively D bures , for the optimized protocol compared to the Ramsey protocol. This is, however, almost completely canceled by the slightly longer protocol duration T . As a result, the maximally reachable value of F Q /T is almost identical for the Ramsey and optimized protocols. In contrast, for relaxation, cf. Fig. 4(a), the significant improvement of D HS , respectively D bures , realized by the optimized protocol gives rise to an improvement of F Q /T despite the slightly longer protocol duration T . We thus expect a metrological gain of the optimized protocol compared to the Ramsey protocol.
So far, we only considered decay rates determined by T 1 = 1000 and T 2 = 1000. However, since the dissipation sets a time scale for the control task that is independent on the QSL set by δB, cf. Eq. (15), it is natural to ask whether the control strategy that has been identified above depends on the decay rates. To this end, we examine how the improvement of D HS , respectively D tr = √ D HS , observed in Fig. 1 behaves for different relaxation and dephasing times. In detail, we are interested in the behavior of as a function of δB and for various decay rates 1/T 1 and 1/T 2 . The function M γ measures, for a given δB, the maximally reachable distinguishability D max tr , independent of the time it takes to reach it. In other words, M γ (δB) = 1 − D max tr . If, for a given physical process, the protocol duration is not crucial and only the maximally achievable state distinguishability is of importance, M γ (δB) is the relevant figure of merit. For the Ramsey protocol, Eq. (16) can be solved analytically to yield for relaxation with γ = 1/T 1 . For pure dephasing, the solution takes the same form but differs by a factor of four, i.e., γ = 4/T 2 . The dotted lines in Fig. 5(a) and (b) show M γ for the Ramsey protocol for relaxation and pure dephasing, respectively. The dotted lines perfectly fit the numerical values given by the opaque markers, as expected for an analytical solution. For the dynamics under the optimized control fields, we can evaluate Eq. (16) numerically, cf. the non-opaque markers in Fig. 5. Remarkably, these show an almost identical functional dependence compared to the Ramsey scheme. We therefore fit the data obtained for the optimized protocol to Eq. (17) using effective relaxation and dephasing times as fitting parameters. This yields the solid lines in Fig. 5, which indeed show that M γ (δB) accurately describes the dependence also for the optimized data points with effective decay times T 1,eff or T 2,eff , see the legends in Fig. 5. This is in fact not obvious as the coherent dynamics of the Ramsey and optimized protocol differ drastically, which makes the resemblance in their functional behavior of M γ remarkable. For relaxation, the effective decay times satisfy T 1,eff /T 1 ≈ 2.4, whereas for pure dephasing, the ratio is T 2,eff /T 2 ≈ 1.2. Thus, the maximally reachable distinguishability D max tr behaves as though it would have been measured by a Ramsey protocol with 2.4 times longer T 1 , respectively 1.2 times longer T 2 time, which greatly improves the distinguishability. Given the protection strategy of the dynamics, the prolongation of the decay times is not surprising, since the overall impact of the dissipation onto the states is reduced.

IV. CONCLUSIONS
In summary, we have studied how optimized control fields can help to improve the distinguishability of two states of a qubit -both of which evolve under different drift but identical drive Hamiltonians while being exposed to either relaxation or pure dephasing. Our results show two improvements with respect to a standard Ramsey protocol for state discrimination.
First, optimized control fields increase the overall achievable state distinguishability, at the expense of slightly longer protocol durations. When comparing this improved state distinguishability against the prolonged protocol duration, in the case of relaxation, we observe a metrological gain, evidenced by the quantum Fisher information weighted by the protocol duration. In contrast, both effects -the improved state distinguishability and the prolonged protocol duration -roughly cancel in the case of pure dephasing.
Second, by utilizing optimize control fields, we are not only able to improve the state distinguishability but also to stabilize it at its maximum for times that are at least one order of magnitude longer than the decay times due to the environmental noise. The control strategy utilizes decoherence-free subspaces in all cases, where the states can be effectively stored and protected before be-ing separated right before their measurement. We find the required control fields to be both simple and experimentally feasible.
Our study demonstrates the capabilities of optimal control to effectively reduce the environments detrimental influence. For the considered state discrimination problem and if compared to the standard Ramsey scheme, it reveals an alternative protocol with improved noise resistance. Our results thus suggest to explore state discrimination and its impact on quantum metrological applications from a new perspective.