Experimental Test of Entropic Noise-Disturbance Uncertainty Relations for Three-Outcome Qubit Measurements

Information-theoretic uncertainty relations formulate the joint immeasurability of two non-commuting observables in terms of information entropies. The trade-off of the accuracy in the outcome of two successive measurements manifests in entropic noise-disturbance uncertainty relations. Recent theoretical analysis predicts that projective measurements are not optimal, with respect to the noise-disturbance trade-offs. Therefore the results in our previous letter [PRL 115, 030401 (2015)] are outperformed by general quantum measurements. Here, we experimentally test a tight information-theoretic measurement uncertainty relation for three-outcome positive-operator valued measures (POVM), using neutron spin-1/2 qubits. The obtained results violate the lower bound for projective measurements as theoretically predicted.

Introduction.-According to the rules of quantum mechanics any single observable or even a set of compatible observables can be measured with arbitrary accuracy. However, classically unanticipated consequences appear when measuring non-commuting observables jointly, either simultaneously or successively. Heisenberg's seminal paper from 1927 [1] predicts a lower bound on the uncertainty of a joint measurement of incompatible observables. On the other hand it also sets an upper bound on the accuracy with which the values of non-commuting observables can be simultaneously prepared. While in the past these two statements have often been mixed, they are now clearly distinguished as measurement uncertainty and preparation uncertainty relations, respectively. While Heisenberg's paper only presented his idea heuristically, the first rigorously-proven uncertainty relation for position Q and momentum P was provided by Kennard [2] as ∆(Q)∆(P ) ≥ 2 , in terms of standard deviations defined as ∆(A) 2 = ψ|A 2 |ψ − ψ|A|ψ 2 . In 1929, Robertson [3] with the commutator [A, B] = AB − BA.
It is widely accepted [4] (but nevertheless under discussion [5,6]) that the uncertainty relation as formulated by Robertson in terms of standard deviations ∆(A, |ψ )∆(B, |ψ ) ≥ 1 2 | ψ|[A, B]|ψ | lacks an irreducible or state-independent lower bound, meaning it can become zero for non-commuting observables. Furthermore, the standard deviation is not an optimal measure for all states. Consequently, Deutsch began to seek a theorem of linear algebra in the form U(A, B, ψ) ≥ B(A, B) and suggested to use (Shannon) entropy as an appropriate measure. Note that Heisenberg's (and Kennard's) inequality ∆Q∆P ≥ 2 has that form, but its generalization Eq. (1) Uncertainty relations in terms of entropy were introduced to solve both problems. The first entropic uncertainty relation was formulated by Hirschman [7] in 1957 for the position and momentum observables, which was later improved in 1975 by Beckner [8] and Bialynicki-Birula and Mycielski [9].
The extension to non-degenerate observables on a finite-dimensional Hilbert space was given by Deutsch in 1983 [4] as where H denotes the Shannon entropy and incompatibility c = max i,j | a i |b j | is the maximum overlap between the eigenvectors |a i and |b j of observables A and B, respectively. This relation was later improved by Maassen and Uffink [10] yielding the well-known entropic uncertainty relation Entropic uncertainty has proven to be a useful tool in entanglement witnessing [11], complementarity [12] and in quantum information theory [13]. Initially, procedures to quantify error and disturbance are based on distance measures between target observables and measurements [5,14] or the associated probability distributions [15]. More recently, interest has risen in information-theoretic measures, introduced first by Buscemi et al. [16], but also in several subsequent alternative approaches [17][18][19][20].
Theory.-To formally study measurement uncertainty relations one must define measures for two key properties of a measurement device, more precisely a quantum instrument, M [21,22] (which may in general implement an arbitrary quantum measurement with any number of , for two-level systems. The eigenstates |ain of A (or |bin of B for disturbance) are prepared with equal probability p(±) = 1/2, before being measured by M, producing outcome m and transforming the state according to Mm. (b) For the disturbance, the input states are |±b , again with probability p(±) = 1/2. The result of the first measurement is classically communicated to a device applying a correction transformation Em on the post-measurement state. The disturbance is obtained upon a subsequent projective measurement of B yielding outcome b at the end.
outcomes): how accurately it measures a target observable A (noise), and how much it disturbs subsequent measurements (disturbance). While several definitions of noise have previously been studied theoretically and experimentally, we utilize the information-theoretic approach of [16], formulated as follows and schematically illustrated in  [13]. The noise is defined in the following scenario: the eigenstates of A are randomly prepared with probability p(a) = 1 d before M is applied, producing an outcome m with probability p(m|a) = Tr(M m |a a|). If M accurately measures A then the value of m should allow one to infer a; if the measurement is noisy, m yields less information about a. This noise is quantified in terms of the conditional Shannon entropy: denoting the random variables associated with a and m as A and M, respectively, the noise of M for a measurement of A is [16] N (M, A) = H(A|M) = − a,m p(a, m) log 2 p(a|m), (4) where p(a, m) = p(a)p(m|a) and p(a|m) can be calculated from Bayes' theorem.
The entropic disturbance D(M, B) of the apparatus M on the measurement of B is defined with respect to an analogous procedure as the noise. Uniformly distributed eigenstates {|b i } with eigenvalues b i associated with random variable B are fed to the same instrument M from which a post-measurement state (5) Using these notions of noise and disturbance, for arbitrary observables A and B in finite-dimensional Hilbert spaces, the noise-disturbance (measurement) relation holds [16]. In [23] we experimentally tested where g[x] is the inverse of the function h(x) defined as As it turned out, the proof given in [23] for this relation was incorrect and this relation does not hold in general, which was pointed out in [24]. It should be noted that the relation does hold for projective measurements, although it can be violated by non-projective dichotomic measurements.  The bound of Eq.(7) can be violated by considering a three-outcome measurement M θ with the associated positive-operator valued measure (POVM) given by , where M θ m = p m (1 + n(θ) m · σ) with weights and directions p 0 = cos θ 1 + cos θ , p −1 = p 1 = 1 2(1 + cos θ) which is illustrated in Fig. 2 for three distinctive values of the parameter θ. Note that for θ = π 2 the POVM M θ degenerates to a projective measurement in ±z-direction with elements M In order to determine a lower bound on the disturbance D(M θ , σ x ), let us consider the correction E opt m that maps n ±1 onto the negative x-axis and n 0 onto the positive x-axis, respectively. Using Eq.(5) one can then calculate the joint distribution p(b , b) and thus the upper bound on the minimum disturbance for B = σ x as This noise-disturbance pair from Eqs. (10) and (11) violates Eq. (7) for all θ ∈]0, π 2 [, which is experimentally tested here. In Sec. I of the Supplemental Material [25] details of the theoretical framework are elaborated.
A schematic illustration of the experimental setup is depicted in Fig. 3. An incoming monochromatic neutron beam with mean wavelength λ 2.02Å (∆λ/λ 0.02) is polarized along the vertical (+z) direction by refraction from a swivelling CoTi multilayer array, henceforth referred to as supermirror. To prevent depolarization by stray fields, a 13 Gauss guide field B GF z pointing in the positive z-direction, from coils in Helmholtz configuration, is applied along the entire setup (Helmholtz coils not depicted in Fig. 3).
The probability of preparation of one of the two possible initial states, that is |±z for noise and |±x for disturbance measurement, is determined by a classical random number generator applying one out of two possible currents in the spin rotator coil DC-1. Within the coil DC-1 a local magnetic field B y , pointing in positive y-direction, is applied. Larmor precession around the yaxis is induced and the strength of B y is tuned such that it causes a spin rotation by an angle of 0 or π for the noise and + π 2 or − π 2 for the disturbance measurement, respectively.
For the three-outcome POVM M θ another spin rotator coil (DC-2) and the second supermirror (analyzer 1) are applied. As seen from the definition of the POVM M θ m = p m (θ)(1+n(θ) m ·σ), each POVM element consist of a measurement-direction given by n m and a weighting denoted as p m , dependent on the parameter θ. While the former is adjusted by an appropriate magnetic field strength B y in DC-2, the latter is set by the horizontal angle of refraction inside the supermirror. Note that the change in angle of the supermirror only effects the transmission (weighting) and does not change polarization of the neutrons, making this procedure a valid experimental realization of the POVM M θ .
For the noise-disturbance measurement the whole function of the quantum instrument has to be specified (not just the POVM it induces), which includes transformation of the post-measurement state. Consequently, a correction operation E opt m is applied, in order to minimize the disturbance D E (M θ , B). In our experiment E opt m maps n ±1 onto the negative x-axis and n 0 onto the positive xaxis, which is achieved by Larmor precession with DC-3.
Finally, DC-4 and the third supermirror (second analyzer) perform the B measurement, which is a simple projective measurement, where the observable is given by B = σ x . At the end of the beam line a boron trifluoride counting tube (detector 2 in Fig. 3) registers all incoming neutrons. The two successively performed measurements of M θ and B result in six output intensities I b m,b for B = σ x (disturbance-measurement), for each setting of θ (see Sec. II of the Supplemental Material [25] for details of the data evaluation). For the noise-measurement no B measurement is required, thus only three output intensities I a m (with m = −1, 0, 1) are obtained. Data treatment.-Uniformly distributed eigenstates of the observable A = σ z , denoted as {|a i } = {|+z , |−z }, are sent onto the apparatus M θ . The correlation between the eigenvalue a i corresponding to the state prepared and the outcome m measured by the apparatus M θ , is given by the joint probability p(a, m), which in turn allows us to determine the noise. The conditional probability p(a|m) is then obtained via allowing to calculate the noise N (M θ , A) using Eq.(4). The noise N (M θ , A) of the three-outcome POVM M θ is determined applying the reduced setup; here an additional counting tube (detector 1 in Fig. 3) is inserted by directly mounting it onto the exit window of the first analyzer (second supermirror). This is done to maintain optimal positioning, relative to the beam, when the supermirror, is rotated to implement the POVM weights.
With this configuration a maximal count rate I max = 350 ccounts per second is recorded. During the measurement the POVM parameter θ is varied between π/2 and 0 in steps of π/34 (see Sec. II of the Supplemental Material [25] for details of the noise measurement). For each value of θ three intensities, belonging to the POVM outputs M θ 0 , M θ +1 and M θ −1 are recorded in a measurement time t meas = 400 seconds. The conditional probability p(a|m) is obtained via p(a|m) = I a m / a,m I a m , allowing to calculate the noise N (M θ , A) using Eq.(4).
With the six conditional probabilities p(a|m) we can calculate the noise N (M θ , σ z ) via The experimental results of the disturbance measurement D E (M θ , σ x ) can be seen in Fig. 4. The values obtained for the disturbance measurement for small values of θ are slightly higher than the theoretically predicted. This is due to the fact that for small values of p(b, b ) in Eq. (14) the disturbance D E (M θ , σ x ) is very sensitive to the input data. Unlike in the case of the noise N (M θ , σ z ), for the disturbance certain probabilities are predicted to be zero over the entire range of θ (see Sec. II.2 of the Supplemental Material [25] for details).
Final results.-A parametric plot of the experimental results of the noise-disturbance measurement is given in Fig. 5, where the disturbance D E (M θ , σ x ) is plotted versus the noise N (M θ , σ z ). Note that the final results from for the last four noise-disturbance pairs (low disturbance, high noise, bottom right) for better statistics. Here, only noise-disturbance pairs where it is possible to decide whether projective or POVM measurements perform better (due to the size of error bars) are shown (see Sec. II of the Supplemental Material [25] for details of the disturbance measurement).
Discussion and Outlook.-In addition, Fig. 5 gives an experimental comparison with the results from the projective measurements from [23], in terms of N (M pr , σ z ) versus D E (M pr , σ x ). Our experimental data clearly confirm that the three-outcome POVM measurement outperforms usual projective measurements, evidently reproducing the tighter bound theoretically predicted in [24].
At this point we want to emphasize that Fig. 4 gives an intuitive explanation why the three-outcome POVM, defined in Eq.(9), outperforms projective measurements: although there is a loss comming from the noise in the POVM measurement (meaning higher noise values compared to the projective measurement), this loss is surpassed by the gain in the obtained disturbance (significantly lower disturbance values as for projective measurement). This behavior is a peculiarity of the applied three-outcome POVM. In general, increasing the number of possible outcomes has a negative (increasing) effect on the noise-disturbance bound [24]. A next step would be investigation of two consecutive three-outcome POVM measurements. So far only the first measurement apparatus used a POVM measurement followed by a subsequent projective measurement. It is of interest to replace the projective measurement apparatus with a second three-outcome POVM measurement and study the resulting disturbance on the second POVM measurement.
Conclusion.-We experimentally tested a tight information-theoretic measurement uncertainty relation, in terms of a proposed three-outcome POVM using neutron spin-1 /2 qubits. The obtained results of the noise-disturbance trade-off relation for three-outcome POVM outperform prior results for projective measurements, over almost the entire measured range of the tested POVM parameter θ.
The authors thank Alastair A. Abbott and Cyril Branciard for helpful discussions. This work was supported by the Austrian science fund (FWF) Projects No. P 30677-N36 and P 27666-N20.

Supplemental Material
In this supplement, we provide technical details of the data evaluation, required for determination of noise and disturbance, accompanied by the underlying theoretical framework. This complements the conceptual description given in the main text.  1]. In order to determine an lower bound on the disturbance D(M θ , σ x ), let us consider the correction E opt m that maps n −1 and n 1 onto the negative x-axis and n 0 onto the positive x-axis, respectively. Using where the joint probabilities p(b, b ) are given by with the optimal correction denoted as E opt m P (n m ) = 1 1+(−1) m ex·σ and the marginal given by summation Note that our applied correction operation, that is mapping n −1 and n 1 onto the negative x-axis and n 0 onto the positive x-axis, differs from the procedure originally prosed, that leaves the state unchanged on outcome m = 0. However, both approaches are optimal. Finally, the disturbance is calculated applying the four joint probabilities p(b, b ) from above via the conditional entropy H(B|B ), as  Uniformly distributed eigenstates of the observable A = σ z , denoted as {|a i } = {|+z , |−z }, are sent onto the apparatus M θ . The correlation between the eigenvalue a i corresponding to the state prepared and the outcome m measured by the apparatus M is used to determine the noise N (M θ , A). This correlation is quantitatively characterized by the joint probability distribution p(a, m). The conditional probability p(a|m) is then obtained via p(a|m) = p(a,m) p(m) , allowing to calculate the noise N (M θ , A) using Eq. (S. 4). The noise N (M θ , A) of the three-outcome POVM M θ is determined applying the reduced setup, where the detector ( 3 He cylindric count tube, diameter ø=1 inch) is directly mounted onto the exit window of the second supermirror to maintain optimal positioning when the supermirror is rotated (to implement the POVM weights). With this configuration a maximal count rate I max = 350 counts per second is recorded. During the measurement the POVM parameter θ is varied between π/2 and 0 in steps of π/34. For each value of θ three intensities, belonging to the POVM outputs M θ 0 , M θ 1 , and M θ −1 (denoted as I a m with m = −1, 0, 1) are recorded in a measurement time t meas = 400 seconds, which is plotted in Fig. S. 1. The particular order of the POVM elements, that is starting with M θ 0 followed by M θ +1 and M θ −1 , has experimental reasons, namely to reduce the number of movements of the neutron optical components.
For each value of θ an initial state (eigenstate of A = σ z ) is chosen by random generator. The result is blinded during the measurement but stored in file for a later comparison with the obtained values for the noise N (M, A). The following sequence was randomly generated: The count rates are detangled according to their corresponding POVM output, and data corrections are performed: First a background correction is applied, by subtraction the background counts of I bg m = 1.37 ± 0.03 counts per second resulting in the intensity bgCorr I a m . A second correction is performed, by taking the finite contrast for our system, measured as C = 95 % into account. Next the count rates are normalized by the total count rate. The statistical p(a = +1, m = +1) and p(a=-1,m=+1) can be derived for each individual value of θ.
The results for M θ +1 are plotted in Fig. S. 2 (a); apart from θ = 0, where the initial states are indistinguishable, the initial state can be inferred with a distinctive probability from p(a = +1 ∨ a = −1, m = +1). For the next output element that is M θ 0 the situation is different. As can be seen from the normalized count rate of M θ 0 , which is plotted below in Fig. S. 2 (b), it is impossible to infer which eigenstate of σ z was sent, since the theoretical predictions are exactly the same. Finally, we take a look at the third output element, that is M θ −1 , which is depicted in Fig. S. 2 (c). Note that all theory curves from the output port M θ −1 for input state | + z correspond to those of M θ +1 for input state | − z . Using the conditional probabilities p(a = +1|m = +1) and p(a = −1|m = +1) are calculated, which is depicted together with the theoretical predictions in Fig. S. 3 (a). The identical data sets of M θ 0 are taken for the joint probabilities p(a = +1, m = 0) = p(a = −1, m = 0) and for the conditional probabilities p(a = +1|m = 0) = p(a = −1|m = 0), which is plotted in Fig. S. 3 (b). The conditional probabilities p(±a|m = −1) are determined in analogous manner from p(a = +1 ∨ −1, m = −1) via p(a = +1, m = −1) and p(a = −1, m = −1) resulting in p(a = +1|m = −1) and p(a = −1|m = −1), which is illustrated in Fig. S. 3 (c).
The theoretical predictions (red and blue curves in

II.2 Disturbance Measurement
For the disturbance measurement D E (M θ , B) the three-outcome POVM measurement is followed by a subsequent projective measurement of an observable B = σ x . In addition, an optimal correction operation E opt m in between the two measurements maps n −1 and n 1 onto the negative x-axis and n 0 onto the positive x-axis, respectively. Uniformly distributed eigenstates of the observable B, denoted as {|b i } = {|+x , |−x } and associated with random variable B, are fed to the same instrument M θ . Due to the disturbing nature of the measurement apparatus M θ , generally, a loss of correlation occurs. The correlation between the eigenvalue b corresponding to the state prepared and the outcome b of the second now projective measured, which will be used to define the disturbance, is characterized by the joint probability distribution p(b, b ), allowing to calculate the disturbance D E (M θ , B) using Eq.(S. 6).
In the actual experiment, the detector (Boron trifluoride cylindric count tube, diameter ø=3 inch, active volume of length L act = 30 cm) was placed horizontally, transversal to the beam. This was done to account for the beam displacement ∆y ∼ 10 mm, caused by the tilt of the second supermirror, when setting the POVM weights. With this configuration a maximal count rate I max = 25 cnts/sec is recorded. During the measurement the POVM parameter θ is varied between π/2 and 0 in steps of π/34. For each value of θ now six intensities I b m,b , belonging to the +b and −b measurement of the POVM outputs M θ 0 , M θ +1 , and M θ −1 , are recorded in a measurement time t meas = 400 seconds, which is plotted in Fig. S. 5 (for higher statistics also a second data set with t meas = 800 seconds was recorded). For each value of θ an initial state (eigenstate of B = σ x ) is chosen by a random generator. Again, the result is blinded during the measurement but stored in file for a later comparison with the obtained values for the disturbance D E (M θ , B). The following sequence was randomly generated: As before in the noise measurement, the count rates are detangled according to their corresponding B measurement and POVM output. Next a background correction is applied, by subtracting the background counts of I b bg = 0.176 ± 0.008 cnts per sec resulting in the intensity I b bgCorr (M θ m ) and a overall contrast of C = 0.97 is taken into account. Following the same procedure as for the noise, the count rates are normalized (equipped with statistical and systematical error) by the total number of counts which gives the six probabilities p(b = +1 ∨ −1, m, b ) with m = −1, 0, 1 and b = ±1, which is plotted in Fig. S. 6 (a), (b) and (c), left and right, respectively.
Again the data points are separated according to the input state |+x → b = +1 and |−x → b = −1, which gives in total 12 probabilities p(m, b, b ) with m = −1, 0, 1, b = ±1 and b = ±1 (not shown here). Since the disturbance is defined as , (S. 14) we have to calculate the joint probability p(b, b ) via p(b, b ) = 1 m=−1 p(m, b, b ), which is plotted in Fig. S. 7. The