Practical, Reliable Error Bars in Quantum Tomography

Precise characterization of quantum devices is usually achieved with quantum tomography. However, most methods which are currently widely used in experiments, such as maximum likelihood estimation, lack a well-justified error analysis. Promising recent methods based on confidence regions are difficult to apply in practice or yield error bars which are unnecessarily large. Here, we propose a practical yet robust method for obtaining error bars. We do so by introducing a novel representation of the output of the tomography procedure, the"quantum error bars". This representation is (i) concise, being given in terms of few parameters, (ii) intuitive, providing a fair idea of the"spread"of the error, and (iii) useful, containing the necessary information for constructing confidence regions. The statements resulting from our method are formulated in terms of a figure of merit, such as the fidelity to a reference state. We present an algorithm for computing this representation and provide ready-to-use software. Our procedure is applied to actual experimental data obtained from two superconducting qubits in an entangled state, demonstrating the applicability of our method.

In the realistic regime where finite data are collected, the error bars provided by most methods which are widely applied in current experiments [19,[22][23][24] are typically ill justified and may lead to deceiving conclusions [25][26][27].To remedy this problem, Blume-Kohout [27] and Christandl and Renner [28] resort to confidence regions.These are regions in state space of all density matrices in which the state lies with high probability.In contrast to Bayesian methods [25], the reliability statements do not depend on any prior distribution.However, confidence regions are a priori difficult to construct explicitly [29].Furthermore, they are designed for worst-case scenarios and are often not representative of the intuitive extent of the error.
Our main result is a novel representation of the output of the tomography procedure-a summary of what the tomographic data tells us about the state of the system-which we call quantum error bars.This description is (i) concise, being given in terms of a few parameters only, (ii) intuitive, providing a fair idea of the "spread" of the error, and (iii) useful for precise statements, containing all necessary information for constructing confidence regions.Our method, in particular, inherits the mathematical robustness of the confidence region approach.
The quantum error bars are designed to mimic the role of classical error bars.Classically, an error bar typically represents the standard deviation of the distribution of a physical quantity caused by noise or statistical errors; this distribution is usually assumed to be Gaussian.Observe that, precisely, classical error bars (i) are a concise description of the error, (ii) provide a fair, intuitive idea of the "spread" of the quantity of interest, and (iii) allow us to calculate precise statements such as the required error interval to consider (e.g., 5 standard deviations) for a specific requested certainty level (e.g., one in a million).
Our statements are formulated in terms of a figure of merit which can be chosen freely.Our method works best when the figure of merit is the fidelity to a pure target state, the expectation value of an observable, or the trace distance to any reference state.This encompasses most tomography settings.
The quantum error bars are constructed as follows.The input is the experimental data from a general quantum tomography experiment.Then we construct a particular distribution µ( f ) of the chosen figure of merit f , which has the property of containing the necessary information to construct confidence regions at any confidence level using the method of Ref. [28].We show that in a wide range of situations and for a class of figures of merit, the distribution µ( f ) can be approximated by a simple analytical expression with three parameters.The quantum error bars are then straightforwardly deduced from these parameters.We provide a simple numerical algorithm to obtain the quantum error bars from the measurement data.By fitting a numerical approximation to µ( f ) with our approximate analytical model, we obtain the values of the parameters of the model which directly translate to the quantum error bars.The practicality of our method is demonstrated by applying it to experimental data from two superconducting qubits.
The rest of this Letter is structured as follows.First, we briefly explain our quantum tomography setup and the concept of confidence regions.We then derive our main technical MEASUREMENTS FIG. 1. Setup of quantum tomography.Measurements are taken on n copies of a quantum system.The outcomes allow us to infer what the state of the quantum system is.In this example a qubit is measured using Pauli operators.Here, the experimental data are most consistent with the state being |↑ , located at the top of the Bloch sphere (in green).However, because only finite data are collected, there is an uncertainty associated with this statement.In the method of Ref. [28], a distribution µ B n (ρ) (the red gradient) is determined from the data, from which confidence regions can be constructed (delimited by the dotted line).These are regions in state space in which the state lies with high probability.
results, namely, the definition of µ( f ), its approximate theoretical model, and the algorithm to estimate µ( f ) numerically.Finally, we demonstrate the applicability of our method on experimental data.
Quantum Tomography Setup.-Alarge number n of copies of a quantum system are measured using independent, possibly different, measurement settings (Fig. 1) [66].We list all the of the distinct positive operator-valued measure (POVM) effects in one set {E k }, and denote by n k the number of times the POVM effect E k was observed.We then construct the likelihood function, which will be needed in our analysis.It is defined as the probability with which the observed data would occur if the true state were n copies of ρ, along with the log-likelihood, with a conventional (−2) factor [27,33].
Confidence Regions.-In the following, we briefly review the method of Ref. [28] for constructing confidence regions, on which our method is based.
Confidence regions of confidence level 1 − ε are defined as regions in state space which contain the true state with a probability of at least 1 − ε.Crucially, it is the complete procedure of assigning a region to tomographic data which is certified and not the particular region itself (despite the slightly misleading terminology).More precisely, for a particular "true" state ρ true , the measurement outcomes observed in the tomography procedure are only one possible outcome data set among the enormous amount of theoretically possible data sets.Now, a data analysis procedure associates with each observed data set a corresponding region in state space.This tomography procedure is said to yield confidence regions of confidence level 1 − ε if, for any true state ρ true , the tomography procedure associates with the observed data set a region which contains ρ true , except for some data sets with total probability ε.In other words, the complete tomography procedure is successful except with probability ε, in which case the observed data set may cause the procedure to report a bad region.These "exceptional data sets" may be interpreted as misleading but highly unlikely situations.For example, if we flip a fair coin many times and observe the sequence of all "heads," any reasonable inference scheme would wrongly report that the coin is highly biased.However this outcome only happens with disproportionately small probability; introducing the parameter ε above allows us to disregard such extremely unlikely cases.
The method of Ref. [28] is formulated using the estimate density µ B n [67], defined as where c B n is a normalizing factor such that dρ µ B n (ρ) = 1, and where dρ is the Hilbert-Schmidt measure normalized such that dρ = 1 [68,69].The main result of Ref. [28] is a criterion for certifying a procedure for yielding confidence regions of confidence level 1 − ε.The criterion is the following: the procedure should map to any tomographic data (essentially) a region R in state space which satisfies where δ (•) denotes the Dirac delta function.Now, fix a threshold value f , and consider the region R f in state space consisting of all states whose figure of merit is greater than or equal to f (Fig. 2).The weight of the region R f according to the distribution µ B n (ρ) is exactly given by FIG. 2. Construction of confidence regions from the distribution µ ( f ) on the figure of merit.High weight intervals with respect to µ ( f ) (left plot) correspond to high weight regions in state space with respect to µ B n (ρ) (right diagram) which are (essentially) confidence regions, according to Ref. [28].
confidence level (See Appendix D for how to transpose the δ -enlargement in [28] into a shift of the threshold value f .).
Determining µ( f ) Numerically.-Wepropose a practical procedure which determines a numerical estimate of µ( f ).We resort to a Monte Carlo-type scheme known as the Metropolis-Hastings algorithm [71] (cf.also Refs.[72,73]).This algorithm is a standard, well-tested scheme widely used in computational physics-for instance, to simulate the behavior of statistical systems at finite temperature [74]-and there are standard methods for controlling the uncertainties resulting from the use of this procedure [75].Using this algorithm, we conduct a random walk in the quantum state space and produce random samples distributed according to the distribution µ B n (ρ).By collecting the values of f (ρ) at the sampled points into a histogram, we obtain an estimate for µ( f ).(See Appendix A for the details of the random walk procedure).
Theoretical Model for µ( f ).-It turns out that, for a selection of common figures of merit, we may understand the numerical estimate of µ( f ) with a theoretical model.Suppose f (ρ) is the fidelity to a pure reference state, the expectation value of an observable, or the trace distance to any reference state.Then, under some reasonable assumptions [76], we derive the following approximate theoretical model for µ( f ) (see Appendix B), with three fit parameters a 1 , a 2 , and m (with m 0), and one constant normalization factor C; h is a constant depending only on the choice of the figure of merit.Specific values of the constant h for some figures of merit are summarized in Table I.
The parameters (a 2 , a 1 , m) are then mapped onto new parameters which are more representative of the shape of the function.The latter is viewed as a "skewed" Gaussian (see Appendix C).The parameter f 0 determines the position of the peak, the parameter ∆ is the half width of the "deskewed" Gaussian, and γ characterizes the deviation from a perfect Gaussian.The parameters ( f 0 , ∆, γ) are the quantum error bars.
Application to Experimental Data.-We have applied the ln µ ( f ) ≈ −a 2 x 2 − a 1 x + m ln x + c, where: 3. Analysis of measurement data from two superconducting qubits prepared in a Bell state.We determined effective measurement operators which model the noise in the measurement process.The histogram of the fidelity to the target state |ψ (the blue data points), produced using our procedure, fits well to our theoretical model in Table I.The quantum error bars are a concise, intuitive, and precise characterization of the fit model, which is interpreted as a skewed Gaussian function.The parameter f 0 is the peak maximum, ∆ is the half width of the original Gaussian, and γ characterizes the skewing in terms of the displacement of the sides of the peak from the exact Gaussian at the relative height 1/e.This example involving experimental data demonstrates a good level of practical applicability of our method.algorithm to experimental data from two superconducting qubits prepared in a Bell state according to the setup described in Refs.[10,77].The data were kindly provided by the authors of Ref. [10].The two qubits were measured using slightly noisy individual Pauli operators, with a total of n = 55 677 measurements.The numerical estimation of µ ( f ) corresponding to the fidelity to the target Bell state is depicted in Fig. 3. (See Appendix F for details of the analysis of the experiment, including the modeling of the measurements [78] into effective POVM operators.) Quantum Error Bars.-The quantum error bars ( f 0 , ∆, γ) displayed in Fig. 3 are a concise and useful description of the error analysis, from which reliable operational statements can be made.Indeed, they provide the necessary information for constructing confidence regions for any given confidence level.
Also, as seen in Fig. 3, our error bars have the intuitive interpretation as representing the "spread" of the figure of merit according to µ ( f ).As such, the error bars are much smaller than the size of a confidence region for a small epsilon in a worst-case scenario, and they are in fact of comparable size to those obtained by bootstrapping [22,24,27,41,79] (see Appendix G).
Discussion.-Our work bridges the apparent gap between carrying out a mathematically rigorous, well-justified error analysis and using an ad hoc procedure yielding smaller error bars.The quantum error bars provide a convenient and precise representation of the information provided by the tomography procedure.
While the fit model for µ( f ) is subject to some assumptions and approximations, it applies well to many examples studied by the authors in developing this work-for n ∼ 100 total measurements already-and has been tested with up to five qubits.Note that the numerical procedure is not subject to these assumptions, and a deviation from the fit model could easily be noticed in some extreme examples considered (for example, with goodness-of-fit measures).A further detailed discussion on the reliability of our method is presented in Appendix H.
It is relatively straightforward to apply our method to experimental setups consisting of a few qubits.Our procedure is restricted neither to particular measurement settings nor to specific quantum states, and it applies, for example, to adaptive tomography.In general, noise in the measurement procedure has to be modeled into effective POVM effects analogously to our approach for the two superconducting qubits.(In contrast, other approaches do not require this [80][81][82].)We have developed a software which implements our procedure [83] that is expected to be directly applicable to most experimental settings.
For worst-case scenarios such as quantum cryptography [84], it is still desirable to improve the methods for explicitly constructing confidence regions.We do anticipate that the bounds used in Ref. [28] may be tightened to yield smaller confidence regions for the same confidence level.If the construction is not altered, the procedure presented here would not require any change, as the same histograms may still serve for constructing confidence regions using the tightened proof.
We also insist that our results do not rely on any particular interpretation of "probability," such as a Bayesian or a frequentist one.This is because we consider experiments which can, in principle, be repeated arbitrarily many times, which is a regime where these interpretations are equivalent [28].Nonetheless, the Bayesian viewpoint is convenient, as the distribution µ ( f ) happens to coincide with the Bayesian posterior corresponding to an agent starting the tomography procedure with a Hilbert-Schmidt uniform prior.
Furthermore, even though our results are formulated in the context of quantum state tomography, the same procedure may be applied to quantum process tomography [85,86].Indeed, the Choi-Jamiołkowski isomorphism implies that determining a quantum process is mathematically the same as determining a bipartite quantum state.

SUPPLEMENTAL MATERIAL
In this appendix, we provide a detailed description of how our method is implemented, how it is applied to practical examples, as well as additional discussions referred to from the main text.An overview of our method is given in Fig. 4.
Our software, along with instructions for download and use, may be downloaded at the location: https://tomographer. github.io/tomographer.The figure of merit is given as a function f (ρ) of the quantum state.For example, the squared fidelity to a reference state ρ Ref is represented as Recall that the relevant object in the method of Christandl and Renner is the estimate density µ B n (ρ), given by Eq. ( 3) of the main text.
Given the figure of merit f (ρ) of interest, the reduced distribution on this of merit of µ B n (ρ) is where the Dirac delta ensures the integration is performed over the shell of states in state space which have the given figure of merit f .The quantity µ ( f ) corresponds to the total weight given by µ B n (ρ) to all states with a given fixed figure of merit f .In the following, we develop a method to compute µ ( f ) numerically.We resort to a Monte Carlo-type scheme known as the Metropolis-Hastings algorithm [71] (cf.also Refs.[72,73]).This scheme is widely used in computational physics, for instance to simulate the behavior of statistical systems at finite temperature [74].This algorithm conducts a random walk which produces random samples distributed according to a given distribution P (x).The parameters of the algorithm are a starting point x 0 as as well as a "jump distribution" Q (x |x).The jump distribution is assumed to be symmetric (Q , and is used to update the current step in the random walk.(For example, Q (x |x) is often chosen as a Gaussian in some relevant coordinates centered at x).The i-th step of the random walk goes as follows: 2. Calculate a = P(x )/P(x i ).If a > 1, then set x i+1 := x unconditionally; if a < 1, then decide randomly to set x i+1 := x with probability a, or else to set x i+1 := x i .
The sequence of points {x i }, albeit correlated, are then asymptotically distributed according to the distribution P (x).

>> fi t(…)
FIG. 4. Overview of our quantum tomography analysis and how to apply our method.The measurement data are the input to our procedure.Our numerical method, which is based on a Metropolis-Hastings random walk in state space, outputs a histogram of a chosen figure of merit.We provide software accomplishing this [83].This histogram is a numerical approximation to the distribution µ ( f ) of the figure of merit.In a second step, this numerical estimate is fitted by a theoretical model.The fit can be done, for instance, using MAT-LAB.This yields a full description of the relevant distribution which allows to construct in principle confidence regions for any confidence level.This description is given in terms of three parameters, which are then effectively the "error bars." In order to calculate the quantity µ ( f ), we draw a large number of random samples in the quantum state space according to the distribution µ B n (ρ), i.e. with µ B n (ρ) playing the role of P (x).The histogram of values f (ρ) evaluated at those samples then provide a numerical estimate of µ ( f ).Crucially, it is not necessary to calculate the normalization constant c B n in the definition of µ B n (Eq.(3) of the main text) because in the Metropolis-Hastings algorithm we only have to evaluate ratios of probabilities.
For our random walk, we represent a quantum state ρ of dimension d by a square complex matrix T with tr T T † = 1, such that ρ = T T † .To any such T corresponds a valid density matrix ρ, and to any density matrix ρ corresponds at least one such T (e.g.T = ρ 1/2 ).Additionally, the constraint tr T T † = 1 corresponds to requiring that the components of T , real and imaginary parts taken separately into a real vector y, lie on the surface of the unit (2d 2 − 1)-hypersphere.Random density matrices may be sampled from the Hilbert-Schmidt measure by choosing such random points on this hypersphere [68].In fact, the matrix entries T i j of T are simply the components of a vector |ψ of dimension d 2 which purifies ρ.Indeed, if we trace out the second system from |ψ We hence choose to perform a Metropolis-Hastings random walk on the (2d 2 − 1)-hypersphere corresponding to the possible T matrices.The jump candidate is calculated from a point y i by choosing a vector ω of 2d 2 normally distributed values and setting y = ( y i + η step ω)/ y i + η step ω , where η step is a chosen step size.The jump distribution obtained in this way is symmetric.
We follow the the prescriptions given in Ref. [75] for the correct usage and appropriate error analysis of the Metropolis-Hastings algorithm.First, since the random starting point is likely to be a point which has very little weight under P (x), the random walk needs to equilibrate, or thermalize, until it reaches points which have a non-negligible values of P (x).This first set of points traversed until the walk has thermalized should be discarded.Also, because consecutively collected samples may be very correlated, it is useful to keep only one sample in a certain number N sweep (the "sweep size"), while throwing away each time the N sweep − 1 points between two recorded samples.In our examples, the sweep size N sweep is chosen of the order of 1/η step ; this gives at least the chance to the random walk to traverse all of state space between two recorded samples.Errors on the final numerical histogram points may be estimated either by calculating the standard deviation of independent runs of the simulation, or by a binning analysis which takes into account the correlation of the samples during a single run.We refer to Ref. [75] for a detailed discussion of these techniques.
In our case, for each histogram bin, we associate to each recorded sample the value 1 if the point is in the bin, or 0 otherwise.The final numerical estimate of µ ( f ) is produced by averaging the time series for each bin, which corresponds up to a constant to generating a histogram of counts.These time series are suitable for use in a binning analysis to obtain error bars on the numerical estimate of µ ( f ).We now derive an approximate theoretical model to fit our numerical histogram.This is useful in order to provide a succint description of the result in terms of only a few parameters.It also serves as a consistency check allowing us to assert that our results are well understood from a theoretical point of view.
In general, the function µ ( f ) can be very complicated, so an exact analytical description is unlikely.Rather, our goal is to find a decent approximation of this function in a region close to where µ B n has high weight.
In fact, the bell shape of the curves in Fig. 3 of the main text is typical when the figure of merit is taken to be the fidelity to a pure target state, the expectation value of an observable or the trace distance to any reference state.Intuitively, this shape is the result of the concurrence of two effects: a volume factor reflecting the increasing surface of a shell of fixed figure of merit as we get far from the reference point, and the approximately exponential decrease of the likelihood function itself.For example, in the case of the trace distance to the maximum likelihood estimate, there are many more states with high distance to ρMLE than there are very close-this is the increasing volume factor.The following derivation makes this argument more precise.
Let's now derive the fit model.We parameterize ρ with a generalized Bloch vector [87,88].Take an orthonormal basis {A j } of the Lie algebra su(d), with j = 1, . . ., M and M = d 2 − 1.The A j are hermitian, traceless and obey tr A j A j = δ j j (an example are the normalized generalized Gell-Mann matrices [87,88], or, for k qubits, the normalized tensor product of Pauli operators).We may now write a general state ρ as ρ (a j ) = (1/d)1+∑ j a j A j with real coefficients a j obeying some nontrivial constraints such that ρ 0. The Hilbert-Schmidt distance is given by the Euclidean distance of the generalized Bloch vectors, tr ρ(a j ) − ρ(b j ) 2 = ∑ j a j − b j 2 .Now because the Hilbert-Schmidt measure dρ is induced by the Hilbert-Schmidt metric [68,69], we may write (A1) as with a new constant c and λ (ρ) = −2 ln Λ (ρ), and where implicitly the arguments to λ (•) and f (•) are to be transformed into ρ appropriately.
The conditions that the a j have to satisfy in order to represent a positive semidefinite matrix are complicated [87,89,90].However, it turns out that we don't need to know the exact form of these constraints.Rather, we assume that: (i) f has an extremal value close to the region of interest (viz., near ρ MLE ); (ii) the surface of a shell of states of a given figure of merit f tends to zero as f tends to this extremal value.
These assumptions are rather natural and are indeed automatically satisfied if f (ρ) is one of the cases considered in the main text (the squared fidelity to a pure reference state, the trace distance to a reference state ρ Ref , or the expectation value of an observable).In the case of a distance measure, such as the trace distance, the extremum is usually zero at the reference state itself, and the surface of the shell of states with very small distance to ρ Ref clearly shrinks to zero.In the case of the expectation value of an observable, the extremum is attained at the border of state space.Because the border of state space has no flat facets, the surface of a shell of given expectation value (a hyperplane intersected with state space) also shrinks to zero as we approach the border.
where in the last two integrals the terms in square brackets are to be evaluated at the points r which satisfy r = f (r, Ω) and f = f (r, Ω), respectively.Note that the figure of merit f (r, Ω) must be invertible for fixed Ω and for r 0; this is usually the case with our choice of a Ref j above.Note also that Expression (B2) is in fact still exact, albeit very difficult to calculate explicitly.To proceed further, we will use Laplace's approximation, and assume that the main contribution to the integral is a region close to a single point Ω 0 on the shell of fixed figure of merit f where the integrand is maximal.Then, we have where w [ f , Ω 0 ] is a "width factor," which accounts for the total weight of the peak in Laplace's approximation, including a possible truncation of the peak caused by the border of state space.Note that the condition (ii) above for the figure of merit ensures that w [ f , Ω 0 ] doesn't explode when f approaches the extremum of f (ρ).
At this point, we need to make further assumptions about the behavior in f of the individual terms in (B3).First, we consider the regime in which the likelihood is close to Gaussian in the Hilbert-Schmidt coordinates, (we shift a by a Ref without loss of generality and for later practical reasons; also λ B is a symmetric matrix).This is generically the case for most practical scenarios where a reasonable amount of measurements were taken.Second, we need to assume something about the figure of merit f (r, Ω): we'll suppose that where h is some known constant, and g (Ω) some function.
The figures of merit considered in the main text automatically satisfy this assumption.First, if the figure of merit is any distance measure to ρ Ref which is given by a norm, such as the trace distance, then f (r, Ω) = ρ (r, Ω) − ρ Ref = ∑ j (a j (r, Ω) − a Ref j ) A j = r ∑ j Ω j A j , recalling that our hyperspherical coordinates are defined by a(r, Ω) = a Ref + r Ω with Ω the unit vector in the direction Ω.Also, f (ρ) obeys property (B5) if it is the expectation value of an observable, f (ρ) = tr (ρW ): with a = r Ω + a Ref we can write f (ρ) = tr (1/d) + ∑ j a j A j W = r Ω• w+( a Ref • w+trW /d), where w is the vector with components w j = tr (A j W ). Recall in this case that a Ref is on the border of state space.Furthermore, recall that the squared fidelity to a pure reference state can be written as the expectation value of an observable, Armed with both assumptions (B4) and (B5), we see that ∂ f /∂ r = g(Ω) as well as r = ( f − h) /g (Ω), and we obtain where c = c • sign (g (Ω 0 )).The term in the exponential in (B6), being quadratic in r, is then also quadratic in f − h.At this point, we further assume that Ω 0 (where λ ( f , Ω 0 ) is minimal at fixed f ) is approximately constant in f , and that the term w [ f , Ω 0 ] is either approximately constant or, being a volume factor, varies as a power of r, and thus of f − h.We finally obtain our fit model, with 3 fit parameters a 1 , a 2 , m and one constant normalization factor C. The value m includes the (M − 1) power plus any contribution from the weight factor w [ f , Ω].The expression for the logarithm of µ ( f ) is numerically more suitable for fitting.We thus obtain our fit model for ln µ ( f ), depending on whether f h for all valid f or f h for all valid f .Indeed, either of these two conditions hold as we have chosen the center a Ref j of our hyperspherical coordinates as an extremal point of f (ρ).Table I of the main text summarizes the appropriate fit model for a selection of figure of merits.
It is further worth mentioning that for larger f , the exponential will dominate all the other terms; for example, in this regime, the details of the function w ( f , Ω) is not relevant for most figures of merit.
There are certain situations in which our approximate fit model fails to accurately describe the behavior of µ ( f ).If too few measurements are taken, the likelihood function is not approximately Gaussian as we have assumed (however this is usually the case already for, e.g., n ∼ 100 total measurements).Our derivation also no longer applies if Ω 0 happens to not be constant with f , or if a different figure of merit is considered such as the fidelity to a mixed reference state.Furthermore, in some cases the boundary of state space might interfere with our approximation (it might for example constrain Ω 0 causing it to vary with f ), or the Laplace method might not be a good approximation if λ ( f , Ω) has e.g.several minima for fixed f .However in examples we have studied these cases always caused our model to fit poorly to the numerical estimate in the region of the peak; we consider it very unlikely that the fit model fits the peak well but fails to describe the tail accurately.See also Appendix H for a more general discussion of the reliability of our method.which for x > 0 exhibits a characteristic skewed bell curve as depicted in Fig. 3 of the main text.The function y(x) is exactly the model for ln µ( f ), with x = s ( f − h) for a constant h and for a sign s = ±1 depending on the figure of merit, as given by (B8).The function (C1) can be seen as a skewed version of a parabola with summit at (x 0 , y(x 0 )) (see Figure 5a).The de-skewing operation applied to y(x) consists in finding the parabola y deskewed (x) = −a (x − x 0 ) 2 + y 0 with the same the peak position and curvature as y(x).In other words, we find a parabola y deskewed (x) such that the functions y(x) and y deskewed (x) must match at x = x 0 to second order.The maximum of y(x) is at the point x 0 satisfying 0 = (dy/dx)| x 0 , that is, 0 = −2 a 2 x 0 − a 1 + m/x 0 .Solving for x 0 while ensuring that x 0 > 0 yields The condition (d 2 y/dx 2 )| x 0 = (d 2 y deskewed /dx 2 )| x 0 yields a = a 2 + m/(2 x 2 0 ).In terms of (a, x 0 , y 0 ), the original parameters (a 2 , a 1 , c) read We can already define ∆, which is the first of our quantum error bars.It is defined as The parameter ∆ is the half width of the Gaussian function e y deskewed (x) at relative height 1/e (Figure 5b): Indeed, the standard deviation of a Gaussian is precisely the half width of the Gaussian peak at relative height 1/e with respect to the Gaussian peak maximum.In our case, ∆ is interpreted as the half width of the Gaussian, before the skewing operation is applied.
It remains to understand the effect of the m parameter in terms of skewing.Consider the intercepts of y deskewed (x) with the line y = y 0 − ξ , which are at x ± = x 0 ± ξ 1/2 ∆.(These points correspond to the cross-section of the peak of e y deskewed (x) at a relative height e −ξ .)If we view the function y(x) as the result of skewing y deskewed (x) via the transformation above parametrized by m, then the intercepts with the line y = y 0 −ξ are shifted by some δ x ± which vary as a function of m.Let us determine δ x ± to first order in m.For infinitesimal m, the equation y(x ± ) = y 0 − ξ defining x ± varies correspondingly as y(x ± + δ x ± ) + δ y(x ± + δ x ± ) = y 0 − ξ .Keeping only the terms of first order in m we obtain Noting that we only need (dy/dx)| x ± to zeroth order in m, we have Also, with δ a 2 = −m/(2x 2 0 ), δ a 1 = 2m/x 0 and δ c = (δ a 2 ) x 2 0 + δ a 1 − m ln x 0 , we have Developing the logarithm as a Taylor series in ∆, the first two orders cancel and we have Then we obtain from (C5), also recalling that a = ∆ −2 , We now introduce the skewing factor γ as such that to lowest order in ∆, the shift of the "sides of the peak" given by x ± for a relative height e −ξ is directly proportional to γ (see also Figure 5b): More precisely, the shift for infinitesimal m is given by We may straightforwardly define f 0 = s x 0 + h as the position of the peak in terms of the figure of merit f , by invoking the relation x = s ( f − h) which we used to write (C1).Finally, we obtain a set of parameters ( f 0 , ∆, γ), along with a normalization constant y 0 , which now all have a direct interpretation in terms of features of the modeled distribution (Figure 5c).
In summary, they are given in terms of the fitted parameters (a 2 , a 1 , m) as: recalling that s = ±1 and h in the relation x = s ( f − h) are fixed by the choice of figure of merit, as given in Table I of the main text.The position of the peak is at f = f 0 .The half width of the peak (after it is de-skewed) is given as ∆ (at relative height 1/e), and the factor γ measures how much the peak is skewed towards larger f values (respectively lesser f values, if s = −1), with a direct interpretation in terms of the horizontal shift of the sides of the peak.
Appendix D: Confidence regions from the distribution µ ( f ) Here we see how to construct confidence regions from the histogram obtained by our method.As explained in the main text, regions with high weight in state space may be promoted to confidence regions by the method of Christandl and Renner.
Consider the region with all states ρ which have at least a given value of the figure of merit: The direction of the inequality in (D1) depends on which figures of merit are considered desirable.The direction used here reflects the fidelity to a target state, in which case higher fidelities are desirable.If, e.g., a proper distance measure such as the trace distance is used, the opposite inequality is preferable.
It is straightforward to see that the weight of the region R f in state space according to the measure µ B n (ρ) dρ is directly given by the weight of the function µ ( f ) over the corresponding range of f values which are included in the region R f .For example, if the figure of merit is the fidelity to a target state, the weight α ( f ) of the region R f is given by The value of f required for a region R f to encompass a particular weight 1 − ε/ poly(n) is thus given by inverting (D2).This may either be done directly from the numerical histogram points, or from a fit model.This gives us a region with high weight with respect to µ B n (ρ).
The method of Christandl and Renner may now be used to upgrade these regions to confidence regions.Choose a confidence level 1 − ε, and calculate the corresponding poly(n) and δ as given in ref. [28].Recall that a region with weight 1 − ε/ poly(n), once enlarged by δ in purified distance, is a confidence region with confidence 1 − ε.
In general, the δ -enlargement can be translated into a cost in the corresponding bounding figure of merit f .We'll derive here this cost for our particular cases of interest of the fidelity to a pure reference state, the expectation value of an observable and the trace distance to any reference state.Consider first the case where the figure of merit corresponds to the trace distance to a reference state ρ Ref , Note the reverse inequality is used in (D1).Then, consider the region R f +δ .Now, we'll see that R f +δ contains the set R f enlarged by δ in purified distance.Indeed, if ρ ∈ R f , and σ is such that P (ρ, σ ) = 1 − F 2 (ρ, σ ) δ , then we may use the triangle inequality, along with the fact that D (ρ, σ ) P (ρ, σ ) [91], to see that D (σ , ρ Ref ) f + δ , and deduce that σ ∈ R f +δ .Thus, if R f is a region with weight 1 − ε/ poly(n), then R f +δ is a confidence region of confidence level at least 1 − ε.This construction is depicted graphically in Fig. 6.Of course, the same reasoning applies to the case where f (ρ) = P (ρ, ρ Ref ) is the purified distance to a reference state.
Consider also the case where f (ρ) = tr (ρW ) is the expectation of an observable W . First, assume that the reverse inequality direction is used in (D1).Then the δ -enlargement of R f is included in the region R f +wδ , where we've assumed that the eigenvalues z j of W lie within an interval of size w, i.e. w − z j w + and w = w + − w − , or equivalently, w − 1 W w + 1 and w = w + − w − .Indeed, assume f (ρ) f and P (ρ, σ ) δ .Then D (ρ, σ ) P (ρ, σ ) and by properties of the trace distance there exists Remark that the error bars on the numerical estimate of µ ( f ), or on the relevant fit parameters, should a priori be propagated to the obtained confidence regions.However since µ ( f ) decays exponentially, small errors on the fit paramterers will only have a negligible effect on the f required to contain a given weight 1 − ε/ poly(n) as given by (D2).This is just like classical error bars-error bars hardly need their own error bars.
The final confidence regions obtained are still generally unmeaningfully large.For example, if we try to construct confidence regions for the two superconducting qubits and choose, say, ε = 5%, then we see that ε/ poly (n) ∼ 10 −37 .Yes, that's small.[93] The corresponding f required (see construction in Fig. 6) is f ≈ 0.85, and we can calculate δ ∼ 0.1.The final confidence region comprises then all states with a fidelity to the target state in the range [0.75,1].This analysis is in itself Histogram of the squared fidelity to the target state |Ψ under the distribution which is relevant for the construction of confidence regions using the method of Christandl and Renner [28].The numerical estimate (blue points) were obtained using our procedure based on a Metropolis-Hastings random walk.The theoretical model fits the numerics well (red curve).The quantum error bars characterize the distinctive features of the curve, which is interpreted as applying a "skewing" operation on an exact Gaussian (green curve).Confidence regions may be constructed from this histogram by choosing regions with states that have a certain minimum fidelity to the target state, chosen such that the set has high weight with respect to this distribution.The true value of the squared fidelity of the state from which we have simulated measurements is in fact 0.963; the shift is due to the increasing volume factor towards lower fidelity values.b.The same analysis is applied to the case of the expectation value of an entanglement witness.The witness is chosen to have positive expectation value only for entangled states, with a maximum at the maximally entangled state |Ψ where Ψ|W |Ψ = 2. c.The same analysis is again repeated for the case where the figure of merit is the trace distance to the maximum likelihood estimate.
Points with obviously huge error bars were excluded from the fit.The raw fit for ln µ ( f ) is presented in Fig. 8, along with a plot of the residuals.The corresponding fit parameters are (with 95% confidence bounds):

(E2)
The value m is close to M − 1 = 14, which is the value we would predict from (B6) without the w [ f , Ω 0 ] term.This indicates that this latter volume factor is approximately constant in this case.The quantum error bars can be obtained with (C13), f 0 = 0.0377 ; Our model also fits the numerically obtained histogram in Figures 7a and 7b well.For Figure 7a, the appropriate fit model is ln µ and for Figure 7b we have used the model ln µ  The log scale allows to better appreciate the quality of the fit.The plot shows the residuals of the fit for ln µ ( f ), that is, the difference between the logarithm of the numerical histogram points and the fitted model for the function ln µ ( f ).The low deviation from zero for the central points underscores the quality of our model (and that the error bars may be overestimated).The points at the sides correspond to regions with very low probability and where the numerical estimate is anyway expected to be unreliable.

Appendix F: Application to experimental data and modeling noisy measurements
As an illustration, we apply our method to analyze measurement data obtained for two superconducting qubits in a Bell state prepared using the setup reported in [10].The data were kindly provided by the authors of Ref. [10].The measurement on an individual qubit is carried out by a transmission measurement on a resonator coupled to that qubit [78].This measurement yields a random real value I which is distributed differently whether the qubit is in the |0 state or in the |1 state.Single-shot readouts are possible to reasonable accuracy using a simple threshold, because the two distributions of I corresponding to |0 and |1 have almost nonoverlapping support [10].However, we choose to model the measurement process more precisely, as our method assumes the POVM effects correctly incorporate any noise introduced by the measurement device itself.Here, we model the measurement process as a real-valued POVM.A calibration measurement yields the distributions q 0 (I) and q 1 (I) for trusted preparations of the |0 and |1 states respectively.The measurement of the Pauli operator σ i is performed by applying the appropriate unitary gate U i with high fidelity to bring the measurement basis onto the computational basis.The effects corresponding to the real-valued POVM including the rotation with U i are then (We have ignored here errors in implementing the gate U i .) We could have used these POVM effects directly for each measured value for each qubit in the expression for the loglikelihood given by Eq. ( 2) of the main text, however for practical purposes (to reduce the number of different POVM effects), we have coarse-grained the values I into 20 different bins, yielding the discrete distributions q 0 (k) and q 1 (k) for bin number k.In other words, if the measured value I is in bin number k, then the corresponding POVM effect is The joint POVM effect corresponding to combining individual measurements on the two qubits is simply given by tensoring the two POVM effects.For example, if the value I A is measured on qubit A (falling in bin k A ) and the value I B is measured on qubit B (which falls in the bin k B ), then the joint POVM effect is simply where U i (resp.U j ) is the rotation applied to qubit A (resp.qubit B) before measuring the qubit in the computational basis.
We have analyzed the measurement data using the procedure described above.There were in total n = 55 677 measurements.The histogram corresponding to the squared fidelity to the target Bell state is depicted in Fig. 3 of the main text.Our theoretical model (B8b) (with h = 1) fits the numerical estimate well.The fit parameters are (with 95% confidence bounds), A currently used ad hoc technique for obtaining error bars is bootstrapping [22,24,27,41,79].In our simulated experiment above, we have performed a simple parametric bootstrapping analysis for comparison.We have simulated new measurement outcomes from ρ MLE , using the same amount of  7a.We have resampled random measurement outcomes from the maximum likelihood estimate and plotted the resulting reconstructed squared fidelities to the target state |Ψ (green histogram bars).The bias of the curve from our method is due to the increasing volume factor in the direction of lower fidelity values.We see here that our method yields error bars of the same order of magnitude as bootstrapping.However our error bars are well-justified, because they can serve to construct confidence regions for any confidence level.
measurements and the same settings as for the original simulated experiment.We repeated the procedure many times to obtain in total 300 new datasets.For each dataset, we have reconstructed the corresponding maximum likelihood estimate, and determined its squared fidelity to the target state |Ψ .The histogram of these values is presented in Fig. 9, compared with the result of our method in Figure 7a.We see that the width of the distribution is approximately the same.The bias of ∼ 1% squared fidelity between the two methods is due to the increasing volume factor picked up by our method in the direction of decreasing fidelities.Our error bars, however, have the robust operational meaning as a means to construct confidence regions.
Appendix H: Discussion of the reliability of the method In our work, we provide several levels of reliability statements.First, obviously if µ ( f ) can be exactly determined, then our error analysis is perfectly reliable, assuming the given measurement operators are accurate.In practice though, it is only possible to approximate µ ( f ) with numerical techniques.However these methods are standard and well-tested, and come with reliable error estimates [75].Thus with minimal reasonable assumptions an error analysis based on this approximation of µ ( f ) is also reliable.
Furthermore, we provide an approximate theoretical model with only three fit parameters and which explains well the numerical estimate obtained.The quality of the fit over many examples studied by the authors not only presents additional strong indication that the numerical estimate is faithful, but also shows that the result admits a simple representation with few parameters.Recall that the numerical method does not rely on the assumptions and approximations used to derive the theoretical fit model.Also, because of its form the fit model is relatively robust to small uncertainties in the fit parameters.
Our approximate fit model might fail though to describe the distribution accurately in some extreme cases, for example if too few measurements are taken (however examples with n ∼ 100 total measurements were well fit).The model is also known not to apply to figures of merit such as the fidelity to a non-pure state, or more generally, figures of merit which do not satisfy (B5).However, the fit in these cases is usually of bad quality, especially in the region of the peak.In practice, it is sufficient to rely on goodness-of-fit measures or visual inspection of the quality of the fit to assert its validity.
Appendix I: Overview of our software We are releasing a software suite which accomplishes our procedure in a wide range of settings [83].The project is composed of a program ready for use, which is built upon a modular, generic C++ framework designed for flexibility and speed.
We expect our program to be directly usable in most experimental applications.Our program takes as input a list of POVM effects E k , which are assumed to be independent, and a list of frequencies n k which indicate how many times each corresponding POVM effect was observed.Further inputs include settings for the histogram range and number of bins, which figure of merit to use, parameters of the Metropolis-Hastings random walk, the number of times to repeat the random walk, the error analysis method, etc.The output of the program is the histogram as displayed for example in Fig. 7, as well as Fig. 3 of the main text, with corresponding error bars.We refer to the project's hosted location [83] for further documentation and detailed information about its usage.The histograms presented in this work were all obtained using our software.
This program is itself built upon a generic C++ framework with a collection of tools which may be used to specialize our method to more complex setups.We provide for example tools to specify the data for a quantum tomography problem, an implementation of an abstract Metropolis-Hastings random walk, an interface to collect statistics during this random walk, as well as tools for parallel processing several random walk instances.The code is written using a technique called C++ template metaprogramming [94], which allows to write generic code which is flexible and reusable, but which at compile-time is translated into highly optimized low-level machine instructions.Our project relies on the Eigen and Boost libraries [95,96], in particular for linear algebra calculations.These libraries also make extensive use of this technique.Some tasks are not covered by our program.If required by the figure of merit, finding the maximum likelihood estimate ρ MLE can be accomplished by minimizing the loglikelihood λ (ρ).Since this function is convex, the solution can be found efficiently.In most of our examples, we used CVX, a MATLAB package for specifying and solving convex problems [97,98].Also, in order to determine the fit parameters corresponding to the histogram, we resorted to MATLAB's curve fitting toolbox.A MATLAB script is provided along with the software to ease this task, and to calculate the quantum error bars.
Our program is currently limited to POVM effects with a product structure.However, this is generically the case, even e.g. for adaptive tomography [99][100][101].We have successfully used our code to analyze simulated measurements of Pauli operators of up to at least 5 qubits on our hardware, and we expect further improvements will increase this limit.
.e., which has high weight under the distribution µ B n [70].Confidence Regions for a Figure of Merit.-We may now use this criterion to devise an explicit procedure for constructing confidence regions, where the regions R are chosen to be defined via level sets of a figure of merit.A figure of merit f (ρ) may be any function of the quantum state.For example, f (ρ) = F 2 (ρ, |ψ Ref ψ Ref |) expresses the fidelity to a reference state |ψ Ref .The reduced distribution of the estimate density µ B n (ρ) onto the figure of merit f is given by FIG.3.Analysis of measurement data from two superconducting qubits prepared in a Bell state.We determined effective measurement operators which model the noise in the measurement process.The histogram of the fidelity to the target state |ψ (the blue data points), produced using our procedure, fits well to our theoretical model in TableI.The quantum error bars are a concise, intuitive, and precise characterization of the fit model, which is interpreted as a skewed Gaussian function.The parameter f 0 is the peak maximum, ∆ is the half width of the original Gaussian, and γ characterizes the skewing in terms of the displacement of the sides of the peak from the exact Gaussian at the relative height 1/e.This example involving experimental data demonstrates a good level of practical applicability of our method.
Appendix A: Procedure for determining µ( f ) using a Metropolis-Hastings random walk Appendix B: The fit model for µ ( f ) Furthermore recall that the squared fidelity to a pure reference state |ψ Ref can be written as the expectation value of the observable |ψ Ref ψ Ref |.Denote by ρ Ref a relevant reference point where the figure of merit is extremal, and let a Ref j such that ρ Ref = (1/d)1 + ∑ j a Ref j A j .(We recycle the notation ρ Ref since whenever the figure of merit is a distance measure to a reference state, the same reference state is to be used here.)We go to hyperspherical coordinates (r, Ω) centered at the reference point a Ref j , with d M a j = dr dΩ r M−1 , and introduce the change of vari-ables r → r = f (r, Ω):

F 2
(ρ, |ψ Ref ψ Ref |) = tr(ρ|ψ Ref ψ Ref |).However, if the figure of merit is the fidelity or purified distance to a mixed reference state, it does not satisfy in general the Ansatz (B5).
Appendix C: The quantum error bars Now we proceed to transform the parameters (a 2 , a 1 , m) into more meaningful quantities, corresponding to distinctive features of the corresponding function.Consider the function y(x) = −a 2 x 2 − a 1 x + m ln(x) + c , (C1)

FIG. 5 .
FIG.5.Model function for µ( f ) as skewed Gaussian.a.The function y(x) = −a 2 x 2 − a 1 x + m ln x + c (used to model ln µ( f )) can be seen as a skewed parabola, with a "skewing operation" parametrized by γ.The green curve y deskewed (x) is an actual parabola.b.The model function y(x) is fully characterized by the location of the peak x 0 , the half width ∆ of the "de-skewed" parabola at relative height 1/e, and a parameter γ characterizing the shift of the sides of the peak at relative height 1/e.c.The variable f relates to x via a simple shift and possible reflection, given by x = s ( f − h) for a constant h and s = ±1 depending on the figure of merit (see TableIof the main text).Hence, with f 0 = s x 0 + h, the curve µ( f ) is fully characterized by the parameters ( f 0 , ∆, γ), which we call the quantum error bars.
and σ ∈ R f +wδ .If the forward direction inequality is used in (D1) instead of the reverse, then the same argument above is easily adapted to show that the δenlargement of R f is included in the region R f −wδ , where w is defined in the same way.Note also that the case of the squared fidelity to a pure reference state |ψ ref is given by f(ρ) = F 2 (ρ, |ψ Ref ψ Ref |) = ψ Ref |ρ |ψ Ref ,and is thus the expectation value of the observable |ψ Ref ψ Ref |.More precisely, we have w + = 1, w − = 0 and w = 1, and the inequality direction used in (D1) encompasses larger values for the fidelity in the region; the enlarged set to consider is then simply R f −δ .

FIG. 7 .
FIG.7.Simulated example data analyzed using our procedure, along with the quantum error bars.Two qubits in a noisy entangled state were measured with individual Pauli operators.a. Histogram of the squared fidelity to the target state |Ψ under the distribution which is relevant for the construction of confidence regions using the method of Christandl and Renner[28].The numerical estimate (blue points) were obtained using our procedure based on a Metropolis-Hastings random walk.The theoretical model fits the numerics well (red curve).The quantum error bars characterize the distinctive features of the curve, which is interpreted as applying a "skewing" operation on an exact Gaussian (green curve).Confidence regions may be constructed from this histogram by choosing regions with states that have a certain minimum fidelity to the target state, chosen such that the set has high weight with respect to this distribution.The true value of the squared fidelity of the state from which we have simulated measurements is in fact 0.963; the shift is due to the increasing volume factor towards lower fidelity values.b.The same analysis is applied to the case of the expectation value of an entanglement witness.The witness is chosen to have positive expectation value only for entangled states, with a maximum at the maximally entangled state |Ψ where Ψ|W |Ψ = 2. c.The same analysis is again repeated for the case where the figure of merit is the trace distance to the maximum likelihood estimate.
as specified by TableIof the main text.The respective quantum error bars are indicated on the corresponding plots in Fig.7.

FIG. 8 .
FIG.8.Fit of the logarithm of the numerically obtained histogram.The log scale allows to better appreciate the quality of the fit.The plot shows the residuals of the fit for ln µ ( f ), that is, the difference between the logarithm of the numerical histogram points and the fitted model for the function ln µ ( f ).The low deviation from zero for the central points underscores the quality of our model (and that the error bars may be overestimated).The points at the sides correspond to regions with very low probability and where the numerical estimate is anyway expected to be unreliable.

FIG. 9 .
FIG.9.Comparison with error bars from existing bootstrapping methods.The blue points and red curve reproduce the numerical estimate and fit in Figure7a.We have resampled random measurement outcomes from the maximum likelihood estimate and plotted the resulting reconstructed squared fidelities to the target state |Ψ (green histogram bars).The bias of the curve from our method is due to the increasing volume factor in the direction of lower fidelity values.We see here that our method yields error bars of the same order of magnitude as bootstrapping.However our error bars are well-justified, because they can serve to construct confidence regions for any confidence level.