Experimental single-setting quantum state tomography

Quantum computers solve ever more complex tasks using steadily growing system sizes. Characterizing these quantum systems is vital, yet becoming increasingly challenging. The gold-standard is quantum state tomography (QST), capable of fully reconstructing a quantum state without prior knowledge. Measurement and classical computing costs, however, increase exponentially in the system size - a bottleneck given the scale of existing and near-term quantum devices. Here, we demonstrate a scalable and practical QST approach that uses a single measurement setting, namely symmetric informationally complete (SIC) positive operator-valued measures (POVM). We implement these nonorthogonal measurements on an ion trap device by utilizing more energy levels in each ion - without ancilla qubits. More precisely, we locally map the SIC POVM to orthogonal states embedded in a higher-dimensional system, which we read out using repeated in-sequence detections, providing full tomographic information in every shot. Combining this SIC tomography with the recently developed randomized measurement toolbox ("classical shadows") proves to be a powerful combination. SIC tomography alleviates the need for choosing measurement settings at random ("derandomization"), while classical shadows enable the estimation of arbitrary polynomial functions of the density matrix orders of magnitudes faster than standard methods. The latter enables in-depth entanglement studies, which we experimentally showcase on a 5-qubit absolutely maximally entangled (AME) state. Moreover, the fact that the full tomography information is available in every shot enables online QST in real time. We demonstrate this on an 8-qubit entangled state, as well as for fast state identification. All in all, these features single out SIC-based classical shadow estimation as a highly scalable and convenient tool for quantum state characterization.

Quantum systems are prepared in laboratories and in engineered devices such that their state delicately encodes quantum information essential for achieving goals in both science and technology. Any small adjustments, changes in the environment, or active control all change the state. Yet, an accurate mathematical description of the state is a necessary component for most higher-level tasks. A crucial requirement for ensuring the performance of quantum devices is thus having methods for accurately determining the quantum states that have been prepared. The goldstandard approach for this fundamental task is quantum state tomography (QST) or simply tomography [1]. QST enables the full reconstruction of the system's quantum state from an exponential number of measurements. Often, however, we are not even interested in the full quantum state, but rather certain features, like entanglement across a particular bipartition. Yet, to access such non-linear functions, one would like to have the reconstructed quantum state.
Formally, QST methods use an informationally complete set of measurements to reconstruct the complete description of the quantum state. The optimal measurement for collecting the necessary tomography data has long been known to be the so-called SIC POVMs [2,3]. SIC POVMs are constructed from the minimal number of d 2 measurements for a d-dimensional system, which are arranged in a way that maximizes the pairwise distance in Hilbert space. SIC POVMs are known to exist for several lowdimensional systems [4,5], and for qubits, take the form of 4 non-orthogonal vectors arranged as a tetrahedron in the Bloch sphere, see Fig. 1(d). While SIC POVMs uniquely offer access to the complete tomographic information in every single experimental run (shot), implementing these measurements in practice is very challenging, requiring purpose-built setups [6,7], sequential measurement [8,9], or ancilla assisted schemes [10,11]. Hence, tomography remains almost exclusively performed using the simpler, but over-complete Pauli basis, requiring 3 N orthogonal measurement settings, each with 2 N outcomes for an N-qubit system. The resulting overhead effectively limits full tomography to system sizes of only a few qubits.
From a conceptual point of view, the qubit SIC POVM is favorable, as a single experimental shot already contains complete tomographic information. This distinct advantage has also been recognized in recent theoretical work on adaptive tomography for linear cost functions [11], as well as neural network quantum state tomography [12,13]. Experimentally, it is also much cheaper to repeat the same measurement setting many times than to switch settings an exponential number of times as the system size grows with Pauli tomography. So, this feature can have a significant impact in practice. Moreover, since full tomographic information is contained in every shot, the experimenter is free to stop the tomography at any point, e.g. when certain quantities of interest have converged. In contrast, other QST approaches would require at least one shot for each measurement setting to collect sufficient information in the first place. This discrepancy is particularly rele-arXiv:2206.00019v1 [quant-ph] 31 May 2022 vant when we are not interested in the full density matrix, but only in certain (non-linear) properties, which often require far fewer shots than the 3 N minimum in Pauli tomography [14,15]. Finally, for randomized measurement schemes [16], where ideally a different measurement setting is required in each shot, the SIC approach obviates this requirement completely ('derandomization'), making these schemes even more practical. Hence, in most situations, SIC tomography has the potential to substantially outperform standard methods for tomography or for the direct estimation of state properties.
Here, we describe our realization of SIC POVMs in a trapped ion quantum processor and their use for characterizing unknown quantum states. We put an emphasis on demonstrating the speed and robustness obtained from reducing the number of measurement settings in conjunction with new data processing techniques that come with rigorous accuracy guarantees. With our approach, we are able to comfortably reconstruct the full 8-qubit quantum state encoded in the electronic energy levels of calcium ions in real-time using a standard laboratory computer. Moreover, we demonstrate the simultaneous real-time estimation of Renyi entropies across all bipartitions using a samplingfree classical shadow method [14]. This enables full entanglement characterization of arbitrary (but close to pure) quantum states with orders of magnitude fewer experimental shots than standard QST methods.

I. SIC TOMOGRAPHY
QST aims at reconstructing an unknown quantum state ρ from an informationally complete set of measurements, which spans the entire Hilbert-space of the quantum system. The minimal number of measurement outcomes to reconstruct an arbitrary d-dimensional quantum state then is d 2 . An experimenter performs these measurements on many copies of ρ referred to as experimental shots, and attempts to reconstruct ρ from the observed measurement counts. The standard approach to QST of N-qubit systems combines tomographic measurements of each individual qubit. The over-complete Pauli basis is a particularly prominent choice, see Fig. 1(c). Three distinct measurement settings are required to evenly cover the Bloch sphere and obtain tomographic single-qubit information. Extending this to N-qubit systems produces 3 N distinct measurement settings that need to be explored, see In contrast to (single-qubit) Pauli basis measurements, (single-qubit) SIC POVMs provide access to a complete set of tomographic data from a single experimental shot. No change in measurement settings is required. This desirable feature extends to N-qubit measurements: a single measurement setting per qubit suffices to obtain tomographically complete data, see Fig. 1(b). SIC tomography utilizes (tensor products of) the single-qubit SIC Pauli tomography (a) uses 3 basis measurements per qubit to obtain tomographic information about an unknown N-qubit system, see (c). Each basis measurement is contingent on one of three possible unitary rotations -red boxes in (a). This produces a total of 3 N different measurement settings that need to be accessed. SIC tomography (b), on the other hand, uses the same measurement setting for each qubit, see (d). This non-orthogonal measurement is achieved by isometrically embedding each 2-level system (qubit) into a larger 4-level system (ququart) -blue boxes in (b) -and subsequently measuring this larger system. The experimental realization of this embedding within each ion is shown in Fig. 2.
POVM depicted in Fig. 1(d) (or a local rotation thereof). Following Naimark's dilation theorem [18], every POVM can be realized as a projective measurement on a higherdimensional Hilbert space. Using this result, together with a qudit quantum processor [17], we realize the qubit SIC POVM as a projective measurement on a 4-level system (ququart) employing two more states within each calcium ion. For this purpose, we map each qubit locally to a ququart using the unitarŷ Here, the four two-dimensional vectors contained in the first two columns represent the measurement vectors of the qubit basis given in Fig. 1(d). The optimized gate sequence for mapping the measurement states from qubit to ququart is shown in Fig. 2(b) and consists of five local rotations  40 Ca + ion representing a qubit or ququart with important transitions marked: (blue) dipole transition for cooling and detection, (red) metastable quadrupole transition for encoding qubits and ququarts within the Zeeman sub-manifold and (brown) additional transitions for repumping. (b) Gate sequence for locally mapping the SIC POVM from Fig. 1(d) on qubit-level to four orthogonal basis states of a ququart denoted in (a). This enables full read-out of the SIC POVM in a single experimental shot by means of a four-outcome projective measurement. (c) Experimental realization of SIC tomography comprised of cooling (DC = doppler cooling, PGC = polarization gradient cooling, SBC = sideband cooling), preparation of the state to be analyzed, mapping from qubit to ququart according to the SIC POVM, and finally the four-outcome projective measurement. For the latter, three sequential fluorescence detections (DET) are required [17], see Appendix AI for details.
for each qubit, with phase gates absorbed into the rotation phases. Therefore, our single-setting SIC tomography implementation remains the very same independent of qubit number.

II. RECONSTRUCTION METHODS & CLASSICAL SHADOWS
Even with a complete set of measurements, reconstructing ρ is computationally demanding, especially if one insist on enforcing physicality constraints. Two standard QST methods in the field are linear inversion (lin-inv) and maximum likelihood estimation (MLE) [19], which can be readily applied to both Pauli and SIC data. Lin-inv provides an analytical approach to estimating ρ from a complete set of projectors Π j that span the entire Hilbert space. Access to (approximations of) the associated probabilitiesp j ≈ Tr(Π j ρ) allows us to reconstruct the underlying state: with |Π j the vectorized projector, obtained by stacking the columns of Π j . Further, S −1 p denotes the Moore-Penrose pseudo inverse of the measurement superoperator S p = ∑ j |Π j Π j | [20], andp j = n j /N j is the observed frequency of outcome j after averaging over N j experimental shots. As an unconstrained method, lin-inv version bears the risk of producing unphysical estimators for ρ featuring negative eigenvalues. This is particularly pronounced when few experimental shots are used and is very problematic for estimating non-linear observables. Physical constraints are thus typically introduced through MLE, which, following Ref. [20], can be approximated by a convex optimization problem Here, S = ∑ j | j Π j | denotes a change of basis operator, | f = ∑ j n j N j | j a column vector of the observed frequencies, and W a diagonal matrix of statistical weights W . Optimization is performed under the constraints that the estimator for ρ is positive semidefinite (ρ ≥ 0) with unit trace (Tr(ρ) = 1), i.e. it must be a valid quantum state. The convex optimization in Eq. (3) is computationally more efficient than full MLE and recovers the latter in the limit of large sample sizes. Nonetheless, the computational complexity remains intractable for anything but very small systems. Lin-inv is much more efficient by comparison. Nonetheless, inverting the superoperator S p also becomes more challenging as system size increases. Viewed as a matrix, every tomographically complete Nqubit superoperator S p must have (at least) 4 N rows and (at least) 4 N columns. Performing the inversion row-by-row can offer some relief in terms of memory load, but the exponential number of multiplications remains challenging. Finally, physical constraints (ρ ≥ 0 and Tr(ρ) = 1) can be incorporated into lin-inv by truncating negative eigenvalues to obtain the closest quantum state under the Frobenius norm [21][22][23] referred to as projected least squares (PLS). It should be noted that more principled, yet ever more computationally challenging, approaches exist [24,25].
So far, we considered full QST, i.e. experimentally extracting a complete description of ρ, which is traditionally required for predicting certain properties of complex quantum systems, especially non-linear functions, most prominently purity or entanglement. In large-scale systems, however, predicting such properties becomes very costly independent of the data acquisition (SIC, Pauli) and reconstruction (lin-inv, MLE) method, both in regards to the number of required shots and in regards to the computational power required to analyze the data.
A promising alternative comes in the form of classical shadows [14,15] as a general-purpose method to construct classical descriptions of quantum states using very few experimental shots. Consequently, the classical shadows framework allow for the prediction of L different functions of the state with high accuracy, using order log(L) experimental shots. Importantly, the number of shots is independent of the system size and saturates informationtheoretic lower bounds. Moreover, target properties can be selected after the measurements are completed. A big drawback of existing classical shadow methods, however, is that they require a different measurement to be sampled randomly for each shot [16], which is demanding and slows down data acquisition. We show in the following that SIC POVMs naturally alleviate this sampling requirement ('derandomization'). SIC POVMs are thus an ideal choice for unlocking the full potential of the classical shadows framework. This has, in parts, been already pointed out in Ref. [11], which explores adaptive SIC tomography for linear cost functions inspired by VQE. Instead, we are here interested in a general framework for efficiently predicting general linear and nonlinear properties of the quantum state.
Formally, classical shadows provide an alternative approach for a linear-inversion estimator deduced from SIC measurements on an N-qubit state ρ. Each experimental shot m, containing complete tomographic information, can be be assigned to an size-N stringî m,1 , ...,î m,N ∈ {1, 2, 3, 4} ×N , where each quartic value keeps track of the SIC POVM outcome observed.
For each shot m, an N-qubit estimator for the density matrixσ m = N n=1 3|ψˆi m,n ψˆi m,n | − I is obtained, referred to as a classical shadow. A total of M such estimators can be experimentally inferred and accumulated to approximate ρ asρ A crucial observation here is that, compared to standard linear inversion in Eq. (2), the processing of classical shadows is performed in the dimension of the quantum state 2 N · 2 N , which is the minimal possible dimension for full tomography, see Appendix AVIII A 2. Moreover, predicting linear observables using classical shadows is even more efficient as it suffices to reconstruct a subset of ρ solely where operators act on. In Appendix AVIII B, we show how we can formalize these considerations to derive a measurement budget for estimating linear observables. Suppose that we are interested in estimating a total of L 1 subsystem observables tr(O l ρ), where each O l only acts non-trivially on (at most) K ≤ N qubits. Then, measurements suffice to jointly ε-approximate all observables with probability (at least) 1 − δ . We emphasize that this is a novel, rigorous a-priori bound based on minimal assumptions. In practice, convergence sets in (much) earlier. A full derivation and additional context is provided in the Appendix. For now, we merely point out that improvements of order 2 K are possible for the exponential scaling in case the observables in question have small Hilbert-Schmidt norm, as is the case for fidelities. Apart from linear observables, classical shadows also promise to allow for efficient estimation of non-linear functions, see Appendix AVIII C. Whereas the full scope of non-linear functions is covered in the Appendix, here we focus on a quadratic estimator in form of the (subsystem) puritŷ as purity generally obeys hard convergence properties and also provides a means to measuring entanglement via Rényi-entropies (see below). The latter are given by the negative logarithm of subsystem purities, corresponding to certain bipartitions of the state. Similar to before we can derive a measurement budget, where measurements allow for ε-approximating L subsystem purities of size (at most) K with probability (at least) 1 − δ . We emphasize that this is again a novel, rigorous apriori bound based on minimal assumptions, actual convergence sets in much quicker. However, this bound still marks an improvement over the best available results for purity estimation with randomized single-qubit measurements [26,27]. The improvement follows from exploiting the geometric structure of SIC POVMs and we refer to Appendix AVIII C for additional context and complete proofs. These results demonstrate that classical shadows in combination with SIC measurements offer a powerful tool set for measuring entanglement in a scalable fashion [26,27]. Whereas our experimental studies will primarily focus on quadratic estimators of Rényi-entropy (Eq. (9)), classical shadows can be extended to higher order estimators following the same principles: (i) rewrite a degree-d polynomial as tr Oρ ⊗d , (ii) replace each ρ with an independent classical shadowσ m and (iii) average over all different sub-selection of distinct classical shadows. We refer to Refs. [14,26,27] for details. Finally, we remark that classical shadow estimators from Pauli basis measurements can in principle be obtained by randomly sampling over the measurement settings from shot to shot. Although experimentally feasibly, this is highly impractical. Remarkably, the Pauli basis also leads to slower convergence than SIC measurements.

III. EXPERIMENTAL SETUP & SIC IMPLEMENTATION
Experimental results in this work are obtained on a trapped-ion quantum processor based on a linear string of 40 Ca + ions, each encoding a single qubit in the (meta-)stable electronic states {S 1/2 (m = −1/2) = |0 , D 5/2 (m = −1/2) = |1 } [28]. A universal set of quantum gate operations is realized upon coherent laser-ion interaction and comprises of arbitrary local single-qubit rotations together with two-qubit entangling operations, enabling all-to-all connectivity. A binary qubit measurement is implemented by scattering on the dipole transition, where fluorescence is only observed if the ion is in the |0 state, thereby separating the computational basis states {S 1/2 (m = −1/2) = |0 , D 5/2 (m = −1/2) = |1 }; see Fig. 2(a). Equivalent control over the entire S-and D-state Zeeman manifold allows for encoding a higher dimensional quantum decimal digit (qudit) with up to 8 levels in each ion, combined with full fluorescence read-out of the whole qudit space [17], see Appendix AI.
The present work builds on this capability by utilizing up to four levels per ion to implement SIC POVMs. To this extend, two additional levels D 5/2 (m = −3/2) = |2 and D 5/2 (m = +1/2) = |3 are taken into account, see Fig. 2(a). Upon applying the mapping sequence depicted in Fig. 2(b), each qubit is locally extended to a ququart where each basis state encodes one SIC-vector. A fouroutcome projective measurement is implemented by three sequential fluorescence detections, where before the second detection the population between state |0 and |1 is flipped and likewise before the final detection the states |0 and |2 are flipped. This enables us to evaluate the full ququart state probabilities from three binary outcomes. The entire experimental sequence comprised of cooling, state preparation, mapping the SIC POVM to the ququart and four-outcome read-out is shown in Fig. 2(c) to which we refer as a single experimental shot. We remark that this SIC tomography procedure works independently on each qubit and that such a single experimental shot delivers the complete tomographic information of the N-qubit system.

IV. RECONSTRUCTION TIME
While SIC POVMs can significantly speed up data acquisition, the classical resources needed for reconstructing and storing the quantum state ρ is typically an additional bottleneck in QST (see Eq. (2) and Eq. (3)). In the following, we compare the computational time for reconstructing ρ following various tomography approaches. For the moment, we solely focus on reconstruction time and discuss the convergence properties of the various methods later and in the Appendix AII. For a system of N qubits, we consider tomography data comprised of M = 100 · 3 N shots. This corresponds to 100 shots for each measurement set-ting used in Pauli tomography, which, on the trapped ion platform, has proven to be a good trade-off accounting for statistics, systematic drifts, and measurement time.   Fig. 1(c)) and SIC tomography ( Fig. 1(d)), as well as SIC-based classical shadows (Eq. (4)) as a function of the system size. For each qubit number N, the reconstruction considers M = 100 · 3 N experimental shots (i.e. 100 shots per Pauli basis, see text). The analysis is conducted on a standard desktop computer and plotted double logarithmically in the number of experimental shots. We find both MLE methods to require the highest computational resources, with SIC MLE significantly faster as the processed dimension is lower with 4 N ·4 N compared to 6 N ·4 N for Pauli tomography. Lin-inv with SIC measurements shows an initial time-offset to Pauli tomography arising from handling ququart (dim = 4 N ) instead of qubit data (dim = 2 N ), but grows much more slowly with system size. Among the lin-inv methods, we find the classical shadow reconstruction to be the fastest for an increasing number of qubits, as it accumulates each individual shot in dimension 2 N · 2 N (see Eq. (4)), avoiding the costly matrix inversion in dimension 4 N · 4 N or 6 N · 4 N as required for SIC and Pauli tomography, respectively. Figure 3 illustrates the classical reconstruction time versus the number of qubits N for experimental data used throughout this manuscript. Whereas absolute time reflects a laboratory desktop computer, relative scaling between methods remains generally valid. Note that the plot is double-logarithmic in the number of shots M = 100 · 3 N . While MLE methods always obey physical constraints, solving the convex optimization problems is costly and only feasible for small system sizes. We find MLE with SIC measurements to be more efficient, due to handling matrices of maximum size 4 N · 4 N , in contrast to 6 N · 4 N for the over-complete Pauli basis. However, MLE quickly becomes infeasible as the number of qubits increases. SIC lin-inv suffers an initial offset to Pauli lin-inv due to computing ququart instead of qubit state probabilities, which for the MLE approaches was masked in the overhead of Fig. 4. SIC-based classical shadow tomography of a 5-qubit AME state. We compare the convergence of lin-inv (Eq. (2)) and classical shadow tomography using SIC measurements (Eqs.(4)-(6)). We additionally use PLS following Ref. [21][22][23], explained in text. (a) In terms of convergence, we find both SIC lin-inv from Eq. (2) and SIC shadows lin-inv from Eq. (4) to overlap which is expected due to their similarity. We remark that in terms of reconstruction SIC shadows lin-inv is computationally more efficient, see Fig. 3. PLS incorporates physical constraints (Tr (ρ) = 1 and ρ > 0), but underestimates the fidelity for small number of shots and converges very slowly requiring at least an order of magnitude more shots (indeed slower than MLE, additionally confirmed by experiments in Appendix AII and numerical simulations in Appendix AVI). Shaded regions illustrate 1 standard deviation around the mean of multiple batches of equal shot numbers. (b) When estimating purity, lin-inv shows highly unphysical behaviour under insufficient statistics and not yet converged in the plot. Physical constraints can be corrected by PLS, although at the cost of much slower convergence. On the other hand, classical shadow purity estimators (Eq. (6)) display very quick convergence. (c) Estimating Rényi-entropies (Eq. (9)) from subset purities across all bipartitions converges very quickly for all methods. The lack of difference between the methods is likely due to the small system sizes, and for increasing bipartition sizes we expect similar behaviour as in (b). Note that bipartitions are denoted by tuples (2,3)-qubits and (1,4)-qubits referring to the number of qubits in each part. convex optimization from Eq. (3). As the number of qubits increases, SIC makes up for this, as the computations are performed in a smaller dimension. Although computationally much cheaper than MLE, even lin-inv is becoming increasingly costly due to the memory requirements of processing the inverse superoperator S p from Eq. (2). Already for 6 qubits this requires 268.4 MB and 3100 MB for SIC and Pauli measurements, respectively. Scaled up further, this will rapidly exceed the memory of today's computers. Alternatively, inversion of S p could be done rowby-row to reduce memory load, but this would be more time-consuming than pre-calculating the inverse S −1 p as we have done here. While linear observables under lin-inv are proven to quickly converge (see Appendix AVIII B), nonlinear functions suffer from the nonphysical properties in the form of negative eigenvalues, see Fig. 4(b). Furthermore, PLS adds negligible computational overhead over lin-inv and is thus neglected in this comparison. PLS does, however affect the convergence and accuracy of the estimators, as shown and discussed further below.
Finally, we find the best scaling for the SIC-based classical shadows from Eq. (4), where data is processed at the dimension of the density matrix, 2 N · 2 N , avoiding matrix inversion or optimization altogether. Instead, individual experimental shots are accumulated, offering convenient updates of ρ for every new set of data as where further below. As a consequence of this individual accounting for every shot, the computational complexity of this method grows linearly with the total number of shots. While this linear overhead leads to slightly worse performance for very small systems, it is more than compensated by the improved exponential scaling with qubit number (2 N · 2 N vs. at least 4 N · 4 N ) for large systems. Hence, the SIC-based classical shadows clearly outperform all other methods for 6 or more qubits. Note, that for large-scale systems the gap between classical shadows and standard lin-inv becomes even bigger as row-by-row lin-inv becomes requisite.

V. ESTIMATING PROPERTIES OF THE STATE
Here we shift our attention towards convergence of the different tomography estimators. In particular, we showcase the classical shadows' unique feature of efficiently predicting non-linear properties of even large-scale quantum systems. To this end, we experimentally perform tomography on a 5-qubit AME state [29] 2 AME states are the most entangled states in the sense that they are maximally entangled in all bipartitions [30]. This makes them interesting for applications in quantum error correction [31], quantum teleportation, quantum secretsharing and superdense coding [32]. Alas, their general existence remains unknown for all but the smallest systems. We prepare a locally-rotated GHZ state and probe the state via live-update SIC tomography. The local π/4 rotation ensures that the state is not aligned with any tomography basis and can serve as a proxy for an arbitrary entangled state. Data analysis is performed in parallel to data acquisition to get the quickest possible feedback. Shot-by-shot reconstruction is realized via classical shadows (Eq. (4)-(A20)). (a) Experimental results on purity (left), fidelity (middle top), and all (2,6)-qubit bipartition Rényi-entropies (middle bottom) obtained live in a time-regime where the data acquisition time dominates. We find the purity from classical shadows to have converged after less than 1000 s, with the other measures converging significantly faster. Fidelity converges very fast below 500 s and Rényi-entropies saturate almost immediately. Note, that for the latter, the curves for SIC shadows lin-inv and SIC shadows quadratic overlap due to the fast convergence for small subsystem sizes. (b) Comparison of the time for data acquisition ( ) following the expected linear curve along times-curves for data analysis utilizing different methods and metrics covering the entire 12 500 s of data taking. When analyzing fidelity, parity and all (2,6)-qubit bipartition Rényi-entropies (•) as shown in the experimental results of (a), live update is possible for around 2500s. This is expected, since the shot-by-shot data analysis scales quadratically in total shot number, see Eq. (6). Estimating only Rényi-entropies for all bipartitions ( ), on the other hand, remains feasible for the entire time as it overlaps with data acquisition ( ) following the linear curve. Additionally, we simulate the estimation of all Rényi-entropies of an 18-qubit state (× × ×) by considering the data acquisition overhead ( ), demonstrating that also this in principle remains feasible to perform live.
We characterize the 5-qubit AME state from Eq. (8) using SIC tomography data. The results in Fig. 4(a) indicate that both lin-inv methods converge very quickly in fidelity, as is likewise expected for all linear observables, see Appendix AVIII B. In terms of convergence SIC lin-inv from Eq. 2 and SIC shadows lin-inv from Eq. 4 expectantly overlap due to their similarity, whereas the latter was found to be more efficient in terms of reconstruction, see Fig. 3. In contrast, incorporating physical constraints with PLS drastically slows convergence [23], because truncation of negative eigenvalues produces a bias. We repeatedly confirm this bias by experiments in Appendix AII as well as numerical simulations in Appendix AVI. We further estimate the state's purity, as an example of an archetypal non-linear function of the full state. Here lin-inv shows highly unphysical results under insufficient statistics which not yet converged in the plot, while the classical-shadow purity estimators from Eq. (6) converges rapidly after only about 3000 shots, see Fig. 4(b). Finally, most inspired by applications, we probe the state's entanglement by estimating all second-order Rényi-entropies with the reduced density matrix ρ A for part A of a bipartition (A,Ā) together forming ρ [33]. In Fig. 4(c), we present results on Rényi-entropies following Eq. (6) and particularly cover all bipartitions denoted by tuples (1,4)qubits and (2,3)-qubits, referring to the number of qubits in each part. Note that, for classical shadow prediction, only the subset qubits in the smaller partition need to be taken into account. This leads to a drastic speed-up of the analysis for larger scale systems. Moreover, all predictions can be analyzed after the data has been acquired.
For comparison, we also analysed the AME state with Pauli tomography, see Appendix Fig. A2. Slowest convergence is consistently found for PLS, which is also notably slower than regular MLE. We additionally confirm this by numerical simulations considering only shot noise, i.e. statistical noise, see Appendix AVI). Given only few experimental shots, SIC tomography outperforms Pauli tomography in case of MLE reconstruction, likely because the SIC POVM provides the optimal information gain per shot. Curiously, however, for large shot numbers, Pauli MLE starts to outperform SIC MLE in terms of convergence. We suspect this to be a result of the over-completeness of the Pauli basis, where very large shot numbers may lead to improved accuracy for each orthogonal direction. Note that all methods converge to the same point, as verified in numerical simulations, see Appendix In conclusion, we find SIC tomography to be preferable over Pauli tomography in regards to both, classical computation time, and convergence speed. At the same time, the underlying classical shadow formalism provides the potential for scaling to large quantum systems. We emphasize that the moderately lower quantitative performance of SIC tomography observed in our data is not inherent to the method, but due to experimental imperfections, i.e. the additional overhead in mapping SIC POVMs to ququarts, and the four-outcome read-out, see Appendix AI. These technical imperfections can be overcome on future devices.

VI. LIVE-UPDATE TOMOGRAPHY
Since the SIC POVM contains full tomographic information within each shot, it provides a unique way to speed up QST. Combined with classical shadows, which work by accumulating estimates in each shot according to Eq. (4), SIC tomography can be performed in real-time (or 'online'), i.e. a live update is performed for every new set of experimental shots. Apart from reducing time overheads by performing experiment and analysis at the same time, this approach has the big advantage that the experiment can be stopped as soon as all properties of interest are (believed to be) accurately estimated. Based on these ideas, we demonstrate a live reconstruction of a maximally-entangled 8-qubit state. Specifically, we use a GHZ-state with an additional local π/4 rotation of all qubits as a proxy for a generic maximally-entangled state that is not aligned with any of the tomography bases to provide a fair comparison. Figure 5(a) presents results on the estimation of fidelity, purity and Rényi-entropies (Eq. (9)), the latter for all possible (2,6)-qubit bipartitions. Purity of the full 8-qubit state is found to converge the slowest, after around 1000 s. This is still a drastic speedup compared to lin-inv, which, after an order of magnitude longer averaging time still produces unphysical results. Fidelity converges after less than 500 s and Rényi-entropies saturate almost immediately on the presented time-scale. We remark that both curves for SIC shadows lin-inv and SIC shadows quadratic overlap as a consequence of the fast converging 2-qubit subsets, see Appendix Fig. A3 on bigger bipartitions. In Fig. 5(a), live updates are tracked up to 2500 s, which is the the limit beyond where the analysis starts to take longer than the data acquisition (due to the quadratic scaling in the number of shots for purity estimation, see Eq. (6)). We extend the discussion about the time relation between data acquisition and data processing in Fig. 5(b), where we acquire 100 experimental shots in about 2.4 s and show the entire 12 500 s of data taking. Over this time, we observe that the computational time nicely follows the expected quadratic growth with shot number, relating to the number of approximated ρ's to compare in Eq. (6). Dropping the full-system purity, we find that all bipartition Rényi-entropies can be estimated in real-time throughout the entire time of data taking. On top, simulations suggest that even on an 18qubit state all bipartition Rényi-entropies can be estimated live for around 1000 s.
In practice, we accumulate 100 shots for each classical shadow approximating ρ following Eq. (6) which we refer to as batching. This post-processing trick enables a trade-off between computational time and convergence speed, which is studied thoroughly through numerical simulations in Appendix Fig. A4. A suitable batch-size must be decided on a case-by-case basis. From a practical point of view, the experimental noise might also fall into consideration as it affects the targeted accuracy. Thus, batching experimental shots for the analysis comes as a handy tool for reducing analysis time with a limited effect on convergence and thereby extends the window for doing real-time analysis.

VII. DISCUSSION & OUTLOOK
We have demonstrated that real-time SIC tomography enables the prediction of Rényi-entropies in less experimental shots than required for a minimal Pauli tomography implementation (see Fig. 5). Depending on the state and estimator, SIC tomography has the potential to significantly speed up the prediction of many more properties of quantum states. Beyond Rényi-entropies, we challenge SIC tomography in a state identification game partly inspired by Ref. [34]. We show that it excels over other state-of-the-art methods (see Appendix Fig. A5). For the challenge in question, SIC tomography required less than 20 shots to correctly identify a randomly chosen 4-qubit linear cluster state among 16 orthogonal state possibilities, clearly outperforming any Pauli QST method. Note that, such a speed-up in favour of SIC tomography will become even more pronounced on up-scaled systems.
Though SIC POVMs are optimal for tomography [4], they are not known to exist in every dimension. Indeed, just resolving this issue would have far-reaching consequences for the foundations of mathematics [35]. For a given Hilbert space of dimension d = 2 N , a SIC constructed in that space is referred to as a global SIC, whereas a measurement composed of N two-dimensional SICs is referred to as a local SIC. If they indeed exist, global SIC tomography would be sample-optimal in the sense that it saturates fundamental lower bounds from information theory [23,36] (joint measurements across many state copies could still yield further improvements [36,37]). Global SIC measurements would, however, be challenging to realize on quantum hardware. Quadratic circuit sizes (in the number of qubits) may be necessary, because the associated SIC states form a 2-design -a concept closely related to chaos and information scrambling [38]. And recent works provide lower bounds on the minimum circuit depth required to achieve information scrambling [39]. We conclude that, although local SICs may be less efficient than global SICs in terms of measurement complexity [23,40], they are much cheaper to implement. In addition, they are informationally complete and optimal amongst all possible local measurements [40].
To further improve system predictions, one might want to adapt the measurement basis depending on the state to analyse, potentially reducing quantum shot noise and offering a speed-up in convergence. The concept of adaptive tomography follows those means, where the measurement setting is adjusted based on the outcomes of prior measurements [11,[41][42][43][44][45]. In the most extreme cases, the measurement settings are changed after each shot. Although this has been demonstrated in some settings [46][47][48][49][50], it requires both additional classical computation and physical setting adjustments, rendering it potentially very time consuming.
In contrast, the SIC POVM representation from Fig. 1(d) turns out to be very efficient and practical. Moreover, we studied convergence properties of states with different overlap to either SIC POVM or Pauli basis, see Appendix AVII. Interestingly, purely local states analyzed by SIC significantly increase convergence when a component along one of the non-orthogonal SIC POVMs vanishes obeying the concept of unambiguous state discrimination [51]. For randomly aligned, or correlated states, however, the effect vanishes making the local rotation of the SIC POVM irrelevant. However, for estimating Pauli observables, the rotation does matter, with the optimal alignment given such that the overlap with the Pauli basis is symmetric, see Appendix Fig. A10(b). This rotated SIC will be notably useful for VQE applications, as those rely on efficient Pauli observable measurements.
Finally, we emphasize that the combination of SIC POVM measurements with the classical shadow formalism is well-suited for directly estimating higher-order polynomials of an unknown density matrix ρ. As discussed in Appendix AVIII D and following ideas from Refs. [26,27], this opens the door for mixed-state entanglement characterization of large-scale systems in real time, a likely requirement in the development of scalable quantum technology.
Note: In the final stages of this project we became aware of independent and complementary research using SIC POVMs on a superconducting quantum processor [52]. Experimental implementations here and in the main text are performed on a trapped-ion quantum computer, which is schematically shown in Fig. A1(a). The device operates on a string of 40 Ca + ions stored in ultra high vacuum using a linear Paul trap. Each ion acts as a qubit encoded in the electronic levels S 1/2 (m = −1/2) = |0 and D 5/2 (m = −1/2) = |1 denoting the computational subspace [28]. Quantum state manipulation is realized upon coherent laser-ion interaction. A universal gate-set comprises of addressed single-qubit rotations with an angle θ around the x-and the y-axis of the form R σ j (θ ) = exp(−iθ σ j /2) with the Pauli operators σ j = X j or Y j acting on the j-th qubit, together with two-qubit Mølmer-Sørenson entangling gate operations MS i, j (θ ) = exp(−iθ X i X j /2) [53]. Multiple addressed laser beams, coherent among themselves, allow for arbitrary two-qubit connectivity across the entire ion string [17]. Optionally, all ions can be addressed simultaneously using a global beam to enable both collective local operations as well as collective entangling operations MS(θ ) = exp(−iθ ∑ j< X j X /2). We choose whatever is more efficient for the underlying experiment. Initial state preparation in |0 is reached after a series of Doppler cooling, polarization-gradient cooling and sideband cooling. Read-out is realized by exciting a dipole transition coupled to the lower qubit level |0 and collecting its scattered photons, from which the computational basis states |0 and |1 can be identified. Thereby, a qubit's state is revealed by accumulating probabilities from multiple experimental runs. The dipole laser collectively covers the entire ion string, which enables a complete read-out in a single measurement round. Additional pump lasers support efficient state preparation as well as cooling and prevent the occupation of unwanted meta-stable states outside the computational subspace {|0 , |1 }. Beyond qubit level, we hold equivalent control over the entire S-and D-state Zeeman manifold with up to 8 levels in each ion, allowing us to encode a higher dimensional quantum decimal digit (qudit), see Fig. AI(b). In this work we make use of up to four levels, denoting a ququart via additionally employing D 5/2 (m = −3/2) = |2 and D 5/2 (m = +1/2) = |3 alongside both qubit states. Ququart read-out can be performed via three consecutive fluorescence detections, where before the second detection the population between |0 and |1 is switched and before the third and final detection the population between |0 and |2 is switched, see main text Fig. 2(c). Combining this three binary outcomes enables us to evaluate the ququart state probability within a single experimental run. Note that, fluorescence detection in case of measuring a bright state heats up the ion string due to photon scattering. This is counter acted by a sequence of Doppler and polarization-gradient cooling after each individual detection to keep the quality of post measurement bit-flip operations high and thereby suppress detection errors.
The last paragraph of this experimental setup section is dedicated to technical errors limiting our tomography experiments. Performance on SIC tomography is generally found to be moderately lower compared to Pauli tomography, see main text Fig. 4. Evidently, this performance decrease is not inherent to the method, but rather owed to technical errors for two main reasons: 1) The SIC tomography implementation generates an overhead of 5 local pulses per qubit used for mapping qubit to ququart, depicted in Fig. 2(b), as well as two additional bit-flips realizing the four-outcome read-out, see Fig. 2(c). In contrast each Pauli setting requires just 1 local pulse per qubit, yet requiring three orthogonal measurements per qubit to extract full tomography information. Our trapped-ion setup has a single-qubit gate fidelity of 0.9994(3) estimated from Randomized benchmarking as well as 2-qubit gate infidelity of roughly 0.98(1) estimated from a decay of fully entangling MS-gates [17]. The latter two-qubit gate fidelity might slightly fluctuate from pair to pair. Moreover cross-talk to adjacent ions have an influence. 2) After each insequence-detection the CCD camera demands for a 3 ms pause to process the data, before the upcoming sequence continues. However, we utilize this time for re-cooling the ionstring via Doppler and polarization-gradient cooling. During this pause the ions are exposed to amplitude damping due to spontaneous decay from the upper D-states states, having a life-time at about 1 s. Accounting for all this throughout measurement taking, we observe a loss of fidelity per qubit between Pauli and SIC tomography of less than 1 %. Thus, extracting complete tomography information in a single experimental run, i.e. shot, comes at the expense of a more complex experiment, that however is of technical nature and can be overcome on future devices with reduced single-qubit error-rates as well as with faster processing CCD-cameras, which nowadays already exist. More importantly, only SIC tomography offers the unique potential to predict non-linear properties in large-scale systems, as pointed out here further below and in the main text.

AII. COMPARING TOMOGRAPHY METHODS
We start off by presenting complementary experimental data covering the 5-qubit AME-state from main text Fig. 4. Whereas in the main text the focus was on scalable approaches, especially SIC-based classical shadows, here we compare these results to Pauli tomography according to Fig. 1(a),(c) using linear inversion and MLE reconstruction following Eq. (3). Generally, linearly reconstructed density matrices exhibit negative eigenvalues in the case of insufficient statistics, which manifest themselves particularly in unphysical values for non-linear functions of the density matrix, as indicated by purity values above 1. Physical constraints can be imposed on lin-inv, through truncation of negative eigenvalues following Ref. [21][22][23], which we refer to as projected least squares (PLS). Importantly, PLS adds only a negligible computational overhead to lin-inv. Consequently, the covered set of tomography approaches is representative in the field of quantum computation and quantum information. Figure A2 depicts results on fidelity, purity, and negativity, with the latter being a common measure of entanglement, albeit one that is challenging to access with experiments.
Note that for 5-qubit Pauli tomography 3 5 = 243 settings are required, where for this particular case each setting was repeated a 100 times, leading to the stated maximum shot number of 24 300. 100 shots per setting prove to be a good tradeoff in the trapped ion platform accounting for both statistics and systematic drifts in the experiment. The moderately lower performance (in terms of the numerical values) of SIC tomography is not inherent to the method, but comes from technical imperfections due to experimental overhead. Particularly, the mapping of the SIC POVM to ququart states, as well as the four-outcome read-out, both essential for single-setting tomography, add experimental complexity, see Appendix AI. Figure A2(a) shows that the fidelity converges very quickly for lin-inv approaches, as is expected for general linear operators, see Appendix AVIII B. Interestingly, Pauli MLE performs worse than all other methods at very few shows, while SIC-based methods perform very well, even for low shot numbers, since this measurement extracts the maximal amount of information for a generic state. PLS on the other hand, despite the computational efficiency, converges much slower than the other methods, even including MLE [23]. For quadratic measures in Fig. A2(b) it becomes clear that lin-inv methods produce highly unphysical results which take a long time to converge, limiting the usefulness of these methods in practice. PLS solves this problem, but again shows very slow convergence. Both problems are solved by the SIC-based classical shadow purity estimator from Eq. (A20), which demonstrates both fast convergence, and accurate (physical) estimates. Finally we study negativity as a commonly used measure of quantum entanglement [54]: where ρ Γ A represents the partial transpose with respect to subsystem A of a bipartition (A,B) together forming ρ. The 1-norm in Eq. (A1) denotes the absolute sum of all negative eigenvalues given by ρ Γ A . By construction, the partial transpose of a separable state cannot have negative eigenvalues, such that the negativity vanishes. Quantum negativity is an entanglement monotone: if it is positive, then the underlying state must be entangled. The converse, however, need not be true in general [55]. Comparison between SIC and Pauli tomography for a standard set of reconstruction methods. Using the 5-qubit AMEstate from main text Fig. 4, we analyze fidelity, purity and negativity from Eq. (A1). Due to the increased experimental complexity, we observe that SIC methods generally converge to slightly lower values than Pauli, however this is not inherent to the method, but rather the implementation. (a) Fidelity converges quickest for lin-inv approaches, as expected for general linear functions, see Appendix AVIII B. Convergence of PLS, however, is very slow compared to all other methods. SIC MLE at low shot numbers performs better than Pauli MLE as SIC measurements provide the maximum information gain. (b) SIC-based classical shadows ("SIC shadows quadratic") demonstrate the best convergence for purity. MLE methods converge similarly to (a), but lin-inv methods show very slow convergence with highly unphysical results, making their use problematic in practice. (c) As a commonly used entanglement measure, we evaluate the negativity of the reconstructed states, see text for details. As expected, this property converges the slowest, since it is a global property of the density matrix (making this non scalable), with lin-inv again showing highly unphysical results. Bright shaded regions represent 1 standard-deviation around the mean-value when averaging multiple sets at the given shot number.
As in the main text, we consider the bipartition [2,3] for the 5-qubit system. We find a significantly slower convergence than for Rényi-entropy (see Fig. 4). This is due to the requirement for processing the entire density matrix ρ for quantum negativity as opposed to the classical shadow subsystem purity estimator, which is only evaluated on the smaller partition. The same convergence behaviour is confirmed by numerical simulations discussed in Fig. A7 below. Classical shadows, on the other hand, allow for tighter classifications of entanglement [26,27]. The key idea is to probe the presence of negative eigenvalues in the partial transpose by comparing degree-d polynomials in the underlying density matrices. As d increases, these tests become tighter and eventually recover the negativity condition for entanglement (N (ρ) > 0). Classical shadows allow the estimation of all polynomials involved, but the classical post-processing cost becomes less and less favorable as the polynomial degree d increases [27].
We emphasize that from the given set of tomography schemes, SIC-based classical shadow estimators deliver the best results in terms of both convergence as well as practicability. MLE reconstruction typically fails due to lack in computational power and lin-inv neglects physical constraints -a shortcoming that becomes very pronounced for nonlinear observables. Incorporating physical constraints by projection (PLS) remains computationally efficient, but leads to considerably poorer convergence behaviour. Finally, classocal shadows are the only approach for efficiently predicting non-linear functions, such as mixed-state entanglement [26,27] of large-scale systems.

AIII. COMPLEMENTARY RESULTS ON ROTATED 8-QUBIT GHZ-STATE
In the live update discussion of main text Fig. 5 all two-qubit bipartitions were evaluated on top of fidelity and purity. Here we analyzed the same data in post-processing to present results on all bipartitions ([1,7]-qubits, [2,6]-qubits, [3,5]-qubits and [4,4]-qubits). This evaluation for all possible pairs was not possible in real-time on a standard desktop computer. We average until 50,000 shots (roughly 1200 s of data taking), where the SIC based classical shadow purity estimator (Eq. (A20)) of the 8-qubit state, representing the most demanding property, has converged. Data was taken for a total of 12 500 s. Note that at around 40,000 shots almost no change is visible in the classical shadow purity as well as the respective standard-deviation. The latter is due to systematic experimental drifts over the course of the long time measurement. Here we batched 100 shots for each approximate ρ following Eq. (4), which speeds up the analysis without significant loss in accuracy, see Appendix AIV for a thorough study of batch-sizes considering both analysis time and accuracy. We also observe that the convergence of bipartitions from lin-inv become significantly slower than classical shadows as the subsets get bigger. For PLS individual bipartitions even visually separate, indicating very slow convergence behaviour, confirming similar observations throughout the main text and appendix. Pauli tomography was neglected here as all bipartition Rényi-entropies from SIC-based classical shadows have converged faster than the time it would take to just obtain a single measurement per Pauli setting. Moreover, there is no efficient way of adding physical constraints to the reconstructed state ρ as MLE for an 8-qubit state is unfeasible on standard desktop computers, and PLS delivers significantly worse convergence. The findings here agree with those of the 5-qubit AME-state, previously discussed. SIC-based classical shadow estimators demonstrated to be most suitable for predicting non-linear functions towards larger system-sizes. Apart from the exemplified quadratic measures utilized in purity and Rényi-entropy, classical shadow estimators support higher polynomial functions following the same principles [26].

AIV. CLASSICAL SHADOWS CONVERGENCES AND PRACTICABILITY
In the live update studies from main text Fig. 5(b), we demonstrated real-time analysis of ongoing SIC tomography experiments, until all properties of interest were accurately estimated. How long the analysis can be performed in real time, however, depends on the size of the subsystems to be analyzed, as well as the type and number of functions to be estimated in parallel. Whereas time consumption for linear observables by means of SIC classical shadows remains constant over the course of data acquisition -individual experimental shots are simply processed and accumulated according to Eq. (A15) -non-linear functions generally require higher order products of all combinations from the given set of shots, see Eq. (A20). The resulting scaling is governed by the maximum polynomial order of the function in question minus one. For quadratic functions, in particular, the computation time for every new shot grows linearly with the number of already accumulated shots M. Hence, over the course of the data-acquisition, this can eventually become computationally demanding, especially for large problem sizes where a large number of shots must be accumulated. Thus, to perform non-linear function analysis in real-time, one either keeps subsystems (and by that shot requirements) relatively small, or uses a so-called batching approach, where multiple shots are bundled to estimate a more accurate ρ and thereby reduce the number of costly higher order product combinations. In this section, we discuss the potential, as well as limits of increasing the batch-size in contrast to comparing single shots. To develop a quantitative statement we perform simulations on a 3-qubit linear cluster state considering only statistical quantum projection noise. Along those lines, simulated tomography data is sampled from a multinomial distribution considering different sample sizes to mimic experimental shots. Each noisy set of tomography data is then reconstructed by means of SIC-based classical shadows, in particular, focusing on the quadratic purity estimator from Eq. (A20). Since we are only interested in changes in convergence behaviour, the choice of state does not affect the qualitative statement of these numerical simulations. We compare 100 different shot numbers for 100 different batch-sizes in a practical regime for 3-qubit states. On the one extreme we compare all combinations of shadows obtained from a single shot each, which is known to be statistically optimal (see also the discussions around Eq. (A20)). On the other extreme we compare just two shadows, each linearly more accurate, since they are obtained from averaging half the shots. In between we can trade-off the quality of the individual estimators versus the number of comparisons between estimators. By additionally accounting for analysis time a sweet-spot can be determined on a case-by-case basis.
We analyze the convergence behaviour of these different strategies in Fig. A4 by means of the standard deviation of the classical shadow purity estimator, when repeating every point in the 2d-grid 100 times. We plot "shots" against "batchsize" using a logarithmic color coding. Darker regions refer to a more accurate purity estimate for the underlying state. A region at a specific color always shows a certain almost vertical extension, indicating that batching has a very small effect on accuracy. At the same time, however, it offers to speed up the data analysis significantly. We note that the observed pattern qualitatively remains the same for bigger system sizes and accordingly more shots. Especially for bigger subsystems, where a lot of statistics is required, batching has the potential to significantly speed up analysis. We made use of this method in main text Fig. 5, where 100 experimental shots were used for each classical shadow. This turned out to be sufficient to estimate all relevant target properties in real-time.  Fig. A4. Numerical simulations of batching SIC based classical shadow purities. We plot the purity's standard deviation from 100 repetitions of each point in the 2d grid using a logarithmic color scheme. A 3-qubit linear cluster state is utilized in these numerical simulations, although the particular state and qubit number does not effect the qualitative picture. Regions of similar accuracy are found to have a certain almost vertical extension, indicating that bunching does not significantly degrade accuracy, while greatly reducing computation complexity. The ideal batch size must be evaluated on a case-by-case basis, weighing convergence against analysis time.

AV. FAST STATE IDENTIFICATION WITH CLASSICAL SHADOWS
We now consider a quantum game, where a quiz master targets us experimenters, having access to a quantum computer with a state to prepare and a question about a certain property. Importantly, the question is only revealed after performing the experiment. Such a setup is partly inspired by recent works on quantum-enhanced learning [34]. Here, we follow a game where the goal is to prepare a random target state from a fixed set of 16 states and then pinpoint this target state in as few experimental shots as possible. Fig. A5 depicts the results, where SIC tomography allows us to receive a reward in less than 20 shots. In stark contrast, the minimum number of shots for the same task using Pauli tomography is 3 4 = 81. The figure of merit for this game is the estimated distance between the states as the minimum difference in fidelity between the target state and all others. A reward is obtained as soon as that minimum distance remains positive. A quiz master constructs a quantum game where she provides us experimenters, having access to a quantum computer, with a circuit to prepare a state and perform measurements. After the measurement stage, we must report a certain property. Importantly, the property is only revealed after the experiment has been completed, preventing us from simply measuring the property in question. The aim is to win this challenge with as few experimental shots as possible.
In the present case a single state from a fixed set was randomly chosen and implemented, which we then had to identify in as few shots as possible. SIC tomography delivered a reward in less than 20 shots, whereas the smallest Pauli tomography instance requires 3 4 = 81 shots. Distance refers to the minimum fidelity difference between the target state and all others, see Fig. A6(b).
Specifically, in the decision game of Fig. A5, we randomly prepare 1 out of 16 orthogonal 4-qubit linear cluster states. These states correspond to all combinations of input states |± on the 4 qubits. The target state is then identified by comparing the the linear-inversion fidelities between the prepared state and all 16 possible targets. Note that, fidelity is a good option for state identifications as linear observables generally converge quickest under lin-inv, as we showed experimentally (see Fig. 4 and Fig. 5) and also formally derived in Appendix AVIII B. Figure A6(a) depicts fidelities with respect to all 16 states for SIC and Pauli tomography against the number of experimental shots. As expected, only one of the curves is close to fidelity 1 indicating the target state, whereas all others approach 0 within experimental uncertainties. The distinguishing performance is even better visualized by plotting the difference between the target state fidelity and all others in Fig. A6(b). The minimum of these distances (i.e. the worst case) is also shown in the background of Fig. A5 and used as our state distinguishability criteria. Surprisingly, less than 20 experimental shots are required to pin point the state, clearly undercutting the minimum for Pauli tomography. This argument can in principle be extended to larger systems, where we have seen that properties such as linear observables or Rényi-entropies (see Fig. 5) converge much faster than the 3 N experimental shots required for the smallest instance of Pauli tomography. Fast state identification represents another fruitful example where SIC-based classical shadows not only appear more practical but also impart quicker convergence for certain tasks than Pauli tomography. The difference in performance between the tomography approaches again originates from SIC's bigger experimental overhead, see Appendix AI.

AVI. NUMERICAL SIMULATIONS COMPARING TOMOGRAPHY APPROACHES
This section aims to support previous experimental findings through numerical simulations of convergence under statistical quantum shot noise. To this extent, simulated tomography data is generated by sampling from a multinomial distribution, where the sample size is given by the number of shots. For the sake of comparability, we perform these numerical convergence simulations using use the 5-qubit AME-state from main text Fig. 4 and Eq. (8). We study both SIC and Pauli tomography and cover all reconstruction methods as experimentally studied in Appendix AII, i.e. lin-inv, For illustration purposes, we plot the fidelity difference between the generated state and all possible states as individual curves, which provides a distance measure. The minimum of these differences was used as a state distinguishability criteria to gain a reward. PLS, MLE, and, for quadratic functions also SIC-based classical shadows. Numerical results are presented in Fig. A7 on a double-logarithmic scale, covering infidelity (to better illustrate convergence), and trace-distance as another commonly used property for state distinguishability Lin-inv approaches are again found to converge significantly faster than all other methods in terms of infidelity, while the trace-distance shows the opposite behaviour due to being much more complicated to estimate. This is indicative of the unphysical nature of lin-inv estimates. In contrast, MLE approaches respects those physical boundaries, which result in valid values on all estimators at the cost of a slower convergence. Here we also confirm previous experimental observations (see Fig. A2) that SIC MLE performs better for small shot numbers than Pauli MLE. This is likely due to the fact that SIC POVMs provide the optimal information gain in each shot. Curiously, however, for larger numbers of shots, Pauli MLE eventually converges faster. This might be due to the overcompleteness of the Pauli basis but remains to be fully understood. While not specifically presented in this manuscript, we found the very same behaviour for other states having different overlap with Pauli basis and SIC POVM, hence this effect does not seem to be due to the choice of state. PLS again produces the slowest converges as we have already seen across our experimental studies. Overall, quantum negativity exhibits the slowest convergence, which confirms what we experimentally observed in Fig. A3. We note that the negativity calculation always requires the full density matrix independent of the bipartition. This is in stark contrast to R'enyi-entropy based measures and general non-linear functions supported on a subset of the system, which can be estimated efficiently using SIC-based classical shadows. The latter again show the fastest convergence for purity according to Eq. (A20), which will generally be true for classical shadows by construction. To keep simulations efficient the batch-size was chosen as a constant fraction of the total number of shots, which has a negligible effect on convergence, see Appendix AIV. Finally, these numerical simulations could reproduce all features and findings from the experimental studies, thus indicating that there are no principal limitations to our experimental implementation of both SIC and Pauli tomography. Moreover, all reconstruction methods converge to the same values indicating no principle draw back across the various methods.

AVII. OVERLAP WITH TOMOGRAPHY BASIS IN EXPERIMENT & SIMULATION
Our experimental studies covered by Fig. 4 and 5 focus on maximally entangled states that were differently aligned with respect to Pauli basis and SIC POVM in order to not particularly favor either of them. We resume this discussion in more detail by investigating states of varying orientation, with respect to the measurement states, to study its effect on both experiments and numerical simulations. To this extend, convergence behaviour of first purely local states followed by entangled states is demonstrated.
We start off by presenting experimental results on SIC tomography of local states up to 8 qubits having different overlap with the Pauli basis and SIC POVM, depicted in Fig. A8. Analyzed metrics include fidelity and purity using lin-inv and classical shadows as those represent the scalable approaches. The given states were collectively prepared and subsequently To this extend quantum shot noise is incorporated via sampling ideal tomography data from a multinomial distribution, with sample sizes corresponding to number of shots. We additionally present infidelity (1-fidelity) and complementary trace-distance following Eq. (A2) and plot in doublelogarithmic scale to more clearly visualize several orders of magnitudes. Note that infidelity estimation with lin-inv sometimes delivers unphysical results, indicated by values out of range of the plot). partial traced to study multiple qubit numbers, see Appendix AI. Brighter colors denote higher qubit numbers. Among the Pauli basis states |0 and |1 , the latter performs significantly better under the SIC POVM. This reflects a general theme, where in using non-orthogonal bases it is preferable if one component vanishes, which is related to the concept of unambiguous state discrimination [51]. In the example of |1 this is indeed true for the first SIC-vector aligned with |0 (see Fig. A10(a)). For superposition states, we find that states that maximize the overlap with one SIC vector (Fig. A8(c)) perform worse than those that feature more even overlap with all SIC vectors (Fig. A8(d)). Generally higher qubit number states result in lower fidelities, due to experimental imperfections. We again emphasize the particularly fast convergence of the SIC-based classical shadow purity estimator, being here only moderately slower than fidelity.
To confirm these results for purely local states and to extend the discussion to entangled states of different orientation, we performed numerical simulations on 4 qubit states under quantum shot noise as previously explained within Appendix AVI. Figure A9 contains results for SIC tomography as well as now for Pauli tomography for predicting infidelity via lin-inv, which is efficient and enables scalability in contrast to MLE, that however, would not change the essence of statements. Figures A9(a-d) cover local states both along a SIC-vector (a,c) as well as orthogonal to it (b,d), i.e. in the opposite direction of the Bloch-sphere. These simulations reproduce the effects seen experimentally in Fig. A8, where states aligned with a SIC vector (a,c) converge more slowly than those orthogonal to it (b,d). In contrast for Pauli tomography, representing an orthogonal basis, best results are obtained for states that are aligned with the basis, as seen in Fig. A9(a-b).
In case of local states, measurements are uncorrelated and the above state dependent convergence is to be expected. The situation might change when moving to entangled states. In Figs. A9(e-f) we apply controlled-phase gates states along the SIC vectors from (c,d) to generate an entangled state that is still (in a sense) aligned with the SIC basis. The resulting convergence is similar for both tomography methods. Curiously, SIC tomography performs equally well as for special states like a GHZ state (Fig. A9(g)) or a more generic rotated GHZ state (Fig. A9(h)). In contrast, Pauli tomography, which also shows little dependence on the state once entanglement is involved, does outperform for the GHZ state (g), which is perfectly aligned with the Pauli basis. Importantly, the rotated GHZ state, which is somewhat randomly aligned (b) |1 state orthogonal to the first SIC-vector, whose component completely vanishes. Convergence in (b) is significantly faster than in (a) as only 3 of the non-orthogonal SIC vectors take part. Boosted convergence is related to unambiguous state discrimination [51] limiting non-orthogonal measurements. (c) Superposition state with maximized overlap with one of the SIC-vectors. (d) Superposition state with minimum overlap with the SIC-vectors. Here, convergence of the state in (d) is better than (a) as the state's information is more regularly distributed over the SIC-vectors, not favouring one, resulting in a higher information gain. Also note that the achievable fidelity generally drops with higher qubit number due to experimental imperfections.
with both SIC POVM and Pauli basis (h) shows no difference between the tomography methods. The same rotated state, yet on 8 qubits, was utilized for the real-time analysis in main text Fig. 5 to make sure the comparison does not favour any approach.
Inspired by these findings, it can be beneficial to rotate the SIC POVM used throughout this manuscript from Fig. A10(a) to favour the particular state or application that it is used for. Particularly, the alignment depicted in Fig. A10(b), which has equal overlap with every Pauli basis vector, leads to improved prediction of Pauli observables. Intuitively, this alignment ensures that each shot contains equal information about every Pauli observable. Geometrically, imagine a triangle spanned by the three Pauli basis vectors in positive direction. Then, the orthogonal state to the first SIC-vector perpendicularly intersects this triangle area through its center. The front area of the SIC tetrahedron orientates parallel with the triangle. Thus, favouring Pauli eigenstates or stabilizer states, for which we find plenty of applications across the entire field of quantum computation and quantum information. Particularly, VQE applications, which rely on the efficient estimation of many Pauli observables could benefit from this choice of SIC POVM.
Note that, experiments on rotated SICs can straightforwardly be realized by changing the local mapping sequence from Fig. 2(b) and do not add further complexity to the implementation.

AVIII. SIC POVM-BASED CLASSICAL SHADOW FRAMEWORK
On the contrary, SIC POVMs were originally discovered, because of their exceptional tomographic capabilities [2]. Ever since, both (tensor products of N) single-qubit SIC POVMs and global 2 N -dimensional SIC POVMs have served as idealized measurements for state reconstruction tasks. Single-qubit SIC POVMs, also known as tetrahedral POVMs, have also been used to acquire training data for neural network quantum state tomography [12,13].
Geometrically speaking, SIC POVMs [2] and the overcomplete Pauli basis [56] both form complex projective 2-designs (the single-qubit Pauli basis is actually a 3-design [57][58][59]). Roughly speaking, this means that the first two (three) moments exactly reproduce the moments of uniformly (Haar) random states. As detailed below, closed form expression for Haar-random moments can then be used to compute measurement operators and estimators analytically. For Pauli basis measurements, this observation culminated in efficient PLS estimators for full state tomography [23], as well as the classical shadow formalism for directly predicting (non-)linear properties of the underlying state [14,26]. Subsequently, some of these ideas have been extended to (single-qubit) SIC POVM measurements. Ref. [11], in particular, highlights that SIC POVM measurements can outperform Pauli basis measurements in VQE-type energy measurements and, more generally, for predicting linear state properties.
Here, we build on all these ideas and provide a self-contained derivation of classical shadows from (single-qubit) SIC In case of the Pauli basis, however, some improved convergence can still be observed for specific states, such as the GHZ-state (g), which is aligned with the Pauli basis and thus performs better.
POVM measurements, as well as rigorous sample complexity bounds for general linear and quadratic property estimation. The actual definition of SIC POVM shadows is virtually identical to existing (Pauli basis) classical shadows, because both form complex projective 2-designs. Sample complexity bounds, on the other hand, require novel proof techniques. They require computing variances which correspond to 3rd order polynomials in the measurement ensemble. This is comparatively easy for Pauli basis measurements, which form a 3-design. But for SIC POVMs -which only form a 2-design -these existing techniques don't apply. We overcome this drawback with new proof methods that directly use the symmetries within a SIC POVM rather than an abstract 3-design property. To our knowledge, these theoretical arguments are novel and may be of independent interest. A. Classical shadows from SIC POVM measurements

The single-qubit case
Single-qubit density matrices live in the (real-valued) vector space of Hermitian 2 × 2 matrices which we denote by H 2 . On this space, single-qubit SIC POVMs are known to form so-called projective 2-designs [2]. Mathematically, this means that discrete averages over (outer products of) SIC-vectors reproduce uniform averages over all possible pure states up to second moments [60]. This is captured by the following two averaging formulas: Here, dv denotes the unique pure state measure (normalized to dv = 1) that assigns the same infinitesimal weight to each state (dv = dw). The two qubit swap operator operator F|v ⊗ |w = |w ⊗ |v acts by permuting tensor factors. The first averaging formula (A3) confirms that the collection 1 2 |ψ i ψ i | : i = 1, 2, 3, 4 ⊂ H 2 forms a valid quantum measurement for single-qubit systems (POVM). Let ρ ∈ H 2 be a density matrix, i.e. a Hermitian matrix with unit trace whose eigenvalues are non-negative. Then, obeys Pr [i|ρ] ≥ 0, because density matrices don't have negative eigenvalues ( ψ i |ρ|ψ i ≥ 0). Moreover, Eq. (A3) ensures proper normalization: The second averaging property (A4) is more interesting. It ensures that SIC POVM measurements are informationally complete, i.e. we can reconstruct every density matrix ρ based on outcome probabilities. There are many ways to establish this property. Here, we choose one that is based on the following observation. If we weigh SIC projectors |ψ i ψ i | with the probability Pr [i|ρ] of observing this outcome, Eq. (A4) allows us to compute the resulting average. Let tr 1 (·) denote the partial trace over the first of two qubits (tr 1 (A ⊗ B) = tr(A)B and linearly extended to all of H ⊗2 2 ). Then, where the last equation follows from the interplay between partial trace and swap operator. The final expression is equivalent to applying a depolarizing channel with parameter p = 1/3 to the quantum state in question: Viewed as a linear map on H 2 , this channel has a uniquely defined inverse: Although a linear map, this is not a physical operation. We can, however, use it in the classical post-processing stage to counterbalance the effect of averaging over SIC elements. Indeed, linearity and Eq. (A5) ensure The left hand side of this display features a linear combination involving SIC outcome probabilities Pr [i|ρ], while the right hand side exactly reproduces the underlying state ρ. This equips us with a concrete state reconstruction formulathe so-called linear inversion estimator. But, at least at first sight, this formula is only useful if we have precise knowledge of the SIC outcome probabilities Pr [i|ρ]. And, with current quantum technology, these probabilities must be estimated from repeatedly performing SIC POVM measurements on independent copies of ρ and approximating these probabilities by frequencies.
The classical shadow formalism provides an alternative perspective on this estimation process. Suppose that we perform a single SIC POVM measurement of an unknown quantum state ρ (single shot). Then, we obtain a random measurement outcomeî ∈ {1, 2, 3, 4} with probability Pr î |ρ each. Inspired by the left hand side of Eq. (A7), we can use this outcomê i to construct a Monte Carlo estimator of ρ: This is a random 2 × 2 matrix that can assume 4 different forms -one for each possible outcomeî ∈ {1, 2, 3, 4}. It does exactly reproduce the underlying quantum state ρ in expectation over the observed single-shot outcome: It is worthwhile to emphasize that eachσ has the same eigenvalue structure: λ + = 2 and λ − = −1. In turn, these random matrices all have unit trace (tr(σ ) = 2 − 1 = 1), but are unphysical in the sense that one eigenvalue is always negative. Eq. (A8) represents a physical density matrix ρ as the expectation of 4 unphysical estimators. This desired expectation value can be approximated by empirically averaging M independently generated Monte-Carlo estimators. Suppose that σ 1 , . . . ,σ M are M iid (independently and identically distributed) Monte Carlo estimators. Then, their empirical average obeysρ and the rate of convergence can be controlled with arguments from probability theory. This will be the content of the next two subsections.

Extension to multi-qubit systems
The formalism and ideas presented above readily extend to quantum systems comprised multiple qubits. Let ρ ∈ H ⊗N 2 H 2 N be a N-qubit density matrix. We can perform a single-qubit SIC POVM measurement on each of the N qubits. As in the single qubit case, each such measurement yields one out of four possible outcomes. In total, a single-shot measurement produces a string (i 1 , . . . , i N ) of outcomes. There are 4 N such outcomes and the probability of obtaining any one of them is given by Here, we have introduced the short-hand notation |ψ i 1 , . . . , ψ i N = N n=1 |ψ i n ∈ C 2 ⊗N C 2 N . These expressions are non-negative, because the N-qubit density matrix does not have negative eigenvalues. Eq. (A3), applied to each qubit separately, moreover ensures proper normalization: Again, the second averaging property is more interesting: for single-qubit density matricesρ ∈ H 2 , we already know that where D 1/3 : H 2 → H 2 is the depolarizing channel from Eq. (A6). It is now easy to check that this equation extends to general Hermitian 2 × 2 matrices: We can use this observation to show that N single-qubit SIC POVM measurements are tomographically complete. To achieve this it is helpful to first decompose ρ ∈ H ⊗N 2 into a sum of elementary tensor products: where the last equality follows from linearity of depolarizing channels. The final expression is equivalent to applying N independent, single-qubit depolarizing channels to the N-qubit quantum state ρ. Viewed as a linear map on H ⊗N 2 , this tensor product channel has a uniquely defined inverse D −⊗N 1/3 : H ⊗N 2 → H ⊗N 2 . For elementary tensor products, this tensor product of inverse depolarizing channels factorizes nicely into tensor products. In particular, Again, this is not a physical operation, because it produces matrices with negative eigenvalues. However, we can nonetheless use it in the classical post-processing stage to counterbalance the N-qubit averaging effect encountered in Eq. (A11): The left hand side of this display features a linear combination involving single-qubit SIC outcome probabilities Pr [i 1 , . . . , i N |ρ], while the right hand side exactly reproduces the underlying N-qubit density matrix. This provides us with a concrete reconstruction formula for arbitrary N-qubit states. In fact, it is a natural and relatively straightforward extension of the single-qubit linear inversion estimator (A7) to N qubits. As was the case for single qubits, the classical shadow formalism provides an alternative perspective on such a linear inversion estimation process. Suppose that we perform N single-qubit SIC POVM measurements of an unknown N-qubit state ρ (single shot). Then, we obtain a random outcome string (î 1 , . . . ,î N ) ∈ {1, 2, 3, 4} ×N with probability Pr î 1 , . . . ,î N |ρ each. Inspired by the left hand side of Eq. (A12), we can use this random outcome string to construct a Monte Carlo estimator of ρ: This is a random 2 N × 2 N matrix that decomposes nicely into tensor products of single-qubit contributions. Each tensor factor contributes a matrix with eigenvalue λ n,+ = +2 and λ n,− = −1. The eigenvalues of the tensor productσ then correspond to N-fold products of these two possible numbers. The largest eigenvalue is +2 N , the smallest is −2 N−1 , sô σ is a very unphysical random matrix. The randomness stems from an actual quantum measurement and depends on the underlying quantum state. This ensures thatσ reproduces ρ in expectation: in full analogy to the single-qubit case. As detailed in the next section, the rate of convergence will depend on the number of qubits N. The larger the space, the longer it takes for convergence to kick in, and this scaling can become unfavorable. It is therefore worthwhile to emphasize another distinct advantage of the tensor product structure of the estimators (A13): marginalization to subsystem density operators is straightforward. Let ρ be an N-qubit state, but suppose we are only interested in a subsystem K ⊂ [N] = {1, . . . , N} comprised of only |K| ≤ N qubits. Such a subsystem is fully described by the reduced |K|-qubit density matrix ρ K = tr ¬K (ρ) ∈ H ⊗K 2 that results from tracing out all qubits not in K. This partial trace is a linear operation that plays nicely with the tensor product structure in Eq. (A13). Each tensor product factor has unit trace, which ensures that The object at the very left is a random 2 |K| × 2 |K| matrix that can be generated from performing a complete N-qubit SIC POVM measurement to obtain outcomes (î 1 , . . . ,î N ) ∈ {1, . . . , 4} ×N . Subsequently, we only use outcomes that correspond to qubits in K to directly construct an estimator for the subsystem density matrix ρ K in question. This trick reduces the question of convergence to a problem that only involves |K| qubits, not N. This can be highly advantageous if |K| N.
What is more, we can use the same N-qubit measurement outcome (î 1 , . . . ,î N ) ×N to construct estimators for multiple subsystems K 1 , . . . , K L ⊂ [N] at once. This allows us to use the same N-qubit measurement statistics to estimate many subsystem properties in parallel.

B. Convergence for predicting linear observables
Suppose that we have access to M independent Monte Carlo approximationsσ 1 , . . . ,σ M of an unknown N-qubit state ρ. Each of them arises from measuring N single-qubit SIC POVMs on an independent copy of ρ. We can then use these approximations to estimate observable expectation values tr(Oρ) with O ∈ H This bound is reminiscent of existing variance bounds for randomized single-qubit Pauli measurements [14], but slightly weaker (random Pauli basis measurements achieve Var [tr(Oσ )] ≤ 2 N tr(O 2 )). The proof exploits the 2-design property of SIC POVM measurements and will be supplied in Sec. AIX A below. For now, we use this variance bound to conclude strong convergence guarantees for observable estimation with SIC POVM measurements. The key ingredient is a strong tail bound for sums of iid random variables with known variance and bounded magnitude. The Bernstein inequality is a stronger version of the better known Hoeffding inequality, see e.g. [61,Corollary 7.31] or [62,Theorem 2.8.4]: Let X 1 , . . . , X M iid ∼ X ∈ R be iid random variables with expectation µ = E [X] and variance σ 2 = E (X − µ) 2 that also obey |X m | ≤ R (almost surely). Then, for ε > 0 For the task at hand, we writeô lK on the |K|-qubit subsystems in question. These Hilbert-Schmidt norms can be related to the operator norm, which is bounded: Here, we have used the fact that each O l,K is a matrix of size (at most) 2 |K| · 2 |K| . This implies the bound max 1≤l≤L tr O 2 l,K ≤ 2 |K| , which can be very pessimistic. Inserting it into Eq. (A19) yields Eq. (5) in the main text.
Proof of Theorem 1. A maximum deviation larger than ε occurs if at least one individual predictionô l is further than ε off from the actual target tr(O l ρ). The union bound, also known as Boole's inequality, tells us that the probability of such a maximum deviation is upper bounded by the sum of individual deviation probabilities. These, in turn, can be controlled via the tail bound from Eq. (A18): We see that each of these summands diminishes exponentially in the measurement budget M. The right-hand side of Eq. (A19) ensures that each term contributes at most δ /L to this sum. Since there are L summands in total, we conclude Pr max 1≤l≤L This is equivalent to the advertised display.

C. Convergence for predicting (subsystem) purities
Classical shadows can also be used to predict non-linear quantum state properties, see e.g. [14,26]. A prototypical example is the purity of an N-qubit density matrix ρ: The purity equals one if and only if ρ describes a pure quantum state |φ φ |. Conversely, it achieves its minimum value for the maximally mixed state: ρ = 1 2 I ⊗N achieves p(ρ) = 1/2 N 1. We now describe how to obtain a purity estimator based on classical shadowsσ 1 , . . . ,σ M that arise from measuring N single-qubit SIC POVMs on (independent copies of) ρ. By construction, eachσ m is a Monte Carlo estimator of ρ.
This formula describes an empirical average of M 2 random variables with the correct expectation value p(ρ). This, in turn, ensures E [p] = p 2 (ρ). However, in contrast to before, the individual random variables are not necessarily statistically independent. The first two terms tr (σ 1σ2 ) and tr (σ 1σ3 ), for instance, both depend onσ 1 . This prevents us from reusing exponential concentration inequalities, like the Bernstein inequality, to establish rapid convergence to this desired expectation value. More general, albeit weaker, concentration arguments still apply. Chebyshev's inequality, for instance, implies ε 2 for any ε > 0.
In words: the probability of an ε-deviation (or larger) is bounded by the variance Var [p] of our estimator divided by ε 2 . This variance can be decomposed into individual contributions: Var tr σσ Var tr σσ (A22) and we refer to [26,Supplemental material] for details. Here, ρ is the underlying state andσ ,σ denote independent instances of a classical shadow approximation. This reformulation contains two variance terms that depend on one (linear contribution) and two independent classical shadows (quadratic contribution), respectively. We can use Lemma 1 to control the first term. Set O = ρ to conclude because tr(ρ 2 ) ≤ 1 for any underlying quantum state. Bounding the quadratic variance term requires more work. The following statement is a consequence of the geometric structure of SIC POVM measurements and substitutes existing arguments which rely on 3-design properties which do not apply here.
Lemma 2. Letσ ,σ be independent classical shadows of an underlying N-qubit state ρ. Then, We provide a detailed argument in Sec. AIX B below. For now, we insert both bounds into Eq. (A22) to obtain This variance bound diminishes as M increases. For fixed ε ∈ (0, 1) and δ ∈ (0, 1), a measurement budget of M ≥ 5 × 3 N + 1 /(δ ε 2 ) ensures Var [p] ≤ 4 5 ε 2 δ + 2 25 ε 4 δ 2 < ε 2 δ . We can insert this implication into the Chebyshev bound (A21) to obtain a rigorous convergence guarantee for purity estimation: In words: with probability (at least) 1 − δ , the purity estimatorp is ε-close to the true purity. Again, the required measurement budget M scales exponentially in the number of qubits involved. For global purities, this exponentially increasing measurement demand quickly becomes prohibitively expensive -a situation that cannot be avoided due to recent fundamental lower bounds [63]. However, once more, the situation changes if we consider subsystem purities instead. Let K ⊆ [N] be a subsystem comprised of |K| qubits. The associated density matrix is ρ K = tr ¬K (ρ) and we can estimate it by averaging appropriately marginalized classical shadows: Importantly, this estimation process now only depends on the |K| < N qubits involved, such that This scaling is much more favorable, especially for small subsystems (|K| N). Similar to linear observable estimation, we can use this assertion to predict many subsystem purities based on the same classical shadows. A union bound argument, similar to the proof of Theorem 1 above, readily implies the following statement.
Theorem 2. Suppose we are interested in predicting L subsystem purities p ρ K l of an unknown N-qubit state ρ. Let K = max 1≤l≤L |K l | be the largest subsystem size involved and set ε, δ ∈ (0, 1). Then, N-qubit SIC POVM measurements on (independent copies of) ρ are likely to ε-approximate all subsystem purities simultaneously. More precisely, the resulting subsystem purity estimatorsp K defined in Eq. (A24) obey The dependence on subsystem size K and accuracy ε is virtually identical to convergence guarantees for linear observable prediction, see Theorem 1. However, the dependence on the number of subsystem purities Land the inverse confidence 1δ , now enter linearly, not logarithmically. This is a consequence of the fact that the individual contributions top K l are not statistically independent. In turn, we had to resort to Chebyshev's inequality instead of stronger exponential tail bounds like the Bernstein inequality.
It is possible to obtain a scaling proportional to log(2L/δ ) by using a more sophisticated estimation procedure known as median of means estimation, see e.g. [14] for details. Practical tests with real data do, however, suggest that median of means estimation actually reduces the approximation quality overall [26]. This is not a contradiction, because statements like Theorem 2 are conservative mathematical statements about the worst-case rate of convergence. In practical applications, convergence can -and usually does -set in much earlier.
In words, second Rényi entropy is maximal for all subsystems simultaneously. This, however, is not a consequence of entanglement, but a trivial consequence of the fact that the state is very (maximally) mixed.
Fortunately, there exist entanglement criteria that extend to (very) mixed states. Chief among them is the PPTcriterion [64][65][66]. Let ρ be an N-qubit quantum state and let (A,Ā) be a bipartition of the qubits into two disjoint sets. Then, ρ is entangled (across the bipartition) if the partial transpose density matrix is not positive semidefinite (i.e. it has negative eigenvalues): ρ T A ≥ 0 implies ρ is entangled across the bipartition.
The partial transpose is defined by transposing tensor factors belonging to subsystem A, i.e. (ρ 1 ⊗ · · · ⊗ ρ N ) T A = a∈A ρ T a ā∈Ā ρā and linearly extended to all N-qubit density matrices. A quick sanity check confirms that the maximally mixed state doesn't pass the PPT condition (I T = I): (1/2 I) = τ ≥ 0.
Very entangled states, like the 2-qubit Bell state |Ω = 1/ √ 2(|00 + |11 ) do, in contrast, have partial transposes with negative eigenvalues: where F|x ⊗ |y = |y ⊗ |x denotes the swap operator which has one negative eigenvalue (λ min (F) = −1). We call a state ρ with ρ T A ≥ 0 a PPT-entangled state (with respect to the bipartition (A,Ā). The PPT condition is a sufficient, but not necessary, condition for entanglement. It is known that there exist states which are entangled, but nonetheless obey ρ T A ≥ 0 [55]. So, it is fruitful to view the PPT criterion as a one-sided test for entanglement: if ρ T A ≥ 0, we can be sure that the state is entangled. But, ρ T A ≥ 0 doesn't necessarily imply that the state is not entangled (i.e. separable).
The PPT criterion (A26) is conceptually appealing, but it does require full and accurate knowledge of the density matrix ρ. This, in turn, typically requires full state tomography which quickly becomes prohibitively expensive. It is, however, possible to test consequences of ρ T A ≥ 0 by comparing moments of the partially transposed density matrix. The simplest consistency check is the so-called p 3 -criterion [26]: The final simplification follows from the fact that partial transposition preserves the purity. The contrapositive of this implication serves as a (one-sided) test for entanglement: if tr ρ T A 3 < tr(ρ 2 ) 2 (for some bipartition (A,Ā), then the underlying state must be PPT-entangled (across this bipartition). Classical shadows can be used to directly estimate the trace moments involved in this test. Indeed, tr(ρ 2 ) is just the purity, while tr ρ T A 3 can be rewritten as a linear function on three copies of the underlying state: We refer to [26,Eq. (4)] for a precise reformulation. Subsequently, we can approximate this function by averaging over triples of distinct (and therefore independent) classical shadows: The convergence analysis from above can, in principle, be extended to this form of cubic approximation. For randomized Pauli basis measurements, this has been done in the supplemental material of Ref. [26]. We leave a parallel treatment of cubic estimation with SIC POVM shadows for future work. The experimental and numerical results from the present work indicate that a SIC POVM-based approach is expected to be both cheaper and easier than existing approaches based on Pauli basis measurements. Finally, we point out that Ref. [27] extended the intuition behind the p 3 -criterion (A27) to a complete family of polynomial consistency checks that compare polynomials of degree d with polynomials of degree (d − 1) and lower. This produces a hierarchy of in total d max = 2 N consistency checks that is complete in the sense that a state ρ passes all of them if and only if ρ T A ≥ 0. Although polynomial estimation with classical shadows becomes more and more challenging as the degree d increases, the lower levels of this hierarchy may still be attainable with (comparatively) modest experimental and postprocessing effort.

A. Variance bounds for observable estimation
Here, we supply the proof of Lemma 1. for any underlying N-qubit state ρ.
Proof. The classical shadowσ is constructed from performing single-qubit SIC POVM measurements on an underlying N-qubit quantum state ρ. Recall from Eq. (A9) that each of the 4 N possible outcome strings i 1 , . . . , i N ∈ {1, 2, 3, 4} occurs with probability In words: the probability of any particular outcome occurring is bounded by 2 −N . This allows us to bound the dominating part of the variance by Such a decomposition into tensor products allows us to factorize the above bound into a product of single qubit contributions: with respect to the Hilbert-Schmidt inner product (Parseval's identity). This is the advertised result.
The proof strategy behind this statement differs from existing arguments in the literature, most notably Refs. [14,26,27]. These use the fact that Pauli basis measurements form a 3-design, a structural property that doesn't apply to SIC POVMs. The key idea behind this new proof technique is to notice that trace inner products of SIC POVM shadows can only assume very discrete values. Recall that and, because |ψ i n , |ψ j n ∈ C 2 are SIC vectors, each contribution can only assume one of two discrete values: 9 ψ i n |ψ j n 2 − 4 = +5 if i n = j n , −1 else if i n = j n .
So, the magnitude of tr(σσ ) scales exponentially in the number of coincidental measurement outcomes (i n = j n ). This observation can be used to control the variance of this trace inner product. We first illustrate this for N = 2 qubits, which is enough to convey the main gist. The proof of Proposition 1 is then a straightforward, yet somewhat technical, generalization to an arbitrary number of qubits.
Lemma 4. Suppose that we perform two N-qubit SIC POVM measurements on (distinct copies of) a quantum state ρ and let K ⊆ [N] = {1, . . . , N} be a subset of K = |K| qubits. Then, the probability that the obtained measurement outcomes are equal (i k = j k ) for all k ∈ K obeys where ρ K = tr ¬K (ρ) is the reduced density matrix supported on the relevant qubit subset and each D 1/3 is a single-qubit depolarizing channel.
The proof follows from exploiting the fact that the two SIC POVM measurements are statistically independent, as well as the 2-design property of SIC POVMs. We defer it to the end of this section. For the task at hand, Lemma 4 bounds all remaining probabilities in Eq. (A30). Doing so, conveniently cancels the existing powers of 3 and produces the bound advertised in Proposition 1 for K = 2 qubits: This argument can be readily extended to an arbitrary number of qubits.
Proof of Proposition 1. The trace inner product between two N-qubit SIC POVM shadows can only assume discrete values. Indeed, Eq. (A28) states that tr (σσ ) = ±5 c , where c is the number of coincidental measurement outcomes. We can use indicator functions to single out all possibilities for coincidences and multiplying them with the correct scaling factor provides a closed-form expression of the trace inner product in terms of measurement outcomes alone. In the following, we will use K ⊆ [N] to denote a subset of coincidental indices. The complementary set (where indices mustn't coincide) will be denoted byK = In the second line, we have re-expressed conditions for non-coincidence ({i¯k = j¯k) as linear combinations of coincidences on larger subsystems (K ∪ T with T ⊆K). The last line follows from introducing the union U = K ∪ T and rewriting K as K \ T. The inner sum over subsets T ⊆ U has no effect on the indicator function. The size of such sets ranges from T = |T| = 0 up to T = |T| = |U| and for each T , there are |U| T subsets of that size. For a fixed set U, we therefore obtain This is the advertised bound for the variance of N-qubit purity estimators.
Finally, we provide the proof for the bound on coincidental SIC POVM measurement outcomes.