Classical Shadow Tomography with Locally Scrambled Quantum Dynamics

We generalize the classical shadow tomography scheme to a broad class of finite-depth or finite-time local unitary ensembles, known as locally scrambled quantum dynamics, where the unitary ensemble is invariant under local basis transformations. In this case, the reconstruction map for the classical shadow tomography depends only on the average entanglement feature of classical snapshots. We provide an unbiased estimator of the quantum state as a linear combination of reduced classical snapshots in all subsystems, where the combination coefficients are solely determined by the entanglement feature. We also bound the number of experimental measurements required for the tomography scheme, so-called sample complexity, by formulating the operator shadow norm in the entanglement feature formalism. We numerically demonstrate our approach for finite-depth local unitary circuits and finite-time local-Hamiltonian generated evolutions. The shallow-circuit measurement can achieve a lower tomography complexity compared to the existing method based on Pauli or Clifford measurements. Our approach is also applicable to approximately locally scrambled unitary ensembles with a controllable bias that vanishes quickly. Surprisingly, we find a single instance of time-dependent local Hamiltonian evolution is sufficient to perform an approximate tomography as we numerically demonstrate it using a paradigmatic spin chain Hamiltonian modeled after trapped ion or Rydberg atom quantum simulators. Our approach significantly broadens the application of classical shadow tomography on near-term quantum devices.


I. INTRODUCTION
Quantum state tomography [1][2][3] is an essential task in many quantum technology applications.It seeks to reconstruct a quantum state from experimental data of repeated measurements.While reconstructing the full density matrix of a many-body system quickly becomes unfeasible with increasing system size due to the curse of dimensionality [4,5], predicting a collection of (possibly exponentially many) properties of the quantum system can still be efficiently achieved with an only polynomial number of state copies, which was the idea of shadow tomography proposed by Aaronson [6,7].The idea is further improved by the recent work [8] to propose the classical shadow tomography, which significantly reduces the demand on the quantum hardware and enables efficient classical post-processing.
Given a copy of an unknown quantum state ρ of N qubits, the classical shadow tomography protocol (see Fig. 1) first transforms the state ρ → ρ = U ρU † by a unitary U , which is randomly sampled (independently each time) from some probability distribution P (U ), then measures the transformed state ρ in the computational basis, ρ → |b b|, which collapses the system to a product state |b labeled by a bit-string b ∈ {0, 1} ×N of measurement outcomes b = (b 1 , • • • , b N ) with the probability P (b|ρ ) = b|ρ |b .Based on the observed bit-string b and the classical description of the unitary U , a classical snapshot σU,b = U † |b b|U can be constructed in princi- * yzyou@physics.ucsd.eduple, which essentially encodes the measurement outcomes together with their basis choice (pulled back through the unitary evolution).Repeating such measurements on independent and identical copies of ρ for a few times, a collection of classical snapshots E σ|ρ = {σ U,b } can be obtained (which correlates with ρ).Ref. [9] showed that as long as the unitary ensemble is expressive enough (i.e.tomographically complete), there exist a linear reconstruction map M −1 such that the density matrix ρ can be formally recovered as ρ = Eσ∈E σ|ρ M −1 [σ].This also enables the prediction of many properties of ρ, like the expectation value of any physical observable O as: O = Tr(Oρ) = Eσ∈E σ|ρ Tr(OM −1 [σ]).The construction of classical snapshots σU,b and the computation of their associated properties are performed on a classical computer.
depending on the type of observables O that we are interested in, one needs to employ different strategies to design the unitary circuit U .Two limiting cases have been analyzed in Ref. [8]: (i) if the observable is low-rank (such as many-body overlap fidelity), it is most efficient to adopt deep circuits, such that U effectively forms a global Haar random ensemble; (ii) if the observable is high-rank and quasi-local, it would be more efficient to adopt shallow circuits (e.g. the on-site Haar random).Otherwise the sample complexity will be high.However, the flexibility to interpolate between these two limits has not been available yet, such that the tomography protocol can not adjust to the target observables in a more adaptive manner.Second, more importantly, in existing quantum simulation platforms, applying random unitary circuits is very challenging, because it requires high degrees of sophisticated quantum controls.In particular, for programmable quantum simulators of large systems based on trapped ions or Rydberg atom systems, [11][12][13] a certain set of entangling unitary evolution is much more favorable to implement than typical random unitaries that require fine-tuned control.Therefore, it is desirable to develop a method applicable for systems with limited controls.
In this work, we address these challenges by generalizing the classical shadow tomography methods to a broad class of unitary ensembles.In our approach, the specific details of the unitary ensemble is not important as long as the ensemble generates locally scrambled quantum dynamics [14].Rigorously speaking, the probability distribution P (U ) of evolution unitaries is invariant under local basis transformations, i.e. ∀V ∈ U(d) N : P (U ) = P (U V ) = P (V U ) where V = i V i is a product of local unitary operator V i on each qudit.This basically means that the unitary evolution U is efficient in scrambling local quantum information, such that the initial local basis choice is quickly "forgotten" under the quantum dynamics.Examples of locally scrambled quantum dynamics includes random unitary circuits (including random Clifford circuit at the 3-design level) [15][16][17][18][19][20] and quantum Brownian dynamics [21][22][23][24][25].As the unitary ensemble does not care about local basis choice, the only information that matters will be the quantum entanglement that the unitary dynamics can create in the quantum system.Therefore, for locally scrambled quantum dynamics, the reconstruction map only depends on the entanglement property of the classical snapshots.The density matrix ρ can be reconstructed as a linear superposition of the classical snapshot σ reduced in different subsystems.The combination coefficient can be calculated from the entanglement feature [26,27] of the classical snapshots, which is simply the collection of average purities of classical snapshots in all possible subregions.
Since our method is applicable to a broad class of quantum dynamics, it is natural to consider an ensemble of realistic Hamiltonian evolutions that are readily available in near-term quantum devices.To this end, we introduce an approximate classical shadow tomography (with a non-vanishing but small bias) applicable to an ensemble of time-dependent Hamiltonian evolution that generates approximately locally scrambled dynamics.We numerically demonstrate this idea by using a simple spin chain Hamiltonian modeled after programmable trapped ions or Rydberg atom array systems.We introduce the local frame potential to characterize the bias and we show the bias decreases rapidly for the initial short period of time, and reaches a vanishingly small plateau value for the proposed Hamiltonian.Surprisingly, we find even a single instance from an ensemble of Hamiltonian evolution suffices to perform an approximate tomography, implying that our method is hardware efficient for existing quantum devices [28,29].
In the following, we will first establish the general theoretical framework to calculate the reconstruction map in Sec.II A and to bound the sample complexity in Sec.II B. We also provide a two-qudit toy model to analytically demonstrate our construction in Sec.II C. We comment on how to carry out the computation efficiently in Sec.II D. Then we apply our construction for local unitary circuits and numerically demonstrates its accuracy in quantum fidelity and Pauli observable estimation tasks in Sec.III A, as well as their scaling of sample complexity in Sec.III B. Finally, we show in Sec.III D that our approach can be extended to broader classes of unitary ensembles that are approximately locally scrambled.We propose a frame potential to characterize the level of approximation, which serves as a powerful indicator to design nearly-locally-scrambled unitary ensembles that are available for existing analog quantum simulators [28,29].We summarize our classical post-processing protocol and outline a few interesting future applications in Sec.IV

A. Reconstruction Map from Entanglement Features
To be general, we consider a quantum system consists of N qudits, where each qudit has the Hilbert space dimension d (where d = 2 corresponds to the qubit system).The protocol of classical shadow tomography describes a process that first measures the unknown quantum state ρ in a random basis specified by the unitary transformation U and then prepare the classical snapshot σU,b ≡ U † |b b|U based on the measurement outcome b.The randomness involved in the process includes (i) sampling U from the distribution P (U ) and (ii) obtaining the measurement outcome b conditioned on the evolved state ρ = U ρU † with the probability P (b|ρ ) = b|ρ |b = Tr(σ U,b ρ).Inspired by the discussion in Ref. [30], we define as the posterior snapshot ensemble, as it is conditioned on the observation of ρ.The posterior snapshot ensemble reduces to the prior snapshot ensemble when there is no knowledge contained in ρ, i.e. ρ = d −N 1.For the prior distribution P (σ U,b ), the outcome b is uniformly drawn from all possible outcomes in {0, 1, • • • , d − 1} ×N (independent of U, ρ).The prior snapshot ensemble E σ only depends on the unitary ensemble With the notation introduced above, the expected classical snapshot σ can be expressed as which is related to the original state ρ by a quantum channel M, called the measurement channel.It is easy to check that the measurement channel M is tracepreserving, completely positive and self-adjoint.It is generally difficult to obtain an explicit expression of M for generic unitary ensemble E U (or for generic prior snapshot ensemble E σ ).Results of M are known for global and on-site 2-design unitaries [9,31,32] (possibly with noise [33,34]), fermionic Gaussian unitaries [35], and many-body Gaussian unitaries [10].
We can make progress in computing the measurement channel M (and its inverse) for yet another class of unitary ensemble, namely the locally scrambled unitaries [14], for which P (U ) obeys the local-basis invariance condition where the local scrambling unitary V is an element in the group U(d) N (the tensor product of the on-site unitary group U(d) of each qudit).This condition is sufficient to ensure the prior ensemble E σ of snapshot states σ to be invariant under σ → V † σV , In this case, we say that E σ is a locally scrambled ensemble.In fact, our following derivation only requires the weaker condition Eq. ( 5) at the state level, instead of Eq. ( 4) at the channel level, though it will be practically more straight forward to design unitary circuits that satisfies Eq. ( 4) by assembling locally scrambled unitary gates.Nevertheless, as long as the states σ are locally scrambled (even if the unitaries U may or may not be locally scrambled), we will be able to insert local basis transformations V in Eq. (3), and average V over any ensemble of our choice, We can choose the ensemble of V = i V i to be such that every V i is independently a local 2-design unitary.With this choice, the ensemble average of V can be evaluated by averaging every V i over the Haar unitary measure following Ref.[36,37], and the result can be written as (see Appendix A for derivation) with B, C summing over all possible subregions of the N qudit system, where each subregion is labeled by a subset of Ω N = {1, • • • , N } (as an element in the power set 2 is the 2nd entanglement feature [26,27] of the prior snapshot ensemble E σ , where S C (σ) denotes the 2nd Rényi entanglement entropy of the state σ in region C.The entanglement feature Eσ,C is merely a property of the unitary ensemble E U (which determines E σ ).It describes how the unitary channel entangles a product state in general.It depends on neither the underlying state ρ to be reconstructed nor any particular snapshot state σ collected in the tomography process.
Given the entanglement feature W Eσ,C , Eq. ( 7) spells out how the expected classical snapshot σ is written as a linear combination of reduced density matrices ρ B in all regions, which explicitly specifies the measurement channel M as a linear map σ = M[ρ] from ρ to σ.Therefore, any reduced classical snapshot σ A must also be a linear combination of reduced density matrices ρ B , which implies that the measurement channel can be represented as a matrix M AB such that σ A = B M AB ρ B .Suppose the map M is invertible (i.e. the unitary ensemble is tomographically complete), the inverse map M −1 (the reconstruction map) must also be a linear map that combines all reduced classical snapshots σ A to reconstruct ρ B = A (M −1 ) BA σ A .In particular, we are most interested to reconstruct the full density matrix ρ (because all reduced density matrices follows from its partial trace), which must also be a linear combination of σ A with some coefficients r A ∈ R, where ) follows the same definition as the reduced density matrix.The reconstruction map M −1 is not a physical channel, because the reconstruction coefficients r A may not be positive definite in general.Nevertheless, M −1 is still trace-preserving and self-adjoint.Since M −1 is linear, we have ρ , which enables us to reconstruct the underlying state ρ from the ensemble of classical snapshots.The collection of ρ = M −1 [σ] is also called the classical shadow [8] of ρ, which can then be used to predict many properties of ρ efficiently.Now the key problem is to compute r A from W Eσ,C .For a system of N qudits, there will be 2 N many reconstruction coefficients r A .To determine them, we substitute Eq. ( 7) to Eq. ( 9) and find with the fusion coefficient f A,B,C given by which is universally determined by the qudit dimension d.Here δ A,B denotes the Kronecker delta of two regions Eq. ( 10) will hold for any choice of ρ if and only if where Ω N = {1, • • • , N } is the full set that labels the full system of N qudits.By solving this linear equation, we can determine the reconstruction coefficients r A in terms of of the entanglement feature W Eσ,C , such that the reconstruction map M −1 can be constructed according to Eq. (9).
In conclusion, we provide a general framework to compute the reconstruction map for the classical shadow tomography with locally scrambled quantum dynamics.The protocol is summarized as: 1. Given the prior snapshot ensemble E σ , first calculate its entanglement feature by

Solve for the reconstruction coefficient r
Eσ,C = δ B,Ω N .

Then the reconstruction map is given by
All computations are supposed to be carried out on a classical computer in the post-processing procedure.Although solving for r A may be computationally demanding for large systems, it only needed to be done once and its result can be applied to process all classical snapshots collected from all possible states ρ to be learned.

B. Variance Estimation and Sample Complexity
Given the ensemble E σ|ρ of classical snapshots collected from measuring the unknown state ρ, we can use the reconstruction map M −1 to predict properties of ρ.For example, let O be a traceless Hermitian operator representing a physical observable.Its expectation value O ≡ Tr(Oρ) can be predicted via (13) where we have used the self-adjoint property of M −1 to transpose its action from σ to O. We can interpret ô(σ) ≡ Tr(M −1 [O]σ) as the single-shot estimation of the observable (based on a particular classical snapshot σ), such that O = Eσ∈E σ|ρ ô(σ).
The variance of the single-shot estimation is defined as Var ô ≡ Eσ∈E σ|ρ ô(σ) 2 − (E σ∈E σ|ρ ô(σ)) 2 , which can be bounded by (the first term in Var ô) The bound O E σ|ρ can be considered as a generalized ρdependent notion of the (squared) shadow norm [8] of an operator O (whereas the shadow norm originally defined in Ref. [8] further maximizes over all possible underlying states ρ to remove the dependence on ρ).Assuming E σ is locally scrambled, following the same approach of inserting and averaging local-basis transformations as in Eq. ( 6), the bound in Eq. ( 14) becomes where g, h are group elements in the S N 3 (product of 3fold permutation groups over N qudits).Wg g,h is the Weingarten function of permutations g and h, which is equivalent to traditional Weingarten function as Wg g,h = Wg(gh −1 , d), where d is the local Hilbert dimension of qudit.O 2 ρ,g is a generalized operator norm for O, which is defined as where χ g is the representation of the S N 3 permutation g in the 3-fold Hilbert space.W (3) Eσ,h is the 3rd entanglement feature of the ensemble E σ , defined as Note that the 2nd entanglement feature previously defined in Eq. ( 8) can be consistently cast into the form of Eq. ( 17) in terms of permutation operators (see Ref. [27]).
In practice, the expectation value O is always estimated based on a finite collection of the snapshot states.Let M be the number of samples of ρ measured in the data acquisition stage (each sample results in a snapshot state σk ).The finite average estimation ō = 1 M M k=1 ô(σ k ) will fluctuate around the true expectation value O with a variance that scales as (Var ô)/M .By the Chebyshev inequality, the probability for ō to deviate from O by more than amount is bounded by Therefore, to control the failure probability within a threshold δ, i.e.Pr(|ō A larger (smaller) shadow norm O 2 E σ|ρ indicates that more (less) samples are needed.
However, the ρ-dependent shadow norm O 2 E σ|ρ is generally complicated to evaluate.If we are not interested in the shadow norm for a specific state ρ, but rather the expectation of the shadow norm over an ensemble of states {V ρV † } that are similar to ρ by local basis transformations V ∈ U(d) N , we can actually define a ρ-independent shadow norm by averaging over V , The expected shadow norm can be expressed purely in terms of the entanglement features of E σ and E O (see Appendix B for derivation), Eσ,A∩B∩C W (2) where the coefficient v A,B,C,D is given by In conclusion, given a traceless Hermitian operator O, its expected shadow norm O 2 Eσ provides a typical lower bound for the number of samples needed in order to control the error of the prediction ō given by the classical shadow tomography within the probability bound Pr(|ō− O | ≥ ) ≤ δ.Here we have only analyzed the sample complexity for a single linear observable.For the analysis of multiple and/or non-linear observables, we refer to the original paper of Ref. [8].Their result applies to our case simply by replacing the shadow norm with our version.

C. A Toy Example of Two-Qudit System
To demonstrate our framework and to gain some analytical intuition, we present a toy example to compute the reconstruction map in a two-qudit (N = 2) system.We assume that the two-qudit system always evolves under a locally scrambled quantum dynamics, which can be modeled (for example) by a finite-time Brownian evolution [38] driven by random Hamiltonians.Every classical snapshot σU,b = U † |b b|U is generated by the reversed evolution from the product state |b b|.In the long-time limit (Fig. 2(a)), the entanglement feature W (2) , 1) follows from that of Page states, where the subregion basis are arranged in the order of {}, {1}, {2}, {1, 2}.This is because the evolution of entanglement feature under any locally scrambled quantum dynamics always converges to the Page state, regardless of the initial state, as proven in Ref. [14].In the short-time limit (Fig. 2(b)), σ remains as a product state, therefore the entanglement entropy vanishes for all regions, which translates to W Eσ = (1, w, w, 1), (24) with w varies between 2d d 2 +1 (the long-time limit) and 1 (the short-time limit).The physical meaning of w is the average single-qudit purity in the snapshot state σ.
Given W Eσ in Eq. ( 24), Eq. ( 12) reads (25) By solving this linear equation, the reconstruction coefficient r A can be obtained The behavior of r A as a function of w is shown in Fig. 2(c), which continuously interpolates the two limits.In the short-time limit, w = 1 and Eq. ( 26) reduces to matching the result of on-site 2-design circuits [31,32].In the long-time limit, w = 2d d 2 +1 and Eq. ( 26) reduces to r A = (−1, 0, 0, (d 2 + 1)/d 2 ), corresponding to the reconstruction map matching the result of global 2-design circuits [31,32].The general result in Eq. ( 26) provides the reconstruction map that interpolates these two limits, which allows us to perform classical shadow tomography for intermediate unitary channels that are neither on-site nor global 2design.
To investigate the sample complexity of the tomography scheme in the two-qudit system, we consider a traceless Hermitian operator O (i.e.Tr O = 0) and define two parameters k 1 and k 2 to parameterize the purity: Then the entanglement feature of the observable O can be arranged as the following vector with the same choice of region basis as in Eq. ( 24).Given r, W E O and W (2) Eσ , we have all the information needed to calculate the shadow norm, according to Eq. ( 21) where The operator locality crucially affects k tot .Consider modeling a local operator O loc by a random operator drawn from the Gaussian unitary ensemble (GUE) and acting on the first qudit only, we have hence k tot = d.On the other hand, for a global operator O glb modeled by a global GUE random operator acting on both qudits simultaneously, we have hence . In these two cases, the shadow norm in Eq. ( 31) becomes Their dependence in w is plotted in Fig. 2(d).In the short-time limit (w = 1), O loc 2 Eσ , meaning that the shallow circuit is more efficient in predicting local observables.In the long-time limit (w = 2d such that there is no difference in predicting both local and global observables in terms of the sample efficiency, because all operators are equally scrambled in this limit.

D. Additional Remarks on Computational
Methods and Future Directions Efficient numerical methods have been developed [20,39] to calculate the evolution of entanglement feature Eσ under locally scrambled quantum dynamics by solving the corresponding entanglement dynamics equation (without simulating the quantum dynamics using brute force).However, we will leave this approach for future exploration.In this work, we will compute the entanglement feature beforehand based on the definition Eq. ( 8), by direct sampling from the prior snapshot ensemble E σ .For experimentally generated random unitaries whose distribution is a priori unknown, it is also possible to estimate the entanglement feature efficiently from Rényi entropy measurements [40][41][42] following the definition Eq. ( 8).
As shown in Ref. [39], for one-dimensional quantum systems, the entanglement feature vector W Eσ with the fact that f A,B,C is factorizable to every qudit, one can develop efficient MPS-based numerical approach to find the solution of r A (also as a MPS).However, we will defer the development of this approach to future work.In the following numerical demonstrations, we will directly solve Eq. ( 12) for small systems as a proof of concept.
The MPS representations for r, W Eσ and W E O also enables us to calculate the shadow norm O 2 Eσ efficiently by a four-way MPS contraction.The ability to compute the shadow norm efficiently will be particularly useful if we want to design optimal unitary channels to minimize the sample complexity for a given set of designated observables.It is possible to apply machine learning approaches (such as deep reinforcement learning) to perform the circuit structure optimization.Therefore, our construction provides the flexibility to allow the classical shadow tomography to adapt to designated observables, which has not been possible before.We will also leave this promising direction to future research.

III. NUMERICAL DEMONSTRATIONS
To demonstrate the effectiveness of our approach, we consider three types of unitary ensembles for the unitary channel in the data acquisition protocol, as illustrated in Fig. 3.

A. Classical Shadow Tomography with Shallow Random Unitary/Clifford Circuits
We first consider using random unitary circuits (RUCs) [15] for the unitary channel.As illustrated in Fig. 3(a), the unitary circuit consists of two-qubit local unitary gates arranged in the brick-wall pattern with a periodic boundary condition.Each gate in the circuit is independently drawn from the Haar random unitary ensemble.The depth L of the circuit can be adjusted.Obviously, RUCs are locally scrambled, as any local-basis transformation (from both left and right) can be absorbed by the Haar random unitary gates in the circuit.Therefore, we expect our reconstruction map to work perfectly in this case for any choice of the circuit depth L. GH , denoted by GH).The inset shows the variance Var F of the predicted fidelity as a function of system size N .In both subfigures, the sample size is 5000.Error bar indicates 3-standard-deviation estimated by the bootstrap method.Points are split horizontally to avoid the overlap of markers.
For illustration purpose, we start with a Greenberger-Horne-Zeilinger (GHZ) state ρ = |Ψ Ψ|, where Eσ,C to determine the reconstruction map M −1 .This calculation is done for once and stored in the classical memory for future reference.In our numerical simulation of the data acquisition process, we sample the RUC, apply it to the GHZ state |Ψ , and perform the computational basis measurement.We generate a collection of classical snapshots E σ|ρ = {σ} of size M by repeated measurements.We then estimate the fidelity F of the reconstructed state by Following the philosophy of classical shadow tomography, one should view Eq. ( 35) as a prediction task.If the shadow tomography is successful, then the estimated fidelity should converge to F = 1.This estimation can be achieved accurately by a few measurements, even though the full density matrix estimation avg σ∈E σ|ρ M −1 [σ] may still have large fluctuations.In addition, when the reconstruction is biased, for example the experimental channel M doesn't match the theoretical assumption of the unitary ensemble, then the fidelity estimation will deviate from one.Fig. 4 (a) shows that the entanglementfeature-based reconstruction map M −1 EF indeed gives unbiased estimation of fidelity F for different circuit depths L and for different system sizes N .Furthermore, when the GHZ state is prepared with Z errors, our method can give the correct fidelity estimation that decreases linearly with the probability of Z error, which is challenging for the current state-of-art machine-learning quantum state tomography method [8,43] (see Appendix D for more discussions).To compare with the existing classical shadow tomography method [8], we consider the reconstruction maps ) assumes the unitary ensemble is global (or on-site local) Haar random.They can be viewed as special limits where circuit depth L tends to infinity and zero respectively.Although they can also achieve an unbiased estimation of quantum fidelity, the tomography efficiency differs.In Fig. 4(b), the error bar shows how the (3-times) standard deviation of the estimated fidelity scales with the number of qubits N at 5000 sample size.As we can see (both from the error bar and from the inset of Fig. 4(b)), the variance of (on-site) local Haar estimation increases drastically as N increases, which implies an increasingly high sample complexity for large systems.
In the other limit, the variance of global Haar estimation is independent of system size, achieving the optimal sample complexity as advocated in Ref. [8].However, to realize the global Haar ensemble, the circuit depth needs to be at least of order O(N ), which is quite demanding for quantum devices.If we approximate the global Haar ensemble with finite-depth circuits and use the reconstruction map M −1 GH on data collected from finitedepth circuit measurements, [9] this will yield systematically biased predictions for physical quantities when the circuit is not deep enough.In Fig. 5(a), we show that the biased prediction tends to over-estimate the fidelity, leading to the unphysical result of F > 1 (the correct behavior is F = 1).This occurs because, when the measurement channel M in data acquisition protocol disagrees with the reconstruction channel M −1 in classical post-processing protocol, the reconstructed density matrix 1 M σ∈E σ|ρ M −1 [σ] may not be positive-definite (see Appendix E for detailed discussions), resulting in the unphysical fidelity estimation.This bias gets worse for larger system size.In Fig. 5(b), we also show the estimation of For GHZ state, the correct behavior is P 0 = 0.5, and we still see significant bias when applying M −1 GH for shallow circuits.However, with the entanglement-feature-based reconstruction map M −1 EF , as demonstrated in Fig. 4 (b), we are able to achieve an unbiased fidelity estimation with a 3-layer shallow circuit, approaching similar variance level (i.e.similar sample efficiency) as global Haar ensemble while keeping a low circuit complexity.This clearly demonstrates the advantage of our approach.

B. Scaling of Variance and Tomography Complexity
The above discussion motivate us to define the tomography complexity as C = (L + 1)M , where L is the circuit complexity (the number of layers in the quantum circuit), and M is the sample complexity (the number of sample needed).M will be proportional to the single-shot variance Var ô.Suppose applying each layer of quantum gates and performing measurements both take a unit of time on the quantum device, then C is roughly the total amount of time needed to collect the classical shadow from M copies of the quantum state, which characterizes the complexity of the data acquisition protocol.This notion of complexity is consistent with the Quantum Algorithmic Measurement (QUALM) complexity introduced in Ref. [44] (with LM and M being their gate and query complexities respectively).In the following, we will investigate the scaling of single-shot variance Var ô as a function of circuit depth L and system size N for both low-rank operators (such as fidelity) and full-rank oper-ators (such as Pauli operators), and show how the tomography complexity C can guide us to find the optimal circuit depth L * .
with different support k.The dots are experimental results from simulation, and the lines are theoretical prediction using operator shallow norm by Eq. (21).They match perfectly.
For low-rank operators, we will focus on quantum fidelity, which is important in many quantum information applications, such as (variational) state preparation.We will define the zero-depth limit (L → 0) of the RUC to be a single layer of on-site Haar-random gates because even if there is no two-qubit gate in the "zero-depth" circuit, we still assume that the unitary ensemble is locally scrambled such that on-site scrambling unitaries continue to persist.In this limit, the single-shot variance Var F of fidelity estimation scales exponentially with the number of qubits N .On the other hand, in the deep circuit limit (L → ∞), RUCs will approach the global Haar unitary ensemble, and the variance Var F will be independent of system size.We are interested to inves-tigate how Var F behaves in the shallow circuit regime.In Fig. 6(a), we calculated Var F numerically using the bootstrap method for the 9-qubit GHZ state.It shows the variance Var F will decrease quickly in the shallow circuit regime.Interestingly, we found that an empirical formula Var F ∝ exp c N (L+1) α fits the data well in the shallow circuit regime, with α = 0.7 ± 0.1.In Fig. 6(b), we plot Var F as a function of N (L+1) α for different fixed circuit depth L. We find curves with different choices of circuit depth L all collapse together with the same coefficient c = 0.47 ± 0.08.
The physical intuition behind the empirical formula has to do with the operator growth in RUCs.If the quantum circuit is very shallow, then the computational basis measurement will only probe local information in the original basis.If the circuit becomes deeper, computational basis measurement can probe information in larger regions in the original basis, because the measurement operator has grown under the (backward) circuit evolution.Suppose the size of the measurement operator grows in a power-law manner ∼ (L + 1) α [45] with respect to the depth L of the RUC, the relative size of the system will effectively shrink to N eff = N (L+1) α , such that Var F should scale universally with N eff , as proposed in the empirical formula.We might expect α = 1/2 (or α = 1), if the operator grew diffusively (or ballisticaly).However, the best fit of our numerical result seems to indicate an effective operator growth between the diffusive and ballistic limits.Due to the limited system size in this study, we are unable to determine whether our observation persists to the thermodynamic limit.We will leave this intriguing scaling behavior for further investigation in the future.Nevertheless, for any α, the variance decreases faster than exponential with L in the shallow circuit regime, which already speaks for the advantage of applying shallow circuits in classical shadow tomography.
For full-rank operators, we mainly focus on consecutive strings of Pauli operators of the form where Z is the Pauli-Z operator, and I is the identity operator.We define the locality of the Pauli string operator by its length k.In the shallow circuit limit (L → 0), the variance of estimation for Z (k) scales Var Z (k) ∝ 4 k .So shallow circuit is only efficient for predicting the local observables, and becomes inefficient for non-local observables.In the deep circuit limit (L → ∞), as the unitary ensemble becomes globally Haar, there is no difference between local and non-local operators in this limit, and Var Z (k) ∝ 2 N .A simple comparison indicates: when k N/2, Var Z (k) will decrease with L, thus deep circuits will have lower sample complexity; when k N/2, Var Z (k) will increase with L, thus shallow circuits will have lower sample complexity.In Fig. 6(c), the dots shows the variance Var Z (k) as a function of circuit depth L for different support k.The trend agrees with our simple argument.For non-local operators, their variance will quickly decrease with the circuit depth L, while the variance for local operators will mildly increase with L. The behavior is theoretically described by how the operator shadow norm O 2 Eσ depends on both the circuit depth L and the operator locality k, which are separately encoded in the entanglement features of E σ and E O .We calculate the shadow norm Z (k) 2 Eσ based on the entanglement feature formalism using Eq. ( 21), and plot the result as lines in Fig. 6(c).The theoretical calculation agrees perfectly with the numerical results, which also indicates that the shadow norm bounds the single-shot variance (and hence the sample complexity) quite tightly.Based on the discussion in Sec.II B, the sample complexity M is proportional to the single-shot variance.Given the scaling of variance with the the circuit depth L, we can study the scaling of the sample complexity, as well as that of the tomography complexity C. For fidelity estimation task, C scales as For sufficiently large systems, the complexity C can have a non-trivial minimum at a finite circuit depth L * (αcN ) 1/α − 1.Our simulation result in Fig. 7(a) verifies such behavior.For small systems (N 5), random single-qubit measurements can efficiently benchmark the quantum state, so we do not need to use a finite-depth circuit for data acquisition.However, as the system size N gets larger, to maintain the prediction accuracy, singlequbit measurements will require more and more samples that have to grow exponentially with N .As shown in Fig. 6(a), applying a few layers of quantum circuits before the measurement can quickly bring down the singleshot variance (and hence reduce the sample complexity).However, we also do not want to go too far in the circuit depth, because that would increase the circuit complexity.Therefore, we expect an optimal circuit depth L * where the sample complexity and the circuit complexity reach a balance, and the total tomography complexity is minimized.This explains the advantage of shallow circuits in classical shadow tomography, as compared to the existing method that requires either on-site Haar random (L → 0) or global Haar random (L → ∞) unitaries.
We also study the tomography complexity C ∝ (L + 1) Var Z (k) for the full-rank observables, such as Pauli strings Z (k) , as shown in Fig. 6(b).In this case, what matters is the locality k of the full-rank operator (the length k of the Pauli string).For local operators (small k), on-site measurement will be most efficient.However, for non-local operators (large k), we observe that the tomography complexity is minimized at some finite circuit depth, again demonstrating the advantage of employing shallow circuits in classical shadow tomography.For different classes of physical observables, we can use the tomography complexity C as an objective function to guide the design of the optimal circuit structure.We will leave this promising direction for future investigation.

C. Classical Shadow Tomography with Fixed Quantum Circuits or Hamiltonian Dynamics
Compared to other classical shadow tomography protocols, our method can be applied to a large family of unitary ensembles that only requires the local scrambling condition, which is more appealing to near term quantum devices.One of the biggest challenges in realizing the original proposal of global Clifford classical shadow tomography is that the realization of global Clifford unitary requires ∼ N 2 many local Clifford gates (for a Nqubit system), which remains challenging for near term quantum devices.Even though global Clifford shadow tomography is very efficient in predicting non-local properties, it has not been implemented even for few-qubit systems as far as we know.
As we have seen in Sec.III A, the quantum entanglement created by the unitary channel plays an important role in reducing the sample complexity.With the quantum entanglement generated by the unitary channel, the classical shadow tomography is essentially an entanglement-assisted non-local measurement proto-col.To circumvent the difficulty of sampling (fullyscrambled) deep random unitaries but still harness the power of entanglement, we can use the idea of locally scrambled unitaries to design randomized measurement protocols that have sandwich structures like Fig. 3 (b), where random single-qubit Clifford gates (green boxes) are introduced at the beginning and the end of the unitary channel, and a fixed unitary circuit/quantum dynamics (the blue box) is sandwiched in between to provide entanglement generation.This sandwiched protocol satisfies the local scrambling condition rigorously, therefore the reconstruction map in Eq. ( 12) can be applied.
We will give two examples to demonstrate this sandwiched protocol.In the first example, as shown in Fig. 8(a), the fixed unitary is taken to be a fixed Clifford circuit consist of a sequence of controlled-NOT (CNOT) gates that generates entanglement.In the second example, as illustrated in Fig. 8(b), the fixed unitary is generated by the time evolution of a Rydberg atom Hamiltonian [46]: Both cases are ready to be implemented with near term quantum devices, such as trapped ion based quantum simulator or Rydberg based quantum simulator, given the fact that single qubit Clifford gates can be efficiently implemented, and randomized Pauli measurements have been demonstrated.[47] < l a t e x i t s h a 1 _ b a s e 6 4 = " For comparison, we use both our proposed sandwiched protocol and the standard randomized Pauli measurement to perform the classical shadow tomography on a GHZ state and to evaluate the fidelity of the reconstructed state.The results are shown in Fig. 9.As we can see, the variance of the fidelity estimation based on randomized Pauli measurements grows exponentially with increasing system size.As expected, the variance (or the sample complexity) reduces dramatically if one adds a fixed unitary generated by the CNOT circuit or the Rydberg Hamiltonian dynamics.More specifically, the variance of prediction is reduced by one order of magnitude even for a small system of N = 9 qubits.In both cases, the quantum entanglement generated by the locally scrambled quantum dynamics helps to improve the tomography efficiency.This result demonstrates the power of our protocol: it is both very flexible in terms of the design and very efficient in terms of the sample complexity.

D. Approximate Classical Shadow Tomography with Local Hamiltonian Dynamics
Requiring an unitary ensemble to be strictly locally scrambled could be restrictive.To this end, we would like to explore a broader class of unitary ensembles that are only approximately locally scrambled.In particular, we study unitary evolutions U = e −iHT generated by a local Hamiltonian H for finite amount of time T , as depicted in Fig. 3(b).Two classes of Hamiltonians are of particular interest.In the first class, we consider a model of random local Hamiltonians where each term H i,i+1 is independently sampled as 2local GUE random matrices.We dub this class the GUE2 ensemble to remind ourselves that the Hamiltonian is 2-local.The local Hamiltonian describes a disordered one-dimensional quantum system in general.Once every H i,i+1 term is sampled, we will use the Hamiltonian H to drive the time evolution without changing H during the evolution.The unitary GUE2 ensemble is only invariant under N , such that that its corresponding prior snapshot ensemble E σ will transform as σU,b = V , which does not satisfy the locally scrambling condition at the state level (i.e. the invariance under σ → V † σV ).However, we anticipate that under a sufficient amount of time evolution, the original local basis choice (of |b ) will be quickly randomized given the chaotic nature of the local Hamiltonian, such that the initial choice of V |b b|V † or |b b| will not make a substantial difference statistically, so the GUE2 ensemble will become approximately locally scrambled after some local thermalization (scrambling) time T Th .
Another more realistic class of random Hamiltonians to be considered is based on the quantum Ising model with both disordered coupling in space and random fields in time (40) where the local coupling is drawn from a uniform distribution, and the angle of magnetic field θ t ∼ Uni[0, 2π] is also random.We use this Hamiltonian to drive the quantum dynamics in discrete time steps.In each period of time, the magnetic field h will be applied along a different random direction θ t in the x-y plane for all spins uniformly.However, J ij will remain the same throughout the time evolution.The ensemble of unitary consists of which we name as the Disordered Quantum Ising Model or DQIM for short.The DQIM ensemble is friendly for quantum technology such as Rydberg-atom-based [28] or trapped-ion-based [29] quantum simulators.Similar construction of approximate unitary designs by Hamiltonian evolution with random quenches in time was also proposed in Ref. [40,41].We would like to investigate how well our framework applies to these two cases.Each approximately locally scrambled unitary ensembles E U leads to a prior snapshot ensemble that is also approximately locally scrambled.We propose to characterize how close the prior snapshot ensemble E σ is towards its local-basis invariant limit by the following frame potential Recall that in deriving Eq. ( 6) from Eq. ( 3), we only require the 2nd moment to match, i.e.
therefore we will be most interested in the 2nd frame potential F (2) Eσ .The frame potential F Eσ for any ensem-ble E σ is lower bounded by its locally-scrambled (U(d) Ntwirled) limit F (2) The fact that F (2) E LS σ is expressed purely in terms of the entanglement feature of E σ indicates that it is indeed free of any local-basis-dependent information.We can define the gap between the frame potential and its locallyscrambled limit as which turns out to match the trace-square-difference between the 2nd moment Eσ σ⊗2 and its local twirling EV,σ(V † σV ) ⊗2 .The frame potential gap ∆ (2) Eσ serves as an indicator of the validity of our approach, as it vanishes if E σ is locally scrambled such that our construction becomes exact.
Different unitary ensembles can lead to different frame potential gaps of E σ , which can be used to evaluate the quality of the unitary ensemble in obeying the local scrambling condition.In Fig. 10(a), we first focus on the frame potential gap ∆ (2) for the GUE2 ensemble.We find the gap will first decay exponentially and then saturate to a plateau at a very low level.The quickly vanishing gap implies that the GUE2 ensemble quickly becomes approximately locally scrambled as time evolves.We define the characteristic time associated with the exponential decay as T Th , i.e. ∆ (2) (T ) ∝ exp(−T /T Th ), which can be considered as the local scrambling (thermalization) time.Such an exponential decaying behavior in the early time regime is generally expected for non-critical quantum dynamics, which admit typical local energy scales (or time scales).In addition, the inset plot in Fig. 10(a) shows that T Th is independent of the system size N , as the slope remains the same for different N within error bar.Unlike global scrambling (global thermalization) which requires a long time (∼ N ) to achieve, achieving local scrambling only requires a fixed amount of time set by the ultra-violet energy scale that is independent of the system size N .This is another advantage of using locally scrambled quantum dynamics for classical shadow tomography in practice.
As for the DQIM ensemble, we fix the strength of the magnetic field at h = π/4, since this value produces the fastest on-site scrambling of a single qubit.According to the definition Eq.( 40), the only tuning parameter will be the mean value J of Ising couplings (which also sets their disorder strength).We calculate the frame potential gap ∆ (2) for DQIM ensemble with different J.We observe that the frame potential gap always decays exponentially in the early time regime, in correspondence to the local thermalization process.Then it will typically crossover to a plateau (i.e.saturate to a finite constant) in the late time.The early-time exponential decay region is larger for larger J, and we use the exponential decay regime to define the local scrambling time T Th .The result is shown in Fig. 10(b).The DQIM ensemble also approaches local scrambling as time evolves, although the final saturation plateau is not as low as the GUE2 ensemble.Larger Ising coupling J will lower the saturation plateau and shorter the local scrambling time T Th , as shown in Fig. 10(c).In addition, as shown in Fig. 10(d), we find the frame potential gap for a single realization quenched-disorder Hamiltonian does not deviate significantly from the ensemble mean value.This indicates that a single fixed disordered Ising chain under a randomly rotating uniform magnetic field is already good to generate an approximately locally scrambled ensemble that can be used for classical shadow tomography.
In practice, we use the two proposed approximated ensembles, (i) the GUE2 ensemble and (ii) a single instance of the DQIM ensemble, to perform the tomography task and predict the fidelity of a 7-qubit GHZ state.In Fig. 11(a), we see the predicted fidelity will be biased in the beginning (the biased fidelity can be greater than one, see Appendix E for more discussions), due to the fact that the quantum dynamics is still on its way to establish local scrambling.After around T ∼ 10T Th , the local scrambling condition is approximately established, then the entanglement-feature-based reconstruction map M −1 EF can provide a good reconstruction of the quantum state, as indicated by the convergence of the quantum fidelity to identity.In Appendix E, we further investigate the quantum fidelity of ρ projected to the physical space (to tame the unphysical F > 1 behavior) and show that the reconstruction is nearly perfect after around T ∼ 10T Th .In addition, Fig. 11(b) also shows the local scrambling time for GUE2 is independent of system size, which is consistent with the same behavior in Fig. 10(a).The results in Fig. 11 suggest that the entanglement-feature-based approach could be applicable for approximately locally scrambled unitary ensembles.The reconstruction bias vanishes as the frame potential gap decays.As long as the frame potential gap is low enough, the bias is also expected to be vanishingly small for all predictions.This significantly broadens the application of classical shadow tomography to a large class of quantum dynamics that can be achieved on NISQ devices.

IV. SUMMARY AND DISCUSSIONS
Our result can be further extended to more general measurement channels, which can involve ancilla qubits and partial measurements.The unitary channel can be noisy and the measurements can be weak.Under generalized measurements, the state ρ collapses to ρ → K a ρK † a /(Tr K a ρK † a ), where K a is the Kraus operators [48] associated with the measurement outcome a.We can define the measurement operator σa = K † a K a (with the standard normalization a σa = 1), which forms the prior snapshot ensemble E σ = {σ a |P (σ a ) = d −N }, and the posterior snapshot ensemble will be E σ|ρ = {σ a |P (σ a |ρ) = Tr σa ρ} correspondingly.As long as the generalized prior snapshot ensemble E σ is locally scrambled, i.e. ∀V ∈ U(d) N : P (σ) = P (V † σV ), our theoretical framework automatically applies, and all formulations in this work remain valid in the same form.This enables us to consider classical shadow tomography with very general data acquisition protocols.
12. Classical post-processing protocol to estimate the operator expectation value and shadow norm.
The entanglement feature formalism plays a central role in our approach.Fig. 12 summarizes the proposed classical post-processing protocol to predict the expectation value O of a physical observable O, together with its estimated variance (given by the shadow norm O 2 Eσ divided by the sample size M ).Given the circuit structure, the entanglement feature (EF) solver calculates the entanglement feature W Eσ of the prior snapshot ensemble as defined in Eq. ( 8) (the algorithm is developed in previous works [14,20,39]).The result is passed to the inverse channel solver to calculate the reconstruction coefficients r A by solving Eq. ( 12).With r A , we can predict any physical observable O by O = d N A r A o A where o A = Eσ∈E σ|ρ Tr Oσ A (the median-of-means trick [8] can be used here if multiple observables are to be predicted).For every sample of classical description of the Kraus operator K, a quantum circuit simulator (running on a classical computer) is needed to construct the (efficient representation of) measurement operator σ = K † K.The classical simulation could be efficient if the circuit is Clifford [49] (our formalism applies to random Clifford circuits with no problem).The part of computation in the dashed box of Fig. 12 should be repeated for every sample to evaluate the ensemble average.Finally, given the reconstruction coefficient r and the entanglement features W (2)

Eσ and W
(2) E O , the shadow norm O Eσ can be calculated, which provides an estimation for variance of the predicted observable.Although it takes some effort to process the entanglement feature data and to calculate the reconstruction coefficients, such computation (everything outside the dashed box in Fig. 12) only occurs once for a given circuit structure, therefore this computational effort is usually affordable (especially when efficient tensor-network approaches are developed and employed) [50].Ψ〉 FIG. 13.Illustration of holographic classical shadow tomography scheme, where the quantum circuit is arranged in a hierarchical structure (forming the hyperbolic bulk space).
The theoretical framework established in this work extends the classical shadow tomography to general quantum circuits, which opens up many possible applications.As one interesting example, we consider performing the classical shadow tomography in the "holographic bulk" by transforming the original state by a random Clifford circuit arranged in a hierarchical structure (see Fig. 13), similar to the multi-scale entanglement renormalization ansatz (MERA) network [51,52] or the holographic quantum error-correcting code [53].Following the idea of holographic duality, local measurements in the holographic bulk translate to measurements at all different scales on the holographic boundary.Therefore it is conceivable that the holographic classical shadow tomography could achieve high sample efficiency for operators of all scales, potentially evading the dichotomy between sample complexity and circuit complexity.
Another interesting application is to consider random circuits hybrid with random measurements inserted into the circuit at a fixed rate [54][55][56][57][58]. Conditioned on the intermediate measurement outcomes, the hybrid quantum circuit forms a quantum channel that transmits quantum information from end to end.Driven by the measurement rate, the final state can undergo an entanglement transition [19,59,60] (or the quantum channel can undergo a purification transition [61] equivalently).When the measurement rate is high, the quantum information in the initial state can be efficiently extracted by intermediate measurements (eavesdroppers), such that the channel has zero transmission capacity.When the measurement rate is lower than a critical threshold, the channel will have a finite capacity and can transmit quantum information in an error-correcting manner.[18,20,62] However, it is unclear how to take advantage of the selforganized quantum error correction in these hybrid quan-tum circuits.We anticipate that the classical shadow tomography with a flexible measurement scheme can help decoding the measurement-induced quantum errorcorrecting code.We will leave these interesting applications for future explorations.
Finally, the classical shadow tomography provides an efficient interface that converts quantum states to classical shadow data, which enables us to exploit the power of classical computation, especially data-driven and machine learning approaches, to advance our understanding of complex quantum systems and to solve challenging quantum many-body problems.As shown in Ref. [63], classical algorithms that learns from the classical shadow data has provable performance advantages over conventional numerical approaches that do not learn of data.Our work further adds to this promising direction by providing a more flexible classical shadow tomography scheme that works with very general measurement protocols (beyond on-site Pauli measurements), which could lead to potentially more efficient classical-shadow-based learning algorithms.

V. ACKNOWLEDGMENT
where O 2 g is inherited from Eq. ( 16) Compared with Eq. ( 16), we can see that the ensemble average E V ∈U(d) N in Eq. (B3) removes the ρ dependence by effectively replacing ρ with d −N 1 (the prior density matrix that defines the prior POVM E σ ).This explains the consistency in our notation that O 2 Eσ = E σ∈Eσ ô(σ) 2 follows from essentially the same definition as in Eq. ( 14).Note that the reconstruction map M −1 always commutes with the local basis transformation E O ,D Tr ((χ C ) A,B ⊗ 1)χ g .

(B5)
Here χ C denotes the swap operator supported in region C that acts between the first two copies of the Hilbert space, and (χ C ) A,B denotes the reduction of χ C in region A and B respectively in the first and the second copies of the Hilbert space, which results in (χ C ) A,B = χ A∩B∩C d |A∩B∩C|−|C| .The operator entanglement feature W 2 follows from the same definition given in Eq. (8).r A , r B are the reconstruction coefficients given by the solution of Eq. (12).Substitute Eq. (B5) to Eq. (B2), we can evaluate the summation of g, h given that g,h∈S N 3 Tr(χ A∩B∩C χ g )Wg g,h W (3) Eσ,A∩B∩C .The reduction of the 3rd entanglement feature to the 2nd entanglement feature is a consequence of the fact that ρ drops out from the tensor product in Eq. (B3), such that only 2-fold Hilbert space is required to define O 2 Eσ .Thus we finally arrive at the expression for the operator shadow norm purely in terms of the entanglement features of E σ and E O , where the coefficient v A,B,C,D is given by Appendix C: Efficient classical post-processing algorithm with tensor network method 1. Overview of tensor network based classical post-processing In the main text, we have derived the following protocol for predicting quantities of states with locally scrambled quantum circuits: It seems that the number of coefficient r A scales exponentially with system size, therefore not efficient.Surprisingly, with clever design and the help of tensor network, there indeed exists efficient tensor network method that can achieve efficient classical post-processing.Fig. 14 summarizes the classical post-processing workflow of predicting operator expectations with tensor network methods.
On the right side of workflow, given the circuit structure, the entanglement feature can be efficiently encoded as a tensor network using "EF solver".Then the tensor network representation of the entanglement feature is inputted into the M −1 solver, whose output is a tensor network representation of the reconstruction coefficient r A .The nice thing is that one only need to solve the tensor network representation for r A once.And this representation can be stored for future usage.
On the left of the workflow, we do experiments on quantum devices and calculate classical shadows σ.And one should notice that our formulation is general enough to include Clifford circuits that do not have group structure.And the classical shadows of those circuits can be calculated efficiently.Then we can combine the classical shadows σ and tensor network representation of r A to predict operator expectations O .
A detailed discussion on how to model both Tr (Oσ A ) and r A with tensor network is discussed in the following two subsections.In the main text, we have shown the reconstruction channel under local scrambling assumption can be written as

! σ #
At first sight, the exponential summation of subregion A seems to be troublesome.However, it can be circumvented by tensor network methods.Here, we will introduce a concrete algorithm.First of all, if the unitaries used in the classical shadow experiment are Clifford gates, then classical shadows σ are stabilizer states, and each of them can be efficiently stored with O(N 2 ) memory on a classical computer, where N is the system size.Proof: If the circuit is composed of Clifford gates, then classical shadow σ is a stabilizer state with stabilizer group generated by The reduced state σ A = D Ā[σ] restricted to region A is also a stabilizer state with stabilizer group S A ⊆ S defined by taking the elements of S which have zero support on Ā.This is obviously a subgroup of S since it is closed under multiplication and inversion.Without loss of generality, we can write where we drop the second trace since the internal bond dimension is one, and each tensor o (ai) i on site i with binary physical index (a i = 0 or 1) is  In the main text, we argued that the vector r A can be represented as a MPS.Here, we illustrate how to find such a MPS using variational method.First of all, we have shown that the reconstruction coefficient r A satisfies the linear equation: where W C is the second entanglement feature vector created by the unitary ensemble.In Ref. [64], the authors shows entanglement feature vector W C can be efficiently encoded using MPS representation.The physical intuition behind this efficient representation is that if one views W C |C , then this state will possess low entanglement Ref. [65].Therefore, it can be represented as a MPS with low bond dimension.In Fig. 15, the blue nodes indicate the MPS representation of W C .For translation invariant circuit structure (such as the brick-wall circuit), the time complexity to construct the MPS representation for W C is O(1) (independent of the system size).For general circuit structure, the time complexity is at most O(N ).
In Eq. (C8), the fusion coefficient f Note that this fusion factor f A,B,C can be factorized to each site as f A,B,C = i f ai,bi,ci where as the tensor subscripts a i , b i , c i = 0, 1 enumerates over boolean variables.Therefore, the fusion factor f A,B,C can be represented as the gray tensors in Fig. 15.
To find the MPS representation of vector r A , we use the variational method.We can write an MPS ansatz for r A and try to find the best parameters in the MPS by doing variational optimization.The same idea has been explored in machine learning tensor network optimization [66][67][68], and differential programming of tensor networks [69][70][71].With differential programming, we can find the best parameters in the MPS ansatz for r A by minimizing the L1 or L2 loss of the left-hand side tensor and right-hand side tensor of Fig. 15.With a fixed bond dimension, the algorithm complexity is O(N ).In addition, we can utilize the symmetry of the unitary ensemble to minimize the training parameters in the MPS ansatz.In practice, we find that r A can be represented as a MPS with a low bond dimension using the variational method.A detailed discussion of this new computational method will be in another paper.
In addition, we would like to point out that after the first draft of our paper, our new proposal has caught much attention from both theoretical and experimental sides.Especially, the formal solution of Eq. (C8) can be solved [72], It would be also interesting to directly encode Eq. (C11) with a MPS without the help of variational optimization.And we leave this to a future study.In the main text, we have seen when the measurement channel M in data acquisition and the reconstruction channel M −1 in classical post-processing mismatch, the reconstructed density matrix 1 M σ∈E σ|ρ M −1 [σ] may not be positive-definite.And it results in biased prediction of physical quantities.In Fig. 5 (a) and Fig. 11, we have seen the biased prediction of fidelity that is larger than one.In Fig. 17, we plot the eigenvalues of reconstructed density matrix of 7-qubit GHZ state using DQIM ensemble with T /T Th = 1.38.In the main text, we have seen the DQIM ensemble with one period of evolutional time or T /T Th = 1.38 is not sufficient to achieve the local scrambling assumption, such that there is mismatch between data acquisition channel M and reconstruction channel M −1 .We see the spectrum of density matrix contains some negative eigenvalues.In addition, the approximate shadow tomography based on locally scrambling Hamiltonian evolution, such as DQIM ensemble or GUE2 ensemble, is approximately unbiased when local scrambling is approximately satisfied or frame potential gap is vanishingly small.In Fig. 11, we have seen they all can give unbiased prediction of quantum fidelity when T ≥ 10T Th .We directly visualize the reconstructed density matrix using approximated DQIM ensemble in Fig. 19 and Fig. 20.As we see in Fig. 20, at T /T Th = 1.95, the locally scrambling assumption is not satisfied, and reconstructed density matrix is biased.In contrast, at T /T Th = 25.3 (Fig. 19), the reconstructed density matrix using a single instance of DQIM Hamiltonian is perfect, justifying the validity of our approach when the locally scrambling assumption is approximated satisfied.
Further more, for biased reconstruction, in order to make it positive definite, we can nonlinear project the reconstructed ρ to the convex set of physical states C = {ρ|ρ 1, Tr(ρ) = 1} by minimizing Π C (σ) = arg min ρ∈C Tr((ρ − σ) 2 ), (E1) which is the method mentioned in Ref. [30].If we have more prior knowledge about the quantum state, such as it is a pure state, then we can further impose those assumptions into the projection.Here, as an illustration, we utilize the knowledge that the target quantum state is pure, and we project the reconstructed ρ to a pure state ρ in C by choosing the eigenstate of ρ with the largest eigenvalue.As shown in Fig. 18, for the approximated ensembles, the GUE2 and DQIM are biased in the short time region, and projected state ρ has a fidelity less than one.And when locally scrambling assumption is approximately satisfied, the projected ρ will have fidelity that is approximately 0.99.With these checks: • unbiased prediction of physical quantities, see Fig. 11 • high fidelity of reconstructed density matrix projected back to physical space, see Fig. 18 we confirm the approximated shadow tomography can perform unbiased reconstruction.
and E O = {V † OV |V ∈ U(d) N } denotes the locally scrambled ensemble (or known as U(d) N -twirling) associated with the observable O in question.

FIG. 2 .
FIG. 2. Two-qudit unitary channel in (a) the long-time (Page state) limit and (b) the short-time (product state) limit.(c) Reconstruction coefficients rA and (d) the shadow norm O 2 Eσ v.s. the single-qudit purity w, for d = 2. w varying from 1 to 4/5 effectively models the circuit depth (or evolution time) growing from 0 to ∞.
Eσ admits efficient matrix product state (MPS) representation, even if snapshot states in E σ are volume-law entangled.Combining the MPS representation of W(2)

1 FIG. 3 .
FIG. 3. Classical shadow tomography with (a) finite-depth random unitary/Clifford circuits (of L layers), (b) a fixed unitary twirled by single qubit random Clifford gates, and (c) discrete-time Hamiltonian dynamics (of T steps).

FF
FIG. 4. (a) Fidelity estimation of GHZ state with RUC of different circuit depth L using entanglement-feature-based reconstruction M −1 EF (denoted by EF) over different number N of qubits.(b) Fidelity estimation of GHZ state using shallow RUC (3-layer, with M −1 EF , denoted by EF), random on-site (local Haar) gates (0-layer, with M −1 LH , denoted by LH) and global Haar unitary (∞-layer, with M −1GH , denoted by GH).The inset shows the variance Var F of the predicted fidelity as a function of system size N .In both subfigures, the sample size is 5000.Error bar indicates 3-standard-deviation estimated by the bootstrap method.Points are split horizontally to avoid the overlap of markers.

1 FIG. 5 .
FIG. 5. (a) Fidelity estimation of the reconstructed GHZ state with RUC of finite depth L. (b) Estimation of observable P0 = | Ψ|00 • • • 0 | 2 (the projection operator to the |00 • • • 0 state) on the reconstructed GHZ state with RUC of finite depth L. In both cases, the reconstruction uses the global Haar reconstruction map.The sample size is 5000.Error bar indicates 3-standard-deviation estimated by the bootstrap method.

1
FIG. 6.(a) Single-shot variance of estimated fidelity v.s.circuit depth L for a 9-qubits GHZ state.(b) Single-shot variance of estimated fidelity as a function of the effective system size N eff .The best fit for Var F ∝ exp c N (L+1) α gives c = 0.47 ± 0.08 and α = 0.72 ± 0.1.(c) Variance of full rank operator estimation on 9-qubit GHZ state.The full rank operators are Pauli-Z operators of the form:Z (k) = Z ⊗k I ⊗(N −k)with different support k.The dots are experimental results from simulation, and the lines are theoretical prediction using operator shallow norm by Eq.(21).They match perfectly.

1
FIG. 7. (a)Tomography complexity C ∝ (L + 1) Var F as a function of circuit depth L for the fidelity (low-rank observable) estimation task.Dots are tomography complexities for GHZ states of qubit number N by our numerical simulation.Solid curves are best fits based on the empirical formula Eq. (37).(b) Tomography complexity C ∝ (L + 1) Var Z (k) for the Pauli string (full-rank observable) estimation task.Dots are numerical simulation results.Solid curves are analytic calculations using the operator shadow norm formula Eq. (21).

1 FIG. 8 .
FIG. 8. Classical shadow tomography with a fixed unitary and onsite random Clifford gates.The fixed unitary can be generated with single quantum circuit, such as (a) CNOT gates, or fixed quantum dynamics, such as (b) Rydberg Hamiltonian dynamics.

F
FIG. 9. Fidelity estimation of GHZ state using randomized Pauli measurements (denoted as Pauli), classical shadow tomography with fixed CNOT gates as in Fig. 8 (a) and classical shadow tomography with fixed Rydberg Hamiltonian dynamics as in Fig. 8 (b).The inset shows the variance Var F of the predicted fidelity as a function of system size N .The sample size is 10000.Error bar indicates 3-standard-deviation estimated by the bootstrap method.Points are split horizontally to avoid the overlap of markers.

FIG. 10 . 2 )
FIG. 10.(a) Frame potential gap ∆ (2) Eσ of the GUE2 ensemble as a function of evolution time T .The inset shows the decay behavior for different system sizes N .(b) Frame potential gap of DQIM ensemble at different coupling strength J, in comparison with that of the GUE2 ensemble.(c) The dependence of the local scrambling time T Th on the coupling strength J.(d) Frame potential gap for single instances in the DQIM ensemble.Each instance corresponds to a light-green curve in the background.

FIG. 11 .
FIG. 11.Fidelity prediction by (a) different approximated locally scrambled ensembles, and (b) the GUE ensemble at different system sizes N .Sample number is 10000 and error bar indicates 3-standard deviation.

2 g = V † OV 2 g 2 g
acts on each qudit separately and hence does not interfere with the partial trace operation.This indicates that the norm O is invariant under the transformation V .This suggests us to define a locally scrambled ensemble E O (or known as U (d) N -twirling) associated with any given observable OE O ≡ {V † OV |V ∈ U(d) N }, in Eq. (B3) can be redefined as its ensemble average

FIG. 14 .
FIG.14.Classical post-processing protocol to estimate the operator expectation value and shadow norm.

•
Proposition I: Given O is a Pauli observable and σ is a stabilizer state, the vector o = {o A |o A = Tr (Oσ A )} has an efficient matrix product state (MPS) representation with internal bond dimension equals one, where σ A = D Ā[σ].And D A [•] is the depolarizing channel acting on region A.

)
It is obvious that the expectation Tr (Oσ A ) = 0 when supp(O) A. Moreover, the only scenario when Tr (Oσ A ) is non-zero is ±O ∈ S A .Therefore, we haveTr (Oσ A ) = 0, ±O / ∈ S A Tr(Oσ A ) = Tr(Oσ), ±O ∈ S A .(C5)From the above equation, it is clear that for any Pauli observable O, o A = Tr (Oσ A ) can be represented as a trivial MPS with bond dimension D = 1:

)
This concludes that even the vector o = {o A |o A = Tr (Oσ A )} contains exponentially many elements, it has an efficient MPS representation with bond dimension D = 1.This MPS representation can be easily constructed: given Pauli observable O, first calculate Tr(Oσ), then construct the MPS using Eq.(C7).Tr(Oσ) can computed in O(N 2 ) time, because O is a Pauli observable and σ is a stabilizer state.The remaining MPS tensors can be constructed in O(N ) time by traversing through the Pauli string.

3 .
FIG. 15.A cartoon illustration of variational solving MPS representation of rA.

DQIM J = 1 , 1 FIG. 17 .
FIG. 17. Eigenvalues of reconstructed density matrix ρ of 7-qubit GHZ state using mismatched channels.The unitary ensemble is single instance of DQIM ensemble with J = 1, and T /T Th = 1.38.Under this condition, the unitary ensemble is not locally scrambled.

FIG. 19 .
FIG.19.Unbiased reconstruction of a 7-qubit GHZ density matrix, using a single instance of Hamiltonian in the DQIM ensemble at T /T Th = 25.3 (after the local scrambling condition is achieved).

FIG. 20 .
FIG.20.Biased reconstruction of a 7-qubit GHZ density matrix, using a single instance of Hamiltonian in the DQIM ensemble at T /T Th = 1.95 (before the local scrambling condition is achieved).
Ω N ).|B| denotes the size (cardinality) of the region B. ρ B = (Tr B ρ) ⊗ (1 B /d | B| ) is the reduced density matrix of ρ in region B embedded back into the total Hilbert space.B denotes the complement of region B. Note that B and B do not need to be consecutive regions in the space, and they can intertwine with each other in general.Wg B,C = (d 2 − 1) −N (−1/d) |B C| is the Weingarten function of regions B and C, where B C denotes the subregions that belong to either B or C but not both.