Quantum entanglement recognition

Entanglement constitutes a key characteristic feature of quantum matter. Its detection, however, still faces major challenges. In this letter, we formulate a framework for probing entanglement based on machine learning techniques. The central element is a protocol for the generation of statistical images from quantum many-body states, with which we perform image classification by means of convolutional neural networks. We show that the resulting quantum entanglement recognition task is accurate and can be assigned a well-controlled error across a wide range of quantum states. We discuss the potential use of our scheme to quantify quantum entanglement in experiments. Our developed scheme provides a generally applicable strategy for quantum entanglement recognition in both equilibrium and nonequilibrium quantum matter.


I. INTRODUCTION
Entanglement has turned into a central concept across various branches in physics ranging from quantum technological applications [1] to the characterization of quantum matter [2,3]. It has remained, however, a key challenge to quantify the entanglement content of a given quantum state especially under realistic experimental conditions beyond the ground and pure state paradigm. This challenge is rooted in the fundamental property that entanglement measures are nonlinear functions of the density matrix, but quantum measurements only yield direct information linear in it according to the axiomatic foundations of quantum mechanics. For quantum systems involving a limited number of degrees of freedom, entanglement can still be quantified also experimentally upon reconstructing the full density matrix via tomography [4][5][6][7][8], by means of measurements on identical copies of quantum states [9][10][11][12][13], or through the statistics of randomized measurements [14][15][16].
By applying machine learning techniques we show in this work that entanglement measures can be accurately extracted merely from limited information linear in the system's density matrix, thereby reducing significantly the necessary measurement resources. The key element of the proposed scheme is a protocol to generate a two-dimensional statistical image from a given quantum many-body state. Utilizing conventional image recognition based on convolutional neural networks we then perform a quantitative classification of the entanglement entropy and logarithmic negativity, ranging over a broad class of quantum many-body wave functions from ground and excited states to quantum nonequilibrium evolution for pure and mixed states. We apply these techniques to various one-dimensional quantum spin-1/2 chains and find that this classification is remarkably accurate and can be assigned a well-defined error. Importantly, we find that the artificial neural networks (ANNs) for quantum entanglement recognition can generalize, which we study quantitatively across different kinds of spin chains. In particular, the obtained ANNs can, at a controlled error, even predict nonequilibrium dynamics, generated e.g. by a Lindblad-master equation, despite not having seen such an evolution before. We further discuss how this quantum entanglement recognition might be utilized experimentally as an entanglement estimator upon training with simulated data.

II. IMAGE GENERATION FROM QUANTUM STATES
The key element of our quantum entanglement recognition scheme is the statistical image generating protocol, which we now introduce. Let ρ 0 denote the density matrix of a quantum (many-body) system. In the following, we study models of L spin-1/2 degrees of freedom for simplicity, although the protocol can be extended straightforwardly to any lattice model with finite local Hilbert spaces. We perform measurements on the quantum state with the string O = σ z 1 ⊗ · · · ⊗ σ z L of Pauli matrices σ z l yielding as outcomes spin configurations s = (s 1 , . . . , s L ) with s l = ±1. Such measurements can be performed in quantum computing platforms such as trapped ions [17] and superconducting qubits [18][19][20] or ultra-cold atomic systems via quantum gas microscopes [21]. In general, a single measurement basis is not sufficient for entanglement detection. We therefore generate a more detailed picture of ρ 0 by applying a set of W fixed but random local unitary transformations U i>1 = u i,1 ⊗ ... ⊗ u i,L with i = 2, ..., W, where each local unitary u i>1,l on site l is drawn independently from the circular unitary ensemble (CUE) [22]. The full information about ρ 0 can be obtained by measuring along W = 2 L orthogonal directions; this would lead to full state tomography. Here, we explore whether a limited number of measurement axes is sufficient for entanglement quantification. For that purpose, we start with a simple experimentally accessible way of generating independent measurement directions via local unitary rotations. Measuring the rotated ρ → ρ i = U i ρ 0 U † i as before we obtain the probabilities  The network reads statistical images such as those in (a). After processing the image information the network classifies the entanglement by assigning the images to different labels corresponding to binned intervals of the considered entanglement measure. In the shown case this is for the case of a total number N bin = 4 of bins. (c) Distribution of the error in entanglement quantification for weakly entangled ground states (WE1) of the ferromagnetic transverse-field Ising model with field strengths h > 1 over a range of 2 ≤ N bin ≤ 10 for a spin chain with L = 10 sites. Error distances δ are measured in units of bins. A similar performance is obtained for larger system sizes, as is shown in the inset for the case of L = 18.
To begin, we consider the case with W = 11. In this way we obtain a two-dimensional representation of ρ 0 in terms of the probabilities p ij , shown for three exemplary states of different entanglement classes in Fig. 1(a). While we have chosen the simplest form of local unitary transformations U i for the image generating protocol, note that it is not unique and can just as well comprise many-site unitary transformations respecting the symmetry of the entanglement measure (see Appendix E for details). For any protocol, a natural choice, though not absolutely essential, is to have U 1 to be the D × D identity operator, i.e. ρ 1 = ρ 0 , such that p 1j = j|ρ 0 |j . We found that while this choice does not affect the performance of the network trained with the simplest protocol proposed here, it turns out to be crucial for the network to achieve a comparable performance when trained with some of these alternative protocols.

III. SUPERVISED LEARNING OF ENTANGLEMENT
In the following we now outline how the introduced statistical image generation can be used to perform a quantitative entanglement classification task by means of a supervised learning procedure. For pure states a natural entanglement measure is the half-chain entanglement entropy with ρ < = Tr > ρ the reduced density for the first half of the chain, obtained by tracing out the degrees of freedom of the remainder > from the full density matrix ρ. For mixed states we use instead the logarithmic negativity [23], where ||ρ T< || denotes the trace norm of ρ T< and ρ T< is the partial transpose of ρ on the degrees of freedom of the first half of the chain. For the desired quantification task we bin the range of values for the respective entanglement measure into N bin equally-spaced intervals. We fix an interval I S = [0, S max ] for the entanglement entropy S, say, with a suitably chosen S max and decompose I S = N bin n=1 I Sn into I Sn = [(n − 1)∆S, n∆S) with n = 1, . . . , N bin = S max /∆S. Each of these bins, labeled by n, corresponds to the category to which we aim to associate the images. This binning process is necessary when there are no training images in some range of entanglement values, and consequently, the largest value of N bin permissible is such that there are no empty intervals I Sn . With this we can now perform an entanglement classification problem as in conventional image recognition. We adapt a convolutional neural network to process the two-dimensional image of the quantum state as shown in Fig. 1(b). Two layers of feature maps, comprising 64 and 32 features respectively, are extracted from the images. These feature information are then flattened and fitted across the images from the training library to their respective N bin labels. See Appendix A for more details on the network. We quantify the classification accuracy on test images by the signed binning error distance δ = n − n ANN measuring the difference between the bin label n from the exact entanglement content and n ANN predicted by the ANN. A typical distribution of δ for various 2 ≤ N bin ≤ 10 is shown in Fig. 1(c) for a specific benchmark problem. Importantly, the entanglement classification exhibits a well-defined error, whose distribution is sharply peaked around δ = 0, implying no error, with some further appreciable weight only for δ = ±1.

IV. RESULTS
It is the key goal of the following analysis to explore the performance of the quantum entanglement recognition scheme for a large variety of quantum states.

A. Models
As benchmark systems we choose a set of paradigmatic one-dimensional quantum many-body models. This includes the transverse-field Ising chain with either ferromagnetic (TFI+) or antiferromagnetic (TFI-) couplings as well as the XX model: Here, σ x,y,z i denote the Pauli matrices at site i = 1, . . . , L and h a magnetic field strength. While throughout this work we show numerical data for L = 10, we emphasize that the entanglement recognition does not depend significantly on L. This scalability with system size is exemplified in Fig. 1(c) where we also include data for L = 18. Although we focus on a particular set of models, we find that our results do not depend crucially on the model details as discussed below, suggesting that our observations are generic and therefore applicable in broad context.

B. Ground and excited states
We start by studying weakly entangled ground states of the considered spin chains. First, we explore the performance by testing on the same class of states, e.g., with ground states of TFI+ after training the ANN with ground states of the same model (see Appendix B for details). The network performs remarkably well in this case, as can be seen in Fig. 1(c) showing a typical distribution of the binning error δ. Almost independent of N bin , one can recognize a strongly peaked distribution with an appreciable weight beyond δ = 0 only at δ = ±1. As the mean error is practically vanishing, the performance is effectively captured by the standard deviation, which can therefore be used as a well-defined error quantifier. The fact, that the network typically fails at most by assigning a state to the nearest-neighboring bin, suggests that the dominant error originates from those instances where the actual entanglement entropy resides close to the border between two bins. A particularly important consequence of Fig. 1(c) is that the performance of the network improves upon enlarging N bin . This can be quantified by δS = S n − S ANN n = δ × ∆S, where S n denotes the binned value of the computed entanglement entropy and S ANN n the one predicted by the ANN. The result for the corresponding standard deviation σ of δS is shown in Fig. 2(a) for all the different models considered, showing a clear improvement in network prediction accuracy for larger N bin .
As a next step we aim to explore the capabilities of the ANN to generalize to unfamiliar data by testing the network with states from model classes different to those it was trained on. Remarkably, the distributions for δ exhibit the same structure as in Fig. 1(c). The respective summary on all training/test combinations is shown in Fig. 2(b,c) for N bin = 10 containing both the mean µ and the standard deviation σ of δS. As expected, the network performs best when tested on the same class of states it was trained on. However, it can also generalize to different states in some instances, e.g. when testing on TFI+ states, albeit with a slightly poorer performance and in a non-reciprocal fashion. These observations suggest that while the ANN primarily learns model specific features of the entanglement, it does in fact learn also about some universal features of quantum entanglement, the extent of which appears to depend on the type of states used for training. We further point out that our scheme also applies to excited states with volume-law entanglement, where we again find a strongly peaked distributions for δ. We have included for one representative case, H TFI+ , the corresponding σ in Fig. 2(a). We have also checked that the network performs equally well when making the Ising chain nonintegrable and across different entanglement measures (see Appendix D for details).

C. States obtained from unitary dynamics
As a next step we aim to explore quantum entanglement recognition in nonequilibrium dynamics. This is of particular importance for many quantum simulator platforms, where it is much more natural to realize time evolution than for instance ground state preparation. Here, we will be exclusively interested in whether the ANN can be trained to quantify the entanglement dynamics associated with time evolution under the same Hamiltonian H. This effectively probes the ability of the network to differentiate between the entanglement present in different superpositions of the eigenstates of H. For the case of unitary dynamics, the training library consists of images of states evenly sampled across the statistical image time series of 1000 different initial randomly polarized states evolving in time under H. In this case, there are essentially no gaps in the entanglement distribution of the training images so that unlike for ground and excited states, binning into intervals is not necessary. Consequently, we are no longer constraint as before to have the network perform a 'discrete classification' task. Instead, the network can be modified to perform a 'continuous classification' task by appending to the network in Fig. 1(b) a single output neuron that directly predicts the value of the entanglement measure S or E N . See Appendix A for details on network structure and Appendix B and C for details on the training image library. As a benchmark, we consider the evolution of randomly polarized states under H TFI+ . The resulting prediction for the entanglement entropy S ANN (t) (green curve), based on its statistical image time series, in comparison to the exact real-time evolution S(t) (black curve) is shown in Fig. 3(a) for a specific instance (thin green/black lines) and averaged over 100 different initial randomly polarized states (solid lines) with a shaded region indicating the associated standard deviation of the error in the network prediction. As one can see, the network is able to almost precisely predict the entanglement dynamics, achieving essentially vanishing error for S(t) under unitary evolution across the timescales probed.

D. States obtained from dissipative dynamics
In an experimentally realistic context the major challenge is to quantify entanglement for mixed states. For this purpose we study exemplarily the logarithmic negativity E N for non-unitary time evolution as described by a Lindblad master equation of the form ∂ t ρ = with γ characterizing the dissipation strength. Analogous to the case of unitary dynamics, the training images are obtained from statistical image time series of 1000 different initial randomly polarized states evolving in time under the above Lindblad master equation. We find that the ANN is capable of accurately tracking the evolution of E N for the studied dissipative dynamics. Analogous to the purely unitary case, we compare in Fig. 3(b) the network-predicted E ANN N (t) (blue curves) to the exact result E N (t) (black curves) for the case of weak dissipative dynamics with γ = 0.01. For early times t < 100 ∼ γ −1 , the evolution is approximately unitary and E N (t) increases linearly with time. At later times t > 100 however, E N (t) gradually decays as dissipation kicks in leading to a reduction of entanglement. The network performs very well, although slightly worse than the unitary case as can be seen by its slightly broader range of fluctuations in prediction error (shaded region) in Fig. 3 Fig. 3(a). We characterize the average performance of the network by the time-dependent mean µ(t) [Fig. 4(a)] and standard deviation σ(t) [Fig. 3(d)] of the error in entanglement prediction by the network, δS ANN (t) for the unitary case and δE ANN N (t) for the non-unitary case, taken over 100 different initial randomly polarized states. and (b) standard deviation σ(t) of the prediction error in the respective entanglement measures taken over the 100 initial states used to produce the plots in Fig. 3(a,b). For the dissipative case, color variation from blue (noiseless) to red indicates the level of noise introduced to the statistical image time series via sampling the wavefunction a finite M number of times to generate each row of the image.

E. Generalizing to time evolution under perturbed Hamiltonian
An additional challenge in experiments is the presence of perturbations to a model Hamiltonian that is being emulated. For applications in experiments therefore, it is desirable for an ANN that is trained for a specific Hamiltonian H, to still be relatively accurate in quantifying the entanglement of states evolved by a weakly perturbed HamiltonianH. Specifically, we consider here an integrability-breaking perturbation, H TFI+ = H TFI+ + δ <i,j> σ z i σ z j , where δ characterizes the strength of the perturbation. The effects of perturbation on the network's performance is shown in Fig. 5. We find that when the perturbation is weak, δ = 0.01, the ANN trained by images of states evolving under H TFI+ with weak dissipation γ = 0.01 is still able to accurately quantify the entanglement dynamics on average but does so with a larger error at times t > 100 ∼ γ −1 , when dissipation begins to dominate. When the perturbation is strong, δ = 0.2, the error begins to grow from t > 0, while it starts to underpredict the entanglement on average after t > ∼ 2γ −1 . In the worst case scenario, when in addition the actual dissipation is twice as large as what was used for training, the ANN begins to overpredict on average after t > ∼ 4γ −1 by an amount that increases with time. Interestingly, a stronger dissipation does not further increase the standard deviation of the prediction error. The above suggests that dissipation modifies the statistical images in a systematic way which on the one hand can be recognized by the ANN, but on the other lacks a simple extraction of the dissipation strength. Consequently, the ANN overpredicts (underpredicts) on average the late time entanglement when the actual dissipation is weaker (stronger) than what was used for training. The presence of a strong perturbation to the Hamiltonian however does strongly deform the statistical images and cripples the performance of the ANN.

V. SUMMARY AND DISCUSSION
In this work we have introduced quantum entanglement recognition based on machine learning techniques. Our protocol provides a controlled and unbiased way to extract quantum state information by essentially applying projective measurements in W predetermined but randomly selected bases. By organizing these information into a statistical image, the entanglement quantification task is mapped into the conventional image recognition task, for which convolutional neural networks have been optimized to achieve excellent performance. In the large W ∼ 2 L limit, these statistical images essentially capture all the information present in the density matrix, such that it would not be surprising if a trained convolutional neural network is able to successfully reconstruct the entanglement associated to a statistical image.
The central result of this work is to show that indeed the above expectation is correct, and more remarkably, only a small set of measurement bases W ≪ 2 L is required. This is the key feature going beyond previous works on applying machine learning techniques to characterize quantum matter [24][25][26][27][28], which operate either in a single measurement basis [24,25,27,28] or by considering low-order correlation functions [26]. By applying to various different classes of quantum many-body states we show that the resulting quantum entanglement recognition can be assigned a well-defined error making the scheme accurate and reliable. While the networks primarily learn model specific features of the entanglement, which was demonstrated in Fig. 2(b,c) by the inability of a network trained for one class of ground states to always generalize to another class of ground states, we show that the networks in our scheme are at least capable of generalizing to weak perturbations, e.g., in the context of nonequilibrium dynamics, implying that the networks can learn universal features of quantum entanglement, and from a theoretical standpoint, that such features are already well-encoded within the state information obtained from W ≪ 2 L measurement bases. A generalizable network is particularly important for its po- Testing the network that was trained on statistical images of states evolving under HTFI+ with dissipation parameter γ = 0.01 on those evolving under a perturbed versionHTFI+ = HTFI+ + δ <i,j> σ z i σ z j with weak (dark blue) and strong (green) perturbations δ for the same dissipation, as well as with stronger dissipation for the strongly perturbed case (yellow). Plots of the (a) logarithmic negativity EN (t) (dashed lines) and those predicted by the network E ANN N (t) (solid lines) averaged over 100 initial states, the corresponding (b) mean µ(t) and (c) standard deviation σ(t) of the prediction error. (b,c) For comparison, we show the blue reference plots from Fig. 4(a,b) respectively, i.e. for the case of δ = 0 and γ = 0.01. tential use in experiments, where the microscopic details of the dynamics might not be known in full detail.
In the experimental context, we showed that indeed the network is able to perform accurately in the presence of weak perturbations. In addition, the probabilities p ij from Eq. (1) can only be estimated from a finite number of measurements M onto spin configurations, introducing noise onto the statistical images. In Fig. 4(a,b), we have included results for such noisy images where one can see that even with M = 3000 per rotation the dynamics can be reproduced well, with a mean |µ(t)| < ∼ 0.1 and standard deviation σ(t) < ∼ 0.2 of the prediction error δE ANN N (t), i.e. within 5 − 10% of the actual values of E N (t). For the considered W = 5 rotations this implies a total number of 15000 measurements, which is much less compared to a recent experiment on probing Renyi entropies [16]. Let us emphasize, however, that we have not attempted to optimize the ANN for the entanglement recognition task at finite M , so that further improvements at this front will likely appear in the future. This opens the door to the development of machine learning tools that directly enable experimental studies on quantum entanglement in systems beyond the few body context. This is all the more important as many phases of strongly correlated quantum matter are characterized by their entanglement content such as in, e.g., quantum spin liquids. Based on the statistical images generated out of quantum many-body states, we perform a conventional image recognition task using a convolutional neural network. In the case of ground and excited states for which images are labeled by their half-chain entanglement entropy binned into N bin intervals, we used a network consisting of two consecutive convolution layers followed by a hidden dense layer with N bin nodes established by the different image labels. The first convolution layer scans a statistical image and constructs 64 different feature maps, which are in turn scanned by the second layer to construct 32 new feature maps. This feature information is then flattened and fitted across the images from the training library to their respective N bin labels of the dense layer via the 'categorical crossentropy' loss function.
In the case of states obtained from dynamics for which images are labeled directly by their (continuous) halfchain entanglement entropy or logarithmic negativity, an additional dense layer with a single node is appended to the above network. The data is then fitted to this last node via the 'mean squared error' loss function. To increase the quantification precision, we increase the number of nodes in the preceding dense layer to N bin = 50.
For unbiased training of the ANN, the training library is constructed by generating an approximately equal number of widely varied reference images in each of the N bin categories, i.e. images from states with entanglement spread across I Sn for each bin n.

Ground states:
We solve for the ground states for different magnetic field strengths h > 1 for each of the model classes to obtain the corresponding libraries of labeled images with a uniformly distributed entanglement entropy S ∈ I S = [0, ln 2]. To ensure that there are no directional biases in the images concerning a specific orientation on the Bloch sphere, a uniform random rotation U = ⊗ L i=1 u is applied to each state before generating its corresponding statistical image. We separately train the ANN with the library of a particular model class and then test the entanglement classification with new states it has not seen before either from the same or a different model class. Each training library contains 50000 images, while an additional 10000 images are generated for each test library.

Excited states:
We focus on the TFI+ model (as well as a nonintegrable modification) for the study of excited states. We solve for the eigenstates and their corresponding entanglement entropies (also logarithmic negativities) for a range of near critical field strengths h ∈ (1, 1.07). For each h-value, the eigenstates are grouped into N bin bins ranging between their maximum and minimum entanglement entropies. One state is randomly selected from each bin for image generation, while the rest of the eigenstates are discarded.

Unitary and dissipative dynamics:
We time evolve 1000 different initial randomly polarized states for 1000 time steps under H TFI+ for the unitary case, and under the Lindblad master equation of the form with γ = 0.01 for the dissipative case. For each initial state, its time-evolved state at 250 random but evenly spaced time steps are selected for image generation, giving a total of 250000 images on which the ANN is trained.

Appendix C: Numerical tools
The ground and excited states are solved via exact diagonalization for spin chains of L = 10 sites. To speed up the process of image generation, ITensor [29] was employed to solve for the ground state of H TFI+ on a spin chain with L = 18 sites as well as perform the localized unitary rotations (inset of Fig. 1). In this case, smaller images with W = 5 had to be used due to limited computational memory. The time evolution of a given initial state is numerically solved by employing the mesolve function of QuTiP (Quantum Toolbox in Python) [30,31]. In addition to the (simplest) protocol U i = u i,1 ⊗ ... ⊗ u i,L presented in the main text, we have considered statistical image generation via alternate protocols. To illustrate the independence on protocols respecting the symmetry of the (half-chain) entanglement measures, we show in Fig. 7-9 the network performance based on the following multi-qubit unitary transformations: U A i>1 = u i, (1,5) ⊗ u i,2 ⊗ u i,3 ⊗ u i,4 ⊗ u i,6 ⊗ ... ⊗ u i,L , (E1) U B i>1 = u i, (1,4,5) ⊗ u i,2 ⊗ u i,3 ⊗ u i,6 ⊗ ... ⊗ u i,L , (E2) U C i>1 = u i, (1,2,3,4,5) ⊗ u i, (6,7,8,9,10) , with i = 2, ..., W, where u i,(j1,j2,...,jn) denotes an SU (2 n ) unitary operator drawn independently from the circular unitary ensemble (CUE) [22] that acts on sites j 1 , j 2 , ..., j n . The performances based on the different protocols are essentially identical when testing on the same class of ground states that the ANN was trained on but varies in the extent to which they are able to generalize to the other classes.
Appendix F: Ability of the network to interpolate We show here an interesting finding that the network is able to interpolate for the case of dissipative dynamics. Specifically, we trained a network on images drawn from states evolved under ∂ t ρ = −i [H TFI+ , ρ] + γ L i=1 (σ x i ρσ x i − ρ) with γ = 0.01 and γ = 0.03 and tested it on states evolved under the same dissipative dynamics but with γ = 0.02. The network performance is shown in Fig. 10 below. In the absence of image noise, remarkably, the network is able to accurately predict the entanglement dynamics with a performance comparable to the case if it was tested with the same dissipation parameter as it was trained on, shown for the case of γ = 0.01 in Fig. 4(a,b) of the main text. In the presence of sampling noise, the quantitative deviation is 2 to 3 times worse compared to those of Fig. 4(a,b).  Fig. 4(a,b) for the network trained on dissipative evolution with γ = 0.01 and γ = 0.03 but tested on γ = 0.02.