Quantum Compressed Sensing with Unsupervised Tensor Network Machine Learning

We propose tensor-network compressed sensing (TNCS) for compressing and communicating classical information via the quantum states trained by the unsupervised tensor network (TN) machine learning. The main task of TNCS is to reconstruct as accurately as possible the full classical information from a generative TN state, by knowing as small part of the classical information as possible. In the applications to the datasets of hand-written digits and fashion images, we train the generative TN (matrix product state) by the training set, and show that the images in the testing set can be reconstructed from a small number of pixels. Related issues including the applications of TNCS to quantum encrypted communication are discussed.


I. INTRODUCTION
An important perspective of quantum information is to transfer and process classical information by taking advantages of quantum physics.Taking dense/super-dense coding protocol [1][2][3][4][5][6][7] as an example, the idea is to use previously shared entangled state between a sender and the receiver(s) to send more classical information than is possible without the resource of entanglement.Another example that was recently developed is the machine learning algorithms by tensor network (TN) [8][9][10][11][12][13][14].The aim is to employ TN (see some reviews of TN in Refs.[15][16][17][18][19][20]) as a novel machine-learning model to learn, classify, and/or generate classical information in the quantum many-body Hilbert space.
In this work, we propose tensor-network compressed sensing (TNCS) by combining the idea of quantum communication [21], compressed sensing [22] (see also the book in Ref. [23]), and unsupervised TN machine learning [10].Let us consider the following scenario.Alice wants to send an image of, e.g., a signature, to Bob in a secured way.She only sends a small number N f of pixels to Bob by classical communication which might be unsafe or even public.After Bob receives these pixels, he measures the quantum state that is previously provided by Alice, in the way determined by the received pixels, and then reconstructs the full information from the measured state.N f should be as small as possible, so that any other parties without the provided quantum state cannot access the full information from these pixels.
In the TNCS, Alice firstly trains the quantum state |Ψ by the unsupervised TN machine learning algorithm.Then, she needs to determine which N f pixels should be sent to Bob so that N f can be as small as possible.We propose to chose the pixels in a specific way, dubbed as entanglement ordering (EO) that is determined by the entanglement property of |Ψ .With the N f pixels sent from Alice, Bob can recover the unsent pixels from |Ψ in a generative process of the TN machine learning.
We testify the TNCS with the datasets of hand-written digits and fashion images (namely MNIST [24] and fashion-MNIST [25]).Three different ways of choosing the sent pixels are testified, which are the random ordering (RO), the ordering by variance (VO), and the EO.By comparing the average peak signal-noise ratio (PSNR) [see Eq. ( 8)] between the reconstructed images and the original ones from the testing set, the EO exhibits the highest PSNR.Discussions about several relevant issues of TNCS are given, including the potential quantum nature of the TNCS, and the necessary ambiguous correlations of the information, security for quantum encrypted communication, and etc.

II. TENSOR NETWORK COMPRESSED SENSING
Suppose Alice wants to send an image of a hand-written digit "3".She firstly trains the quantum state |Ψ as the generative model for the training set of "3" in MNIST.This can be done with the unsupervised TN machine learning algorithm [10], where |Ψ has the form of matrix product state (MPS) [26].In the training process, the tensors in the MPS are updated to minimize the negative log-likelihood, so that the whole state optimally captures the joint probability distribution of the training set.See more details in Appendix A.
With |Ψ , Alice needs to decide which N f pixels should be sent to Bob, so that N f can be as small as possible, or the accuracy for Bob to reconstruct the image can be as high as possible with a fixed N f .This is similar to compress an image of N pixels to N f N pixels with the assistance of |Ψ .In the language of probability, the problem can be restated as the following.Knowing the N f pixels {x [sent] } = (x p1 , x p2 , • • • ) at the positions (p 1 , p 2 , • • • ), the (conditional) probability dis-tribution of the rest of the pixels {x [rest] } satisfies where |s(x n ) stands for the state associated with the n-th pixel x n .The state |Φ is a (N − N f )-qubit state, obtained by implementing measurements according to {x [sent] } as with C a constant to normalize |Φ .The task is to find the N f pixels {x [sent] } that minimize the Shannon entropy Aiming at this task, let us begin with a simpler question: which pixel should be sent if Alice sends only one pixel?This can be determined by the single-site entanglement entropy (SEE) that (say for the n-th qubit) is defined as where ρn is the reduced density matrix with respect to the n-th qubit, with Tr /n the trace over all degrees of freedom except for the n-th qubit.
S ent n quantifies the information of the rest of the system one will gain if one has the information of the n-th qubit.Such a quantity has been utilized to safely reduce the number of pixels for supervised TN machine learning [27].With S ent n , Alice can choose the ñ-th pixel with ñ = arg max n S ent n , so that Bob will gain as much information as possible from one sent pixel.
Based on the above scheme, we propose the following protocol to select {x [sent] }, dubbed as entanglement ordering measurement protocol (EOMP).
1.With an N -qubit state |Ψ(N ) , calculate the SEE S ent n of all qubits, and find the qubit that has the maximal S ent n , i.e., ñ = arg max n S ent n .
2. From the reduced density matrix of the ñ-th qubit, ρñ , calculate its dominant eigenstate |s ñ .A simple example that helps to understand EOMP is provided in Appendix B. After Bob receives the N f pixels (and the corresponding positions in the image), he measures |Ψ to obtain |Φ [Eq.( 2)].The rest of the unsent pixels are then generated by |Φ .In Ref. [10], the authors employ a probabilistic process to generate the images, where the pixels are sampled to be black or white (0 or 1).The probability P (x) of being 0 or 1 for the n-th pixel is determined from ρn , P (x) = x|ρ n |x , x = 0, 1 (note x P (x) = Trρ n = 1 due to the normalization of |Ψ ).It means that one simply measures the qubit in the basis of the Pauli matrix σz .Here, to generate gray-scale images (as the original images are gray-scale), we generate the pixels {x [rest] } that have the maximal probability, i.e., where the product n goes through all the pixels in |Φ .It means that each measurement basis |s(x n ) is the dominant eigenstate of the corresponding single-site reduced density matrix of |Φ [Eq.( 5)].

III. RESULTS
We testify the TNCS on the MNIST and fashion-MNIST datasets.Each dataset has 10 classes of images, and in total has 60000 training images and 10000 testing images.Each image contains 28 × 28 = 784 gray-scale pixels.Fig. 1 demonstrates two original images and the reconstructed images with different numbers of known pixels N f picked in three different orders (EO, RO, and VO).EO is determined by the EOMP with the state |Ψ .Note that the MPS is trained by the training images, and the reconstructed images are from the testing dataset.For VO, the pixels are picked referring to the variance of the training images.The variance of the n-th pixel is calculated as where the summation goes though the training images and K is the number of the training images.In the VO, the pixels that have larger variance in the training images are considered to convey more information, thus selected for sending.RO means to pick the pixels of each image randomly.EO and VO are determined by the state and training set, respectively, and do not change with the specific images to be sent.
For N f = 0, |Ψ just generates the image (denoted as {x}) that has the maximal probability in |Ψ [see Eq. ( 6)].We name such an image generated with no known pixel as the quantum average.One |Ψ gives one unique quantum average (we assume that all ρn s have non-degenerated eigenvalues).As shown in Fig. 2, the quantum average is different from the simple average xn = i x i,n /K, since no correlations are considered in the simple average.Such correlations (and entanglement) are considered in the quantum average when calculating the reduced density matrix.For N f > 0, the more the known pixels there are, the closer the generated image will be to the specific image (the one to be reconstructed).Fig. 1 shows how well an image can be reconstructed from |Ψ with different numbers N f of pixels.Take the reconstruction of a dress image as an example (last three rows in Fig. 1).The original image is quite different from the quantum average (N f = 0).With only N f 5 known pixels picked by EO, the sleeves emerge.In contrast, the sleeves appear until 50 pixels are known if they are picked randomly.For the VO, the sleeves also emerge with 5 pixels but in a bad shape.The shape of sleeves is reconstructed with N f 20 in VO to a similar quality as N f 5 in EO.The length of the sleeves is corrected with N f 50 for EO and N f 110 for RO and VO.Fig. 3 shows the average PSNR of reconstructing all the images of "3" in MNIST and dresses in fashion-MNIST.We take the virtual bond dimension (see Appendix A) as χ = 16 and χ = 40.The PSNR between two images {x} and {y} is defined as PSNR({x}, {y}) = 10 log 10 784 Generally, PSNR increases with N f and χ as expected.With the same χ, the EO achieves the highest PSNR among the three ways.Both SEE and variance measure the amount of the carried information.Considering a pixel (labeled as n) that is always black in all the training images, such a pixel obviously carries no information, and we have S n = V n = 0. On the other hand, if a pixel changes dramatically with the training images, not necessarily but normally, this pixel may contain more information, and we will have large S n and V n .One essential difference is that S n and V n are properties from the quantum state and classical data, respectively.In our case, the quantum quantity (EO) outperforms the classical one (VO).More discussions are given in Sec.IV.Fig. 3 shows which pixels are selected in EO and VO with different values of N f .To illustrate the orders, we mark a pixel redder than those pixels that are behind this pixel in the order.Both EO and VO manage to capture the general shapes.Particularly, the "checker-board" pattern appears in EO with relatively large N f .This brings higher efficiency for the following reason.Since each two nearest-neighbor pixels should possess a strong correlation, the corresponding qubits are ex-  To illustrate the orders by color, we mark a pixel redder than those behind this pixel in the order.pected in a highly entangled state.It means that one only needs to know the information of one qubit (pixel) to access the information of the other qubit (pixel).Taking the maximally entangled two-qubit state |01 + |10 as an example, if one knows that the first qubit is in the state |0 (or |1 ), meaning that the first pixel x 1 = 0 (or x 1 = 1), one will know that the second qubit is in the state |1 (or |0 ), meaning that the second pixel x 2 = 1 (or x 2 = 0).In this case, one only needs to send the information of one of the pixels, and the rest will be obtained from the state.

IV. DISCUSSIONS
In the following, we raise several relevant questions from different aspects of TNCS.
What differences does quantum physics make in TNCS?Though the EO works better than VO, we are not stating here that the quantum advantages over classical information with these two specific methods.In fact, even RO where the pixels are picked randomly works well.That means that the TNCS is a valid and efficient scheme for transferring and reconstructing images via quantum states.
Nevertheless, TNCS indeed provides a new path of investigating quantum advantages over classical information techniques.A question is how to define new (classical or quantum) quantities that better suppress N f and/or increase PSNR.Possible choices include the (classical) correlation functions of the training data, the (quantum) correlation functions from |Ψ , and the k-site entanglement entropy with k > 1.The current results with SEE and variance show some signs of the differences between quantum and classical means.Indeed, the performance of both quantum and classical means need to be pushed to their limits to discuss more clearly about the quantum advantages.
Ambiguous correlations of information in TNCS.Another immediate question about TNCS is how to determine the samples (denoted by A) for training the quantum state |Ψ , and what are the relations to the information (denoted by B) to be transferred or reconstructed.Here we require that the reconstructed images do not belong to those for training the state.It brings flexibility, meaning that Alice does not have to know what exactly are in B while preparing |Ψ by A.
But, B has to be "ambiguously" correlated to A somehow.Let us consider an extreme situation, where all training samples in A are formed by uncorrelated random numbers.The trained state |Ψ is an entangled state.However, such a state obviously cannot be used to effectively transfer a random image as no correlations exist between the random image and the state.
In this work, we choose A and B as the training and testing images of the same dataset, respectively.For instance, A and B are handwritten digits "3" or images of dresses.Although the "microscopic information" (pixels) of all the images in A and B are different from each other, a human being can recognize the "macroscopic information" of each image as a digit "3" (or a dress) without any problem.This suggests that A and B must be correlated somehow.In other words, we here ensure the existence of the "ambiguous" correlations between A and B by the "macroscopic" information.How to characterize and quantify such ambiguous correlations is an important issue to TNCS, and will be also helpful to further understand and model the recognition process.
Security in encrypted communication and potential application.In the scenario depicted above, TNCS can be used to securely send information via quantum states.Since |Ψ cannot be cloned, one cannot copy an unknown state to others.The information is secured under the assumption that those without |Ψ cannot reconstruct the full information solely from N f N pixels.Moreover, there are many ways to enhance the security to avoid that the full information be cracked from the known pixels.
For example, Alice can introduce a one-to-one (reversible) deterministic map {y [sent] } = F ({x [sent] }; {x [rest] }) to encrypt {x [sent] }.Without F , the {x [sent] }, which might be unsafe, could contain critical information (see for example the second and fifth rows in Fig. A1 in Appendix C, which are almost meaningful images).The purpose of F is to avoid containing any meaningful information in {x [sent] }.
Such a F -encrypted TNCS will contain the following steps: 1) Alice designs the function F , and trains |Ψ by the images formed by {x [rest] } and {y [sent] }; 2) Alice sends |Ψ to Bob; 3) For the information to be sent, Alice sends {y [sent] } = F ({x [sent] }; {x [rest] }) and the function F to Bob through classical channels that may not be safe; 4) Bob obtains {x [rest] } by |Ψ and {y [sent] } (same to the standard TNCS), and obtains {x [sent] } by {y [sent] }, {x [rest] }, and the inverse of F .Then Bob will have the full information {x [sent] }+{x [rest] }.The information will be safe since those without |Ψ cannot have {x [rest] }, thus cannot obtain {x [rest] }) even if they have F and {y [sent] }.
Since the information to be sent is not restricted to the data that train |Ψ , Alice can provide previously the copies of |Ψ to multiple parties, and send any piece of "ambiguously" correlated information to each party anytime afterwards.Different pieces of information can be sent via the copies of the same state.
Meanwhile, Alice does not allow other parties to access the coefficients of |Ψ , to guarantee herself as the only provider of the state.One potential risk is that Alice provides too many copies of |Ψ to others, with which the coefficients of |Ψ can be cracked by, e.g., quantum state tomography [28].In our case, this risk is low since N is large, and it can be easily controlled by the number of the states provided to other parties.Note that schemes can be designed to avoid such a risk.See an example in Appendix D.
With TNCS, the "microscopic" information cannot be exactly reconstructed.One possible application in this case is to transfer signatures, where the exact "microscopic" information is not necessarily needed.However, it would be interesting to develop a modified version of TNCS with the aim of reconstructing exactly the full information.One may use the strings of 0's and 1's with a much smaller length (N 784) for training and reconstructing.The images in A and B may not contain any "macroscopic" information (may not be, e.g., meaningful images), but the "ambiguous" correlations are necessary.How to construct the training dataset for sending/reconstructing a certain kind of information, and how are the efficiency, accuracy, and security, are open questions.
Other open issues about TNCS.The TNCS proposed in this work can be further improved from several aspects.Considering the TN model, MPS is the simplest TN model and is a good choice for 1D data such as the time series.For images that are in fact 2D data, MPS is still usable, as shown in e.g., Refs.[8,10,27].Tree TN and MERA (originally proposed in Refs.[13,29] for physical problems) have been suggested as a more suitable TN for learning 2D images [9,14,30].Other TN's such as PEPS [15,31] are much less investigated in the literature.
One may also consider to develop different ways of generating the information.The way we propose in this work can be easily done in classical simulations, but is surely more challenging to implement in experiments or quantum computations.The probabilistic way proposed in Ref. [10] is more feasible in experiments.As the TNCS can be utilized as a classical or method, one may choose (or develop) the generating way to serve the specific purpose.by the gradient method as A [n] ← A [n] − τ ∂f /∂A [n] , with τ the gradient step; see Ref. [8] or [10] for more details.After converging, |Ψ gives the joint probability of the pixels.The probability for any image {x} in |Ψ is given as

FIG. 1 .
FIG.1.Examples of original and generated images in MNIST and fashion-MNIST in the entanglement order (EO), random order (RO), and variance order (VO).The number of known features N f varies from 0 to 170, while the total number of features in an image is 784.We take the bond dimension of the generative MPS as χ = 40.

FIG. 3 .
FIG. 3. (Color online) Average peak signal-to-noise ratio (PSNR) of the constructed images in the testing dataset of (a) the handwriting digits "3" in MNIST, and (b) the dresses in fashion-MNIST.The dimension of the MPS is taken as χ = 16 or 40.The number of known pixels for reconstruction ranges from N f = 0 to 80.

FIG. 4 .
FIG. 4. (Color online) Which N f pixels are selected in EO and VO.To illustrate the orders by color, we mark a pixel redder than those behind this pixel in the order.
FIG. A1. (Color online) Images (digits "3" in the first row and dresses in the forth row), the pixels {x [sent] } (the second and fifth rows), and the generated images (the third and sixth rows) with N f = 80 known pixels.The generated images in the same row are from a same state written in the form of MPS.We take the bond dimension of the MPS as χ = 40.