Variational Quantum Eigensolver with Fewer Qubits

We propose a qubit efficient scheme to study ground state properties of quantum many-body systems on near-term noisy intermediate scale quantum computers. One can obtain a matrix product state (MPS) representation of the ground state using a number of qubits smaller than the physical degrees of freedom. By increasing the qubits number, one can exponentially increase the bond dimension of the MPS variational ansatz on a quantum computer. Moreover, we construct circuits blocks which respect U(1) and SU(2) symmetries of the physical system and show that they can significantly speed up the training process and alleviate the gradient vanishing problem. To demonstrate the feasibility of the qubit efficient variational quantum eigensolver in a practical setting, we perform first principle classical simulation of differentiable circuit programming. Using only 6 qubits one can obtain the ground state of a 4 x 4 square lattice frustrated Heisenberg model with fidelity over 97%. Arbitrarily long ranged correlations can also be measured on the same circuit after variational optimization.


I. INTRODUCTION
Studying ground state properties of quantum many-body systems is a promising native application of quantum computers. Given limited qubit resources and noisy realizations of near-term quantum devices [1,2], a practical approach is to employ the variational quantum eigensolver (VQE) [3][4][5][6][7][8], which runs in a classical-quantum hybrid mode. In this scheme, a parameterized quantum circuit provides a variational ansatz for the ground state. A classical optimizer tunes the circuit parameters to reduce the expected energy of the target Hamiltonian of the output quantum state. There were already several small scale experimental demonstrations of VQE for molecules and quantum magnets [9][10][11][12][13]. These early experiments mostly employed gradient free or Bayesian approaches for classical optimization. Recent progress on unbiased gradient estimation on quantum circuits [14][15][16][17][18][19][20][21][22][23][24][25] breaks the information bottleneck between classical and quantum processors, thus providing a route towards scalable optimization of circuits with a large number of parameters.
There are nevertheless more challenges in the training of variational quantum circuits. The gradients of an unstructured, randomly parametrized circuit vanish exponentially as a function of the number of parameters [26] due to the concentration of measure in high dimensional spaces [27,28]. Intuitively, this could be understood by the fact that the overlap between a random initial quantum state and a target state is exponentially small in the many-body Hilbert space. This difficulty motivates one to design the circuit architecture and initialize the circuit parameters with insights from classical tensor networks [29,30] and quantum chemistry ansatz [31][32][33]. Furthermore, since the number of required qubits is the same as the problem size in the standard VQE applications, one has to push up the number of controllable qubits way beyond the current technology to convincingly surpass the * wanglei@iphy.ac.cn classical simulation approach in finding the ground states of quantum many-body systems. Related approaches such as the quantum approximate optimization algorithm [34] and related field such as quantum machine learning [15,16,35] suffer from the same problem.
We address these problems by adopting the qubit efficient circuit architecture [29] for the variational quantum eigensolver. By measuring qubits sequentially and reusing the measured qubit, one can produce quantum states for an arbitrarily large system with a fixed number of qubits. This approach amounts to generating matrix product states (MPS) [36][37][38] on a quantum computer [39,40]. Note that, although it is well known that an MPS with small bond dimension can be efficiently simulated classically [37], having quantum resource allows one to reach exponentially large bond dimension that is inaccessible to classical computers.
Despite its one-dimensional geometry, MPS is a versatile variational ansatz that has been successfully applied to systems with diverse lattice geometry and topology [41]. The success lies in the fact that many physics and chemistry systems of interest exhibit relatively small entanglement entropy in their ground states [42,43]. In this way, one is able to significantly boost the performance of classical MPS based approaches even with an intermediate scale quantum computer.
Variational optimization of MPS generated on an actual quantum device has previously been demonstrated in an experiment [44]. The experiment exploits the fact that the radiation field of a cavity QED naturally realizes [45] the continuous MPS [46] with a few tuning parameters. Here, we focus on variational MPS calculation on programmable gate model quantum computers. This setup provides us with more systematic and precise control of the bond dimension and the number of variational parameters. Moreover, crucial technical advances such as gradient-based learning [14][15][16][17][18][19][20][21][22][23][24][25] and quantum number preserving circuit design greatly speed up the training process, and make it practically useful for solving challenging quantum many-body problems.
The paper is organized as follows. In Sec. II we intro- Classical Optimizer q y 1 q y 2 q y 3 q y 4 q y 5 q y 6 q y 7 q y 8 q y 9 q y 10 q y 11 q y 12 q y 13 q y 14 q y 15 q y 16 Figure 1. Variational training of the MPS prepared on a parametrized quantum circuit in the qubit efficient scheme. The upper left is the quantum circuit with one reusable qubit for the physical degrees of freedom and V qubits for the virtual degrees of freedom. Green triangles represent input single qubit state initialized to |0 and the yellow square q α k k is the k-th output bit measured on Pauli basis α k . After measurement, the first qubit is reset to |0 and then entangled with the remaining V qubits before the next measurement. Each gray box represents a multilayer parameterized quantum circuit detailed in Fig. 2. Each block has a similar circuit structure and the circuit parameters θ 1 , θ 2 . . . θ N−V are independent. By assembling the bit strings measured on different bases one obtains an estimate of the expected energy of a Hamiltonian. Using differentiable circuit setup [15] one can unbiasedly estimate the energy gradients and update the quantum circuit parameters with a classical optimizer.
duce qubit efficient scheme of preparing MPS on quantum circuits and a gradient-based variational training approach for obtaining the ground state of generic quantum many-body Hamiltonians. We demonstrate the utility of this scheme by numerically simulating the VQE of a 4 × 4 frustrated Heisenberg model using only 5 (or 6 for circuits with respecting the SU(2) symmetry) qubits in Sec. III. We show that the MPS-inspired, symmetry-preserving circuit architecture greatly improves the performance and alleviates the gradient vanishing problem. Finally, in Sec. IV, we carry out a detailed gate counting to estimate the timings of actual experiments, and point to future research directions. Codes and pretrained circuits parameters can be found at the Github repository [47].

II. CIRCUIT ARCHITECTURE AND TRAINING APPROACH
Considering a generic quantum many-body Hamiltonian written in terms of the Pauli operators where i, j = 1, 2 . . . N are site indices with N being the system size, α, β = x, y, z are indices of the Pauli axis. We only show the first few terms for the sake of concreteness although higher order polynomials of Pauli operators are allowed. In the variational quantum eigensolver approach [3], the ground state |ψ(θ) is represented by the output of a parametrized quantum circuit, where θ = {θ i } are the circuit parameters.
The variational energy is a summation of expectation values of Hamiltonian terms in the variational ground state, (2) To estimate the expected energy, one can identify maximally commuting sets of Hamiltonian operators and measure all the commuting terms together on the corresponding bases.
One can prepare an MPS as a variational state of N qubits using a smaller number V + 1 N of qubits [29,39,40]. The idea is to treat one of them as the physical qubit and use the remaining V qubits as virtual degrees of freedom to mediate quantum entanglement. Sequentially measuring and reusing the first qubit allow for producing an arbitrarily long MPS. The circuit structure is illustrated in the upper left corner of Fig. 1. First, one initializes the V + 1 qubits to the product state |0 ⊗ |0 and applies a circuit block parameterized by θ 1 to all qubits. Then, one measures the first qubit on Pauli basis σ α 1 1 and stores the output q α 1 1 to a classical memory. Next, one recycles the measured qubit and reset it to state |0 . One then entangles it with the remaining V qubits again by applying a second circuit block with parameters θ 2 . After repeating these procedures until one has collected N − V − 1 bits of classical information, one measures all qubits to collect the last V + 1 bits. This sequential measure-and-reuse scheme is equivalent to measuring an N-qubit MPS on the same basis. In Appendix A, we provide a concrete example of preparing and sampling a cluster state in the qubit efficient scheme.
Given the variational circuit, we aim at solving the optimization problem θ opt = argmin θ H θ . Gradient-based op-timization algorithms are crucial to scaling to a large number of variational parameters [16]. Suppose that all the parameters of the quantum circuit appear in the form e −iθ i Σ/2 with Σ 2 = 1. The analytical expression of the gradient with respect to the parameter θ i reads [15], One can thus estimate the energy gradient by tuning the parameters to θ i ± π/2 and use it for gradient descend optimization of the energy. Unlike numerical differentiation, Eq. (3) is an exact gradient estimator, which is crucial for unbiased stochastic optimization with a noisy estimate of the gradients [48].
To recapitulate, the key point of the proposed MPS variational algorithm is to estimate the energy gradient of an N-qubit Hamiltonian with respect to an MPS prepared on a quantum circuit with fewer (V + 1 N) qubits. The steps are shown in Fig. 1. 1 Tune a selected circuit parameter θ i to θ i + π 2 , and collect bit strings by repeated measurements on various bases according to the Hamiltonian terms. Then we repeat for θ i − π 2 . 2 Estimate the energy expectation value by assembling the statistics of all Hamiltonian terms 3 Estimate gradient of all parameters via Eq. (3). Feed the gradient information into a classical optimizer. 4 Update the circuit parameters according to suggestions of the classical optimizer. This completes one training epoch. The training stops when a prescribed convergence criterion is met. After reaching convergence, one may measure physical observables of interests on the optimized circuits.

III. APPLICATION TO HEISENBERG MODEL
As a concrete example, we apply the approach detailed in the previous section to the frustrated Heisenberg model on a square lattice: where i, j and i, j denote nearest and next-nearest neighbors pairs, respectively. J 2 > 0 is the strength of the frustration term that supresses the Neel order. The energy expectation value and its gradient can be efficiently evaluated by sampling the circuit output on three bases σ x , σ y and σ z . In the following discussion, we consider the model on an open square lattice of the size N = 4 × 4 with J 2 = 0.5. These sites are zigzag ordered in our ansatz as shown in the lower right of Fig. 1. Frustrated quantum spin models are crucial to the study of quantum magnets with many open problems [49,50]. Classical computational approaches to these problems are either limited by the sign problem [51] or high computational cost at larger bond dimensions [52]. Variational optimization of MPS on near-term quantum computers is a promising approach which may deliver valuable insights into open problems in this field. Figure 2(a) shows a general internal structure of the variational circuit which is efficient to be implemented on quantum hardware [11]. Each layer contains 3(V + 1) parameters in the rotational gates R x θ and R z θ . We use CNOT gates with no variational parameters as the entanglers to generate entanglement between qubits. We repeat this construction for d times within each circuit block. Thus there are M = 3d(V +1) parameters in each block. As we show below, taking into account the physical symmetries in designing of the VQE ansatz can reduce the number of parameters and increase the training performance.

A. Circuits with Conserved Quantum Numbers
The Heisenberg model (4) has a U(1) symmetry with good quantum number S z . To preserve this symmetry [33,53], we construct a circuit block consists of e −iθσ z i /2 and e −iθSWAP(i, j)/2 gates [54]. The latter gate is equivalent up to a phase factor to the SWAP α gate with α = θ/π [55]. Viewing the setup as a wide circuit with N qubits, it is clear that the quantum number of the initial state is conserved during the evolution. To obtain the ground state in S z = 0 sector, we prepare a spin-balanced initial state for the variational calculation. Fig. 2 (b) shows that, by applying an additional X gate before the variational gates in the odd steps, one has an anti-ferromagnetic product state |1010 . . . 10 as the initial state.
One can further exploit the full SU(2) symmetry of the Heisenberg model (4). While there are sophisticated approaches [56,57] to implement this non-Abelian symmetry in classical simulations, the implementation is straightforward on quantum circuits. As shown in Fig. 2 (c), we first prepare the input state in the total spin S 2 = 0 sector, where the simplest choice is the singlet product state |ψ 0 = N/2 i=1 | ↑↓ − ↓↑ . To prepare |ψ 0 , we use an additional ancilla qubit to carry the entanglement of the physical qubit in the odd and even steps [58]. In the odd step, we prepare a spin singlet between the physical qubit and the ancilla qubit |↑↓ − ↓↑ = CNOT(1, a)H(1)X(1)|0 1 ⊗ |0 a . Here, CNOT(1, a) is the inverse controlled-NOT gate, which flips the ancilla qubit when the physical qubit is in state |0 . In the even step, we swap the ancilla and physical qubits. The physical qubits in the odd and even steps thus form a spin singlet. Then, we repeatedly apply parametrized SU(2) symmetric operations to the initial state to generate the variational output. We choose the generators to be the SWAP α gate [55] between a collection of qubits pairs [59]. We show the whole circuit of the SU(2) symmetric ansatz in Appendix B. The SU(2) symmetric variational state reads |ψ(θ) = {i, j} e −iθ i, j SWAP(i, j)/2 |ψ 0 , where the product is ordered by the circuit architecture. This variational ansatz resembles the classical variational ansatz for quantum spins in the valence bond basis [60]. However, in general, the state  Figure 2. The internal structure of circuit blocks shown in Fig. 1. (a) a general unstructured setup. R α θ i = e −θ i σ α i /2 represents a parametrized single qubit rotation gate. (b) U(1) preserving block. The leftmost X gate is applied only for odd steps to flip the input state to |1 . The double crosses are the SWAP α gates [55]. (c) SU(2) preserving block. The left and right panels are for odd and even steps respectively. The last qubit is an ancilla for creating singlets between consecutive steps. The gates enclosed in the dashed box is repeated for d times, where d is denoted as the depth of the block.
could not be sampled efficiently using the classical Monte Carlo method due to the appearance of complex weights [61]. Moreover, since the swap operations are not commuting within each other, there is an additional difficulty in devising an efficient classical Monte Carlo scheme to sample from the variational ansatz. Therefore, variational optimization of this ansatz on a quantum device highlights the possible quantum advantage of the proposed qubit efficient VQE scheme.

B. Numerical Results
To assess the feasibility of the qubit efficient VQE scheme on near-term quantum devices, we perform a faithful classical simulation of the training process. We simulate a circuit of V + 1 qubits instead of an equivalent N qubits circuit without qubit reusing. Therefore, even in the classical simulation, we do not have direct access to the final wavefunction but only to the measured bit strings. We sample the energy and its gradient on samples of batch size 4096. Note that we purposely do not exploit the classical backpropagation algorithm (which reduces the complexity of gradient estimation from  O(M 2 ) to O(M)) to be in line with the repetitive experimental measurement [16].
We use V = 4 qubits for the virtual degrees of the MPS. The maximum entanglement entropy of the ansatz is thus 4 ln 2 given the full capacity of the variational blocks. We employ the Adam optimizer with a learning rate 0.1 [62] for the stochastic gradient descent training. We compared three different circuit blocks shown in Fig. 2. All of them have fixed depth d = 5. The variational parameters are random initialized with uniform distribution in [0, π]. As shown in Fig. 1, there are in total N − V = 12 circuit blocks.
The general circuit structure shown in Fig. 2(a) contains M = 3(V + 1)(N − V)d = 900 variational parameters. In 500 steps of training, the energy per site decreases to −0.416. For comparison, the exact ground state energy per site is E exact = −0.46909731. As the energy decreases, its fidelity with respect to the exact ground state increases from 5.7×10 −3 to 0.69.
Next, the U(1) symmetric circuit structure in Fig. 2(b) contains 10 single qubit gates and 5 two-qubit gates in each layer. Hence the number of circuit parameters is also 900. However, the training efficiency increases significantly as shown in Figure 3. The ground state energy per site reaches −0.454, with a ground state fidelity of 0.92.
Finally, the SU(2) symmetric circuit structure in Fig. 2(c) gives the best variational energy despite that it only has M = (V + 1)(N − V)d = 300 variational parameters. Using the same hyperparameter for training, the variational energy decreases to −0.463, and the fidelity reaches 0.97. After obtaining the variational state, we can measure physical observables on the  circuit. For example, the spin-spin correlation in z direction σ z i σ z j measured on the SU(2) symmetry preserving circuit shown in Fig. 4 (a). As a comparison, using the same training hyperparameters we obtain a ground state with fidelity 0.98 using SU(2) symmetric variational circuit ansatz Fig. 2(c) for unfrustrated Heisenberg lattice with J 2 = 0. As shown in Fig. 4 (b), the checkerboard pattern for the antiferromagnetic correlation is more visible in the unfrustrated case. These results suggest that, even with a moderate number of qubits, the qubit efficient VQE scheme is able to offer useful physical insights.

C. On Gradient Vanishing
We inspect the variance of the gradient signal for various system sizes to investigate the gradient vanishing problem in the training variational quantum circuits [26]. To compute the gradient variance, we sample 1000 gradients of random circuit parameters.
First, we consider an unfrustrated Heisenberg model on an open chain of length N. Here N should be regarded as the effective circuit width since the output bit strings lie in the Hilbert space of size 2 N . Fig. 5(a) shows the variance of the gradient for circuit blocks with U(1) and SU(2) symmetries with V = 4. Interestingly, the variance of the gradient shows a power-law decay in contrast to the exponential decay found in a circuit with generic structure [26]. Therefore, it appears that the MPS circuit structure alleviates the gradient vanishing problem at least for the problem under consideration. We attribute this to the fact that the low entropy variational ansatz captures the right inductive bias for the ground state of the target problem.
Next, for an N = 20 Heisenberg chain, we examine the scaling of the gradient variance with the number of virtual qubits V. Again, we see that using symmetry greatly enhance the gradient in Fig. 5(b). We observe an exponential decrease of the gradient in the regime V N. While the gradient increases with V for L N. Their values are still much smaller than the values in the small V limit, which shows that the MPS-inspired ansatz are easier to train compared to an unstructured quantum circuit of a generic structure.

IV. DISCUSSIONS
Classical quantum many-body computation approaches provide valuable insights to quantum algorithms. Using the proposed qubit efficient VQE scheme, one can access the ground state properties of quantum systems using fewer qubits. In particular, by exploiting the physical symmetries in the quantum circuit architecture design, one can alleviate the gradient vanishing problem and speed up the convergence to the ground state.
In addition to the ability to reach exponentially large bond dimension with respect to the qubit number, the quantum circuit MPS ansatz also shows some peculiar features comparing to its classical counterparts. The number of variational parameters is controlled by the block depth, which is detached from the maximum bond dimension 2 V . Therefore, reaching an exponentially large bond dimension does not necessarily imply exponentially many variational parameters. In fact, it may be inefficient to let all the tensor elements of an MPS to be independent variational parameters. In this respect, the quantum circuit MPS ansatz, especially those ones with symmetric circuit blocks, impose additional entanglement structure in the general MPS ansatz.
The wall clock time of both classical simulation and actual experiment can be estimated by counting the gate operations, as summarized in Table I. In the classical simulation, the  [68,69], which means longer time is required for solving the same model. Furthermore, we note that gate operations in the lines 2 − 5 of Table I are also trivially parallelizable on QPUs. Therefore, we envision that building a cluster of QPUs [5,6] may provide further advantage for high-throughput gradient estimation of the VQE calculation. In this case, one only needs classical communications to collect the gradients measured on all QPUs. Technically, having intermediate scale quantum circuits running in parallel is also easier than building a fully entangled large scale quantum computer. In this way, we expect running the variational algorithm on parallel QPUs will soon win over classical processors with exponential gate time.
A practical issue is that the total running time MT gate should be within the coherence time of the qubits, which limit the block depth of the variational ansatz to be shallow circuits on near-term devices. Table II shows the variational energy and fidelity for various depths d obtained for the N = 4 × 4 frustrated Heisenberg model at J 2 = 0.5. One sees that it is possible to reach fidelity 0.917 with only 60 parametrized gates, which is within the reach of the current day quantum technology. For larger problem size, one will need to increase the circuit depth linearly with N. Assuming area law entanglement entropy scaling of the system, one also needs to scale the circuit width V linearly with the boundary size for an accurate variational description of the ground state.
A crucial step for the qubit efficient MPS preparation scheme is the measure and reset operation. The cost of this step is device dependent. It is straightforward for trapped ions. However, for SQUIDs, a single qubit measurement can take several microseconds, which is even slower than applying a gate. Fabricating low-latency quantum circuits that support fast measure and control is a rewarding direction in light of the proposed qubit efficient VQE scheme. Alternatively, one can employ the same circuit architecture without reusing the measured qubits, see Appendix B. In this case, one still has the benefit of enhanced gradient signal shown in Sec. III C.
The use of conserved quantum number in circuit construction also allows one to have access to excited states in various quantum number sectors. With this regard, it is interesting to consider what are the universal gate sets with respect to various physical symmetry constraints. In addition to internal symmetries, the spatial translational symmetry may be taken into account via parameter sharing in the circuit blocks. Then, it naturally raises the question of whether one can study infinite large system periodic systems with a finite number of qubits.
The proposed scheme directly applies to Hamiltonians with arbitrarily long-range interaction. In particular, fermionic systems can be easily studied using the Jordan-Wigner transformation [70]. A general quantum chemistry problem is more challenging than the quantum spin problem considered here since they contain O(N 4 ) terms in the Hamiltonian. Nevertheless, the total number of measurements can be reduced using the techniques of [8,71,72].
Another interesting direction is to perform time evolution and measure time-dependent quantities in the qubit efficient scheme. Since one does not have access to the full wave function directly in the qubit efficient scheme, the Trotter decomposition based time evolution [73,74] may not be directly applicable. Variational quantum algorithms for time evolution [75,76] appears to be a good candidate for this purpose.

V. ACKNOWLEDGMENT
We thank Miles Stoudenmire for sharing the idea of Ref. [29] prior to its publication. We thank Yun-Fei Pu and Ding-Shun Lv for providing valuable information on experimental feasibility. We thank Norbert Schuch for comment on the U(1) preserving circuit construction and Yan-Xia Liu for helpful discussions on Bethe Ansatz. We thank Pan Zhang for generous allocation of GPU hours and Xiu-Zhe Luo for contribution to Yao.jl In the qubit efficient scheme, the 1 − 12-th qubits are the same physical qubit which is reused after measurement. The 13 − 16-th qubits are the V = 4 qubits which mediate entanglement in the final output state. The 17-th qubit is the ancilla qubit for constructing singlet product initial state. The operations for generating singlets (i.e. X, H, controlled gates and SWAP gates) commute with parametrized swap gates in the dashed box, so that can be moved to the beginning of the circuit.