Correlation-Informed Permutation of Qubits for Reducing Ansatz Depth in VQE

The Variational Quantum Eigensolver (VQE) is a method of choice to solve the electronic structure problem for molecules on near-term gate-based quantum computers. However, the circuit depth is expected to grow significantly with problem size. Increased depth can both degrade the accuracy of the results and reduce trainability. In this work, we propose a novel approach to reduce ansatz circuit depth. Our approach, called PermVQE, adds an additional optimization loop to VQE that permutes qubits in order to solve for the qubit Hamiltonian that minimizes long-range correlations in the ground state. The choice of permutations is based on mutual information, which is a measure of interaction between electrons in spin-orbitals. Encoding strongly interacting spin-orbitals into proximal qubits on a quantum chip naturally reduces the circuit depth needed to prepare the ground state. For representative molecular systems, LiH, H$_2$, (H$_2$)$_2$, H$_4$, and H$_3^+$, we demonstrate for linear qubit connectivity that placing entangled qubits in close proximity leads to shallower depth circuits required to reach a given eigenvalue-eigenvector accuracy. This approach can be extended to any qubit connectivity and can significantly reduce the depth required to reach a desired accuracy in VQE. Moreover, our approach can be applied to other variational quantum algorithms beyond VQE.


I. Introduction
Quantum computing is expected to revolutionize computational chemistry by achieving polynomial scaling in both the number of quantum particles and the quality of the description of the system (e.g., number of orbital basis functions or numerical grid points) [1,2]. The Variational Quantum Eigensolver (VQE) has emerged as a viable algorithm [3] to find asymptotically exact lowest eigenvalues for solutions to the Schrödinger equation on Noisy Intermediate-Scale Quantum (NISQ) devices. VQE-based ground state electronic energy calculations of small molecular systems (e.g., H 2 , BeH 2 , H 2 O, alkali metal hydrides and H 12 ) in minimal basis sets (e.g., contracted Gaussians sto-ng family) were experimentally implemented using superconducting circuits [4][5][6][7][8] and trapped ions [9] as physical qubits. As the age of quantum supremacy dawns [10], demonstrating quantum advantage for chemistry will naturally involve considering larger molecules and/or basis sets. In addition to quantum circuit width (number of qubits), this will, in turn, increase the quantum circuit depth, which typically grows polynomially in the problem size [11].
The growing circuit depth with problem size causes two main issues. One issue is the accumulation of hardware noise, which impacts the accuracy of the results. The other issue is the trainability of the ansatz parameters, i.e., whether the parameters have large enough gradients to allow for progress in the optimization. This concern * These authors contributed equally. † zhy@lanl.gov ‡ lcincio@lanl.gov § pdub@lanl.gov arises since increased circuit depth leads to smaller gradients [12][13][14][15], and accurate estimation of small gradients on a quantum computer requires a large number of runs (or "shots"). In fact, these two issues are related as the accumulation of hardware noise also leads to smaller gradients [16]. This highlights the importance of keeping the circuit depth shallow in variational ansätze. Some strategies partially address these problems, such as improved classical optimizers [17][18][19], parameter initialization strategies [20][21][22], error mitigation [23][24][25], and noise resilience [26]. On the other hand, the most direct strategy would be to somehow reduce ansatz circuit depth. Some promising approaches have been proposed for this purpose, largely focusing on adaptive ansätze [27][28][29][30]. Given that reducing ansatz depth will improve both the training complexity and the accuracy of VQE, it is a crucial research direction that could bring us closer to realize quantum advantage for electronic structure calculations.
It is worth emphasizing that lack of complete qubit connectivity on NISQ devices [31] makes it necessary to increase the depth of quantum circuit. Thus, any efforts to minimize ansatz depth in VQE should account for the connectivity of the specific NISQ hardware. For instance, if two qubits are highly entangled in the exact solution of an electronic structure problem, yet the qubits are physically distant on a device with limited connectivity, a deeper circuit is required for accurate simulation. Long sequences of noisy two-qubit gates are necessary to entangle distant qubits, which can degrade the fidelity and reduce trainability.
In this work, we introduce a novel approach to reduce ansatz depth by permuting the pattern with which the Hamiltonian is embedded onto qubits. Specifically, we employ a correlation-informed approach, where the permutation is chosen based on the mutual information defining entanglement between two individual spinorbitals [32,33]. Our main idea is illustrated in Fig. 1 on the example of selected spin-orbitals of LiH molecule. Entanglement of spin orbitals triggers permutation of the embedding to ensure that these orbitals are encoded into qubits that are placed physically close to each other on an actual quantum chip.
We numerically implement such qubits permutations assuming a linear qubit connectivity for the LiH, H 2 , (H 2 ) 2 , H 4 , and H + 3 molecular systems. In all cases, permutations significantly reduce the circuit depth required to reach a given energy accuracy, relative to the unpermuted case. To realize this permutation technique, we introduce PermVQE algorithm, an added layer on top of the original VQE posed in [3]. At a fixed circuit depth L, this algorithm, as depicted in Fig. 2 starts from the approximate ground state wave function of a given initial Hamiltonian, then uses it to calculate an optimal reordering of qubits and builds a permuted Hamiltonian. The algorithm iteratively repeats this process until an adequate reordering of qubits is obtained, and the best-permuted qubit Hamiltonian is used to variationally calculate the best-possible (within L) ground state wave function and associated energy.

A. Overview
A diagram of the PermVQE algorithm is shown in Fig. 2. As in the standard VQE for electronic structure, the first step is the initial mapping of the fermionic Fock-space states as well as creation and annihilation operators into Hilbert-space states and Pauli operators of qubits based on one of established second-quantized encoding methods, vide infra. This is followed by a standard VQE loop to variationally learn an approximate correlated ground state wave function at a fixed circuit depth L. The next step is to perform local tomography on two-qubit subsets in order to generate mutualinformation matrix, which we call the entanglement map. This step also minimizes a cost function that quantifies the amount of long-range correlations (for the specific hardware's connectivity), while varying the qubit index labels. As a result we arrive at an optimal permutation of the qubits and hence a new Hamiltonian, which is then fed back to the VQE subroutine for another iteration. We now provide more details on these various subroutines.

B. Initial qubit mapping
In PermVQE, the first step is an initial spin-orbital to qubit mapping, subject to a chosen transformation [1,2]. As discussed below in Sec. III, the performance of Per-mVQE can be affected by the initial qubit mapping. In particular, it may be desirable to select an initial qubit mapping that leads to a more sparse entanglement map, i.e., minimizing the number of qubits being entangled. This would then allow for qubit permutations to have a more significant effect. In Sec. III, we compare three popular second quantized basis set encoding methods: Jordan-Wigner (JW) [34], Bravyi-Kitaev (BK) [35], and Parity [36,37]. We find that the Jordan-Wigner transformation facilitates sparser entanglement map, and hence improves performance of PermVQE.

C. Entanglement map
After obtaining the initial qubit Hamiltonian and running VQE under fixed circuit depth L, the next step of PermVQE is to produce an entanglement map reflecting electronic correlations in the approximate ground state. For this purpose, we calculate the quantum mutual information for all pairs of qubits, which provides a measure of the total correlation including both quantum and classical correlations. Previous results obtained for the orbital ordering problem in Density Matrix Renormalization Group (DMRG) method in classical quantum chemistry calculations [32] showed that the quantum mutual information is a reliable parameter to quantify the correlation between two quantum particles. The quantum mutual information between qubits i and j is defined as follows: where S i and S ij are the single-qubit and two-qubit von Neumann entropies, respectively. The Kronecker δ sets Figure 2. The PermVQE Algorithm. After obtaining approximate ground state wave function from the initial Hamiltonian at a fixed circuit depth L through VQE, the entanglement map is produced based on mutual information Iij. Minimization of a cost function (which quantifies the amount of long-range correlations) over possible permutations of qubits, returns the permuted Hamiltonian, which is then fed back into the VQE algorithm. This procedure is iteratively repeated until the best-permuted Hamiltonian is generated from which the best-possible (within L) ground state wave function and energy are obtained.
all diagonal elements I ii to zero. The single-qubit von Neumann entropy S i is given by: and S ij is defined analogously. Here ρ i is the reduced one-body density matrix of qubit i and {λ We note that ρ ij (and its marginals ρ i and ρ j ) can be obtained from two-qubit tomography, which involves measuring 15 Pauli operators. There are n(n−1)/2 pairs of qubits, with n being the number of qubits. Hence the number of operator measurements needed to construct the entanglement map is 15n(n − 1)/2. Some of these operators will commute and thus can be measured simultaneously.
Based on the mutual information values for each pair of qubits, we build an n × n matrix. This matrix I = {I ij }, called the entanglement map, is useful to illustrate the amount and the length-scale of the correlations in the approximate ground state. Moreover, we use it to define a cost function described next.

D. Cost Function
To quantify the amount of long-range correlations, we introduce a cost function as follows. For a NISQ device with a given connectivity, let d ij denote the distance between qubits i and j, which can be precisely defined as the number of edges in the shortest path through the connectivity graph between these qubits. Alternatively, one can view d ij as the minimum number of swap gates (plus one) needed to make qubits i and j nearest neighbors. Then, for any qubit connectivity, a cost function can be defined as where f (·) is a monotonously increasing function of d ij .
As an example, one could choose a power-law function: being the choice of cost function employed in our numerics below. Specifically, we consider a linear connectivity in our numerics, in which case d ij = |i − j|, and the cost function becomes C(I) = i<j |i − j| 2 I ij . The choice of the cost function is dictated by the optimization procedure described below.

E. Cost-Function Minimization
We define a permutation P as a bijection from the set of qubit indices to itself. The action of P will affect the entanglement map I, and we are interested in solving the optimization problem: After solving this optimization problem, the PermVQE algorithm uses P opt to permute the qubit indices and produce a new Hamiltonian. For example, if the original Hamiltonian is H = X 1 Y 2 Z 3 and P opt transforms indices as follows: There are several methods suitable for minimization of C(P IP −1 ) over qubit permutations. In principle, one could take a brute-force approach. That is, one could explicitly construct the n! different permutations and check which one of them produces the lowest value of C(P IP −1 ). While this approach will certainly find the best permutation for a given entanglement map, the computational cost grows factorially in n, and hence scalability to large problem sizes is problematic. Instead, we focus on a more practical technique, as follows.

Spectral Graph Algorithm
The mutual information matrix I can also be considered as a weighted graph where the weighted edge rep-resents the mutual information between two qubits. In particular, the minimization of a quadratic cost function, C(I) = i<j |i − j| 2 I ij , can be related to spectral graph theory [38,39]. For a given I, we can define the graph Laplacian L as follows: Spectral graph theory has shown [38] that the Fiedler vector, or the second smallest eigenvector of L, is the solution that can provide low values of the following function: which is exactly the cost function that we defined above. Sorting the entries of the Fiedler vector in either ascending or descending order provides the optimized ordering of qubits [38]. Note that the diagonalization of the Laplacian matrix L scales polynomially with number of qubits n. Hence, the cost-function minimization based on the spectral graph algorithm provides an efficient way of finding optimal qubit ordering, with polynomial scaling in n.

A. Ising Toy Models with Exact Wave Functions
To demonstrate the effect of permutations on circuit depth required to reach exact solution in the VQE, we start with toy models and exact wave functions giving rise to exact entanglement maps (here only one loop for qubits reordering is needed). For that purpose we analyze several 6-qubit model Ising Hamiltonians with artificially engineered entanglement. The generic Ising Hamiltonian is given by: We choose the heuristic two-layer RyRz ansatz shown in Fig. 3, as this hardware-efficient ansatz is naturally adapted for NISQ devices to ensure the nearest neighbor connectivity between qubits. The ansatz circuit with L layers is denoted as an "L-Depth" circuit. We consider five different model Hamiltonians, given in Table I. The entanglement maps for those Hamiltonians are shown in Fig. S1 in Supplemental Material (SM). From a comparison of different entanglement maps, we notice that the entanglement is more localized in the permuted case supporting the choice of the cost-function defined above. We find that for the majority of considered Hamiltonians, qubit permutations significantly reduce the number of ansatz layers required to reach the exact energy of the system as shown in Table I  H 1 and H 2 Hamiltonians, the exact solution is already found at L = 1, where only the distinct qubit pairs are entangled. As expected, larger depth further reduces the advantage of permutations since long circuit depth allows to solve the problem exactly for the unpermuted (default) Hamiltonian. Thus, we expect that qubit permutations could be an ideal method to variationally learn the ground state wave functions on NISQ devices where circuit depths are limited. For each Hamiltonian, we also compare the difference in permutated vs unpermuted cost-function values to the difference in permuted vs unpermuted depth required to converge to the true ground state, see Table S1 in SM. Notably, a larger difference in cost-function appears to be concomitant with a larger difference in the required ansatz depth, which may indicate a correlation between these parameters. However, for a large number of entangled qubits (e.g. H 5 ), this correlation fails. Nevertheless, the cost-function difference can be used as an initial estimate of the possible permutations.

B. Molecular Systems with Exact Wave Functions
For the actual molecular systems, the entanglement maps are expected to be vastly more complicated. Moreover, the type of mapping of fermionic-to-qubit Hamiltonian can also affect the entanglement map of the system. To demonstrate proof-of-principle advantage of qubit permutations on a circuit depth required to reach near-to-exact solution in the VQE of chemical systems, we consider five molecular LiH, H 2 , (H 2 ) 2 T-shaped Van der Waals complex, H 4 third-order saddle point and H + 3 cyclic cation systems, which were investigated in ideal environment (exact wave function, exact entanglement map, no noise consideration). Here, we compare ∆E = E VQE − E exact energy error as a function of ansatz depth L, where E VQE is converged variational energy when using unpermuted or permuted Hamiltonian in the best identified (JW) encoding (see below), and E exact is exact energy in the full or reduced active space obtained by direct Hamiltonian matrix diagonalization in the basis of Slater Determinants of spin-orbitals. For all systems a full active space of sto-3g basis set [40] was used, except LiH and H 2 molecules, for which a reduced active space of 10 spin-orbitals and a full active space of 6-31G basis set [41] were selected, respectively. Slightly-modified versions of hardware-efficient RyRz and Ry ansätze [42] were used for all the molecules (see Figs. S2-S3 in SM), except H + 3 for which a particle-preserving ansatz was employed, vide infra. For all these systems, we present graphs of converged VQE energy with and without permutations in the main text.

Initial Qubit Mapping
There are multiple choices of spin-orbital to qubit mappings [1,2]. For electronic structure problem, three popular second quantized encoding basis set methods Jordan-Wigner (JW) [34], Bravyi-Kitaev (BK) [35], and Parity [36,37] primarily differ in the pattern to store information on the occupation number, the parity and the number of qubit operations for simulating fermionic creation or annihilation operators. While it was shown that the BK mapping can reduce the number of operations to O(log 2 (N )) for simulating each fermionic operator [43], entanglement maps in this encoding contain a larger number of entangled qubits, which may reduce the advantage of permutations (Fig. 4A). The same behavior was observed for the Parity mapping (Fig. 4B). The reason for this is likely relevant to the fact that both BK and Parity store information on the occupation number nonlocally, thus delocalizing the entanglement. In contrast, the occupation number-locality preserving JW mapping produces sparse entanglement map (Fig. 4C). We expect that the advantage of permutations will be more pronounced in this mapping, which is further used as default throughout this paper. For example, LiH in the JW mapping includes 6 entangled qubits unlike the BK and Parity mappings in which 8 qubits are entangled, see Fig. 4. The comparison of JW entanglement maps for other molecular systems is shown in Fig. S4 in SM. We also compare exact entanglement maps for qubit Hamiltonians generated with IBM's Qiskit [44] and Google's Openfermion [45] frameworks for the LiH molecule in Fig. S5 in SM. Entanglement maps are identical in JW encoding after qubit permutations, while they are different for BK and Parity mappings. This result is expected taking into account that Qiskit orders (by default) spinorbitals based on spin (with "up" first and "down" next), whereas Openfermion performs even-odd ordering.

LiH Molecule
For the analysis of LiH in sto-3g basis, the lowestenergy spatial orbital was assumed to be doubly occupied and its contribution is integrated out to an effective field felt by the active space [46] of remaining five orbitals, Fig. 1. The latter spans ten spin-orbitals, and as such the system can be described with a 10-qubit circuit.
For RyRz and Ry ansätze, the effect of qubit permutations on circuit depth is shown in Figs. 5a and 5e, respectively. We note here that for both ansätze at depth L = 2, the VQE converged to a correlated wave function whose energy is comparable to the Hartree-Fock (HF) one. For depth L = 4, ca. 70% of pseudo-correlation energy ∆E pc (Fig. 5a) is captured for the permuted Hamiltonian, where ∆E pc is defined as the difference between the Hartree-Fock (HF) energy E HF and the exact E exact ground state energy for reduced active space (∆E pc = 12.2 kcal/mol). In contrast, for the unpermuted (default) qubit Hamiltonian, the VQE energy is still comparable to that of HF energy even for the depth L = 8, indicating that it generates insufficient entanglement of the qubits. Similar results are observed with Ry ansatz, Fig. 5e.

H2, (H2)2 and H4 Molecular Species
The improvement in energy convergence due to qubit permutations on 8-qubit circuit depth are well-observed for H 2 (6-31G basis), non-covalently bound (Van der Waals) complex (H 2 ) 2 (sto-3g basis), and H 4 third-order saddle point (three imaginary frequencies in the Hessian matrix, sto-3g basis), Fig. 5b-d and Fig. 5f-h for RyRz and Ry ansätze, respectively. Note that the level of correlations in these molecular systems is expected to increase as follows: H 2 < (H 2 ) 2 H 4 . Consequently, smaller difference between permuted and unpermuted E VQE can be observed in the corresponding panels. We also note here that H 4 is so strongly correlated, that already at depth L = 2 converged VQE corresponds to a highlycorrelated wave function even when unpermuted Hamiltonian is employed, E HF − E VQE (unpermuted) ∼ 100 kcal/mol, Fig. 5d,h.

H + 3 Cation
Since VQE performs an unconstrained energy optimization in the Fock space of the original electronic problem, calculation of exact energy for charged molecules (ions) could be a challenging task [47]. Indeed, as chargeneutral molecules are usually energetically more preferable over their ions, VQE will invariably collapse to a lower energy of the corresponding neutral form. Among other options [47], one could design quantum circuit which conserves the number of electrons, and force the search to be in the expected subspace [48].
The JW mapping, which relates fixed-particle-number subspaces to the fixed-excitation subspaces in the qubit Hilbert space, seems to be the simplest to approach such a circuit. A complication, however, is a large number of two-qubit gates that act on states |00 or |11 in the HF reference state, in which case electron preserving gates must act as the identity. Thus, we remove these ineffective gates in our ansätze when reporting the depth of our circuit to get a more appropriate comparison between unpermuted vs permuted Hamiltonians. We display this result for our H + 3 cation in Fig. 6a,b along with a decomposition of the electron-preserving gates, which we take from Ref. [48]. Note that the HF layer was initialized depending on the permutation used. Thus, if qubit 1 corresponds to an occupied orbital and after the permutation it swaps with unoccupied qubit 5, we initialize qubit 5 in a |1 state and qubit 1 in a |0 state. Using this approach, we again found that permutations help us to reduce the circuit depth required to reach lower ∆E although the effect is less profound compared to that for neutral molecules, Fig. 7. At a higher depth L = 8, the results equalize.

C. Molecular Systems with Approximate Wave Functions (PermVQE)
So far we demonstrate the proof-of-concept beneficial effect of qubit permutations on circuit depth for toy Ising models and various molecular systems based on ideally permuted Hamiltonians where exact entanglement maps were built from the exact wave functions under noisefree conditions. We now describe PermVQE presented in Fig. 2 on the example of LiH molecule based on approximate wave function under noise-free and noisy conditions without resorting to exact solution. Each iteration uses the current best approximation to the ground state. Because von Neumann entropy S is a function of density-matrix values, the approximate wave function for molecular systems must be at minimum correlated (e.g.  improved over uncorrelated HF wave function) before S is calculated.
For LiH molecule in the reduced active space of sto-3g basis, some correlations are observed at L = 1 at L = 2 for Ry ansatz based on the entanglement maps shown in Fig. S6 in SM. Despite the apparent correction from L = 1 to L = 2, the system is not correlated enough for further improvement. As expected, upon increasing L under noise-free conditions (L ≥ 3), one can clearly see the profound effect of qubit permutations on the VQE converged energy, Fig. 8A. For example, at L = 7, the converged VQE energy is at the level of chemical accu- racy for the permuted Hamiltonian (∆E = 1.2 kcal/mol). In contrast, the converged VQE energy is at the level of HF energy for the unpermuted case. We present the tabulated number of cost-function evaluations per PermVQE iteration for converged VQE per L for unpermuted and permuted Hamiltonians, as well as selected PermVQE iterations for L = 1, 2, 5 and 10 (up to three for the system under studies) including corresponding JW entanglement maps in Table S2 and Fig. S7 in SM.
We also note the following empirical observation. One could intuitively expect that the approximate wave function to build the best-permuted Hamiltonian (within given L) should ideally have a reasonable overlap with the exact wave function. Results obtained for VQE based on unpermuted vs permuted Hamiltonian for L = 2 to 7 demonstrate a weak dependence on the accuracy of the initial wave function. The only relevant criterion seems to be that the variational state develops sufficient correlations, such that the correlation pattern is noticeable. Our experiments suggest that correlations do not have to be accurately computed for the purpose of finding favourable permutation of qubits. Indeed, the wave function is further refined in each iteration under fixed L, e.g. see Fig. S7 in SM.
Finally we study the effects of qubit permutations on noisy simulator. Optimized ansatz parameters under a noisy environment were used for the energy evaluation. No error mitigation was performed here and error rates were kept very low to obtain meaningful results, being 5 × 10 −5 for 1-qubit gates, 5 × 10 −4 for 2-qubit gates. The average energy values after 10 trials are shown in Fig. 8B. For the unpermuted (default) Hamiltonian, the energy is not improved below the HF value and an increase of ansatz depth leads to a greater error in energy ∆E due to additional noise that those layers introduce. Similar behavior is observed for several initial points of permuted Hamiltonian (L ≤ 5). However, a noticeable improvement is observed from L ≥ 6 for the case of permuted Hamiltonian. After L = 7, the ∆E further increases as expected. We stress that the final permutation (for a given L) is obtained in several steps of the algorithm outlines in Fig. 2. In each step, the current best approximation to the ground state is used (as opposed to the exact ground state used for proof-of-concept calculations presented in previous sections).

IV. Conclusions
In this work we show that encoding strongly interacting spin-orbitals of molecular systems into proximal qubits on a linear chain architecture quantum chip, naturally reduces the circuit depth needed to prepare the ground state for the quantum chemistry electronic structure problem. Assuming direct mapping between qubits and canonical spin-orbitals (eigenfunctions of the Fock operator), i.e. each qubit represents the occupation number of a particular spin-orbital, the two-qubit mutual information describes the entanglement between two individual qubits, whereas JW encoding is the most-suitable for permuting qubits. With this observation, we developed a PermVQE algorithm which uses default (unpermuted) Hamiltonians generated with IBM's Qiskit or Google's Openfermion frameworks. PermVQE then iteratively converts inputs into permuted Hamiltonians based on mutual information from approximate wave function, and performs VQE with improved precision at a given circuit depth L. We remark that a different approach based on mutual information maximization to select entangling ansätze (rather than our mutual information minimization to select permutations) was reported in Ref. [49] during the preparation of this manuscript.
We believe that the proposed method will facilitate the simulation of larger molecular systems with NISQ devices by reducing the depth length, and contribute to the demonstration of chemical advantage. Furthermore, we remark that our correlation-informed permutation approach can be combined within the other variational quantum algorithms beyond VQE, for example, to reduce ansatz depth in variational quantum algorithms for simulating dynamics [50][51][52][53], solving linear systems [54,55], and compiling [26,56]. Finally, we note that PermVQE can be employed for any hardware connectivity. Future study might include extending the permutation approach to take advantage of more highly connected qubit architectures, such as grid or hexagonal lattices, which would require modified and more complicated cost functions.

A. Appendix: Ansatz Methods
Hardware-efficient RyRz ansatz that was built on the basis of IBM's Qiskit software package [44], was used to prepare quantum states for Ising toy models, whereas partially modified RyRz and Ry ansätze were used to prepare quantum states for neutral molecules. For RyRz, additional Ry and Rz rotational final layers were included to provide more flexibility to the parameterized circuit, Fig. S2 in SM. For Ry, all Rz gates were removed and additional final Ry rotational layer was included, see Fig.  S3 in SM. This ansatz can be used since all ground states of systems that we analyze possess time-reversal symmetry. Thus, we expect all state coefficients to be purely real, eliminating the need for Z rotations. Electronpreserving ansatz shown in Fig. 6 was used to model H + 3 cation.

B. Appendix: Computational Methods
All quantum simulations were performed using the Qiskit 0.19.6 package [44]. For the VQE and noisy VQE, the Qiskit's Statevector and Qasm simulators were used, respectively. For a classical optimization, the COBYLA [57] protocol was used with 200000 maximum iterations for investigated molecular systems. The lowest eigenvalue was obtained from multiple trial runs (5-10) for each depth of the ansatz. For model Ising Hamiltonians, the same classical optimizer was used with maximum 10000 iterations. The PermVQE calculations with approximate reference wave functions were performed with the same classical optimizer with maximum 100000 iterations and maximum allowed 3 consecutive permutations of the corresponding Hamiltonians. The noise model was incorporated by running the final step of VQE calculation using qasm simulator with 10000 shots per Pauli-word evaluation and the following error rates: 5 × 10 −5 for 1-qubit gates, 5 × 10 −4 for 2-qubit gates. The geometries of molecular systems were initially preoptimized (in the gas phase) at the Hartree-Fock or Full Configurational Interaction level for (H 2 ) 2 using electronic structure Gaussian 16 (Revision B.01) package [58]. Table S3 in SM contains additional information (xyz coordinates, energy data). The molecular orbitals were visualized using ChemCraft software (https://www.chemcraftprog.com/). The entanglement maps for each run were built using the exact or approximate wave function of a given system. The brute force approach was applied to find the best permutation of a given entanglement map.