Fermionic Simulators for Enhanced Scalability of Variational Quantum Simulation

Near-term quantum simulators are mostly based on qubit-based architectures. However, their imperfect nature significantly limits their practical application. The situation is even worse for simulating fermionic systems, which underlie most of material science and chemistry, as one has to adopt fermion-to-qubit encodings which create significant additional resource overhead and trainability issues. Thanks to recent advances in trapping and manipulation of neutral atoms in optical tweezers, digital fermionic quantum simulators are becoming viable. A key question is whether these emerging fermionic simulators can outperform qubit-based simulators for characterizing strongly correlated electronic systems. Here, we perform a comprehensive comparison of resource efficiency between qubit and fermionic simulators for variational ground-state emulation of fermionic systems in both condensed matter systems and quantum chemistry problems. We show that the fermionic simulators indeed outperform their qubit counterparts with respect to resources for quantum evolution (circuit depth), as well as classical optimization (number of required parameters and iterations). In addition, they show less sensitivity to the random initialization of the circuit. The relative advantage of fermionic simulators becomes even more pronounced as interaction becomes stronger, or tunneling is allowed in more than one dimension, as well as for spinful fermions. Importantly, this improvement is scalable, i.e., the performance gap between fermionic and qubit simulators only grows for bigger system sizes.

In NISQ era, hybrid quantum-classical variational algorithms are seen as the most promising route to demonstrate supremacy over fully classical computation paradigm [52,53].In these algorithms, the target outcome is written variationally in terms of a minimum cost function which is measured as the output of a parameterized quantum circuit.The measured cost function is fed into a classical optimizer to be minimized which updates the parameters of the circuit.By iterating the algorithm eventually, the cost function reaches its minimum.Indeed, dividing the complexity between a quantum circuit and a classical optimizer allows a fairly shallow quantum circuit to potentially achieve a quantum advantage.Thus, a complete resource analysis of any variational quantum algorithm necessarily involves analysis of both quantum and classical resources.Quantum resources can be quantified through circuit depth or equivalently the number of two-qubit gates.Classical resources, however, quantify the complexity of the optimization problem through the number of required iterations as well as the number of optimizable parameters.A lot of efforts have been dedicated to enhancing the efficiency of variational quantum algorithms by saving both quantum and classical resources through inventing error mitigation methods [54][55][56][57][58], efficient design of quantum circuits [59,60], exploiting symmetries [61][62][63][64] and accelerating the classical optimizer [65][66][67][68].
Variational Quantum Eigensolver (VQE) [52,[69][70][71][72] is perhaps the most widely investigated variational quantum algorithm which has been demonstrated experimentally for both condensed matter systems [21,73,74] and quantum chemistry problems [24,[75][76][77][78][79][80][81].The goal of the VQE algorithm is to prepare an individual eigenstate, e.g. the ground state, of a given Hamiltonian.For the most basic version of the VQE, the average energy is used as the cost function whose minimization results in the ground energy of the system.Thus, the output of the optimal circuit represents the ground state.
The conventional qubit-based quantum simulators cannot directly simulate fermionic systems.Certain mathematical transformations, such as Jordan-Wigner [82][83][84][85] or Bravyi-Kitaev [86], are needed to map fermionic operators into Pauli strings [12].The resulting qubit Hamiltonian might be highly non-local, which makes the VQE algorithm more costly due to increased required measurement and makes it more susceptible to Barren plateau [87][88][89][90][91][92] which significantly slows down the training.Recently, the network of trapped atoms in optical tweezers has been exploited to realize an analog Fermi Hubbard quantum simulator [93][94][95].In a recent proposal [96], such systems have been proposed for realizing a fermionic digital quantum simulator in which two-particle quantum gates can be performed by exciting the atoms to Rydberg states.This idea can also be utilized for simulating gauge field theories [97,98].Indeed, the feasibility of fermion-based quantum simulators opens huge possibilities for simulating fermionic systems without the overhead cost of the transformation from fermionic operators to Pauli strings.A quantitative analysis is indeed essential to find out whether fermionic simulators can make VQE simulations more resourceefficient.If so, how this resource efficiency depends on the geometry of the fermionic system and scales with the system size.
The outline of the paper is as follows.We briefly review the background material for this work in the rest of this section, basics of the VQE algorithm are introduced in II, followed by their implementations for qubit (II A) and fermion based (II B) simulators.We detail our circuit design and numerical simulation techniques in Sec.III.Sec.IV contains the results for variational simulation of the spinless Fermi-Hubbard model with fermionic simulators in 1D open chain (IV A), ladder (IV B), and 2D lattice (IV C) configurations.This is extended to the spinful Fermi-Hubbard model in Sec.V with 1D open chains (V A), and ladder (V B) configurations.The scalability of classical and quantum resource requirements for ground-state finding with fermionic and qubit simulators is analyzed in Sec.VI. Results for Fermionic simulation for the water molecule are presented in Sec.VII, before we conclude in Sec.VIII. .

II. VARIATIONAL QUANTUM EIGENSOLVER ALGORITHM
In this section, we briefly review the Variational Quantum Eigensolver (VQE), as one of the most widely used near-term algorithms in the field of quantum simulation.VQE has been developed to determine individual eigenvalues of a many-body system and prepare their corresponding eigenstates.
This implies that if |ψ( ⃗ θ)⟩ is expressible, i.e. includes the ground state for a certain choice of ⃗ θ= ⃗ θ opt , then by minimizing the average energy, as the cost function, one can get the ground state energy as E 0 =⟨ψ( ⃗ θ opt )|H|ψ( ⃗ θ opt )⟩.In this case, the output of the simulator corresponds to the ground state, namely |E 0 ⟩=|ψ( ⃗ θ opt )⟩.The parameterized quantum state |ψ( ⃗ θ)⟩ can be obtained through the operation of a parameterized quantum circuit U( ⃗ θ).Theoretically, a quantum analog of the universal approximation theorem has been developed, which indicates that sufficiently deep quantum circuits can approximate any target function to the desired accuracy [70].However, the practically relevant problem of designing the shallowest possible quantum circuit for simulating the ground state of a Hamiltonian remains an area of intensive research.This includes incorporating the symmetries of the system into the design of the circuit [62,63,99] or exploiting evolutionary [60,100] and machine learning [101][102][103] algorithms for simplifying the circuit.
As mentioned before, VQE algorithm relies on two different types of resources: (i) quantum resources which can be quantified through the depth of the quantum circuit or equivalently the number of two-qubit gates R Q ; and (ii) classical resources which quantifies the complexity of the classical optimization through where l I is the average number of iterations that the optimizer needs to iterate for converging to a given precision.The resource efficiency, namely minimizing both R Q and R C , is essential for the scalability of the VQE algorithm and achieving a quantum advantage in NISQ era.
A. VQE Algorithm for Fermionic Systems on Qubit-Based Quantum Simulators In this paper, we focus on simulating the ground state of fermionic many-body systems in both condensed matter physics and quantum chemistry.However, most of the quantum simulators that are available are qubit-based which do not satisfy Fermi statistics.This means that to implement VQE on such systems, one has to first map the fermionic Hamiltonian into a qubit one.The spinless fermionic annihilation and creation operators acting onsite j, represented by c j and c † j respectively, follow the Fermi statistics as where δ jk is the Kronecker delta function, and { , } is the anti-commutator.There are several methods to map these fermionic operators into qubit basis.The Jordan-Wigner transformation is most prominent among such maps [104], and relates Fermion operators {c j } to Pauli spin operators via the following mapping where n j is the number operator, σ α j (with α=x, y, z) is the Pauli operator acting on site j and σ ± j =(σ x j ± iσ y j )/2.Note that the string operator ⊗ j−1 k=1 σ z k is highly non-local which creates multi-body interaction in the qubit basis.In particular, when the fermionic Hamiltonian contains long-range tunnelings or describes highdimensional lattices, the corresponding qubit Hamiltonian becomes highly non-local.The emergence of such non-local terms puts extra resource overheads on VQE simulation and creates several drawbacks, including (i) the number of measurements for estimating average energy increases; (ii) the optimization becomes more susceptible to the Barren plateau phenomenon due to nonlocal terms in the cost function which makes the convergence slower [87,90]; and (iii) the circuit design becomes challenging and quite arbitrary.Indeed, if one could bypass the Jordan-Wigner transformation step for mapping fermions to qubits, then the above inefficiencies would be prevented, resulting in a more efficient simulation of fermionic quantum systems.

B. VQE simulation on Fermion-Based Quantum Simulators
In the qubit quantum computational paradigm, the qubits are assumed to be distinguishable.On the other hand, elementary particles or atoms, either Bosons or Fermions, are completely indistinguishable.Thus, for Fermionic or Bosonic computation models, the natural approach is to consider distinguishable localized energy modes on a similar footing as qubits.For Bosons, this leads to elementary gate operations being linear optical unitaries augmented with some non-Gaussian operation like a photon detector, being universal [105].For Fermions, the situation is more interesting, as projective measurement on Fermion modes does not achieve universality when coupled with free-Fermionic quadratic unitaries [106].Importantly for our purpose, Bravyi and Kitaev [86] obtained the following set of Fermionic unitaries which are universal when restricted to global particlenumber conserving transformations Physically, the first unitary acts on a single site and encodes information about particle numbers for each mode, the second unitary signifies interaction between fermions in different sites, and the last unitary signifies particle hopping between sites.This is an interesting feature that makes fermions very distinct from qubits.While in qubit systems, one type of two-qubit operation, e.g.controllednot, is enough for universal computation, in fermionic quantum simulators one needs two different types of twoparticle operations.
Recently, a novel approach for developing digital fermionic quantum simulators in neutral atom arrays with interactions mediated by Rydberg states has been proposed [96].In fact, the authors argue that the following generalization of the Bravyi-Kitaev gate set can be precisely realized in experiments.
in which where, U tun jj ′ is a tunneling gate allowing fermions to shift at different sites, and U int jj ′ is an interaction gate which gives relative phases between different charge configurations.Thus, the fermion systems can be simulated using these two gates on fermionic quantum simulators directly, instead of being implemented indirectly on a qubit-based simulator by using the Jordan-Wigner transformation with the additional overhead of quantum resources.
The digital fermionic quantum simulator in Ref. [96] exploits fermionic atoms trapped in optical tweezers.In order to implement the interaction and tunneling gates U int jj ′ and U tun jj ′ one can excite the atoms into Rydberg states.In particular, tunneling gates can be realized by different methods, two of which are critical, the MERGE protocol and the SHUTTLE protocol.In the MERGE method [45,95,107], two nearby optical traps are brought so close together that atoms can tunnel through to each other and then separate these two traps.In this case, the parameters ⃗ θ of U tun jj ′ are determined by the behavior of tweezers and the customdesigned merging and splitting protocols.On the other way, namely the SHUTTLE approach, two tweezer arrays are employed each trapping different spin states of the atom [108].Then, when these tweezers overlap, Rydbergmediated level transitions in the spin subspace also lead to coherent delocalization of positional degrees of freedom.This delocalized position wave function can then be collapsed in the new site of choice, making it possible to implement the U tun jj ′ .The interaction gate U int jj ′ can be similarly implemented by bringing the corresponding optical tweezers within Rydberg blockade distance and then driving a laser pulse to create a Rydberg excitation that couples to the internal angular momentum degrees of freedom.By suitably choosing the shape of the laser (purple) is applied on all neighboring sites and then R z j (θ) (red) is applied on all sites, (c) Decomposition of one layer of fermionic circuit used in this paper, first U tun jj ′ (orange) is applied on all neighboring sites and then U int jj ′ (green) is applied on the same way.
pulse, U int jj ′ [109] can be obtained.For unrelated site pairs, both U tun jj ′ and U int jj ′ can be parallelized, i.e., implemented simultaneously all across the length of an array of neutral atoms.

III. CIRCUIT DESIGN FOR BOTH QUBIT AND FERMION QUANTUM SIMULATORS
In this paper, we focus on Hamiltonians in which the number of fermions is conserved.By mapping to qubit basis, the resulted Hamiltonian H q preserves the number of excitation, namely [H q , S z ]=0 where S z =1/2 j σ z j .To incorporate this symmetry in the circuit design we rely on two-qubit gates U qubit jj ′ and single qubit rotations R z j (θ) which take the form In Fig. 1(a), we depict the quantum circuit which realizes the two-qubit operation U qubit jj ′ .By combining these gates one can construct a quantum circuit that naturally preserves the number of particles.One layer of such a circuit is shown in Fig. 1(b).In order to enhance the expressivity, several layers of the circuit are concatenated to make a deeper circuit.
For designing the circuit in fermionic digital quantum simulators we can use different arrangements of U tun jj ′ and U int jj ′ .We have found out that a particular arrangement performs better for the VQE simulation.In this advantageous configuration, in a given circuit layer, all the tunneling gates U tun jj ′ are grouped together to act on all the bonds and then followed by similar arrangements for interaction gates U int jj ′ .In Fig. 1(c), we depict one layer of a typical circuit for a one-dimensional system.Note that in the following section, the qubit and fermion circuits have a little difference as shown in Fig. 1, but keep the same gate arrangements.Initially, the circuit parameters of both VQEs are randomly set to near 0 to ensure ⟨H⟩ of both VQEs start from similar values.
For the classical optimization part, we use the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm [110][111][112][113], and restrict maximum iterations l I to 150 to minimize the average energy ⟨H⟩, which is sufficient in most cases.It is worth mentioning that other classical optimizers also work and do not qualitatively affect the conclusions.In addition, for each case, we repeat the procedure 100 times for random initializations, over which the results are averaged to be statistically meaningful.
It is worth emphasizing that depending on the Hamiltonian to be simulated, the above qubit circuit design may not be the most efficient one.For example, unitary coupled cluster ansatz-based circuits are more favored for molecular simulations.In particular, an adaptive optimization of circuit design [114], has become very popular in recent years.We stress that this technique remains equally available for fermionic simulators too, with the operator pool of Ref. [114] being the fermionic gates in Eq. ( 9).However, for the sake of fair comparison between qubit and fermionic quantum simulators, we use the same strategy, i.e. adopting symmetry-preserving circuits and using the same classical optimizer, for both qubits and fermions.Indeed, one can use more complicated strategies, such as adaptive optimization of the circuit design [114], in both cases and improve the performance.Here, our objective is to provide a comparison of the performance of the qubit and the fermionic simulators and thus these improvements are out of the scope of the paper and left for future investigations.

IV. VQE SIMULATION OF SPINLESS FERMIONIC SYSTEMS
Now we start the analysis of Fermionic quantum systems.In this section, we neglect the explicit contribution of spin degrees of freedom and show how VQE simulation of such systems on a fermionic digital quantum simulator can outperform the performance of qubit-based quantum simulators.We focus on the simplest VQE case, namely finding the ground energy of spinless fermionic Hamiltonians.In the absence of spin degrees of freedom, the Pauli exclusion principle dictates that there can only be one electron per site.Therefore, a typical spinless fermion Hamiltonian reads as where, t represents particle tunneling, V is interaction between neighboring sites and µ is the chemical potential.The summation runs over all nearest neighbor sites ⟨jj ′ ⟩ on a given geometry of the system.This Hamiltonian conserves the number of fermions as [H, n]=0, where n= j n j .This implies that all the eigenstates of the Hamiltonian have a fixed number of fermions.In particular, the number of fermions N f in the ground state depends on V which results in the filling factor f =N f /N .As a further simplification, we assume that the sites are shallow, i.e. µ ≪ t, which is ensured by simply choosing µ=0 and t=1.Note that while this parameter choice is not the most general, our focus here is to present a proof-of-principle improvement with the fermionic gates.Besides, for smaller atoms like Li, or relatively weak trapping laser fields, this assumption can be physically justified.It is also the regime where the MERGE protocol mentioned above is most successful.We now denote H(t, V, µ) in the equation above as simply H(V ).
We simulate N =12 sites arranged in three specific geometric configurations, a 1×12 Chain (Fig. 2), a Ladder (Fig. 3) with 2 rungs, i.e., 2×6 system, and a 3×4 Rectangular Lattice (Fig. 4).Note that by spreading the system in two dimensions the entanglement in the ground state increases, which is known as area-law.Therefore, we expect that the simulation of such systems should be more resource-demanding than simple one-dimensional chains.
A. Geometry 1: Open Chain First, we consider the open boundary spinless fermion chain model (Fig. 2(a)).Physically, when V =0, the model reduces to the standard tight-binding model, and the ground state is at half-filling f =1/2.As the interaction V increases, the corresponding energy penalty leads to a reduction of the fermion numbers N f in the ground state, one electron at a time, i.e., in discrete steps to and so on (see Fig. 2(b)).For simulation, we consider two values of V at V =0 and V =2, with the corresponding ground state fermion numbers being N f =6 and N f =5 respectively.The fermion and qubit circuit structure is shown in Fig. 2(c) and (d), consisting of U int jj ′ and U tun jj ′ from Eq. ( 9).In each circuit layer, the tunneling gates are grouped together which is then followed by interaction gates.There are two steps to fulfill either tunneling or interaction gates for all neighboring sites, resulting in a total depth of 4 for each layer.
The simulation results, quantified by average energy ⟨H⟩=⟨ψ( ⃗ θ)|H|ψ( ⃗ θ)⟩ and fidelity F=|⟨ψ( ⃗ θ)|E 0 ⟩| 2 , for V =0 are depicted in Figs.2(e)-(f).Both fermion and qubit circuits contain four layers.As the figure shows, the fermion quantum simulator converges faster and shows smaller error bars.This means that fermionic simulators are not only faster but also more robust against random initialization.Note that non-interacting onedimensional fermionic systems can be efficiently solved through Jordan-Wigner transformation on classical computers.Thus, aside from benchmarking, it is practically more important to investigate the non-solvable interacting fermions.We consider the interaction strength V /t=2 for which the filling factor of the ground state changes to N f =5.The resulting average energy and fidelity are shown in Figs.2(g)-(h), respectively.Different from the non-interacting case, convergence is achieved with only L=4 layers for the fermionic circuit but requires L=6 layers for the qubit circuit.To achieve the threshold fidelity F=0.95, the fermion VQE only needs l I =75 iterations, while the qubit simulator requires l I =106 iterations.
Remarkably, in the interacting case, the improvement in convergence speed is even more significant than in the non-interacting case.The error bars still remain smaller for fermionic simulators.These results clearly show the superiority of fermionic quantum simulators over their qubit counterparts for simulating the fermionic manybody systems.

B. Geometry 2: Ladder
Now we consider the ladder model with two rungs, see Fig. 3(a), as an intermediate between the 1D chain discussed above and the full 2D model.As before, increasing the repulsive interaction V leads to a decline in the ground state fermionic number density, as shown in Fig. 3(b).For designing the quantum circuit, we follow the same logic as 1D systems.In each circuit layer, we first group the tunneling gates U tun jj ′ together and then perform the interaction gate U int jj ′ in a similar fashion.Since there are more bonds in the ladder geometry the full operation of either tunneling or interaction gates can be fulfilled in three steps as shown in Fig. 3(c), resulting in total depth 6 for each layer.The results, depicted in Figs.3(e)-(f), show that the fermionic circuit converges far faster, requiring only l I =33 iterations to reach a target fidelity of F=0.95, compared to 89 iterations for the qubit circuit.As the results show by changing the geometry from a 1D chain to a ladder the improvement achieved by the fermionic simulator becomes even more pronounced.This is because the corresponding qubit Hamiltonian in the case of ladder is significantly more non-local which results in overhead resources.We note that both the qubit and fermionic circuits only need L=3 layers to converge within the allowed number of iterations, which is less than L=4 layers required for the 1D chain previously.However, the circuit depth of each layer in this case is higher compared to the 1D chain and thus more gates are required.Again the error bars in fermionic simulators are smaller than the qubit simulators showing more robustness against the random initializations.

C. Geometry 3: Rectangular Lattice
As demonstrated above, by extension the geometry in 2D structures of the fermionic circuit shows a noticeable performance advantage over the qubit circuits.We now consider a more pronounced 2D system, which is a 3×4 rectangular lattice, shown in Fig. 4(a).In general, we expect the 2D system to be more challenging to simulate than the 1D system because of stronger entanglement and the presence of more couplings in the system.As before, the filling factor of the ground state depends on the repulsive interaction V , as shown in Fig. 4(d).We again follow the same logic of grouping the gates in the circuit.For the rectangular lattice, we need four steps to perform either tunneling gates U tun jj ′ or interaction gates U tun jj ′ on all neighboring bonds, as shown in Fig. 4(c).This means that each circuit layer has a depth of 8.For converging to the ground energy, this fermionic circuit only needs three layers.In Figs.4(e)-(f), the average energies and the corresponding fidelities obtained by both fermionic and qubit simulators are shown.In order to converge to fidelity F=0.95, one needs L=3 and L=6 layers in fermionic and qubit circuits, respectively.These results show that for higher dimensions, the gap in performance widens between fermionic and qubit circuit ansatzes.As the figures show, for the qubit circuit, even with twice as many layers, it is still hard for it to converge to the true ground energy.In terms of fidelity, the qubit circuit needs l I =135 iterations to achieve a fidelity of F=0.95 but fermionic approach only needs l I =39 iterations.
Indeed, the VQE simulation of spinless fermionic Hubbard Hamiltonian shows that the fermionic circuit construction is superior to the qubit circuit construction for the simulation of spinless Hubbard Hamiltonians, with the performance advantage widening as we move towards higher dimensional systems, or indeed, for the same system with bigger interaction strengths.This can be seen more quantitatively in the table I, where we compare the performance of fermionic and qubit simulators in terms of both classical and quantum resources.

V. SIMULATING SPINFUL FERMIONIC MANY-BODY SYSTEMS
In the previous section, we simulated the ground state of spinless fermionic Hubbard Hamiltonian and established the advantage gained through the use of fermionic quantum simulators.We now want to extend these re-  sults for the spinful Hubbard Hamiltonian, described as As before, we again assume t=1 and µ=0.Notice that since there are two spins, the intersite repulsion term U is non-trivial between oppositely signed spins.We can then denote the full Hamiltonian H(t, U, V, µ) in terms of adjustable variables as H(U, V ).
For the spinless system, each site only has two different states non-occupied or occupied.For N -sites, this information can be encoded in an N -qubit register.For the full description of spin 1/2 fermions, each site has four possible configurations, non-occupied, occupied with one spin-up electron, occupied with one spin-down electron, and occupied with two opposite-spin electrons.For neutral atom arrays, this entails applying an external magnetic field to induce hyperfine level splittings.Therefore, for both qubit and fermionic simulators each site is encoded by two registers, one represents spin-up and one spin-down.cuit design is altered for the spinful fermionic case from the spinless case discussed in the last section.As mentioned before, there are two types of spinless gates in G set.For spinful fermions, the realizable gates depend on the details of trapping laser beams and pulses that one can apply.If one uses the same setup as Ref. [96] for realizing spinful fermions then the tunneling gate U tun jj ′ is replaced by two unitaries for each spins as

Before progressing further, let us describe how our cir-
where α= ↑ , ↓ represent the spin of the fermion.The interaction gate U int jj ′ , however, is replaced by four different gates where, α, α ′ ∈ {↑, ↓} label the four possible interaction gates based on spin degrees of freedom.Based on the gate implementation of Ref. [96], these gates can only be realized sequentially.Therefore, in our circuit design for spinful fermions, we also perform these gates sequentially.Note that by using alternative trapping methods and exploiting more Rydberg states one may be able to merge some of these unitaries and thus simplify the circuit.Nonetheless, for the sake of consistency, we stick to the operators given in Eqs. ( 13) and ( 14).The benchmarking qubit circuit is built in analogy with the Fermionic circuits by retaining the same connections, and with weights defined by Eq. ( 10) earlier.

A. Geometry 1: Open Chain
As mentioned above, in spinful systems with N sites one has to use 2N registers.Hence, to keep the same simulator size as the spinless model, we have to cap the system size of the spin model at N =6 sites.Notice that when N =6, the simplest ladder example with two rungs coincides with the simplest 2D rectangular lattice of 2×3 sites.We follow the exact same methodology as the spinless case.
First, we consider the fermion spin chain model with size N =6 as shown in Fig. 5(a).The corresponding simulator structure is on the right.As we discussed, the simulator consists of 12 registers, 6 each for each spin configuration.The number of fermions N f in the ground state for varying V and U is shown in Fig. 5(b).As V or U increases, the number of fermions declines from 6 to 2, step by step.In general, the onsite repulsion between two fermions is stronger than the coulomb repulsion between two neighbor sites, so we set V =0.5 and U =2.5.To simulate this spin model, the circuit of fermion VQE is designed as shown in Fig. 5(c), where tunneling gates only act on two neighbor spin-up sites or two neighbor spin-down sites in accordance with Pauli exclusion principle, as given by the hopping term of the Hamiltonian in Eq. (12). in contrast, interaction gates are executed on the spin subspace at every site.Therefore, in each layer, for implementing tunneling gates two steps are needed while for interaction gates three steps.This makes the depth of a single layer equal to L=5.The numerical results for average energy and fidelity are shown in Figs.5(e)-(f).The results are even more strongly in favor of the fermionic circuit than the spinless examples before.The qubit-based VQE converges far more slowly to the actual ground state even with L=7 layers, taking l I =325 iterations to reach fidelity F=0.95, while the fermionic simulators with L=5 layers converge to the ground state with the same fidelity F=0.95 after only l I =73 iterations.The quantum resource requirement, i.e., two-party elementary gate counts, is also more than halved by the fermionic circuit (150 vs. 420 for the qubit circuit).

B. Geometry 2: Ladder
For a fermionic Hubbard model with N =6 sites, the only two-dimensional geometry is a 2×3 ladder.The schematic of the system and its corresponding simulator with 12 registers are shown in Fig. 6(a).The number of fermions N f in the ground state of the system as a function of U and V is depicted in Fig. 6(d).To simulate this model, the corresponding one-layer quantum fermionic circuit is displayed in Fig. 6(c).Similarly, the tunneling gates only act on sites that represent the same spins while interaction gates operate between all neighboring sites.Therefore, in a single circuit layer, the tunneling gates are realized through 3 steps while interaction gates require 6 steps to fulfill.This means that every circuit layer has a depth of 9.The numerical results for the average energy and the corresponding fidelities on a qubit circuit with L=6 layers and fermionic circuits with L=5 layers are depicted in Figs.6(e)-(f).Again the results demonstrate that a fermionic circuit converges to fidelity F=0.95 after l I =35 iterations while the qubit circuit requires l I =74 iterations.This means that the fermionic simulators are more efficient with respect to both quantum and classical resources.In summary, the fermionic circuit is again, clearly better in the spinful case.The complete table for counting classical and quantum resources for each simulation is provided in Table II, which quantitatively backs up this assertion.

C. Resource Efficiency
Table II details both classical and quantum resources utilized by qubit and Fermionic simulators for various configurations of the spinful Hubbard Hamiltonian.Even more prominently than the spinless case, the Fermionic simulator consumes significantly less quantum resource R Q and classical resource R C .As an illustrative example, We could only converge the qubit simulator for the open chain to 95% fidelity after 325 iterations, however, the Fermionic simulator converges to this target fidelity nearly five times faster (73 iterations) and with shallower circuits (only 5 layers vs 7 layers for the qubit), thus proving its superiority.

VI. RESOURCE EFFICIENCY AND SCALABILITY
Tables I and II detail both classical and quantum resources utilized by qubit and Fermionic simulators for various configurations of the spinless and spinful Hubbard Hamiltonians.The quantum resource counts R Q , i.e., the two-body gate count for the whole circuit, increases as the systems acquire more and more width, for both Fermionic and qubit architectures.However, unlike the Fermionic case, the qubit circuit requires a lot more depth to converge for the 2D lattice.Also, while the quantum resource required increases for the 2D lattice, the open chain actually requires the most classical resource for Fermionic simulators, which is in contrast with the qubit simulator.
As demonstrated above, the fermionic simulator showed significant advantages over the qubit-based simulator on both quantum and classical resources.From table I and table II for spinless and spinful cases respectively, we know these advantages grow bigger with the spatial dimension of the system.The important question is -do these advantages scale up when the system size N increases ?.This is crucial for a NISQ quantum simulator because of limited resources available.Here, we take the counts of R Q and R C to compare the resource consumption of qubit-and fermionic-based quantum simulators.
We note from our data that the resource requirements grow polynomially with system size N , i.e., While the exponents quoted above are not conclusive since we had to confine ourselves to short chains, they nonetheless confirm that vis-a-vis the qubit simulator, the fermionic simulator shows more resource efficiency for both quantum and classical parts of the protocol as N increases, i.e., the relative advantage with respect to the qubit simulator is scalable.In Table III, we list the scaling exponents for other geometries studied in the paper.Due to    In the previous sections, we discussed the use of direct fermionic circuits to implement the variational algorithm for obtaining the ground state of spinless or spinful fermionic many-body models with several different geometric configurations.These results all point towards a significant advantage being obtainable in terms of both classical and quantum resource efficiency than the standard Qubit-based VQE schemes.
In this section, we consider an example from the domain of quantum chemistry to illustrate the advantage of Fermionic simulators.We remember that chemical bonds happen because interacting electrons, each delocalized between different atomic orbitals, are in a bound state, i.e., have a negative ground state energy.This molecular Hamiltonian can thus be written as a many-electron Fermionic Hamiltonian where in the second-quantized notation, c i (c † i ) are lowering and raising Fermionic operators for the i-th orbital.The ground state energy of the molecular Hamiltonian then gives an idea of the strength of the bond.Thus, qubit-based quantum simulators again encounter the same problem of fermionto-qubit encoding-induced overheads, and we expect the Fermionic simulator to be significantly better.In fact, in Ref. [96], two four-body generalized gates corresponding to the terms in coupled cluster ansatz of approximating the molecular Hamiltonian [115] were constructed from the repeated application of Fermionic gates U tun jj ′ and U int jj ′ .With the help of these gates, Ref. [96] found the ground state of LiH as a case study.Here, we provide an explicit demonstration of the efficiency of the fermionic simulator by finding the ground energy of a molecule of water, whose molecular structure is shown in Fig. 7(a) [116].For the H 2 O molecule, we have 4 active orbitals and 4 electrons hopping between them.The simulator has to, therefore, contain 8 registers to account for spin degrees of freedom.Different from the approach of [96], we do not explicitly use the four-body Fermionic gates as building blocks of the unitary, instead using twobody tunneling and interaction gates introduced earlier.One layer of the fermionic quantum circuit is shown in Fig. 7(c), which shows that the tunneling gates can be implemented in 2 steps while the interaction gates require 3 steps, therefore, a single layer of the circuit has the depth of 5.The results of average energy and fidelity are shown in Figs.7(d)-(e), respectively.Again, compared with qubit circuits, fermionic circuits outperform qubit-based circuits in terms of convergence speed (l I = 8 for Fermionic vs. 13 for qubit, less is better), and quantum resources (R Q = 72 for Fermionic vs. 144 for qubit, less is better).In terms of classical resource cost, both are roughly similar (R C = 1280 for Fermionic vs. 1560 for qubit, less is better), and the error bars representing variability of output on randomized initializations are narrower for the Fermionic circuit than the qubit circuit.Thus, even for the generic Fermionic circuit, an advantage over the qubit paradigm is present.Please note that Fig. 7(d)-(e) are in the semilog scale and they also indicate the onset of the barren plateau problem at larger iterations, albeit beyond our fidelity accuracy threshold.
Let us also note that we have only considered the total particle number conservation property in this algorithm.Incorporating all available symmetries of molecules, such as the total spin, time-reversal, and point-group and crystallographic symmetries, into the construction of fermionic quantum simulators should further reduce the required resource costs and improve the performance of the fermion simulation algorithms even more.

VIII. CONCLUSION
Efficient simulation of strongly interacting fermionic systems is crucial in every physical and technological domain.Shortcomings of classical computation are long known for the simulation of quantum systems, which leaves quantum simulators as the only viable solution.Quantum simulators are rapidly emerging in various physical platforms, with qubit architectures becoming the default arrangement.However, the imperfect nature of NISQ simulators prohibits universal large-scale quantum computation, which makes resource efficiency a critical issue for the foreseeable future.Simulation of fermionic systems on qubit-based quantum simulators requires fermion-to-qubit mappings which introduce additional resource overhead, limiting the scalability to larger systems and introducing new challenges for trainability.Analog fermionic quantum simulators [94,95] have been experimentally demonstrated in neutral atom arrays, paving the way for the realization of programmable fermionic digital simulators.Moreover, in a recent parallel work [117], the inverse problem of setting the optical tweezer trap configuration for various Hubbard model configurations was also treated.In this paper, we compare the performance of fermionicand qubit-based simulators for VQE simulations of the fermionic many-body ground state in both condensed matter systems and quantum chemistry problems.Our results show that fermionic quantum simulators offer a clear, scalable, and significant advantage in terms of both classical as well as quantum resources.In addition, fermionic simulators show less sensitivity to random initializations of circuit parameters.Thus, our work opens up the possibility of simulating much bigger fermionic systems in future, which can potentially have a big impact on material design and quantum chemistry.

FIG. 1 .
FIG. 1.(a) Decomposition of two-qubit gates U qubit jj ′ (purple) in terms of rotations and CNOT gates, (b) Decomposition of one layer of qubit circuit used in this paper, first U qubit jj ′

FIG. 2 .
FIG. 2. (a) Open chain configuration for N =12 spinless Fermi-Hubbard model.(b) Dependence of particle number density on the Coulomb repulsion strength V for the ground state.(c) One layer of the fermionic circuit which is decomposed into four steps (orange steps denote U tun jj ′ , green steps denote U int jj ′ .).(d) One layer of the qubit circuit which is decomposed into three steps (purple steps denote U qubit jj ′ , and red steps denote the phase rotations).(e) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of convergence to ground state energy for Fermi-Hubbard model interaction strength V = 0. (f) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of fidelity with ground state for Fermi-Hubbard model interaction strength V = 0. (g) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of convergence to ground state energy for Fermi-Hubbard model interaction strength V = 2. (h) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of fidelity with ground state for Fermi-Hubbard model interaction strength V = 2. Tunneling strength t = 1 throughout.

FIG. 3 .
FIG. 3. (a) Ladder 6×2 configuration for N =12 spinless Fermi-Hubbard model.(b) Dependence of particle number density on the Coulomb repulsion strength V for the ground state.(c) One layer of the fermionic circuit which is decomposed into 6 steps (orange steps denote U tun jj ′ , green steps denote U int jj ′ .).(d) One layer of the qubit circuit which is decomposed into 4 steps (purple steps denote U qubit jj ′ , and red steps denote the phase rotations).(e) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of convergence to ground state energy for Fermi-Hubbard model interaction strength V = 2. (f) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of fidelity with ground state for Fermi-Hubbard model interaction strength V = 2. Tunneling strength t = 1 throughout.

FIG. 4 .
FIG. 4. (a) Rectangular 3 × 4 configuration for N =12 spinless Fermi-Hubbard model.(b) One layer of the qubit circuit is decomposed into 5 steps (purple steps denote U qubit jj ′ , and red steps denote the phase rotations).(c) One layer of the fermionic circuit which is decomposed into 8 steps (orange steps denote U tun jj ′ , green steps denote U int jj ′ ).(d) Dependence of particle number density on the Coulomb repulsion strength V for the ground state.(e) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of convergence to ground state energy for Fermi-Hubbard model interaction strength V = 2. (f) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of fidelity with the ground state for Fermi-Hubbard model interaction strength V = 2. Tunneling strength t = 1 throughout.
with scaling exponents β Q and β C respectively.As an illustration, demonstrated in Fig 8, our data for the spinless 1D open chain indicates that β Q = 2.19, β C = 3.90 for the fermionic circuit vs β Q = 2.81, β C = 4.48 for the qubit circuit.

FIG. 5 .
FIG. 5. (a) Open chain configuration for N =6 spinful Fermi-Hubbard model as simulated by qubit (left) and fermonic (right) simulators.(b) Dependence of particle number density on the site depth U and Coulomb repulsion strength V for the ground state.(c) One layer of the fermionic circuit which is decomposed into six steps (orange steps denote U tun jj ′ ,αα , green steps denote U int jj ′ ,αα ′ .).(d) One layer of the qubit circuit which is decomposed into five steps (purple steps denote U qubit jj ′ , and red steps denote the phase rotations).(e) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of convergence to ground state energy for V = 0.5, U = 2.5.(f) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of fidelity with ground state for V = 0.5, U = 2.5.Tunneling strength t = 1 throughout.

FIG. 6 .
FIG. 6.(a) Ladder 2×3 configuration for N =6 spinful Fermi-Hubbard model as simulated by qubit (top) and fermonic (bottom) simulators.(b) One layer of the fermionic circuit which is decomposed into 9 steps (orange steps denote U tun jj ′ ,αα , green steps denote U int jj ′ ,αα ′ ).(c)One layer of the qubit circuit is decomposed into 7 steps (purple steps denote U qubit jj ′ , and red steps denote the phase rotations).(d) Dependence of particle number density on the site depth U and Coulomb repulsion strength V for the ground state.(e) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of convergence to ground state energy for V = 0.5, U = 2.5.(f) Performance of the fermionic circuit-based VQE (orange) vs the qubit circuit-based VQE (green) in terms of fidelity with ground state for V = 0.5, U = 2.5.Tunneling strength t = 1 throughout.

FIG. 7 .
FIG. 7. (a) The molecular geometry of water with 4 spin orbits, (b) One layer of the qubit circuit decomposed into 5 steps (purple steps denote U qubit jj ′ , red denote phase rotation R z j .)(c) One layer of the fermionic circuit decomposed into 7 steps (orange steps denote U tun jj ′ , green steps denote U int jj ′ ).(d) Performance of the fermionic circuit based VQE (orange) vs. the qubit circuit-based VQE (green), in terms of convergence to ground state energy.(e) Performance of the fermionic circuitbased VQE (orange) vs. the qubit circuit-based VQE (green), in terms of fidelity to the actual ground state.

FIG. 8 .
FIG. 8. (a) Required quantum resource RQ, and (b) the corresponding classical resource RC, as system size N increases for the spinless 1D open chain.Orange lines denote fermionic based and green lines denote qubit-based simulator performance.Inset to Fig. (a) depicts fitting lines for log RQ vs log N (slopes ≈ 2.19 for Fermionic vs. ≈ 2.81 for qubit ).Inset to Fig.(b) depicts fitting lines for log RC vs log N (slopes ≈ 3.90 for Fermionic vs. ≈ 4.48 for qubit).Interaction strength V = 2 in every case.Tunneling strength t = 1 throughout.
p real parameters ⃗ θ={θ 1 , θ 2 , ...., θ lp }, then the ground state energy is bounded by average energy, namely The simplest version of the VQE is for simulating the ground state and is built on the Ritz variational principle, i.e., if one chooses a trial state |ψ( ⃗ θ)⟩=U( ⃗ θ)|ψ 0 ⟩ in which U( ⃗ θ) is a unitary operator parametrized by l

TABLE I .
Table for resource count of the simulation of various configurations of spinless fermionic Hubbard Hamiltonians with N =12 sites where the parameters are chosen to be t=1 and V =2.

TABLE II .
Table for resource count of the simulation of various configurations of spinful fermionic Hubbard Hamiltonians with N =6 sites where the parameters are chosen to be t=1, U =2.5 and V =0.5.C = 5.86, β Q = 3.24) than the spinless (β C = 4.48, β Q = 2.81).However, for the fermionic architecture, the simulation of the spinful chain is roughly as scalable as the spinless in terms of gate counts (β Q = 2.23 for the spinful chain vs β Q = 2.19 for the spinless chain) and the spinful chain actually shows better simulation scalability in terms of classical optimization resource requirements (β C = 3.35 for the spinful chain vs β C = 3.90 for the spinless chain).Together, they indicate that given the severe constraints of the NISQ era, Fermionic simulators can allow us to tackle systems, especially spinful systems, far larger than the existing qubit simulators can.