Low Depth Quantum Simulation of Electronic Structure

Quantum simulation of the electronic structure problem is one of the most researched applications of quantum computing. The majority of quantum algorithms for this problem encode the wavefunction using $N$ Gaussian orbitals, leading to Hamiltonians with ${\cal O}(N^4)$ second-quantized terms. We avoid this overhead and extend methods to the condensed phase by utilizing a dual form of the plane wave basis which diagonalizes the potential operator, leading to a Hamiltonian representation with ${\cal O}(N^2)$ second-quantized terms. Using this representation we can implement single Trotter steps of the Hamiltonians with linear gate depth on a planar lattice. Properties of the basis allow us to deploy Trotter and Taylor series based simulations with respective circuit depths of ${\cal O}(N^{7/2})$ and $\widetilde{\cal O}(N^{8/3})$ for fixed charge densities - both are large asymptotic improvements over all prior results. Variational algorithms also require significantly fewer measurements to find the mean energy in this basis, ameliorating a primary challenge of that approach. We conclude with a proposal to simulate the uniform electron gas (jellium) using a low depth variational ansatz realizable on near-term quantum devices. From these results we identify simulations of low density jellium as a promising first setting to explore quantum supremacy in electronic structure.


INTRODUCTION
The problem of electronic structure is to simulate the stationary properties of electrons interacting via Coulomb forces in an external potential. The solution of this problem has wide implications for all areas of chemistry, condensed matter physics, and materials science, and is of industrial relevance in the design and engineering of new pharmaceuticals, catalysts, and materials. Recently, quantum computers have emerged as promising tools for tackling this challenge, offering the potential to access difficult electronic structure with reduced computational complexity. However as the age of "quantum supremacy" dawns, so has the realization that many "efficient" quantum algorithms still require more resources than will be available in the near-term.
Originally proposed by Feynman [1], the efficient simulation of quantum systems by other, more controllable quantum systems formed the basis for modern constructions of quantum computation. This early insight has since been refined to encompass more universal and versatile constructions of simulation [2,3]. By combining quantum phase estimation [4] with these techniques, Aspuru-Guzik et al. showed the first efficient quantum algorithm for solving quantum chemistry problems [5]. This initial algorithm was based on adiabatic state preparation combined with Trotter-Suzuki decomposition of the unitary time-evolution operator [6,7] in second quantization.
Many algorithmic and theoretical advances have followed since the initial work in this area. The quantum simulation of electronic structure has been proposed via an adiabatic algorithm [8], via Taylor series time-evolution [9], in second quantization, in real space [10,11], in the configuration interaction representation [12,13], and using a quantum variational algorithm [14,15]. Starting with [16], researchers have sought to map these algorithms to practical circuits and reduce the overhead required for implementation by both algorithmic enhancements [17][18][19][20] as well as physical considerations [21,22]. As a second quantized formulation is generally regarded as most practical for near-term devices, many works have also tried to find more efficient ways of mapping fermionic operators to qubits [23][24][25][26][27].
a Throughout this paper we use the computer science conventions that f ∈ Θ(g) for any functions f and g if f is asymptotically upper and lower bounded by a multiple of g. f ∈ o(g) implies that f /g → 0 in the asymptotic limit. O indicates an asymptotic upper bound and Ω indicates an asymptotic lower bound. A tilde on top of the bound notation, e.g. O(N ), indicates suppression of polylogarithmic factors. In contrast to formally rigorous bounds, a tilde inside of a bound, e.g. O( N ) indicates the bound is obtained empirically.
A major challenge in developing low depth quantum algorithms for quantum chemistry is that electronic structure Hamiltonians often have as many as O(N 4 ) terms, where N is the number of basis functions. This is problematic as many algorithms for time-evolution and energy estimation have costs which scale explicitly with the number of terms. In this paper, we introduce new basis functions which reduce the number of Hamiltonian terms to Θ(N 2 ). We exploit this and other properties of the basis to demonstrate algorithms for state preparation and time-evolution which are simultaneously practical at small sizes and asymptotically more efficient than any in the literature. We conclude with a proposal to simulate the uniform electron gas on a near-term device using planar circuits of only linear depth.
Section I discusses several strategies for quadratically reducing the number of terms in the second quantized electronic structure Hamiltonian. The approach we focus on is to use a plane wave basis and its dual obtained by a unitary rotation, which we call the "plane wave dual basis". In Section I A we show that the dual basis diagonalizes the potential operators, leading to a Hamiltonian with Θ(N 2 ) terms and other desirable properties. The plane wave basis is especially natural when treating periodic systems (e.g. crystalline solids), allowing us to conveniently extend quantum simulation methods to condensed phase systems of interacting electrons. The basis is compact for uniform and near-uniform electron gasses, realized in simple metals as well as electrons in semiconductor wells, and there is well-developed infrastructure (e.g. pseudopotentials) to enable compact representations of other realistic materials. In Section I B we describe a generalization of the fast Fourier transform to second quantized systems of fermions. We show that this operation can be implemented on a planar lattice of qubits with linear depth and that it maps a quantum state between the plane wave basis (where the kinetic operator is diagonal) and the plane wave dual basis (where the potential operator is diagonal). Section II introduces algorithmic improvements for three different quantum approaches to simulating electronic structure. The scaling advantages of the techniques introduced in this paper are compared to prior results in Table I. In Section II A we take advantage of the fermionic fast Fourier transform to show that single Trotter steps of the Hamiltonian can be implemented using circuits of only O(N ) gate depth on a planar lattice without ancillae. This is a large improvement over the previous best result of O(N 9/2 ) depth. We bound the gate depth of Trotterization on a planar lattice within this representation at O(N 7/2 /ǫ 1/2 ) where ǫ is the target precision. This represents more than a quadratic improvement over the best previously proven bounds for Trotterization. In Section II B, we show that the Taylor series method of time-evolution has gate depth O(N 8/3 ) with logarithmic dependence on ǫ. In Section II C we discuss how the structure of the plane wave dual basis reduces the measurements required when estimating the energy through Hamiltonian averaging, significant in the context of variational quantum algorithms. We show that O(N 4 /ǫ 2 ) circuit repetitions are sufficient to estimate the energy of the Hamiltonian to absolute error ǫ. However, we also argue that when one desires to study the properties of a material in its thermodynamic limit, relative error µ is a more relevant metric and in that context, only O(N 2 /µ 2 ) repetitions are required. Even for fixed absolute error, our bounds on the required measurements represent a substantial improvement over prior estimates.
Section III proposes an experiment for simulating the uniform electron gas (also known as jellium) on a near-term quantum device based on the techniques of Section I and Section II. Though one of the simplest models of realistic electronic structure, jellium is readily tuned between simple and complex physics through a single parameter, the electron density. Jellium has foundational importance both for practical computational materials science, as well as basic condensed matter physics: the energy density of jellium is the starting point for all practical density functional approximations used in quantum chemistry and materials simulations; the system can be physically realized to good approximation in real materials such as the alkali metals, and in semiconductor wells; and two dimensional jellium in a magnetic field is the standard setting in which to discuss the fractional quantum Hall effect. In addition, many questions remain about jellium physics in the low density regime, where unbiased classical simulations are intractable for system sizes of interest, and biased simulations do not reach the accuracy to definitively resolve between competing phases. In Section III A we describe a quantum variational algorithm for jellium which can be executed on a planar lattice of qubits with O(N ) circuit depth, based on the results of Section I. We conclude with an outlook on how to extend these simulations to more general quantum chemical problems and the potential for the jellium problem to serve as a setting for early demonstrations of quantum supremacy over a problem of practical interest.
This paper provides a number of supporting technical results in appendices. In Appendix A we show a finitedifference discretization of the Hamiltonian with O(N 2 ) terms. In Appendix B we review the well known form of the Hamiltonian in the plane wave basis and in Appendix C we derive its representation in the plane wave dual basis, including connections to discrete variable representations. Appendix D shows a closed-form representation of the plane wave dual Hamiltonian mapped to qubits under the Jordan-Wigner transformation. In Appendix E we discuss the discretization errors associated with Gaussian molecular orbitals and plane wave orbitals and argue that both bases have the same asymptotic error scaling. In Appendix F we provide bounds on components of the plane wave dual Hamiltonian that are relevant to the results of Section II. In Appendix G we bound the Trotter error in the simulations of Section II A. In Appendix H we show a method for implementing controlled phase operations between all qubits on a planar lattice with gate depth of only O(N ). In Appendix I we prove results about the scaling of the fermionic fast Fourier transform. In Appendix J we provide new circuits for evolving under a sum of commuting Pauli strings and use that result to bound the cost of Trotterizing Hamiltonians in the plane wave dual basis without using the fermionic fast Fourier transform. Finally, in Appendix K we show an alternative implementation of the Taylor series algorithm which improves over the simpler scheme explored in Section II B.

I. ELECTRONIC STRUCTURE HAMILTONIANS WITH FEWER TERMS
Within the Born-Oppenheimer approximation, the properties of materials, molecules and atoms emerge from the behavior of electrons interacting in the external potential of positively-charged nuclei. In the non-relativistic case, the dynamics of these electrons are governed by the Coulomb Hamiltonian, where we have used atomic units, r i represent the positions of electrons, R i represent the positions of nuclei, and ζ i are the charges of nuclei. T is referred to as the kinetic term, U the (nuclear) potential term, and V the electron-electron repulsion potential term. The electronic structure problem is to estimate the properties of the eigenfunctions (especially the lowest energy eigenfunction) of the time-independent Schroedinger equation defined by this Hamiltonian.
To convert the differential equation into a practical computational problem, one typically first chooses some form of discretization. Moreover, the antisymmetry of electrons must be enforced either in the solutions (first quantization) or in the operators (second quantization) 1 . Most quantum computing research focuses on second quantization, in which the Hamiltonian is formulated as where a † p and a p are fermionic raising and lowering operators satisfying the anticommutation relation {a † p , a q } = δ pq , the coefficients h pq and h pqrs are determined by the discretization that has been chosen, and the sums now run over the number of discretization elements for a single particle. Specifically, if electron j is represented in a space of spin-orbitals {φ p (r j )} then a † p and a p are related to Slater determinants through the equivalence, where η is the number of electrons in the system and |0 is the vacuum. From inspection, one sees that the number of terms in Eq.
(2) may be as high as O(N 4 ) where N is the size of the discrete representation. This presents a major problem for realizing quantum simulation algorithms on near-term quantum devices as most quantum algorithms have some explicit dependence on the number of terms. For instance, the cost of implementing a Trotter step requires a number of gates that scales at least linearly in the number of terms. Likewise, the number of measurements required for variational quantum algorithms scales at least linearly in the number of terms. The most commonly used discretization in classical electronic structure is known as a Galerkin discretization. The Galerkin discretization is derived from the weak formulation of the Schroedinger equation in Hilbert space, given by finding |φ (spanned by the basis vectors {|φ p }) such that φ p | H |φ = E φ p |φ for all p. This is contrasted with the strong formulation (see Appendix A) that insists the original differential equation hold at all points in space r, as opposed to assessing error on the restricted subspace spanned by {|φ p }. The Galerkin formulation leads to the following coefficients which define the second quantized Hamiltonian of Eq. (2): where U (r) is the external potential Coulomb interaction, V (r, r ′ ) is the two-electron Coulomb interaction and the φ p (r) = r|φ p are the single-particle orbitals that define the basis. An important feature of Galerkin discretizations (again, in contrast to e.g. finite-difference discretizations) is that basis set error is variational, meaning that energies from exact diagonalization monotonically approach the continuum basis set limit from above. The basis functions φ p (r) are chosen in a number of ways. Perhaps the most common choice for treating molecular systems is atom-centered Gaussian basis functions, conventionally termed an atomic orbital basis. These functions resemble the mean-field orbitals of single atoms and provide a computationally convenient formulation for the evaluation of the above integrals. Parameters of the Gaussians are optimized so that modest numbers of such basis functions can compactly represent the low-energy eigenstates of atomic and molecular Hamiltonians with qualitative accuracy. However, a drawback of these functions is that the associated Hamiltonians contain O(N 4 ) terms for modest size systems, despite their relative locality eventually leading to O(N 2 ) terms in an asymptotic limit [22]. Moreover, to prepare a compact initial state for a molecular simulation, it is common to rotate from the atomic orbital basis to the molecular orbital basis, which minimizes the mean-field molecular energy. This basis is even more delocalized than the atomic orbital basis and contains even more terms at all system sizes.
1 Several papers have investigated quantum simulation of electronic structure in first quantization [10][11][12][13]. When scaling to the continuum basis limit (as opposed to scaling towards larger systems), these encodings have an asymptotic spatial advantage; first quantization requires O(η log N ) qubits whereas second quantization requires O(N ) qubits. In first quantization one must initialize the simulation in an explicitly antisymmetric initial state, which is potentially costly. As discussed in [11], bounded-error quantum simulations in a real space basis may (in the worst case) require that one compute the potential to a number of bits that is exponential in N as a consequence of singularities in the Hamiltonian which occur when electrons occupy the same location in space. But the primary reason why these algorithms remain less popular than their second-quantized counterparts is that all proposed implementations [10][11][12][13] require complex on-the-fly logic which preclude near-term implementation and dramatically increase the number of T gates required for error-correction.
Gaussian bases were introduced more than half a century ago to reduce the cost of evaluating the integrals in Eq. (4) and Eq. (5) for the mean-field quantum chemistry calculations of interest at the time [45]. However, with advances in classical computing power, the evaluation of such integrals for systems with up to several hundred atoms is no longer a major bottleneck. Further, the requirements of a basis for efficient quantum algorithms are quite different than for classical algorithms. In a quantum algorithm, we primarily desire the computational basis (i) to have a small number of terms, so as to minimize the cost of basic algorithms such as time evolution, or the number of measurements in variational quantum algorithms, and (ii) to allow for a simple preparation of a relevant initial quantum state. To some extent these are conflicting requirements, as (i) can be obtained by locality of the basis in real space, while (ii) implies locality of states in energy space, or delocalization in real space. For example, the traditional Gaussian basis satisfies (ii) but not (i) in medium sized molecules. We should note that while the number of basis functions is also an important quantum resource (corresponding to the number of logical qubits), in many cost models the circuit size is more important. In a fault-tolerant architecture, the number of physical qubits required is largely a function of the number of non-Clifford gates in the original algorithm and does not strongly depend on the number of logical qubits. While existing quantum hardware is limited to a small number of qubits, the expectation is that manufacturing more qubits will be easier in the near-future than significantly increasing coherence time, suggesting that even in a non-fault-tolerant context, gate depth is a more important resource than number of qubits.
To consider how one might circumvent the O(N 4 ) scaling of terms in the Hamiltonian, consider a set of spatially disjoint functions {φ p (r)}, which are defined such that the intersection of the supports of φ p (r) and φ q (r) is the empty set for all p = q. The consequence of this is that the product φ p (r)φ q (r) = 0 for all r and all p = q. Taking this definition with Eq. (5), it is clear that h pqrs = 0 unless p = s and r = q; thus, there are at most O(N 2 ) elements defining the Hamiltonian. To enable a meaningful kinetic energy operator, one would match derivatives at the boundaries of the functions (e.g., as in finite element methods) or, alternatively, allow for overlapping basis functions. In either case, one achieves the desired scaling of O(N 2 ) terms in the Hamiltonian for all system sizes. Another possibility is to use a non-Galerkin grid-based representation, as embodied in finite-difference methods. In Appendix A, we provide explicit forms for the second quantized molecular electronic structure Hamiltonian in such a discretization with O(N 2 ) terms. We focus on a different route to reducing the number of terms in the Hamiltonian, namely to use a pair of basis sets in which the different components in the Hamiltonian (kinetic and potential) are diagonal. This property is offered by the plane wave basis and its dual representation, which we now discuss.

A. The Plane Wave Dual Basis
Like Gaussian orbitals, plane waves have also enjoyed a long history of use in classical approaches to electronic structure. While plane waves have never been studied as a basis for quantum computation of electronic structure, they have many desirable properties as a basis; for instance, their periodicity makes them convenient for crystalline solids. The plane wave basis is defined subject to periodic boundary conditions in a computational cell of volume Ω and the integrals in Eq. (4) and Eq. (5) are defined using the Coulomb potential obtained from solving Poisson's equation subject to periodic boundary conditions (see Appendix B for review), where R j are nuclei coordinates, ζ j are nuclei charges and k ν is a vector of the plane wave frequencies at the ν th harmonic of the computational cell in three dimensions, excluding the zero mode. We will assume a cubic cell for simplicity. The zero mode gives a divergent term but for all charge-neutral systems the divergence from the electronelectron interaction cancels with the divergence from the external potential and contributes only a constant term which depends on the unit cell shape (for a derivation of this term, see Appendix F in [46]). Using a plane wave basis enforces a periodic charge distribution, natural for crystalline solids. As discussed in Appendix E, one can also represent finite systems such as molecules using plane waves by choosing the cell volume Ω to be sufficiently large so that the periodic images do not interact [46] or by using a truncated Coulomb operator, which completely eliminates periodic images [47]. Because plane waves have no knowledge of the atomic positions, one requires more plane waves than Gaussian orbitals in order to obtain the same level of basis set accuracy for most materials. Pseudopotentials reduce the ratio of the number of plane waves needed to the number of Gaussian orbitals needed for the same energy accuracy to only roughly a factor of ten [48,49]. However, within a pseudopotential formulation, the asymptotic rate of convergence in both basis sets is dominated by the resolution of the electronelectron cusp, giving a basis set discretization error that scales as O(1/N ) [50,51]. Thus, the asymptotic scaling of algorithms for simulating electronic structure (including the single-molecule case) can be compared directly whether N represents Gaussian orbitals or plane wave orbitals. We substantiate this notion concretely in Appendix E. Also note that in condensed phase systems with delocalized electrons, plane waves are especially competitive with Gaussians, and in special cases such as jellium (the focus of Section III) are substantially more compact.
Within the plane wave basis, we can see immediately that the two-body Coulomb operator has only O(N 3 ) terms instead of O(N 4 ) terms. This reduction in the number of terms arises due to momentum conservation which constrains the allowable transitions between plane waves as they are eigenstates of the momentum operator. As we review in Appendix B, the complete Hamiltonian in the plane wave basis takes the well-known form: where σ ∈ {↑, ↓} is the spin degree of freedom and we have truncated the operators to the support of plane waves with frequencies k ν = 2πν/Ω 1/3 such that ν is a three-dimensional vector of integers with elements in [−N 1/3 , N 1/3 ]. In the above summation notation, addition of momenta is carried out modulo the maximum momentum. Aliasing the momenta in this way is equivalent to evaluating the integrals in Eq. (4) and Eq. (5) by sampling at N evenly spaced grid points, a common practice in electronic structure codes sometimes called dualling [52,53]. Dualling causes the plane wave Hamiltonian matrix elements to deviate from a Galerkin discretization, but this discrepancy is similar to basis error and vanishes as the number of plane waves increases. Importantly, the dualling form of the plane wave matrix elements is essential to give the desirable properties of the matrix elements in the dual basis we now discuss. The Fourier transform of the complete plane wave basis (i.e. in the limit of infinite volume Ω and infinite momentum cutoff) is a basis of delta functions (a grid). But by applying the discrete Fourier transformation to a basis of N plane waves, one obtains a new set of basis functions resembling a smooth approximation to a grid with lattice sites at the locations r p = p (Ω/N ) 1/3 . We call these functions the "plane wave dual basis". In electronic structure, the plane wave dual basis has previously been considered in the context of reduced scaling density functional calculations [54,55]. As a basis set where each function is associated with a real-space coordinate value, the plane wave dual basis can also be viewed as a discrete variable representation (DVR) [56]. In particular, it is a relative of the sinc DVR basis widely used in quantum dynamics simulations [56][57][58][59][60]; although unlike in the standard sinc basis where the kinetic energy operator is approximate when using a finite basis, here the kinetic energy operator is always treated exactly. However, the primary novelty about the plane wave dual basis in this work is its use in quantum computation and the specific properties of the basis that we exploit to enable especially efficient quantum algorithms. We derive the closed-form expressions for the plane wave dual basis functions and associated operators in Appendix C, and further elucidate connections to DVR there.
While the plane wave dual basis functions are not strictly localized in space, they nevertheless diagonalize the potential operators of Eq. (7) within the dualling approximation, analogous to the conversion between the plane wave and real space forms of the potential operators via a continuous Fourier transform in a complete plane wave basis. As we derive in Appendix C, by applying this Fourier transform, the Hamiltonian in the dual basis becomes H = 1 2 N ν,p,q,σ k 2 ν cos [k ν · r q−p ] a † p,σ a q,σ

B. The Fermionic Fast Fourier Transform
A useful feature of the Hamiltonian representation introduced in Section I A is that one can rotate the system from the plane wave dual basis (where the potential operator is diagonal) to the plane wave basis (where the kinetic operator is diagonal) using an efficient quantum circuit that is related to the fast Fourier transform. This operation allows one to efficiently prepare the initial state for classes of interesting physical systems whose ground state is well approximated by a mean-field state of delocalized electron orbitals (Section III), as well as to improve the efficiency of quantum measurements (Section II C). The usual quantum Fourier transform would be appropriate to diagonalize the kinetic energy operator for a binary encoding of the state in real space, as used in [2, 10,11,[61][62][63][64][65][66]. However, our second-quantized encoding of the state necessitates a special version of the fast Fourier transform which we refer to as the "fermionic fast Fourier transform" (FFFT). Note that the word "quantum" does not appear in this name because our implementation of the FFFT does not offer any quantum advantage over its classical analog.
The fast Fourier transformation was first applied to fermionic systems for quantum computing purposes in [67] and improved in the context of tensor network simulations in [68]. While [68] showed that the FFFT could be realized with O(log N ) depth using arbitrary two-qubit gates, in Appendix I, we extend the method of [67] to show that the FFFT can be implemented for three spatial dimensions using a planar lattice of qubits with O(N ) depth. While past work has focused entirely on describing the FFFT under the Jordan-Wigner transformation [67,68], we generalize the approach to arbitrary mappings including Bravyi-Kitaev [23,24,44] and other modern approaches [25][26][27].
The essential function of the FFFT is to perform the following single-particle rotation: As a clarifying example, the two-dimensional FFFT that acts on spin orbitals p and q (which are not necessarily adjacent in lexicographical ordering) is where f swap generates the "fermionic swap operator", which has been proposed for use in quantum computer simulations in [41]. We define f swap in a mapping-independent way (i.e. not specific to Jordan-Wigner) as This operator is referred to as a fermionic swap because it has the property that it swaps the spin orbitals p and q (up to a global phase) while maintaining proper anti-symmetrization. For example, f swap a † p f swap = a † q and vice versa. Using these definitions it can be shown (see Appendix I) that This reveals that F 0 acts as a Hadamard transform, or two-dimensional Fourier transform, on the creation operators that define the two dimensional subspace. Then, by combining these operations together with phase shifts one can follow the same reasoning used in the Cooley-Tukey fast Fourier transform algorithm [69] to construct the FFFT out of these operations and phase shifts. We present this argument formally in Appendix I. We also show that the entire FFFT in three dimensions can be implemented on a planar lattice of qubits with gate depth of O(N ). Just as the quantum Fourier transform diagonalizes the kinetic operator in real space simulations, it is shown in Appendix B and Appendix C that Thus, an alternative expression for the molecular electronic structure Hamiltonian in the plane wave dual basis is We choose to write the kinetic operator using the FFFT relation to emphasize that the Hamiltonian has the special property that all components of it are diagonal in either the plane wave or plane wave dual representations. In addition to the advantages of having only Θ(N 2 ) terms, in the subsequent sections we will make frequent use of this diagonal property. In some circumstances, we will also desire to perform simulation in the plane wave dual basis after preparing an initial state that is a product state in the plane wave basis; this can be accomplished by applying the FFFT to a product state. Finally, we note that while prior work has leveraged the diagonality of momentum and potential operators in real space, our use of second quantization allows us to use dramatically fewer qubits and also avoids the challenge of anti-symmetrizing the initial state which complicates first-quantized methods [11,70].

II. IMPROVED ALGORITHMS FOR QUANTUM COMPUTATION
We analyze the cost of applying several types of quantum simulation algorithms to the Hamiltonians introduced in Section I in this section. In Section II A and Section II B, we focus on Trotter-Suzuki and Taylor series algorithms for time-evolution, which can be used to prepare electronic structure ground states when used in conjunction with the phase estimation algorithm [4,5]. Specifically, the phase estimation algorithm will project a simulation register into eigenstate |j with probability | j|ψ 0 | 2 where |ψ 0 is the "reference state" from which the phase estimation procedure begins. The phase estimation procedure involves taking a short time dynamical simulation e −iHδ such that the error in the eigenvalues of the effective Hamiltonian for the simulated unitary is at most O(ǫ). If the cost of this short dynamical simulation is F (ǫ) then the cost of phase estimation is then in O(F (ǫ)/ǫ). Thus minimizing the costs of dynamical simulation is vitally important for phase estimation.
Typically, one is interested in projecting to the ground state and |ψ 0 is chosen to be the Hartree-Fock state, defined as the lowest energy single Slater determinant approximation to the ground state [71]. The Hartree-Fock algorithm is a classical self-consistent mean-field procedure for finding this state in terms of a series of single-particle rotations. To prepare the Hartree-Fock state from any product state one can evolve under the anti-Hermitian operator pq θ pq a † p a q for some amplitudes θ pq = −θ * qp determined by the Hartree-Fock procedure. The cost of performing this evolution depends on the values of θ pq . However, the Hartree-Fock state does not need to be prepared exactly in order to provide a good reference and a variational outer-loop can be employed to take fewer Trotter steps [17]. Accordingly, we assume that the gate complexity of state preparation is less than the cost of the algorithms described in Section II A and Section II B. For systems of delocalized electrons, such as jellium (the focus of Section III), state preparation can be efficiently accomplished with log(N ) gate depth for arbitrary two-qubit gates, or with linear gate depth using a planar architecture, using the FFFT of Section I B.
Before beginning our analysis we will make a few comments about how the results of this paper should be compared to prior work. Most prior quantum algorithms for electronic structure have focused on the simulation of finite systems consisting of a small number of atoms [5]. As discussed in Section I and Appendix E, one can also use the plane wave dual basis for such simulations by choosing the unit cell volume Ω to be large, or by truncating the Coulomb operator, which exactly eliminates the periodic images. However, we expect that the plane wave dual basis will be most useful for simulating systems with periodicity in at least one of the spatial dimensions, such as crystalline wires, surfaces, and solids, similar to the main current uses of plane wave bases in classical electronic structure. In either case, one is interested in the cost of simulation when the number of basis functions N grows towards the continuum limit. We do not expect the total energy to be extensive in N .
One should also bound the cost of simulation as the number of particles η grows. While it is reasonable to wonder how the cost of simulation grows with molecule size, molecules do not necessarily grow in a systematic fashion. For instance, molecules can have larger η by replacing lighter atoms with heavier onces or by adding atoms to a molecule. When using plane waves to treat materials one is usually interested in the properties of an infinite material that is periodic over some computational cell (a collection of unit cells) of volume Ω. As with molecules, one is sometimes interested in how the complexity of a method scales as one fixes the computational cell size and increases the number of particles (e.g. by replacing lighter atoms with heavier ones). But unlike when simulating molecules, the notion of scaling towards the thermodynamic limit is well-defined in the context of periodic solids. The thermodynamic limit is approached as one grows η by increasing the number of unit cells in the computational cell while keeping a fixed averaged density ρ = η/Ω. Accordingly, for both molecules and materials we report the asymptotic scaling of algorithms in terms η, N and ρ but are most interested in the fixed density scalings corresponding to molecules growing by addition of atoms and materials growing towards the thermodynamic limit. In Section II C we report the number of measurements required in terms of both a fixed absolute error ǫ and a fixed relative error µ = ǫ η as one is interested in fixed relative error while scaling towards the thermodynamic limit but interested in fixed absolute error otherwise. This is because physical total energies are extensive in η.

A. The Cost of Time-Evolution using Trotter-Suzuki Methods
Trotterization is perhaps the simplest method for simulating electron dynamics in the plane wave dual basis. Trotterization solves the problem of compiling e −iHt into fundamental gates by noting that if H = L ℓ H ℓ , where each e −iH ℓ t can be easily compiled into fundamental gates, then e −iHt can be simulated by a time-dependent Hamiltonian that rapidly switches each term on and then off. If the frequency of these switches is sufficiently high then, from the perspective of the quantum system, the entire Hamiltonian is active throughout the evolution; e.g. for large r, Here we have employed the second-order Trotter formula, which is often more practical for chemistry simulations than higher-order decompositions [18]. The value of r that is needed for this expansion depends subtly on the terms in the Hamiltonian. If the Hamiltonian terms commute then the error in the simulation is zero. Thus, the error does not depend on the norms of the Hamiltonian terms, but rather it depends on their commutators. Specifically it was shown in [18] that where |ψ is a state restricted to the η-electron manifold. Thus, once a particular ordering of the terms is chosen then an upper bound on the scaling of r can be found based on the commutator norms of the terms.
The conventional approach to second-quantized simulation would be to Trotterize the Hamiltonian of Eq. (9). This approach is outlined in detail along with improved methods for simulating evolution under the kinetic terms in the Hamiltonian in Appendix J. However, the approach we analyze here is to simulate evolution by switching between the plane wave dual basis and the plane wave basis to diagonalize the potential and kinetic operators. Using Representing the kinetic terms as diagonal operators has two effects. Firstly it reduces the number of commutators in Eq. (17) which leads to better bounds on the error. The second advantage is that the kinetic operator only contains O(N ) local terms that all commute. This allows us to simulate the kinetic operator in depth O(1) after performing this basis transformation on a quantum computer that has arbitrary single qubit rotations as a fundamental gate.
To understand the cost of this approach note that each of the r steps in the Trotter algorithm comprises of, from Eq. (18), two simulations of the potential energy Hamiltonian, the FFFT and its inverse, and a simulation of the kinetic operator in the plane wave basis. The operator U + V is the sum of Θ(N 2 ) number operators. Each number operator can be simulated using O(1) CNOT gates and single qubit rotations [16]. It follows that e −i(U+V )t can be simulated using O(N 2 ) gates and in depth O(N ) if the quantum computer has all to all connectivity between qubits, without the use of ancillae. We show in Appendix H that it can also be simulated in a planar nearest neighbor architecture in depth O(N ) without ancillae. Similarly, e −i 1 2 ν,σ k 2 ν a † ν,σ aν,σt requires O(N ) gates to implement. The number of gates required to perform the FFFT scales as O(N log N ) [67,68] with O(N ) depth for the three-dimensional transform implemented for qubits connected on a planar lattice, which is proven rigorously in Appendix I. Consequently, costs are asymptotically dominated by simulation of the potential. Thus, the Trotter simulation can be performed using a circuit of depth O(N r) on a planar lattice.
In the Appendix G we show that to simulate for time t and achieve error ǫ it suffices to choose r such that implying that the gate depth of our approach to Trotter-Suzuki based simulations is There are a number of ways that we can understand this scaling depending on how problem size grows. If we assume we grow our simulation size without changing the system density (i.e. ρ = η/Ω ∈ O(1)) then the gate depth is in O(N 5/3 η 11/6 t 3/2 /ǫ 1/2 ). If we only are interested in the scaling with N then we can take η ∈ O(N ) to find that the gate depth is in O(N 7/2 ). This is more than quadratically better than the best known rigorous bounds on the circuit depth for Trotter-based chemistry simulations, O(N 8 ) [41]. However, just as the bounds for r in [41] proved to be polynomially loose [18,21], we expect the empirical performance of our approach to be better than Eq. (20) suggests. Despite the obvious differences in scaling, a full comparison between prior Trotter-Suzuki work in different bases and this result remains challenging. This is because comparing costs for any specific system will depend on the precise N needed in the given basis, which will be problem specific. Nonetheless, the quadratic difference between the two complexities strongly suggests that for fault-tolerant applications, our simulation method will be competitive because most of the physical qubits required for the simulation arise from executing single qubit rotations fault tolerantly [72,73]. As a final note, while we use an exact evaluation of the potential here it is possible to leverage the local nature of the plane wave dual basis in order to approximate the potential on-the-fly using the Barnes-Hut algorithm or other fast multipole methods, which require O(N ) gates. Thus, it is possible, in principle, to achieve a gate complexity that matches the cited circuit depths to within logarithmic equivalence. However, a naive application of the fast multipole method would require that the quantum computer coherently apply the algorithm for each configuration in the superposition and is likely to be impractical in near-term quantum computers.

B. Bounding Cost of Time-Evolution using Taylor Series Methods
With the exception of [9,11,13,74,75], all prior papers which analyze the time-evolution of electronic structure Hamiltonians use the Trotter-Suzuki decomposition. Even the most elaborate Trotter schemes scale sub-polynomially but not poly-logarithmically with respect to the reciprocal of the simulation error, 1/ǫ, [76,77]. In [78,79], Berry et al. combined the results of [76,80,81] to show a technique for performing time-evolution of arbitrary Hamiltonians with sub-logarithmic dependence on the inverse precision. Since then, several papers have introduced other "post-Trotter" methods with improved dependence on ǫ [82,83]. The "Taylor series" techniques of [78] were first applied to chemistry in [9,13]. The result of [9] is an algorithm with gate complexity of O(N 5 ) and the result of [13] is a more complicated algorithm which exploits the sparseness of the configuration interaction representation of the Hamiltonian in order to perform simulation with gate complexity O(η 2 N 3 ), where η is the number of electrons.
Using the Taylor series method in the plane wave dual representation we will be able to outperform both of these bounds. In this section, we will show that one can perform time-evolution of the Hamiltonian with gate complexity of O(N 4 ) using an approach that is much simpler than the aforementioned Taylor series based algorithms. This scheme is similar to the "database algorithm" protocol of [9], which scaled at least as O(N 6 ) in that work. In Appendix K, we build on the results of this section to show a more complicated algorithm, inspired by the "on-the-fly" algorithms from [9] and [13], which in the plane wave dual basis has gate complexity of O(N 11/3 ) and gate depth of O(N 8/3 ), making this the most efficient algorithm for time-evolution of an electronic structure system in the literature.
We will not go into details about how the Taylor series method works and instead refer readers to [79]. We will describe what is required to implement the techniques and bound the cost of our approach. The Taylor series method begins with the observation that any local Hamiltonian, e.g. Eq. (9), can be expressed as where W ℓ are real scalars and H ℓ are self-inverse operators which act on qubits; e.g., the H ℓ are the strings of Pauli operators in Eq. (9). The Taylor series simulation technique is described in [79] in terms of queries to two oracle circuits. The first oracle circuit acts on an empty ancilla register of O(log L) qubits and prepares a particular superposition state related to Eq. (21), where Λ is a normalization parameter that turns out to have significant ramifications for the overall algorithm complexity. The second oracle circuit we require acts on the ancilla register |ℓ as well as the system register |ψ and directly applies one of the H ℓ to the system, controlled on the ancilla register. For this reason, we refer to the ancilla register |ℓ as the "selection register" and name the oracle accordingly, Note that the self-inverse nature of the H ℓ operators implies that they are both Hermitian and unitary, which means they can be applied directly to a quantum state. Suppose that the circuit prepare(W ) can be applied at gate complexity P and the circuit select(H) can be applied at gate complexity S. Then, the main result of [79] is that one can straightforwardly perform a quantum simulation under H for time t to unitary operator precision ǫ at gate complexity with spatial overheads and precision costs polylogarithmically bounded in ǫ. Since the bound on the Hamiltonian norm from Appendix F is obtained using the triangle-inequality, it also asymtotically bounds Λ at O(N 7/3 /Ω 1/3 + N 5/3 /Ω 2/3 ). We now describe how these two oracles can be implemented so that (S + P ) ∈ O(N 2 ). First, we discuss implementation of select(H). From Eq. (9) it is clear that there are L = Θ(N 2 ) terms which can be indexed by only two indices, p and q. For the purposes of this section we will further suppose that p indexes both spin and position so that even values of p correspond to spin-up orbitals and odd values of p correspond to spin-down orbitals. We will ignore the identity term and index the local Z p terms whenever p = q. For p = q there are three terms in Eq. (9) which we will refer to as the ZZ term, the XZX term and the Y ZY term. An ancilla qubit, b will be introduced and if b = 0 then the pair (p, q) will refer to the ZZ term whereas if b = 1, the pair (p, q) will refer to the XZX and Y ZY terms. If p > q we will refer to the XZX terms whereas if q > p, we will refer to the Y ZY term. Accordingly, our select(H) circuit should have the following actions, Note that the condition involving (p + q) mod 2 is necessary when the model contains a spin degree of freedom in order to conserve spin. This efficient encoding requires only 2 log N ancilla for the selection register. The logic to select a term, shown in Eq. (25), involves only the operations >, ∧ and =, which can all execute with O(1) gates.
Since the actual H ℓ contain up to N Pauli operators, we see that select(H) can be circuitized with gate complexity S ∈ O(N ). For a specific implementation of how even more complex Pauli strings can be implemented from a selection oracle with this same gate complexity, see Section III of [9]. Using the notation established in Eq. (25) the preparation oracle should have the following actions, where We have added a factor of 1/2 to the Z and ZZ coefficients of Eq. (9) as those terms will execute twice; when p = q this happens due to the b degree of freedom and when b = 0 this happens because both p > q and p < q will occur. To implement prepare(W ) one can use an approach which mirrors the "database algorithm" introduced in [9]. The idea is based on results from [84] which show that an arbitrary quantum state on m qubits can be prepared using a circuit with no more than O(2 m ) CNOT gates. Since prepare(W ) initializes a state on O(log L) qubits where L = Θ(N 2 ), the techniques of [84] would allow one to implement prepare(W ) at gate complexity P ∈ 2 O(log L) ∈ O(N 2 ). We have thus shown a constructive approach to Taylor series simulation of Eq. (9) with total gate complexity If we fix the system phase at ρ = η/Ω ∈ O(1) and assume that η ∈ Θ(N ) then we see that the algorithm scales asymptotically as O(N 4 t). Though less efficient than the method of Section II A by a factor of √ N , this algorithm has logarithmic dependence on ǫ, which is a superpolynomial advantage in ǫ over all Trotter schemes and an exponential advantage in ǫ over the method of Section II A. In Appendix K we extend these ideas to show a more involved implementation of prepare(W ) which results in overall gate complexity O(N 11/3 ) and gate depth of O(N 8/3 ). The concept of that approach is to compute the coefficients "on-the-fly" similar to the "on-the-fly" algorithm in [9]. There are several ways in which these results could be improved. First, our bound on Λ for the database algorithm is likely loose and should be studied numerically in order to estimate practical scaling. Second, following the construction detailed in [13], one could simulate the plane wave dual Hamiltonian in the configuration interaction representation using the Taylor series approach and in doing so, reduce the spatial requirement of this algorithm from O(N ) to O(η). That improvement would be especially meaningful in the dual basis due to the spatial overhead associated with using plane waves instead of Gaussian orbitals.

C. Fewer Measurements for Variational Quantum Algorithms
An alternative to quantum phase estimation and other methods requiring time evolution for the study of electronic systems are quantum variational algorithms such as the variational quantum eigensolver [14,15,85]. These methods have garnered significant recent attention due to their simple experimental implementation and robustness to control errors [39]. Variational quantum algorithms involve a parameterized procedure (usually a parameterized quantum circuit) for preparing quantum states (the variational ansatz). The variational ansatz is iteratively improved by measuring an objective function and then using a classical optimization routine to suggest new parameters. The bottleneck we focus on here is the measurement step. While variational algorithms do not require long coherent evolutions, they usually require a large number of circuit repetitions for measurement purposes; the abstract of [41] claims the primary challenge of these methods is that "the required number of measurements is astronomically large for quantum chemistry applications". Here, we show that use of the plane wave dual basis enables new bounds and strategies that drastically reduce the number of circuit repetitions required.
Usually (but not always, e.g. see [86]) the measurement objective is the expectation value of the energy on the current quantum state. The expense of this step typically depends on the norm and form of the Hamiltonian, and the exact method that is used to evaluate it. The simplest and most practical method of expectation value estimation relies on a form of quantum operator averaging that leverages the structure of these Hamiltonians as sums of tensor products of Pauli operators. The expectation value of the energy may be estimated by measuring the individual tensor products of Pauli operators H ℓ on repeated, independent state preparations and summing the resulting estimates H ℓ together, weighted by their coefficient W ℓ , to get an estimate of the expectation value, This method has the advantage that negligible coherence time is required beyond state preparation in order to perform the required measurements, making it particularly amenable to implementation on quantum devices without errorcorrection. If one assumes no additional prior information and allows a variable number of measurements per term, the number of repeated preparations and measurements M to estimate the value of H to a precision ǫ is known [41,87] to be bounded by where we have used the triangle inequality upper-bounds to the norm of the plane wave dual Hamiltonian derived in Appendix F. Already, this bound is significantly lower than the best proven bound on the number of measurements required when using a Gaussian basis, O(N 8 /ǫ 2 ); however, both bounds are loose. Hamiltonians in the plane wave dual basis have a few special properties that allow us to make even fewer measurements. In particular, the Coulomb operators U and V , are diagonal. A consequence of this is the local H ℓ terms from each of these operators all commute with each other, allowing the use of a separate, unbiased estimator for the mean of U + V , without the use of an ancilla qubit [86]. While the kinetic operator is not diagonal in the plane wave dual basis representation, one can perform the FFFT prior to measurement. This would change to the plane wave basis and diagonalize the kinetic operator. Thus, instead of independent wavefunction preparations for each H ℓ within the sum for T , U and V , either the entire operator U + V or the entire operator T can be measured completely on each circuit repetition. As variances add linearly for independent measurements, if we were to measure T , U and V individually by sampling bit strings in their eigenbasis, we would require a number of circuit repetitions scaling as where we have used bounds from Appendix F. In the final bound we have provided the scaling at fixed density, consistent with scalings in other sections of this paper. We see that for either finite molecules or bulk materials, since it must be the case that η ∈ O(N ), this scaling is no worse than O(N 4 /ǫ 2 ). As discussed at the beginning of Section II, one is often interested in studying the cost of converging periodic electronic structure calculations to the thermodynamic limit. However, simulation of the ground state energy within fixed absolute error is unreasonable in the thermodynamic limit as the energy scale of the system grows scales asymptotically as O(η). Accordingly, when growing towards the thermodynamic limit one would be interested in achieving a fixed relative error µ = ǫ η. In terms of µ, we see that scaling towards the thermodynamic limit is where in the final bound we have made the reasonable assumption that η 4/3 N 2/3 grows faster than N 4/3 /η 4/3 . A practical difficulty for simple operator averaging on near-term devices with Pauli operators built from Jordan-Wigner strings is the sensitivity to measurement error on each of the individual qubits in a long Pauli string [40]. In the plane wave dual basis, one has the advantage that the diagonal operators are always two-local in the Jordan-Wigner representation, thus mitigating this problem. The kinetic operator may be treated in this way by applying the FFFT. However, one might seek to avoid the coherent overhead of applying the FFFT in order to diagonalize the kinetic operator. This would be especially advisable when using a near-term device prone to errors during the FFFT execution. If one were to measure U + V at once but measure T by sampling the H ℓ then the total number of measurements would scale as where T is the triangle-inequality upper-bound on the norm of T from Appendix F. At fixed density ρ ∝ η/Ω ∈ Θ(1) this quantity also scales as O(N 2 /µ 2 ) for fixed relative error µ and η ∈ Θ(N ). Alternative methods for evaluating the objective function using techniques from phase estimation have been studied in some detail [86]. These methods require a number of initial state preparations that scales quadratically better in ǫ and measures fewer qubits in the process, which mitigates the impact of measurement error. This quadratic scaling improvement comes at the cost of requiring larger circuit depth, which can make these approaches impractical for existing experimental platforms which are often limited by coherence time. Specifically, if one prepares the state of interest with a unitary U, then it is possible to estimate the expectation value of the energy to a precision ǫ using O( ℓ |W ℓ |/ǫ) applications of U and U † . This implies an asymptotic bound on the cost of energy estimation of For fixed density, η ∈ Θ(N ), and relative error µ, we can bound this scaling at O(N/µ). As was shown in the original work, this scaling is comparable up to logarithmic factors to the application of iterative phase estimation using the best known Hamiltonian simulation algorithms. This makes it feasible to use variational approaches to improve state preparation for quantum phase estimation applications on fault-tolerant quantum devices.

III. PROPOSAL TO SIMULATE JELLIUM ON A NEAR-TERM DEVICE
In this section, we discuss an experimental proposal for near-term devices based on the advances of Section I. In particular, we focus on the quantum simulation of the homogeneous or uniform electron gas, also known as jellium. We believe that jellium is an attractive system to target with early quantum computers due to its simplicity yet foundational importance for many areas of physics and materials science. Further, it is naturally compatible with the plane wave and dual basis simulation formalism we have described so far. The widespread use of jellium as a benchmark on which to test new classical simulation methods, as well as continuing unresolved physical questions in the system, positions it as an intriguing arena in which to contrast quantum and classical simulations.
Jellium is defined as a system of interacting electrons with a uniform electron density ρ and a homogeneous compensating positive background charge, such that the overall system is charge neutral [88]. As a finite realization, we consider a system of η electrons in a box of volume Ω with periodic boundary conditions, where the jellium Hamiltonian becomes exactly Eq. (9) with a constant external potential, i.e. all ζ j = 0. Jellium is of interest in different physical dimensions; both two-and three-dimensional jellium are realized to a good approximation in real materials. For example, two-dimensional jellium is approximated well by electrons confined in semiconductor wells [89], while three-dimensional jellium is a model for the valence electron density of alkali metals such as sodium [90]. Historically, the physics of jellium has helped elucidate some of the most basic concepts in condensed matter physics. For example, Wigner's observation that electrons in jellium must crystallize as the electron density is decreased [91] was the first example of an interaction driven metal-insulator transition. Later, the ground state physics of jellium in two dimensions in a strong magnetic field became the canonical setting to understand the quantum Hall effect [92]. Simulations of jellium also play a central role in computational applications. This is because the energy density of jellium is the starting approximation in density functional calculations, the mostly widely performed calculations in quantum chemistry and materials science. In particular, the local density approximation gives the (exchangecorrelation) energy E xc of a material with a generic, non-uniform, electronic density ρ(r), as where ǫ UEG xc (ρ(r)) is the (exchange-correlation) energy density of jellium at density ρ(r). For this reason, the history of density functionals has been tied to improvements in approximate simulations of the jellium energy density [93][94][95].
For the above reasons, simulating the properties of jellium with classical methods is a standard classical benchmark. This also argues for using it as a benchmark for quantum simulations, and in this context we briefly outline the current limitations of classical techniques, and the setting in which quantum simulations may be most useful. The phase diagram of jellium is usually discussed in terms of the Wigner-Seitz radius r s , which is related to the density by 4πr 3 s /3 = Ω/η = ρ −1 in three dimensions. While the ground state of jellium at high densities (metallic, r s ∼ 1 Bohr radii per particle) and at very low densities (insulating, r s ∼ 100 Bohr radii per particle) is well established, the precise phase diagram in the low to intermediate density regime is uncertain due to competing electronic and spin phases [93,[96][97][98][99][100]. In the high-density regime, the system is dominated by kinetic energy, and expansion techniques based on perturbation theory perform well [101,102]. Outside this density regime, the main simulation tool has been quantum Monte Carlo in the continuum formulation [93,[96][97][98][99][100], and more recently, in basis set formulations such as full configuration interaction quantum Monte Carlo (FCIQMC) [103,104] and auxiliary field quantum Monte Carlo (AFQMC) [105,106]. The latter basis set calculations use plane waves and can be directly compared to quantum simulations in the plane wave dual formulation. Due to the fermion sign problem, it is difficult to obtain data with acceptable stochastic error with exact quantum Monte Carlo methods (e.g. with released nodes [93], FCIQMC without initiators [103], or AFQMC without constrained phase bias [106]) for systems with η > 50. Instead, simulations use a bias to control the sign problem, such as the fixed node approximation. Although much useful information can be extracted in the presence of this bias, the systematic error is hard to estimate, and is thought to be as large as half a percent in the energy [96,103]. Unfortunately, this error is on a similar scale to the energy difference between competing phases in the intermediate density regime. We expect quantum simulations, even for modest η ≈ 50 and modest N ≈ 100, to offer bias free results that cannot currently be obtained by classical techniques; beyond their role in understanding the approximations used in classical methods and in demonstrating "quantum supremacy', such simulations will provide a new way to resolve the complicated jellium phase diagram in the low density regime.
In the next part of Section III, we consider how to use the advances introduced in Section I within the specific context of a practical quantum algorithm for jellium simulation on near-term devices. While the Trotter and Taylor algorithms described in Section II A and Section II B can be used for ground state simulation, either by simulating adiabatic state preparation [8] or by projecting to a ground state using quantum phase estimation [4,5], such approaches are likely to require error-correction for their implementation. However, in the case of jellium, a good initial state preparation is extremely simple. This makes variational quantum algorithms for jellium particularly interesting, given their additional suitability for near-term devices. [14,15].

A. Linear Depth Quantum Variational Algorithm for Planar Architectures
As with all variational algorithms, one prepares an ansatz |ψ( θ) for the ground state which is described in terms of parameters θ which are selected in order to minimize the expectation value of the Hamiltonian, ψ( θ)| H |ψ( θ) .
Usually, one prepares |ψ( θ) by applying a parameterized quantum circuit to a suitable reference state |ψ 0 so that |ψ( θ) = U ( θ) |ψ 0 . Thus, the power of a variational algorithm depends on the quality of the reference state |ψ 0 and the structure of the parameterized circuit U ( θ). The reference state is often chosen to be the mean-field solution to the problem. Mean-field solutions to jellium are diagonal in the plane wave basis, and provide useful starting points for quantum Monte Carlo simulations even at quite low densities [100]. One can begin quantum simulation in a product state associated with the plane wave basis and then apply the FFFT to obtain the mean-field state of jellium in the dual basis. As shown in Appendix I, the FFFT can be implemented with O(N ) gate depth on a planar lattice.
A variational strategy that is particularly practical for the near-term is based on a low-order Trotter approximation of adiabatic state preparation. This ansatz is related to the quantum approximate optimization algorithm [107] and has been shown to perform well in the context of electronic structure [41]. Following the scheme of [41], the idea is to Trotterize the adiabatic algorithm defined by evolution under Thus, the schedule is to start in the ground state of the one-body Hamiltonian and slowly turn on the two-body terms.
Note that H(0) = T for jellium, which is the Hamiltonian of a free particle. This choice of schedule further justifies use of |ψ 0 = FFFT |0 as the reference since this makes |ψ 0 an eigenstate of T in the plane wave dual basis. One should choose |0 to have the correct particle number and spin-symmetry to describe the target state as an error-free simulation would conserve these quantum numbers. We use the fact that we can write Eq. (35) for any molecular Hamiltonian in the Jordan-Wigner transformed plane wave dual basis as for scalar values of θ which should be apparent from Eq. (9). We can Trotterize the adiabatic evolution as where M is the total number of repetitions of the Trotter step. As discussed in Section II A, each of these Trotter steps can be implemented with gate depth O(N ) on a planar lattice of qubits with no ancilla. Thus, the total gate depth of this ansatz would be O(N M ). Rather than try to variationally determine all parameters to minimize the final Hamiltonian H(1), the suggestion of [41] is to train the ansatz "in layers"; i.e., to train the first Trotter step to minimize H(1/M ), the second to minimize H(2/M ) and so on. The results of [41] suggest that this ansatz may perform well for values of M as low as ten or less. Note that while initial states other than a product state of plane waves may be needed in systems other than jellium, the variational ansatz can be used for any molecule. Variational algorithms were experimentally demonstrated in [40] and [39] using superconducting qubit platforms from industrial quantum computing groups which are expected to reach the quantum supremacy threshold in the near-future [34]. Such platforms would have qubits connected on a planar lattice and could only implement shallow circuits due to limited coherence. For such an early demonstration, we can make further simplifications to the M = 1 variational ansatz. To explain this strategy, we notice that the expectation value of the Hamiltonian after applying the M = 1 variational ansatz can be expressed as where we can see that H( θ) amounts to a local basis transformation on the Hamiltonian H. Since this transformation can be applied efficiently with classical post-processing, we see that the ansatz preparation can be simplified to In practice, one would probably also take the rotation angles in the FFFT as variational parameters. Thus, our "minimal resource variational ansatz" consists of the FFFT, a high entanglement operation known to produce a good reference, followed by entangling gates between all pairs of qubits and then a single layer of phase gates on each qubit.
As a final note, the outer-loop of this variational quantum algorithm will only need to optimize over O(N ) distinct parameters, as opposed to O(N 2 ) distinct parameters, due to the translational invariance of the jellium system. In order to resolve distinct phases in low-density jellium, a reasonable target is to obtain energies accurate to a fixed relative error of half of one percent. The minimal variational ansatz of Eq. (39) may be sufficient to prepare accurate ground states of jellium in certain parts of the phase diagram; in the high density regime even the mean-field state |ψ 0 is a good initial description. But we also expect that this single Trotter step ansatz will fail to resolve the ground state in more complex regimes. Thus, this proposal immediately raises two unresolved questions: "how many Trotter steps will we be able to implement on a near-term device?" and "how many Trotter steps would be required to surpass all classical methods in the low density regime?". By compiling all aspects of this procedure to a natively realizable gate set, we should be able to estimate how many Trotter steps would be possible within the limitations of expected coherence times and gate fidelities. This analysis will be the subject of a future paper. However, the second question is more difficult to answer without a quantum device, especially because the radix-2 decimation implementation of the FFFT requires that problem sizes are a power of two. Whether or not quantum supremacy is immediately achievable using this approach to jellium simulation, experimentally studying this ansatz will provide important insights into the effectiveness of Trotter-based variational quantum algorithms for problems of correlated electrons.

CONCLUSION
In this work, we have introduced efficient techniques that use the plane wave basis and its dual for quantum simulations of electronic structure. The kinetic and potential operators are respectively diagonal in these bases, providing a Hamiltonian representation with only a quadratic number of terms in basis size. We also described an efficient second quantized fermionic fast Fourier transform to map between the two bases which can be implemented with linear gate depth on a planar lattice of qubits. Using the diagonality of the Hamiltonian components in these dual basis sets, we showed that Trotter steps can be implemented with linear gate depth on a planar lattice. We use these properties to implement time-evolution using Trotter and Taylor series methods with lower overhead than all prior approaches and also reduce the number of measurements required for quantum variational algorithms. Finally, we identified jellium as a concrete electronic structure problem to target on near-term quantum devices. Jellium is attractive due to its fundamental significance in conceptual and numerical electronic structure theory and because of its tunability into regimes where classical simulations are currently inadequate. Exploiting its natural expression in the plane wave basis, we proposed a simple quantum variational algorithm which can be executed with low circuit depth on near-term quantum hardware. Understanding the performance of this algorithm for jellium will provide important insights into the near-term feasibility of quantum supremacy in realistic problems of electronic structure.
Beyond the confines of this work, we expect that the advances we have described will have ramifications across many different approaches to quantum simulation. For example, the quadratic reduction in the number of Hamiltonian terms, as well as the lower scaling bounds on the Hamiltonian norm, will translate generally to decreased complexity in the overhead for perturbative gadgets, or in quantum simulations within the configuration interaction representation. The techniques may further be used in conjunction with error-corrected simulations. Moving beyond jellium as a physical system, quantum simulations in the plane wave basis may practically be extended to real materials by incorporating a single-particle pseudopotential, without essential modifications of the results in this proposal. Ultimately, we believe that our work illustrates the potential of exploring fundamental reformulations of the electronic structure problem in order to reduce the complexity of quantum simulations.
An alternative to the Galerkin discretization derived from the weak form of the Schroedinger equation is a finitedifference formulation, which is associated with the strong formulation of the differential equation. In the past, many works have explored the use of finite-difference discretizations (either implicity or explicitly) [61,62,[64][65][66] although never before in a second quantized simulation of an electronic structure system. Still, discretizing these systems in this way is straightforward and follows from this past work. Assuming a uniform partitioning of space, the value of position operators are assigned to a set of grid points with values determined by the position of the grid point. Generalizations to non-uniform grid spacings are also possible.
One might consider this approach analogous to choosing basis functions of the form φ i (r) = δ(r − r i ) in the Galerkin formulation, where δ is the Dirac delta function and r i is the location of a grid point, but with several important differences. In this case, the derivative operators are discretized in an entirely different way, using a finite-difference stencil, rather than integration over such basis functions. This follows from the discussion of functions with disjoint support in the main text. Moreover, while an inner product in the Galerkin formulation between two functions |ψ = i b i |ψ i and |φ = i c i |φ i has a natural definition induced by the definition of the inner product on the space of {|φ i } given by ψ|φ = i,j b * i c j ψ i |φ j , the same is not true in the finite difference scheme. In this case, one must choose a definition that is consistent with some sensible measure on the space.
To see how these differences are formulated in practice, we will consider an example. Assume a uniform volume partition for the system that consists of N = M 3 orbitals which are each indexed by four indices, x ∈ Z ∈ [0, M ), y ∈ Z ∈ [0, M ), z ∈ Z ∈ [0, M ). In this case, the kinetic energy operator may be expressed using a finite-difference 7-point stencil for the Laplacian, where h is the spacing between grid points. Central difference stencils of this type, utilizing three points along each axis, have errors that scale as O(h 2 ) in their representation of the derivative operator. In this case, we can see that the kinetic energy operator has exactly 7 N terms, and note that other size stencils may be used to reduce the discretization error. The most accurate stencil, which extends across the entire length of the simulated system, would still only have O(N 2 ) terms. An important difference to note between this choice and the Galerkin discretization is that error in expressing the finite-difference formulation of the kinetic energy operator can lead to sub-variational energies in principle. However, this is easily managed in practice with reasonably sized stencils and spatial partitions. With a uniform grid of points positioned as above and spaced by the same distance h along each axis, we may use the rectangular rule to define an inner product on single particle functions. In this scheme a single particle function |φ is defined by its values at the grid points φ(x, y, z, σ). Note that we will now also consider the spin degree of freedom σ = {↑, ↓}. We can define the inner product between two single particle functions |ψ and |φ explicitly as ψ|φ = h 3 x,y,z,σ ψ(x, y, z, σ) * φ(x, y, z, σ).
(A2) and label individual points |φ x,y,z,σ such that φ x,y,z,σ |φ x ′ ,y ′ ,z ′ ,σ ′ = δ xx ′ δ yy ′ δ zz ′ δ σσ ′ and x, y, z, σ|φ x ′ ,y ′ ,z ′ ,σ ′ = φ(x, y, z, σ). With these definitions of the kinetic energy and inner product, we can express the second quantized coefficients for one-body operators in the following way. If we define compound indices p = (x p , y p , z p , σ p ) with corresponding Kronecker delta functions δ pq = δ xpxq δ ypyq δ zpzq δ σpσq where we have used the shorthand notation q+ α to indicate shifting the α axis by 1 lattice point. We define the standard number operator as n x,y,z,σ = a † x,y,z,σ a x,y,z,σ . Similarly, the coefficients of the two-body potential become where we have separated the same-point repulsion into a second term characterized by λ. It follows that the two-body part of the operator may also be written as where λ scales the repulsive interaction between electrons of opposite spin when they occupy the same spatial orbital. We can see that there are N/2 terms on the left and N (N − 1)/2 unique terms on the right, for a total of N 2 /2 terms in the two-body potential. While the exact value of λ does not matter in the continuum limit, the chosen value determines the convergence of basis set discretization error. The approximation we advocate here is to treat λ as the mean repulsion between a uniform charge distribution in the cell, i.e. λ = 1 2 dx 1 dx 2 dy 1 dy 2 dz 1 dz 2 Note that the analytical evaluation of this integral is provided as the main result of [109]. Note further that one could also choose to evaluate the long-range Coulomb interaction between orbitals p and q using integrals which assume uniform charge density within the cell. This choice may lead to slightly different convergence behavior but the results will certainly agree in the continuum limit. Putting these results together, we arrive at the second quantized position space Hamiltonian in a finite-difference representation, H = h 2 x,y,z,σ 6 n (x,y,z,σ) − a † (x−1,y,z,σ) a (x,y,z,σ) − a † (x+1,y,z,σ) a (x,y,z,σ) (A7) − a † (x,y−1,z,σ) a (x,y,z,σ) − a † (x,y+1,z,σ) a (x,y,z,σ) − a † (x,y,z−1,σ) a (x,y,z,σ) − a † (x,y,z+1,σ) a (x,y,z,σ) x,y,z,σ n (x,y,z,σ) U (x, y, z, σ) + 0.941156 h x,y,z n (x,y,z,↑) n (x,y,z,↓) .
which implicitly defines both the one-body and two-body coefficients, h pq and h pqrs for the second quantized Hamiltonian, noting that some normal ordering may be required to bring it to its final form. This Hamiltonian contains strictly O(N 2 ) terms, as desired. While we do not use this result for any of the algorithms of this paper, understanding the finite-difference formulation on a grid is helpful to appreciate differences with the plane wave dual basis. Furthermore, it is possible that this form of the Hamiltonian has advantages that could make it easier to simulate in the context of future quantum algorithms, perhaps based on 1-sparse decompositions of the finite-difference stencil.

Appendix B: Electronic Structure Hamiltonian in Plane Wave Basis
In this section we will review analytical forms for the electronic structure Hamiltonian in a basis of plane waves of the following form in three dimensions, The length scale of our basis is parameterized by the cell volume Ω. The kinetic energy operator is a one-body operator. The coefficients of the kinetic energy operator T are Thus, where c † ν and c ν are canonical fermionic raising and lowering operators and σ ∈ {↑, ↓} represents spin. Clearly, this operator is diagonal since plane waves are eigenstates of the momentum operator.
When working with plane waves it is convenient to define the Fourier transform of the Coulomb potential, and the inverse of this Fourier transform, a solution to Poisson's equation with periodic boundary conditions: Note that there would appear to be a singularity in this periodized representation of the Coulomb operator when k ν = 0; however, whenever treating a charge-neutral system the singularities from interactions with the positive and negative charges cancel to contribute only a finite constant which depends on the cell shape. This factor can be computed using an Ewald sum, shown explicitly in Appendix F of [46]. The external potential arising from interactions with nuclei can be expressed as where nuclei j has position R j and atomic number ζ j . With this we compute the external potential coefficients as Accordingly, we can write the external potential operator as where the condition p = q is equivalent to dropping the zero momenta mode of the external potential which, as explained earlier, cancels with the zero mode of the electron-electron interaction. As explained in the main text, we choose to alias the momenta modes so that, in this case, k p−q is always contained within the original set of plane waves. This introduces a slight deviation from the Galerkin formulation and corresponds to evaluating matrix elements by N evenly spaced samples on a real space grid. Doubling the quadrature spacing would yield an exact evaluation but without the aliasing (dualling) approximation we would not obtain the convenient exactly diagonal form of the potential matrix elements in the the dual basis that we rely upon. The two-electron interaction coefficients are obtained from the integrals, The condition that ν = p − s = r − q arises from conservation of momentum. From this we arrive at r = q + ν and s = p − ν, which implies the final form of the two-electron term in momentum space is where we can see that this summation satisfies momentum conservation since ν = p − (p − ν) = (q + ν) − q. Thus, the total expression for H = T + U + V (up to a constant shift that depends on the unit cell shape) is given by

Appendix C: Electronic Structure Hamiltonian in Plane Wave Dual Basis
In the prior section we derived a closed form for the molecular electronic structure Hamiltonian in the plane wave basis. We now translate that Hamiltonian into the plane wave dual basis via unitary discrete Fourier transform. The unitary discrete Fourier transform of the plane wave basis is computed in each dimension separately as x is the x-component of the plane wave basis function ϕ ν (r) = ϕ νx (x)ϕ νy (y)ϕ νz (z), ν = (ν x , ν y , ν z ) and r = (x, y, z). As the expression for φ px (x) in Eq. (C1) takes the form of a geometric series, we can find the following closed form, which is a smooth approximation to a grid with lattice sites at the locations r p = p (Ω/N ) 1/3 . Basis functions of the above form (which can be conveniently labeled by the real-space coordinates of their centers) are commonly used in quantum dynamics simulations under the name of discrete variable representations (DVR) [56][57][58][59][60]. The sinc DVR, introduced in [59] is closely related to the plane wave dual basis. As seen from Eq. (C1), the plane wave dual basis is obtained as a sum over unit weighted plane waves with reciprocal lattice momenta up to a maximum cutoff momentum. The sinc DVR is obtained as a continuous integral over unit weight plane waves up to the maximum cutoff momentum. One of the primary weaknesses of the sinc DVR basis is the need to approximate the kinetic energy operator when using a finite number of sinc functions. This is removed in the plane wave dual basis, as the kinetic energy operator is represented exactly.
Rather than compute the integrals over these basis functions by quadrature, it is more straightforward to Fourier transform Eq. (B11) in order to obtain a representation of the electronic structure Hamiltonian in the plane wave dual basis. Accordingly, we define raising and lowering operators in the plane wave basis as the Fourier transform of their plane wave dual counterparts, Using these relations we can write the kinetic energy operator of the previous section in the dual space as We can transform the external potential in a similar fashion Recognizing that p − q spans the full set of momentum vectors in our system due to aliasing (dualling), we can replace the sum over p = q and the indices p − q and q with a sum over ν = 0 and q = 0. This leads to a DVR-like representation with diagonal potential operators. We find where we have used the fact that the summation grouped on the right side of the first equation is equal to zero unless p ′ = q ′ . This is because the negative modes of k q will have exactly the opposite phase as the positive modes of k q . This leads to the diagonal form of the final expression. Finally, we turn our attention towards transforming the two-electron operator. The following relations are helpful, where the first relation comes from conservation of particle number and the second relation is the Fourier convolution theorem. We can compute the interaction term in the plane wave dual basis as Putting these results together, we find the final expression for the total Hamiltonian in the plane wave dual basis, As we can see, there are only O(N 2 ) terms.

Appendix D: Plane Wave Dual Basis Hamiltonian Mapped to Qubits
Whereas fermions are indistinguishable, anti-symmetric particles, qubits are distinguishable and have no special symmetries. Thus, in order to encode a fermionic system on a quantum computer in second quantization one must map the operator algebra of fermions to the operator algebra of qubits. The algebra of fermions is defined by the canonical fermionic anti-commutation relations, The oldest (and simplest) method which accomplishes this is the Jordan-Wigner transformation [43]. A significantly more complicated method is known as the Bravyi-Kitaev transformation [23,24,44]. The Bravyi-Kitaev transform yields operators that are log N local as opposed to the Jordan-Wigner transformation, which is N local, in general. More recently, there has been work on generalizing these transformations [25][26][27]. Understanding the structure of these transformations is important for compiling circuits efficiently. However, for our purposes, the locality overhead is not necessarily detrimental in terms of gate depth (although it does effect gate count on a fully connected architecture) and so we analyze the Jordan-Wigner transformation for the sake of simplicity. The Jordan-Wigner transformation consists of the following mapping, where X p , Y p and Z p represent Pauli operators acting on tensor factor p. By inspection, one can confirm that the mapping of Eq. (D2) reproduces the algebra of Eq. (D1).
To actually apply the Jordan-Wigner transformation, one must map the fermionic orbitals specified in Eq. (C9) by the indices (p, σ) to a single qubit index; e.g., The Jordan-Wigner transformation is particularly simple for the plane wave dual basis molecular Hamiltonian. Applying Eq. (D2) to operators that appear in Eq. (C9), we find that We note that all of the qubit terms that come out of n p n q are diagonal (and thus commute). From Eq. (D4) we can write the position space second quantized Jordan-Wigner transformed qubit Hamiltonian as Expanding these terms and recollecting the qubit operator coefficients we find

Appendix E: Comparing Discretization Error in Fourier and Gaussian Bases
In this section we discuss convergence of basis set discretization errors in both plane wave and Gaussian bases. The basis set discretization error is defined with respect to the ground state energy in the continuum basis (N = ∞) as where |ψ N is a wavefunction limited to the support of Slater determinants with up to N single-particle basis functions (in our context those functions are either plane waves or Gaussian orbitals). Throughout this work, but especially in Section II and Table I, we directly compare the asymptotic scaling of algorithms using a plane wave basis and algorithms using a Gaussian orbital basis. We compare these scalings in terms of the same parameter, "N ", which represents the number of plane waves for some algorithms and the number of Gaussian orbitals for others. In order for such comparisons to be valid, we need to establish that the number of plane waves required for a particular calculation is asymptotically equivalent to the number of Gaussian orbitals required for the same calculation.
In Appendix E 1 we review results from the literature which establish that ∆E ∈ O(1/N ) regardless of the detailed form of the single-particle basis functions. This has been established by many numerical studies over the years and also proved up to second-order in perturbation theory for Gaussians in [110] and for plane waves in [111]. Although most of the results we describe are standard, we gather them here for completeness and also provide an intuitive explanation for this phenomenon based on simple arguments from approximation theory.
In Appendix E 2, we describe how a plane wave basis calculation is done in practice for systems with reduced periodicity, e.g. for molecules or surfaces. Using the methodology of [47], we show that one can exponentially suppress errors arising from the fictitious periodic image charges that occur when using plane waves to describe non-periodic systems. Taken together, these results allows us to directly compare the asymptotic scalings of algorithms using a plane wave basis with the asymptotic scalings of algorithms using a Gaussian orbital basis, even for non-periodic systems such as single-molecules. As the dual basis is a unitary rotation of the plane wave basis, all results presented here also hold equally for the dual basis.

Scaling of Intrinsic Discretization Error
We first present an intuitive argument for the basic result and then discuss several earlier works which establish the result more rigorously. As is well known from approximation theory and Fourier analysis, the rate of convergence of a basis expansion of a function is governed by its smoothness. For example, for an infinitely differentiable function (in any dimension), the asymptotic Fourier amplitudes from a Fourier transform decay exponentially in magnitude with respect to the number of Fourier modes, and thus approximating the function with a cutoff in the Fourier series (e.g. a finite basis) incurs an exponentially small error with the size of the basis, i.e. O(e −κN ) for some finite positive κ. For non-analytic functions, if the basis functions themselves do not incorporate the non-analytic behavior, then the error of the basis expansion only converges algebraically like O(N −α ), where α depends on the particular expectation value we are interested in as well as the nature of the non-analyticity.
Kato proved that the electronic wavefunction we are interested is continuous but has a discontinuous (yet finite) first derivative at the nuclei (the electron-nuclear cusp) and at the electron-electron coincidences (the electron-electron cusp) [112]. The asymptotic rate of convergence of both the plane wave expansion and Gaussian expansion is governed by their ability to capture these cusp-like behaviors. Around a cusp, the wavefunction may be expanded as Ψ (s) = Ψ (0) 1 + a 1 s + a 2 s 2 + . . .
where s is a radial coordinate around the cusp (e.g. |r p − R j | for the electron-nuclear cusp or |r p − r q | for the electronelectron cusp) and where we have kept the spherical part of the wavefunction for simplicity. The linear coefficient a 1 is determined by the type of cusp (e.g. a 1 = −ζ for a nuclear cusp and a 1 = 1/2 for the electron-electron cusp). An expansion in an analytic function basis (e.g. plane waves or Gaussians) necessarily omits the linear s (or it would have a discontinuous first derivative by assumption) and thus, asymptotically incurs error in some volume S close to the cusp, where S is the smallest spatial feature resolvable by the basis, which is O(1/N ). While appropriately constructed Gaussian basis sets can resolve local features such as the electron-nuclear cusp at a rate faster the O(1/N ) (see below for more detail), the same is not true of the electron-electron cusp, which occurs at all points in configuration space where coordinates of two or more electrons coincide. Evaluating the energy error in the ground state as and using the leading terms in the kinetic energy and potential energy in the Hamiltonian, proportional to (1/s)(d/ds) and 1/s respectively, the linear term in the wavefunction gives an error, to leading order in s, as ∆E ∈ O(S). By this intuitive argument, the error in the energy incurred by the cusp should scale asymptotically as O(1/N ).
The O(1/N ) scaling for the contribution of the electron-electron cusp to the energy has long been observed empirically using Gaussian basis sets, see e.g. [113][114][115] and extrapolating the so-called electron correlation energy using this asymptotic form is a common practice in electron structure theory [71]. The complicated form of molecular Gaussian basis sets prevents a more rigorous derivation of this form beyond arguments similar to the ones we presented above. However, for the case of two-electron atoms (the simplest electronic structure system demonstrating an electron-electron cusp), a rigorous partial wave analysis is possible at the level of a perturbative treatment of the electron-electron interaction [110]. This finds that at second order perturbation theory, the energy convergence of each partial wave goes like (ℓ + 1/2) −4 where ℓ is the angular momentum of the partial wave. Adding up the contributions of each partial wave to a maximum cutoff ℓ = L, gives a convergence like O(1/L 3 ), and the total number of angular functions up to the cutoff L is also O(L 3 ), thus the convergence in this case is again O(1/N ) [110]. In the case of plane waves, the O(1/N ) scaling for the contribution of the electron-electron cusp has been shown under both the random phase approximation [116] and second order perturbation theory [111]. In [111], there is also a comprehensive numerics study which demonstrates the O(1/N ) plane wave convergence.
In practice, when using a Gaussian basis, one includes basis functions that are centered on the nuclei. Then, although the Gaussians are formally analytic around the nucleus, one can choose series of Gaussians with increasingly large exponents such that they effectively mimic the sharp features of the electron-nuclear cusp. For an optimally chosen set of coefficients, one can thus improve on the algebraic convergence for the electron-nuclear cusp, and it has been shown that the convergence of the Gaussian basis for the electron-nuclear contribution scales as O(e −κ √ N ) [117,118]. However, this improvement is not possible using a single-particle basis alone for the electron-electron cusp, as this is a cusp in the inter-electron coordinate. In the case of plane waves, an equivalent acceleration of convergence for the electron-nuclear cusp can be obtained if one uses pseudopotentials, which restores the analyticity of the wavefunction around the nucleus. In this case, as argued above using arguments from Fourier analysis, the smoothness of the wavefunction means that neglecting electron-electron interaction effects (e.g. as is done in density functional theory) the plane wave error scales as O(e −κN ). In real materials, pseudopotentials are a mainstay of plane wave calculations. It is also possible to introduce a second set of functions to augment the plane wave description of the wavefunction around the nuclear region [119][120][121], and such augmented basis sets allow for exponential convergence in the electronnuclear cusp without pseudopotentials. Thus, the convergence of both Gaussian and plane wave calculations is limited by resolution of the electron-electron cusp, which scales as O(1/N ), as discussed earlier.
Since the asymptotic convergence of the Gaussian basis and plane wave basis is the same, the asymptotic complexity of algorithms designed using either the plane wave basis or the Gaussian basis may be directly compared for real molecules and materials. However, it is also useful to have an idea of the relative prefactors in the convergence. The precise prefactor depends on the system and accuracy required. As a concrete example, the cubic diamond and cubic silicon density functional energies using the Perdew-Burke-Ernzerhof exchange-correlation functional and the Goedecker-Teter-Hutter pseudopotential can be converged to an accuracy of 10 milli-eV per atom using approximately 150 plane waves per atom and 250 plane waves per atom respectively; the same accuracy in a Gaussian basis with the same pseudopotential requires a quadruple-zeta double-polarization basis or larger, which for these systems has 26 Gaussian basis functions per atom, a factor of 6-10. While this example is for a density functional calculation, it serves to illustrate the relative spatial resolution of the two bases, which is the main factor in resolving the electronelectron cusp in correlated calculations. In [49] an analysis carried out at the correlated wavefunction level finds that the number of Gaussians needed to reproduce a plane wave calculation of fixed dimension (for a surface adsorption problem) to chemical accuracy is approximately less by a factor of 20-30, although this is a significant overestimate since the number of plane waves used is substantially more than is required for chemical accuracy. In summary, a rough estimate for the plane wave basis size versus Gaussians basis size for the same accuracy is approximately ten.
Within the context of performing quantum simulation experiments on the most advanced hardware platforms (specifically industrial transmon platfroms being designed at Google, IBM, Intel, Rigetti and elsewhere) in the next few years, gate count (not qubit count) is the primary concern. While most expect that more qubits can be manufactured in a scalable fashion, there is no clear path to substantially improving the gate fidelities already achieved by the most advanced transmon platforms. And the total fidelity of a circuit decreases exponentially in the number of gates. Less obvious is the fact that gate count (not logical qubit count) also determines the primary overhead in quantum errorcorrection. This is because a very large number of physical qubits (often hundreds of times more than the number of qubits required for a logical bit) are required to perform state distillation in order to implement non-transversal gates (e.g. T gates in the toric / surface code). Thus, we expect the scaling advantages of our approach to translate into practical gains for a variety of interesting quantum simulations, both in the near-term and in the distant future.

Modeling Non-Periodic Systems with a Periodic Basis
Plane waves are often used as a basis for systems with reduced periodicity, e.g. surfaces (periodic in two dimension), nanowires (periodic in one dimension), or even single-molecules (periodic in zero dimensions) [122]. The main concern to address with plane waves in such simulations is that that they enforce a periodic charge density and thus produce fictitious image interactions between computational cells. A simple way to avoid this is to make the computational cell volume Ω sufficiently large so that periodic images of the cells do not interact. This is typically what is done for surface calculations, where it is necessary only to extend the cell volume in one or two of the spatial directions. However, a more efficient and rigorous procedure, particularly for systems that are periodic in zero dimensions such as single-molecules, is to use a truncated Coulomb operator with a slightly larger cell size [123][124][125].
To see how this works, we consider an isolated molecule. The total density of a molecule decays exponentially quickly away from its center, and thus the molecule may be inscribed in a box of volume Ω = D 3 with only exponentially small parts of the density (and contributions to the energy) outside of the box. By using a Coulomb operator truncated at distance D [47], such that and by carrying out the simulation in a box of size 8 Ω = (2 D) 3 , we ensure that there is no Coulomb interaction at all between the repeated images of the molecule, up to exponentially small terms in Ω arising from the density of the molecule outside of the box. While the Fourier amplitudes of the normal Coulomb operator are 4π/k 2 , the Fourier amplitudes of the truncated Coulomb interaction become 4π(1 − cos[|k|D])/k 2 . The exact analytical form of this correction gives the following Coulomb operators in the plane wave basis: These operators follow straightforwardly from the derivation in Appendix B if the Fourier transformed potentials of Eq. (6) are convolved with the (1 − cos[|k ν |D]) correction inside of the sum over ν.
In the dual basis, the truncated Coulomb operator can be implemented even more straightforwardly: one simply drops all n p n q terms for which |r p − r q | > D. As with the plane waves, to maintain resolution, we increase the number of basis functions by exactly a factor of eight. Taken together with the prior arguments in this appendix, this concludes our argument that electronic structure simulations of systems of reduced periodicity can be carried out using plane wave (and dual) orbitals with the same asymptotic scaling as Gaussian orbitals.

Appendix F: Operator Norm Bounds
In this appendix we bound the norms of the Hamiltonian components H = T + U + V in the plane wave dual basis. These bounds are used extensively in determining the asymptotic scalings discussed in Section II. However, we note that these bounds are likely loose and that one should compute these bounds numerically in order to determine practical scaling. Recall that we restrict the support of all operators to N plane waves with momenta in each dimension not exceeding absolute value proportional to N 1/3 /Ω 1/3 .
We begin with the two-body potential operator V , as given in Eq. (C8). For any state |ψ inside the η-electron manifold of the Hilbert space we wish to estimate As the sum above does not have a closed form in three dimensions, we will upper bound it using integrals. In particular, we use the fact that for monotonically decreasing f , We will break the sum into three cases corresponding to one, two and three dimensional sums for the potential operator. First let us consider the case of the one-dimensional sum encountered when ν y = ν z = 0, Now consider the two-dimensional case encountered when ν z = 0, (νx,νy) =(0,0) The one-dimensional sum above occurs when ν x > 0, ν y = ν z = 0. This sum is O(1) from Eq. (F3). The second case term can be bounded using the fact that 1/(ν 2 x + ν 2 y ) is a monotonically decreasing function of both variables Finally consider the three-dimensional case. Using the exact same reasoning spelled out before, but using a spherical integral rather than a polar integral, we find from Eq. (F5) that in three dimensions (νx,νy,νz ) =(0,0,0) Thus, from Eq. (F3), Eq. (F5) and Eq. (F6) we find that, Note that the dimensions of the potential are in units of inverse length and the energy scales as η 2 as expected.
We can now bound the norm of the external potential operator U , as given in Eq. (C6). For any state |ψ inside the η-electron manifold of the Hilbert space we wish to estimate where we have assumed that j ζ j = η as this must be true when treating periodic systems which, in general, must be charge neutral. Thus, we can see that the external potential has the same bound as the two-body potential, We use the equality of these bounds when estimating the variance of measuring U + V in Section II C. Finally, we bound the norm of the kinetic energy operator T . It turns out that the kinetic energy operator is much easier to tightly bound in momentum space and so we derive the bound from Eq. (B3) rather than from Eq. (C4). The bound holds for both cases as a consequence of Parseval's theorem and the unitarity of the discrete Fourier transform. The bound is computed as However, in some cases (e.g. for the value of Λ in the Taylor series method in Section II B) we are interested in the triangle inequality upper bound on the operator norm, which is not invariant under a Fourier transform. Thus, it may also be useful to bound T in the plane wave dual basis with a triangle inequality as where ∇ 2 = ∂ 2 x + ∂ 2 y + ∂ 2 z is the Laplacian, which acts on r. We do this so that we can expand the inner sum using a geometric series in ν: We can now see that and this leads us to our final bound by completing the sum over p, This turns out to be exactly consistent with the bound we obtained from the momentum space operator but it was necessary to show the triangle inequality norm remained the same. Finally, the norm of the Hamiltonian H = T +U +V and the upper bound on its expectation value is thus In this section we will describe a circuit which swaps qubits on a planar lattice so as to place them all adjacent at least once with circuit depth O(N ). This circuit is useful in many contexts, including for the implementation of the potential operator which consists of terms having the form Z i Z j . We describe the process informally below for the case of square lattices before providing a formal proof that the method works for a wide class of rectangular lattices. The motivation for restricting qubit connectivity to planar lattice comes from existing superconducting qubit platforms which have this restriction. For the purpose of explanation, we will illustrate the scheme for a 4 by 4 grid of qubits. Our circuit is implemented in four steps.
Step 1. Define a closed-loop 1D path through the qubits. This will always be possible on any rectangular arrangement of qubits on a planar lattice. For instance, for the 4 by 4 grid, one possible closed-loop path is shown in Figure 1a. We then decompose this path into two different, disconnected graphs which we will call the "left stagger" and "right stagger". We show an example of this decomposition in Figure 1. In the first step we draw a closed-loop 1D path through the qubits, e.g. Figure 1a. We then decompose the 1D path into a "left stagger" (Figure 1b) and a "right stagger" (Figure 1c).
Step 2. Alternate layers of SWAP gates on the "left stagger" and "right stagger" conformations of the graph. If U L is a layer of SWAP gates associated with the "left stagger" and U R is a layer of SWAP gates associated with the "right stagger" then one should implement (U R U L ) N/2 where N is the number of qubits. This circuit has depth of exactly N cycles and returns all of the qubits to their original positions.
A key insight is that half of the qubits will circulate along the 1D path in a clockwise fashion and half of the qubits will circulate around the circuit in a counter-clockwise fashion. To see this, it is helpful to imagine the qubits as being colored in a checkerboard fashion. We demonstrate the first four layers of this pattern for the 4 by 4 lattice in Figure 2. If we imagine the qubits colored as in Figure 2 then we can clearly see that the blue qubits will circulate clockwise and the red qubits will circulate counterclockwise. Because the qubits will return to their original locations after (U R U L ) N/2 , all of the blue qubits must have "moved through" all of the red qubits and thus, all of the blue qubits have been adjacent to all of the red qubits. What remains is to make all of the blue qubits adjacent to all of the blue qubits and all of the red qubits adjacent to all of the red qubits.
Step 3. Alternate between two staggered layers of parallel SWAP gates to move all the "colors" of the checkerboard pattern to seperated sides of the qubit array. In the worst case, this will require √ N /2 cycles. We demonstrate this in Figure 3a and Figure 3b.
Step 4. Repeat Steps 1 through 3, in parallel, for the divided sectors of the array. One should alternate between horizontal and vertical color divisions for Step 3. Once the divided sector size has reached four, a single layer of SWAPs is all that remains to ensure every qubit has neighbored at least once.
Steps 1-3 require exactly N + √ N /2 layers of gates in the worst case. After every repetition of Steps 1-3, the circuit is divided into sectors of half the number of qubits as in the prior iteration. Accordingly, one will need to repeat Steps In the second step we alternate between applying UL (Figure 1b) and UR (Figure 1c). If we color the qubits in a checkboard fashion then we can see that all of the qubits of one color (in this case, blue) will move along the 1D path in a clockwise fashion whereas all of the qubits of the other color (in this case, red) will move along the 1D path in a counterclockwise fashion. We show the first four, of sixteen layers, required to circulate these qubits all the way through the 1D path. Finally, we need to show that for each odd y that there exists an edge (x, y) in some P n M (C M ). To see this assume that for all 0 ≤ p ≤ r that (x, y) is in C M · · · P r M (C M ) for all odd y is [x − 1, . . . , x + 4r + 1]. We can therefore apply P M to the graph P r M (C M ) which includes the edges (x, x + 4(r + 1) − 1) and (x, x + 4(r + 1) + 1). Thus (x, x + 4(r + 1) − 1) and (x, x + 4(r + 1) + 1) is in C M · · · P r+1 M (C M ). Thus (x, y) is in the union for all odd y greater than x − 1 and less than (x + 4(r + 1) − 1). Since this trivially holds for r = 0 and because the function is periodic with period M/2 we then have that our claim holds for all even x.
Assume x is odd and there exists even y such that (x, y) is not in C M P M (C M ) · · · P M/2 k (C M ). Since edges are symmetric this implies that (y, x) is not in the union of graphs as well. We have shown above that each even vertex has every even vertex as a neighbor and hence this is impossible. Therefore the claim holds for all x. To understand how this works, let us first consider F M (C M ). As argued in Lemma 2 each vertex in Z M appears in an edge with every other vertex in Z M that has the opposite parity after an appropriate number of applications of the P M operation. Thus, the union of the resulting edges forms a complete bipartite graph on Z M elements. The results are then saved to ensure that every edge that we have found can be decoded as an edge later.
Next we apply Q M to the graph. This mapping is equivalent to splitting both layers of the complete bipartite graph into separate sub-graphs and drawing edges between the vertices to form a cycle graph isomophic to C M/2 . If M ≥ 4 then neither of these graphs consists of elements that have not shared an edge with each other. Thus we reduce the original problem to two instances of the initial problem. By recursing we again reduce the sub-graph to a complete bipartite graph, which reduces the number of edges in the complete graph that have not been observed by a factor of 2. After recursing this process O(log(M )) times it is then clear that every possible combination of edges is observed and saved. Since the map is by construction invertible, these saved edges can be decoded to edges in the original vertex set which completes our proof.
Theorem 3 is notably restricted to cases where M is a power of two. This is an important restriction for the simple scheme outlined here because if we do not make this assumption then the approach that we take to recursively building the edge sets will not work. We can also make this work in cases where M = 2 q X for X ∈ O(1) using O(q) operations from the above set by recursing until the problem is reduced to building edges between sets of size X, which can be handled brute force using bubble sort in O(1) steps. However, in general, if M = 2P for prime P then such a construction will not lead to a low depth circuit and idiosyncratic approaches may be needed to make the strategy work. For this reason we focus our attention in the following on graphs with M = 2 k vertices. In the following lemma we will use these techniques to show how to simulate the potential term in low-depth on a nearest-neighbor quantum computer that consists of an an integer number of qubits laid out in a rectangular lattice.
Lemma 4. Let S be a set of 2 k qubits on a nearest neighbor rectangular lattice of dimension 2 d × 2 k−d such that swap gates and e −iZZφ gates can only be performed between neighboring qubits in S. Then (x,y)∈S e −iφxya † x axa † y ay can be performed on a quantum computer in depth O(2 k ).
Proof. We prove this result by leveraging Theorem 3 but to do so we need to embed the cycle graph described in the theorem within the square lattice. To see that such an embedding is possible, first note that every cycle has a Hamiltonian path. Any rectangular grid of size 2 d − 1 × 2 k−d also contains the disjoint union of 2 k−d−1 cycles and edges that connect these cycles to their neighbors. In particular, if we start a path at (0, 0) then by following the Hamiltonian path we can arrive at (1, 0). This qubit is adjacent to vertex (2, 0) which is also part of a disjoint cycle and hence there exists a Hamiltonian path for the union of both cycles that links (0, 0) to (0, 3). Repeating this argument we see that there is a Hamiltonian path connecting each vertex in the union of these cycles that terminates at (0, 2 k−d − 1). Now if we introduce another row of vertices beneath this cycle with labels (−1, 0), . . . (−1, 2 k − 1) that have edges between horizontally adjacent qubits as well as edge s between vertex (−1, 0) and (0, 0) as well as (−1, 2 k−d − 1) and (0, 2 k−d − 1). Thus there exists a Hamiltonian cycle that can be embedded in every rectangular lattice of dimension 2 d × 2 k−d . This cycle can be viewed as the cycle graph C 2 k . Now that we have shown we can implement qubits on a cycle graph in a square lattice we next need to show that we can manipulate the qubits in the manner described in Theorem 3. To do so we need to first discuss implementing F 2 q for q = 1, . . . , k. The operation F 2 q can be implemented by swapping qubits every even qubit and its odd neighbor with higher index, and then swapping each even qubit with its odd neighbor with lower index. This shifts the value of every even qubit two sites in the opposite direction from the data in the corresponding odd qubits. Ergo it performs the transformation f on the labels ascribed to each qubit site. Each transformation can be done in depth O(1) swaps and in turn the whole series of swaps requires O(2 q ) depth. Furthermore, for each unique edge that is found in this process we can easily apply e −iφZZ to each edge in depth O(1). Thus, we can apply F 2 q and perform the necessary phase rotations in depth O(2 q ).
The operation Q 2 q can be implemented in the following way. Apply bubble sort using local swap operations to the qubits. Since there are 2 q vertices within each set that Q 2 q acts on this can be done using a serial bubble sort algorithm using 2 2q swap operations, however by using parallel bubble sort one can perform O(2 q ) comparisons at the same time allowing the algorithm to execute in depth 2 q . This allows us to sort the qubits such that the vertices 0, . . . , 2 q−1 − 1 are assigned even labels and the remaining vertices are assigned odd labels. Thus an application of F 2 q , Q 2 q requires depth O(2 q ). If the graph has already been partitioned into a disjoint union of Hamiltonian cycles then it is clear that applying C 2 q to each of these cycles can be done in depth O(2 q ) because these graphs do not interact gate operations can be applied on them simultaneously. Following the steps outlined in Theorem 3 we can produce every edge in the complete graph on 2 k entries in depth which completes our proof. Now if we define the total number of vertices on the graph to be N = 2 k then the depth required by the simulation is O(N ). This means that in architectures that allow nearest neighbor interactions that act on disjoint qubits to be applied at unit cost requires at most linear time. This is significant for Trotter based simulations as well as variational algorithms where exponentials of such terms have to be employed in state preparation.
The analogous claim for a † q follows again by symmetry. Next given this result, we need to examine the two-level fermionic Fourier transform. This is important because it is the primitive upon which the FFFT is built. The circuit in Figure 4 illustrates how the eight-mode FFFT leverages, F † 0 and the related operators F † k = e −i2πk/Ma † q aq F † 0 , to perform a fermionic Fourier transform [67]. The following corollary illustrates that F † 0 performs the necessary two-mode transformation and then the subsequent theorem will use this fact to demonstrate the general construction for the FFFT for more than eight modes and for representations other than the Jordan-Wigner transform.
Corollary 6. Let a † p and a † q be creation operators acting on disjoint spin orbitals and let F † 0 be defined as per Eq. (11) Proof. From Lemma 5 we have that Although the magnitudes of the creation operator matches what is needed by the two-dimensional FFFT, the phases are not correct. The phases for the transformation of a † p can be corrected by introducing two phase shift operators: However, if we apply the same transformation to a † q then we find This unwanted phase of i can be corrected by applying a e −i(π/2)a † q aq gate prior to the application of the partial fermionic swap e ifswapπ/4 and gives us the claimed unitary gate. Proof. Our construction for the FFFT consists of two types of gates. Specifically, we use F 0 gates between two adjacent spin orbitals, f swap gates and finally phase shifting gates e −insφ where n s is the number operator acting on an arbitrary spin orbital s. For every two level subsystem in the problem we can represent the corresponding creation operators as a vector. For example, let c † p = [1, 0] ⊤ and c † q = [0, 1] ⊤ . Thus, applying F 0 on this subspace is equivalent to applying the two-dimensional Fourier transform on the vectors that correspond to the elements. Similarly the phase shifters can be used to set the phases arbitrarily for the creation operators, which allows us to shift the phases of the corresponding vector components arbitrarily. Thus, these components allow the Hadamard gate and an arbitrary diagonal unitary to be performed on the corresponding set of vectors.
The FFFT of a vector of length M = 2 k for positive integer k requires O(M log(M )) operations from our gate library. The result is such that, for the p th computational basis vector that this process maps e p → 1 √ M j e −2πijp/M e j . The algorithm does this by applying a divide and conquer approach to the Fourier transform wherein the discrete Fourier transform on dimension M is broken up into two Fourier transforms on dimension M/2. The elements of these two Fourier transforms are combined by first applying phases to the components of the vector of the form on two dimensional subspaces corresponding to different mixtures of even and odd Fourier components. In order to estimate the gate complexity of the algorithm we first need to convert these two-level transformations into operators on the fermionic modes. Again encoding c † p as [1, 0] ⊤ and c † q as [0, 1] ⊤ we have that the equivalent fermionic transformation is carried out by a unitary F k such that Since e iφa † q aq a † q e −iφa † q aq = e iφ a † q , the gate F k can be expressed using Corollary 6 as F k is also unitary, as required, since F 0 is unitary from Lemma 5. This requires O(1) gates from our gate set. By translating the gate operations between the two sets it is clear that if we were not restricted to two-level F k gates then the process could be executed in O(M log M ) gates from this gate library. However, owing to this restriction we have to perform fermion swap gates in order to move each q to be adjacent to its corresponding p. To do this, O(log(M )) such fermionic swaps are required. We choose to implement the sort using parallel bubble sort along the lexicographical ordering of the fermion modes, which on M elements requires O(M 2 ) nearest neighbor fermionic swaps to re-arrange the elements. Since this process needs to be repeated O(log(M )) times, the number of fermionic swaps required in the overall algorithm is at most O(M 2 log(M )). However, the depth is a factor of M lower than this if parallel bubble sort is employed.
We can now use the previous result to explain how the three-dimensional FFFT can be performed with low depth. The result follows similar reasoning as the previous theorem but with the complication that the FFFT is not easily expressible as a low depth circuit using nearest-neighbor gates when applied to two out of the three dimensions. The strategy that we employ to avoid this problem is to reorder the spin-orbitals using fermionic swaps.
Corollary 8. The three-dimensional FFFT on N = M 3 spin orbitals, where M is a positive integer power of two can be implemented using O(N 2 ) quantum gates taken from a library that includes F 0 gates on nearest neighbor gates, fermionic swap gates and phase gates. It also requires requires depth O(N ).
Proof. Let us begin by assuming the following canonical ordering: n(ν x , ν y , ν z ) = ν x + ν y M + ν z M 2 . The three dimensional FFFT by definition is composed of independent FFFTs in the x-direction, y-direction and z-direction. Let each node correspond to a vertex label of a Hamiltonian path embedded in the lattice. Such a path exists because the number of lattice sites is even since M is even. For fixed ν y and ν z , all the fermionic modes which participate in the Fourier transform are contiguous by the definition of a Hamiltonian path. Therefore, each can be simulated using the result of Theorem 7. There are M 2 groups of qubits with fixed ν y and ν z and O(M 2 ) gates are required to apply the x-Fourier transform with each group. Thus, the entire process requires O(M 4 ) ⊂ O(N 2 ) gates from Theorem 7. Each of the M 2 FFFTs are independent and can be parallelized. Therefore, we can perform the x component of the fermionic Fourier transform with depth O(M ) ⊂ O(N ).
Next, let us consider the y-Fourier transform. We apply this Fourier transform by using fermionic swap operations to transform the basis to one where the effective ordering is now changed to n(ν y , ν x , ν z ). We achieve this by again performing a bubble sort along the lexicographical ordering of the fermion modes, using fermionic swap operations for the exchange. Bubble sort on N elements requires, in the worst case scenario O(N 2 ) swap operations (the evaluation of n is performed in classical preprocessing and thus does not require any quantum operations). Thus, we can sort the qubits into the ordering n(ν y , ν x , ν z ) using O(N 2 ) fermionic swap gates. These swaps are carried out between adjacent vertices on the Hamiltonian path inscribed in the two-dimensional lattice and thus commute and can be directly simulated using nearest neighbor interactions. By parallelizing swaps in bubble sort we see that depth O(N ) can be attained. Once sorted, we can again apply the result of Theorem 7 to the resulting M 2 y-Fourier transforms within groups of qubits for which ν x = ν z . Thus, the y-component of the FFFT can be performed in O(N 2 + M 4 log(M )) = O(N 2 ) gates and depth O(N ).
The z-component of the FFFT can be performed using the exact same protocol as the y-component, this time sorting the bits so that the ordering is n(ν z , ν y , ν x ) and then (if necessary) using fermionic swaps to sort back to the original ordering of spin-orbitals. Thus, by summing the complexities of the Fourier transforms along each of the three components we obtain the claimed complexities for a nearest neighbor architecture on a planar lattice where M is a positive integer power of two. Although the fermionic swap gate between two lexicographically adjacent fermionic modes is not necessarily a two-local qubit gate, this is the case under the Jordan-Wigner transformation. Thus, we have demonstrated that O(N ) layers of gates suffice to implement the FFFT on a planar qubit architecture.
Note that the fermionic swap operation has many other potential uses in quantum simulation. As an example, one application would be in the implementation of operator nesting [18]. While this procedure typically requires ancilla to evaluate Jordan-Wigner strings when parallelizing commuting operations, one could perform nesting in-place by using fermionic swap operations to move qubits acted upon by Hamiltonian terms that act on disjoint sets of qubits next to each other in lexicographical ordering.
While we have examined simulation using the fermionic fast Fourier transform within the plane wave dual basis, it is important to note that this approach is not necessary. For purposes of comparison, we outline here the method by which one would simulate chemical dynamics within the basis using the Jordan-Wigner representation of the spin orbitals. The Hamiltonian is well suited for such simulations because it can be conveniently expressed as a sum of Pauli operators as shown in Eq. (D6). The simplest term that appears in such a Trotter decomposition is of the form e −i2φnp . Such terms are easy to implement. It is easy to see from Eq. (D4) that this is equal to e iφZ , up to an irrelevant global phase. This is a single qubit rotation which can either be directly implemented in non-fault tolerant architectures or performed using a sequence of O(log(1/ǫ)) gates in a fault-tolerant architecture. The next simplest such terms are of the form e 4iφpqnpnq . Such terms are slightly more sophisticated and good networks are known for these exponentials as given in [16]. While such terms are seldom dominant for secondquantized quantum simulation, for molecules represented in the plane wave dual basis they are among the most numerous terms. Therefore, it warrants taking some time to devise optimal networks for these circuits. First, while the approach of [16] groups all three non-identity terms in the expansion of Eq. (D4) for n p n q into a single circuit, this is not necessarily optimal. This is because the single qubit terms can be grouped together. Instead, by decomposing the Hamiltonian as per Eq. (D6) directly into Pauli operators we can execute the single qubit terms that come from both the n p and n p n q terms simultaneously. This allows them to be executed with O(N ) gates and depth O(1).
The Z p Z q term is slightly more challenging. The strategy that we employ, as seen in Figure 5, is to break up the sum into sets of N − 1 terms all of which can be computed by CNOTs acting on disjoint qubits in a logarithmic number of layers. The simplest such group is {Z 1 Z 2 , Z 3 Z 4 , . . . , Z N −1 Z N , Z 1 Z 3 , . . . , Z N −2 Z N , . . . , Z 1 Z N −1 }.
There are O(N ) such sets and so we can perform all N (N − 1)/2 exponentials using at most N (N − 1)/2 rotations, O(N ) of which need to be executed sequentially. This is a factor of 3 reduction from the networks of [16] and in addition this approach requires no ancilla to be parallelized. Next let us focus on the kinetic term. We employ a new strategy for simulating the kinetic term that is based on ideas from [72]. The circuit works by diagonalizing the Hamiltonian a † p a q + a † q a p by transforming qubits p and q into the Bell basis. This is done because X p X q and Y p Y q further subdividing each term with one more index so that each W p,q,b,ν is a sum of µ phases with the same magnitude, W p,q,b,ν ≈ ζ µ−1 m=0 W p,q,b,ν,m W p,q,b,ν,m ∈ {−1, +1} ζ ∈ Θ ǫ L t µ ∈ Θ max p,q,b,ν |W p,q,b,ν | ζ .
(K2) To accomplish this one-the-fly, we perform logic on the output of sample(W ) which acts as, sample(W ) |p |q |b |ν |0 ⊗ log µ → |p |q |b |ν | W p,q,b,ν where W p,q,b,ν is a digital approximation with log µ bits to the real-valued W p,q,b,ν . Since the values of W p,q,b,ν shown in Eq. (K1) are straightforward arithmetic functions of p, q, b and ν, together with simple logic, we see that sample(W ) can be implemented at gate complexity O(1) with respect to N and ǫ. Note that some of this arithmetic (such as reversible computation of the reciprocal) can be costly to compute to high precision in practice. Furthermore, if we were concerned about scaling with number of nuclear charges, we could also break up the Z p coefficients in terms of the nuclei j using a number of ancilla scaling logarithmically in the number of nuclei. Given the sample(W ) circuit, we can construct the prepare(W ) circuit by performing logic followed by phasekickback on the output of the sample(W ) register. The values of W p,q,b,ν,m are always either +1 or -1 but we actually need the square root of these values for the prepare(W ) superposition (see Eq. (26)). Thus, we need the circuit kickback(W ) |m | W p,q,b,ν → |m | W p,q,b,ν W p,q,b,ν > (2m − µ) ζ i |m | W p,q,b,ν W p,q,b,ν ≤ (2m − µ) ζ .
(K4) kickback(W ) can also be implemented with gate complexity O(1) with respect to N and ǫ. We put these circuit together with some Hadamard gates to form the complete prepare(W ) circuit as shown in Figure 7. We see that While our implementation of prepare(W ) is significantly more efficient than the method outline in Section II B, by breaking up the Hamiltonian into these different terms the normalization Λ becomes, which is significantly higher than the value of Λ which applies to the method of Section II B. Since the gate complexity of implementing select(H) is O(N ) and the gate complexity of implementing prepare(W ) is O(1), from Eq. (24) we find that the total gate complexity of our Taylor series approach is no more than with only polylogarithimic dependence on precision. We see that the gate complexity at fixed density becomes O(N 11/3 ), which is better than the O(N 4 ) scaling of the method in Section II B. Furthermore, the oracle for select(H) can be parallelized to O(1) depth using arbitrary two-qubits gates. This can be taken advantage of by our "on-the-fly" algorithm but not by our database algorithm due to the difference in scaling of prepare(W ). To see this consider the following. The select(H) oracle consists of five cases depending on the values of p, q and b. These cases can be executed sequentially without sacrificing more than a constant factor in depth. The cases corresponding to the kinetic energy terms are the only ones that require O(N ) sized circuits. However, they can be performed in depth O(1) using the following protocol. First, fanout a qubit string that replicates N copies of p, q and b. This can be achieved in O(log N ) depth. Next for each qubit compute the value of the control bit that decides whether the conditions for that term to be activated are met. This requires O(log N ) operations. Next compute for qubit j whether j = q, j = p or j ∈ (p, q) and using Toffoli gates conditioned on these qubits as well as the flag that determines whether the term is activated to begin with, apply X, Y or Z on the qubit in question as dictated by select(H). By construction, the depth needed for this process is O(1). After this has been performed, uncompute all ancillae, which can be done in O(1) depth. The entire process requires then clearly requires O(1) depth. Since r = O(N 8/3 ) segments are required for the simulation from Eq. (K6) and each segment can be performed in O(1) depth, we find that the overall gate depth of our algorithm is O (N 8/3 ). This depth is substantially lower than any previously described algorithm for electronic structure simulation in the literature.