Simulating Time Evolution with Fully Optimized Single-Qubit Gates on Parameterized Quantum Circuits

We propose a novel method to sequentially optimize arbitrary single-qubit gates in parameterized quantum circuits for simulating real and imaginary time evolution. The method utilizes full degrees of freedom of single-qubit gates and therefore can potentially obtain better performance. Specifically, it simultaneously optimizes both the axis and the angle of a single-qubit gate, while the known methods either optimize the angle with the axis fixed, or vice versa. It generalizes the known methods and utilizes sinusoidal cost functions parameterized by the axis and angle of rotation. Furthermore, we demonstrate how it can be extended to optimize a set of parameterized two-qubit gates with excitation-conservation constraints, which includes the Hop and the Reconfigurable Beam Splitter gates. We perform numerical experiments showing the power of the proposed method to find ground states of typical Hamiltonians with quantum imaginary time evolution using parameterized quantum circuits. In addition, we show the method can be applied to real time evolution and discuss the tradeoff between its simulation accuracy and hardware efficiency.


I. INTRODUCTION
Quantum simulations of materials, which are becoming popular as promising applications of quantum computing, are practically useful in designing functional materials.In fact, there have been significant number of quantum and quantum-classical hybrid algorithms developed for such simulations [1][2][3][4][5].One can track the time evolution of a quantum system by the real time evolution (RTE) algorithm, which is important to investigate, for example, the quantum dynamics under the irradiation of laser [6,7].In addition to the quantum phase estimation (QPE) algorithm [8,9], and the variational quantum eigensolver (VQE) [10][11][12], the imaginary time evolution (ITE) algorithm [13,14] might be applied to obtain the ground state energy and wavefunction.Although it is hard to run the QPE on current noisy intermediate-scale quantum devices, the VQE and ITE may be more realizable [10,[15][16][17][18][19][20][21][22][23]; actually several experimental studies on small systems have been reported [24,25].
The basic ingredients for running quantum-classical hybrid algorithms are first to set a parameterized quantum circuit (PQC), sometimes called the trial wave function or simply the ansatz, and then to iteratively update its parameters by classical optimizers so that its final output state approximates the target state.The approximation accuracy of the target state achieved via quantum-classical hybrid algorithms heavily depends on the choice of PQC and the classical optimization strategy.Recent great efforts have revealed several essential properties of the components of quantum-classical hybrid algorithms [26][27][28].Among those literature works, we find interesting gradient-free optimizers that make full use of the specific parameterization of standard PQCs.
More precisely, those optimizers can analytically optimize a subset of the parameters at each iteration by exploiting the special type of analytic form of the cost function.That is, the parameters are locally optimized in a coordinate-wise manner and updated deterministically.Specifically, Nakanishi et al. [29] and Ostaszewski et al. [30] showed that the cost of VQE becomes a sinusoidal function of a single-qubit rotation, and thus we can determine the optimal rotational angle.Ostaszewski et al. also proposed a sequential optimization method for selecting the best rotational axes of qubits from the x, y, or z rotations.Going further to relax the ansatz-dependency of VQE, a generalization of such gradient-free optimizers called the free-axis selection (or, Fraxis) [31] was proposed.The Fraxis algorithm analytically determines the best rotational axis (not limited to the discrete set consisting of x, y, or z rotations) for each single-qubit gate when the rotational angles are fixed.Note that these gradient-free optimizers directly determine the local optima and are different from the generic ones such as the Nelder-Mead [32] or SPSA [33].
In this article, with a particular attention to the task of simulating real and imaginary time evolution on a PQC, we make a further progress of this gradient-free optimization method.That is, for simulating time evolution we prove that both the rotational axis and angle of each single-qubit gate in the PQC can be analytically optimized in a coordinate-wise manner.The set of rotational axes and angles of single-qubit gates constitutes the parameters of the PQC, and hence we have derived a fully-optimized gradient-free optimizer for simulating the time evolution.This means that the PQC now acquires the ability for searching the quantum state in the Hilbert space that best approximates the time evolution in the coordinate-wise sense, and hence opening the new path to efficiently approximate the target state being driven via the time evolving propagators.Moreover, we show that this method can be applied to optimize some multi-qubit gates, including the excitation-conserving two-qubit gates which play important roles in the quantum chemistry applications.We conducted numerical simulations of ITE for 1-dimensional (1D) Heisenberg model and H 2 molecule, and showed that the proposed algorithm could approximate the target ground state faster and better than some previous methods.In addition, we also conducted numerical simulations of RTE for 1D Ising model using the proposed algorithms, and confirmed the reproduction of more accurate dynamics in our methods.
Here we provide a summary of some related works on the quantum-computing approach for quantum simulations.Note that some existing methods for RTE and ITE [34][35][36] utilize McLachlan's variational principle [37] to update the parameters of PQC.The resultant updating rule requires, however, evaluating an inverse matrix of the size of parameters dimension at each iteration step, which is computationally demanding.Several approaches circumvent this issue.The method proposed by Motta et al. [14] is to use a length-varying quantum circuit where a local Hamiltonian approximating the ITE can be efficiently optimized at each depth (corresponding to the time step).The resultant quantum circuit achieves a controllable accuracy, while the length inevitably becomes large.Nishi et al. [38] extended this result and showed that a non-local approximation of Hamiltonian could result in a shallower circuit and applied the method to the MaxCut problem.As for RTE, the algorithm proposed by Barison et al. [39] utilize a global optimization for parameters without matrix inversion.However, it requires the direct implementation of time evolving propagators, and then the application is limited to small quantum systems.Benedetti et al. [40] reported a new algorithm that optimizes the parameters of a fixed-length PQC in a coordinate-wise manner to approximate the RTE and ITE, which is based on parametererization of each singlequbit gate with one parameter as in [29,30,41,42] and therefore not fully optimized.Although there was an attempt to optimize a PQC whose some of single-qubit gates are constrained to have the same rotational angle [29], we are not aware of previous methods that explicitly optimize parameterized multi-qubit gates similarly as ours.
This article is organized as follows.The problem formulation and the proposed method are described Section II.Some extension to optimize multi-qubit gates is also shown there.Section III is devoted to show the numerical simulations.Finally we conclude the paper in Section IV.Detailed derivation of equations are given in Appendix.

II. METHODS
We first review the hardware-efficient objective function for time evolving simulation proposed by Benedetti et al. [40], which focuses on the overlap between a target state and a trial state, and introduce a measure of hardware-efficiency of objective function.Then, we introduce Free Quaternion Selection for Quantum Simulation (abbreviated as FQS, where QS has two meanings: Quaternion Selection and Quantum Simulation), which can fully optimize an arbitrary single-qubit gate with hardware efficiency.Next, we show that the FQS can be extended to special multi-qubit gates such as the excitation-conserving gates.

A. HARDWARE-EFFICIENT QUANTUM SIMULATION OF TIME EVOLUTION
Suppose a time evolution simulation based on the Hamiltonian given as Ĥ = K k=1 h k Ôk , where Ôk denotes a tensor product of Pauli operators for m-qubit Ôk ∈ {σ 0 = I, σ x , σ y , σ z } ⊗m , h k is a real valued coefficient, and K ∼ O(poly(m)).The time evolving propagator e −i Ĥt is approximated with the first-order Trotter decomposition as where N is the number of time steps and ∆t ≡ t/N .Now a propagator e −ih k Ôk ∆t is applied to an arbitrary state |ψ k−1 , which results in a state |ψ k as ( Suppose an initial state |ψ k−1 is approximated by a PQC as is an optimal parameter set, and |0 is the computational basis state.Provided that a PQC has sufficient expressibility, there exists an optimal parameter set ϑ * . Therefore, a time evolution can be simulated if parameter sets that reproduce the time evolving propagator are somehow found.Hereafter, we focus on a series of processes to determine the kth optimal parameter set ϑ * . For readability, we write ϑ * [k] and ϑ * [k−1] as ϑ * , ϑ ′ , respectively, when it is obvious from the context.To determine the optimal parameter set ϑ * , Benedetti et al. [40] proposed a recursive optimization using the objective function M( ϑ) based on the Euclidean distance as and where Θ is the parameter space, and F ( ϑ) is defined as Now we suppose a PQC U ( ϑ) consists of D parameterized single-qubit gates and arbitrary unitary gates without parameters such as CNOT gate.An element of the parameter space Θ is given as ϑ = (ϑ 1 , • • • , ϑ D ), where ϑ d represents the parameters for the dth parameterized gate.Conventionally, ϑ d is a scalar value, e.g., rotational angle ϑ = θ ∈ R for a single-qubit axis-fixed rotation gate such as R x , but in this paper to deal with a general single-qubit gate ϑ d can be vector-valued as detailed in the next section.To evaluate M( ϑ), it is common to apply Hadamard tests with controlled gates for different parameters between ϑ and ϑ ′ .Because All D elements in ϑ usually differ from those in ϑ ′ , the Hadamard test requires additional O(D) controlled gates and ancilla qubits, it may be difficult for near-term quantum devices when D is large.
The Hadamard test can be replaced by direct measurements as proposed by Mitarai et al. [43], which are often called hardware-efficient because they need less controlled gates and ancilla qubits.However, the direct measurements require more types of circuits, i.e., replacing D controlled gates with direct measurements incur the cost of evaluating O(4 D ) types of circuits.Here we see the trade-off among different quantum resources.To reduce the required quantum resources, it is possible to restrict the number of the gates to be updated in the objective function Eq. ( 5).On the other hand, such restricted updates may not be sufficient to minimize the simulation error M( ϑ) for e −ih Ô∆t .To balance between the hardware efficiency and simulation accuracy, we employ P ∈ N series of updates procedure where parameterized gates are grouped into P sets, and the respective sets are represented by Λ p , (p = 1, 2, • • • , P ).Here, Λ p consists of the gate indices as Λ p ⊆ {1, 2, • • • , D}.In the P series update, we divide the propagator into P terms maintaining total ∆t time evolution.Although the division of ∆t is not unique, we uniformly assign ∆t/P similarly as [40].Then, the original objective function in Eq. ( 5) is replaced by a series of the following objective functions where ϑ (p−1) denotes a parameter set whose ϑ d , (d ∈ ∪ p−1 q=1 Λ q ) have been updated from those in ϑ (0) ≡ ϑ ′ .Here, U ({ϑ d }; ϑ (p−1) ), d ∈ Λ p denotes a PQC in which the elements of Λ p are variable.ϑ (p) is recursively obtained from ϑ (p−1) by solving the following problems where ϑ (p) is defined by substituting {ϑ * d }, d ∈ Λ p into ϑ (p−1) .Eventually, we obtain the solution ϑ * = ϑ (P ) , which approximates the state evolved by the propagator e −ih Ô∆t from the state with ϑ ′ .This update procedure are repeated for the total K Trotterized time propagators to simulate time evolution of ∆t as in Eq. ( 1) and (2).The optimization method for Eq. ( 6) is not unique, and classical optimizers are conventionally employed, most of which update simultaneously multiple parameters with the cost of Hadamard tests consisting of multiple controlled gates and ancilla qubits.In the present study, instead, we employ coordinate-wise update, where parameters are sequentially updated for respective singlequbit gates.The procedure is summarized in Algorithm 1.
Because there are at most |Λ p | different parameterized gates between U † ( ϑ (p−1) ) and U ({ϑ d }; ϑ (p−1) ), the evaluation of the objective function Eq. ( 6) requires additional O(|Λ p |) controlled gates and one ancilla qubit.We introduce a hardware-efficiency measure η for an algorithm to simulate time evolution with Eq. ( 6) defined as The measure varies in the range of 1/D ≤ η ≤ 1; in case of the lowest value η = 1/D the objective function in Eq. ( 6) is regressed to the original form in Eq. ( 5).On the other hand, in the highest value η = 1, namely, the most Note that the objective function in Eq. ( 6) does not prescribe any optimization methods and PQC structures.In general, it is important to employ a PQC with sufficient expressibility to describe the state of interest.While it is common to extend a quantum circuit by adding parameterized gates, the correlation among parameters emerges as a new obstacle for optimization upon increase of parameters.To circumvent this problem, it is required to both (1) simultaneously update correlating parameters and (2) construct a high-expressibility PQC with as fewer number of parameters as possible.To this end, we propose a new optimization method for time evolution simulation by generalizing the free-axis selection (Fraxis) algorithm [31], which makes full use of degree of freedom with respect to a single-qubit gate.It is different from the previous work [40], where the objective function is analytically optimized for axis-fixed rotation gates by the analog of the NFT(Rotosolve) method.Since a general single-qubit gate with three parameters is decomposed into R z -R y -R z gates, the NFT can be applied to a general single-qubit gates by sequential update of the three gates.However, it may fail to consider the correlation among parameters of the R z -R y -R z gates.
In contrast, the new optimization method can incorporate correlation between all parameters of a single-qubit gate.The three parameters of a single-qubit gate corresponds to a three-dimensional rotation which is best captured by the selection of quaternion system, and hence the name free quaternion selection of our proposed method.It is also worth noting, when η = 1 (|Λ p | = 1, ∀p), the full optimization of single-qubit gate by FQS can be conducted with only seven types of direct measurements without Hadamard test, which is smaller than the types required for sequential optimization of a decomposed generalized single-qubit gate proposed in the previous work [40].
In the next subsections, we firstly introduce the FQS formulation based on imaginary time evolution using the objective function for the most hardware-efficient case η = 1, where real time t is replaced by imaginary time τ as t → −iτ .More specifically, in imaginary time evolution, a target state as in Eq. (2) becomes where ∆τ ≡ τ /N , and N is a normalization factor as N = e −h Ô∆τ U ( ϑ ′ ) |0 2 , which can be ignored in optimization problem of the objective function Eq. ( 5).In the Result section, we demonstrate the applications of imaginary time evolution for finding the ground state of the 1D Heisenberg model and H 2 molecule.Although we supposed η = 1 and imaginary time evolution for simplicity in the following derivation, we emphasize that the FQS algorithm is neither limited to the most hardware-efficient objective functions of η = 1 nor imaginary time evolution.Note that the FQS algorithm with η = 1 is a simple extension of η = 1 by using a coordinate-wise update for each general single-qubit gate in the same Λ p as in Algorithm 1.In particular, we also demonstrate the FQS application to simultaneous optimization for an excitation-conserving gate consisting of two parameterized single-qubit gates, and thus η = 1/2.
The appropriate hardware efficiency η should be determined from the trade-off between the performance of quantum devices and the required simulation accuracy.To demonstrate this point, we also applied FQS to real time evolution of the 1D Ising model with several hardware-efficiency levels.

B. THE PROPOSED METHOD Free Quaternion Selection for Quantum Simulation
A general single-qubit gate with parameters of a rotational angle θ and a rotational axis n is written as (10) where σ 0 is identity and σ = (σ x , σ y , σ z ).Here, n = (n x , n y , n z ) denotes a normalized vector corresponding to the rotational axis.Suppose a PQC U ( ϑ) consisting of D general single-qubit gates.In the PQC, a parameter set ϑ d of the dth parameterized gate denotes where θ d ∈ R and n d = 1.For simplicity, we suppose the dth gate set Λ d contains only the dth parameterized gate (i.e., the most hardware-efficient case η = 1) and the total number of the gate sets is D.Then, the unitary operator with Λ d in Eq. ( 6) is written as where V 1 and V 2 denotes the unitary operators corresponding to the partial circuits prior and posterior to the R n d in the PQC, respectively.Substituting Eq. ( 12) into Eq.( 6) we obtain where g 0 and g = (g x , g y , g z ) are defined as where 1) , and ς µ ∈ {σ 0 , −iσ x , −iσ y , −iσ z } (See Appendix A, B for detailed derivation).Here, we used the following notations, Because the objective function Eq. ( 13) has sinusoidal form, the optimal parameter ϑ * d = (θ * d , n * d ) are trivially obtained as where we choose l ∈ Z satisfying θ * d ∈ [0, 4π].Thus, the optimal value of ϑ * d can be determined from g µ .The first term in Eq. ( 14) can be determined by a pa- On the other hand, quantum computations are required for the second term of Eq. ( 14).In general, the evaluation of the second terms requires four types of measurements in total using the Hadamard test with controlled operation on For η = 1, however, we can replace the Hadamard test with modest number of direct measurements.
In the following part of this section, we describe the details of the direct measurement protocol.First, we note in the second term of Eq. ( 14) can be transformed into single-qubit gate as R nµ (ϕ µ ), where ϕ µ and nµ are determined with θ ′ and n ′ d that are obtained in the previous processes as for µ = 0 and for µ = 0 where δ µt and ǫ µst denotes the Kronecker delta and the three-dimensional Levi-Civita symbol, respectively (See Appendix B for details).Next, we define a generator G d as a function of rotation angle θ and axis n Furthermore, the generator is transformed as where :; The quantum circuits for computing the expected values Q±,µ, (µ = 0, x, y, z) as in Eq. ( 22) to update the parameter of the dth single-qubit gate.Here, obtained in the previous processes.Rµ(±π/2), which is defined as e ∓iσµπ/4 , is inserted after the gate of interest.V1 is the sub-circuit containing all parameterized single-qubit gates that have been updated, V2 is that containing the rest of single-qubit gates to be updated, and Ô is that containing measurements related to the Trotter-decomposition term (namely, the j-th qubit is rotated by H or HS † before measured in the computational basis when Ôj is σx or σy, respectively).18)- (19)]

Algorithm 2 Imaginary Time Evolution with Free Quaternion Selection for Quantum Simulation
Note that Q ±,ν is independent of (θ, n) and can be evaluated by direct measurement with a PQC in Fig. 1, where a single-qubit gate e ∓iσµπ/4 is inserted after the gate of interest.The generator agrees with the second term in Eq. ( 14) when (ϕ µ , nµ ) is substituted into (θ, n).Hence, once Q +,0 , Q ±,x , Q ±,y , Q ±,z are obtained for the dth gate, we can evaluate the second term in Eq. ( 14) without additional quantum computation as shown in Algorithm 2. Therefore, the optimal values of single-qubit gate parameters can be analytically determined with only seven types of expectation values in the direct measurement scheme: no ancilla qubits, no additional CNOT gates, which is rather advantageous on present real devices with limited qubit connectivity and significant error from control operation.Therefore, we believe the di-rect measurement scheme of FQS as the one of the most hardware-efficient protocol for time evolving simulation.
In the following, for simplicity we refer to the algorithm described in this section as FQS(1q, 3p), where the 1q denotes its targeting parameterized single-qubit gates, and the 3p denotes the full parameterization of each gate: the (unit) quaternion which can be identified with a set of rotational angle and axis or direct parameters of a singlequbit gate [44].We emphasize the fact that all parameters of a single-qubit gate are simultaneously optimized in FQS(1q, 3p).Thus, the time evolution is more accurately simulated by making full use of expressibility of a target gate.Obviously, FQS can be used to optimize the specific single-qubit gates such as rotation gates with fixed axis (i.e., NFT [29] or Rotosolve [30]) and the Fraxis [31] gate, which is equivalent to R n (π).PQCs consisting of fixed-axis rotation gates each with 1 parameter, or the Fraxis gates each with 2 parameters can be optimized with FQS(1q, 1p) or FQS(1q, 2p), respectively.In particular, the objective functions tailored to the specialized FQS are shown in Eq. (C.1) and Eq.(C.2) at the Appendix.Although NFT [29] and Fraxis [31] are mainly used for VQE, there are many similarities with the specialized FQS in optimization procedure.For simplicity, we also refer to FQS(1q, 1p) and FQS(1q, 2p) as NFT and Fraxis, respectively.

Free quaternion selection for multi-qubit gate
Here, we extend the FQS to special multi-qubit gates that can be decomposed as, where A, B, and C denote arbitrary unitary gates without parameters (such as the CNOT and Pauli gates), and R and R † share the same parameter ϑ = (θ, n).Table I lists some examples of the well-known gates in this class.They are excitation-conserving, swap, Hop, and Reconfigurable Beam Splitter (RBS) gates.Strictly speaking, they are all instances of the excitation-conserving (or, particle-number-preserving) gate set which can be implemented by the native gate set of the Fermionic Simulation as in [45], but for convenience, we name the gates as in Table I.A product of Hop gates is known to be universal with respect to real-valued probability amplitudes of quantum states on fixed particle number, making the gates attractive for quantum chemistry applications [20].The RBS gates have been proposed as building blocks of quantum neural network [46].
Because R n (θ) and R † n (θ) gates share the same parameters in those multi-qubit gates, these gates are simultaneously updated with an optimization scheme similar to the FQS method.In this case, each Λ includes two single-qubit gates making the hardware-efficiency measure η ≤ 1/2.Although in the following, we suppose the simple case that each Λ p consists of only two single-qubit gates, namely η = 1/2 (|Λ p | = 2, ∀p = 1, 2, • • • , P ), with shared parameters which are written as ϑ p = (θ p , n p ), it is straightforward to generalize it to |Λ| > 2 by using a coordinate-wise update with a multi-qubit gate in Eq. ( 23).
Given the excitation-conserving gate in Table I, the objective function for the pth gate set is rewritten by substituting Eq. ( 23) and θ = π into Eq.( 6) as where V 1 and V 2 denotes the parts of the PQC as in the previous subsection.By definition of a 3 × 3 asymmetric matrix G (p) as Eq. ( 24) is transformed in a quadratic form as where superscript T denotes a transpose operation.The optimal value can be computed from the symmetric matrix S (p) , defined as so that the optimal parameter becomes where the eigenvector corresponding to the largest eigenvalue of S (p) is the analytical solution.As for the objective function Eq. ( 6) with respect to Swap, Hop, or RBS gates, we also derive the analytical optimization form as describe in Appendix D.
To obtain a solution of Eq. ( 28), we need to evaluate the elements of G (p) in Eq. ( 25), which can be written in an unified expression similarly to the previous subsection, as (29) where W A , W B , and W C are defined as Note that (ϕ 1 , n1 ), (ϕ 2 , n2 ) are determined by each element of G (p) as in the previous subsection.In addition, an analytical solution of the objective function i,j (CNOT), Z c i,j (CZ) gates, and their products, where the superscript c represents a controlled-gate with the first (control) and second (target) subscripts.In the right-most column, the direct sum of matrices 1 ⊕ M ⊕ ±1 denotes the block diagonal matrix Diag (1, M, ±1).
for Swap, Hop, or RBS gates derived in Appendix D also requires quantities in the same form as Eq. ( 29).
In principle, these quantities in Eq. ( 29) can be evaluated with Hadamard test with two control operations on R n2 (ϕ 2 ) and R n1 (ϕ 1 ) as shown in Fig. 2.However, direct measurements without ancilla qubits and CNOT gates are available similarly as proposed in literature [43].
It should be also noted that in the case of excitationconserving gate, eight Hadamard tests are required because G xy = −G yx , while four measurements for the Hop and RBS gates (See Appendix D).
Here, we denote the FQS method for optimizing Eq. ( 28) as FQS with u-qubit gates of 2 parameters; FQS(uq, 2p), where u is the number of qubits subject to nontrivial action of A, B, and C. On the other hand, the FQS method to optimize only one degree of freedom out of three in ϑ p = (θ p , n(ψ p , φ p )) is termed FQS with u-qubit gates of 1 parameter; FQS(uq, 1p), which can be applied to Hop, RBS, swap, and the excitationconserving gate with one fixed degree of freedom.
In particular, as shown in Table I, FQS(2q, 2p) generalizes the excitation-conserving gate, the Hop gate, and the RBS gate.In contrast to the conventional excitationconserving gate where only rotational angle ψ is the optimization target, the FQS(2q, 2p) can simultaneously update not only ψ but also φ, which seems to be advantageous to exhibit higher expressibility and to avoid local minimum and saddle points.To verify this feature, we also carry out controlled experiments where the two parameters ψ, φ of an excitation-conserving gate are sequentially and separately optimized.We also note that because the total number of electrons has to be preserved in quantum chemistry calculation, limiting the parameter search space by taking into account the preservation is essential to reduce the computational cost.The FQS(2q, *p) methods are useful for such preservation constraints.

III. RESULTS AND DISCUSSION
In this section, we verify the performance of the proposed FQS methods in real and imaginary time simulations of typical Hamiltonians.The ITE simulations were executed with η = 1 to find the ground state of the 1D Heisenberg model and H 2 molecule.As for real time evolution, we simulated the 1D Ising model with various hardware-efficiency.All simulations presented in the paper were carried out using statevector simulator of Qiskit [47].The settings of each experiments are detailed in Appendix E.

A. 1D HEISENBERG MODEL WITH FQS(1Q, 3P)
Here, we consider a 5-qubit 1D Heisenberg model under the periodic boundary condition.The Hamiltonian is given as (31) where the coupling constant J and the external fields h satisfy J = h = 1, and G = {V, E} is the undirected :; 2. A quantum circuit for Hadamard test to evaluate Eq. ( 29) with respect to the optimization of the pth gate set.|0 a is the ancilla qubit.W0, W1, and W2 are the components of the PQC defined in Eq. (30), where W0 includes the gates that have been updated, and W2 includes the rest of gates to be updated, and Rn 1 (ϕ1), Rn 2 (ϕ2) act on the same qubit as the dth gate.Using the circuits, we can evaluate the expectations Ẑa ⊗ Ô and Ẑa ⊗ Î , and the linear combination of them with their coefficients − sinh (∆τ h/P ) and cosh (∆τ h/P ) yields the quantity Re 0| U † ( ϑ (p−1) )e −h Ô∆τ /P W2Rn 2 (ϕ2)W1Rn 1 (ϕ1)W0 |0 .
graph of the lattice with 5 nodes.The imaginary time propagators were prepared under the first-order Trotter decomposition with fixed time step ∆τ = 0.50.We carried out 21 independent ITE simulations obtained with FQS in comparison with NFT and Fraxis, where we consistently employed an ansatz with ladder-like entangler shown in Fig. 3

(a).
Since it is not possible to employ the identical state as the initial conditions consistently for NFT, Fraxis, and FQS, we separately compared FQS with Fraxis and NFT by preparing different initial conditions as shown in Fig. 4. In comparison to NFT with R y ansatz, the initial rotational axes in FQS were slightly perturbed from the y-axis while identical values were assigned to the initial rotational angles, which were randomly selected.Otherwise the axes were not updated from the initial direction keeping the state vector in real space.Although unvarying axis is not trivial from Eq. ( 14), it seems to be reasonable, because a state expressed by R y ansatz corresponds to a real vector, and a resulting state by exact ITE based on the stationary Hamiltonian initiating from a state in real vector space trivially remains in real.Note that the axis rearrangement from the y-axis is critical for the Heisenberg model, because its ground state cannot be expressed by the R y ansatz.Indeed, NFT based on the R y ansatz did not sufficiently reach to the ground state.Although the effect of the initial perturbation was not pronounced in the beginning of the simulation, the difference of ITE paths of FQS and NFT became distinct around τ = 100 as seen in Fig. 4, which implies the complex component gradually increased in the course of the simulation and became dominant at this point.In this case, obviously, FQS is advantageous with respect to expressibility for the ground state.
As confirmed in the previous study [31], the ansatz shown in Fig. 3(a) with the Fraxis gates can express states with lower energy than the R y ansatz.FQS, however, made distinct difference in comparison with Fraxis as in Fig. 4, where we employed the identical initial states whose rotational axes were randomly selected.Since in conventional optimization rotational axes of single-qubit gates are fixed, the initial choice of the axes, namely ansatz design, plays critical roles.In contrast, because the rotational axes are adaptively switched during simulation in circuit structure optimization such as Rotoselect [30] and Fraxis [31], they alleviate circuit design dependency achieving higher expressibility.While the axis selection are limited to x-, y-, and z-axes in Rotoselect, Fraxis can find better axes under the condition of fixed rotational angle θ = π.Nevertheless, due to the fixed rotational angles, Fraxis may not be sufficient to reproduce accurate states on ITE path.Indeed, Fig. 4 shows that the ITE path reproduced by FQS is more accurate than the one by Fraxis.This shows that the full expressibility of single-qubit gates in FQS brings large improvement in ITE simulations.
In addition to higher expressibility, the advantage of FQS is its simultaneous optimization of multi parameters.Similar to NFT and Fraxis, FQS can analytically obtain the exact landscape of cost function.While the number of parameters to be updated at each iteration is limited to one in NFT and two in Fraxis, FQS can update at most three parameters which fully parameterize a single-qubit gate.To confirm it, we carried out additional ITE simulations for the 1D Heisenberg model in Eq. ( 31) with the ansatz in Fig. 3(a).
For fair comparison, we prepared two settings such that two simulations were performed on PQC with equivalent expressibility.In the first condition (Setting-A), all D = 15 single-qubit gates U d (ϑ) in the ansatz with two layers (a) A 5-qubit PQC for 1D Heisenberg model in result A were treated by FQS(1q, 3p) as shown in Table II.On the other hand, in the second condition (Setting-B), these 15 gates were decomposed into 15 R y and 30 R z gates (D = 45) as U (ϑ) = R z (φ)R y (ψ)R z (λ), and the 45 gates were sequentially optimized with NFT.
We carried out 30 independent simulations with different initial parameters that were shared by the two settings, where the initial rotational axes were determined randomly, and the initial rotational angles were fixed to π. Figure 5 shows the cumulative distributions of the fidelity between the ground state and the resulting state at 40, 80, 160, and 320 time step.Since it is less likely to be trapped at local minimum for ITE, the fidelity in both simulations became gradually larger with the time steps.Note that because the total number of gates in NFT simulation is larger than that in FQS, actions of the respective ITE propagators in NFT are scaled down more than FQS for better reproduction even with the coordinate-wise update.Nevertheless it is obvious that FQS reproduced ITE paths more accurately than NFT, which shows the importance to simultaneously optimize multiple parameters.
It should be also noted that NFT and FQS(1q, 3p) require three and seven types measurements per gate update, respectively (see Section II-B).NFT was originally proposed for Variational Quantum Eigensolver (VQE), where the number of required measurement types can be reduced to two except for the first optimization step, because the cost function is equivalent to the observable (expectation value of energy), and thus the estimated value of the cost function in the prior optimization can be reused in the following optimization.In contrast, because in this time evolution simulation the objective function in Eq. ( 5) differs from the observable in Eq. ( 22), it is not possible to reduce the number of measurement types in each optimization step.As a result, the NFT optimization of a general single-qubit gate requires nine measurement types because the gate is decomposed into three fixed-axis rotation gates, e.g., one R y and two R z gates.Therefore, the computational cost for a general single-qubit gate by FQS(1q, 3p), which requires seven measurement types, is actually smaller than that of NFT while achieving higher accuracy due to taking into account correlation among parameters.

B. H 2 MOLECULE WITH FQS(2Q, 2P)
As described in Method Section, the FQS can be applied to the set of two-qubit gate decomposable as in  Eq. (23).In this section, we confirm performance of the FQS for excitation-conserving gates useful for quantum chemical calculation.We chose H 2 molecule with the atomic distance of 0.74 Å as a benchmark system, where the molecular Hamiltonian obtained by Hartree-Fock method with STO-3G basis was mapped to 4-qubit Hamiltonian by Jordan-Wigner transformation.The imaginary time propagators were prepared as well as in the previous section with fixed time step ∆τ = 1.0.In this section, we employed an ansatz with the structure The excitation-conserving PQC for H 2 molecule with Jordan-Wigner mapping.
Decomposition of the excitation-conserving gate N (ψ, φ) using Ry and Rz gates.
shown in Fig. 6.When compared to the Hop and RBS gates, the excitation-conserving gate has an additional degree of freedom, which allows to express relative phase in complex space.Although the advantage of FQS(2q, 2p) is the simultaneous update of two parameters taking the correlation into account, the advantage is not unveiled trivially in treatment of molecular Hamiltonian.This is because the eigenstates can be represented by vectors in real space.To confirm this point, we compared two ITE simulations, where ψ and φ are variable for FQS(2q, 2p), while φ = π for FQS(2q, 1p).Here, we refer the simulation conditions for FQS(2q, 1p) and FQS(2q, 2p) as Setting-C and -D, respectively, (See Table I).In both setting, the gate set Λ p consists of the single-qubit gates in pth excitation-conserving gate.
Given φ = π as an initial condition, FQS(2q, 2p) yielded the same trajectory as FQS(2q, 1p) with φ = π (Data is not shown).The unvarying behavior of φ is not trivial, but seems to be reasonable because a state represented by a real vector remains in real space by ITE with molecular Hamiltonian.Hence, for FQS(2q, 2p) simulations we randomly chose the initial value of φ.On the other hand, FQS(2q, 1p) and FQS(2q, 2p) shared the same initial values of ψ, which were randomly generated.
Figure 8(a) shows all simulations started from the similar energy level, which implies the initial states contain the ground state with amplitude in the same scale.Note that FQS(2q, 1p) with φ = π reached to the chemical accuracy (∆E = 10 −3 a.u.) in the best case, which im-plies the excitation-conserving ansatz restricted to φ = π can sufficiently cover the ground state.However, in the worst case of FQS(2q, 1p), the energy was not improved from the HF level, which implies that the expressibility for the ground state does not necessarily guarantee sufficient expressibility to reproduce accurate ITE path.In contrast, FQS(2q, 2p) initiating from a state in complex space outperformed FQS(2q, 1p) with φ = π.This is presumably because FQS initiating from complex vector can make use of larger Hilbert space which includes not only complex space represented by φ = π but also extended real space caused by the phase cancellation among φ's in excitation-conserving gates.Thus, the time evolution in FQS can possibly be described better with φ = π.
Next, we evaluated the correlation of two parameters (ψ, φ) in each excitation-conserving gate.To this end, we decomposed an excitation-conserving gate into R y and R z gates as Fig. 7 according to [48].Then we sequentially updated ψ and φ with FQS(2q, 1p) in different gate sets (Setting-F).Since these multiple R y and R z gates share the parameters ψ and φ, respectively, required controlled operations still remain two for one excitation-conserving gate, namely η = 0.5, even after the gate decomposition.On the other hand, the number of optimization is doubled as P = 10, that is effective time step is scale down, according to increase of the total number of parameterized gate D, which can lead to overestimation of its performance through scaling effect of the propagator e −ih Ô∆t/P in Eq. ( 6).Hence, for fair comparison with FQS(2q, 1p) and FQS(2q, 2p), we carried out twice sweep update for each optimization of Eq. ( 6) in the FQS(2q, 2p) method, where we employed P = 10 allowing overlap of Λ (see Appendix E).We refer this simulation condition as Setting-E in Table II.
For statistical accuracy, we independently conducted 20 ITE simulations by using randomly-generated common initial states for both FQS(2p, 1p) and FQS(2q, 2p) methods.Figure 8 showed a boxplot of the energy in course of simulation time, which exhibits distinct difference where FQS(2q, 2p) reached to lower energy states when compared to FQS(2q, 1p).This discrepancy implies the importance of taking into account correlation between ψ and φ.
FQS(2q, 2p) requires eight-type measurements to evaluate Eq. ( 29), because G xy = −G yx holds for the excitation-conserving gate.On the other hand, four-type measurements are required in FQS(2q, 1p) for the respective gates.Therefore, the number of required measurements for a single excitation-conserving gate in Eq. ( 23) with FQS(2q, 2p) is equivalent to that in separate optimization with FQS(2q, 1p).
Considering the twice sweep, FQS(2q, 2p) simulation in Fig. 8(b) are twice as expensive with respect to the measurement cost.To evaluate the simulation accuracy with consistent measurement cost, FQS(2p, 1p) with Setting-F should be compared with FQS(2q, 2p) with Setting-D where all parameters are updated once per a propagator.Eight types of measurements are required in both simulations.
As shown in Fig. 8, notably, the performance of FQS(2q, 2p) (Setting-D) was almost retained in this case when compared to the twice-sweep simulation (Setting-E) in Fig. 8(b), although the worst case in single sweep resulted in slightly larger energy.Altogether, the FQS(2q, 2p) application to the excitation-conserving gate realizes incorporation of the parameter correlation without any additional cost when compared to FQS(2q, 1p), which remarkably outweighs the time scaling of the propagator according to the number of the parameterized gates.

C. REAL TIME EVOLUTION FOR 1D ISING MODEL WITH FQS(1Q, 3P)
We applied FQS algorithm to real time evolution of a 4-qubit 1D Ising model with transverse-field under the open boundary condition.The Hamiltonian is given as where J = h = 1.We employed the ground state |0

⊗4
of the Hamiltonian without transverse-field as the initial state at t = 0.The time propagators were prepared using the first-order Trotter decomposition with fixed time step ∆t = 0.01.We used the 2-layer PQC with a linear entangler as shown in Fig. 3(b), where the single-qubit gate U d represents a general single-qubit gate in FQS and R z -R y -R z gates in NFT so that FQS and NFT simulations were performed on PQC with equivalent expressibility as in the previous section.
In order to demonstrate the trade-off between simulation accuracy and hardware efficiency, we prepared several simulation settings with different values of hardware efficiency η.In the most hardware-efficient case (η = 1), a propagator is divided by the total number of single-qubit gates D, and the respective single-qubit gates are updated according to Eq. ( 6) with respect to a divided propagator e −ih k Ôk ∆t/D , where D = 12 in FQS and D = 36 in NFT.
As the next hardware-efficient case, we employed a group update for a propagator of e −ih k Ôk ∆t/P with the objective function in Eq. ( 6).Here, the respective gate sets Λ p , (p = 1, . . ., P ) consist of single-qubit gates in identical layers as in Fig. 3 "|Λ| = 1", "Layer", and "All" denote the level of hardwareefficiency determining the gate set in one optimization problem in Eq. (7).
parameterized gates are sequentially updated to approximate the action of one propagator e −ih k Ôk ∆t .We refer to such simulations as NFT(All) and FQS(All).
Figure 9 shows total magnetization per site N −1 i Z i and infidelity at each time step with different levels of hardware-efficiency, where the infidelity is defined as The NFT method failed to reproduce dynamics of the total magnetization for all three cases, in particular with η = 1/D and 1.Although NFT improved with the use of different initial state (See Supplementary materials), FQS still outperformed NFT for all cases.Considering equivalent expressibility of PQC between NFT and FQS, the results imply parameter correlations within a single gate (intragate correlation) are more important in real time dynamics than in imaginary time evolution.It seems to be inline with the fact that the required accuracy with respect to reproduction of propagators in realtime evolution is much higher than that in imaginary time evolution to find the ground state.In addition, although the performance of FQS with η = 1 was not necessarily satisfactory, it was drastically improved with group update in FQS with η = 1/m and 1/D.The improvement was due to incorporation of the correlations between multiple gates in one group (intergate correlation) that are assigned to one divided propagator.In FQS (η = 1), the divided propagators are uniformly assigned to all parameterized gates.However, although the error with respect to one divided propagator can be suppressed by fine-grained time steps, the action of each divided propagator may not be sufficiently represented by a single gate update.On the other hands, in FQS with η = 1/m and 1/D, the description of action of one propagator reproduced better by multi-gate update in one optimization.Indeed, higher accuracy was achieved in FQS with η < 1 where the respective propagators were assigned to more parameterized gates in one optimization.

IV. CONCLUSION
In this paper, we proposed a new method called FQS for time evolution simulation with full optimization of a single-qubit gate with respect to its rotational angle and axis.The time evolution is reproduced by sequential optimization of Euclidean norm between target and trial states, instead of the conventional gradient based approach.Because FQS can incorporate correlation among parameters into optimization, it can achieve more accurate simulation.We extended FQS to the excitation-conserving gates that have been widely employed in quantum chemical applications.To verify the performance of the proposed method, we applied it to quantum imaginary time evolution for 1D Heisenberg model and H 2 molecule and confirmed that it effectively led to quantum states that were closer to the true ground states.We also applied FQS to real time evolution of 1D Ising model.Unlike imaginary time evolution, the most hardware-efficient setting with FQS did not reproduce the dynamics with satisfactory accuracy in real time evolution, although the advantage of FQS over other methods was significant.However, its dynamics accuracy was drastically improved when the hardware efficient condition was relaxed with the use of O(m) controlled gates, where m is the number of qubits.
Although in the present work the gate updating order is fixed from left to right (from the one closest to the input qubits to that closest to the output qubits) of the quantum circuit assuming generality of ansatz and Hamiltonian, the order may not be optimal and there remain rooms for improvement.One way to determine the order is based on the support of the time propagator.These technical improvement may allow the use of FQS in real applications implemented on a bigger size circuit; for instance, calculation of broad vibrational absorption spectra of floppy molecules [49], simulation of short time molecular dynamics observed by femto-second time-resolved spectroscopy [50].Lastly note that, other than time evolution simulation, FQS is applicable to general optimization problems whose objective functions are given by Euclidean norm between a target state and a trial state of a PQC.Therefore, an improved FQS may also be potentially applicable to such optimization problems in a practical level.Substituting Eq. (A.6) into Eq.(A.2), we obtain The exact solution for maximization of F (d) (ϑ d ) is trivially given as where l ∈ Z is chosen such that θ * d ∈ [0, 4π].The sign of n * d does not affect the expectation values while it changes the global phase.

B. EVALUATION OF gµ WITHOUT HADAMARD TEST
Here, we clarify that g µ in Eq. (A.6) can be estimated by evaluation of several expectation values.To this end, we transform g µ as where we employed the condition that U (ϑ d ; ϑ (d−1) ) differs from U ( ϑ (d−1) ) with respect to only the dth gate in the third equality. 1), and ρ ′ , Ô′ are defined as Note that the first term in Eq. (B.1) is transformed as To obtain the above, we used a relation σ a σ b = δ ab I + c=x,y,z iǫ abc σ c , where δ ab and ǫ abc denotes the Kronecker delta and the three-dimensional Levi-Civita symbol, respectively.
Because a parameter ϑ is already known, the first term in Eq. (B.1) can be calculated without quantum circuits.Thus, to obtain g µ , only the following quantities in the second term in Eq. (B.1) are required

Re tr Ô′ ς
Here, we introduce a real value α µ and a real unit vector It is straightforward to prove the existence of the α µ and β µ as follows.

C. THE RELATIONSHIP OF FQS WITH NFT AND FRAXIS
In this section, we derive FQS(1q, 2p) (Fraxis) and FQS(1q, 1p) (NFT) from FQS(1q, 3p).First, in a Fraxis gate the rotational angle θ in a general single-qubit expression R n (θ) is fixed to π while the rotation axis n is arbitrary.Substituting θ d = π and ϑ d = n d into Eq.(A.2), we obtain the objective function for Fraxis Note that g 0 in Eq. (A.6) does not appear in this objective function.Thus, the optimal n * d that maximizes the objective function is trivially n * d = g/ g as in Eq. (A.8).The same procedure as for FQS(1q, 3p) can be used for evaluation of g, and thus FQS(1q, 2p) requires only six types of measurements Q In NFT method a single-qubit gate R n (θ) has a fixed axis ñ such as R x (θ).From Eq. (A.2), the objective function for NFT is given as  Then, we can evaluate Eq. (C.3) and Eq.(C.4) in the same way as FQS(1q, 3p), and the number of types of required measurements is three for FQS(1q, 1p).

D. FQS FOR HOP, RBS, AND SWAP GATES
Considering the case η = 1/2 (|Λ p | = 2, ∀p = 1, 2, • • • , P ), we derive the analytical optimization for the objective function in Eq. ( 6) with specific multi-qubit gates.Here we suppose the pth gate set includes only two single-qubit gates with shared parameters ϑ p = (θ p , n p ) as in Eq. (23).Given the swap gate in Table I where the rotational axis is fixed to ñ = (0, 0, 1), the objective function F (p) (ϑ p ) is rewritten as Then, the optimal value of θ * p for the optimization problem Eq. (D.1) is given as where we choose l ∈ Z satisfying θ * p ∈ [0, 2π].For the cases of the Hop and RBS gates in Table I, the rotational angle θ is fixed to π and the rotational axis n is restricted in the XZ-plane.Thus, n can be expressed with one parameter ψ as n(ψ) = (sin (ψ/2), 0, cos (ψ/2)).Expanding Eq. ( 26) with n p (ψ p ) as a function of ψ p , we obtain the the objective function that has the same form as Eq.(D.1), where h 0 , h 1 , h 2 , h 3 , and θ p are replaced by G 33 , G 31 , G 13 , G 11 , and ψ p , respectively.Altogether, for any gate types in Table I only Eq. ( 29) are required for estimation of the optimal parameter.Figure 4: In the case of NFT and Fraxis, the single-qubit gates in Fig. 3(a) were replaced by R y and R n (π) gates, respectively, where indices of the parameterized gates are also shown in Fig. 3(a).We consistently employed the following conditions Setting-A for NFT, Fraxis, and FQS.

E. EXPERIMENTAL DETAILS 1D Heisenberg model
Setting-A: The PQC employed contains 15 parameterized single-qubit gates and the gates in an identical layer are indexed from ascending order of corresponding qubits (D = 15).We used 15 gate sets in total (P = 15) where Λ p = {p}, (p = 1, • • • , 15).In the optimization for the respective gate sets Λ p , the gate parameters of each gate set were updated once in ascending order of gate indices.
Setting-B: The 15 general single-qubit gates in the FQS simulation in Setting-A were decomposed into 15 Ry and 30 Rz gates as U (ϑ) = R z (φ)R y (ψ)R z (λ) (D = 45).Then, the 45 gate sets are prepared as Λ p = {p}, (p = 1, • • • , 45) (P = 45).The three gates obtained by decomposition are assigned a series of indices from the front of the circuit and separately updated with the NFT method in ascending order of the gate set index.

H 2 molecules
For imaginary time evolution of H 2 molecular Hamiltonian, we employed a PQC shown in Fig. 6, which contains five excitation-conserving gates.An index of single-qubit gates in pth excitation-conserving gate is assigned from the front of the circuit such as 2p − 1, 2p.
Setting-F : The dth excitation-conserving gates are decomposed into multiple R y and R z gates as in Fig. 7, where the parameters (ψ, φ) in excitation-conserving gates are shared by two R z gates and two R y gates, respectively.Then ψ and φ are separately and sequentially updated once for through ten optimizations in the order of φ 1 → ψ 1 → φ 2 → ψ 2 • • • φ 5 → ψ 5 by using FQS(2q, 1p).

FIG. 3 .
FIG. 3. PQCs for result A and result C. The PQC (a) and (b) are used for simulating imaginary time evolution with a 5-qubit 1D Heisenberg model and real time evolution with a 4-qubit 1D Ising model, respectively.The number of layers is represented by L (l = 1, • • • , L), and U d is the dth singlequbit gate whose type is determined corresponding to the optimization method, e.g.U d = Ry(θ) for NFT, U d = Rn(π) for Fraxis, and U d = Rn(θ) for FQS in the first part of result A.

FIG. 4 .
FIG.4.Averaged trajectories of 21 independent simulations for 1D Heisenberg model using the PQC shown in Fig.3(a)with two layers.Black and red lines represent the results obtained with NFT and Fraxis, where initial rotational angles and axes are randomly selected, respectively.Blue and Green lines represent the result with FQS, where the initial states are identical to those of NFT and Fraxis, respectively.
FIG.5.Cumulative distributions of the fidelity between the ground state and the imaginary time evolved state in 1D Heisenberg model using the PQC shown in Fig.3(a) with 2 layers.Orange and blue lines represent the cumulative distributions of NFT and FQS(1q, 3p), respectively.
FIG. 8. Energy in the course of simulation steps N with ∆τ = 1.Black and red lines correspond to the results of FQS(2q, 1p) and FQS(2q, 2p), respectively.Box and whisker plot denote quantiles obtained from the ITE simulation results with 20 different initial parameter sets.Dash line represents the HF energy level.

FIG. 9 .
FIG. 9. Dynamics of (Top) total magnetization per site and (Bottom) the infidelity of simulation for 4-qubit 1D Ising model, where the initial states were |0 ⊗4 .The solid lines and dashed lines represent the results of FQS and NFT, respectively.The red dotted line represents the exact simulation."|Λ| = 1", "Layer", and "All" denote the level of hardwareefficiency determining the gate set in one optimization problem in Eq.(7).

TABLE I .
Examples of the special two-qubit gates in the form of ARn(θ)BR † n (θ)C.The parameterized single-qubit gate Rn(θ) is supposed to act on the second qubit (more precisely, Rn(θ) denotes I1⊗Rn(θ)).The arbitrary gates A, B, and C without parameters are represented by Z2 (Pauli Z), X c

TABLE II .
Simulation settings of imaginary time evolution in Section III-A and III-B Then, we only evaluate g 0 and g d for executing FQS(1q, 1p).More specifically, the following quantities corresponding to the second term in Eq. (B.1) are required, d ≡ ñd • g.