Digitized-counterdiabatic quantum approximate optimization algorithm

The quantum approximate optimization algorithm (QAOA) has proved to be an effective classical-quantum algorithm serving multiple purposes, from solving combinatorial optimization problems to finding the ground state of many-body quantum systems. Since QAOA is an ansatz-dependent algorithm, there is always a need to design ansatz for better optimization. To this end, we propose a digitized version of QAOA enhanced via the use of shortcuts to adiabaticity. Specifically, we use a counterdiabatic (CD) driving term to design a better ansatz, along with the Hamiltonian and mixing terms, enhancing the global performance. We apply our digitized-counterdiabatic QAOA to Ising models, classical optimization problems, and the P-spin model, demonstrating that it outperforms standard QAOA in all cases we study.


I. INTRODUCTION
Hybrid classical-quantum algorithms have the potential to unleash a broad set of applications in the quantum computing realm. The challenges involved in realizing fault-tolerant quantum computer have promoted the study of such hybrid algorithms, which proved to be relevant to modern noisy intermediate-scale quantum (NISQ) devices [1,2] with few hundred qubits and limited coherence time. One notable example is that of variational quantum algorithms (VQA), which is implemented by designing variational quantum circuits to minimize the expectation value for a given problem Hamiltonian. VQA is advantageous given the fact that preparing a tunable circuit ansatz is found to be difficult on a classical computer. It has already been widely applied in quantum chemistry [3][4][5][6][7][8], condensed matter physics [9][10][11], solving linear system of equations [12], combinatorial optimization problems [13,14] and several others [15,16]. Remarkably, one of the early implementations of the VQA was performed using photonic quantum processors [17], which prompted further theoretical progress [18][19][20][21][22][23]. VQA has been demonstrated in superconducting qubits [3,6,18] and trapped ions [8,24,25].
One compelling outcome of VQA is the development of the quantum approximate optimization algorithm (QAOA) [26], which provides an alternative for solving combinatorial optimization problems using shallow quantum circuits with classically optimized parameters. In the past few years, there has been a rapid development in QAOA-based techniques that have been applied not only for solving conventional optimization problems like MaxCut but also for solving ground state problems in different physical systems [25,27,28]. Improved * These two authors contributed equally. † koushikpal09@gmail.com ‡ enr.solano@gmail.com § chenxi1979cn@gmail.com versions of QAOA, like ADAPT-QAOA [29] and Digital-Analog QAOA [30] have also been reported recently. Like any combinatorial optimization problem, QAOA depends on optimizing a cost function to obtain the desired optimal state corresponding to a p-level parametrized quantum circuit. In addition, the choice of the approximate trial state, from which the cost function is obtained, is crucial to the success of the QAOA algorithm. Generally, this is done by using quantum adiabatic algorithms (QAA) which produce near-optimal results for large p which is not suitable for current NISQ devices. Moreover, due to the requirement of large p, the cost of classical optimization increases and the algorithms suffer from the problem of vanishing gradients and local minima [31][32][33]. Several studies have been reported in past few years showing that high fidelity quantum states can be prepared by assisting QAA with additional driving interaction [34]. These studies establish that for certain problems, the inclusion of additional driving terms can reduce the computational complexity, and with it the circuit depth. These driving terms are usually calculated using methods developed under the umbrella of so-called shortcuts to adiabaticity [35,36], which have been introduced to improve the traditional quantum adiabatic processes, removing the requirement for slow driving [37]. Instances of these methods include counterdiabatic (CD) driving [38][39][40], fast-forward approach [41,42], and invariantbased inverse engineering [43,44]. Among them, CD driving is interesting and has been used to study fast dynamics [45][46][47][48][49], preparation of entangled states [50][51][52][53], adiabatic quantum computing [34,54] and quantum annealing [55][56][57].
In the context of QAOA, the advantage of the introduction of CD driving is twofold. The CD driving decreases the circuit depth, while reducing the number of optimization parameters. On the other hand, it provides a better approximate trial state which is beneficial for finding the optimal target state. In this work, we propose a novel algorithm, digitized-counterdiabatic quantum approximate optimization algorithm (DC-QAOA), which improves the performance of the conventional QAOA using CD driving. In this context, it is worthwhile to mention the work of Ref. [58], also inspired by CD driving techniques. This article is organized as follows. In Sec. II, we introduce DC-QAOA algorithm and explain it in detail, comparing it with the quantum adiabatic evolution and QAOA. In the following sections we present a comparative study of the proposed DC-QAOA and the conventional QAOA method in the context of various physical systems. In Sec. III, we prepared the ground state of three different types of 1D Ising spin models, namely, the Longitudinal field Ising model (LFIM), the transverse field Ising model (TFIM), and the GHZ state. In Sec. IV, we studied classical optimization problems such as the MaxCut problem and the Sherrington-Kirkpatrick model, while in Sec. V, different variants of P-spin model are considered. In doing so, we establish, by comparing the approximation ratios, that the DC-QAOA is advantageous compared to the QAOA for shallow quantum circuits. We conclude with a discussion in Sec. VI. The conventional QAOA method can be viewed as a combination of two distinct parts: the quantum part consists of a parameterized circuit ansatz, which is in turn complemented by a classical optimization algorithm to determine the parameters that minimize (maximize) a predefined cost function. The circuit ansatz for the quantum part is governed by an annealing Hamiltonian, where λ(t) ∈ [0, 1] is the annealing schedule for t ∈ [0, T ]. H mixer = i h i σ x i is the mixing Hamiltonian that produces an equal (weighted) superposition state in the computational basis to begin with, whereas the desired final state is the ground state (or an eigenstate) of H prob . In continuous annealing, the system evolves from the eigenstate of H mixer to the eigenstate of H prob through adiabatic evolution. The corresponding digital adiabatic circuit ansatz can be designed using the trotterized time evolution operator [59,60] where we consider that H a (t) can be decomposed into M klocal terms, i.e., into terms H m (t) which have k-body interactions at most. Note that the U(0, T ) is a product of p subunitaries, each corresponding to an infinitesimal propagation step ∆t. An adiabatic evolution using U(0, T ) can always produce an exact target state at the cost of resorting to a large value of p. This can be translated to the language of QAOA if one parameterizes U(0, T ) as where the evolution operators are U m (β p ) = exp (−iβ p H mixer ) and U p (γ p ) = exp (−iγ p H prob ).
Here, the annealing schedule is characterized by the discrete set of parameters {β p , β p−1 , . . . , β 1 } and {γ p , γ p−1 , . . . , γ 1 }. (γ, β) defines a 2p parameter space that corresponds to the depth of the circuit ansatz and the cost function F(γ, β) is optimized classically to obtain an optimal parameter set (γ * , β * ), which produces the desired target state, i.e., |ψ(γ * , β * ). Note that in most cases this target state is chosen to be the ground state of H prob .
As the case of adiabatic evolution, QAOA requires large p to obtain a near-optimal trial state, even with the assistance of the classical optimizer. In addition, the realization of U(β, γ) for an interacting many-body system for large p becomes inefficient due to the large number of gates involved. In DC-QAOA, we focus on improving the quantum part of QAOA, by adding a variational parameter in each step, i.e., The application of another parameter decreases the size of p drastically. This additional parameter can be quantified as the inclusion of the CD driving term in the problem Hamiltonian. The resulting circuit ansatz is shown in Fig. 1.
In general, CD driving amounts to using an additional control Hamiltonian in Eq. (1), required for suppressing nonadiabatic transitions [38][39][40]61]. This is especially effective for many-body systems with tightly spaced eigenstates. CD driving comes at a cost, as it generally involves nonlocal many-body interactions, and their exact specification of the CD Hamiltonian term requires access to the spectral properties of the driven system [38,40,45]. As a way out, variational approximations have been proposed to obtain the CD terms [50,62,63]. In this context, one can use the adiabatic gauge potential for finding an approximate CD driving without spectral information of the system [63,64].
In the following sections, a pool of CD operators is defined using the nested commutator approach of the adiabatic gauge potential, provided by Claeys et. al. [65], Here we considered up to the second order in the expansion of the nested commutator, l = 2, which gives rise to an operator pool A = {σ y , σ z σ y , σ y σ z , σ x σ y , σ y σ x }, including solely local and two-body interactions. Note that the choice of this operator pool depends on H a (t) and may contain other operators depending on the problem Hamiltonian. However, A contains every possible CD operator that can be derived from the problem Hamiltonians, used in the present study. We chose the CD term as a combination of these operators for each system based on the success probability of the algorithm. For instance, the local CD driving term provides better success probability in the case of the transverse-field Ising model and P-spin model, however not suitable for solving the MaxCut Hamiltonian. The CD coefficients α k are transformed into the additional variational parameter associated with the CD driving. The addition of such a new free parameter increases the degrees of freedom, making it possible to reach broader parts of the Hilbert space of the Hamiltonian with a lower circuit depth than in QAOA. Furthermore, as DC-QAOA only requires the operator form of the CD driving combined with the additional set of parameters α, it eliminates the requirement of complex calculation of the CD coefficients. DC-QAOA is also more flexible in regards to the boundary conditions compared to the CD evolution which permits the application of the driving term even for one step only. Moreover, the operators can be chosen heuristically and according to the requirement of the system which is being studied.
Although there are several ways to define the cost function, we opt for the most convenient one which is the energy expectation value of the problem Hamiltonian calculated for the trial wave function, where ψ(γ, β, α) represents the approximate trial state produced by the digitized CD ansatz. The efficiency of our algorithm can be measured in terms of the approximation ratio, given by, where E 0 is the ground state energy of the system. Classical optimization techniques are an integral part of variational algorithms, which helps to find the optimal parameters that minimize the cost function. Our work mainly considers two optimization techniques, namely Momentum Optimizer and Adagrad Optimizer, which are specific examples of stochastic gradient descent (SGD) algorithms. Momentum Optimizer is a variant of SGD in which a momentum term is added along with the gradient descent. The prime purpose of the momentum term is to increase the parameter update rate when gradients are in the same direction and decrease the update rate when gradients point in a different direction [66]. On the other hand, Adagrad Optimizer's main purpose is to change the update rate based on the past descent results [67]. Adagrad has shown great improvements in the robustness of SGD [68]. These two classical optimization techniques work pretty well for the cases we consider. This is because these optimization routines have proven faster convergence than gradient descent. Moreover, some of the cases we study involve a large Hilbert space, which may lead to local minima in the energy landscape. In the presence of steep gradients, the use of these techniques proves beneficial. This problem dependence of the performance is shared with other optimization routines such as Nesterov Momentum, Adam, and AdaMax. An overview and comparison about challenges faced by the different types of gradient descent optimization, can be found in Ref. [69].

III. ISING SPIN MODELS
1D quantum Ising spin chains are the manifestation of the simplest many-body systems that are widely studied in existing quantum processors. Numerous computational problems can be mapped to finding the ground state of the Ising-like Hamiltonians, which makes it suitable for benchmarking various quantum algorithms. The general form of the Hamiltonian of 1D Ising spin model is given by, where σ δ i denotes the Pauli matrices at the ith site, and < i, j > corresponds to the nearest-neighbor interaction with strength J i j . The on-site interaction terms h i and k i represent the longitudinal and transverse fields, respectively. We consider the periodic boundary conditions so that our model describes a ring of interacting spins [70][71][72]. Note that three special cases can be retrieved from Eq. (8): i) longitudinal field Ising model (LFIM) when k i = 0, ii) transverse field Ising model (TFIM) when h i = 0, and iii) a special case when both k i = 0 and h i = 0, for which the resulting ground state of H prob is the highly entangled Greenberger-Horne-Zeilinger (GHZ) state [73][74][75][76]. For simplicity, we choose the system to be homogeneous i.e., J i j = J as well as h i = h z and k i = h x . To prepare an equal superposition of the qubits, as a input of the circuit ansatz, the mixing Hamiltonian is chosen as H mixer = i σ x i . To implement DC-QAOA, as mentioned in Sec. II, along with the problem and mixer Hamiltonian, we include the CD term to define the circuit ansatz. The CD operator is chosen heuristically from the operator pool A. For instance, in the case of LFIM, the ground state is ferromagnetic and constitutes a large energy gap with the first excited state for the chosen interaction strengths. In such cases, the local driving term A t = i σ y i can produce the ground state. On the other hand, the ground state of TFIM is closely spaced with the nearby excited states, which makes the local driving term insufficient. Similarly, the local driving term is also not suitable for GHZ state [61]. Instead, the second-order term A t = i σ z i σ y i+1 is more likely to produce a better result. The unitary operator that represents the CD part of the circuit ansatz is given by, where A q t represents the respective q-local CD operator chosen from the CD pool A. For instance, if q = 1 then A q t = {A t } j and if q = 2, then A q t = {A t } j, j+1 . The circuit is designed using the gate model of quantum computing whereas the classical optimization is the stochastic gradient descent method. Fig. 2 depicts the improvement obtained by DC-QAOA over traditional QAOA. In the simulation, we study a 12-qubit system, for which we compute R for different p values. For LFIM, as shown in Fig. 2a, R = 1 even for p = 1 with DC-QAOA which constitutes considerable improvement over QAOA, that requires p = 3 to achieve unit R. Hence, for Fig. 2a, the number of variational parameters required to achieve unit R = 1 for DC-QAOA is 3p = 3 whereas for QAOA it is 2p = 6. We also see that, for a lower number of layers i.e., p = 1, 2, 3, DC-QAOA converges faster to the unit R compared to QAOA. Furthermore, while DC-QAOA shows better convergence at lower depths, for the TFIM and the GHZ states, the exact ground state can only be achieved with p ≥ L/2 layers. This effect can be attributed to the Lieb-Robinson bound [77,78] which forces the circuits for TFIM and GHZ state to scale linearly with the system size in order to achieve unit R.
To compare the resource requirements, both classical and quantum, one can inspect two crucial elements of these methods. In the case of systems with nearest-neighbor interactions, the increase in circuit depth per layer by adding a CD term will be constant, and it depends on the CD term chosen. The circuit depth can be quantified as d × p, and the CD driving increases it to (d + d cd ) × p, where d cd represents the increment in depth per layer. For LFIM, the CD term σ y gives d cd = 1, whereas for TFIM d cd = 4 for σ z σ y . On the other hand, the increase in parameter space due to the CD term is always from 2p to 3p, making DC-QAOA advantageous specifically for low p values. In the limit of large p, the performance of QAOA and DC-QAOA becomes comparable for fixed system size.

IV. CLASSICAL OPTIMIZATION PROBLEMS
Thus far, we have discussed the applications of DC-QAOA for finding the ground state of the Ising model and preparing entangled states. Combinatorial optimization problems are another set of problems that can be encoded in the ground state of a quantum Hamiltonian, diagonal in the computational basis. Here we discuss the application of DC-QAOA for solving combinatorial optimization problems, where the main objective is to find the optimal solution for a given classical cost function. MaxCut is one fundamental combinatorial optimization problem that has been solved using QAOA.
For the MaxCut problem, let us consider a graph G = (V, E), where V and E being the vertex set and edge set respectively. We consider a classical cost function C(z) defined on binary strings z = (z 1 , z 2 , . . . , z n ), and aim at separating the vertices into two sets so that the number of edges cut by C(z) is maximized. This maximizes the classical cost function where w i j represents the edge weight between vertices i and j.
Depending on the sets that the vertices of each edge are in after the cut, binary values (either 0 or 1) are assigned to variables z i and z j corresponding to respective vertices. This situation can be encoded in the ground state of the problem Hamiltonian by mapping the binary variables to Pauli operators Note that Eq. (11) also belongs to the Ising class and is equivalent to Eq. (8) for GHZ states if only nearest neighbor interaction is considered, which is the case of the 2-regular MaxCut.
Here, to verify the performance of our algorithm, we consider unweighted (w i j = J i j = 1) 3-regular MaxCut problem, with each vertex connected to three other vertices. The CD operator pool can be obtained from the NC expansion and is given by A = {σ z σ y , σ y σ z }. In Fig. 3 (a), the approximation ratio R for different graph sizes with up to 14 vertices (qubits) are shown for a single layer (p = 1). We notice that for small graph sizes, say 4 qubits), DC-QAOA is superior as it reaches unit R. However, for a bigger graph R decreases gradually while exceeding the performance of QAOA. Although this can be improved for p > 1 but for large depth DC-QAOA, the number of parameters for each step scales as 3p, so the landscape of the cost function most likely has a complicated form, and we expect to see the problem of vanishing gradients (Barren plateau). A detailed analysis is needed for p > 1 DC-QAOA, which we leave for future work. Interestingly, if J i j is chosen as random all-to-all two-body interactions, Eq. (11) represents the so-called Sherrington-Kirkpatrick (SK) model. SK model is a classical spin model proposed by Sherrington and Kirkpatrick [79,80] where J i j are interaction terms such that J = {∀J i j } has zero mean and unit variance. For instance, they can be randomly chosen from the set J = {−1, 1} with probability 1/2. The SK model is interesting for DC-QAOA as it can be studied as a combinatorial search problem on a complete graph. QAOA on the SK model has been extensively studied recently [81,82]. Here, ten different instances of J i j values are considered in a system of L = 6 spins. Note that the couplings J i j are nonuniform and the CD term depends on the choice of J i j . As this model involves similar interactions to that in the Max-Cut problem, we chose the CD term from the same operator pool, A = {σ z σ y , σ y σ z }, calculated from the nested commutator ansatz. In fact, the CD-term chosen for the SK model is A t = J i j σ z i σ y j , where the operators are applied to all the sites due to its all-to-all connectivity.
In Fig. 3b, the approximation ratio (R) is shown with respect to a varying number of layers (p). We observe that R is higher for DC-QAOA as compared to QAOA and that as the number of layers increases DC-QAOA and QAOA start to converge to the same value. This shows that DC-QAOA is efficient for instances where the circuit ansatz is low-layered. In fact, for low layers, although not giving the exact ground state DC-QAOA gives significantly enhanced R. This could be advantageous as we can find optimal parameters which could be used as initial parameters for high-layered QAOA.
V. P-SPIN MODEL As a final benchmark, we consider the P-spin model, which is a long-range exactly-solvable fully-connected model [83][84][85][86]. The system Hamiltonian reads While the ground-state of Hamiltonian (12) is trivial, the presence of a quantum phase transition makes its preparation challenging by quantum annealing [83]. For P = 2 this Hamiltonian exhibits a second-order phase transition whereas a first-order phase transition occurs for P ≥ 3, closing the energy gap exponentially with increasing system size. This has motivated proposals to change the first-order phase into second-order phase transition by making the Hamiltonian non-stoquastic [87,88]. The nature of the ground state also depends on P. For odd P the ground state is non-degenerate, while for even P it has a two-fold degeneracy with Z 2 symmetry, which makes the choice of the CD operator difficult [57]. We study DC-QAOA in a 6 qubit P-spin model for the nontrivial case of h 0 using local CD operator A t = i σ y i . QAOA and DC-QAOA are compared for three different cases: P = 3 , h = 1 and P = 4, h = {0, 1} respectively. Fig. 4 and Fig. 5 shows the advantage obtained by DC-QAOA for P = 4 and P = 3 respectively. For P = 4, R as a function of number of iterations is shown for p = 1 for 10 random parameter initialization. We observe that for a finite number of iterations, DC-QAOA shows higher R values as compared to QAOA for both h = 0 and h = 1. It is evident that, in the case of h = 0, QAOA is highly dependent on the choice of initial parameters and lands into local minima in some instances. By contrast, DC-QAOA shows unit R for every instance. For P = 3, h = 1, R is shown as a function of number of layers (p = 1, 2, 3) for 10 random initial parameters. As expected, for DC-QAOA, R values end up close to unity even for p = 1 and R values increase as the number of layers increases. However, this is not surprising for P = 3 as the ground state is a product state making it favorable for the local CD operator. The more intriguing case is in Fig. 4b, where the approximation ratio reaches close to unity for p = 1 even when the ground state is degenerate. This occurs simply because the trial state converges to a particular one of the two due to the local CD driving. This is in  contrast with QAOA, which does not achieve the target state for p = 1 in any case.

VI. DISCUSSION AND CONCLUSION
We have introduced a quantum algorithm leveraging the strengths of shortcuts to adiabaticity for quantum approximate optimization algorithms. Specifically, we have formulated a variant of QAOA using CD driving, called DC-QAOA, and established its enhanced performance over QAOA in finding ground states of different models. We benchmark our algo-rithm by considering various examples, starting with Ising spin models, preparing entangled states, classical optimization problems like MaxCut and SK model and, the P-spin model. Including the CD term to the circuit ansatz, the performance of the QAOA algorithm is enhanced. Results reveal that for low-layered circuits, DC-QAOA converges to the ground state faster than state-of-the-art QAOA. Thus, adding a new free parameter in the form of a gate chosen from a predefined set (CD term) increases the performance of the algorithm for shorter circuit depths. Thus, DC-QAOA turns out to be a preferable algorithm for circuits of shorter depth.
In conclusion, DC-QAOA outperforms QAOA for all the models we have studied. For high-depth circuits, DC-QAOA can be applied for initial layers only to enhance the performance of the standard QAOA. An interesting prospect would be to use the resulting optimal parameters from low depth DC-QAOA as the initial parameters of a high-depth QAOA in order to obtain the minima of the cost function efficiently. Our work shows that implementing principles of shortcuts-toadiabaticity to enhance quantum algorithms has both fundamental and practical importance. The experimental realization of DC-QAOA on real hardware offers an exciting prospect for further progress.
Note: As we finished this work, we learned about the recent preprint devoted to QAOA assisted by CD [89].