Recursive greedy initialization of the quantum approximate optimization algorithm with guaranteed improvement

The quantum approximate optimization algorithm (QAOA) is a variational quantum algorithm, where a quantum computer implements a variational ansatz consisting of $p$ layers of alternating unitary operators and a classical computer is used to optimize the variational parameters. For a random initialization, the optimization typically leads to local minima with poor performance, motivating the search for initialization strategies of QAOA variational parameters. Although numerous heuristic initializations exist, an analytical understanding and performance guarantees for large $p$ remain evasive. We introduce a greedy initialization of QAOA which guarantees improving performance with an increasing number of layers. Our main result is an analytic construction of $2p+1$ transition states - saddle points with a unique negative curvature direction - for QAOA with $p+1$ layers that use the local minimum of QAOA with $p$ layers. Transition states connect to new local minima, which are guaranteed to lower the energy compared to the minimum found for $p$ layers. We use the GREEDY procedure to navigate the exponentially increasing with $p$ number of local minima resulting from the recursive application of our analytic construction. The performance of the GREEDY procedure matches available initialization strategies while providing a guarantee for the minimal energy to decrease with an increasing number of layers $p$.


I. INTRODUCTION
The quantum approximate optimization algorithm (QAOA) [1] is a prospective near-term quantum algorithm for solving hard combinatorial optimization problems on Noisy Intermediate-Scale Quantum (NISQ) [2] devices.In this algorithm, the quantum computer is used to prepare a variational wave function that is updated in an iterative feedback loop with a classical computer to minimize a cost function (the energy expectation value), which encodes the computational problem.A common bottleneck of the QAOA is the convergence of the optimization procedure to one of the many low-quality local minima, whose number increases exponentially with the QAOA circuit depth p [3,4].
Much effort has been devoted to finding good initialization strategies to prevent convergence to such low-quality local minima.Researchers have proposed to: first solve a relaxed classical optimization problem and to use that as an initial guess [5], to use machine learning to infer patterns in the optimal parameters [6], interpolating optimal parameters between different circuit depths [3], or to use the parallels between the QAOA and quantum annealing [4].Recently the success of the interpolation strategies that appeal to annealing was attributed to the ability of the QAOA to effectively speed up adiabatic evolution via the so-called counterdiabatic mechanism [7].This result was used to explain cost function concentration for typical instances and concentration of optimal, typically smoothly varying, parameters, which was previously introduced on Ref. [8] and [9] respectively.
Despite this progress, all proposed initialization strategies remain heuristic or physically motivated at best, and our understanding of the QAOA optimization remains limited.One of the main puzzles is the exponential im-provement of the QAOA performance with circuit depth p, observed numerically [3,10].Here we propose an analytic approach that relates QAOA properties at circuit depths p and p + 1.The recursive application of our result leads to a QAOA initialization scheme that guarantees improvement of performance with p.
Our analytic approach relies on the consideration of stationary points of QAOA cost function beyond local minima.Inspired by the theory of energy landscapes [11], we focus on stationary configurations with a unique unstable direction, known as transition states (TS).We show that 2p + 1 distinct TS can be constructed analytically for a QAOA at circuit depth p + 1 (denoted as QAOA p+1 ) from minima at circuit depth p.All these TS for QAOA p+1 exhibit the same energy as the QAOA pminimum from which they are constructed, thus providing a good initialization for QAOA p+1 .Descending in the negative curvature direction connects each of the 2p + 1 TS to two local minima of QAOA p+1 , which are thus guaranteed to exhibit lower energy than the initial minima of QAOA p .Iterating this procedure leads to an exponentially increasing (in p) number of local minima which are guaranteed to have a lower energy at circuit depth p + 1 than at p [12].We visualize this hierarchy of minima and their connections in a graph and propose a Greedy approach to explore its structure.We numerically show that optimal parameters at every circuit depth p are smooth (i.e. the variational parameters change only slowly between circuit layers) and directly connect to a smooth parameter solution at p + 1 through the TS.Our results explain existing QAOA initializations and establish a recursive analytic approach to study QAOA.
The rest of the paper is organized as follows.In Section II we review the QAOA, present newly found symmetries, and introduce the analytical construction of TS.
In Section III we show how TS can be used as an initialization to systematically explore the QAOA optimization landscape.From this, we introduce a new heuristic method, dubbed Greedy for exploring the landscape and provide a comparison to popular optimization strategies.Finally, in Section IV we discuss our results and potential future extensions of our work.Appendices A-F present detailed proofs of our analytical results, as well as supporting numerical simulations.

II. QAOA OPTIMIZATION LANDSCAPE A. MaxCut problem on random regular graphs
The QAOA was originally proposed for a graph partitioning problem, known as finding the maximal cut (MaxCut) [1] and has also been applied to a variety of other optimization problems [13][14][15].MaxCut seeks for a partition of the given undirected graph G into two groups such that the number of edges E that connect vertices from different groups are maximized.Finding the solution of MaxCut for a graph with n vertices is equivalent to finding a ground state for the n-qubit classical Hamiltonian H C = ⟨i,j⟩ϵE σ z i σ z j , with the sum running over a set of graph edges E and σ z i being the Pauli-Z matrix acting on the i-th qubit.
The depth-p QAOA algorithm [1] minimizes the expectation value of the classical Hamiltonian over the variational state |β, γ⟩ with angles β = (β 1 , . . ., β p ) and γ = (γ 1 , . . ., γ p ) shown in Fig. 1(a): Here H B = − n i σ x i is the mixing Hamiltonian and the circuit depth p controls the number of applications of the classical and mixing Hamiltonian.The initial product state |+⟩ ⊗n , where all qubits point in the x-direction is an equal superposition of all possible graph partitions which is also the ground state of H B .
Finding the minimum of E(β, γ) = ⟨β, γ| H C |β, γ⟩ over angles (β 1 , . . ., β p ) and (γ 1 , . . ., γ p ) that form a set of 2p variational parameters, (β, γ), yields a desired approximation to the ground state of H C , equivalent to an approximate a solution of MaxCut.The scalar function E(β, γ) thus defines a 2p-dimensional energy landscape where the QAOA seeks to find the best minimum.The performance of the QAOA is typically reported in terms of the approximation ratio r β,γ = E(β, γ)/C min , where C min is the cost function value for the MaxCut.Symmetries of the QAOA ansatz when restricted to graphs with only odd connectivity, such as random 3-regular graphs (RRG3) used in this work, restrict the parameter range to the following fundamental region: ; < l a t e x i t s h a 1 _ b a s e 6 4 = " X + f 2 H 2 w B / y r y D S q 3 K 2       with i ∈ [1, p] and j ∈ [2, p].Note that the fundamental region presented above is smaller than what has been previously reported [3,16], see the Appendix A for details.

B. Energy minima and transition states
Previous studies of the QAOA landscape were restricted to local minima of the cost function E(β, γ), since they can be directly obtained using standard gradient-based or gradient-free optimization routines.Local minima are stationary points of the energy landscape (defined as ∂ i E(β, γ) = 0 for derivative running over all i = 1, . . ., 2p variational angles), where all eigenvalues of the Hessian matrix H ij = ∂ i ∂ j E(β, γ) are positive, that is the Hessian at the local minimum is positivedefinite.However, the study of energy landscapes [11] of chemical reactions and molecular dynamics has shown that TS, which corresponds to stationary points with a single negative eigenvalue of the Hessian matrix (index-1), also plays an important role [12].There, TS are particularly relevant as they correspond to the highestenergy configurations along a reaction pathway.They often serve as bottlenecks in the reaction process and thus are crucial for understanding reaction rates, designing catalysts, and predicting chemical behavior.By studying the role of transition states in the QAOA landscape, we aim to uncover insights that could lead to improved optimization strategies or better convergence properties of the algorithm.This motivates the construction of TS achieved below.

C. Analytic construction of transition states
The structure of the QAOA variational ansatz allows us to analytically construct the TS of QAOA p+1 using any local minima of QAOA p : Theorem 1 (TS construction, simplified version).Assume that we found a local minimum of QAOA p denoted as Padding the vector of variational angles with zeros at positions i and j, results in being a TS for QAOA p+1 when j = i or j = i + 1 and ∀i ∈ [1, p], and also for i = j = p + 1.
Proof.The argument consists of two steps.First, by relating the first derivative over newly introduced parameters to derivatives over existing angles we show that Eq. ( 3) is a stationary point of QAOA p+1 .More specifically, we observe that the gradient components where the zero insertion is made satisfy the following relations Since ∇E(β, γ) Γ p min = 0, it directly follows that the TS constructed using Theorem 1 are also stationary points.
In the second step, we show that the Hessian at the TS has a single negative eigenvalue.To this end in the Appendix B we show that we can always write the Hessian at the TS in the following form where Here, the largest block H(Γ p min ) corresponds to the old Hessian at the stationary point.The matrix h(l, k) corresponds to the second derivatives of the energy with respect to new parameters that are initially set to zero, whereas matrix v(l, k) represents the "mixing" terms, with one derivative taking over the old parameters and the second derivative corresponding to one of the new parameters, which are initialized at zero.By employing this representation of the Hessian at the TS, we utilize the eigenvalue interlacing theorem ([Ref.[17], Theorem 4 on page 117] summarized in Theorem 3) to establish that H[Γ p+1 TS (l, k)] has at most two negative eigenvalues.Subsequently, we prove that the determinant of H[Γ p+1 TS (l, k)] is negative for each of the 2p + 1 transition states, which implies the presence of only one negative eigenvalue (i.e., the index-1 direction).It is important to note that this result is independent of the choice of classical Hamiltonian, which is fixed to encode MaxCut in this work.
The simplified theorem above ignores the possibility of vanishing eigenvalues of the Hessian, which can be ruled out only on physical grounds.This issue and complete proof of the theorem are discussed in Appendix B. For each local minima of QAOAp we generate p + 1 TS for QAOAp+1, find corresponding minima as in Fig. 1(b), and show them on the plot connected by an edge to the original minima of QAOAp+1.Position along the vertical axis quantifies the performance of QAOA via the approximation ratio, and points are displaced on the horizontal axis for clarity.Color encodes the depth of the QAOA circuit, and large symbols along with the red dashed line indicate the path that is taken by the Greedy procedure that keeps the best minima for any given p resulting in an exponential improvement of the performance with p.The Greedy minimum coincides with an estimate of the global minimum for p = 6 (dashed line) obtained by choosing the best minima from 2 p initializations on a regular grid.

III. FROM TRANSITION STATES TO QAOA INTIALIZATION A. Initialization graph
For each local minimum of QAOA p , Theorem 1 provides p + 1 symmetric TS where zeros are padded at the same position, i = j, like in Fig. 1(a), and additionally p non-symmetric TS with j = i+1, where zeros are padded in adjacent layers of the QAOA circuit.Fig. 1(b) shows how one can descend from a given TS along the positive and negative index-1 direction, finding two new local minima of QAOA p+1 with lower energy.Thus Theorem 1 provides us with a powerful tool to systematically explore the local minima in the QAOA in a recursive fashion.
Such exploration of the QAOA initializations for a particular graph with n = 10 vertices is summarized in Fig. 2. We find a unique minimum for QAOA 1 using grid search (see Appendix E) in the fundamental region defined in Eq.E1 from which we construct two symmetric TS according to Eq. ( 3), descend from these TS in index-1 directions with the Broyden-Fletcher-Goldfarb-Shanno (BFGS) [18][19][20][21] algorithm, finding two new local minima of QAOA 2 .These minima are connected to the minima of QAOA 1 , since it was used to construct a TS.Repeating this procedure recursively for each of the p + 1 symmetric TS [22] we obtain the tree in Fig. 2. Assuming that all minima found in this way from symmetric TS are unique, their number would increase as 2 p−1 p!. Numerically, we observe that the number of unique minima is much smaller compared to the naïve counting, increasing approximately exponentially with p.

B. Greedy maneuvering through the graph
The exponential growth of the number of minima in QAOA depth p makes the naïve construction and exploration of the full graph a challenging task.To deal with the rapidly growing number of minima we introduce: Corollary 1.1 (Greedy recursive strategy).Using the lowest energy minimum that is found for QAOA depth p, we generate 2p + 1 transition states (TS) for QAOA p+1 .Each transition state corresponds to the same state in the Hilbert space as the initial local minimum, so the energy of all the transition states is the same and equal to the energy of the initial local minimum.We then optimize the QAOA parameters starting from each of these transition states and select the best new local minimum of QAOA p+1 to iterate this procedure.This Greedy recursive strategy is guaranteed to lower energy at every step.
Proof.Let the initial local minimum at QAOA depth p have energy E p .Since all the 2p + 1 transition states are generated from this minimum and have the same energy E p , when we optimize the QAOA parameters for QAOA p+1 starting from these transition states, all the converged local minima will have energy less than or equal to E p .As a result, the energy can either decrease or stay the same (provided that curvature vanishes, which we do not expect on physical grounds, see Appendix B), but it cannot increase.Therefore, the Greedy recursive strategy is guaranteed to lower or maintain the energy at every step.
The Greedy path that is taken by this strategy in the initialization graph is shown in Fig. 2 as a red dashed line.We can see that this heuristic allows to very effectively maneuver the increasingly complex graph with its numerous local minima and find the global minimum for circuit depths up to p = 7.A detailed description of the algorithm is presented in Appendix E.
To systematically explore how Greedy maneuvers the initialization graph, we compare it to two initialization strategies proposed in the literature: The so-called Interp approach [3] interpolates the optimal parameters found for circuit depth p to p + 1 and uses it as a subsequent initialization.This procedure creates a smooth parameter pattern that mimics an annealing schedule.Numerical studies demonstrated that Interp has the same performance as the best out of 2 p random initializations.The second method that we use for comparison is the Trotterized quantum annealing (TQA) method [4], that initializes QAOA p using γ j = (1 − j p )∆t and β j = j p ∆t.
The step size ∆t is a free parameter determined in a preoptimization step.The TQA has similar performance to Interp at moderate circuit depths, notably having lower computational cost.Obtaining an initialization for QAOA p within the Interp framework requires running the optimization for all p ′ = 1, . . ., p − 1, while in the TQA the search for an optimal ∆t is performed directly for a given p. Fig. 3 reveals that the Greedy approach yields similar performance to existing methods.Moreover, the performance of TQA slightly degrades at higher p, however, Greedy is fully on par with Interp initialization.The comparable performance between Greedy and earlier heuristic approaches is surprising.Indeed, the Greedy method for QAOA p explores p + 1 symmetric TSs and chooses the best out of the resulting up to 2(p + 1) minima (if none are equivalent), in contrast to Interp, which uses a single smooth initialization pattern at every p and thus at a smaller computational cost.

C. Smooth pattern of variational angles and heuristic initializations
We find that having a smooth dependence of the variational angles on p (referred to as a "smooth pattern") is an important characteristic for efficiently maneuvering the initialization graph.A smooth pattern means that the variational angles change gradually and continuously as the QAOA depth p increases, without abrupt jumps or discontinuities.This smoothness property can be visually inspected by plotting the variational angles as a function of p and observing whether the curve appears continuous and smooth.Assuming we found a smooth pattern of QAOA p , Theorem 1 produces a TS of QAOA p+1 by          padding it with zeros, effectively introducing a discontinuity (bump).Optimization from the TS with such a bump can proceed by rolling down either side of the saddle, see Fig. 4(a), finding two new minima.Remarkably, the eigenvector corresponding to the index-1 direction of the Hessian has dominant weight on the variational angles with initially zero value, see D for details.Thus descending along the index-1 direction, we can either enhance or heal the resulting discontinuity in the pattern of variational angles.As a result, among two new local minima of QAOA p+1 one typically exhibits a smooth parameter pattern where the bump was removed, while the other minimum has an enhanced discontinuity, see Fig. 4(b) for an example.Utilizing these observations in a numerical study, we find that minima exhibiting a non-smooth parameter pattern exhibit usually a worse or the same performance as smooth minima.In fact, in the Greedy procedure we find that in most cases, in particular in the beginning of the protocol, smooth minima are selected.However, there are cases where a non-smooth minimum is selected if it exhibits the same energy as the smooth one.Greedy then branches off in the optimization graph into a sub-graph involving only non-smooth minima.Usually, this process of branching off is followed by a smaller gain in performance from increasing p.

T J k j t z i A B T z e H Q = " >
The preferred smoothness of QAOA optimization parameters has been explored in the literature [3,7,23] and is believed to be linked to quantum annealing [24] (QA).In QA the ground state of the Hamiltonian H C is obtained by preparing the ground state of H B and smoothly evolving the system to H C such that the system remains in the ground state during the evolution.A fast change, as generated by a bump in the protocol, leads to leakage into excited energy levels and thus decreased overlap with the target ground state of H C .Since the QAOA can be understood as a Trotterized version of QA [1,3,4], for large p, we believe that a similar process is present in the QAOA and thus makes a smooth parameter pattern preferable.
We find that smooth Greedy minima coincide with Interp minima as shown in Fig. 4(b).The Interp naturally creates a smooth parameter pattern since the minima found at p is interpolated to a QAOA p+1 initialization.The optimizer only slightly alters the parameters from its initial value, as can be seen in Fig. 4(b).Geometrically, the Interp initialization can be obtained from the symmetric TS constructed by Theorem 1 as Interp is the rescaled center of mass point of all symmetric TS, with the rescaling factor (p + 1)/p being physically motivated.Considering the center of mass of all TS smoothens out discontinuities present in individual TS.
The re-scaling is related to the notion of "total time" of the QAOA, given by the sum of all variational angles, T = j |γ j |+|β j | [3,25], that resembles the total annealing time in the limit p → ∞.This parameter has been shown to scale as T ∼ p [4], naturally explaining the role of factor (p + 1)/p in yielding the correct increased total time of QAOA p+1 .In other words, the Interp strategy seems to essentially execute a Greedy search without optimizing in the index-1 direction from the TS.This insight lends credence to the success of Interp.However, only Greedy offers a guarantee for performance improvement with increasing p, while for Interp this behavior is supported only by numerical simulations.

IV. DISCUSSION
In this work we analytically demonstrated that minima of QAOA p can be used to obtain transition states (TS) for QAOA p+1 which are stationary points with a unique negative eigenvalue in the Hessian.These TS provide an excellent initialization for QAOA p+1 , because they connect to two new local minima with lower energy.This construction allows us to visualize how local minima emerge at different energies for increasing circuit depth using an initialization graph.Categorizing the local min-ima on this graph by their smooth (discontinuous) patterns of variational parameters, we find that the smooth minima achieve the best performance.Incorporating the smooth nature of minima allows us to establish a relation between the Greedy approach for the exploration of the initialization graph and the best available initialization strategy [3].
The use of TS and their analytic construction for the study of QAOA provide the first steps towards an indepth understanding of the full optimization landscape of the QAOA.The constructed TS are guaranteed to provide an initialization that improves the QAOA performance, suggesting that our construction may be useful for establishing analytic QAOA performance guarantees [1,14,26] for large p in a recursive fashion.Of particular interest is here an analytical understanding of the numerically observed exponential performance improvement with circuit depth.On a practical side, the established relation between heuristic initializations [3] and Greedy exploration of TS suggests that our construction of TS may be useful as a starting point for constructing simple initialization strategies in a broader class of quantum variational algorithms, such as the variational quantum eigensolver [27,28] and quantum machine learning [29].
In addition, our results invite a more complete characterization of the QAOA landscape using the energy landscapes perspective [11].What fraction of minima does our procedure find out of the complete set of QAOA local minima?Are there more TS and are our analytically constructed TS typical?How is the Hessian spectrum distributed at these minima and TS? How do these properties depend on the choice of the QAOA classical Hamiltonian, particularly for classical problems with intrinsically hard landscapes [30]?Answering these and related questions will most likely lead to practical ways of further speeding up the QAOA by reducing the overhead of the classical optimization [31].for the QAOA p (i.e.QAOA with circuit depth p) ansatz.Here we use bold notation for both β and γ parameters to denote a length-p vector of angles, i.e. β = (β 1 , . . ., β p ) and γ = (γ 1 , . . ., γ p ).The use of symmetries allows to restrict the manifold of variational parameters, leading to a more efficient exploration of the QAOA landscape.This section expands upon previous results by [3].
We begin by rewriting the exponents of both classical and mixing Hamiltonian as: From here it is apparent that adding π to any of the parameters, β l , γ l → β l +π, γ l +π for all l ∈ [1, p] does not change the cost function value E(β, γ).Indeed, this leads to an appearance of an overall negative sign that cancels within the expectation value of the classical Hamiltonian.Therefore we can easily restrict the search space to (i) . For β parameters we can restrict the parameter space even further.In Ref. [3] the authors restrict the parameters as ] due to the following considerations.Consider adding π 2 to β, the exponent e −i(β l + π 2 )HB = e −iβ l H B e −i π 2 H B leads to an additional product of all σ x operators, this operator flips all spins, effectively being a generator of the Z 2 symmetry of the classical Ising Hamiltonian, H C .Therefore, such a shift of β l will have no effect on the cost function and we restrict (ii) ]. Yet another symmetry is recovered by taking the complex conjugate of the energy.As both classical and mixing Hamiltonians are real-valued, one has And because the energy is also real-valued (H C is Hermitian), we recover another symmetry of the cost function: The symmetries (i)-(iii) introduced above were discussed in Refs.[3,4].But we can restrict the search space even further.In particular, we demonstrate that for the QAOA cost function for 3-regular random graphs (RRG3) the following additional symmetry holds: (iv) Flipping sign of any of the β l → −β l for any l ∈ [1, p] together with shifts of γ l,l+1 angles, as γ l,l+1 → γ l,l+1 ± π 2 .Note that for l = p only the γ p angle has to be shifted.
Let us prove this property for regular graphs with odd connectivity (i.e.3-regular, 5-regular, . . .).In order to demonstrate the property (iv) for j < p, it is enough to show that: where ∼ stands for equivalence up to a global phase.In other words, we use the property that e −i π 2 H C ∼ i σ z i acts as a product of σ z operators over all spins, that relies on the fact that each vertex is connected to an odd number of edges (interaction terms).This leads to the relation Thus, the change of sign of β k can be compensated by the shifts of "adjacent" angles γ k,k+1 by π/2, leading to the property (iv) when j < p.In the particular case of j = p, the property (iv) for j = p is obtained using the following relation Finally, let us rewrite the property (iv) by sequentially applying this symmetry for all indices j starting from k and ending at p. Then we obtain the following property equivalent to (iv) and dubbed (iv'): This allows us to restrict all γ angles to the region [− π 4 , π 4 ].Moreover, the sign-flip symmetry (iii) allows us to make one of the γ angles, for instance, γ 1 , positive, cutting the search space in half.
In addition, let us apply property (iv') for k = 1 (i.e.including all layers of the unitary circuit) and supplement it with a global sign flip, operation (iii).As a result, we obtain the following symmetry: This indicates that there is a p-dimensional plane in the landscape with coordinates γ = (± π 4 , 0 p−1 ) which acts as a mirror.This plane is characterized by a vanishing gradient of the cost function and the Hessian having p vanishing eigenvalues.However, it is located on the edge of our search space and it has a vanishing expectation value of the cost function, corresponding to the approximation ratio r = 0, which is very far from the good-quality local minima.
In summary, collecting all symmetries discussed above, we restrict the fundamental search region to Appendix B: Construction of transition states In this section, we show how to use a local minimum of the QAOA p to construct a set of 2p + 1 transition states (TS) at circuit depth p + 1.These are stationary points with all but one Hessian eigenvalue being positive.More precisely, we show the following statement: be a local minimum of QAOA p .Define the following 2p + 1 points by padding this vector with zeroes at distinguished positions: with i ∈ [1, p + 1] and j = i or j = i + 1.Then each of these points is either (i) a TS for QAOA p+1 or (ii) has a non-regular Hessian.
Theorem 1 in the main text is a streamlined version of this statement that does not mention the possibility of degenerate Hessians.We expect that the Hessian matrix of a local minimum of QAOA p is non-degenerate in the absence of symmetries and provided the circuit is not overparametrized [32] (if there exists some combination of variational angles, such that its changes do not influence the quantum state, it leads to vanishing eigenvalue of Hessian).Analogously, in the case of the Hessian at the TS of QAOA p+1 , we numerically find that option (ii) never happens.Below, we relate the two new additional eigenvalues of the Hessian at the TS to the expectation value of a physical operator over the variational state.This expectation value is non-zero in the absence of special symmetries or fine-tuning, providing a physical justification for why we do not observe zero eigenvalues in the Hessian spectra of our TS.

Cost function gradient
Let us start by computing the energy gradient ∇E(β, γ).Derivatives of the quantum state with respect to parameters β l , γ l are given by the following expressions: where ) and analogously for U <l , and U >l .For simplified notation we use write |+⟩ instead of |+⟩ ⊗n .We can now deduce the components of the energy gradient ∇E(β, γ) from Eq. (B2).They read Our goal is to prove that given a local minimum Γ p min = (β ⋆ 1 , . . ., β ⋆ p , γ ⋆ 1 , . . ., γ ⋆ p ) for a QAOA p the set of 2p + 1 points with l ranging from 1 to p+1 and either k = l or k = l+1 are all TSs.The first step is to prove that they are all stationary points.That is, each such point leads to a vanishing gradient.From the above expression, it follows that we only have to consider gradient components where the zero insertion is made since the others are zero due to the point Γ p min being a local minimum (i.e.derivatives are vanishing).For the derivatives over newly introduced angles using Eq.(B2), we see that where the index l ranges from 1 to p + 1 for the (l, l) case and from 1 to p in the (l, l + 1) case.These observations reduce the derivatives over the new angles to derivatives over angles from local minima of QAOA p .And these vanish by definition because we started in a local minimum which is itself a stationary point, that is We emphasize that these arguments do not apply to two special cases that should be treated separately.

Cost function Hessian
We now proceed with the study of the Hessian for each of the stationary states in the set Γ p+1 TS (l, k) with l ranging from 1 to p + 1 and k being l or l + 1.Using basic row and column operations we decompose the Hessian as follows: where H(Γ p min ) ∈ R 2p×2p , v(l, k) ∈ R 2p×2 and, h(l, k) ∈ R 2×2 .It is important to note that the determinant of the Hessian at the point Γ p+1 TS (l, k) remains unchanged by such reordering of rows and columns.To see this, recall that switching two rows or columns causes the determinant to switch signs.Since we switch x rows and x columns, we realize that the overall sign does not change after all.In terms of matrix elements, v(l, k Our goal is to restrict the properties of the Hessian (B12) using the fact that the Hessian at circuit depth p is a positive-definite matrix, a consequence of the fact that we start at a local minimum Γ p min .To this end, we use a powerful theorem from matrix analysis.
Theorem 3 (Eigenvalue interlacing theorem [17] (Theorem 4 on page 117)).Let A ∈ R n×n be a symmetric matrix and B ∈ R m×m with m < n be a principal submatrix (obtained by removing both the i-th column and i-th row for some values of i).Suppose A has eigenvalues The eigenvalue interlacing theorem relates the ordered set of Hessian eigenvalues {λ p+1 i } for QAOA p+1 to the Hessian eigenvalues {λ p i } of QAOA p in the following way: Using the fact that H p (Γ p min ) has λ p k > 0 for all k, we see that the Hessian of QAOA p+1 at point Γ p+1 TS (l, k) has at most two negative eigenvalues, λ p+1 In what follows we establish that among these two eigenvalues, exactly one is negative and the other one is positive.This is achieved by demonstrating that the full Hessian matrix has a negative determinant, det H Γ p+1 TS (l, k) < 0, (B15) which rules out the possibility that the remaining eigenvalues λ p+1 1,2 have the same sign (which would cancel in the determinant).
Below we first prove Relation (B15) for the cases where the insertion of the zeros is made at the first (i) or at the last (ii) layer of the unitary circuit.We then conclude by considering the general case (iii), where zeros are inserted in the "bulk" of the unitary circuit.Moreover, whenever is clear from context, we will drop the indices (l, k) for better readability.Furthermore, for all the cases considered below, we introduce a specific short-hand notation for the following second-order derivative (B16) This matrix element will play a special role in the calculation of detH(Γ p+1 TS (l, k)).It is important to note, that while the specific expression of b differs for all the stationary points in the set given by Eq. (B4), it has a non-zero value, b ̸ = 0. Indeed, below we express b as an expectation value of a non-vanishing operator over the QAOA variational state, that is non-zero in the absence of special symmetries.
The first step is to compute the matrix elements of v(p + 1, p + 1).From now on we drop the quantifying index and simply write v and h to reduce notational overhead.The first column of v corresponds to v βj ,βp+1 = ∂ βj ∂ βp+1 E(β, γ) evaluated at the TS Γ p+1 TS : where we introduced the short-hand notation a j for better readability.Analogously, considering matrix elements of the form v γj ,βp+1 = ∂ γj ∂ βp+1 E(β, γ), we obtain Evaluating the second derivatives on Eq. (B17) and Eq.(B18) at j = p + 1 corresponds to the first column of the 2 × 2 matrix h.In particular, evaluating Eq. (B17) at j = p + 1 leads to U >j = I and U ≤j = U which in turn implies that Note that above we used U >p+1 = I.This is because when the derivative is taken with respect to the last layer (p + 1) of the unitary circuit, there is no unitary to the left of it which, in the notation introduced on Eq.(B2) is equivalent to U >p+1 = I.Doing the same on Eq. (B18) gives Finally, let us look at the matrix elements of the form v βj ,γp+1 = ∂ βj ∂ γp+1 E(β, γ) and analogously v γj ,γp+1 , corresponding to the second column of v. Let us first inspect ∂ γp+1 E(β, γ): When evaluated at point Γ p+1 TS , we obtain that [H C , U † p+1 H C U p+1 ] = 0 since U p+1 = I and H C commutes with itself.Hence, we see that as long as the second derivative is taken with respect to an element (β or γ) at index j < p + 1 the final result will be zero.As we already saw in Eq. (B20), Using similar arguments, we show that ∂ γp+1 ∂ γp+1 E(β, γ) = 0 which corresponds to the h γp+1,γp+1 matrix element of h.We are then ready to construct the Hessian at the TS under consideration: Using the expression for the determinant of a block matrix [17] det we rewrite the determinant of the full Hessian as follows We used that vh −1 v T = 0 in the last line.We then see that as long as b ̸ = 0 the determinant of the Hessian at the TS is negative, det[H(Γ p+1 T S )] < 0. The explicit expression (B20) for b relates it to the expectation value of the commutator [[H B , H C ], H C ] over the variational wave function.Since this commutator is a non-vanishing operator, its expectation value is generically non-zero, b ̸ = 0.This concludes the proof of Theorem 1 for the case when zeros are inserted at the last layer of the unitary circuit.
b. Case (ii): As before, we focus on computing the matrix elements of v = v(1, 1) and h = h(1, 1).Starting from the first column of v, with matrix elements v βj ,β1 and v γj ,β1 for j ∈ [2, p + 1] we find Moving onto the second column of v, with matrix elements v βj ,γ1 and v γj ,γ1 for j ∈ [2, p + 1] we obtain where for better readability we introduced the short-hand notation c j with j ∈ [2, p].Finally, evaluating the above expressions Eq. (B26) and Eq.(B27) at j = 1 leads to the matrix elements of the 2 × 2 matrix h.Altogether, we find and the value of c p+2 follows from evaluating Eq. (B27) at j = 1.
Invoking once again the expression for the determinant of a block matrix Eq. (B24) we get Using that the point Γ p min is a local minimum (with the Hessian being non-singular), we see that as long as b ̸ = 0 the determinant of the Hessian at the TS is negative.The fact that the parameter b in Eq. ( B28) is non-vanishing can be inferred from the similar argument to the one used at the end of Appendix B 2 a c.Case (iii): l, k ∈ 2, p So far we have proven that when the zeros insertion is made at the initial (I) or last (II) layer of the unitary circuit the corresponding points Γ p+1 TS of QAOA p+1 are TS.In both cases, we proved that the determinant of the Hessian of QAOA p+1 at the given points is negative.In order to do this, we used that one of the columns of the 2p × 2 matrix v was zero which greatly simplified the computation of the determinant.In what follows, we show that these simplifications, unfortunately, do not occur when the zeros insertion is made in the bulk of the unitary circuits.However, we instead observe that the matrix v(l, k) is constructed by taking the l-th (β l ) and p + 1 + k-th (γ k ) columns of the Hessian of QAOA p at the local minimum Γ p min .This fact, together with the invariance of the determinant under linear operations performed on rows or columns leads to the desired result.
We begin by explicitly computing the matrix elements of h(l, k) and v(l, k) and then relating them to matrix elements of the Hessian H(Γ p min ).For the sake of concreteness, we focus on the particular case of symmetric TS, i.e. k = l.The other case, i.e. k = l + 1 can be covered by an analogous chain of arguments.As before, in what follows we drop the quantifying indices for better readability.Starting from h, we obtain One might be tempted by looking at the properties listed in Eq. (B5) to relate . However, upon closer inspection, we can see that these are not the same.More specifically, we get Comparing the above expression with Eq. (B31) we realize that although not equal, they are related via the Jacobi identity for operators A, B and C.More specifically, we obtain Considering now the matrix elements of v we get Hence, we find that the 2p × 2 rectangular matrix v corresponds to the matrix formed by taking columns H(Γ p min ) m,β l−1 and H(Γ p min ) m,γ l with m = 1, . . ., 2p of H(Γ p min ).Using this result and the fact that the determinant is invariant under linear operations performed on rows or columns, we get that det(H(Γ p+1 TS )) = det where we subtracted rows H(Γ p min ) β l−1 ,m and H(Γ p min ) γ l ,m with m = 1, . . ., 2p from v T , and introduced Using once again the expression for the determinant of a block matrix Eq. (B24), and the fact that det(h(l, l)) = − b2 is negative ( b ̸ = 0 due to similar argument as in Appendix B 2 a) we obtain det concluding our proof for the general TS.

Appendix C: Counting of unique minima
The number of minima found in the initialization graph construction presented in the main text, naïvely scales as N min (p) = 2 p−1 p!.This follows from our recursive construction.Each local minimum of QAOA p is used to construct p + 1 symmetric TS and for each TS we then find two new minima of QAOA p+1 , see Figs. 1 and 2. This factorial growth is, however, only sustained if every TS produces two new minima that are all distinct from each other.Numerically, we find that this is not the case and that the number of unique minima is significantly smaller.The increase in the number of unique minima is consistent with an exponential dependence proportional to e κp [we find that N min (p) can be approximated as N min (p) ≈ 0.19e 0.98p ].However, the limited range of p does not allow us to completely rule out factorial growth, see Fig. 5.The much smaller number of unique minima, compared to the naïve counting demonstrates that different TS often lead to similar minima, as illustrated in Fig. 4.

Appendix D: Properties of the index-1 direction
The index-1 direction is the direction of negative curvature at a TS in a QAOA p+1 which we use to find two new minima in QAOA p+1 , as illustrated in Fig. 2(a).The index-1 direction is obtained by finding the eigenvector corresponding to the unique negative eigenvalue of the Hessian, H(Γ p+1 T S ).Numerically we showed in Fig. 2(b) that optimization initialized along the ± index-1 direction either heals or enhances the perturbation introduced by a creation of the TS from the local minima of QAOA p .
Interestingly, we find that the index-1 vector has dominant components at positions where zero angles were inserted as well as the positions of adjacent angles.In contrast, all other components of the index-1 vector have nearly zero weight, as illustrated in Fig. 6.The large contribution along the component corresponding to the zero insertion can be physically motivated by the fact that the gate with the zero parameter does initially not have any effect for driving the initial state |+⟩ ⊗n towards the ground state of H C .Hence, the energy can be lowered by 'switching on' the action of this gate by moving the value of the corresponding variational angle away from zero.
B < l a t e x i t s h a 1 _ b a s e 6 4 = " l 8 z H + N 3 P i q S U k K v P z b c L s     Interestingly, we see that the neighboring gates with nonzero parameters are also changed along the index-1 direction.The next nearest neighboring gates appear to be not involved in this process.We note that this numerical observation allows to a priori guess the index-1 direction without having to diagonalize the Hessian H(Γ p+1 T S ).This may be useful for the practical implementation of our initialization on available quantum computers.

Appendix E: Description of GREEDY algorithm
In the following, we provide a detailed description for the GREEDY QAOA initialization, as well as the subroutines required to implement the algorithm.To this end, we first provide a pseudo-code for a gradientbased QAOA parameter optimization routine.The algorithm is a so-called variational hybrid algorithm, which implies that the quantum computer is used in a closed feedback loop with a classical computer.There the quantum computer is used to implement a variational state and measure observables while the classical computer is used to keep track of the variational parameters and update them in order to minimize the energy expectation value.Update (β, γ) using gradient information 7: until E(β, γ) has converged 8: Return minimum Γ p min For very shallow circuit depths, such as p = 1, the optimization landscape is sufficiently low dimensional and simple such that global optimization routines can be used to find the optimal parameters.One of the most straightforward global optimization routines is the so-called grid search.There, the parameters are initialized on a dense grid and a parameter optimization routine, such as the QAOA sub-routine is carried out for each point in the grid.Then, only the lowest energy local minimum is kept.Compute or approximate the index-1 unit vector v for each TS The index-1 direction vi can either be found explicitly by diagonalizing the Hessian matrix or using the heuristic approximation outlined in the previous section.While explicit diagonalization incurs classical computation costs that scale polynomially with p, and thus can be  done efficiently, approximation to index-1 direction is expected to give similar performance of QAOA subroutine at a lower classical computational cost.
Appendix F: Additional graph ensembles and system size scaling In the main text, we numerically investigated the performance of our method on random 3-regular graphs (RRG3) with system size n = 10.In the following, we present results for larger system sizes as well as two more graph types.Namely, weighted 3-random regular graphs (RWRG3) where the Hamiltonian is given by H C = ⟨i,j⟩∈E w ij σ z i σ z j and w ij are random weights w ij ∈ [0, 1), as well as random Erdős-Rényi graphs (RERG) with edge probability p E = 0.5.
Fig. 8 shows the performance comparison between Greedy, TQA, and Interp on RWRG3 and RERG.We can see that for RWRG3 the performance of the three methods is comparable, while for RERG the TQA performs worse that the other two methods.Greedy and Interp yield (nearly) the same performance for both graph ensembles on the system size that we considered (n = 10).
Fig. 9 compares the performance for RRG3 with different system sizes.Interp and Greedy yield very similar performance for smaller system sizes (n = 8 indicated by light color) while it yields the same performance for larger system sizes (n = 16 indicated by dark color).TQA performs slightly worse than Greedy and Interp for all system sizes considered.We can furthermore see that gain in performance from every additional layer is becoming less for bigger system sizes.This is due to the fact that in order for the QAOA to "see" the whole graph, a circuit depth p scaling as p ∼ log n is required [14].

< l a t e x i t s h a 1 _
b a s e 6 4 = " B Y k U F z x d T u v W A P p O s F S V + 8 I h C L s = " > A A A B + n i c b V B N S w M x E J 2 t X 7 V + V T 1 6 C S 1 C R S i 7 I u q x 6 M V j R W s L 7 V K y a d q G J t k l y Q r L 2 p / g T f T u T b z 6 Z 3 r 1 l 5 h + H G z r g 4 H H e z P M z A s i z r R x 3 Z G T W V l d W 9 / I b u a 2 t n d 2 9 / L 7 B 4 86 j B W h N R L y U D U C r C l n k t Y M M 5 w 2 I k W x C D i t B 4 O b s V 9 / o k q z U D 6 Y J K K + w D 3 J u o x g Y 6 X 7 E j 5 p 5 4 t u 2 Z 0 A L R N v R o q V Q u v 0 d V R J q u 3 8 T 6 s T k l h Q a Q j H W j c 9 N z J + i p V h h N N h r h V r G m E y w D 3 a t F R i Q b W f T k 4 d o m O r d F A 3 V L a k Q R P 1 7 0 S K h d a J C G y n w K a v F 7 2 x + J / X j E 3 3 y k + Z j G J D J Z k u 6 s Y c m R C N / 0 Y d p i g x P L E E E 8 X s r Y j 0 s c L E 2 H T m t g R i m L O h e I s R L J P H s 7 J 3 U T 6 / s + l c w x R Z O I I C l M C D S 6 j A L V S h B g R 6 8 A J v 8 O 4 8 O x / O p / M 1 b c 0 4 s 5 l D m I P z / Q s J T Z c E < / l a t e x i t > (a) < l a t e x i t s h a 1 _ b a s e 6 4 = " K o i b G B z Y Y c T E M C 9 9 S 0 Q L 0 j h i n K c = " > A A A B / H i c b V A 9 T w J B E J 3 D L 8 Q v 1 N J m I 5 h g Q + 6 I U U u i j S U m H p D A h e w t C 2 z Y 3 b v s 7 p m Q C / 4 F W + 3 t j K 3 / x d Z f 4 g J X K P i S S V 7 e m 8 n M v D D m T B v X / X J y a + s b m 1 v 5 7 c L O 7 t 7 + Q f H w q K m j R B H q k 4 h H q h 1 i T T m T 1 D f M c N q O F c U i 5 L Q V j m 9 n f u u R K s 0 i + W A m M Q 0 E H k o 2 Y A Q b K / n l Sn h e 7 h V L b t W d A 6 0 S L y M l y N D o F b + 7 / Y g k g k p D O N a 6 4 7 m x C V K s D C O c T g v d R N M Y k z E e 0 o 6 l E g u q g 3 R + 7 B S d W a W P B p G y J Q 2 a q 7 8 n U i y 0 n o j Q d g p s R n r Z m 4 n / e Z 3 E D K 6 D l M k 4 M V S S x a J B w p G J 0 O x z 1 G e K E s M n l m C i m L 0 V k R F W m B i b z 5 8 t o Z g W b C j e c g S r p F m r e p f V i / t a q X 6 T x Z O H E z i F C n h w B X W 4 g w b 4 Q I D B M 7 z A q / P k v D n v z s e i N e d k M 8 f w B 8 7 n D 3 t + l F Q = < / l a t e x i t > (b) < l a t e x i t s h a 1 _ b a s e 6 4 = " W J H P a J b j / / Z k s W C B 3 9 p W E Q p N a I 8 = " > A A A C E 3 i c b Z D L S s N A F I Y n 9 V b r L e p O N 8E i C G p J R N R l s Q t d V r A X a G K Y T C d 1 6 M w k z E y E E g I + g + A r u N W 9 O 3 H r A 7 j 1 M V y I 0 8 v C t v 4 w 8 P G f c z h n / i C m R C r b / j R y M 7 N z 8 w v 5 x c L S 8 s r q m r m + U Z d R I h C u o Y h G o h l A i S n h u K a I o r g Z C w x Z Q H E j 6 F b 6 9 c Y d F p J E / F r 1 Y u w x 2 O E k J A g q b f n m l h u w 1 L 2 A j MH s J o 3 3 n c x P X U a 4 r 8 E s 2 i V 7 I G s a n B E U y w e H 3 5 U f 9 6 H q m 1 9 u O 0 I J w 1 w h C q V s O X a s v B Q K R R D F W c F N J I 4 h 6 s I O b m n k k G H p p Y M / Z N a u d t p W G A n 9 u L I G 7 t + J F D I p e y z Q n Q y q W z l Z 6 5 v / 1 V q J C s + 8 l P A 4 U Z i j 4 a I w o Z a K r H 4 g V p s I j B T t a Y B I E H 2 r h W 6 h g E j p 2 M a 2 B C w r 6 F C c y Q i m o X 5 U c k 5 K x 1 c 6 n X M w V B 5 s g x 2 w B x x w C s r g E l R B D S B w D 5 7 A M 3 g x H o 1 X 4 8 1 4 H 7 b m j N H M J h i T 8 f E L E 3 u i J w = = < / l a t e x i t > p+1 min1 < l a t e x i t s h a 1 _ b a s e 6 4 = " n J e j 7 q h 4 T B 4 Z h D 0 k + d 0 N L P u n P i c = " > A A A C E 3 i c b V D L S s N A F J 3 U V 6 2 v q D v d B I s g q C U p o i 6 L X e i y g n 1 A E 8 N k O m 2 H z k z C z E Q o I e A 3 C P 6 C W 9 2 7 E 7 d + g F s / w 4 U 4 f S x s 6 4 E L h 3 P u 5 d 5 7 g o g S q W z 7 0 8 j M z S 8 s L m W X c y u r a + s b 5 u

7 4 9
T s s J A n 5 j e p H 2 G O w w 0 m b I K i 0 5 J s 7 b s A S 9 x I y B t P b J D p 0 U j 9 x G e F + M f X N v F 2 w h 7 B m i T M m + d L R 8 X f 5 x 3 2 o + O a X 2 w p R z D B X i E I p m 4 4 d K S + B Q h F E c Z p z Y 4 k j i H q w g 5 u a c s i w 9 J L h D 6 m 1 r 5 W W 1 Q 6 F L q 6 s o f p 3 I o F M y j 4 L d C e D q i u n v Y H 4 n 9 e M V f v c S w i P x q P x a r w Z 7 6 P W j D G e 2 Q Y T M D 5 + A R U Q o i g = < / l a t e x i t > p+1 min2 < l a t e x i t s h a 1 _ b a s e 6 4 = " D 2 c q p b e q 9 3 t y H n w U J e 2 M T T t s h W 4 = " > A A A C N X i c b Z D P S h x B E M Z 7 1 i S a z b 9 R j 7 k 0 M Q E X w j I j Q T 0 o L H q I x w 1 k d W F n M 9 T 0 9 q y N 3 T 1 D d 4 2 w D P M y v o S P k I A n v e W Q S 5 B c c 8 8 p v T M e s p o P G j 5 + V U V V f 0 k u h c U g + O 6 1 l h 4 9 f r K 8 8 r T 9 7 P m L l 6 / 8 1 b r / c 2 a J x w d j i y C e U 9 r M m 0 6 a 9 S J / Y 2 g G 9 S i D 0 1 4 Z z Z 6 b / 9 8 9 a 8 6 c T / 2 f 0 d D 5 9 c O g e k 0 Q p 5 T d 6 Q T R K S H d I j R 6 R P B o S R C / K N X J M b 7 9 L 7 4 d 1 6 v 5 r W l n c 3 s 0 4 W 5 P 3 + C z X w s L Q = < / l a t e x i t > p min = ( ? , ? ) < l a t e x i t s h a 1 _ b a s e 6 4 = " H c Z X M I L Z i N M U U U T 7 t 4 B m 8 O b M N 9 8 = " > A A A C O 3 i c b Z B N S 8 N A E I Y 3 f l u / q h 6 9 L I q g K C U R U S 9 C q Q c 9 K l o V m l o m 2 2 1 d 3 E 3 C 7 k Q o I U d / i / 4 J / 4 J X F W 9 6 E 6 / e 3 a Y K f r 2 w 8 P L M D D P 7 B r E U B l 3 3 0 e n r H x g c G h 4 Z L Y y N T 0 x O F a d n j k 2 U a M a r L J K R P g 3 A c C l C X k W B k p / G m o M K J D 8 J L n a 6 9 Z N L r o 2 I w s k y p h 5 I b c k X v y 4 N w 6 L 8 6 r 8 9 Z r 7 X M + Z 2 b J D z n v H w A 4 s U Y = < / l a t e x i t > p+1 T S = ( ? , 0, ?, 0) < l a t e x i t s h a 1 _ b a s e 6 4 = " b c H z s e k d c m Z z h z A D J z P X z K R l 0 o = < / l a t e x i t > p < l a t e x i t s h a 1 _ b a s e 6 4 = " R I H n J 7 j B 9 O + d X 8 D x a f 6 5 e b I e F 0 c = " 6 u I o O w V 5 8 e Z l 0 L 2 r 2 Z a 1 + W 6 8 2 b o o 4 y u A I n I A z Y I M r 0 A A t 0 A Y d g M E j e A a v 4 M 1 4 M l 6 M d + N j 3 l o y i p l D 8 A f G 5 w + 4 b 5 g T < / l a t e x i t > e i 1 HC < l a t e x i t s h a 1 _ b a s e 6 4 = " O z P 7 y c N b n o h n O w v v H k J A 6 n i p / / U = " > A A A C B H i c b V D L S s N A F J 3 U V 6 2 v q M t u B o v g x p J I U Z e l b r q s Y B / Q x D C Z 3 r Z D J w 9 m J k I J X b j x V 9 y 4 U M S t H + H O v 3 H S Z q G t B y 4 c z r m X e + / x Y 8 6 k s q x v o 7 C 2 v r G 5 V d w u 7 e z u 7 R + Y h 0 c d G S W C Q p t G P B I 9 n 0 j g L I S 2 Y o p D L x Z A A p 9 D 1 5 / c Z H 7 3 A 6 u I o O w V 5 8 e Z l 0 L 2 r 2 Z a 1 + W 6 8 2 b o o 4 y u A I n I A z Y I M r 0 A A t 0 A Y d g M E j e A a v 4 M 1 4 M l 6 M d + N j 3 l o y i p l D 8 A f G 5 w 8 Z N Z h S < / l a t e x i t > e i p HC < l a t e x i t s h a 1 _ b a s e 6 4 = " 8 Q 9 U o V j C O e k e 3 V 7 Q m Y C j y I e Q + J Y = " > A A A C B H i c b V D L S s N A F J 3 U V 6 2 v q M t u B o v g x p J I U Z e l b r q s Y B / Q x D C Z 3 r Z D J w 9 m J k I J X b j x V 9 y 4 U M S t H + H O v 3 H S Z q G t B y 4 c z r m X e + / x Y 8 6 k s q x v o 7 C 2 v r G 5 V d w u 7 e z u 7 R + Y h 0 c d G S W C Q p t G P B I 9 n 0 j g L I S 2 Y o p D L x Z A A p 9 D 1 5 / c Z H 7 3 A Y U P a J n 9 I r e j C f j x X g 3 P h a t B S O f O U Z / Y H z + A E u u l + A = < / l a t e x i t > e i p HB < l a t e x i t s h a 1 _ b a s e 6 4 = " J e m x 3 X L g M D E 3 0 z n D g r W x E 9 0 3 a N g = " > A A A C C 3 i c b V D L S s N A F J 3 U V 6 2 v q E s 3 Q 4 v g x p J I U T d C s Z u 6 q 2 A f 0 M Y w m U 7 a o T N J m J k I J W T v x l 9 x 4 0 I R t / 6 A O / / G S d q F t h 6 4 c D j n X u 6 9 x 4 s Y l c q y v o 3 C y u r a + k Z x s 7 S 1 v b r v y c S x K W c c k 9 3 Z h f K R S 8 T / / P 6 s f I v n Y Q G U a x I g G e L / J h B F c I s G D i k g m D F p p o g L K i + F e I x E g g r H V 9 J h 2 A v v r x M O m d V + 7 x a u 6 1 V 6 t f z O I r g C J T B C b D B B a i D J m i B N s D g E T y D V / B m P B k v x r v x M W s t G P O Z Q / A H x u c P h K G a G A = = < / l a t e x i t > e i0HC = I < l a t e x i t s h a 1 _ b a s e 6 4 = " O B 3 k B / T S w F e H 7 S Y d k H p u H n B F H z Q = " > A A A C C 3 i c b V D L S s N A F J 3 U V 6 2 v q E s 3 Q 4 v g x p J I U T d C q Z u 6 q 2 A f 0 M Y w m U 7 a o T N J m J k I J W T v x l 9 x 4 0 I R t / 6 A O / / G S d q F t h 6 4 c D j n X u 6 9 x 4 s Y l c q y v o 3 C y u r a + k Z x s 7 S 1 v b O 7 Z + 4 f d G Q Y C 0 z a O G S h 6 H l I E k Y D 0 l Z U M d K L B E H c Y 6 T r T a 4 z v / t A h K R h c K e m E X E 4 G g X U p x g p L b l m m d w n p w O O 1 F j w h K b Q g k 2 3 k c I r m G u e l 9 y k r l m x q l r 6 S D s F e f H m Z d M 6 q 9 n m 1 d l u r 1 B v z O I r g C J T B C b D B B a i D J m i B N s D g E T y D V / B m P B k v x r v x M W s t G P O Z Q / A H x u c P g w + a F w = = < / l a t e x i t > e i0HB = I < l a t e x i t s h a 1 _ b a s e 6 4 = "

8 1 j e m 1 G
C f k T 7 k o e c U W O l h 6 e z j q K y L 7 B b L L l l d w a y T L yM l C B D r V v 8 6 v R i l k Y o D R N U 6 7 b n J s Y f U 2 U 4 E z g p d F K N C W V D 2 s e 2 p Z J G q P 3 x 7 O A J O b F K j 4 S x s i U N m a m / J 8 Y 0 0 n o U B b Y z o m a g F 7 2 p + J / X T k 1 4 5 Y + 5 T F K D k s 0 X h a k g J i b T 7 0 m P K 2 R G j C y h T H F 7 K 2 E D q i g z N q O C D c F b f H m Z N M 7 L 3 k W 5 c l c p V a + z O P J w B M d w C h 5 c Q h V u o Q Z 1 Y B D B M7 z C m 6 O c F + f d + Z i 3 5 p x s 5 h D + w P n 8 A b B g k F k = < / l a t e x i t > |+i < l a t e x i t s h a 1 _ b a s e 6 4 = "

8 1 j e m 1 G
C f k T 7 k o e c U W O l h 6 e z j q K y L 7 B b L L l l d w a y T L yM l C B D r V v 8 6 v R i l k Y o D R N U 6 7 b n J s Y f U 2 U 4 E z g p d F K N C W V D 2 s e 2 p Z J G q P 3 x 7 O A J O b F K j 4 S x s i U N m a m / J 8 Y 0 0 n o U B b Y z o m a g F 7 2 p + J / X T k 1 4 5 Y + 5 T F K D k s 0 X h a k g J i b T 7 0 m P K 2 R G j C y h T H F 7 K 2 E D q i g z N q O C D c F b f H m Z N M 7 L 3 k W 5 c l c p V a + z O P J w B M d w C h 5 c Q h V u o Q Z 1 Y B D B M7 z C m 6 O c F + f d + Z i 3 5 p x s 5 h D + w P n 8 A b B g k F k = < / l a t e x i t > |+i < l a t e x i t s h a 1 _ b a s e 6 4 = "

8 1 j e m 1 GFigure 1 .
Figure 1.(a) Circuit diagram that implements the QAOA ansatz state with circuit depth p, see Eq. (1).Gray boxes indicate the identity gates that are inserted when constructing a TS, as indicated in Theorem 1.(b) Local minima Γ p min of QAOAp generate a TS Γ p+1 TS for QAOAp+1 that connects to two new local minima, Γ p+1 min 1,2 with lower energy.

Figure 2 .
Figure2.Initialization graph for the QAOA for MaxCut problem on a particular instance of RRG3 with n = 10 vertices (inset).For each local minima of QAOAp we generate p + 1 TS for QAOAp+1, find corresponding minima as in Fig.1(b), and show them on the plot connected by an edge to the original minima of QAOAp+1.Position along the vertical axis quantifies the performance of QAOA via the approximation ratio, and points are displaced on the horizontal axis for clarity.Color encodes the depth of the QAOA circuit, and large symbols along with the red dashed line indicate the path that is taken by the Greedy procedure that keeps the best minima for any given p resulting in an exponential improvement of the performance with p.The Greedy minimum coincides with an estimate of the global minimum for p = 6 (dashed line) obtained by choosing the best minima from 2 p initializations on a regular grid.

Figure 3 .
Figure 3. Performance comparison between different QAOA initialization strategies used for avoiding low-quality local minima.Greedy approach proposed in this work yields the same performance as Interp [3] and slightly outperforms TQA [4] at large p. Global refers to the best minima found out of 2 p initializations on a regular grid.Data is averaged over 19 non-isomorphic RRG3 with n = 10, shading indicates standard deviation.System size scaling for up to n = 16 and performance comparison for different graph ensembles can be found in the Appendix F.
Interp initInterp opt < l a t e x i t s h a 1 _ b a s e 6 4 = " R L J P H s 7 J 3 U T 6 / s + l c w x R Z O I I C l M C D S 6 j A L V S h B g R 6 8 A J v 8 O 4 8 O x / O p / M 1 b c 0 4 s 5 l D m I P z / Q s J T Z c E < / l a t e x i t > (a) < l a t e x i t s h a 1 _ b a s e 6 4 = " l / 5 1 C P b w d a t e x i t > Interp init < l a t e x i t s h a 1 _ b a s e 6 4 = " b t P o F 1 9 C R d y t P B G d e J O I 3 Q 8 G p f M = " > A A A B 9 H i c b V D L S g N B E J y N r x h f U Y 9 e B o P g K e x K M B 4 D X j x G M A 9 I l j A 7 m S R D 5 r H O 9 A b D k u / w 4 k E R r 3 6 M N / / G S b I H T S x o K K q 6 6 e 6 K Y s E t + P 6 3 l 9 v Y 3 N r e y e 8 W 9 v Y P D o + C Z 0 T I a s 4 6 g i k t k w X R w 9 w x d O 6 e O B N q 4 U 4 I X 6 e y I l 0 t q p j F y n J D C y q 9 5 c / M / r J D C 4 C V O u 4 g S Y o s t F g 0 R g 0 H i e A O 5 z w y i I q S O E G u 5 u x X R E D K H g c i q 4 E I L V l 9 d J 8 6 o c X J c r 9 5 V S r Z r F k U d n 6 B x d o g B V U Q 3 d o T p q I I o e 0 T N 6 R W / e x H v x 3 r 2 P Z W v O y 2 Z O 0 R 9 4 n z + w r Z K t < / l a t e x i t > smooth < l a t e x i t s h a 1 _ b a s e 6 4 = " 0 v x 0 y 9 j t t 2 D t j 4 Y e L w 3 w 8 y 8 I O Z M G 9 f 9 d g p b 2 z u 7 e 8 X 9 0 s H h 0 f F J + f S s q 6 N E E d o h E Y 9 U P 8 C a c i Z p x z D D a T 9 W F I u A 0 1 4 w u 8 v 8 3 h N V m k W y b e Y x 9 Q W e S B Y y g k 0 H Q e i 4 C 2 y m w m e p 1 L x P / 8 w a J C W / 9 l M k 4 M V S S 1 a I w 4 c h E K H s c j Z m i x P C 5 J Z g o Z m 9 F Z I o V J s b G U 7 I h e O s v b 5 L u T d W r V 2 s P t U q z k c d R h A u 4 h G v w o A F N u I c W d I D A F J 7 h F d 4 c 4 b w 4 7 8 7 H q r X g 5 D P n 8 A f O 5 w 9 4 J Y 3 X < / l a t e x i t > T S 1 < l a t e x i t s h a 1 _ b a s e 6 4 = " s V r 0 B s 1 f a 2 3 7 5 9 j t t 2 D t j 4 Y e L w 3 w 8 y 8 I O Z M G 9 f 9 d g p b 2 z u 7 e 8 X 9 0 s H h 0 f F J + f S s q 6 N E E d o h E Y 9 U P 8 C a c i Z p x z D D a T 9 W F I u A 0 1 4 w u 8 v 8 3 h N V m k W y b e Y x 9 Q W e S B Y y g k 0 m t R 9 H t V G 5 4 l b d J d A m 8 X J S g R y t U f l r O I 5 I I q g 0 h G O t B 5 4 b G z / F y j D C 6 a I 0 T D S N M

Figure 4 .
Figure 4. (a) Cartoon of descent from two different TS at of QAOAp+1 generated from a QAOAp minimum with a smooth pattern leads to the same new smooth pattern minima of QAOAp+1, also reached from the Interp [3] initialization.Two additional non-smooth local minima typically have higher energy.(b) shows the corresponding initial and convergent parameter patterns for the RRG3 graph shown in Fig. 2 for p = 10.

Figure 5 .
Figure 5. Number of minima found in the initialization graph in Fig.2with system size n = 10.The orange line describes a naïve counting argument (2 p−1 p!) while the blue line lists the actual number of distinct minima that can be approximated as 0.19 e 0.98p .
R L J P H s 7 J 3 U T 6 / s + l c w x R Z O I I C l M C D S 6 j A L V S h B g R 6 8 A J v 8 O 4 8 O x / O p / M 1 b c 0 4 s 5 l D m I P z / Q s J T Z c E < / l a t e x i t > (a) < l a t e x i t s h a 1 _ b a s e 6 4 = " K o i b r c e K Y h F w W g t 6 t y O / 9 k C V Z p G 8 N / 2 Y + g J 3 J A s Z w c Z K 5 b i V y 7 s F d w y 0 S L w p y R e P h u W f p + N h q Z X 7 b r Y j k g g q D e F Y 6 4 b n x s Z P s T K M c D r I N h N N Y 0 x 6 u E M b l k o s q P b T 8 aE D d G q V N g o j Z U s a N F b / T q R Y a N 0 X g e 0 U 2 H T 1 v D c S / / M a i Q m v / Z T J O D F U k s m i M O H I R G j 0 N W o z R Y n h f U s w U c z e i k g X K 0 y M z W Z m S y A G W R u K N x / B I q m e F 7 z L w k X Z p n M D E 2 T g E E 7 g D D y 4 g i L c Q Q k q Q I D C M 7 z A q / P o v D n v z s e k d c m Z z h z A D J z P X z K R l 0 o = < / l a t e x i t > p < l a t e x i t s h a 1 _ b a s e 6 4 = " 3 E F f v 5 V E p 5 f Q Y b r b x H k 3 j j W + V f 0 = " > A A A B 7 3 i c b V B N S 8 N A E J 3 4 W e t X 1 a O X x S J 4 K k k p 6 r H o x W M F + w F t K J P t p l 2 6 m 8 T d j V B C / 4 Q X D 4 p 4 9 e 9 4 8 9 + 4 b X P Q 1 g c D j / d m m J k X J I J r 4 7 r f z t r 6 x u b W d m G n u L u 3 f 3 B Y O j p u 6 T h V l D V p L G L V C V A z w S P W N N w I 1 k k U Q x k I 1 g 7 G t z O / / c S U 5 n H 0 Y C Y J 8 y U O I x 5 y i s Z K n d 4 Q p c R + t V 8 q u x V 3 D r J K v J y U I U e j X / r q D W K a S h Y Z K l D r r u c m x s 9 Q G U 4 F m x Z 7 q W Y J 0 j E O W d f S C C X T f j a / d 0 r O r T I g Y a x s R Y b M 1 d 8 T G U q t J z K w n R L N S C 9 7 M / E / r 5 u a 8 N r P e J S k h k V 0 s S h M B T E x m T 1 P B l w x a s T E E q S K 2 1 s J H a F C a m x E R R u C t / z y K m l V K 9 5 l p X Z f K 9 d v 8 j g K c A p n c A E e X E Ed 7 q A B T a A g 4 B l e 4 c 1 5 d F 6 c d + d j 0 b r m 5 D M n 8 A f O 5 w + 0 p 4 / D < / l a t e x i t > 2 < l a t e x i t s h a 1 _ b a s e 6 4 = " V J / d l R c 5 n W y w 6 k 0 k W N 6 W J L f l W 6 U = " > A A A B 7 3 i c b V B N S 8 N A E J 3 4 W e t X 1 a O X Y B E 8 l U S L e i x 6 8 V j B f k A b y m S 7 a Z f u b u L u R i i h f 8 K L B 0 W 8 + n e 8 + W / c t j l o 6 4 O B x 3 s z z M w L E 8 6 0 8 b xv Z 2 V 1 b X 1 j s 7 B V 3 N 7 Z 3 d s v H R w 2 d Z w q Q h s k 5 r F q h 6 g p Z 5 I 2 D D O c t h N F U Y S c t s L R 7 d R v P V G l W S w f z D i h g c C B Z B E j a K z U 7 g 5 Q C O xd 9 E p l r + L N 4 C 4 T P y d l y F H v l b 6 6 / Z i k g k p D O G r d 8 b 3 E B B k q w w i n k 2 I 3 1 T R B M s I B 7 V g q U V A d Z L N 7 J + 6 p V f p u F C t b 0 r g z 9 f d E h k L r s Q h t p 0 A z 1 I v e V P z P 6 6 Q m u g 4 y J p P U U E n m i 6 K U u y Z 2 p 8 + 7 f a Y o M X x s C R L F 7 K 0 u G a J C Y m x E R R u C v / j y M m m e V / z L S v W + W q 7 d 5 H E U 4 B h O 4 A x 8 u I I a 3 E E d G k C A w z O 8 w p v z 6 L w 4 7 8 7 H v H X F y W e O 4 A + c z x + 2 K 4 / E < / l a t e x i t > 3 < l a t e x i t s h a 1 _ b a s e 6 4 = " 9 P O c / b 5 p U K o 6 d S H j w f z M s 1 j d s t c = " > A A A B 7 n i c b V D L S g N B E O y N r x h f U Y 9 e B o P g K e y K q M e g F 4 8 R z A O S J c x O e p M h s w 9 m e o U Q 8 h F e P C j i 1 e / x 5 t 8 4 S f a r G 5 t b x e 3 S z u 7 e / k H 5 8 K h p k k w L b I h E J b o d c I N K x t g g S Q r b q U Y e B Q p b w e h u 5 r e e U B u Z x I 8 0 T t G P + C C W o R S c r N T q B k i 8 5 / X K F b f q z s F W i Z e T C u S o 9 8 p f 3 X 4 i s g h j E o o b 0 / H c l P w J 1 y S F w m m p m x l M u R j x A X Y s j X m E x p / M z 5 2 y M 6 v 0 W Z h o W z G x u f p 7 Y s I j Y 8 Z R Y D s j T k O z 7 M 3 E / 7 x O R u G N P 5 F x m h H G Y r E o z B S j h M 1 + Z 3 2 p U Z A a W 8 K F l v Z W J o Z c c 0 E 2 o Z I N w V t + e Z U 0 L 6 r e V f X y 4 b J S u 8 3 j K M I J n M I 5 e H A N N b i H O j R A w A i e 4 R X e n N R 5 c d 6 d j 0 V r w c l n j u E P n M 8 f 7 z y P U Q = = < / l a t e x i t > 1 < l a t e x i t s h a 1 _ b a s e 6 4 = " 4 A / C B U 2 C N o F Q n W 7 W 0 8 m q T a J e J M k = " > A A A B 7 n i c b V B N S 8 N A E J 3 U r 1 q / q h 6 9 L B b B U 0 l K U Y 9 F L x 4 r 2 A 9 o Q 9 l s N + 3 S z S b s T o Q S + i O 8 e F D E q 7 / H m / / G b Z u D t j 4 Y e L w 3 w 8 y 8 I J H C o O t + O 4 W N z a 3 t n e J u a W / / 4 P C o f H z S N n G q G W + x W M a 6 G 1 D D p V C 8 h Q I l 7 y a a 0 y i Q v B N M 7 u Z + 5 4 l r I 2 L 1i N O E + x E d K R E K R t F K n X 7 A k Q 5 q g 3 L F r b o L k H X i 5 a Q C O Z q D 8 l d / G L M 0 4 g q Z pM b 0 P D d B P 6 M a B Z N 8 V u q n h i e U T e i I 9 y x V N O L G z x b n z s i F V Y Y k j L U t h W S h / p 7 I a G T M N A p s Z 0 R x b F a 9 u f i f 1 0 s x v P E z o Z I U u W L L R W E q C c Z k / j s Z C s 0 Z y q k l l G l h b y V s T D V l a B M q 2 R C 8 1 Z f X S b t W 9 a 6 q 9 Y d 6 p X G b x 1 G E M z i H S / D g G h p w D 0 1 o A Y M J P M M r v D m J 8 + K 8 O x / L 1 o K T z 5 z C H z i f P / D A j 1 I = < / l a t e x i t > t e x i t s h a 1 _ b a s e 6 4 = " 6 s w J i Q 2 c C i W u + + E b s I m o X b R f C V U = " > A A A B / n i c b V D J S g N B F O x x j X E b F U 9 e G o P g x T A j Q T 0 G v X i M Y B Z I h t D T 8 5 I 0 6 V n o f h M S h o C / 4 s W D I l 7 9 D m / + j Z 3 l o I k F D U X V e / S r 8 h M p N D r O t 7 W y u r a + s Z n b y m / v 7 O 7 t 2 w e H N R 2n i k O V x z J W D Z 9 p k C K C K g q U 0 E g U s N C X U P f 7 d x O / P g C l R R w 9 4 i g B L 2 T d S H Q E Z 2 i k t n 3 c Q h h i J q I A h h c u H Q D H W I 3 b d s E p O l P Q Z e L O S Y H M U W n b X 6 0 g 5 m k I E X L J t G 6 6 T o J e x h Q K L m G c b 6 U a E s b 7 r A t N Q y M W g v a y 6 f l j e m a U g H Z i Z V 6 E d K r + 3 s h Y q P U o 9 M 1 k y L C n F 7 2 J + J / X T L F z 4 5 l s S Y o Q 8 d l H n V R S j O m k C x o I Z f L K k S G M K 2 F u p b z H F O N o G s u b E t z F y M u k d l l 0 r 4 q l h 1 K h f D u v I 0 d O y C k 5 J y 6 5 J m V y T y q k S j j J y D N 5 J W / W k / V i v V s f s 9 E V a 7 5 z R P 7 A + v w B Z 0 q V y A = = < / l a t e x i t > index-1 vector< l a t e x i t s h a 1 _ b a s e 6 4 = " R I H n J 7 j B 9 O + d X 8 D x a f 6 5 e b I e F 0 c = "> A A A C B X i c b V D L S s N A F J 3 U V 6 2 v q E t d D B b B j S W R o i 6 L 3 X R Z w T 6 g i W E y n b R D Z 5 I w M x F K y M a N v + L G h S J u / Q d 3 / o 2 T N g t t P X D h c M 6 9 3 H u P H z M q l W V 9 G 6 W V 1 b X 1 j f J m Z W t 7 Z 3 f P 3 D / o y i g R m H R w x C L R 95 E k j I a k o 6 h i p B 8 L g r j P S M + f N H O / 9 0 C E p F F 4 p 6 Y x c T k a h T S g G C k t e e Y x u U / P H Y 7 U W P C UZ t A Z I c 6 R Z 8 O W 1 8 w 8 s 2 r V r B n g M r E L U g U F 2 p 7 5 5 Q w j n H A S K s y Q l A P b i p W b I q E o Z i S r O I k k M c I T N C I D T U P E i X T T 2 R c Z P N X K E A a R 0 B U q O F N / T 6 S I S z n l v u 7 M 7 5 W L X i 7 + 5 w 0 S F V y 7 K Q 3 j R J E Q z x c F C Y M q g n k k c E g F w Y p N N U F Y U H 0 r x G M k E F Y 6 u I o O w V5 8 e Z l 0 L 2 r 2 Z a 1 + W 6 8 2 b o o 4 y u A I n I A z Y I M r 0 A A t 0 A Y d g M E j e A a v 4 M 1 4 M l 6 M d + N j 3 l o y i p l D 8 A f G 5 w + 4 b 5 g T < / l a t e x i t > e i 1 H C < l a t e x i t s h a 1 _ b a s e 6 4 = " O z P 7 y c N b n o h n O w v v H k J A 6 n i p / / U = " > A A A C B H i c b V D L S s N A F J 3 U V 6 2 v q M t u B o v g x p J I U Z e l b r q s Y B / Q x D C Z 3 r Z D J w 9 m J k I J X b j x V 9 y 4 U M S t H + H O v 3 H S Z q G t B y 4 c z r m X e + / x Y 8 6 k s q x v o 7 C 2 v r G 5 V d w u 7 e z u 7 R + Y h 0 c d G S W C Q p t G P B I 9 n 0 j g L I S 2 Y o p D L x Z A A p 9 D 1 5 / c Z H 7 3 A x b 1 0 x i p k j 8 A f G 5 w + 5 + J g U < / l a t e x i t > e i 2 H C < l a t e x i t s h a 1 _ b a s e 6 4 = " q N u Z r A 1 M K w 8 d m 7 b H X Q G b Z G P R P L 8 = " > A A A C B X i c b V D L S s N A F J 3 4 r P U V d a m L w S K 4 s S R a 1 G W x m y 4 r 2 A c 0 M U y m k 3 b o T B J m J k I J 2 b j x V 9 y 4 U M S t / + D O v 3 H S Z q G t B y 4 c z r m X e + / x Y 0 a l s q 4 M 1 4 M l 6 M d + N j 1 r p k F D M H 4 A + M z x + 7 g Z g V < / l a t e x i t > e i 3 H C < l a t e x i t s h a 1 _ b a s e 6 4 = " W G J Q H M j o 4 g P N U O n 1 g 9 7 X 6 d u T N a w = " > A A A C B H i c b V D L S s N A F J 3 4 r P U V d d n N Y B H c W J J S 1 G W p m y 4 r 2 A e 0 N U y m N + 3 Q m S T M T I Q S s n D j r 7 h x o Y h b P 8 K d f + P 0 s d D W A x c O 5 9 z L v f f 4 M W d K O 8 6 3 t b a + s b m 1 n d v J 7 + 7 t H x z a R 8 c t F S W S Q p N G P J I d n y j g L I S m Z p p D J 5 Z A h M + h 7 Y 9 v p n 7 7 A a R i U X i n J z H 0 B R m G L G C U a C N 5 d g H u 0 4 u e I H o k R c o y 3 P N B E 6 + M 6 1 4 t 8 + y 1 p P 1 Y r 1 b H / P W N W s x c 4 L + w P r 8 A e x x l 6 I = < / l a t e x i t > e i 2 H B < l a t e x i t s h a 1 _ b a s e 6 4 = " K S F 3 X r N Z q D F W m o 9 P L 7 s b u t r 8 c 4 A = " > A A A C B H i c b V A 9 S w N B E N 3 z M 8 a v q G W a x S D Y G O 4 0 q G W I T c o I 5 g O S e O x t J s m S 3 b t j d 0 8 I x x U 2 / h U b C 0 V s / R F 2 / h v 3 k i s 0 8 c H A 4 7 0 Z Z u Z 5 I W d K 2 / a 3 t b K 6 t r 6 x m d v K b + / s 7 u 0 X D g 5 b K o g k h S Y N e C A 7 H l H A m Q 9 N z T S H T i i B C I 9 D 2 5 v c p H 7 7 A a R i g X + n p y H 0 B R n 5 b M g o 0 U Z y C 0 W 4 j 8 9 6 g u i x F D F L c M 8 D T d w L X H d r i V s o 2 W V 7 B r x M n I y U U I a G W / j q D d j 5 o e R B p / O F w 0 j j n W A 0 0 T w g E m g m k 8 N I V Q y c y u m Y y I J 1 S a 3 v A n B W X x 5 m b T O y 8 5 l u X J b K V V r W R w 5 V E T H 6 B Q 5 6 A p V U R 0 1 U B N R 9 I i e 0 S t 6 s 5 6 s F + v d + p i 3 r l j Z z B H 6 A + v z B + 3 6 l 6 M = < / l a t e x i t > e i 3 H B < l a t e x i t s h a 1 _ b a s e 6 4 = " Figure 6.(a) Illustration of the circuit implementing the QAOA at a TS.Gray gates correspond to the zero insertion.The index-1 direction has mainly weight at the position of the zeros as well as the two adjacent gates.(b) Numerical example of the index-1 vector and the QAOA parameter pattern at the TS.Arrows correspond to the magnitude and sign of the entries in the index-1 direction.Only entries at β1, β2, γ2 and γ3 have a large magnitude, all other entries are nearly zero.

Algorithm 2 grid search subroutine 1 :Algorithm 3 QAOA 1 : 3 :: repeat 5 :
Given a circuit depth p, construct an evenly spaced grid on the fundamental region: i ∈ [1, p] and j ∈ [2, p] 2: QAOA subroutine initialized from each point in grid 3: Return local minimum with the lowest energy Γ p min Using the two subroutines presented above we can provide a detailed pseudo-code for the Greedy QAOA algorithm, see Fig. 7 for a visualization.Greedy Choose maximum circuit depth pmax 2: Choose small offset ϵ ≪ 1 Grid search for p = 1 to find Γ p=1 min ▷ See grid search subroutine 4Construct p + 1 symmetric TS Γ i

Figure 7 .Figure 8 .
Figure 7. Flow diagram to visualize the Greedy QAOA initialization algorithm presented in Algorithm 3. < l a t e x i t s h a 1 _ b a s e 6 4 = "B Y k U F z x d T u v W A P p O s F S V + 8 I h C L s = " > A A A B + n i c b V B N S w M x E J 2 t X 7 V + V T 1 6 C S 1 C R S i 7 I u q x 6 M V j R W s L 7 V K y a d q G J t k l y Q r L 2 p / g T f T u T b z 6 Z 3 r 1 l 5 h + H G z r g 4 H H e z P M z A s i z r R x 3 Z G T W V l d W 9 / I b u a 2 t n d 2 9 / L 7 B 4 8 6 j B W h N R L y U D U C r C l n k t Y M M 5 w 2 I k W x C D i t B 4 O b s V 9 / o k q z U D 6 Y J K K + w D 3 J u o x g Y 6 X 7 E j 5 p 5 4 t u 2 Z 0 A L R N v R o q V Q u v 0 d V R J q u 3 8 T 6 s T k l h Q a Q j H W j c 9 N z J + i p V h h N N h r h V r G m E y w D 3 a t F R i Q b W f T k 4 d o m O r d F A 3 V L a k Q R P 1 7 0 S K h d a J C G y n w K a v F 7 2 x + J / X j E 3 3 y k + Z j G J D J Z k u 6 s Y c m R C N / 0 Y d p i g x P L E E E 8 X s r Y j 0 s c L E 2 H T m t g R i m L O h e I sR L J P H s 7 J 3 U T 6 / s + l c w x R Z O I I C l M C D S 6 j A L V S h B g R 6 8 A J v 8 O 4 8 O x / O p / M 1 b c 0 4 s 5 l D m I P z / Q s J T Z c E < / l a t e x i t > (a)

Figure 9 .
Figure 9. System size scaling for performance comparison on RRG3.Color shade indicates system size, light color is n = 8 and dark color is n = 16.System size changes in steps of two between those values.Data is averaged over 19 nonisomorphic RRG3 graphs.