Non-ergodic delocalized states for eﬃcient population transfer within a narrow band of the energy landscape

,


I. INTRODUCTION
The idea to use quantum computers for the solution of search and discreet optimization problems has been actively pursued for decades, mostly notably in connection to Grover's algorithm [1], quantum annealing [2][3][4][5][6][7][8][9][10], and more recently, quantum approximate optimization [11].Quantum tunneling of collective spin excitations was proposed and studied experimentally as a mechanism for moving between states in the energy landscape that can lead to shorter transition time scales compared to classical Simulated Annealing approaches under certain conditions [4].Experimental evidence of the faster time scales was later corroborated numerically using an imaginary-time Quantum Monte Carlo (QMC) algorithm [12,13].Furthermore, recent studies [14,15] have shown that in QMC, the tunneling corresponds to the Kramers escape through the free-energy barrier in an extended spin system that includes spin replicas in an imaginary time direction.As a result, the incoherent quantum tunneling rate does not have a scaling advantage over such a QMC simulation.This happens because incoherent tunneling dynamics corresponded to sequential transitions connecting individual minima, where each transition is dominated by a single tunneling path [14].In this paper we explore the qualitatively different tunneling dynamics where a large number of tunneling paths interfere constructively, giving rise to "minibands" of the non-ergodic many-body states delocalized in the computational basis (i.e. in the Fock space).We demonstrate that the transport within the minibnads can be used for efficient quantum search in spin glass problems.
To describe the search task we start from the binary optimization problem where the goal is to find the minimum of a classical energy function, E(z), defined over the set of 2 n configurations of n bits (bit-strings) z = (z 1 , z 2 , . . ., z n ) where z k = {0, 1}.In quantum algorithms E(z) is typically encoded in an n-qubit Hamiltonian diagonal in the basis of states |z called the computational basis.Hard optimization problems have their counterparts in spin glass models of statistical physics [16,17].The energy function of a hard optimization problem is characterized by a large number of spurious local minima.Low-energy minima can be separated by a large Hamming distance (number of bit flips transforming one to another).Such landscape gives rise to an interesting computational primitive: given an initial bit-string z j with atypically low energy, we wish to produce other bit-strings with energies in a narrow range ∆E cl around the initial one.In general, this can be a difficult search problem if the number of bit-strings of interest is exponentially small compared to 2 n .
Inspired by the Hamiltonian-based approaches to quantum search [18] and optimization [2][3][4] we propose the following quantum population transfer (PT) protocol: first preparing the system in a computational state |z j with classical energy E(z j ), we then evolve it under the Hamiltonian without fine-tuning the evolution time nor the strength of the time-independent transverse field B ⊥ .At the final moment we projectively measure in the computational basis and check if the outcome z is a "solution", i.e., z = z j and the energy E(z) is inside the window ∆E cl .The second term in the Hamiltonian (2) proportional to B ⊥ is responsible for the PT.It is usually referred to as a "driver Hamiltonian" in the Quantum Annealing literature [3].
We note that the output of PT z can be used as an input of a classical optimization heuristic such as simulated annealing or parallel tempering in a "hybrid" optimization algorithm [19] where quantum and classical steps can be used sequentially to gain the complementary advantages of both [20].
For random optimization problems diagonal matrix elements E(z) of the Hamiltonian (1) correspond to a problem instance sampled from a particular statistical ensemble.Since off-diagonal matrix elements connect states separated by one bit-flip, Eq. ( 2) describes the Hamiltonian of the tight-binding model with diagonal disorder.The underlying lattice for this model is Boolean hypercube [21] where individual sites correspond to bitstrings.The model (2) can be viewed as a generalization of the Anderson model initially introduced in the context of transport in finite dimensional lattices [22].In this model, as well as in the original Anderson model, there exist bands of localized and extended states separated in  2) separated by 2B ⊥ .A narrow impurity band of width W B ⊥ is marked in light green.The sequence of short black lines depicts the energies of marked states E(zi).Dashed lines depict the elementary path to leading order perturbation theory in B ⊥ for the tunneling matrix element cij(E) given in (16).In this paper we focus on the case of relatively large transverse fields B ⊥ > 1 so that the IB energies lie above the ground state of the total Hamiltonian (2) that corresponds to nearly all qubits polarized in x direction.
energy by a so-called "mobility edge".Originally, extensions of Anderson model appeared in a variety of manybody problems in condensed matter physics [23,24] giving rise to the concept of many-body localization (MBL).It was demonstrated in Ref. [21] that MBL is responsible for the failure of Quantum Annealing to find a solution of the constraint satisfaction problem (although, the detailed analysis of this effect is still needed [7,25]).
In models of quantum spin glasses the existence of the two types of eigenstates and the mobility edge were studied in Refs.[5,26,27].We expect the Andreson models on Boolean hypercube have an intermediate phase characterized by the onset of non-ergodic delocalized states forming narrow minibands.Such a phase has been observed in tight binding models on Random Regular [28] and fully connected graphs [29].
For spin glass models (2) with B ⊥ below the quantum spin glass transition, the probability density function (PDF) of the eigenvalues E β of H is localized around the mean classical energy, with an exponentially decaying tail reaching towards the low energy states.We choose the interval of energies ∆E cl to be at the tail of the distribution, of E β but sufficiently far from the ground state so that the typical spacing of eigenvalues is exponentially small in n.Under these conditions classical states in-side the energy window ∆E cl are located near deep local minima of the classical energy landscape E(z).Hamming distances between the minima scale with n and the number of them is exponentially small compared to 2 n yet still exponentially large in n.
In this paper we apply the PT protocol with Hamiltonian (2) to a simple yet nontrivial model of E(z) with the properties mentioned above Here M 1 marked states |z j (n-bit-strings z j ) are chosen uniformly at random from all bit-strings of length n, with energies E(z j ) independently distributed around −n within a narrow band of width W B ⊥ .All other states z have energies E(z) = 0 and are separated by a large gap n from the very narrow band of marked states (see Fig. 1).This model is inspired by the impurity band model in doped semiconductors [30].It also corresponds to a classical unstructured search problem with multiple marked states.
We provide a detailed description of the PT dynamics in the above model by developing a microscopic analytical theory of the "minibands" of non-ergodic delocalized states [31].We derived an effective downfolded Hamiltonian in the energy strip associated with the PT.Its matrix elements correspond to the tunneling between the deep local minima and obey the heavy-tailed statistics.The ensemble of downfolded Hamiltonians for PT corresponds to the preferred basis Levi matrices (PBLM).We use the cavity method for Levi matrices [32][33][34][35][36][37] to find analytically the fractal dimension of the delocalized minibands, and the probability distribution of their spectral width.This allowed us to find the probability distribution and the scaling with n of the PT times.
It is crucial that the dynamics within the IB of the model (3) in the transverse field can be non-ergodic yet delocalized in computational basis.The model is by no means unique from this point of view.We believe that the extended but non-ergodic quantum states exist for quantum extensions of any problem Hamiltonian, which is characterized by a classical spin glass behavior: for Random Energy Model [38], Sherrington-Kirkpatrick model [39], p-spin model [40], K-Satisfiability [41], etc.
Indeed, the main difference between classical and quantum spin-glass models is the existence of the manybody localized (MBL) phase in the latter case.However we see no reason to expect a direct transition between the MBL and ergodic phases without intermediate nonergodic phase similar to the case of ordinary Anderson localization in finite-dimensional space.This difference is due to the fact that the number of relevant bit-strings at a given Hamming distance d from a given one increases for spin-glass models exponentially with d, or even quicker, whereas for finite-dimensional models this increase is only polynomial.
A key challenge in developing a theory of non-ergodic delocalized phase for quantum spin glass models is the calculation of the statistics of the tunneling matrix elements between deep local minima separated by large Hamming distances d.We derived analytically its dependence on the transverse field B ⊥ and Hamming distance d using WKB theory of collective spin tunneling in asymptotic limit of large n.We demonstrated that in the delocalized phase it is qualitatively different from that given by the leading order perturbation theory in B ⊥ , known as a forward scattering approximation (FSA) that has been previously used in these problems [26,[42][43][44].As a consequence, our results for the scaling of the PT time with n and the structure of the delocalized eigenstates in IB model are qualitatively different from the FSA predictions.
In model (3), the most efficient classical algorithm is purely random search with running time ∼ 2 n .We find that the typical runtime of the PT algorithm t PT displays the following scaling dependence on n Here Ω 1 is the number of computational basis states within the target window of energies that contribute with comparable probabilities to the quantum state at the end of PT.The expression applies in the range of transverse fields n 1/2 B ⊥ − 1=O(1) (for arbitrary B ⊥ see Eq. ( 31)).
The dependence of t PT on Ω is the same as in the multi-target Grover quantum algorithm that searches for Ω marked states starting from the fully-symmetric state |S = 2 −n/2 z |z .In the Hamiltonian version of this algorithm [18], one uses the projector to |S as a driver, H D = w |S S|.This algorithm is proven to be optimal for problems without structure.We emphasize that according to Eq. ( 4) the exponential scaling of t PT with n differs from that in the Grover algorithm by a term ∼ B −2 ⊥ that can be made arbitrary small at sufficiently large transverse fields.
PT algorithm is qualitatively different from the quantum annealing, adiabatic optimization and Hamiltonian implementation of Grover search because it exploits the structure of the excited energy spectrum.The PT Hamiltonian H (2) is non-integrable and its eigenstates are delocalized in the low-energy manifold.
In analytically tractable example considered here the PT algorithm has new and potentially advantageous features compared to the Grover algorithm whose Hamiltonian is integrable and all of its eigenstates but one are localized.Therefore the quantum evolution resulting from the Grover Hamiltonian cannot form a massive superposition of Ω 1 solutions if it starts from a computational basis state.The algorithm must always start from the state |S .Moreover, Grover's algorithm performance is exponentially sensitive to fine-tuning of the weight of the driver w on the scale δw ∼ 2 −n/2 √ Ω.In contrast, the scaling of the runtime of PT (4) with n is robust to the choice of B ⊥ that can take on a broad range of values for The nearly optimal (Grover-like) performance of the PT protocol is the consequence of the asymptomatic orthogonality between the eigenstates in the marked state subspace to the rest of the Hilbert space.This suppresses the population transport from the marked states to the O(2 n ) of states |z with classical energies E z = 0 even at large B ⊥ .Such "orthogonality catastrophe" cannot be obtained within the perturbative in B ⊥ approach such as FSA.
The paper is organized as follows.Sec.II contains a qualitative discussion of the main results.In Sec.III we develop a down-folding procedure to reduce the original problem to the nonlinear eigenproblem in the marked state subspace.In Sec.IV we calculate the off-diagonal (tunneling) matrix elements of the down-folded Hamiltonian and studied their dependence on n and Hamming distance using Wentzel-Kramers-Brillouin (WKB) theory.In Sec.V we develop an expansion of the nonlinear eigenproblem near the center of the IB shifted by transverse field and obtain the effective Hamiltonian H of the PT problem.In Sec.VI we study the statistical ensemble of Hamiltonians H . Sec. VII discusses numerical results.In Sec.VIII we study the PT within the Born approximation.In Sec.IX we estimate the number of states in the miniband.In Sec.X we provide an overview of the cavity method for dense random matrices.In Sec.XI we solve the cavity equations and obtain the distributions of the real and imaginary parts of self-energy.In Sec.XII we discuss the complexity of PT algorithm.In Sec.XIII we provide a comparison between PT and Grover's algorithm with multiple target states and systematic errors in oracle phase and driver weight.In Sec.XIV we provide a summary and concluding remarks.

II. QULATITATIVE DISCUSSION OF RESULTS
Each marked state |z j is a deep local minimum of E(z) separated from other minima by a typical Hamming distance n/2 while the separation from the nearest market state is also extensive d min =O(n) for M =2 µn and µ < 1.
The transverse field B ⊥ gives rise to multiqubit tunneling between the states.The tunneling amplitudes from a given minimum to its neighbors located at a Hamming distance d decrease exponentially with d while the number of neighbors increases exponentially with d for d = O(n).As a result, an eigenstate |ψ β of H associated with the impurity band can become delocalized over a large subset of marked states S β with size 1 |S β | ∝ M α and 0 < α ≤ 1.For α = 0 the eigenstate |ψ β is localized, for α = 1 the eigenstate is delocalized in the entier space of marked states.For 0 < α < 1 the eigenstate can be considered "non-ergodic" and its support set S β is sparse in the space of the marked states.
We express the transition probability from |z j to |z , in terms of the eigenstates and corresponding eigenvalues of H, where H |ψ β = E β |ψ β .In the delocalized phase, for a given state |z j there exists a large set of eigenstates |ψ β that have peaks at |z j .These eigenstates possess important properties [28,29,45]: they have largely overlapping supports ∩ β S β ≈ S (z j ), and they are close in energy thus forming a narrow mini-band.The mini-band width Γ may be interpreted as the inverse scrambling time and determines the width of the plateau in the Fourier-transform of the typical transition probability P (ω, z|z j ) [29]. 1 In other words, the significant PT of P (t, z|z j ) from the initial marked state |z j ∈ S into the other states of the same miniband S occurs over the time t PT ∼ 1/Γ.The window ∆E cl is related to the miniband width Γ.
Understanding the properties of non-ergodic delocalized states is crucial for describing the dynamics of quantum spin glasses driven by many-body coherent tunneling processes.Developing its microscopic theory is a challenging problem.This paper studies the transport problem in an "impurity band" (IB) model ( 3) by making use of the down-folded Hamiltonian in the marked state subspace derived in Secs.III, V.While the original Hamiltonian (2) is sparse in the basis of states |z (it couples only states separated by Hamming distance 1), the down-folded Hamiltonian H (38) is a dense M × M matrix.
The transverse field leads to a uniform shift ∼ B2 ⊥ of the marked state energies as shown in Sec.V, (33)- (34).Diagonal elements of H ii are given by the marked state energies counted off from the center of the shifted impurity band.Their PDF is assumed to be exponentially bounded with some width W .
Each pair of marked states is coupled via multi-qubit tunneling.The off-diagonal matrix elements H ij = V (d ij ) cos φ(d ij ) are completely determined by the Hamming distance d ij between the marked states z i and z j .The amplitude V (d) decays steeply with d, inversely proportional to a square root of n d (see Eq. ( 39)).The phase φ shown in Fig. 5 monotonically increases by O(1) when d is changed by 1.In the analysis of spectral properties of H ij the quantity cos φ(d ij ) can be replaced by a random sign.The explicit form of V (d) and phi(d) is obtained using WKB theory of collective spin tunneling.At B ⊥ > 1 the tunneling paths correspond to long spin-flip sequences connecting the initial and final states.They include many loops passing through the the states with Ez = 0 that are neglected in FSA.
The typical matrix element between the two marked states is V typ ∼ n 2 2 −n/2 e −n/(4B 2 ⊥ ) .The typical matrix element between a given marked state and its nearest neighbor is also exponentially small in n but it is exponentially larger than the value V typ .This fact corresponds to a strong hierarchy of the off-diagonal matrix elements of H ij which is a signature of their heavy-tailed probability density function [33,37].Such matrices are called Levi matrices.
The PDF of the rescaled squared amplitudes The particular form of scaling is the direct consequence of the fact that our problem has no "structure": the tunneling matrix elements depend only on Hamming distance and marked states are chosen at random.
The key difference of the ensemble of matrices H ij from Levy matrices studied in the literature [33][34][35]37] is that the dispersion, W , of the diagonal matrix elements is much larger than the typical magnitude of the off-diagonal elements V typ .Therefore H ij can be called preferred basis Levi matrices (PBLM).
We note that the existence of heavy tails in the PDF of the off-diagonal matrix elements of the down-folded Hamiltonian H is due to the infinite dimension of the Hilbert space of the original problem (2) for n → ∞.This happens because the exponential decay of the matrix elements with the Hamming distance d is compensated by the exponential growth of the number of states at the distance d from a given state.We believe that the PBLM structure is a generic feature of the effective Hamiltonians for PT at the tail of the density of states in quantum spin glass problems.
Unlike the standard Levi ensemble, the eigenstates of PBLM allow for the existence of non-ergodic delocalized states when the width W is much bigger than the largest off-diagonal matrix element in a typical row of H ij and much smaller than the the largest off-diagonal element in a matrix For smaller dispersion W V typ M 1/2 the matrix eigenstates are ergodic while for W V typ M the eigenstates are localized.Such phase diagram resembles the one in the Rosenzweig-Porter (RP) model [29,36].The difference of RP from PBLM is that the statistics of the offdiagonal matrix elements in the RP ensemble are Gaussian [46] rather than polynomial (6).In this paper we will focus on exploring PT transfer within the non-ergodic delocalized phase, which is more likely to generalize to other models.We note that the localized phase does not support population transfer.
Because of the PBLM structure of the Hamiltonian H one can expect that the runtime of the PT protocol t PT will have a heavy-tailed PDF whose form is of practical interest.It is closely related to the PDF of the miniband widths Γ ∼ 1/t PT .We obtained the PDF(Γ) by making use of the cavity method for random symmetric matrices [32,33,35,47].
In previous work the cavity equations were solved only in their linearized form, i.e., near the localization transition.We were able to solve fully nonlinear cavity equations in the delocalized non-ergodic phase.We obtained the boundaries of the phase in terms of the ratio of W/V typ and also the form of P(Γ) inside the phase.It is given by the alpha-stable Levi distribution [33,48] with the tail index 1, most probable value Γ typ = V typ (πΩ log Ω/4) 1/2 , and characteristic dispersion πΓ typ /(4 log Ω) where Ω is the typical number of states in the miniband.This number Ω = (πM V typ /W ) 2 is a square function of the ratio of the typical tunneling matrix element V typ to the level separation W/M .In a non-ergodic delocalized phase M Ω 1 and the typical PT time t PT ∼ 1/Γ typ obeys the condition We build on the observations made in the IB model and provide qualitative arguments that PT will have a quadratic speed up over QMC in some quantum search problems where tunneling is a computational bottleneck.

III. DOWNFOLDING INTO THE SUBSPACE OF THE MARKED STATES AND NONLINEAR EIGENPROBLEM
The driver Hamiltonian H D in Eq (2) connects bitstrings that are separated by a Hamming distance d=1.We note that, on one hand, marked states are separated by large Hamming distances d ij with typical value d = n/2.Therefore a pair of marked states |i and |j is coupled by elementary spin-flip processes corresponding to high orders (H D ) k of the driver Hamiltonian with k ≥ d ij .On the other hand, the resolvent of the driver Hamiltonian connects directly every pair of marked states.Furthermore, because H D is invariant under permutations of bits, the matrix elements G ij (E) = z i | G(E) |z j depend only on the Hamming distance d ij between the corresponding states.They are exponentially small in n for extensive d ij = O(n).Therefore, one might expect that under certain conditions the quantum evolution stays approximately confined to the marked state subspace and can be naturally described by the downfolded Hamiltonian whose M × M matrix representation is dense in the basis of marked states.
We use the identity where E and |ψ is an eigenvalue and the corresponding eigenvector of H.We introduce a new vector that has no support in the subspace orthogonal to that of marked states.Then, multiplying both parts of equation ( 10) by √ H cl , we obtain after simple transformations where The operator Λ plays the role of a "driver Hamiltonian" in the downfolded picture, and it couples states in the marked subspace.
Equation ( 12) can be written in matrix form (see Appendix, Sec.A for details) where Here δ ij is the Kronecker delta and is a coupling coefficient that depends only on a Hamming distance |z i − z j | between the bit-strings z i and z j .
We note that ( 14) has the form of a nonlinear eigenproblem.A solution of ( 14) for the M -dimensional vector |A with nonzero norm requires where where This condition along with Eqs. ( 14)-( 17) completely defines the eigenvector projections onto the marked state subspace and the corresponding eigenvalues.
We observe that there are exactly M roots E β of (17) that originate from M classical energies of the marked states E(z j ) at B ⊥ = 0.These eigenvalues and the corresponding eigenstates will be the sole focus of our study.
Here we just mention briefly that the rest of the states originate in the limit H cl → 0 from the eigenstates of the driver Hamiltonian whose energy levels −B ⊥ (n − 2m) (shown in Fig. 1) correspond to the total spin-x projections n−2m ∈ [−n, n].The levels −B ⊥ (n−2m) have degeneracy n m , which is partially lifted due to the coupling to the impurity band with M states.The splitting of the driver energy levels −B ⊥ (n − 2m) increases as a function of transverse field in the vicinity of "resonances" with the levels of the impurity band where B ⊥ (n − 2m) ≈ −n for integer values of m.At resonance, the eigenstates of the driver with total spin-x projection n−2m are strongly hybridized with the marked states |z j .As will be discussed below, the width of the resonances remains exponentially small in n for B ⊥ = O(n 0 ).In Fig. 6 we plot the evolution of the energy spectrum of the Hamiltonian H as a function of transverse field for the case of two impurity states M = 2.

IV. COUPLING COEFFICIENTS IN THE DOWNFOLDED HAMILTONIAN
The coupling coefficient c ij (E) ≡ c(E, d ij ) for i = j determines the off-diagonal matrix element of the downfolded Hamiltonian (15) corresponding to the tunneling transition that connects marked states |z i and |z j .In the IB model, the tunneling matrix element depends only on the Hamming distance d ij between the states.It can be calculated in explicit form from Eq. (16).For this we use the basis of eigenstates |x of the driver Hamiltonian H D |x = H x D |x in Eq. (16).They correspond to bitstrings x = (x 1 , . . ., x n ) of individual qubits polarized in positive x a = 0 and negative x a = 1 direction of the x axis.The eigenvalues of the driver H x D = −B ⊥ (n − 2h x ) depend only on the Hamming weight of the bit-strings x.Therefore one can perform explicitly the partial summation over basis vectors |x in (16) under the conditions that a x a = k for all bit positions a such that z a j = z a i , and a x a = l for all a where z a j = z a i .Finally the result (16) can be written as a double sum over k ∈ (0, n − d ij ) Here d ij is the Hamming distance between bit-strings z i and z j .Plots of coupling coefficients as a function of Hamming distance d based on (20) are given in Fig. 2 (20) is quite involved and is not suitable for the study of the asymptotic properties of the population transfer in the limit of large n.
For a very weak transverse field B ⊥ n −1/2 using perturbation theory in B ⊥ to the leading order one can obtain a standard expression [26] for the coupling coefficient, |c(E, d)| d !(B ⊥ /n) d .It is given by the sum of the transition amplitudes over the d! shortest paths between the states |z i and |z j separated by a Hamming distance d.Intermediate states |z along each path correspond to E(z) = 0 while energies of initial and final states are −n (see Fig. 1).
For larger transverse field values (but still B ⊥ 1) the perturbative expression in the small-B ⊥ limit can be modified to include the range of In that range One can see that for small B ⊥ matrix element falls down with d extremely steeply despite the presence of the factorial factor d ! in (21).We note that this perturbation (FSA) expression is qualitatively valid in the range It gives a correct leading order form of the mobility edge in quantum REM [26,[42][43][44] at small B ⊥ 1.For transverse field, B ⊥ > |E|/n, the dependence of c ij (E) on d ij changes qualitatively.It becomes nonmonotonic, reaching its minimum at the point n/2 of minimum overlap between the bit-strings z i and z j .In a certain region around the minimum it has oscillatory behavior, as seen in Fig. 2. The boundary of this region is shown with black dots.The details of the behavior in the oscillatory region are shown in Fig. 4. The exponential dependence of the envelope of c(E, d) on d is captured by the factor 1/ n d and is independent on the transverse field strength.This region of d and values of B ⊥ > |E|/n are of the most relevance to the transport in non-ergodic minibands which is of central interest to in this paper.

A. WKB calculation of coupling coefficients
In this paper we develop an approach (described in the Appendix B) based on the WKB theory for large spin [49] to calculate the coefficient c(E, d) for n 1 and arbitrary values of transverse fields B ⊥ without relying on perturbation theory in B ⊥ .The coefficient c ij (E) can be expressed in terms of the operator of the total spin-x projection We will utilize the basis of eigenstates |m of the operator S z = n k=1 σ k z corresponding to its eigenvalues m ∈ [−n/2, n/2] and the maximum value of the total spin S = n/2 The state |n/2 − d is a normalized sum of all computational basis states |z with d spins pointing in the negative z direction and n − d spins pointing in the positive Here |z| = n k=1 z k and δ k,d is a Kronecker delta.Because the coefficients c ij (E) (20) depend only on the Hamming distance |z i − z j | between the bit-strings z i and z j , we can assume, without loss of generality, that in Eq. ( 22) one of the bit-strings, e.g., |z j , corresponds to all individual spins pointing in the positive z direction The main observation is that we can pick, instead of the state |z i , any computational basis state |z whose Hamming weight satisfies the condition |z| = |z i | without changing the value of the coefficient Therefore averaging both sides of the Eq. ( 22) over the states |z i that satisfy the condition Here G m, n 2 (E) = m| (E + 2B ⊥ S x ) −1 |n/2 are the matrix elements of the resolvent (9) of the transverse field Hamiltonian H D between the states (24) that belong to a maximum total spin subspace S = n/2.
As will be shown below, for typical instances of the ensemble of Hamiltonians H, the Hamming distance from a randomly selected marked state to its closest neighbor is an extensive quantity O(n).Therefore the above Colored lines show the dependence of the rescaled logarithm of the coupling coefficient n −1 log c 2 (E, d), Eq. ( 20), on the rescaled Hamming distance d/n for n = 400.The energy E is set to the value ⊥ that reflects the overall shift of the impurity band due to the transverse field (cf.(33), (34)).Different colors correspond to different values of the transverse field B ⊥ =1.93 (red), 1.43 (blue), 1.11 (green), 1.01 (brown), 0.99 (purple), 0.95 (gray).The scale along the y-axis suggests that c(E (0) , d) scales exponentially with n for d/n = O(n 0 ).The inset shows the leading order factor in the d-dependence of the coupling coefficient for B ⊥ > |E|/n (cf.(30)).Black dots show the boundaries d = n/2−m0, n/2+m0 of the region of the oscillatory behavior of c(E, d) with d given by WKB theory (29) (see Appendix B for details).
off-diagonal matrix elements of the resolvent can be analyzed in a semiclassical approximation corresponding to S = n/2 1.This approximation for the quantum propagator of a large spin and diagonal elements of the resolvent was considered in [50,51] using the spin coherent state path-integral representation.The analysis in these papers was quite involved because the path-integral formulation requires a careful treatment of the fluctuation determinant and a so-called Solari-Kochetov correction in the action.Also, these results were focused on a general case of large spin Hamiltonian and only considered diagonal elements of the resolvent.Because of this, instead of trying to extend the results in [50,51] to our case, we follow a different path.
The resolvent satisfies the equation where I is the identity operator.We write this equation in the basis of states |m (23).From (9) we obtain In the limit of large n 1 we solve this equation using the discrete Wentzel-Kramers-Brillouin (WKB) approximation method [49,52].In the WKB analysis of Eq. ( 27) the function 2u(m) plays the role of an effective potential for the classical system with coordinate m and energy E. For 2u(m) > E the WKB solution for the resolvent G m,n/2 (E) displays an oscillatory behavior with m while for 2u(m) < E it exponentially increases with m.The boundaries of the oscillatory region m ∈ [−m 0 (E), m 0 (E)] are "turning points" of the classical motion and are given by the condition 2u(m 0 ) = E (see Fig. 3) where In Fig. 4 we plot the comparison between the coefficient c(E, d) computed based on the exact expression (20) and the WKB asymptotic (details of the WKB analysis are given in Appendix B).

𝐴 = 𝜋𝑟 %
Type equation here.In what follows we will be interested in the region d ∈ [n/2−m 0 , n/2+m 0 ] with m 0 (n/2) 2 − (E/B ⊥ ) 2 defined by the condition u(m 0 ) = E.This is the region of oscillatory behavior of c(E, d) with d where the leading order exponential dependence on n is given by the expression with the prefactor given in Appendix, Eqs.(B23),(B24).
The function θ(B ⊥ ) in (30) equals It behaves at large argument as θ 1/(4B 2 ⊥ ).26)).The solid blue line corresponds to the exact expression (20), while the approximate WKB solution is shown with red points.
An explicit form of the WKB phase φ(E, d) in ( 30) is given in Appendix, Eq. (B11).The dependence of the phase on d for different values of B ⊥ is shown in the Fig. 5.This phase varies by O(1) when d is changed by 1 and it is responsible for fast oscillation of the coupling coefficient with the Hamming distance between marked states d.Its dependence on d simplifies in the limit of large transverse field B ⊥ 1: where χ(x) 1 − 2 arcsin(x)/π + O(n −1 ).The second term in (32) is much smaller the the first one, and varies very little when d is changed by 1.A predominately linear dependence of φ(E, d) on d at large fields can be seen in Fig. 5.This property will be important in the analytical study of population transfer.
-1.0 -0.5 0.0 0.5 1.0 For large transverse fields the magnitude of the squared coupling coefficient (30) can be estimated to exponential accuracy as c We note that the number of marked states M d accessible via all possible d-bit flips from a given state is M d = M 2 −n n d .Therefore the leading order dependence of the coupling coefficient on d is proportional to 1/ √ M d .As will be shown later, in the limit of large transverse fields this leads to a nearly Grover complexity of the PT algorithm, up to a factor ∼ exp[−n/(4B 2 ⊥ )], which gives very small correction to Grover scaling for large B ⊥ .However when d decreases below the boundary value d < n/2 − m 0 , the coupling coefficient grows exponentially faster than 1/ √ M d , as follows from the discussion in Appendix (cf.Eq. (B13)).

V. DOWNFOLDED HAMILTONIAN NEAR THE CENTER OF THE IMPURITY BAND
The coupling coefficients c(E, d) (20) decay exponentially with Hamming distances for d = O(n) (see details in Sec.IV).Marked states are selected at random and Hamming distances between them are order n when the number of the states M is exponentially smaller than 2 n .Because the off-diagonal matrix elements of the downfolded Hamiltonian H ij (E) ∝ c(E, d ij ) they are exponentially small in n.At the same time the width of the distribution of energies of the marked states E(z j ) = −n+ j is also assumed to be very small, W B ⊥ (it is exponentially small in n for the cases of interest).Therefore we can solve the nonlinear eigenvalue problem ( 14)-( 17) by an iterative approach treating the off-diagonal part of H(E) and terms ∝ j as a perturbation.Details are given in Appendix C.
At zeroth-order in the perturbation, the down-folded Hamiltonian 16), (20) the explicit form of the equation for Here ∆ 0 is the root of the above transcendental equation that satisfies the condition lim B ⊥ →0 ∆ 0 = 0.In general, the sum (34) is dominated by the region of values of d such that |d − n/2| = O(n 1/2 ) where the factor 2 −n n d reaches its maximum ∼ n −1/2 .In that region we replace the binomial coefficient with a Gaussian function of d and the summation with the integral over d.Taking the integral we obtain ∆ 0 in a form of a series expansion in powers of n −1 A comparison between the exact and asymptotic solutions for ∆ 0 is shown in Fig. 6.For B ⊥ n 1/2 the overall shift of the energies of the marked states is negative and quadratic in B ⊥ .
According to Eq. ( 18) all M degenerate eigenstates |ψ β have the same weight on the marked state subspace.In the large n limit we have Under the condition ∝ B 2 ⊥ /n 1, the eigenstates are  The plot shows the repeated avoided crossing between the two systems of eigenvalues.One system (colored lines) corresponds to the eigenvalues of the transverse field (driver) The second system of eigenvalues corresponds to the energies of the two marked states in the limit B ⊥ → 0. The splitting of the eigenvalues is exponentially small in n and not resolved in the plot.The asymptotic expression (33), (34) for the two eigenvalues E dominated by their projections on the marked state subspace.In the limit n → ∞ they are asymptotically orthogonal to the computational basis states outside the IB.Such "orthogonality catastrophe" cannot be obtained within the perturbative in B ⊥ approach such as FSA.
The exact dependence of the weight Q on transverse field B ⊥ is given in Fig. 7.The expression ( 36) is valid for B ⊥ away from their "resonant" values B ⊥,p n/(n − 2p) where the M -fold degenerate energy level "crosses" the eigenvalues of the driver Hamiltonian, E (0) = −B ⊥ (n − 2p), for integer values of p, as shown in Fig. 6.The width of such resonance regions ∆B ⊥,p ∝ 2 −n/2 n p remains exponentially small in n for n/2 − p n 1/2 .In this study we will focus on the off-resonance case depicted in Fig. 1.One can see from Fig. 6 that B ⊥,p increases with p and so is the width of resonance region.For B ⊥ parametrically large compared to unity one needs to make sure that n is also large enough so that the width of the resonance regions is small (cf.Fig. 8).Away from resonance, all M impurity band eigenstates are well localized in the marked states subspace (cf.( 36)).
and the integer p is equal to its maximum possible value p = pmax for which the weight factor In the spirit of the degenerate perturbation theory, there exists an effective Hamiltonian H that determines the correct zeroth order eigenstates and removes the degeneracy of the energy levels Its matrix in the basis of the marked states has the form where we neglected small nonimportant corrections (see Appendix C).Using the ex-pression for the coupling coefficient (30) given in Appendix B, ((B23),(B24)) we have Here φ(d) ≡ φ(E (0) , d) is a WKB phase shown in Fig. 5 that describes the oscillation of the matrix elements with the Hamming distance.Its explicit form is given in Appendix B, Eq. (B11) and also above in Eq. ( 32) for the case of large transverse fields.The amplitude V ij equals where i =j and the coefficient A(ρ) equals (cf.(B24)) It is independent on n apart from the phase φ(n/2) whose explicit form is The function θ(B ⊥ ) is given in (31).Expanding (31) in the limit B ⊥ 1, In that limit θ 1.We note that even for modest values of transverse field, e.g., B ⊥ 1.46 (corresponding to that in the Fig. 4) the first term provides a good estimate to the value of θ 0.13 (error 9%).We shall refer to H in (38) as the Impurity Band (IB) Hamiltonian.
The form of the IB Hamiltonian (38) only applies to the region of oscillatory behavior where m 0 is given in (29).This above condition for d ij is always satisfied in a typical row of the matrix d ij for the values of M considered in the paper (see the discussion in Appendix G and Eq.(G32)).

VI. STATISTICAL ENSEMBLE OF THE IMPURITY BAND HAMILTONIANS
Properties of the eigenstates and eigenvalues (37) of the IB Hamiltonian H (38) determine the population transfer within the Impurity Band and are thus of the central interest for us in this study.They depend on the statistical ensemble of IB Hamiltonians.In the model considered in this paper diagonal elements j of H are selected at random, independently from each other and from the choice of the corresponding marked states |z j .In the present discussion we assume that the PDF p( ) of j is exponential bounded with the width W .The results do not depend on the particular form of p( ).For the sake of specificity in calculations we will use the window function form where θ(x) is a Heaviside theta function.For the physical effects discussed in the paper to take place the width W needs to scale down exponentially with n

A. Off-diagonal matrix elements
For fixed energies j the matrix of the IB Hamiltonian H ij is entirely determined by the symmetric matrix of Hamming distances d ij between the bit-strings corresponding to the marked states.The set of M bit-strings is randomly sampled from the full set of all possible 2 n bit-strings {0, 1} n without replacement, see Appendix D. Elements of the matrix d ij above or below the main diagonal will be considered independent from each other and taken from the binomial distribution p d , under condition 1 M 2 n/2 .Then, for a given row of the matrix M × M of Hamming distances d ij , the numbers of elements M  9).According to (38), (30) the statistical ensemble of IB Hamiltonians (38) corresponds to that of symmetric random matrices whose associated graphs are fully connected and matrix elements are statistically independent.
As will be seen below the spectral properties of H that are relevant for our study are determined by V 2 ij and not by the oscillatory factor in (38).Therefore we will be interested in the PDF of where i = j.

Typical and extreme values of the off-diagonal matrix elements Vij
For a randomly chosen row of the matrix of Hamming distances d ij the most probable value (mean) of its elements equals to n/2.According to (39), the off-diagonal matrix elements V ij decrease rapidly with the Hamming distance d ij , reaching the minimum value at d ij n/2.Therefore a typical minimum value of the matrix elements V ij corresponds to a typical value overall.We estimate it using Eq. ( 39) and Stirling's approximation where coefficient A = A(E (0) , 1/2) (40) is essentially nindependent between the resonances and θ is given in (31).The matrix elements V ij that scale with n as the typical value in (48 We note that in the Fig. 9 the plot points do not reach the boundaries of the interval d = 0, n.In the matrix of Hamming distances d ij the typical smallest off-diagonal element in a randomly chosen row can be estimated as follows M p dmin = 1 where p d is binomial distribution (46) Using Stirling's approximation for factorials in the limit n 1 it is easy to show that minimum Hamming distance in a row is extensive for M = 2 µn , µ < 1.
The typical largest magnitude off-diagonal matrix element in a randomly chosen row of V ij is equal to V (d min ).Using Stirling's approximation in (39) we get, Using (48) one can see that the maximum off-diagonal matrix element in a randomly chosen row is still exponentially small in n.
Similarly, one can estimate the typical value of the absolute minimum d abs min of a Hamming distance d ij between a pair of marked states.This distance remains extensive for µ < 1, M = 2 µn .This distance corresponds to the overall largest in magnitude element of the matrix Using ( 48) the largest element is exponentially small in n provided that µ < 1/2 which corresponds to the condition of statistical independence of the elements of V ij .
A tight bound for the maximum eigenvalues of H can be obtained using Gerschgorin circle theorem [53], see Appendix E.

B. Heavy tails
It can be shown that the variance of H ij is not a good statistical characteristic of its PDF and is dominated by the extremely rare atypical instances of the ensemble (see details in the Appendix F).We observe that the relationship between the typical matrix element (48), maximum matrix element in a randomly chosen row of V ij (50), and the largest element of V ij overall (51) form a strong hierarchy that is a characteristic of the ensemble of dense matrices with broad non-exponential distribution of matrix elements (Levy matrices) [33].The form of the hierarchy [37] suggests (up to a logarithmic factors) the following asymptotic behavior at the tail of the PDF of the matrix elements: for |V ij | V typ .We will build on the above observation and obtain the explicit form of the PDF of the matrix elements P (V 2 ij ) (47), including its tails.In the asymptotic limit of large n 1 we consider n to be a continuous variable (the validity of this approximation will be justified below).We replace the summation over d in (47) by an integral and Kronecker delta δ(x) by Dirac delta This expression is obtained using the analytical continuation of the binomial distribution p d (46) from the in-teger domain d ∈ (0, n) onto the interval of a real axis x ∈ (0, n) in terms of the Beta function and the resulting identity In what follows we will study the rescaled quantities where i = j and V typ is given in (48).We apply Stirling's approximation for the binomial coefficient in Eq. ( 39) and ( 46) and obtain asymptotic expressions for V 2 (d) and p d , respectively.Plugging them into the (52) and taking the integral there we can obtain the PDF whose form is given in Appendix, Eqs.(G14),(G15).
The following assumption will be applied throughout the paper According to Eqs. ( 39), ( 49),(G5) a typical largest element in a randomly chosen row of the matrix w ij is ∼ M .Therefore based on (55) the following condition is satisfied in a randomly chosen row of w ij Under this condition, the PDF of w ij takes a particularly simple form, g(w) g ∞ (w) with normalization condition ∞ 1 g ∞ (w)dw = 1.Details of the derivation are given in Appendix G.
The above analysis assumes the scaling behavior (39) of V ij with d ij that requires |n/2 − d ij | < m 0 with m 0 given in (29).As shown in Appendix G this condition is always satisfied for a typical row of d ij provided the constraint (55) on the values of M .

C. Preferred basis Levy matrices (PBLMs)
The problem of population transfer is reduced to the analysis of the described above ensemble of real symmetric M × M matrices H ij of the down-folded IB Hamiltonian (38).The matrices H ij form an ensemble of preferred basis Levy matrices (PBLMs), a generalization of Levy matrices actively studied in the literature (cf., e.g., [33][34][35]37]).Unlike Levy matrices PBLMs have a new control parameter: the ratio of typical diagonal to off-diagonal matrix elements W/V typ that controls the preferential basis (computation basis).This distinction is analogous to that between Gaussian Orthogonal ensemble (GOE) and the Gaussian ensemble with broken SU(N) symmetry, the Rosenzweig-Porter (RP) model [46].
Recent studies of RP ensemble [29] demonstrated two localization transitions that occur with varying parameter that controls the relative weight of the diagonal and off-diagonal matrix elements.One of them is the Anderson transition from localized to the extended states that are non-ergodic and posses distinct multifractal features.These states and the corresponding eigenvalues are organized in "minibands" so that the states within the same miniband mostly share the same support over basis states.The spectral width of the minibands is polynomially small (in M ) compared to W .The second transition is from the extended non-ergodic states to the extended ergodic states similar to the eigenstates of the Gaussian Orthogonal Ensemble.We demonstrate analogous behavior in the IB model and analyze the population transfer in the non-ergodic regime.

VII. NUMERICAL SIMULATIONS: MINIBANDS OF NON-ERGODIC DELOCALIZED STATES
In this Section we report exact diagonalization analysis of both the eigenvector statistics and the dynamical eigenstate correlator.Instead of the sparse 2 n ×2 n Hamiltonian Eq. ( 2), it is efficient to diagonalize the dense M × M matrices obtained by down-folding the Hamiltonian into the marked states subspace.This allows access to systems of n = 200 qubits, reducing the finite size effects.The down-folded matrix Hamiltonian ensemble, is constructed as in Sec.VI, where the diagonal elements m are distributed uniformly in the energy window and the off-diagonal elements are constructed by sampling Hamming distances between uniformly random bitstrings of length n and using Eq. ( 20) with E = E (0) determined from Eqs. (33), (34).
We introduce the scaling of the width of the distribution of m with the matrix size M , where γ is a real non-negative parameter that controls the scaling of the typical diagonal to off-diagonal matrix element V typ given in Eq. ( 48), and λ is an auxiliary constant of order one.

A. Eigenvector statistics
We define the inverse participation ratios (IPRs) I q and the entropy H z as, where ψ β denotes an eigenstate with eigenvalue E β .IPR I 2 is the second moment of the wave function probability distribution | ψ β |i | 2 in the computational basis (bitstrings) |i .The entropy H z characterizes the support set of an eigenstate in the computational basis [54], i.e. the subset of bitstrings where the probabilities | ψ β |i | 2 are concentrated.Fig. 10 shows the participation ratio I 2 as a function of the ratio of mean level spacing δ to the typical matrix element V typ , a measure of the number of states in resonance with a typical classical level i .The regime δ V typ corresponds to the localized phase, where the eigenstates have significant weight on a small number of bitstrings that are close to each other in Hamming distance.In this regime I 2 ∼ 1 and is system size independent.In our model marked states are separated by Hamming distance d ≈ n/2 + O( √ n) with high probability and therefore most localized states have sharp peaks at exactly one bitstring, hence I 2 ≈ 1.As the ratio δ /V typ decreases I 2 becomes system size dependent.Fig. 11 indicates that the combination I 2 M/3 ∼ 1 be-  12 where we introduce the multi-fractal dimensions D q and D 1 which determine the scaling of I q and H z with M , respectively, where c q is a q-dependent fitting parameter.The extracted dimensions shown in Fig. 12 as a function of the parameter γ vary continuously between D q = 1 in the ergodic phase γ ≤ 1 and D q = 0 in the localized phase γ ≥ 2, with 1 < γ < 2 corresponding to non-ergodic regime for q = 1, 2.
. The multifractal dimensions D1 (defined in Eq. ( 63)) and D2, (defined in Eq. ( 62)) as functions of γ for the ensemble of IB Hamiltonians with the dispersion of classical energies W = λVtypM γ/2 , with λ = 3.3.All the multifractal dimensions Dq approach 1 in the ergodic regime (γ = 1) and 0 in the localized regime (γ = 2).The difference between D1 and D2 is also likely due to finite size effects.We also extract a scaling exponent from the dynamical correlator (see Eqs. ( 65),( 66)).Dot-dashed line corresponds to the analytical value in the Rosenzweig-Porter limit given by Eq. (69).

B. Eigenstate overlap correlator for non-ergodic minibands
Population transfer dynamics in the non-ergodic regime can be characterized by the survival probability, see Section II.The Fourier transform of the survival probability for a given initial marked state i is given by, Note that the limit ω → 0 gives the inverse participation ratio of a given bitstring in the basis of eigenstates, The average of p i (ω) over the initial state is related to the overlap correlation function K(ω) defined by [29], The fractal dimension extracted from the scaling of K(0) with M is shown in Fig. 12, it follows closely those ex- ω/Γε, where Γε = ΓtypM ε and Γtyp = 2Σ typ is the typical mini-band width and Σ typ ∝ VtypM 1−γ/2 (log M ) 1/2 , Eq. ( 125).Different curves correspond to different values of M , and collapse well with ε = 0.05.We used the ensemble of IB Hamiltonians with a dispersion of classical energies W = λVtypM γ/2 , with γ = 1.2 and λ = 3.3.
tracted from the IPR in the computational basis.The collapse of the plots in Fig. 13 is achieved when the frequency is rescaled by the characteristic energy, with a fitting parameter ε 1.The correlator K(ω) is constant for a range of energy differences ω < Γ ε and decays quickly ∝ ω −2 as ω > Γ ε .This can be interpreted in terms of the formation of non-ergodic mini-bands of eigenstates that share support in computation basis: for an average bitstring there is a range of eigenenergies E β within a width Γ ε around a bitstring dependent value j where the eigenfunction overlaps with z j are relatively large, whereas for larger energy difference the correlation decays quickly below the value corresponding to uncorrelated case K(ω) < 1/M i.e. the amplitudes repel each other.The relation between the survival probability and eigenfunction overlap correlator, Eq. ( 66) suggests that the characteristic population transfer is given by the inverse of the characteristic energy scale of the miniband width Γ ε , the range of energy eigenstates with significant amplitude at the given bitstring.The auxiliary fitting parameter takes a small value ε = 0.05 indicating only a small deviation from Γ typ most likely due to finite size effects.In Appendix M we show the results of direct simulation of dynamics of the model in the course of the PT protocol and confirm the scaling of the PT time.

C. Discussion of numerical results
The size of the matrix of marked states used in exact diagonalization M ≤ 20000 is a small fraction of the size of the total Hilbert space Hamiltonian 2 n × 2 n with n = 200.For such a small sample the distribution of Hamming distances d ij between marked states is dominated by In this regime the square of the off-diagonal matrix element, see Sec.IV, has approximately Gaussian dependence on d ij (cf.Eqs. ( 39),( 48)) and the probability to find a pair of bitstrings at a smaller distance d ij is strongly suppressed.The sign of H ij rapidly fluctuates as a function of d ij resulting in a negligible average The distribution of off-diagonal matrix elements in Eq. ( 68) is non-Gaussian and instead has a heavy tail that cannot be fully characterized by the variance alone, see Section VI B and Appendix F where we introduced the class of Preferred Basis Levy Matrices and derived the asymptotic form of the distribution of matrix elements.For numerically accessible matrix sizes M we expect the deviation from the Gaussian distribution in the observables to be very small.
The eigenstate statistics and the respective fractal dimensions for the model Eq. ( 68) can be calculated using strong disorder perturbation theory.The calculation proceeds similarly to that in Ref. 29 resulting in, Comparison of the approximate Eq. ( 69) with numerical results is shown in Fig. 12 as the dot-dashed line.It appears that the D 1 and D 2 do not quite coincide with each other nor with Eq. ( 69), which may be due to finite size effects.
It is instructive to draw an analogy between characteristics of the PBLMs and that of the Rosenzweig-Porter (RP) model from random matrix theory, see Ref. 29 and 46 and references therein, where the matrix elements are given by Gaussian random variable with zero mean and variance for diagonal and all off-diagonal matrix elements set H 2 ii = 1 and H 2 ij ∝ M γ .Transition points between localized, delocalized and non-ergodic delocalized regimes as well as perturbative expressions for fractal dimensions Eq. (69) are consistent in the two models.The dynamical correlator also shows similar behavior indicative of the formation of minibnads of non-ergodic eigenstates with the leading exponent 1 − γ/2 in the scaling of the population transfer time with M coinciding in the two models.The prefactor (log M ) 1/2 however is affected by the heavy tail of the distribution of the matrix elements and needs to be calculated analytically.It is difficult to extract it accurately from the numerical simulations due to the finite size effects.

VIII. BORN APPROXIMATION FOR THE TRANSITION RATES
In this section we develop a simple picture relying on Fermi Golden Rule (FGR) to study the rates of population transfer away from a given marked state to a set of other marked states inside the same miniband.Assume that the system is initially prepared at a randomly chosen marked state |z j .The probability amplitude to remain in the initial state |z j equals where |ψ(t) evolves with the IB Hamiltonian H (38) and If the eigenstates dominantly coupled to the marked state |z j are extended then the amplitude ψ(z j , t) will undergo decay in time.
Here we calculate ψ(z j , t) using a simple effective Fano-Anderson model for the decay of a discrete state into a continuum [55].This model captures the Born approximation for the ensemble of Hamiltonians introduced in Sec.VI.The model Hamiltonian H is obtained from the IB Hamiltonian H (38) by zeroing out all off-diagonal matrix elements except those in the jth column and the jth row connecting state |z j to the rest of the marked states.The Hamiltonian H has the form where the summation is over m ∈ [1..M ], m = j.We consider the dynamics on a time scale when the population of the state |z j decays into the other states and introduce a small imaginary part −iη to their energies.
It is assumed to be much bigger than the typical energy spacing, η δ = W/M but smaller than the time scale on which the decay takes place.We introduce the parameterization similar to that in Sec.VII for the distribution of energies j , where λ is a (redundant) number of order of O(M 0 ).
The amplitude ψ(t, z j ) has a well-known form [55] where we used a short-hand notation for real and imaginary parts of self-energy of the marked and we keep z real.Calculating the above integral to the leading order in H jm (j = m) we get where The quantity Γ j above is the total decay rate of the state |z j which is twice the imaginary part of the selfenergy Σ j .The latter equals to the "width" of the level j due to the decay.Expressions (76),(77) correspond to a well-known Born approximation for the self-energy Σ j .Using (75) we get, where we defined a function, (79) The matrix elements H 2 mj , see (38), (39), depend only on the Hamming distance d mj .The dominant contribution in to the sum (78) comes from the transitions to the states with | j − m | η.If the number of such states is large the sum can be replaced by the integral corresponding to the approximation where the Lorentzian δ( j − m , η) ≈ δ( j − m , η) is replaced with its average over realizations of m .The average is independent of η W which therefore drops out from the PDF of the transition rate Γ and the resulting level width Σ = Γ/2.This case corresponds to the leading order Born approximation described in the next Section VIII A.
A more accurate treatment of δ( j − m , η) as a random variable results in the form of the PDF of Γ (and Σ ) being explicitly dependent on η.The physical meaning of η is the decay rate at the "children" sites m , m = j which gives rise to the width Σ or the energy level j at the parent site.In a large system the statistics of the decay rate for children and parents are expected to be the same.The crude approximation that captures this effect is obtained by substituting η with typical value of Σ .This corresponds to self-consistent Born approximation described in Sec.XI A 2. It gives rise to a more accurate expression for PDF of Σ (and Γ) whose shape is rescaled compared to the leading order Born.Systematic analysis is given by the cavity method described in Secs.X, XI.

A. Leading order Born approximation
We can break down the decay rate Γ j = 2Σ j into a sum over different decay channels where each term in the sum corresponds to the transition rate from the initial state |z j into the subset of the marked states on a given Hamming distance d from |z j (see Fig. 14).The factor j η (d) in ( 80) is a spectral density of the marked states located at a distance d from the state |z j within the window of energies η around j where ∆(d) is a Kronecker delta and δ( , η) is defined in (79).
We denote as M where coefficient p d defined in (46) is the probability that a randomly chosen state is located on a Hamming distance d = 0 from |z j .The mean separation between the adjacent energies m in the sum (81) equals where δ = W/M is mean spacing between the marked state energies.A substantial contribution to the sum in (81) comes from the terms corresponding to the marked states whose energy levels j lie within the width η from the energy m , i.e., they satisfy the resonant condition | j − m | η as shown in Fig. 14.
The contribution to a sum from each resonance is ∼ 1/η and the number of the resonances in a given decay channel is Ω d ∼ M (d) j η/W (cf.Fig. 14).It is shown in the Appendix J that the dominant contribution to the typical values of Σ j (80) comes from the values of d that correspond to Ω d 1.For them the function δ( j − m , η) in Eq. (81) changes weakly between the adjacent values of m , and in the leading order Born approximation we estimate the sum over m in (81) by replacing it with an integral.Then the spectral density can be estimated as where we required and p( ) is PDF of the marked state energies with the width W (see (44)).
We plug (85) into the expression (80), obtaining the following relation where the sum is dominated by values of d corresponding to large values M The term involving cos 2φ(d) above oscillates around 0 on the scale d ∼ 1 (cf.Eq. ( 32)).Therefore the contributions to the sum from the terms ∝ M ) cos 2φ(d) and drop the second term in the r.h.s of (87) that contains cos 2φ(d).
Essentially, the above approximation corresponds to replacing the oscillatory part in the expression for the offdiagonal matrix elements ) as follows: where β ij are instances of a dichotomous random variable that takes values ±1 with probability 1/2.This approximate model of the ensemble of H ij will be also used in cavity method calculation in Sec.XI.Using the expression (44) for p( ) and also Eqs. ( 53), ( 39), (48), we obtain the relation between the PDFs of the random variables Here w m are random variables independently sampled from the probability distribution g ∞ (w) given in (57).
The level widths Σ j of individual marked states for 1 ≤ j ≤ M are samples of the random variable Σ .In Eq. (89) we introduced the characteristic value of the level width Σ * .This equation relates the PDF of Σ (or the decay rate Γ = 2Σ ) to that of and M s M .We note that the resulting expression for the level width Σ of a marked state formally corresponds to that given by FGR for the decay of the discrete level into the continuum [55].The energies of the marked states m into which a given marked state |z j decays form a miniband of the width Σ j .The decay occurs simultaneously in many channels corresponding to different Hamming distances between the initial marked state and the states of the miniband.
The heavy-tailed PDF of the random variable s M is studied in details in Appendix I. Using Generalized Central Limit Theorem (GCLT) for the sums of a large number of identical heavy-tailed random variables [33,48] it can be represented in the form where x obeys a so-called Levy alpha-stable distribution L 1,1 1 (x) [33] defined in the Appendix, Eq. (I8), and shown in Fig. 15.Scaling factor and shift are (γ Euler 0.577 is the Euler constant).They display very weak logarithmic dependence on M as compared with the main factor ∝ V 2 typ /δ in (89).The width of the PDF of s M is shrunk by a factor (log M ) 1/2 1 and the location of its maximum is increased by a factor (log .We follow here the definition introduced in [33] and used in subsequent papers on Levi matrices in physics literature.In mathematical literature [56,57] a different definition is usually used, corresponding to f (x; α, β, C 1/α , 0) = L C,β α (x).
The PDF of s M has polynomial tail.Therefore decay rates of marked states Γ j = 2Σ j can take a range values that are much bigger than their typical values 2Σ * (89), up to M times bigger in the sample of the size M .These atypically large decay rates correspond to rare clusters of marked states that are located anomalously close to each other.When clusters are formed by O(1) states the above picture of the decay fails.

IX. NUMBER OF STATES IN A MINIBAND WITHIN BORN APPROXIMATION
Using the expression (89) for the miniband width we can estimate the number of marked states Ω in a miniband corresponding to a given state |z j .As before, we divide the states into the groups of the sizes Ω d , each corresponding to the transitions away from |z j with a fixed number of flipped bits d.The level width can be written in the form Σ j = M d=1 Σ j,d where Σ j,d is the partial level-width due to the transitions with flipping d bits.Then, using (87) and making use of the expression (83) for the average values of M (d) j we obtain The quantity ∼ 1 in (94).This results in the interesting phenomenon due to cancellation mentioned din the previous section: while the typical number of marked states in a decay channel varies very steeply with d, typical values of partial decay rates Σ j,d in different channels do not.
The estimate for the typical number of states in the miniband at the distance where p d = 2 −n n d and Ω is the total number of states in the miniband.
One can also write the partial decay rate as Γ The above estimate gives the correct time scale ∼ 1/V (d res min ) over which the two states become hybridized.We note however that the total number of channels is n − 2d res min = O(n).As all Γ (d) j are nearly the same, each channel contributes a small fraction O(1/n) to the total rate.Therefore V (d res min ) ∼ Γ j /n and marked state |z j decays into the large number of marked states within a miniband before it has a chance to hybridize with the nearest one at a distance d res min .This property is markedly different from the situation at finite dimension [22].
Using the scaling ansatz (72) we estimate the mean separation between the energies of marked states as Using the Eqs.( 89) and (95) we obtain the estimates for typical values of the decay rates and number of marked states in a miniband We immediately observe that in the range of γ > 2 the number of marked states in a miniband vanishes.
It corresponds to a localized phase, consistent with the fact that typical energy spacing δ becomes greater than the typical tunneling matrix element V typ connecting the states.The number of states in a miniband Ω cannot be greater than the total number of states M in the IB.The expression above does not apply for γ ≤ 1.This regime corresponds to ergodic phase.
In the region 2 > γ > 1 the separation between adjacent eigenvalues of H is of the same order as δ .The typical number of marked states in a miniband Ω corresponds to the typical number of non-ergodic delocalized eigenstates of H that form the miniband.
The number of states in a miniband scales as a fractional power of M less than one.This is a hallmark of nonergodic delocalized phase.

X. CAVITY METHOD: SUMMARY OF THE PREVIOUS RESULTS
The cavity method has been actively used to study Anderson Localization in Levy matrices in the last several decades [33-37, 47, 58] starting from the seminal work [33].In the present work we use cavity method to study the properties of minibands of delocalized non-ergodic states that were previously discovered in the studies of Rosenzweig-Porter [29,36] and Regular Random Graph (RRG) [28,45] models.Initial studies suggested the existence of the mixed region with localized but non-ergodic states [33].However, recent numerical studies based on exact diagonalization using very large number of samples established that initially large crossover region between localized and extend states collapses in the limit of increasing matrix sizes [35].Multifractal properties of eigenstates in the localized phase and at criticality were studied in [37] using strong disorder perturbation theory.
Numerical solution of cavity equations to study localization transition in Levi matrices with power-law distributions were obtained using population dynamics algorithm [34] utilizing the approach developed in [58].An alternative approach is based on the integral equation for the PDF of the diagonal elements of the resolvent [33,47].It was obtained in the limit where imaginary part of the self-energy is vanishingly small [33,35,47] (with the limit of infinite matrix size taken first).This allows one to derive analytically the global density of states [33,47] and the mobility edge E * (α) which gives the α-dependence of the energy E * separating extended and localized eigenvalues of H [35].
The cavity method proceeds as follows.First, we generate a random M × M matrix H ij (38) from the ensemble described in Sec.VI.Then we add a new row (and a symmetric column) of independent numbers identically distributed as those in the old matrix H ij .This is done by generating a random energy 0 from the distribution 1 W p A ( /W ); then generating a random bit-string z 0 , computing the array of Hamming distances d j0 between z 0 and z j and the corresponding matrix elements H j0 = H 0j for integer j ∈ [1, M ].As a result we obtain a new (M + 1) × (M + 1) matrix H +1 , where +1 emphasizes that it has one more row and one more column than H .We will number elements of the new matrix by indices running over the range [0, M ] where the index 0 corresponds to the added marked state |z 0 .The cavity equations have the form [33,47] where It does not involve the non-diagonal matrix elements of the Green's function G mm (z) when statistical average H 0m = 0.This is effectively our case as well (see Eq.( F1)).
The main assumption of cavity method is that in the limit M → ∞ the difference between the PDFs of Σ +1 0 (z) and Σ 0 (z) disappears.This results in a self-consistent equations for the self-energy.Following [32] we add small imaginary parts to the diagonal matrix elements H mm = m − iη.It is a small "fictitious" quantity that is still assumed to be much bigger than the marked state energy spacing η W/M .Results are not expected to depend on the value of η provided its scaling with M is chosen appropriately, as will be discussed below.We separate the real and imaginary parts of the self-energy, Σ m (z) = Σ m (z) − iΣ m (z) (cf.(74)), obtaining where the function δ(x, y) ≡ 1 π y x 2 +y 2 was already introduced in (79).
The self-consistent Eqs.(99) were derived by Abou-Chacra, Anderson and Thouless [32] for matrices on Bethe lattices and by Bouchaud and Cizeau for Levy matrices [33].The solution of these equation was only found in the case when they can be linearized in Σ m [32,33,35] giving the location of mobility edge E * (α) as a function of the power α in the tail of the PDF of the matrix elements . Here we will provide a full solution of the nonlinear equations.
We will solve the self-consistent equations (99) under the assumption that pairs of variables (Σ m , Σ m ) for each state m ∈ [0, M ] are taken from the same PDF P(Σ , Σ ; z) defined over the domain x ∈ (−∞, ∞), In what following for brevity we omit the explicit dependence on the parameter z.Following [32] we introduce the characteristic function F (k 1 , k 2 ) of the PDF P(Σ , Σ ) where Here f =H 2 0m and the average is performed with the joint PDF P(Σ , Σ ) 1 W p A W df P (f ).The above relation between F (k 1 , k 2 ) and G (k 1 , k 2 ) is actually an equation for the PDF P(Σ , Σ ) because both G and F depend on P.

XI. SOLUTION OF CAVITY EQUATIONS IN NON-ERGODIC DELOCALIZED PHASE A. Analysis of the imaginary part of self-energy
We note that the exponent in the integrand of the above expression for G depends on Σ and − z only via their combination Σ + − z.In the non-ergodic delocalized phase the typical width of the PDF of Σ is much more narrow than the width W of p( ) (44).We will also consider small values of |z| W . Therefore in the first approximation we will neglect Σ and z compared to .Then G (k 1 , k 2 ) depends only on the marginalized PDF Once this PDF is obtained, the PDF P(Σ , Σ ) can be analyzed from its characteristic function F (0, k 2 ).
Inverting it we obtain the self-consistent equation for P(Σ ) in the limit M → ∞ Here θ(k) = 1 − G η (0, k) and the domain of integration for all variables is [0, ∞).The function p η+Y (h) above is a conditional PDF of a random variable with Y fixed and δ(x, y) given in (79).The explicit form of the PDF p η+Y (h) is obtained in Sec.K of the Appendix, Eqs.(K6),(K8).
To achieve further progress we use the approximation (88) and drop oscillatory factors in the off-diagonal matrix elements H 0m .Then we have for the PDF P (57) and in what follows we will use the rescaled variable w = f /V 2 typ for the squared matrix elements, in accordance with (53).Instead of the variable h in (101) we will use the re-scaled variable that obeys the distribution (see details in Appendix K, (K19)).Then θ(k) takes the form Here φ Y (u) is a characteristic function We now make a key observation: in the limit of large x 1 and for W Y the following relations holds for the PDF φ Y (u) and its characteristic function (see the corresponding Eqs.(L26) and (L10) in Appendix L) The reason for this can be explained as follows.For large deviations of x = wY 2 /( 2 + Y 2 ) the conditional PDF p( |x) of the marked state energy is narrowly peaked in the range of values | | ∼ Y .In contrast, typical energy values are much bigger ∼ W .This narrowing of the conditional PDF p( |x) gives rise to a small factor πY /W in the r.h.s. of (106).
We observe that lim k→∞ θ(k) = 0 and for M → ∞ the integral in ( 101) is dominated by |k| 1.We make an assumption (whose validity becomes obvious below) that for small enough k the integral in ( 104) is dominated by values of Σ such that kV 2 typ /(Σ + η) 1. Therefore we will use in (104) the approximate expression for the characteristic function φ Σ +η given by Eq. ( 106).We rescale Σ with the typical value of imaginary part of selfenergy of marked states Σ * (89) obtained in FGR-based calculation in Sec.VIII.Making a change of variables we rewrite the self-consistent equation (101) in the limit x 1, W Y for the rescaled PDF ρ(s) in the following form: and Σ * and V typ are defined in ( 89) and ( 48), respectively.We observe that Ω corresponds to the typical number of marked states in the mini-band that we estimated in Sec.VIII using the Born approximation.
Truncating the expansion at terms ∼ (log M ) −1/2 we get where C ≈ 0.577 is the Euler constant.It is clear from comparing individual terms in Eq. ( 111) with the exponential in Eq.( 108) that q = O( √ log Ω).This justifies the order of truncation (see details in Appendix I, Eq.I5).
We plug the above expression for ρ(s) into (113) and express µ Ω in terms of a new variable Then this variable satisfies the following equation that involves a scale-free Levy distribution and a single parameter where we used an explicit form of β η (109).We note that the self-consistent equation for the function ρ(s) is now reduced to the simple transcendental equation (115).Using explicit form of σ Ω and b Ω (92),(93) one can see that ζ Ω is large compared to unity in the delocalized phase, With this property the equation for x (115) can be solved by iteration using the asymptotic expansion of Levy distribution at large arguments, ).To the leading order Then using (114) the expression for µ Ω is where we neglected terms ∼ σ 3 Ω log Ω that are much smaller than the width σ Ω of the distribution ρ(s We note that the dependence of µ Ω (118) on the initial (fictitious) level broadening η disappears when the later is chosen to be much smaller than the mini-band width [28,36,45], W/M η Σ * σ Ω .Using (89), (72) the scaling behavior of η with M in the non-ergodic delocalized regime must satisfy the condition Finally the expression for the distribution function of the imaginary part of self-energy has the form Here Σ typ is a shift of the distribution and C its scale parameter (characteristic width).Also, Using the scaling ansatz (72) for the width W of the IB in terms of M , the typical number of states in a mini-band (number of resonances) equals, Using the same scaling ansatz (72) and the expressions for σ Ω (92) and µ Ω (122) we obtain, Here V typ ∼n 1/2 2 −n/2 e −n/(4B 2 ⊥ ) is given in (48).The shift Σ typ corresponds to the typical value of Σ .One can see from the above that it is log Ω ∼ log M 1 times bigger than the distribution width.We note in passing that distribution of Σ determines that of the miniband width Γ = 2Σ (97).

Comparison between the cavity method and leading-order Born approximation
It is instructive to compare the above distribution of Σ obtained using the cavity method with that obtained within the Born approximation (89)-(93).In both cases the distribution of Σ is given by the appropriately rescaled and shifted Levy alpha-stable distribution L 1,1 1 (x).In both cases, the scale parameter C (characteristic width) of the disribution has the form C = σ S Σ * with σ S = π/(4 log S).In the case of the Born approximation S = M , corresponding to the total number of marked states, and in the case of cavity method S = Ω M , corresponding to the (much smaller) number of states in the mini-band.Using (124) we estimate Therefore Born approximation underestimates the width of the distribution of Σ .The ratio (127) is especially pronounced near the localization transition γ = 2. Value of σ −1 Ω shrinks to zero at the transition while that of σ M does not depend on the closeness to the transition point.
We note however that factors σ Ω and σ M depend on M only logarithmically.At the same time, the leading order (power-law) dependence of the rescaling coefficient on M is given by the factor Σ * ∝ M 1−γ/2 , and is identical in the cavity method and the Born approximation-based expressions.
The situation is similar with the shift parameter Σ typ in the Levy distribution of Σ corresponding its typical value, Σ typ Σ * /σ S with S=M (Born approximation) and S=Ω (cavity method).The leading-order dependence of the shift on M is the same in both cases and is given by Σ * .In both cases the shift is greater than the rescaling coefficient by a factor ∼ log M .However the Born approximation overestimates the shift by a factor (2 − γ) −1/2 .

Comparison between the cavity method and self-consistent Born approximation
The leading-order Born approximation recovers the typical shift Σ typ and the scale parameter C of the distribution of Σ with exponential accuracy in log M .However it gives an incorrect dependence of the prefactor on log M in these coefficients.The main approximation in Sec.VIII A was to assume that the sum in the expression for the spectral density ρ j η (d) (81) can be replaced by an integral.We revisit the decay rate equation (78) using the statistical ensemble (88) Here in the l.h.s.we omitted the subscript in Σ j and made the rescaling V (d jm ) 2 = V 2 typ w m .Random variables x m are sampled from the distribution g η (x) given in (106) and plotted in Fig. 16).Using GCLT for the sum in (128) one can obtain the PDF of Σ .The details are given in Appendix L and here we provide the result, Here x is a random variable that obeys Levy distribution L 1,1 1 (x), coefficient σ Ω is given in (123) and b Ω is given in (93) where one should replace M with the number of marked states in a mini-band of width η Unlike the discussion in the cavity method, the statistics of Σ explicitly depends on η.We make a self-consistent assumption and set η equal to the characteristic width of the miniband We conclude that the typical number of states in a miniband Ω η = Ω given by the self-consistent Born approximation is the same as that given by the cavity method, Eqs.(110).Therefore using (129) one can see that the width C of the distribution of Σ is also the same in both methods.The difference between the typical values of Σ in the two methods is This error is much smaller than in the case discussed in Sec.XI A 1 (cf.Eq. ( 127)) where the self-consistent condition is not used.However it exceeds the distribution width C for sufficiently large M 1 because in the nonergodic delocalized phase log σ −1 Ω ∼ log log M .

B. Real part of self-energy
In this section we will find the marginalized probability distribution of real parts of self-energy We consider the first equation in (99).Following the arguments provided in Sec.XI A we neglect the terms z−Σ m in the r.h.s of the equation and drop the oscillatory factors in H 0m using the probability distribution ) instead.Then Eq. (99a) takes the form Here r m are instances of a random variable R such that where , f, Σ are random variables independently sampled from the distributions p( ), P (f ) and P(Σ ), respectively.Using GCLT, in the asymptotic limit of M → ∞ the sum in (133) is determined by the tail of the probability distribution of r at |r| → ∞.This analysis is very similar to the one already discussed in Sec.VIII,XI A and in Appendix I.Here we omit details of the calculations and simply provide the result.The tail of the PDF of r in the limit |r| → ∞ has the form (ρ 1).The distribution function P(Σ ) of the sum in (133) is the Cauchy distribution Here the expression for σ M ∼ 1/ √ log M is given in (92).Cauchy distribution has the form very similar to the stable distribution L 1,1 1 (x) that describes the fluctuations of the Σ (120) up to the shift and rescaling coefficients.Both distributions are displayed in Fig. 15.The tail of the Cauchy distribution differs from that of L 1,1 1 (x) by a factor of 2. Unlike that of Σ the distribution of Σ is symmetric for impurity states with energies near the center of the band.The typical value of Σ is greater than that of Σ by a constant factor The width of the distribution of Σ is the same as its typical value while the width C of the distribution of Σ is smaller by a factor ∼ 1/ log M (cf.Eqs.(125),( 126)).These relations between the distributions of Σ and Σ have implications for the complexity of the population transfer as discussed below.We also note that the real and imaginary parts of self-energy of a given marked state are correlated with each other because according to Eqs. (99a),(99b) the values of Σ j and Σ j depend on the same set of parameters (H jm , m , etc).In this work we will not study their correlations.

C. Dynamic correlations
For states close to the center of the band of marked states the typical value of the mini-band width can be connected to the average of the dynamical correlator, with the delta function regularized by a finite scale η, which can be inverted to obtain, where we introduced the Thouless energy, The typical value of the mini-band width was obtained in Eq. ( 125).From the comparison of the respective Fig. 12 we conclude that the scaling of the typical population transfer time 1/ω Th and the scaling of the value of the dynamical correlator K(ω) are consistent in numerical and analytical calculations, subject only to a small correction in the scaling exponent ε = 0.05.

XII. COMPLEXITY OF THE POPULATION TRANSFER PROTOCOL
After the system is prepared at a given marked state |z j at t = 0 the probability for the population to be transferred to other marked states is 1 − ψ 2 (z j , t).At the initial stage the survival probability decays exponentially (76) with the mean decay time 1/Γ j = 1/(2Σ j ).
The initial marked state decays into the eigenstates |ψ β of the IB Hamiltonian H with typical energies E β inside the narrow interval corresponding to the miniband associated with |z j .It has a width Σ j and is centered around H jj = j .Typical classical energies of the bitstrings measured at the end of PT protocol will obey the probability distribution P( − j − Σ j ) with P given in (136).The success of PT protocol is to find a bitstring distinct from z j at a time t with energy inside that window ∆E cl around j .The expected time to succeed in PT equals Here p ∆E is the probability of detecting a bit-string inside the target window ∆E cl under the condition that initial state has decayed.Let us assume that the PT window is as wide as the typical miniband width, ∆E cl = Σ typ .In this case p mb differs from 1 only by a constant factor that does not depend on M (cf.( 137)).Therefore we will detect the bit-string inside the PT window with finite probability as long as we waited long enough for the transition away from the initial marked state to occur.Because the initial state |z j is picked at random we can estimate typical time to success of PT t PT ∼ 1/Σ typ corresponding to the inverse typical width of the miniband.All of the states in a miniband are populated at (roughly) the same time t PT because transition rate to a subset of states on a distance d away from |z j depends on d very weakly (see Eq. ( 94) and related discussion in Sec.IX)).
From a computational perspective it is of interest to characterize the PT by the relation between the the typical success time of PT t PT and the number of states Ω over which the population is spread during PT where we set ∆E cl ∼ Σ * (see discussion above).We note that the time t G for the Grover algorithm for unstructured quantum search to find Ω items in a database of the size 2 n is t G ∼ (2 n /Ω) 1/2 .PT time t PT scales worse than Grover time t G by an additional exponential factor e 2θn e n 2B 2 ⊥ (43).The scaling exponent 2θ can be made arbitrarily small at large transverse fields One can expect that the distributions of eigenvalues and eigenvectors inside the mini-band are very similar to those in the ergodic case, albeit with the appropriately rescaled effective dimension Ω of the Hilbert space [29].For example, the energy spectrum of the mini-bands in the non-ergodic delocalized phase of Rosenzweig-Porter (RP) model corresponds to the Gaussian Orthogonal ensemble.There, according to the semicircle law [59], the typical spectral width of the mini-band (∼ 1/t PT ) is proportional to the square root of the number of states Ω in it.Therefore the Grover scaling (141) for PT is consistent with semicircle law in the Gaussian random matrix models that allow for non-ergodic delocalized phase such as RP model.However in the case of Levy matrices the distribution of eigenvalues has polynomial tails [33], their spectrum is not bounded and semi-circle law does not apply.As mentioned above, this leads to a broad distribution of PT rates.There exist statistically significant clusters of states of a relatively small size that will be populated faster than typical case because the corresponding classical bit-strings are located closer to each in Hamming distance than the typical inter-state separation.At first glance, this tendency is counter to the Grover scaling (141).We note however that fluctuations of Σ and Σ are correlated with each other.Faster decay of a marked state will also correspond to bigger self-energy shift which will reduce the likelihood of finding a marked state with its energy inside the target window ∆E cl ∼ Σ * .
However the Grover scaling still survives in a typical case corresponding to PT away from a randomly selected bit-string.For Levy matrices [33] it reflects the fact that the typical width Σ typ of the curve of the global density of states along the energy axis must scale as a square root of the corresponding typical number of states (area under the curve).We consider here a different protocol inspired by the Hamiltonian version of Grover algorithm proposed in [18].The new protocol finds marked states in the IB model H cl starting from the ground state of H D which is a fully symmetric state |S = 2 −n/2 n j=1 |z in a computational basis.This protocol can be implemented by adjusting the value of transverse field B ⊥ ≈ 1 so that the ground state energy of the driver is set near the center of the IB.Then we can replace the full driver with the projector on its ground state, H D → −nB ⊥ |S S|.The quantum evolution is guided by the Hamiltonian: With the initial condition |ψ(0) = |S .In the case where all impurity energies are equal to each other, {E(z j ) = −n} M j=1 , and B ⊥ = 1 the Hamiltonian H G is a generalization of the analog version of Grover search [18] for the case of M target states.The system performs Rabi oscillations between the initial state |S and the state which is an equal superposition of all marked (solution) states.Time to solution is the half-period of the oscillations, the "Grover time" Hamiltonian versions of Grover search with transverse field driver whose ground state were tuned at resonance with that of the solution state were considered in [60,61].
Robustness of the Grover algorithms to phase noise was considered previously in the case of a single marked state [62,63].Here we investigate the role of systematic phase errors in quantum oracle for the case of multiple solutions by assuming that marked state energies take distinct values E(z j ) = −n + j randomly distributed over some narrow range W .We will also investigate the systematic error in the Grover diffusion operator [1].In the Hamiltonian formulation [18] this corresponds to the deviation from unity of the parameter B ⊥ that controls the weight of the driver in (143).We will define where 0 is the driver error.We denote the computational basis states as |j ≡ |z j with j ∈ [1, N ], N = 2 n and assume that marked states correspond to the range j ∈ [1, M ].We also introduce the state |0 = where j ∈ [0, M ] and H j0 G = H 0j G .On a time scale t 1/δ = M/W much smaller than the inverse spacing of the energies j the quantum evolution with initial condition |ψ(0) = |0 corresponds to the decay of the discrete state with energy 0 into the continuum [55] with the finite spectral width W [64].It is a similar problem to that discussed in the Sec.VIII.

Sensitivity to systematic oracle phase error
We first consider the case of relatively large oracle errors (wide energy band W ) and modest driver errors In this case, following the results of the Sec.VIII on the solution of the Fano-Andreson model [64] we obtain an exponential decay of the initial amplitude (cf.( 76)) where Σ 0 (z) = Σ 0 (z) + iΣ 0 (z) is a self-energy and The state |0 undergoes an exponential decay with the rate Γ 0 = 2Σ 0 .After the characteristic time t PT ∼ 1/Γ 0 the population is transferred into a subset of the marked states with energies inside the window The number of marked states (solutions) to which the population is transferred is Ω ∼ Σ 0 /δ .The relation between t PT and Ω is the same as in the Grover algorithm (143).It also recovers the scaling with Ω and n, up to a factor exp(−n/(2B ⊥ 2 )), for the time of PT considered in the rest of the paper that uses transverse field as a driver and starts from any marked state instead of a fully-symmetric state.
To characterize the effect of oracle errors we introduce the scaling ansatz for the marked states bandwidth W ∼ 2 −n/2 M γ/2 similar to that in (72).We observe that the number Ω of solution states populated over the time t PT cannot be greater than M by construction.For W V √ M (or γ < 1) the value of Ω M and the scaling of the transfer time t PT with M is the same as t G in the ideal Grover algorithm (143).In the region given by (146) (or 2 > γ > 1) the algorithm performance is degraded because Ω M .For W V M (or γ > 2) the algorithm fails to find even one solution.

Sensitivity to the systematic driver error
We now consider the sensitivity of the algorithm to an error in the weight of the driver Hamiltonian, i.e., to the nonzero value of the parameter e 0 = n(1 − B ⊥ ) (144).We assume that 0 W while the spread of the marked state energies the condition (146), so that absent driver errors, PT time would follow a Grover-like scaling law (150).
In this case the state |0 is coupled non-resonantly to a continuum with narrow bandwidth.The expression for the population transfer to the marked states can be obtained from the time-dependent perturbation theory in the parameter 0 /W Maximum transfer occurs at the time t 0 = π/ 0 with the total transferred probability p 0 = 4M V 2 / 2 0 .Typical time t PT t 0 /p 0 to achieve the successful population transfer to marked states involves repeating the experiment 1/p 0 times where Γ 0 is given in (149) and the first multiple in r.h.s gives the typical transfer time in the absence of driver errors.The later leads to an increase of the transfer time by a large factor 0 /W .For the maximum possible bandwidth W when nearly all states are populated, W ∼ Γ 0 ∼V √ M , the time of population transfer (151) is As expected, when the driver error exceeds inverse Grover time 1/t G the performance of analogue Grover algorithms (142) degrades relative to t G .This is a direct consequence of the fact that the quantum evolution begins from fully symmetric state which is a ground state of the driver Hamiltonian whose energy is tuned at resonance with the marked states.In this case the transverse field Hamiltonian driver effectively corresponds to the projector (142).Because the ground state is not degenerate, the resonance region is exponentially narrow (∼ 2 −n/2 √ M ).This results in the exponential sensitivity of the Grover algorithm performance to the value of driver weight.This critical behavior was studied in the work on quantum spatial search [65] for the case of one marked state.
In contrast, in the PT protocol considered earlier in the paper there was no need to fine-tune the value of B ⊥ other than making it large, B ⊥ 1.This happened because the effective coupling between the marked states described by the down-folded Hamiltonian H (38) was not due to any one particular eigenstate of the driver (such as the state |S for the Grover case).Instead this coupling was formed due to an exponentially large (in n) number of non-resonant, virtual transitions between the marked states and highly exited states of the transverse field Hamiltonian H D .This resulted in a significant improvement in robustness for the proposed PT relative to the analogue Grover algorithm.

B. Grover search starting from a marked state
We now consider an implementation of the analogue Grover search that starts from the marked state similar to the PT protocol considered in previous Sections.The transition amplitude U ij (t) = i| exp(−iH G t) |j between the two marked states can be written in the form Here ψ λ (j) = j|ψ λ are amplitudes of the eigenstates of H G in the M + 1 dimensional subspace and λ are the corresponding eigenvalues that obey the equation Here Instead of providing a detailed analysis of the above solution we provide an order of magnitude estimate to extract the relevant scaling behavior.We again assume that the spread of the marked state energies, W = t −1 G = O(V √ M ) corresponds to the inverse of the Grover time t G needed to find any one of the solutions with equal probability.The typical separation between the adjacent vales of j is δ = W/M ∼ V / √ M .
It follows from (154) that in the ordered array obtained by combining together the sets of energies {e j } M j=0 and eigenvalues {λ m } M m=0 their values appear alternatively and sequentially, e.g., j−1 < λ j < e j < λ j+1 .The typical separation between the adjacent elements in the array is |λ j − j | ∼ δ .We observe that for a given value of λ the sum in the expression for Z(λ) (155) is dominated by the small, O(1), number of terms with | m − λ| ∼ δ , each term of the order of M .Indeed, there are O(M ) remaining terms corresponding to | m − λ| ∼ W .The magnitude of those terms is V 2 /W 2 ∼ 1/M and their aggregated contribution to the sum is O(1).Therefore we can estimate Z(λ) = O(M ) and for the amplitudes we have For a given initial state |i at time t we pick the final state |j within the energy window j − j ∼ ∆ = 1/t around i .The sum in the expression (153) for the transition amplitude U ji (t) is dominated by the number of terms Ω = ∆/δ ∼ ∆ √ M /V corresponding to the eigenvalues λ inside the same window of energies.For those terms λ − i , j − λ ∼ ∆ giving the estimate for the amplitudes ψ λ (i), ψ λ (j) ∼ 1/Ω (cf.(156)).The magnitude of the sum in (153) can be estimated as |U ij (t)| ∼ Ω|ψ λ (i)ψ λ (j)| ∼ 1/Ω.On the other hand, because ordered values of λ and m alternate in sequence the probability |U ij (t)| 2 is distributed over Ω marked states and |U ij (t)| ∼ Ω −1/2 .By equating the above two estimates for |U ij (t)| we immediately obtain Ω ∼ 1 and therefore In the case when there are only a few marked states (M ∼ 1 and W ∼ V ) the probability is initially localized on a given marked state |i and then it spreads over to others states separated in energy by V during the time t G ∼ 1/V ∼ 2 n/2 .In this case the algorithm time scales with n identically to that of the analogue Grover search that starts at the fully symmetric state |S .Similar performance is achieved by the PT protocol using transverse field B ⊥ 1 and discussed in previous sections.
The difference from analogue Grover search starting at |S from the above PT protocol using a transverse field becomes dramatic for large number of marked states M 1.Both analogue Grover search and the PT proto-col benefit from the increase in M : the algorithmic time shrinks ∝ 1/ √ M and the number of marked (solution) states Ω in the number of states in the final superposition increases with M .
In contrast, the quantum search with H G starting form the marked state |i does not create massive superpositions of solution states when M increases.Instead it involves a very few others states that are adjacent in energy, The time of the algorithm increases with M (157).This happens because unlike the Hamiltonian H with a transverse field (2), the Hamiltonian H G is integrable.The wave-function remains localized near the initial marked state.

XIV. CONCLUSION
We analyze the computational role of coherent multiqubit tunneling that gives rise to bands of nonergodic delocalized quantum states as a coherent pathway for population transfer (PT) between computational states with close energies.In this regime PT cannot be efficiently simulated by QMC.
We consider optimization problems with an energy function E(z) defined over the set of 2 n n-bit-strings z.We define a computational primitive with the objective to find bit-strings z j = z i inside some narrow energy window ∆E cl around the energy of the initial bit-string z i .The problem is hard for sufficiently low starting energy E(z i ) in the region proliferated by deep local minima that are separated by large Hamming distances.
We propose to solve this problem using the following quantum population transfer (PT) protocol: prepare the system in a computational state |z j with classical energy E(z j ), then evolve it with the transverse-field quantum spin Hamiltonian.Classical energies E(z) are encoded in the problem Hamiltonian diagonal in the basis of states |z similar to quantum annealing (QA) approaches [2][3][4].
A key difference from QA or analogue quantum search Hamiltonians [18,65] is that the transverse field is kept constant throughout the algorithm and is not fine-tuned to any particular value.At the final moment of PT we projectively measure in the computational basis and check if the outcome z is a "solution", i.e., z = z j , and the energy E(z) is inside the window ∆E cl .
In this paper we analyzed PT dynamics in Impurity Band (IB) model with a "bimodal" energy function: E(z) = 0 for all states except for M "marked" states |z j picked at random with energies forming a narrow band of the width W separated by a large gap O(n) from the rest of the states.This landscape is similar to that in analogue Grover search [18,60] with multiple target states and a distribution of oracle values for the targets.The best known classical algorithm for finding another marked state has cost O(2 n /M ).
The transverse field gives rise to tunneling between a pair of marked states corresponding to a sum over a large number of virtual transitions connecting the two marked states via the states with E(z) = 0.As a result the PT dynamics is described by the down-folded M × M Hamiltonian H that is dense in the space of the marked states |z j .Its off-diagonal matrix elements H ij = V (d ij ) cos φ(d ij ) depend only on the Hamming distance d and are obtained using WKB method.The distribution of matrix elements H ij has a heavy tail decaying as a cubic power for V (d) V typ .This is a remarkable result of the competition between the very steep decay of the off-diagonal tunneling matrix element with the Hamming distance d, and the steep increase in the number of marked states M d ∝ n d at distance d.We emphasize that such polynomial tail in the distribution of matrix elements is only possible either in infinite dimension or in presence of long-range interactions (e.g, dipolar glass).
The dispersion of the diagonal elements V typ with γ ∈ [1,2].Therefore we call H ij a Preferred Basis Levi matrix (PBLM), a generalization of the Levi matrix from the random matrix theory.We demonstrate two localization transitions in the PBLM ensemble whose locations are determined by the strong hierarchy of elements of the PBLM H ij .In the range 1 < γ < 2 there exist minibands of non-ergodic delocalized eigenstates of H . Their width is proportional to 1/t PT W .Each miniband associated with a support set S over the marked states.If γ > 2 then W exceeds the largest matrix element of H ij and the support set is empty, all eigenstates are localized.If γ < 1 then W is smaller than the typical largest element in a row of H ij and the support set extends to all marked states -all eigenstates are "ergodic".
We find the distribution of the miniband width Γ = 1/t PT analytically by solving the non-linear cavity equations for an ensemble of PBLMs.Unlike previous analyses focused on linearized cavity equations near the Anderson transition, we find the solution of the fully non-linear cavity equations in the non-ergodic delocalized phase.
The distribution of miniband widths Γ obeys alphastable Levi law with tail index 1.The typical value of Γ and its characteristic variance exceeds the typical matrix element of H by a factor Ω 1/2 where Ω = (M V typ /W ) 2 is a size of the support set in a typical miniband.
We demonstrate that quantum PT finds another state within a target window of energies Ω in time t PT ∝ 2 n/2 Ω −1/2 exp(n/(2B 2 ⊥ )).The scaling exponent of t PT with n differs from that in Grover's algorithm by a factor ∝ B −2 ⊥ , which can be made small with large transverse fields n B 2 ⊥ 1. Crucial distinctions between this case and the Hamiltonian in the analogue version of Grover's algorithm [18] for the case of multiple target states are the non-integrability of our model, and the delocalized nature of the eigenstates within the energy band W . Furthermore, analogue Grover's algorithm for multiple targets is exponentially sensitive in n to the weight of the driver Hamiltonian, and cannot be initialized with a computational basis state.
The model (2) considered in the paper belongs to the class of n-local infinite range spin glasses similar to quantum Random Energy Model in transverse field [66].However the key feature of our analysis -transport via miniband of non-ergodic delocalized states at the tail of the density of states dominated by deep local minima -is ubiquitous to a broad class of quantum spin glass models (2), such as transverse field Sherrington Kirkpatrick, p-spin model [40], K-Satisfiability, etc.
In the above models one can identify two distinct energy scales.The first scale is the typical change in classical energy corresponding to one bit flip: E flip B ⊥ .The second scale is the typical width of non-ergodic minibands Γ < ∆E cl , which decreases exponentially with n.The tunneling transitions between the states inside the miniband require a large number of spin flips, and therefore E flip Γ. Starting from the initial state |z i inside the strip of energies ∆E cl , the quantum evolution is confined within the corresponding miniband.The quantum PT can be described by an effective down-folded Hamiltonian H ij defined over a subset of computational basis states whose classical energies lie within the energy strip ∆E cl at the tail of the density of states.
We note that once a computational problem contains a structure, the associated minibands can be organized in a more complex way than in the IB model considered in our paper.E.g., the population transfer can proceed via the tree of resonances [28,45].In the structured problems the typical tunneling matrix elements H ij can be exponentially greater in n than those in the Grover algorithm and than the transition rates in the classical local search algorithms.Extensions of our approach for the analysis of the computational complexity of Population Transfer for generic spin glass models presents a promising direction for the future research.
the Eq.(B6) has two types of WKB solutions that correspond to real or imaginary momentum p(m) depending on the value of m relative to the turning points m In the region 2 (B5) is rapidly oscillating with d and can be written in the form where is a phase of WKB solution and C (E) is the constant of integration that will be discussed below.
On the other hand, in the two regions 2 is decreasing exponentially with d.For example, in the left region We omit here for brevity the expression in the right region (B12).

Determination of the integration constant in WKB solution
Within the WKB approach the integration constant C (E) can be obtained by matching the exponential asymptotic (B13) with the solution obtained near the boundary of the interval d = 0. However as discussed in Sec.VI B of the main text, for the relevant range of the 2 The expression (B8) for m 0 (E) should be evaluated for the energy E = E 0) −n − B 2 ⊥ corresponding to the eigenvalues of impurity band (33), (35) In this case the O(n 0 ) corrections in the r.h.s. of (B8) vanishes.
The terms O(1/n) above and can be neglected in the exponents of the WKB solutions (B10) and (B13).
model parameters the properties of the typical sample in the ensemble of the IB Hamiltonians H depend only on G n 2 −d, n 2 in the region of its oscillatory behavior (B9) away from the boundaries of the interval d = 0, n.To avoid the analysis in the region of no consequence for us we determine C (E) by equating the above WKB asymptotic for G n 2 −d, n 2 at the center of the interval d = n/2 with expression for G 0, n 2 at that point obtained in a different way.
Using Eq. ( 20) we write c(E, n/2) in the integral form (o → +0).The integral can be expressed in terms of the Gamma function Γ(x).In the region of not too small transverse fields (B7) it has the form Using Sterling formulae for Gamma function we obtain in the limit n For large transverse fields a 1 and we have θ a 2 /4.Using Eq. (B1) we obtain the asymptotic of the Green function at the zone center Here we used the equality for the phase WKB φ(E, n/2) (B11) at the zone center On the other hand, from the WKB expression (B10) we get By comparing the Eqs.(B18) and (B20) we finally obtain the constant of integration C (E) exp(−nθ(a)) (sin φ(E, n/2)) 2 .(B21) One can use (B21) in (B10) and ( 26) to obtain the expression for c(E, d) in the region (B9).Before providing the result we observe that for energies E not too far from the Impurity Band center (cf.Eq. (B4)) the expression for nθ(a) can be expanded in powers of , (B24) It is related to A(ρ) in the Eq. ( 40) of the main text as follows: A(ρ) = A(E (0) , ρ).The phase φ(E, n/2) in (B24) has an explicit form 2. Limit of large transverse fields B⊥ 1 In the limit of large transverse fields the tuning point m 0 (B8) is very close to the boundary of the interval m = L so that one has a small parameter We note that for large transverse fields B ⊥ 1 the phase is a sum of the two terms.First term changes rapidly with d with the slope π/2 and second term changes very little (by an amount O(n −1 )) when d is changed by 1.
We note that unlike the study of the WKB eigenfunctions where one has to select the WKB solution that decays into the classically forbidden region (B12), the Green function G n/2−d,n/2 (E) corresponds to the solution that increases exponentially with m = n/2−d > m 0 .Using the oscillating (B10) and exponentially growing (B13) WKB solutions one can obtain the coefficient c(E, d) from the relation (26).This will provide an asymptotic WKB form of c(E, d) almost everywhere on the interval d ∈ [0, n] except for the small vicinities of the turning points, |n/2 − m 0 (E) − d| = O(n 0 ) and end points, n − d, d = O(n 0 ).In Fig. 4 we plot the comparison between the coefficients c(E, d) computed based on exact expression (20) and the results of asymptotic WKB analysis using Eqs.(B10),(B13).
Appendix C: Linearization of the down-folded Hamiltonian near the center of the Impurity Band We divide the Hamiltonian H(E) for a given E on two parts, accordingly where we defined We write similar expansions for energies and amplitudes E ≈ E (0) + E (1) , ψ(z j ) ≈ ψ (0) (z j ) + ψ (1) (z j ), (C4) and get H(E) ≈ H (0) (E (0) ) + H (0) (E (0) ) ∂E E (1) + H (1) (E (0) ), where the parts of the Hamiltonian H (0,1) are given above.We plug the above expansions into the system of equations ( 14) M j=1 H ij (E)A j =EA i , and use (A13) to express A (0) j = n 1/2 ψ (0) (z j ).Equating terms of the same order in j and c(E, d ij ), i = j, we obtain the equation for eigenstates and eigenvalues in zeroth order n[c(E (0) , 0) − 1]ψ (0) (z j ) = E (0) ψ (0) (z j ) , (C5) Similarly to the above we find from Eqs. ( 18),( 19) the zeroth-order approximation to the the total probabilistic weight of an eigenfunctions |ψ over the marked state subspace Q (0) jk = δ jk Q where a. Zeroth-order of the perturbation theory Eq. (C5) admits the solution corresponding to the Mfold degenerate energy level that originates from the band of the marked states, E (0) → −n in the limit of B ⊥ → 0. The corresponding M eigenstates ψ β (z j ) (β ∈ [1..M ]) have support over the part of computational basis corresponding to marked states: ψ β zj = 0, j ∈ (1, M ).Using c(E, 0) from (C16) the explicit form of the equation (C5) for eigenvalue in zeroth order is given in the main text, Eqs.(33), (34) which we repeat here for convenience.
Here ∆ 0 is the root of the above transcendental equation that satisfies the condition lim B ⊥ →0 ∆ 0 = 0.In general, the sum (34) is dominated by the region of values of d such that |d − n/2| = O(n 1/2 ).We obtain ∆ 0 in a form of a series expansion in powers of n −1 Similarly, using c(E, 0) from (20) in the equation (C8) for the zeroth-order total weight over the marked state subspace we obtain (C12) Using (35) and employing similar approximations to that from the above we get an asymptotical expression in large n limit We recall that in our study n is asymptotically large and we always assume that the transverse field B ⊥ = O(n 0 ) (but can be parametrically large, B ⊥ 1).The denominator in Eqs.(C10),(C12) corresponding to d = m will become zero at "resonant" transverse field value B ⊥ = B ⊥ m which is a root of the equation (??) in the main text.In the range of B ⊥ under consideration n/2 − m n 1/2 .Near the mth resonance the term with d = m in the sum (34) becomes anomalously large due to a small denominator despite the factor p m being very small.We keep this term (34) along with the terms corresponding to |n/2 − d| ∼ n 1/2 and obtain where we introduced rescaled transverse field difference from its value at resonance ⊥ m .There the weight factor Q is decreasing dramatically (cf.Fig. 7) and the above perturbation theory breaks down.The width of the resonant regions ∆B ⊥ m (??) remains exponentially small in n for n/2 − m n 1/2 .In this study we will only focus on the off-resonance case, assuming the condition b. First order of the perturbation theory The first order equation (C6) determines the correct zeroth order eigenstates {ψ β (z j )} M β=1 and removes the degeneracy of the energy levels.To evaluate the coefficients a, b in (C6) we calculate c(E, 0) away from resonance using the same approach as that in the evaluation of the sum in (34) c(E, 0) − nB Appendix G: PDF of the squared off-diagonal matrix elements of impurity band Hamiltonian In this section we provide the details of the derivation of the PDF for the non-oscillatory parts of the (squared) off-diagonal matrix elements V 2 ij of the IB Hamiltonian.As discussed in the main text, in the asymptotical limit of large n 1 one can make an approximation that n is a continuous variable and we replace the summation over d in (47) by an integral and Kronecker delta δ(x) by Dirac delta.This results in the Eq. ( 52) displayed below for convenience It was discussed in the main text (see also below) that the condition for this validity of this approximation is It corresponds to the number of marked states M that is not very large.For example, it can still scale exponentially with n so that M = 2 µn , µ = O(n 0 ), but the coefficient µ in the exponent needs to be small µ 1.
The expression (G1) is obtained using the analytical continuation p x of the binomial distribution p d (46)  In what following we will study the rescaled quantities where i = j, V typ is given in (48)  we get from Eq. ( 39) for where υ(ρ) is given in (41).Eq. (G5) takes the form Then the expression for the PDF for w ij g(w ij ) = V 2 typ P (V 2 typ w ij ) , (G9) can be written in the form (cf. (G1)) g(w) = 2n And the value of the PDF on the lower boundary is In the case of delocalized non-ergodic states (98) The PDF p η (h; 0) ≡ p η (h) is plotted in Fig. 18.The PDF reaches the local maximum on the lower boundary h min corresponding to values of marked state energies W located at the edges of the IB.In the region h ∼ 1/η the probability density reaches very small values, p η (h, z) ∼ η 2 , corresponding to the energies of marked states | − z| η.Maximum value of h = 1/η corresponds to exact resonance = z.The PDF p η (h; 0) has an integrable singularity at this point.It is of interest to consider the PDF of the sum of random variables h m over all marked states (K13) In the non ergodic phase W η mean value of h m is much smaller than its standard deviation

Figure 1 .
Figure 1.Cartoon of the level diagram.Horizontal blue lines depict the energy levels −B ⊥ (n − 2m) of the driver Hamiltonian HD in Eq. (2) separated by 2B ⊥ .A narrow impurity band of width WB ⊥ is marked in light green.The sequence of short black lines depicts the energies of marked states E(zi).Dashed lines depict the elementary path to leading order perturbation theory in B ⊥ for the tunneling matrix element cij(E) given in(16).In this paper we focus on the case of relatively large transverse fields B ⊥ > 1 so that the IB energies lie above the ground state of the total Hamiltonian (2) that corresponds to nearly all qubits polarized in x direction.

−Figure 3 .
Figure 3.The black line shows the plot of 2u(m) (28) vs m between the interval boundaries ±m=L=(n + 1)/2.The horizontal dashed-dotted blue line depicts the region of oscillatory behavior of G m,n/2 (E) with m for a given E described by the WKB solution (30) (see also Eq. (B5) in Appendix) and shown in Fig. 4. The boundaries of this region are the turning points m = ±m0(E) given by Eq. (29) and depicted with blue dots.The regions of m ∈ [m0(E), L]∪[−L, −m0(E)] correspond to the exponential growth of G m,n/2 (E) with m (or decrease with d = n/2 − m).The WKB solution for the right region is given in Eq. (B10).

Figure 4 .
Figure 4.The blue curve shows the d-dependence of the (rescaled) coupling coefficients c(E, d) computed from the exact expression (20) with n = 224 and E = −226.15.We denote the binomial coefficient as n d ≡ C n d .The transverse field is B ⊥ = 1.459.For this value of B ⊥ the impurity band levels E(zj) lie approximately in the middle of the interval between the p = 34th and p = 35th excited energy levels −B ⊥ (n−2p) of the driver Hamiltonian.Red points depict the d-dependence of the same rescaled coefficients c(E, d) given by G n/2−d,n/2 exp(nθ) and determined by the asymptotic WKB expressions given in Appendix (see (B10),(B13)).Dashed lines indicate the boundaries of the oscillatory behavior of the WKB solution (B9).The inset shows the plot for the exponential d-dependence of the rescaled coupling coefficient −c(E, d) in the region of its monotonic behavior d ∈ [1, n/2 − m0(E)] (cf.Eqs.(B13),(26)).The solid blue line corresponds to the exact expression(20), while the approximate WKB solution is shown with red points.

Figure 5 . 2 ⊥
Figure 5. Plots of the WKB phase φ d ≡ φ(E, d) of the oscillations of the coupling coefficient c(E, d) with the Hamming distance d for a number of qubits n = 1000.Both axes are rescaled by n.The phase is plotted relative to its value at d = n/2.We set the energy E = E 0) where E (0) −n − B 2 ⊥ reflects the overall shift of the impurity band due to the transverse field (cf.(33),(34)).Different color curves correspond to different values of B ⊥ > |E|/n with B ⊥ =1.1 (brown), B ⊥ =1.2 (orange), B ⊥ =1.5 (red), B ⊥ =2.1(green), B ⊥ =3.2 (blue), B ⊥ = 10 (black).Each curve varies in its own range n/2−d ∈ [−m0, m0] where m0 is given in (29) and determines the region of oscillatory behavior of the coupling coefficients (see Appendix B for details).For B ⊥ 1 the region of oscillatory behavior shrinks to a point d n/2.In the limit of large values of B ⊥ 1 this behavior occupies almost the entire range d ∈ [0, n].

Figure 6 .
Figure 6.Solid lines show the dependence on the transverse field B ⊥ of the eigenvalues E β of the non-linear eigenvalue problem with Hamiltonian H(E) for the case of n = 50 and M = 2.The plot shows the repeated avoided crossing between the two systems of eigenvalues.One system (colored lines) corresponds to the eigenvalues of the transverse field (driver) Hamiltonian HD = −B ⊥

2 =
E (0) neglecting the tunneling splitting and setting E(zj) = −n for all j ∈ [1, M ] are shown with dashed gray line.

Figure 7 .Figure 8 .
Figure 7.The solid red line shows the dependence of the total weight Q vs transverse field B ⊥ for n = 40.Vertical black and blue lines, respectively, depict the locations of p-even and podd resonances B ⊥ = B ⊥ p defined in the text.The total weight Q undergoes sharp decreases in the vicinity of even resonances.For p < 5 the resonance regions are so narrow that dips in Q are not seen.The width of the regions grow steeply with p.

Figure 9 .
Figure 9. Red points show the empirical probability distribution M (d) j vs d with M (d) j = M j=1 δ(dij − d).Here dij is a matrix of Hamming distances dij between the set of M randomly chosen n-bit-strings (marked states) and i is a randomly chosen marked state.The distribution corresponds to M = 10 7 and n = 60.Black stars connected by a black line show the samples m d from multinomial distribution with mean values M (d) j = M p d where p d is binomial distribution (46).
(d) j with d ij = d are samples from the multinomial distribution with mean values M (d) j = M p d (see Fig.

Figure 10 .
Figure 10.The inverse participation ratio I2 = i | i|ψ β | 4 as a function of the average classical (at vanishing transverse field) energy level spacing δ in units of the typical coupling Vtyp for different numbers, M , of states in the impurity band.We see that for δ /Vtyp ≥ 1 the eigenstates become localized and I2 → 1 independent of M , indicative of eigenstates localized on single bitstring each.

Figure 11 .
Figure 11.The re-scaled inverse participation ratio I2M/3 as a function of the re-scaled impurity band width W/(M Vtyp) for different numbers, M , of states in the impurity band.We see that in the ergodic regime, W/(M Vtyp) ≤ 1, we have I2M/3 = 1, corresponding to the orthogonal Porter-Thomas distribution of states in the impurity band.The inset shows the numerical probability distribution of normalized probabilities M p for an eigenstate over computational states z in the ergodic regime in black, and the analytical orthogonal Porter-Thomas distribution in red.Qualitative arguments in Section VIII suggest that in the non-ergodic delocalized regime I2M/3 ∝ (W/(M Vtyp)) 2 .The black line is proportional to (W/(M Vtyp))2 and we see that I2M/3 aligns with this quantity as long as we do not enter the localized regime δ /Vtyp ≥ 1, see Fig.10.

d jm =d 3 d jm =d 2 d jm =d 1 minibandFigure 14 .
Figure 14.Cartoon of the energies of the marked states m within the impurity band.Energy levels are shown with solid black lines forming groups arranged vertically.All states |zm within one group lie at the same Hamming distance djm = d from a given state |zj with d increasing from right to left.The energy level j is depicted at the right side of the figure with thick black line.Arrows depict the transitions away from the initial state |ψ(0) = |zj into the marked states |zm whose energy levels lie inside the miniband of the width Γj centered at j , i.e., they satisfy the condition | j − m| Γj.Miniband width is indicated with the gray shading area.Arrows of the same color depict transitions within one decay channel, connecting the state |zj to the states a Hamming distance d away from it.Smaller values of d correspond to bigger typical level spacings δ d j (84) and fewer states in a miniband Ω d (95) within the decay channel given by d.
(d) j the number of marked states that are separated by a Hamming distance d from the state |z j (number of terms in the sum (81)) − d jm ) .(82) As discussed in Sec.D the elements of the set {M (d) j } n d=1 are sampled from the multinomial distribution with mean values

(d) j 1 (
see Appendix J).The steep exponential decrease with d of the matrix element V 2 (d) ∝ 1/ n d (39) is canceled by equally steep growth with d of the average number of states in the d channel M (d) j ∝ n d (83).As a result, the binomial factors can-cels out and the average quantity M (d) j V 2 (d) changes only by O(n −1 ) when d changed by 1.
average out.In what following we shall neglect the crossproduct of fluctuational and oscillatory parts (M
)Ω d where the product V (d)Ω d does not depend on d (except from the prefactor).Of course, the analysis based on the decay rate does not apply for the transition to the channels with very few states.The condition Ω d 1 leads to Γ (d) j ∼ V (d) for d = d res min corresponding to the typical Hamming distance from |z j to the nearest marked state in a miniband where the condition V (d) δ d j is satisfied (see Eq. (J1) in Appendix).
search starting from a fully symmetrized state So far we have studied the PT protocol with the Hamiltonian (2) H = H D +H cl that starts from a given marked state of an IB model H cl (3) and aims at finding a different marked state inside a given window of energies using a transverse field Hamiltonian H D = −B ⊥ n m=1 σ x m (2) as a driver.

2 ⊥E 2 + O n 2 B 4 ⊥ E 4 .
(C16)   It follows from the Eqs.(30) and(46) that for d ∈ (n/2 − m 0 , n/2 + m 0 ) the coefficient c 2 (E, d) ∝ 1/ n d decreases exponentially with d, while the distribution p d ∝ n d increases exponentially with d.The binomial factors cancel out and the expression under the summation in (F3) contains very slowly-varying with d (nonoscillatory) part.However for d ∈ (0, n/2 − m 0 ) the coefficient c(E, d) grows exponentially faster than 1/ n d with decreasing d (see Eqs. (B13), (26)).Therefore the variance (F3) is dominated by non-extensive values of d = O(n 0 ) that are much smaller than the smallest Hamming distance d min = O(n) (49) in a randomly chosen row of d ij .Therefore the variance of H ij is not a good statistical characteristic of the PDF of H ij .It is dominated by the extremely rare atypical instances of the ensemble.
+1 |j that is orthogonal to all the marked states.The subset of basis vectors S = {|j } M j=0 spans the M + 1 dimensional subspace with the remaining set S ⊥ of basis vectors spanning the orthogonal N − M − 1 dimensional subspace.One can show that H G does not have matrix elements that couple S with S ⊥ .Assuming that N M one can consider the decay of the state |0 instead of the state |S .We use (144) and omit constant terms and small corrections O(M/N ) in H G .The non-zero matrix elements H ij G = i| H G |j in this subspace S have the form