Large-$N$ expansion and $\theta$-dependence of $2d$ $CP^{N-1}$ models beyond the leading order

We investigate the $\theta$-dependence of 2-dimensional $CP^{N-1}$ models in the large-$N$ limit by lattice simulations. Thanks to a recent algorithm proposed by M. Hasenbusch to improve the critical slowing down of topological modes, combined with simulations at imaginary values of $\theta$, we manage to determine the vacuum energy density up the sixth order in $\theta$ and up to $N = 51$. Our results support analytic predictions, which are known up to the next-to-leading term in $1/N$ for the quadratic term in $\theta$ (topological susceptibility), and up to the leading term for the quartic coefficient $b_2$. Moreover, we give a numerical estimate of further terms in the $1/N$ expansion for both quantities, pointing out that the $1/N$ convergence for the $\theta$-dependence of this class of models is particularly slow.


INTRODUCTION
The existence of field configurations with non-trivial topology characterize the non-perturbative properties of QCD and QCD-like models, leading in particular to a non-trivial dependence on a possible coupling to the topological charge operator q(x), the so-called θ parameter. A non-zero θ implies an additional factor exp(iθQ) in the path-integral of the theory, where Q = d 4 x q(x) is the global topological charge (winding number); since Q is integer-valued for field configurations decaying fast enough at infinity, the theory is invariant under θ → θ + 2π, so that θ behaves as an angular variable. For small values of θ, the free energy (vacuum energy) density f (θ) can be Taylor expanded around θ = 0, a common parameterization being [1] f (θ) − f (0) = 1 2 The expansion contains only even terms because θ breaks CP-symmetry explicitly and the theory is CP-invariant at θ = 0. The quadratic coefficient χ is the topological susceptibility and is related to the second cumulant of the topological charge distribution at θ = 0, while the coefficients b 2n are related to higher-order cumulants of this distribution. Since θ-dependence is connected to intrinsically nonperturbative properties of Quantum Field Theories, a numerical approach based on lattice Monte Carlo simulations is the natural first-principle approach to its investigation. However, various analytical strategies permit to obtain useful information, at least in certain limits.
Semiclassical approaches to θ-dependence consider classical configurations with non-trivial topology, like * Electronic address: mario.berni@pi.infn.it † Electronic address: claudio.bonanno@pi.infn.it ‡ Electronic address: massimo.delia@unipi.it instantons and anti-instantons, and compute the pathintegral by integrating fluctuations around such class of configurations. One usually considers configurations with just one instanton or anti-instanton, so that the whole information is contained in the single-instanton effective action. In this way, one obtains results valid for an ensemble of independent (non-interacting) instantons and anti-instantons (dilute instanton gas approximation or DIGA), leading to a universal dependence of the free energy θ: This approximation is expected to work well at least in some regimes, like for SU (N ) Yang-Mills theories in the deconfined, high-temperature phase, where the typical instanton effective action gets large (because of asymptotic freedom) and topological charge fluctuations become rare and dilute. On the other hand, DIGA is known to break down when instanton interactions cannot be neglected, like in the confined phase of QCD and similar models. In this regime, an alternative approach is represented by an expansion in the inverse of the number of field components, i.e. in 1/N for SU (N ) Yang-Mills theories or for CP N −1 models. Under very general assumptions, like requiring the existence of a non-trivial dependence on θ, large-N expansion leads at least to semi-quantitative insights, like the prediction that the coefficients b 2n in Eq. (1) be suppressed as 1/N 2n [2,3], which for SU (N ) Yang-Mills theories has been checked by lattice simulations during the last few years [4][5][6][7][8][9][10][11][12].
The case of CP N −1 models in two dimensions, which is the subject of the present investigation, is special, because the large-N expansion permits to obtain for them, as for other vector-like models, also quantitative predictions. Leading order computations are available for χ [13] and for all b 2n coefficients [11,14], and even nextto-leading corrections are known for the topological susceptibility [15]. Despite the fact that numerical simulations for these models are less demanding than those for SU (N ) Yang-Mills theories, lattice results have failed up to now to provide a clear confirmation of these quantitative analytical predictions, apart from the case of the leading 1/N term for χ [16][17][18][19]. Numerical results appeared in some cases to be even inconsistent with nextto-leading predictions for χ [18,19]. A recent investigation [20] pointed out that at least the consistency can be recovered once one takes into account further terms in the 1/N expansion. The purpose of the present investigation is to go beyond the simple consistency, trying to achieve a quantitative agreement between lattice computations and analytical predictions, at least for the next-to-leading correction to χ and for the leading term in the b 2 coefficient: in Ref. [20] it was suggested that, in order to do so, one should explore N = 50 or larger. At the same time, we would like to achieve a numerical estimate of the further terms in the 1/N expansion for both quantities.
In order to achieve our goal, we have pushed our investigation up to N = 51, where however standard algorithms face severe critical slowing down problems in the decorrelation of the topological charge [18], which can only partially be ameliorated by numerical strategies like simulated (or parallel) tempering in the coupling of the theory [17,20]. For this reason, we have decided to adopt an algorithm recently introduced by M. Hasenbusch in Ref. [19], in which simulations with open and periodic boundary conditions are smartly combined in a parallel tempering framework. In addition to that, following the same strategy adopted in Ref. [20], we will assume analyticity around θ = 0 and exploit simulations performed at imaginary values of θ in order to improve the signal-to-noise ratio, something which turns out to be essential in order to achieve a precise determination of the higher-order cumulants of Q.
The paper is organized as follows. In Section 2 we provide a concise review of CP N −1 models and of large-N predictions for their θ-dependence. In Section 3 we give details about the lattice discretization and the numerical algorithm employed in our study. Numerical results and their analysis within the framework of the 1/N expansion are presented in Section 4. Finally, in Section 5, we give our conclusions.

CP N−1 MODELS AND THEIR θ-DEPENDENCE IN THE LARGE-N LIMIT
CP N −1 models in two space-time dimensions share many properties with Yang-Mills theories: apart from θdependence, they are also characterized by confinement of fundamental matter fields; for this reason they have represented a theoretical test-bed for the study of nonperturbative physics in gauge theories since long [13,[21][22][23].
The elementary fields belong to the projective space of N -component complex vectors. The projective conditions is enforced by normalizing the modulus of the vectors fields to one and by writing an action which is independent of the residual arbitrary local phase factor of the fields. In some formulations, such as the one considered in our study, this is rephrased by introducing an auxiliary and non-propagating abelian gauge field A µ , so that the arbitrary local phase is gauged away with the advantage of having an action quadratic in the fields. In particular the Euclidean action, already including the θterm, reads where z is a complex N -component scalar field satisfyinḡ z(x)z(x) = 1, D µ is the usual U (1) covariant derivative, g is the 't Hooft coupling, which is kept fixed as N → ∞, and is the global topological charge. The free energy (or vacuum energy) density is defined, using the path-integral formulation of the theory, as where V is the 2d space-time volume. From this expression it is clear that the parameters entering the Taylor expansion of f (θ) around θ = 0, see Eq.
(1), can be related to the cumulants k n of the path-integral distribution of the topological charge, P (Q), computed at θ = 0: On a more quantitative level, one can show that [11,15,27,28] where the length scale ξ appearing in Eq. (7) is the second moment correlation length, defined as: with The next-to-leading correction to ξ 2 χ is the result of a non-trivial analytic computation performed in Ref. [15], leading to the prediction e 2 ≃ −0.0605. Numerical simulations have fully confirmed the leading order behavior of ξ 2 χ [16,19,29,30], however so far they have been elusive in confirming the prediction for e 2 : many numerical works on the CP N −1 theories show a deviation from the leading term which appears to be of opposite (positive) sign, and only recently the hypothesis has been made that this could be due to a poor convergence of the series due to quite large next-to-next-to-leading-order (NNLO) contributions [20]. Also for the O(θ 4 ) coefficient b 2 , consistency with the prediction in Eq. (8) is found only assuming large NNLO corrections [20].

NUMERICAL SETUP
In the following we describe various aspects of the numerical methods used in this investigation, starting from the discretization adopted for the path-integral and for the topological observables, then describing the strategy based on the introduction of an imaginary θ term and on analytic continuation, and finally discussing the application of the Hasenbusch algorithm [19] to our numerical setup.

A. Lattice discretization
The theory has been put on a square lattice of size L and, even if the updating algorithm considers different kinds of boundary conditions at the same time in a parallel tempering framework, average values of observables have been computed only in the case of periodic boundary conditions (p.b.c.). We have adopted the tree-level Symanzik-improved lattice discretization for the non-topological part of the action [29] where c 1 = 4/3, c 2 = −1/12, β L ≡ 1/g L is the inverse bare coupling and U µ (x) are the U (1) elementary parallel transporters. The coefficients c 1 and c 2 are chosen so as to cancel logarithmic corrections to the leading O(a 2 ) approach to the continuum limit, where a is the lattice spacing.
The continuum limit is achieved, by asymptotic freedom, as β L → ∞. In this limit the lattice correlation length ξ L diverges as 1/a; ξ L is defined as usual in terms of two-point correlation functions [31]: whereG L (p) is the Fourier transform of G L , which is the discretized version of the two-point correlator of P defined in Eq. (11), and k = 2π/L. Corrections to continuum scaling can be expressed as inverse powers of 1/ξ L so that, for the adopted discretization, the expectation value of a generic observable will scale towards the continuum as: Regarding the topological charge Q, several lattice discretizations exist, all agreeing in the continuum limit, where a well defined classification of relevant configurations in homotopy sectors is recovered. In general, any lattice discretization Q L of the topological charge operator will be related to the continuum one by a finite multiplicative renormalization [30]: The above relation holds when one considers correlation functions of Q L , i.e. it is not valid configuration by configuration, where one should instead write Q L = ZQ + η, where η is noise contribution related to field fluctuations at the ultraviolet (UV) scale, which is stochastically independent of the global topological background Q but can lead to further additive renormalizations as correlations with higher powers of Q L are considered. Various smoothing algorithms have been commonly adopted in the literature to dampen UV fluctuations responsible for such renormalizations, like cooling [32], the gradient flow [33], or smearing; these procedures have been shown to be practically equivalent, once they are appropriately matched to each other [34][35][36]. In this study we adopt cooling, because of its relative simplicity, which consists in the sequential application of local modifications of the lattice fields in which the action is minimized locally at each step. For the purpose of smoothing, the minimized action can be different from the one used to define the path-integral, our choice has been to set c 1 = 1 and c 2 = 0 in Eq. (12) for cooling.
The most straightforward discretization of Q makes use of the plaquette operator Π µν (x): where, as usual, This choice leads to an analytic function of the gauge fields which is non-integer valued and has Z < 1. There are alternative definitions, known as geometric, which are always integer valued for p.b.c. (hence they have Z = 1); one possibility [29] is based on the link variables U µ (x) the other [37] on the projector P defined in Eq. (11). The geometric charge Q U can be easily interpreted as the sum of the magnetic fluxes (modulo 2π) going out of each plaquette, then normalized by 2π, which is integer valued for any 2d compact manifold.
We have adopted the unsmoothed non-geometric definition Q L to introduce a θ-term in the action, even if this implies the presence of renormalization effects which will be discussed below: Q L is linear in the fields, hence it allows to make use of standard efficient algorithms like over-heatbath. As for measurements, it has been checked [20] that all definitions yield practically indistinguishable results after the application of a modest amount of cooling, in particular O(10) sweeps of local minimization of each site/link variable over the whole lattice; in any case we adopted the geometric one (measured after 25 cooling step) which always yields exactly integer values.

B. Imaginary-θ method
The coefficients b 2n appearing in the Taylor expansion of f (θ) around θ = 0 are observables plagued by a large noise-to-signal ratio, especially when one tries to determine them in terms of the topological charge distribution at θ = 0, as in Eq. (6), since one has to measure tiny nongaussianities in an almost-Gaussian distribution. A better strategy is to add a source term to the action, i.e. to consider the theory at θ = 0, and study the dependence of lower cumulants on θ, which contains the relevant information on the higher order cumulants. This is not possible in practice, because the theory at non-zero θ has a complex path-integral measure and is therefore not suitable to numerical Monte Carlo simulations. However, one can consider a purely imaginary source: this strategy has been developed to study QCD at finite baryon density [38][39][40][41], and has been successfully applied also to the study of θ-dependence [7,9,[42][43][44][45][46][47][48][49][50].
In practice, one sets θ = −iθ I and assumes analyticity around θ = 0. The action is modified as follows from which it follows that the cumulants of Q are related to the corresponding derivatives of the free energy. Using the expression for f (θ) in Eq. (1) we have Such equations provide an improved way of measuring χ and the b 2n coefficients, since one can perform a global best-fit exploiting the information contained in the θ Idependence of lowest order cumulants, which is statistically more accurate [9,51].
In the practical numerical implementation of this procedure, as in Ref. [20], we have used the geometric definition Q U taken after a few cooling steps, which is free of renormalizations, to define the cumulants k n . On the other hand, as explained above, it is convenient, for algorithmic reasons, to discretize the imaginary θ-term in the action by means of the non-geometric definition given in Eq. (16): so that one would like to know how to re-express Eqs. (20) in terms of θ L . As explained above, the relation between Q L and Q is, configuration by configuration, Q L = ZQ + η, where η is an UV noise. That means that, for any n, where the second term drops out because η is stochastically independent of Q. Based on that, for any k n one has so that the Taylor expansion of the cumulants in Eq. (20) can be rewritten in terms of θ L by simply replacing θ I = Z θ L , i.e the renormalization constant Z represents just an additional fit parameter.

C. The Hasenbusch algorithm
The lattice action in Eq. (21), being analytic in all fields, can be easily sampled by standard local algorithms. However, these algorithms become non-ergodic approaching the continuum limit, failing to correctly sample the path-integral. The non-ergodicity is due to the fact that, in the continuum theory, different topological sectors are disconnected and a smooth deformation of the gauge fields cannot change the homotopy class of the configuration. This means that, approaching the continuum limit, the number of Monte Carlo steps required to change Q increases exponentially with 1/a ∼ ξ L : this is usually known as critical slowing down (CSD), and represents a well known problem in a wide range of theories sharing the presence of topological excitations [4,18,[52][53][54][55][56][57][58][59][60]. Moreover, the problem worsens exponentially increasing N [18], so that the study of the large-N limit becomes rapidly not feasible, even at not-so-large values of ξ L .
Being the CSD related to the existence of non-trivial homotopy classes, a possible solution is to switch from periodic boundary conditions to open boundary conditions (o.b.c.) [56]: topological sectors disappear and Q can smoothly change between different values. That does not come for free: finite size effects are more severe, constraining to measure observables only in the bulk of the lattice; Q is no more integer valued and the information on the n th order cumulant is typically obtained in terms of integrated n-point correlation functions of the topological charge density, with a consequent worsening of the signal-to-noise ratio.
The idea put forward in Ref. [19], in order to bypass these complications and still take advantage of the improvement of o.b.c., is to consider a collection of similar systems, differing among them for the value of a global parameter which gradually interpolates between o.b.c. and p.b.c.: while each system has an independent Monte Carlo evolution, swap of configurations between different systems are proposed from time to time in a parallel tempering framework. In this way, the fast decorrelation of Q achieved for the open system is progressively transferred to the periodic one, which is also the system where observables are actually measured, thus completely avoiding the complications of open boundaries. The analysis of Ref. [19] shows that it suffices to open boundary conditions just along a line defect of length comparable to ξ L . Moreover, hierarchical updates around the defect, combined with discrete translations of the periodic system from time to time, helps optimizing the the showed time window corresponds to 0.0025% of the total statistics collected for that run. In this case swap acceptances range from 60% to 20% and are larger for c(r) closer to 1. algorithm. For more details, we refer to Ref. [19].
The only differences characterizing our implementation are that the algorithm is adapted to the Symanzikimproved action, and that it is used also for simulations at non-zero imaginary θ. Actually, a few preliminary tests showed that the choice of boundary conditions in the θ-term does not affect the efficiency of the algorithm: this is expected, since it is just the usual action term which develops barriers between different topological sectors. Therefore, in order to avoid further complications, we decided to keep periodic boundary conditions in the θ term for all replicas.
The line defect D was put and held fixed on the time boundary: The replica swap was proposed, after every update sweep, for each couple of consecutive replicas, and then accepted according to a Metropolis step with probability: where ∆S L is the global change in the action of the two involved replicas. The length of the defect L d was chosen so that L d ∼ ξ L at the highest β L . Concerning N r , for each N we chose it so that, at the highest value of β L , the lowest acceptance was not lower than 20%.
In Fig. 2 we show how a given configuration evolves through different values of c(r) in a typical run. To get the algorithm properly working one should check that swaps happen uniformly, as in the shown example, otherwise the fast decorrelation achieved for o.b.c. is not transferred efficiently across the systems. Finally, in Fig. 3 we compare the MC evolution of Q, with and without using parallel tempering, for N = 51 and for two different values of β L . Without parallel tempering the charge is almost frozen, while many fluctuations are observed during the same clocktime in the other case, allowing us to perform measures which would have been practically impossible with just the local algorithm.  We also report the total accumulated statistics, where each measure was taken after every parallel tempering step. The imaginary-θ fit was always performed in the range [0, 6] with 7 points in steps of δθL = 1, except for the βL with θL,max = 6.5 where 11 points were taken (δθL = 1 up to θL = 3 and then δθL = 0.5); θL,max = 0 indicates that no simulation at imaginary θ has been performed: in this case a single high-statistics run at θ = 0 has been used to determine χ and ξ at the same time. Nr indicates the number of replicas used for parallel tempering, the number of hierarchical levels was always 3 (see [19] for more details on the hierarchical update).

NUMERICAL RESULTS
In Tab. I we summarize the parameters and statistics of the simulations performed in the present study, which regarded N = 21, 31, 41 and 51. In addition, we will also consider results obtained at lower N in previous studies, in order to investigate the large-N behavior of the theory. Results for N = 21 and N = 31 have been already reported in Ref. [20], but using lower statistics and/or smaller correlation lengths than in the present work.
Statistics at θ L = 0 are generally larger because in this case, apart from the cumulants of the cooled charge Q U , we determined the value of the correlation length ξ L , which is needed with the best possible precision since it affects the final precision on the continuum extrapolation of ξ 2 χ. Statistical errors on the cumulants have been estimated by means of a bootstrap analysis.
In the following we will first discuss the impact of finite size effects, in order to justify the choices for the lattice sizes used in our simulations. Then we will illus-trate the procedure and discuss the systematics both for the analytic continuation from imaginary θ and for the extrapolation to the continuum limit of our results; in this respect, we notice that some of our results for ξ 2 χ have been produced without relying on analytic continuation (see Tab. I), in order to provide a robust test that the systematics of analytic continuation and continuum extrapolation are actually under control.
Finally, we will study the large-N expansion of ξ 2 χ, b 2 , b 4 based on our and on previous numerical results, illustrating the systematics, the comparison with existing analytic predictions and the estimate of further terms in the expansion.

A. Finite size effects
In all of our simulations, and for each fixed value of N , we decided to approach the continuum limit keeping the ratio L/ξ L fixed, so as to ensure that the physical volume is kept fixed. In general, taking a ratio L/ξ L ≫ 1 should ensure that finite size effects be negligible. However, as discussed in Ref. [61], this is not the case for the θ-dependence of 2d CP N −1 models in the large-N limit: the large-N limit and the thermodynamic limit do not commute for this class of models, and wrong results might be obtained if the former limit is taken first, leading in practice to the more restrictive condition (L/ξ L ) 2 ≫ N . Since this is strictly related to the particular large-N behavior of the θ-dependence of the theory, we will try to give an intuitive explanation of this condition (see also Ref. [24] for a related discussion).
Basically, the reason can be traced back to the fact that both ξ 2 χ and the b 2n coefficients vanish in the large-N limit; indeed, ξ 2 χ vanishes as 1/N , thus, indicating by l = La the size in physical units, one expects for large N If l/ξ = L/ξ L is kept fixed while N → ∞, one ends up with a system where Q 2 ≪ 1: in these conditions, the distribution of Q is strongly peaked at Q = 0, with only rare occurences of Q = ±1 and even rarer occurences of higher values of |Q|. It is easy to check that such a distribution leads to b 2n coefficients which are very close to those predicted by DIGA, i.e., for instance, b 2 = − Q 4 c /(12 Q 2 ) ≃ −1/12. Let us stress that this happens in practice in many other situations, like for instance for 4d Yang-Mills theories in the high-T phase, where χ vanishes rapidly above T c and one cannot afford to take very large lattice volumes, ending up again with Q 2 ≪ 1; however, in that case the system is really close to DIGA even in the thermodynamical limit, so that, luckily enough, this does not imply significant systematic errors (see Ref. [62] for a thorough discussion about this point).
For 2d CP N −1 models, instead, the large-N limit is qualitatively different from DIGA, indeed one expects b 2 ∝ 1/N 2 , which is quite different from DIGA predictions. Stated in another way, the particularity of 2d CP N −1 models is that while global topological excitations become rarer and rarer as N → ∞, they never become really dilute, as one would naively expect. The additional condition (L/ξ L ) 2 ≫ N ensures that Q 2 is at least of O(1) or larger, so that the non-trivial interactions between topological excitations, which characterize the large-N limit, can be properly taken into account. In order to illustrate the above considerations in practice, in Fig. 4 we report the behavior of ξ 2 χ and b 2 as a function of the lattice size for N = 41. It clearly appears that in the small volume limit the system is described by DIGA (b 2 ≃ −1/12), and that finite size corrections are still significant for L/ξ L = 10. We stress, however, that the range of L/ξ L for which finite size effects are visible at this particular value of N is still compatible with the range observed for other generic (non topological) observables in the same class of models, i.e. L/ξ L 20 [63]: in order to really observe discrepancies with respect to the indications of Ref. [63] one should study a case for which L/ξ L 20 and L 2 /(N ξ 2 L ) 1, however that requires N to be of the order of a few hundreds. In any case, in our simulations we kept L 2 /(ξ 2 L N ) ∼ 20 for all explored values of N , meaning that L/ξ L depends on N (see Tab. I). We stress that the condition L 2 /(ξ 2 L N ) ≫ 1 was also ensured in the numerical simulations reported in Ref. [20].

B. Analytic continuation from imaginary θ and continuum extrapolation
In Fig. 5 we show an example of the global imaginary-θ fit discussed in Section 3 B. The best-fit procedure was performed according to Eq. (20) and for just the first 3 cumulants in all cases, exploiting the whole available imaginary θ range. In order to assess the impact of systematic effects related to analytic continuation, we have tried in each case to change the range fit and the order (i.e. the truncation) of the fit polynomial, verifying that the variation of the fit parameters was within statistical errors, or adding it to the total error otherwise.
A complete summary of the results obtained for Z, ξ L , ξ 2 χ, b 2 and b 4 at all explored values of N and β is reported in Tab. II.
Results collected at different values of β L have then been used to obtain continuum extrapolated quantities. In order to discuss the procedure that we have adopted in all cases, we illustrate in details our analysis for the continuum extrapolation of ξ 2 χ at N = 21. Results at finite ξ L are reported in Fig. 6 as a function of 1/ξ 2 L , and include both results from Ref. [20] (obtained via analytic continuation) and from this work (obtained just from simulations at θ = 0). In general, for the lattice discretization adopted in this work, one expects corrections to the continuum limit for a generic observable O, including only the two lowest non-trivial terms, to be as follows: In general, for all the quantities considered in this study, a linear extrapolation in 1/ξ 2 L , i.e. setting B = 0, has worked perfectly well, i.e. with reducedχ 2 of order 1 for the best fit, in the whole explored range of ξ L . However, in order to correctly assess the impact of systematic errors related to the extrapolation, we have also analyzed the effect of including the non-linear term (B = 0), or of considering the linear fit in a restricted range of ξ L , i.e. discarding points which are farther from the continuum limit. The systematics of this procedure are reported in Tab. III, some of the best fits are reported in Fig. 6 as well. After considering the observed systematics, our final determination for ξ 2 χ at N = 21 has been ξ 2 χ = 0.00765(4), to be compared with ξ 2 χ = 0.00759(5) from Ref. [20], ξ 2 χ = 0.00767(5) from Ref. [19] (adopting a different discretization) and ξ 2 χ = 0.00800(20) from Ref. [18].
The procedure above has been repeated for all explored quantities and for all N . In Figs. 7, 8 and 9 we show the continuum extrapolations for ξ 2 χ, b 2 and b 4 for N = 31, 41 and 51, for simplicity of figure reading we report just the linear best fits over the whole range, even if the final continuum determinations take into account the whole systematics, as in the example above.
Final results are shown in Tab. IV, where we report the continuum limit of ξ 2 χ, b 2 and b 4 for N = 21, 31, 41 and 51, along with continuum results of these observables for other values of N taken from Refs. [19,20]. C. Results for ξ 2 χ and its large-N scaling The main purpose of this section is to make use of the results reported in Tab. IV to investigate the large-N behavior of the topological susceptibility and compare it with analytical predictions. In Fig. 10 we plot the quantity N ξ 2 χ, which should approach a constant for N → ∞, as a function of 1/N , together with the LO and NLO analytic computations, and one of our best fits to be discussed in the following.
Similarly to what has been done in Ref. [20], we fit our data with a function of the type which includes up to N 3 LO corrections in the 1/N expansion: all best-fit systematics are summarized in Tab. V; in all cases, the LO term has been fixed to the well established analytic prediction e 1 = 1/(2π). The systematics for the e 2 coefficient are also plotted in Fig. 11 in order to make the discussion clearer. If one sets e 3 = e 4 = 0, thus allowing only for the NLO correction, acceptable or marginally acceptable best fits are obtained when data for N < 13 are discarded. Nevertheless, results obtained for e 2 are not stable and show a systematic drift as the fit range is changed, suggesting that NNLO corrections could be important. They are positive and in clear disagreement with the analytic predictions e 2 = −0.0605 [15] (as also reported in previous literature) if the fitted range is large enough, but decrease systematically and out-of-the-errors as the fit range is restricted to larger and larger values of N , finally becoming negative compatible with the analytic predictions, even if within very large error bars (see Fig. 11).
On the other hand, when the NNLO correction is included in the fit, e 3 = 0, results obtained for e 2 are reasonably stable and compatible with the analytic predic- tions. Moreover, the NNLO term e 3 appears quite stable as well, also when the NLO coefficient e 2 is fixed to the theoretically predicted value, and even when a further term (e 4 = 0) is included in the fit, so that we can provide a quite conservative estimate e 3 = 1.5 (5), which considers all variations observed for this coefficient in the various fits. The values obtained for e 4 are not sufficiently precise or stable to allow for any estimate, however we N ξ 2 χ · 10 3 b2 · 10 3 b4 · 10 5 9 20.00(15) -13.90 (13)  Ref. [20] while the one for N = 10 from Ref. [19]. The result obtained for ξ 2 χ at N = 41 is in good agreement (less than 1σ) with that obtained using a different discretization in Ref. [19] (ξ 2 χ = 0.00391(2)), where however no determination was given for b2 and b4; for N = 31 instead one should compare with the results of Ref. [20] (ξ 2 χ = 0.00503(6), b2 = −0.00231 (22)), also in this case the agreement is reasonable (1.6σ and 0.9σ respectively). can state it is of O(10).
Finally, we point out that the values obtained for the reducedχ 2 for the fits including N 2 LO and N 3 LO corrections are generally low: one possible interpretation is that we have been too conservative in estimating the errors on continuum extrapolated quantities, that might also explain why the fit including just e 2 yields acceptable values ofχ 2 but is not stable.
Present results clarify why previous lattice studies observed a positive deviations with respect to the LO prediction, in contradiction with the fact that e 2 is negative: the NNLO term has an opposite sign and, given its estimated magnitude, it is expected to dominate un- til N > |e 3 /e 2 | ∼ 20 − 30. Our analysis fully supports the analytic prediction of e 2 . On the other hand, if we had to provide an independent determination of e 2 , a conservative estimate based on our systematics would be e 2 = −0.05(3), which is still quite inaccurate despite the large numerical effort; the difficulty is clearly related to the fact that e 2 turns out to be quite small in magnitude, both with respect to e 1 and e 3 , so that it is hardly detectable.

D. Large-N scaling for b2 and b4
We turn now to the analysis of the large-N limit for b 2 . In this case, following again the lines of Ref. [20], we employ a fit function including N 3 LO corrections that was the minimal polynomial in 1/N capable to fit the data available in Ref. [20] after fixing the leading order term to the predicted theoretical value,b 2 = −27/5. In Tab. VI we report systematics for the best-fit procedure, while in Fig. 12 we plot the best fit obtained fitting data to Eq. (29) in the whole available range; the systematics forb 2 are also reported, to improve clarity, in Fig. 13. As it can be appreciated, the new data collected at large N permit us to perform a best fit without fixing the value of the LO term. Results obtained forb 2 are not stable as the fit range is changed if only NLO corrections are included in the fit (k 2 = k 3 = 0), and tend to be more and more compatible with the analytic prediction b 2 = −5.4 as the range is restricted to larger and larger values of N . When k 2 and/or k 3 are included, results for b 2 are more stable and always compatible with the analytic prediction: an independent conservative estimate in this case would beb 2 = −5(1). " " 1.9(7) 1.02 8 9 " " 2.0(6) 1.23 9 11 " " 2.0(6) -4(7) 0.14 6 10 " " 1.3(4) 7(5) 0.9 7 9 " " 1.0(4) 10(4) 0.92 8 TABLE V: Summary of the fit systematics for the determination of the large-N behavior of ξ 2 χ using the fit function N ξ 2 χ = e1 + e2/N + e3/N 2 + e4/N 3 . Blank spaces mean that the corresponding coefficient was set to 0 in the fit procedure, while numerical values with no error mean that the corresponding coefficient was fixed to that value.
Present precision allows also to obtain a rough estimation of the order of magnitude of the corrections to the large-N scaling of b 2 : a conservative estimate for the NLO term, which is sufficiently stable in all performed fits, is k 1 = 120 (60). Our results point out that they seem to increase by around one order of magnitude at each step in the expansion, resulting in a very slow convergence towards the large-N limit, as it has already been observed for the topological susceptibility.
Concerning b 4 , so far continuum extrapolated results, which are reported in Fig. 14 in terms of the quantity N 4 b 4 , are compatible with zero within statistical and systematic uncertainties, so that we can only set upper bounds. Nevertheless, such upper bounds are already interesting enough when compared to LO large-N prediction. Indeed, with present data one would naively set an upper bound on the modulus of the LO order coefficient |b 4 | 20, which is almost one order of magnitude smaller than the theoretical prediction,    VI: Summary of the fit systematics for the determination of the large-N behavior of b2 using the fit function N 2 b2 =b2 + k1/N + k2/N 2 + k3/N 3 . The convention is the same of Tab. V.
in the case of b 4 .

CONCLUSIONS
The main purpose of our study was to clarify the matching between lattice computations and analytic large-N predictions regarding the dependence on the θparameter of 2d CP N −1 models. The picture emerging from previous lattice studies pointed out to an apparent disagreement for the sign of the deviation of the topological susceptibility from its LO 1/N prediction, and to values for the LO 1/N 2 behavior of the b 2 coefficient missing the predicted analytic value by around a factor 2. A possible way out was proposed in Ref. [20], which showed that assuming large higher order contributions in the 1/N expansion, the disagreement could disappear, leading at least to a partial consistency between lattice data and an-alytic computations. In Ref. [20] it was also pointed out that, assuming the presence of such higher-order corrections, it was necessary to reach at least N = 50 to make the situation clearer.
In this work we accomplished this goal, exploiting a recent algorithm proposed by M. Hasenbusch to defeat critical slowing down [19], which has been adapted to our Symanzik improved discretization in the presence of a θ-term. That allowed us to push our investigation up to N = 51.
In this way we have been able to provide independent determinations of the NLO and LO coefficients, respectively for χ and b 2 , which are in agreement with analytic predictions. In particular, we have estimated e 2 = −0.05(3) (analytic prediction e 2 ≃ −0.0605 [15]) andb 2 = −5(1) (analytic predictionb 2 = −5.4). At the same time, we have provided a first estimate for the NNLO contribution to χ (e 3 = 1.5(5)) and for the NLO contribution to b 2 (k 1 ∼ 120(60)) which are presently unknown analytically.
Instead, no definite results have been obtained for b 4 , because of the larger statistical uncertainties involved in the determination of this observable, apart from upper bounds which however seem to be in disagreement with the LO large-N prediction by around one order of magnitude, pointing to the presence of large NLO corrections also in this case.
Our results, apart from successfully confirming present analytic estimates for χ and b 2 , point out that the convergence of the large-N expansion is particularly slow for this class of models. As suggested in Ref. [20], this could be due to the non-analytic behavior which is expected for N = 2.