$\theta$-dependence in the small-$N$ limit of $2d$ $CP^{N-1}$ models

We present a systematic numerical study of $\theta$-dependence in the small-$N$ limit of $2d$ $CP^{N-1}$ models, aimed at clarifying the possible presence of a divergent topological susceptibility in the continuum limit. We follow a twofold strategy, based on one side on direct simulations for $N = 2$ and $N = 3$ on lattices with correlation lengths up to $O(10^2)$, and on the other side on the small-$N$ extrapolation of results obtained for $N$ up to $9$. Based on that, we provide conclusive evidence for a finite topological susceptibility at $N = 3$, with a continuum estimate $\xi^2 \chi = 0.110(5)$. On the other hand, results obtained for $N = 2$ are still inconclusive: they are consistent with a logarithmically divergent continuum extrapolation, but do not yet exclude a finite continuum value, $\xi^2 \chi \sim 0.4$, with the divergence taking place for $N$ slightly below 2 in this case. Finally, results obtained for the non-quadratic part of $\theta$-dependence, in particular for the so-called $b_2$ coefficient, are consistent with a $\theta$-dependence matching that of the Dilute Instanton Gas Approximation at the point where $\xi^2 \chi$ diverges.


INTRODUCTION
The CP N −1 models in two space-time dimensions have been extensively studied in the literature because they represent an interesting theoretical laboratory for the study of gauge theories [1][2][3]. As a matter of fact, many intriguing non-perturbative properties, such as confinement, the existence of field configurations with nontrivial topology and the related θ-dependence are features that these models share with 4d Yang-Mills theories.
The Euclidean action of these models, including the topological term, can be written through a nonpropagating Abelian field A µ as where N is the number of components of the complex scalar field z, which is subject to the constraintzz = 1, D µ = ∂ µ + iA µ and is the topological charge. The θ-dependent vacuum energy (density) is defined through the path integral as [dA]e −S(θ) (3) and it can be parametrized in terms of the cumulants Q n c of the topological charge distribution at θ = 0 * Electronic address: marioberni91@gmail.com † Electronic address: claudio.bonanno@pi.infn.it ‡ Electronic address: massimo.delia@unipi.it P (Q) as: where the topological susceptibility parametrizes the leading θ 2 term while the coefficients b 2n ≡ (−1) n 2 (2n + 2)! Q 2n+2 c Q 2 (6) parametrize the non-quadratic part. One of the most interesting features of the CP N −1 models is the possibility of performing a systematic expansion of any observable, including those related to θdependence, in the inverse of the number of field components 1/N when N → ∞ while g is kept fixed [1]; that closely resembles the 1/N expansion of 4d SU (N ) gauge theories for large number of colors. The similarities are non-trivial, since the relevant scaling quantity turns out to be θ/N in both cases [4][5][6][7], leading to the prediction that the quantitiesb 2n ≡ N 2n b 2n are finite in the large-N limit: this scaling behavior has been verified by numerical lattice simulations both for the CP N −1 [7,8] and for SU (N ) gauge theories [7,9] up to the quartic coefficient b 2 . From a quantitative point of view, CP N −1 models are more predictive, since the leading term in the 1/N expansion is known [1,6,7,9] for all coefficients in the θexpansion of the free energy in Eq. (4), and also the first subleading term in the case of the topological susceptibility [10], while in the case of 4d SU (N ) gauge theories one has just phenomenological predictions, based on the spectrum of pseudoscalar mesons, for the leading term in the 1/N expansion of the topological susceptibility [11,12].
Numerical simulations underlined also some differences between the two theories. Indeed, while in the Yang-Mills case the large-N expected scaling practically holds already for N ≥ 3 [7], this is not quite the case for the CP N −1 models. Indeed, the large-N limit of b 2 shows significant deviations from large-N predictions even for N ∼ 50, and an agreement with lattice data can be recovered only by including large higher-order corrections in 1/N ; a similar behavior is observed for the topological susceptibility [8,13].
Another important difference emerges when looking at the small-N limit. Indeed, while this is predicted and observed to be regular in the case of 4d SU (N ) gauge theories, a singular behavior is expected for 2d CP N −1 approaching N = 2. For instance, one expects a divergence of the topological susceptibility of the CP 1 theory, which can be justified in perturbation theory on the basis of the ultraviolet (UV) divergence of the instanton size distribution [14][15][16][17][18] P N (ρ) ∝ ρ N −3 (7) occurring for N = 2. This result has been tested in many lattice studies and there seems to be a general consensus about the singular behavior of χ for N = 2 and about its origin due to the presence of small instantons, see, e.g., Refs. [19][20][21][22][23][24]. However, there are still many aspects deserving a more careful investigation.
First of all, as we discuss later on, the actual verification of the divergent behavior occurring for N = 2 requires to perform a continuum limit extrapolation which is highly non-trivial, since the divergence is expected to appear as a logarithm of the UV cutoff, i.e., of the lattice spacing a, which could be difficult to disentangle from a regular power-law behavior in a and requires numerical results spanning over orders of magnitude in terms of the dimensionless correlation length of the system.
The second issue is that a divergent behavior for the topological susceptibility has been claimed to be observed also for N = 3 in Ref. [22], where the authors employed a standard lattice action along with an overlap definition of the topological charge, however this is in contradiction with previous results obtained using a different action discretization and a geometric lattice definition of Q [25]: the origin of this discrepancy may be due to the fact that, according to Eq. (7), also for N = 3 small instantons are expected to dominate, so that also in this case the continuum limit has to be handled with care.
Finally, one would like to understand whether the small-N divergent behavior regards just the topological susceptibility or also other coefficients in the Taylor expansion of the free energy in Eq. (4). In asymptotically free theories, small size instantons are expected to be weakly interacting, so that a possible conjecture [13] is that θ-dependence in the N = 2 limit be described by the Dilute Instanton Gas Approximation (DIGA): with the divergence appearing just in χ, while the b 2n coefficients are finite and have alternate sign: A DIGA-like small-N behavior of f (θ) could explain why, unlike the case of SU (N ) gauge theories, large corrections are observed when studying the large-N limit of CP N −1 models, being the 1/N series at large N not able the capture the small-N behavior of the theory. A first evidence of near-DIGA behavior for the quartic coefficient b 2 for N = 2 has been reported in Ref. [26], where however no continuum extrapolation for this quantity is reported. As for higher values of N , no result is known for N < 9.
The goal of the present work is to provide a systematic study of the small-N θ-dependence of CP N −1 models by lattice simulations, in order to go beyond the present state of the art. To do so, we have attacked the problem from two different sides. On one hand, we have performed extensive numerical simulations for N = 2 and N = 3 considering dimensionless correlation lengths spanning over two orders of magnitude, namely reaching values of ξ going above 10 2 , and considering various different ansatzes for the continuum extrapolation in order to fairly assess our systematic uncertainties on the final results. On the other hand, we have also considered numerical simulations for larger values of N , namely N ∈ [4,9], for which the continuum extrapolation is easier and better defined, then trying to obtain information on N = 2, 3 by a small-N extrapolation of these results. We have applied this double-front strategy to the determination of the topological susceptibility and of the first coefficients of the θ expansion of f (θ), up to O(θ 6 ): consistency of results obtained in the two different ways provides a solid way to assess the reliability of our final statements, among which, for instance, the fact that the θ-dependence is finite for N = 3.
The paper is organized as follows. In Sec. 2 we describe the lattice setup adopted for discretizing the theory and for determining the cumulants of the topological charge distribution, as well as our strategy for taking the continuum limit of our results. In Sec. 3 we present and discuss our numerical results and finally, in Sec. 4, we draw our conclusions.

NUMERICAL SETUP
In this section we briefly discuss various issues related to the discretization of the model and of its observables, in particular those related to topology, and to the continuum extrapolation of the numerical results.

A. Lattice discretization
We discretized the action in Eq. (1) at θ = 0 on a periodic square lattice of size L using the tree-level Symanzikimproved lattice discretization [27]: where c 1 = 4/3 and c 2 = −1/12 are improvement coefficients, β L ≡ 1/g L is the inverse bare coupling and U µ (x) are the U (1) gauge link variables. The adoption of the improved action cancels out logarithmic corrections to the leading O(a 2 ) behavior of the discretization errors, where a is the lattice spacing.
Being CP N −1 models asymptotically free for all values of N , the continuum limit is approached as β L → ∞. The a → 0 limit can be traded for that of a diverging lattice correlation length ξ L . Our choice is for the second moment correlation length ξ, defined in the continuum theory in terms of the two-point correlation function of as To define the lattice length ξ L , we adopted the following definition [28], expressed through the Fourier transform G L (p) of G L (x) (i.e., the discretization of Eq. (11)):

B. Discretization of the topological charge
Regarding the topological charge Q, several equivalent lattice discretizations Q L can be adopted, all having the same continuum limit. However, at finite lattice spacing, these definitions and their correlations are related to the continuum by a finite multiplicative renormalization Z [29,30]: where q is the topological charge density. We adopted a geometric definition of the lattice charge [27,31], i.e., a definition with Z = 1, meaning that it yields always an integer number for every lattice configuration. Among the several geometric definitions, we chose one which involves only the link variables [27]: where Π µν (x) ≡ U µ (x)U ν (x +μ)Ū µ (x +ν)Ū ν (x) is the plaquette operator. Despite the fact that Z = 1, renormalization effects are still present, since in general Q L is related to the physical charge Q, configuration by configuration, by a relation like where η is a noise with zero average stemming from fluctuations at the UV scale. For a geometric charge, such noise appears in the form of the so-called dislocations [32,33], i.e., integer valued fluctuations at the scale of the UV cutoff and proliferating over physical contributions as the continuum limit is approached. The presence of a non-zero η gives rise to further additive renormalizations as one considers cumulants of the topological charge, including the topological susceptibility (see Ref. [34] for a review).
To suppress such noise, a smoothing method, such as cooling [32,[35][36][37][38][39][40] or the gradient flow [41,42], can be adopted. Since it has been shown that they are all numerically equivalent [43,44], we adopted cooling for its simplicity and numerical cheapness. Indeed, this method consists in a sequence of local steps of action minimization that damp UV fluctuations: for simplicity, the action that is minimized is the non-improved one, i.e., that in Eq. (10) with c 1 = 1 and c 2 = 0.
Contrary to Refs. [8,13], in this study we did not adopt neither simulations at imaginary values of θ in order to improve the signal-to-noise ratio of cumulants [7,45,46], nor improved algorithms in order to defeat the critical slowing down of topological modes [47][48][49][50]. This is due to the relative ease in obtaining precise determinations of the cumulants of the topological charge for small values of N , even using standard algorithms such as heat-bath or over-relaxation. For these reasons, the coefficients b 2n are determined in this study by simply using the definition given in Eq. (6), with the average taken over the path integral distribution at θ = 0, where the choice for Q is the geometrical topological charge in Eq. (15) measured after a certain number of cooling steps, as discussed later on.
C. Continuum limit at small N Since ξ L = ξ/a diverges as 1/a in the continuum limit, finite lattice spacing corrections can be expressed as a function of 1/ξ L . Thus, with the adoption of the Symanzik-improved action in Eq. (10), one expects ultraviolet corrections to the lattice expectation value of a generic observable O to have the form However, when approaching N → 2 one expects, at least for topological observables, the presence of additional corrections coming from physical topological fluctuations of small size: such physical contributions are neglected in the discretized theory until the lattice spacing is small enough, leading to an additional dependence on a, hence on ξ L . An a priori estimate of such effects can be done only with some assumptions, nevertheless it can be a useful guide. For instance, taking the perturbative estimate for the instanton size distribution reported in Eq. (7) and assuming that topological fluctuations are dominated by a non-interacting gas of small instantons and anti-instantons, one has that the number of instantons n I and anti-instantons n A are distributed as two independent Poissonians with equal mean: where the integral is carried over sizes ranging from the UV scale, set by the lattice spacing a, up to a certain IR length scale ρ 0 , which is proportional to La. Since, with these hypotheses, we have the following predictions: Ultraviolet corrections predicted by Eqs. (20) become negligible as N grows, in particular one expects them to disappear for N ≥ 4, where the contribution from small instantons becomes negligible. On the other hand, the contribution of small instantons becomes dominant for N = 2 and 3, where it leads either to a logarithmically divergent continuum limit for N = 2, or to linear, instead of quadratic, corrections in the lattice spacing for N = 3. Notice that, under the assumptions of independent Poisson distributions for n I and n A , the b 2n coefficients are finite with values as predicted by DIGA in Eq. (9), so that no further corrections are expected, with respect to Eq. (17), in their approach to the continuum limit. Such considerations will be used with caution in the following. In particular, in order to correctly assess our systematics on the continuum limit, we will consider the possible presence of generic power law corrections in the lattice spacing for N ≤ 4, both for χ and for the other terms in the θ-expansion.

D. Continuum limit and smoothing
Since we adopt a smoothing method in order to remove field fluctuations at the UV cutoff scale, which are responsible for unphysical contributions to lattice topological observables, we need to fix how the amount of smoothing is changed as one approaches the continuum limit.
Smoothing algorithms work in general as diffusive processes, affecting field correlations up to a given distance, i.e., up to a given smoothing radius, which is proportional, in dimensionless lattice units, to the square root of the amount of smoothing, i.e., to √ n cool for cooling, where n cool is number of cooling steps, or to √ t for the gradient flow, where t is the flow time. It is a standard procedure to change the amount of smoothing so that the smoothing radius is kept fixed in physics units: that would correspond to change n cool proportionally to ξ 2 L . However, for a model like the one we are exploring, where small distance physical contributions are expected to be quite significant, one should be careful and take care, additionally, of the dependence of continuum results on the physical smoothing radius, eventually sending it to zero.
An alternative to this difficult double limit procedure is to take the continuum limit at fixed number of cooling steps n cool , so that the smoothing radius goes to zero proportionally to a and there is no possibility that physical contributions at small scales are smoothed away. There are good reasons to believe that such procedure works correctly in the present case.
As we have discussed above, for a geometric charge like the one used in this study renormalization effects are essentially due to dislocations leading to a wrong and/or ambiguous counting of the topological winding number. Such dislocations consist of exceptional field fluctuations living at the lattice spacing scale, hence it is reasonable to expect that they will be suppressed by a given and fixed number of cooling steps n cool , independently of the value of the lattice spacing a.
Based on such considerations, in the following we will consider results obtained by performing the continuum limit at fixed n cool , then carefully checking the possible systematics related to this procedure. In particular, we will show that while results obtained at finite lattice spacing, but at different number of cooling steps, usually differ from each other well beyond their statistical errors, the so obtained continuum-extrapolated results are not significantly dependent on n cool , and usually well within statistical errors.

NUMERICAL RESULTS
In Tables I-VII we summarize the parameters of the performed simulations, along with the total accumulated statistics. Configurations were generated using standard local algorithms, in particular our elementary Monte Carlo step consisted in 4 lattice sweeps of over-relaxation followed by a sweep of over-heath-bath; measures were taken every 10 Monte Carlo steps. We simulated CP N −1 models with N ranging from 2 to 8. For each value of N we simulated several runs at different values of the correlation length (i.e., at different values of β L ); for each ξ L we measured ξ 2 χ, b 2 and b 4 in order to be able to extrapolate these quantities towards the continuum limit.
Concerning the choice of the lattice size, we performed simulation at fixed L/ξ L , ensuring that L/ξ L ∼ 12-15 for each run to have finite size effects under control [51], see Fig. 1. In those cases when that was not possible (more precisely, for high-ξ L runs for N = 2), several lattices with different sizes were simulated and then the infinite volume limit was performed by fitting the L-dependence of every observable O with the law where O ∞ is the desired quantity and a and b are additional fit parameters. An example of extrapolation towards the thermodynamic limit is shown in Fig. 2. First, we consider the case N > 3, for which a finite continuum limit is surely expected for the topological susceptibility, with no qualitative differences in the continuum scaling compared to the large-N case. For this reason, in order to extrapolate the quantity ξ 2 χ towards the continuum limit we have fitted its dependence on ξ L according to the ansatz Only for N = 4, due to its proximity to N = 2 and 3 (see later discussion), we have also considered the possible presence of further power law corrections. An example Several sources of systematic errors have been checked: first, the extrapolation was performed fitting data in several ranges of ξ L to check that the obtained extrapolations were all consistent with each other and that, as the fit range is restricted, the O(x 4 ) corrections became negligible. Second, as anticipated in Sec. 2 D, we extrapolated the continuum limit at fixed number of cooling step n cool for several values of n cool , checking that this procedure does not introduce significant systematics in continuum-extrapolated values.
When changing the number of cooling steps, we observe that, while measures at coarse lattice spacing differ, the dependence on n cool is less and less visible as the continuum limit is approached, making the continuum extrapolation stable as n cool is varied. In Fig. 4 we show an example of the continuum extrapolation at two different values of n cool for N = 5 while, in Fig. 5, we show, for the same case, that any variation in the continuum extrapolation observed when changing n cool is well contained inside our statistical errors. The final, continuum determinations obtained for ξ 2 χ are reported in Tab. VIII.  In the N = 3 case we fit the dependence of ξ 2 χ on ξ L according to the following function: where we consider both the extrapolations obtained with fixed c = 1, as suggested by the ansatz in Eq. (20), and with c left as a free parameter. In both cases, the ansatz in Eq. (23) provides a very good description of our numerical results, and the x 4 corrections turn out to be    necessary only when considering in the fit range correlation lengths as small as ξ L 5. Moreover, even when c is treated as a free parameter, its values turn out to be compatible with 1 within errors, thus giving further numerical support to the ansatz in Eq. (20).
Examples of continuum extrapolations are shown in Fig. 6, where we also report our final continuum estimate for N = 3, ξ 2 χ = 0.110 (5). The quoted error includes all possible systematics related to the variability of the fit parameter a when changing the fitting ansatz, the fitting    range (with ξ min varied between 2 and 7) and the number of cooling steps (with n cool varied between 10 and 50). Let us discuss more in detail the systematics related to cooling. Fig. 6 shows that results obtained for n cool = 10 and 50 are significantly different from each other, nevertheless, as also evident for one particular fit ansatz in Fig. 6, their continuum limit shows very little variations, when compared to statistical errors on the fit parameters.
There is a simple way to understand why results at different n cool differ so much at finite ξ L , while coinciding in the ξ L → ∞ limit. As already discussed in Sec. 2 D, cooling acts as a diffusive process which smooths away field fluctuations (both physical and unphysical) below an effective radius r = ar(n cool ), where the radius in lattice spacing units,r, is a function of n cool only, i.e. independent of the lattice spacing a. are more abundant. On the other hand, the effect must fade away as a → 0.
If the above picture is correct, results obtained at different a and different n cool , but such that r = ar(n cool ) is the same, should coincide: for instance, results shown in Fig. 6 for n cool = 10 should go onto those at n cool = 50 if they are shifted along the horizontal axis by a constant factor equal tor(10)/r (50). Such experiment has been performed for n cool = 10, 25, 50 in Fig. 7, showing that it works perfectly: plotting all data as a function of an effective variable x eff proportional to r/ξ =r(n cool )/ξ L , wherer(n cool ) has been found empirically, data at different values of n cool collapse perfectly onto each other over the whole range of explored correlation lengths; a similar collapse can be obtained also for other values of N . Therefore, the dependence of ξ 2 χ on n cool observed at finite lattice spacing can indeed be simply interpreted in terms of a global effective rescaling of 1/ξ L , so that such dependence naturally fades away when ξ L → ∞.
To summarize, our results for N = 3 provide solid evidence that ξ 2 χ is indeed finite in the continuum limit of this theory. On the other hand, that will be further checked and supported by an independent small-N extrapolation based on N > 3 results only, which is discussed in Sec. 3 D and will make the evidence conclusive.    20) could be correct, leading to a susceptibility which diverges logaritmically in the continuum limit, i.e. has the following dependence on x = 1/ξ L : On the other hand, corrections to such prediction leading to a finite ξ 2 χ cannot be excluded a priori, in this case one should consider a dependence on ξ L like that used in Eq. (23) for the N = 3 case, i.e.: where c is a positive exponent. We have to say that, unfortunately, despite the fact that our present results extend over almost two orders of magnitudes in terms of the correlation length ξ L , we are still not able to clearly distinguish between the two possibilities, in the sense that both ansatzes, Eqs. (24) and (25), return acceptable values of the χ 2 /dof test, even if marginally better for the convergent ansatz. Let us consider, for instance, the data reported in Fig. 8, where we consider only results obtained for ξ L > 10. In this range O(x 4 ) corrections turn out to be unnecessary. Considering data obtained for n cool = 50 we obtain χ 2 /dof = 7.9/6 for the divergent ansatz in Eq. (24) and χ 2 /dof = 3.8/5 for the convergent ansatz in Eq. (25); in the latter case we also obtain c = 0.30 (15) and a prediction ξ 2 χ = 0.43(12) (for n cool = 20, the latter prediction changes to ξ 2 χ = 0.42 (11), c = 0.33 (16), with χ 2 /dof = 3.8/5). Notice that, in the case of the convergent fit, the coefficient c is at two standard deviations from zero, which is the boundary where also this fit becomes divergent: this is consistent with the fact that the logarithmic fit is marginally acceptable.
The situation does not change appreciably when changing the fit range, with both ansatzes remaining acceptable even if with slightly lower values of the χ 2 /dof test in the case of a finite continuum susceptibility: if the latter scenario could be assumed a priori we would conclude ξ 2 χ ∼ 0.40 (15) for N = 2. However, the fact that no conclusive answer can be obtained based on our present data for N = 2, is confirmed by looking at the best fit curves reported in Fig. 8: the two curve profiles (convergent and divergent) are hardly distinguishable in the explored range of ξ L and deviate from each other only for much larger values of ξ L . Another independent possibility to discriminate between these two different behaviors is to use data obtained for N > 2 and extrapolate them towards N = 2. This topic is covered in Sec. 3 D. FIG. 8: Extrapolation towards the continuum limit of ξ 2 χ for N = 2 using data in the range ξL > 10. The solid and dashed lines represent, respectively, the best fits obtained using fit functions h(x) = a1 log(x/a2) + a3x 2 (where x = 1/ξL) and g(x) = a+a1x 2 +a2x 4 +a3x c with measure taken after n cool = 50 cooling steps. The former best fit yields χ 2 /dof = 7.9/6 while, in the latter case, we obtain the continuum limit a = 0.43 (12) and the exponent c = 0.30(15) (χ 2 /dof = 3.8/5). The dotted line, instead, represents again the best fit obtained using fit function g(x) = a + a1x 2 + a2x 4 + a3x c but with measures taken after n cool = 20, which gives a = 0.42 (11) and c = 0.33(16) (χ 2 /dof = 3.8/5). In this section we aim at tackling the question about the small N behavior of the topological susceptibility from another independent front, by extrapolating continuum results obtained for ξ 2 χ at N > 2 towards N = 2. A summary of such results, including all N < 10, is reported in Tab. VIII. Using Eq. (7) and assuming non-interacting instantons, the expected behavior of ξ 2 χ when approaching the N → 2 limit is [13] Therefore, we fitted the N -dependence of ξ 2 χ for N > 2 using the following function: Best fit of the small-N behavior of ξ 2 χ for N ranging from 3 to 9. The solid and dashed lines represent, respectively, the best fits obtained using fit function (27) with N * = 2 or left as a free parameter. The best fits yield, respectively, χ 2 /dof = 5.1/5 and χ 2 /dof = 4.56/4. The best fit is quite good and is shown in Fig. 9, the corresponding parameters are the following: a = 0.119 (10), N * = 1.90 (14), γ = 0.91(4), χ 2 /dof = 4.56/4. This best fit, yielding a central value for N * < 2, technically supports a finite extrapolation towards N = 2; however, N = 2 is well within one standard deviation from N * , so that even this approach reveals to be inconclusive, at least for this issue. As shown in Fig. 10, the error on the best fit blows up as N → 2 is approached, being N * very close to it, so that this extrapolation turns out to be compatible with the hypothetical finite value found from the convergent continuum limit of Sec. 3 C, ξ 2 χ = 0.40 (15), but within quite large uncertainties. As a matter of fact, a best fit of the same data set with (26) but fixing N * = 2 yields a perfectly compatible results: a = 0.112(2) and γ = 0.89(1) (χ 2 /dof = 5.1/5); this best fit is depicted in Fig. 9 as well. Finally, as a further consistency check of the solidity of these results, we fitted our small-N data also in narrower ranges, in particular considering only data for N > 3 or for N > 4. Best fits are shown in Fig. 11. In both cases, N * and γ turned FIG. 10: Best fit of the small-N behavior of ξ 2 χ for N ranging from 3 to 9. The dashed line represents the best fit obtained using fit function (27) leaving N * as a free parameter, the shadowed band represents the fit error and the diamond points represent the hypothetical finite continuum limit obtained from the convergent continuum extrapolation of ξ 2 χ for N = 2.
out, again, to be compatible with 2 and 1 respectively: Moreover, extrapolating these fits towards N = 3, we obtain values for ξ 2 χ(N = 3) which are, in both cases, in perfect agreement with the continuum limit obtained from direct N = 3 simulations in Sec. 3 B. Based on these results, we conclude that the finiteness of the topological susceptibility for N = 3 can be definitely assessed.

E. Small-N behavior of the b2n coefficients
In this section we study the small-N behavior of the b 2n coefficients, in particular, we aim at checking the hypothesis that these coefficients are finite in the N → 2 limit and that, in this limit, they approach the DIGA prediction. With our statistics, we could only obtain reliable estimations of b 2 , while already b 4 always turned out to be compatible with zero, after continuum extrapolation, in all explored cases. Thus, we can presently discuss only the small-N behavior of b 2 .
As we have discussed in Sec. 2 C, contrary to what happens for the topological susceptibility, in this case we do not expect a priori modifications to the standard form of UV corrections reported in Eq. (17). As a confirmation of this expectation, a continuum extrapolation performed according to the fit function in Eq. For N = 2 and 3 we have also considered the possible addition of generic power law corrections to Eq. (22) proportional to x c , as we have done for ξ 2 χ, however this turned out to be irrelevant in this case, with modifications to the continuum extrapolation staying within errors. The only modification which can be noticed in the N = 2, 3 cases is that the approach to the continuum limit is steeper, however this is compensated by the larger values of ξ L available from our simulations in these cases.
All continuum extrapolations are reported in Tab. IX. Also in this case the quoted errors include systematics, which have been assessed by observing the variation of central fit values when changing the fit function, the fit range and the number of cooling steps n cool . Our data for N = 2 confirm that b 2 is finite, with a continuum estimate b 2 (N = 2) = −0.070(4) which is ∼ 3σ off from the DIGA prediction b DIGA 2 = −1/12 ≃ −0.0833. We can try to extrapolate, based on present results, the value N * of N for which b 2 reaches the DIGA prediction, to that aim we try to fit our data according to a critical behavior like: FIG. 13: Extrapolation towards the continuum limit of b2 for N = 3. The solid and dashed lines represent the best fits obtained using the fit function g(x) = a+a1x 2 (where x = 1/ξL) using measures taken after n cool = 50 cooling steps in the ranges ξL > 8 and ξL > 15. The best fits yield, respectively, a = −0.0471(8) (χ 2 /dof = 4.65/5) and a = −0.0485(11) (χ 2 /dof = 1.06/2). The dotted line represents instead the best fit obtained using measures taken after n cool = 20 cooling steps in the range ξL > 8. The best fit yields a = −0.0478 (8) with χ 2 /dof = 2.85/5. Data obtained for different numbers of cooling steps have been plotted slightly shifted to improve readability. The diamond point represents our final estimation of the continuum limit. FIG. 14: Extrapolation towards the continuum limit of b2 for N = 2. The solid and dashed lines represent the best fits obtained using the fit function g(x) = a + a1x 2 (where x = 1/ξL) using measures taken after n cool = 50 cooling steps in the ranges ξL > 20 and ξL > 30. The best fits yield, respectively, a = −0.0677(13) (χ 2 /dof = 6.95/5) and a = −0.0700(17) (χ 2 /dof = 2.13/3). The dotted line represents instead the best fit obtained using measures taken after n cool = 20 cooling steps in the range ξL > 20. The best fit yields a = −0.0670 (16) with χ 2 /dof = 4.55/5. Data obtained for different numbers of cooling steps have been plotted slightly shifted to improve readability. The diamond point represents our final estimation of the continuum limit.  While we do not have any argument to support such an ansatz, it turns out to work quite well: a best fit in the range [2,9] is reported in Fig. 15 and returns the following parameters: a = 0.0345 (18), N * = 1.94(6), γ = 0.352(25), χ 2 /dof = 1.2/5.
It is interesting to observe that the N * found in this way from the analysis of b 2 is perfectly compatible with the one found for the critical fit of ξ 2 χ, and still compatible with N * = 2.

CONCLUSIONS
In this work, we have presented a systematic numerical study of the peculiar features of the θ-dependence of the vacuum energy f (θ) of 2d CP N −1 models in the small-N limit. To that aim, we have performed numerical simulations for N ∈ [2,8].
One of the most interesting questions regards the value of N , if any, at which the topological susceptibility χ diverges: this is predicted to be N = 2 by perturbative computations of the instanton size distribution. A second interesting question regards the fate of the b 2n coefficients, which parametrize the non-quadratic part of θdependence and which could possibly approach the values predicted by the Dilute Instanton Gas Approximation at the point where χ diverges, if the theory can be approximated in this case by a gas of small and non-interacting instantons and anti-instantons.
Our strategy has been twofold. On one hand, we have dedicated particular efforts to correctly assess the continuum limit of ξ 2 χ for N = 2 and N = 3 from direct simulations of these theories: to that aim, we have performed simulations on lattices with correlations lengths ranging from a few units up to O(10 2 ). On the other hand, we have exploited results obtained for larger values of N , where the continuum extrapolation is easier, in order to perform a small-N extrapolation. Based on this double strategy, we have obtained consistent and conclusive evidence that the topological susceptibility is finite for N = 3, providing the estimate ξ 2 χ = 0.110 (5).
On the other hand, results for N = 2 are still inconclusive: results obtained directly at N = 2 are consistent with a logarithmically-divergent continuum extrapolation, but do not yet exclude a finite continuum value with ξ 2 χ ∼ 0.4, which is even marginally favored from the point of the χ 2 /dof test. A similar picture emerges from the extrapolation from results obtained for N > 2, which provides evidence for a critical behavior ξ 2 χ ∝ 1/(N − N * ) γ , with N * = 1.90 (14). Therefore, future numerical studies are still needed in this case to definitely settle the issue.
As for the b 2 coefficient, we have provided continuum extrapolations down to N = 2. While there is no compelling reason to expect that the DIGA prediction be valid at the point where ξ 2 χ diverges, it is interesting to observe that our numerical results are consistent with that. For N = 2 we obtain b 2 (N = 2) = −0.070(4), i.e., around 3σ off the DIGA value b DIGA 2 = −1/12 ≃ −0.0833, while an extrapolation from our results at all values of N (see Eq. (28)) provides evidence for b 2 reaching b DIGA 2 for N = 1.94 (6), which is consistent with the value N * at which ξ 2 χ diverges reported above.