Confronting impact parameter dependent JIMWLK evolution with HERA data

The small-$x$ evolution of protons is determined from numerical solutions of the JIMWLK equations, starting from an initial condition at moderate $x$ for a finite size proton. The resulting dipole amplitude is used to calculate the total reduced cross section $\sigma_r$ and charm reduced cross section $\sigma_{rc}$, as well as diffractive vector meson production. We compare results to experimental data from HERA and discuss fundamental problems arising from the regime sensitive to non-perturbative physics. We emphasize that information on the gluonic content of the proton, gluon spatial distributions and correlations over wide ranges in $x$, which can in principle be constrained by our study, are essential ingredients for describing the initial state in proton-proton and proton-ion collisions. Future electron nucleus collisions at an electron-ion collider will provide important additional insight for heavier nuclei. We further demonstrate that it is not possible to probe gluon saturation in electron-proton collision at HERA energies and that electron-heavy ion collisions will be essential to access the saturation regime.


I. INTRODUCTION
Deep inelastic scattering (DIS) is a clean and powerful process to explore the structure of hadrons as a function of longitudinal momentum fraction x and distance scale Q 2 . The most precise DIS measurements so far have been performed at the DESY-HERA electron-proton collider, and the HERA experiments H1 and ZEUS have recently published their precise combined measurements of the proton structure functions [1][2][3][4]. These measurements make it possible to constrain the total parton densities inside the proton to a very good precision. In addition, measurements of diffractive vector meson production [5][6][7][8][9][10][11][12][13] provide information on the spatial extent of the gluon distribution and in case of incoherent diffraction, even its fluctuations [14][15][16]. In the future, if an electron-ion collider is realized [17][18][19], DIS with a variety of nuclear targets will allow unprecedented detailed studies of proton and nuclear structure (see for example [20][21][22][23]).
A successful framework for describing DIS at high energy is given by the dipole model [24][25][26][27][28]. Here, the scattering process is studied in the rest frame of the target proton or nucleus, where the virtual photon first splits into a quark-antiquark pair (the color dipole), which subsequently interacts with the target. Furthermore, at high enough energy we can assume the transverse distance between the quark and antiquark, r, to be constant during the interaction with the target.
The properties of the target at small enough longitudinal momentum fraction x (high enough energy) enter via Wilson lines [29], which can be computed in the color glass condensate framework [30,31] and descibe the color rotation of the incoming quark or anti-quark. The interaction of the dipole with the target is thus described by a two-point function expressed by the Wilson lines at the position of the quark and anti-quark. 1 This two-point function is the dipole operator, whose average over color configurations yields the dipole amplitude Here, b = (x + y)/2 is the impact parameter and r = x − y represents the size and orientation of the color dipole.
There have however not been any calculations of DIS structure functions, or diffractive vector meson production on the level of Wilson lines including the full leading logarithmic JIMWLK evolution [44][45][46][47][48][49][50]. The JIMWLK equation is a renormalization group equation for the Wilson lines, obtained by integrating out the quantum fluctuations at smaller and smaller Bjorken-x. Writing the JIMWLK equation in its stochastic Langevin form provides a convenient picture of the small-x evolution as a random walk in color space and allows numerical solutions. The first numerical results were obtained for infinite nuclei in Refs. [51,52], and the finite proton geometry was studied in [53].
In this work we perform for the first time calculations of proton structure functions and diffractive vector meson production using this most fundamental description of the target proton by its JIMWLK evolved Wilson lines. 2 This has various advantages over simple parametrization models or those invoking BK evolution. For example, our description includes the expected growth of the proton with decreasing x [53], and Wilson line configurations are correlated between different values of x [55].
This additional information can be of great use in the description of the initial state in other collision systems, such as heavy ion collisions performed at the Relativistic Heavy Ion Collider (RHIC) and at the Large Hadron Collier (LHC). In particular the three dimensional structure of the initial state [55] and the transverse geometry in proton-heavy ion collisions [56] rely strongly on the detailed configuration of Wilson lines in the incoming target and projectile. Apart from the fundamental interest in the proton and nuclear structure at small x, it is thus crucial to understand the Wilson line configurations in protons and nuclei at high energy from electron scattering events in order to constrain the initial states in complex nuclear collisions.
Avoiding approximations like those done in the IPsat or bCGC models and performing explicit small x evolution of a finite size system comes with a variety of problems. One of the issues we will encounter has already appeared when studying the BK evolution of finite size systems [38,[57][58][59][60]: The dipole amplitude decreases at large r and eventually vanishes, while in the parametrized models it is always assumed to approach one, at any impact parameter b. The behavior in our framework is easy to understand. For increasing r = |r|, it will be more and more likely that both ends of the dipole (the quark and anti-quark) miss the target, and evaluating both Wilson lines in Eq. (1) in the vacuum will lead to N = 0.
At r values greater than the inverse pion mass, nonperturbative confinement effects should appear. A simple dipole at such large r is not the right degree of freedom and a more complicated non-perturbative soft contribution, for example described via vector meson dominance [61][62][63][64] should take over. We will detail this problem further in the main text of this work.
Beyond this important problem, other issues, that can be easily avoided in more ad-hoc calculations of the dipole amplitude, appear. For example, we will have to introduce infrared regulators for both the initial condition of the evolution as well as the JIMWLK kernel, to regulate otherwise appearing Coulomb tails. Again, this is a problem that arises because we cannot deal with confinement effects from first principles. Also, an ultraviolet regulator could be required to modifiy the initial transverse momentum spectrum of gluons, which affects also the Q 2 -dependence of the cross sections. This regulator is similar to the anomalous dimension used in [65], that modifies the UV behavior of the unintegrated gluon (at NLO) to produce comparisons with HERA data. However, this is similar to the BK truncation and does not involve the exact evolution of the JIMWLK equations as performed in this work. distribution, which in our case is that of the McLerran-Venugopalan model [66,67] in the initial condition.
The remainder of this paper is organized as follows. First, in Sec. II we discuss how the small-x evolution of the proton shape is obtained from the JIMWLK evolution. The applied model for the in principle nonperturbative initial condition for this evolution is presented in Sec. III. Details of the numerical implementation are given in Sec. IV. The ovservables of interest in this work, proton structure functions and diffractive cross sections, are presented in Sec. V. In case of protons without geometric substructure, the results for the structure functions are presented in Sec. VI and for the diffractive processes in Sec. VII. Finally, we study the small-x evolution of the proton structure fluctuations in Sec. VIII. Finally, the importance of the possible non-perturbatively large dipoles is studied in detail in Sec. IX.

II. HIGH ENERGY EVOLUTION FROM THE JIMWLK EQUATION
At high energy, the scattering of an electron off a hadronic target (proton or heavier nucleus) is described by the electron emitting a virtual photon, which in turn splits into a quark anti-quark pair. This color dipole then propagates eikonally through the target, meaning that the gluon fields of the target merely rotate the incoming probe's color and transfer transverse momentum. The transverse location of the dipole remains unaffected. The color rotation is mediated by Wilson lines, path ordered exponentials of the color fields along the probes trajectory.
In the color glass condensate framework, the Wilson lines are stochastic variables with an energy (or equivalently x or rapidity y) dependent probability distribution. This energy dependence is described by the JIMWLK renormalization group equation, which can be written as a functional Fokker-Planck equation [68]. This in turn can be expressed as a Langevin equation for the Wilson lines V x (in the fundamental representation) themselves [69], The deterministic drift term is with S x = 1/x 2 and T a the generators of the adjoint representation. U x are Wilson lines in the adjoint representation. The random noise is Gaussian and local in coordinates, color, and rapidity with expectation value zero and ξ a x,i (y)ξ b y,j (y ) = δ ab δ ij δ (2) xy δ(y − y ).
The coefficient of the noise in the stochastic term is where When Eq. (2) is discretized, the Wilson line at higher rapidity y + dy is obtained as The delta function in (4) is replaced by δ(y m − y n ) → δ mn /dy when discretizing the evolution on a lattice. Following Ref. [70] the drift term can be eliminated, avoiding the necessity to evaluate adjoint Wilson lines, and one rapidity step can be cast into the form where ξ z = (ξ a z,1 t a , ξ a z,2 t a ). The long distance Coulomb tails that are encountered when solving rapidity evolution from the JIMWLK equation will give an exponential growth of the cross section with rapidity [53,57], eventually violating the Froissart bound [71]. This should be regulated by confinement scale physics, and we regulate the long-distance behavior by following the prescription of Ref. [53] and perform the replacement Here K 1 is the modified Bessel function, and the dimensional parameter m ∼ Λ QCD will be constrained later. The replacement (9) introduces an exponential suppression at large distances, leaving the short-distance part unmodified. We will also consider the running coupling effects in our analysis. We adpot the the so called square root prescription, in which the coupling constant in Eq. (8) is moved inside the integral and evaluated as α s (r) = 12π where r = |x − z|, µ 0 = 0.28 GeV and c = 0.2. The scale Λ QCD (in coordinate space) will be fixed by fitting the structure function data.

III. INITIAL CONDITION FOR THE SMALL-x EVOLUTION
The initial condition for the JIMWLK evolution, the Wilson lines V x at each point in the transverse plane at the initial rapidity y, which corresponds to the initial Bjorken-x x 0 , is obtained similarly to the IP-Glasma model [72,73], except that here we do not use any input from the IPsat model. This is preferable, because the parameters in the IPsat model are determined based on a different, parametrized, x-evolution. The color charge densities ρ a (x) in the transverse plane are assumed to be random Gaussian variables with an MV model [66] correlator The color charge density g 4 µ 2 T p (b) is related to the saturation scale at the given transverse coordinate [74]. We will study both a proton with a Gaussian shape and one that includes additional fluctuations via three randomly positioned hotspots [15,16]. For the Gaussian proton we have where B p controls the proton size. The overall normalization is set by g 4 µ 2 .
For the fluctuating proton, we sample the hotspot positions in the transverse plane relative to the origin, b i , from a Gaussian distribution with width B qc , assuming a uniform angular distribution. The density profile of each hotspot in the transverse plane is also assumed to be Gaussian with width parameter B q . Thus, the extension to a fluctuating proton corresponds to the replacement in Eq. (12). N q can be interpreted as the number of large x partons, typically chosen to be 3, for the three constituent quarks. In addition, we also allow the saturation scale of each constituent quark to fluctuate independently, following a log-normal distribution. As discussed in Ref. [16], the fluctuations are sampled according to The sampled g 4 µ 2 values are furher normalized by the expectation value of the distribution e σ 2 /2 in order not to change the average g 4 µ 2 .
Once the initial color charge densities are set, the Wilson lines at each point in the transverse plane are obtained by solving the Yang-Mills equation as in Ref. [72]. Formally, the classical color field A + can be written as where ρ = ρ a t a . This can be written in Fourier space as Here, we have introduced an infrared regulatorm that suppresses the non-perturbative long-distance Coulomb tails. Generally one expectsm ∼ Λ QCD . The mass parameterm here in momentum space does not necessarily have to be the same as m used in the JIMWLK equation in Eq. (9), which is written in coordinate space, except that we require both of them to be of the order of Λ QCD . In practice, we usem = 0.4 GeV, as it is found in Ref. [16] to produce a Gaussian spectrum for coherent diffractive J/Ψ production, compatible with the experimental data. The standard MV model is known to give faster Q 2 evolution than what is seen in the HERA structure function data. In dipole model fits using the BK evolved amplitude [37,41] this problem is solved by effectively modifying the typical MV-like r 2 behavior of the amplitude at small r by introducing an anomalous dimension γ that makes the dipole amplitude decrease faster with decreasing r. In our setup, to obtain a similar effect we modify the color charge density in momentum space by multiplying Eq. (17) by a suppression factor e −|k|v : Here v is a free parameter 3 . The Wilson lines in coordinate space can finally be obtained by Fourier transforming A + (x − , k) back to coordinate space and writing [74] V where we have also discretized the x − direction into N y independent slices. Note that the A + i (x) in each slice are obtained from a ρ distribution (11) with the amplitude multiplied by N −1 y .

IV. NUMERICAL IMPLEMENTATION
As described in the previous section, the initial condition at x = x 0 is determined by an MV model with a spatially dependent color charge density. We sample the color charges from a Gaussian determined by (11), compute the Wilson lines (19), and evolve every proton configuration by solving the JIMWLK equation (8).
As the JIMWLK evolution is implemented as a random walk in color space, different evolutions of the same initial condition will not result in the same configuration at smaller x. To obtain our final results, we average over 100 initial conditions, each of them evolved only once. Calculations are performed on a 2-dimensional 700 2 lattice, total length being L = 5.12 fm. We have checked that results have converged and smaller lattice spacings do not modify the results. The resulting Wilson lines are saved on a grid with rapidity separation ∆y ≈ 0.2, the exact value depending on the chosen α s , and the final results are obtained by interpolating the resulting quantities linearly in ln 1/x.
The free parameters to be determined by the HERA data are the following: • B qc and B q : replace B p in case of a fluctuating proton geometry • v: ultraviolet damping factor that controls the Q 2 evolution speed in the initial condition. This is correlated with B p , as filtering out the high frequency modes effectively increases the proton size.
• σ, which controls the magnitude of the overall saturation scale fluctuations The proton size B p is chosen to be compatible with the HERA data in the initial condition (we require that the |t| slope of the coherent J/Ψ production in the initial condition is approximately 3.8 GeV −2 , see Sec. V B). Then, the other parameters are fitted to the HERA structure function data. As we will discuss in detail later, we use the charm reduced cross section to constrain the parameters, as it is not sensitive to the contribution from large dipoles, which are not completely described in our framework. This is because non-perturbative physics becomes important as the dipole size reaches the inverse pion mass.
As discussed in the introduction, in the dipole picture, cross sections can be expressed in terms of correlators of Wilson lines. In this work, we study proton structure functions and diffractive processes, which are sensitive to the dipole amplitude, which measures the correlation between two Wilson lines Here, the average is taken over different color field configurations of the proton, b = (x + y)/2 is the impact parameter and r = x − y describes the size and orientation of the color dipole.
For comparison, we will also show results obtained using the IPsat model parametrization, fitted to HERA data in Ref. [39] (see also Ref. [76] for a discussion of the large dipole contributions to the structure functions when the IPsat parametrization is applied). In this parametrization, the dipole amplitude is written as where and µ 2 = µ 2 0 +C/r 2 . The initial condition for the DGLAP evolution [77][78][79] of the collinear factorization gluon distribution xg(x, µ 2 ) and µ 0 are free parameters of the model (where xg(x, µ 2 = µ 2 0 ) is a parametrized function of x), determined by the fit to HERA data, and C = 4. The proton density function has the same form as in our JIMWLK calculation,

A. Structure functions
Proton structure functions can be written in terms of the virtual photon-proton cross section σ γ * p as where T and L refer to transverse and longitudinal polarization of the virtual photon. The most precise combined HERA data [2] is released in the form of a reduced cross section (25) where y = Q 2 /(xs) is the inelasticity of the ep scattering with center-of-mass energy √ s.
In the CGC framework, the virtual photon-proton cross section can be calculated as (26) The target average refers to averaging over different proton configurations, and the summation is taken over quark flavors f = {u, d, s, c}. The light quark masses are set to 0.14 GeV, and the charm mass is 1.4 GeV in this work. Our conclusions are not sensitive to the precise mass values chosen. When evaluating the charm quark contribution we employ the standard kinematical shift [80]x The virtual photon wave function Ψ f L,T can be calculated from QED, see e.g. [81]. For the transverse polarization, the squared wave function summed over quark helicities and averaged over photon polarizations reads Here e f is the fractional charge of the quark and ε 2 = z(1 − z)Q 2 + m 2 f . K 0 and K 1 are the modified Bessel functions of the second kind.

B. Diffractive scattering
Diffractive vector meson production can provide additional insight, in particular on the geometric structure of the target. For coherent diffractive processes, where the proton stays intact, the cross section can be written as [14,36] dσ γ * p→V p where is the scattering amplitude. The indices T, L indicate the photon polarization. In case of diffractive events, we will only consider photoproduction (Q 2 = 0). Consequently, only the transverse polarization will appear.
The incoherent cross section can be written as the variance [14] (see also Refs. [15,16,[82][83][84][85]): Charm reduced cross section σ rc The diffractive vector meson production scattering amplitude can be written as [36], Here the transverse momentum transfer is ∆ = (P − P ) ⊥ ≈ √ −t. This equation can be interpreted as follows: First, the incoming virtual photon fluctuates into a quark-antiquark dipole with transverse separation r, the quark carrying the momentum fraction z. This splitting is described by the same virtual photon wave function Ψ T,L that is used when calculating the proton structure function. The elastic scattering amplitude for the dipole to scatter off the target is N (r, b, x P ). Finally, the vector meson is formed, and the qq → V formation is described by the vector meson wave function Ψ V . Note that Ψ V is a significant source of uncertainty in our calculations. In this work we use the Boosted Gaussian wave function parametrization from Ref. [36] (see Ref. [86] for a new development of more rigorous calculations of the vector meson wave functions). When calculating the cross section, we also take into account the so called skewedness correction, which originates from the fact that in the dilute limit, the two exchanged gluons carry significantly different longitudinal momentum fractions [87][88][89]. We apply this by calculat-ing the effect of the skewedness correction in the IPsat model at given W (and use the fact that the correction is approximately independent of t), and assume that the effect is the same in the formulation using the MV model and JIMWLK evolution.
For J/Ψ production, the structure of the target is probed at the scale which can be interpreted as the longitudinal momentum fraction of the proton carried by the color-neutral "pomeron". Here W is the center-of-mass energy in the photon-proton scattering, and in this work we neglect the t dependence of x P , as |t| The cross section is Fourier transformed into momentum space, and the transverse momentum transfer ∆ is the Fourier conjugate to b − (1 − z)r. Note that this means that a detailed knowledge of the impact parameter profile is needed, in contrast to the calculations of the proton structure functions where the impact parameter integral mainly affects the overall normalization. This makes diffractive scattering a sensitive probe of the proton geometry and, via Eq. (31), its fluctuations.
Charm reduced cross section σ rc

VI. DESCRIPTION OF THE PROTON STRUCTURE FUNCTION DATA
The proton structure functions and the reduced cross sections have been measured very accurately by the HERA experiments H1 and ZEUS [1][2][3][4] over a wide range of x and Q 2 . The experimental data is released separately for the total reduced cross section, and for the charm contribution to it. In case of transverse polarization, the total photon-proton cross section includes the so called "aligned jet" contribution where the momentum fraction z ∼ 1 or z ∼ 0. In this limit, the dipole sizes are limited by 1/m 2 f instead of 1/Q 2 (see Eq. (28)), and thus at all Q 2 there is a significant contribution to F 2 from possibly non-perturbatively large dipoles which would be sensitive to the confinement scale physics. As these effects are not included in our analysis (see also [38]), we prefer to first fit the charm structure function data (along with the proton size constraint from diffractive J/Ψ production), where the large charm mass suppresses confinement scale effects at all Q 2 . Note that in case of F L , where only longitudinally polarized photons contribute, the aligned jet contribution is absent, as the endpoints z → 0, 1 are suppressed by a factor z 2 (1 − z) 2 , see Eq. (29).
The slope of the |t| dependent coherent diffractive cross section for exclusive J/Ψ production is used to fix B p in the initial condition (x = x 0 = 0.01), fixing the proton size at that value of x. Then, by varying g 4 µ 2 and α s (at fixed coupling) or Λ QCD (with running coupling), and calculating the Wilson lines at smaller x by solving the JIMWLK equation, we find the parametrization that gives approximately the best possible χ 2 when comparing with the HERA charm structure function data from Ref. [4].
Note that as the total photon-proton cross section is proportional to both the (squared) saturation scale and the geometric size of the proton, the evolution speed is controlled by both the coupling constant α s and the infrared regulator m in the JIMWLK equation that controls the growth of the proton size. Thus, the HERA reduced cross section data is not enough to determine uniquely these two parameters. The fit quality remains excellent if m is increased (decreased), if one also increases (decreases) the strong coupling constant. This is demonstrated in Appendix A, where we show that varying m by 50% does not affect the quality of the fit when α s is adjusted (it is strongly correlated with m). In this work, unless otherwise noted we use m = 0.2 GeV. However, as m controls the evolution of the proton geometry at long distances, it has a large effect on the small-|t| part of the coherent vector meson production spectra, where the structure at long distances is probed. This is demonstrated in Appendix A. This behavior could make it possible to constrain both α s and m individually.
Comparison to the HERA combined charm structure function data [4] with our results for the optimal parameters is shown in Fig. 1 using the MV model initial condition without proton geometry fluctuations. The description of the data is equally good with both fixed and Total reduced cross section σ r running coupling (χ 2 /N ≈ 4). The too fast Q 2 evolution of the MV model discussed in Sec. III is clearly visible 4 , but at moderate Q 2 the agreement with the HERA data is good. The IPsat results are not shown, but they would be on top of the HERA data [39].  (18). One can see an improvement of the Q 2 evolution speed compared to the experimental data. More significant than what is visible by eye is the improvement of the fit, quantified by χ 2 /N which becomes ≈ 2.7, when including the UV damping. In general, we find that our fit prefers large values for the ultraviolet regulator v, with increasing v also decreasing the extracted value for α s . We were not able to find a single v value preferred by the fit, and thus fix it to v = 0.3 GeV −1 when showing results with ultraviolet damping.
The model parameters used to describe the charm production data are shown in Table I. Note that as already discussed, the ultraviolet damping factor e −|k|v in Eq. (18) effectively makes the proton larger by filtering out the high frequency modes, which is compensated by   [4] without including geometric fluctuations, with the proton size constrained by the coherent J/Ψ spectra. The infrared regulator in the JIMWLK evolution is m = 0.2 GeV. Note that with nonzero ultraviolet damping v > 0, the physical interpretation of the parameters is somewhat obscured.
a more sharply peaked initial color charge distribution in the MV model at the origin (B p is much smaller than in the case v = 0). Similarly, as this suppression factor clearly reduces the overall color charge density, this effect is compensated by a larger g 4 µ 2 in the initial condition. The physical interpretation of the parameters, for example as color charge density and proton size, are somewhat obscured in case the ultraviolet regulator is employed. Note that in both parametrizations the resulting saturation scales extracted from the dipole amplitude are very similar. Using the parametrization constrained by the charm data (and the |t|-slope of the exclusive J/Ψ production), we next calculate the total reduced cross section including the light quarks in addition to charm. The results are shown by the solid lines in Fig. 3, where we find that especially at x values close to the initial condition the total photon-proton cross section is significantly underestimated in our calculation. Again, the IPsat result is not shown, but the description of the data would be almost perfect (χ 2 /N ≈ 1.2). The soft contribution, and the contribution from dipoles smaller than r soft = 0.4 fm, are discussed later in this section.
In the IPsat model fits one obtains a good simultaneous description of both charm and total structure function data. As the charm data is well described also by our calculation, the resulting dipole amplitudes must thus be comparable at small dipole sizes. The large differences in the total reduced cross section can then be understood by analyzing the large dipole contributions. In our case, the dipole scattering amplitude vanishes when the quarks miss the proton, which heavily suppresses contributions from dipoles larger than the proton size (note that the gluonic transverse root mean square radius of the proton is quite small, r rms = 2B p ≈ 0.55 fm at the initial condition). The proton growth towards smaller x allows finite contributions from larger dipoles, which manifests itself in a better description of the small-x structure function data. On the other hand, this effect could also lead to an artificially large x-evolution speed.
In the IPsat or BK fits [37,39,41]) the dipole amplitude goes to unity at large dipoles indepedently of the impact parameter. In particular in the IPsat parametrization the total dipole-proton cross section scales like ∼ ln r at large dipole sizes r, in sharp contrast to the behavior obtained in our framework. Effectively, this means that in the IPsat fits one models the non-perturbative confinement scale physics by imposing a requirement that large dipoles scatter with probability 1.
To make the above discussion more transparent, we show in Fig. 4 the dipole amplitudes in our framework in comparison to those in the IPsat model [39] in the initial condition for different impact parameters. First, the striking difference between the models at large r is obvious. While the IPsat model always assumes N → 1 as r → ∞, the dipole amplitude in our calculation goes to zero at large r. Furthermore, dipole amplitudes are found to grow more slowly with the dipole size r than in the IPsat parametrization. This can be understood, as in the IPsat the probed saturation scale remains approximately constant when the impact parameter is fixed, which is somewhat unphysical. In contrast, in our calculation, when the separation between the quarks increases, at zero impact parameter both of them move to less dense regions of the proton, and the effective saturation scale decreases.
The rapidity evolution of the dipole amplitude is shown in Figs. 5 and 6, where we again compare to the IPsat model. The resulting amplitude with and without UV damping in the initial condition are shown in Fig. 5 for the center of the dipole located at the center of the proton, and in Fig. 6 for one end of the dipole located at the center of the proton. In both cases we can see that inclu- sion of the UV damping introduces an effective anomalous dimension and at small r the dipole is similar to that in the IPsat parametrization. The JIMWLK rapidity evolution effectively reduces the anomalous dimension (which also happens with BK evolution [92]), and deviations from IPsat increase. If one end of the dipole remains in the center of the proton, the dipole amplitude always increases with increasing dipole size as shown in Fig. 6, in contrast to the b ≈ 0 case. This can be understood, because the dipole amplitude will go to zero only when both Wilson lines are the vacuum ones. The effect of dropping dipole amplitude at large r for IPsat in the case where one end of the dipole is held fixed occurs because in that case increasing r leads to increasing b (the central point between the dipole ends), and the probed saturation scale is exponentially suppressed as Q 2 s ∼ e −b 2 /2Bp . Within a model using impact parameter dependent BK evolution [38], it was found that a numerically significant non-perturbative "soft" contribution to the total cross section must be included to describe the experimental structure function data. In order to quantify how large we expect the soft contribution to be, we follow the approach of Ref. [38] and separate contributions to the structure functions into perturbative and soft parts. The soft contribution to the structure functions is calculated as and corresponding perturbative contributions are com-  puted as discussed above and imposing an upper limit r < r soft for the dipole sizes. The total structure functions (and finally the total reduced cross section) are obtained as a sum of perturbative and soft components. For the charm production the soft component has a negligible effect. The resulting reduced cross section (that includes the soft contribution) is also show in Fig 3. We use r soft = 0.4 fm as a scale to separate soft and hard physics, and the proton transverse area is fixed to σ 0 /2 = 13.6 mb in order to get a good description of the HERA data in the lowest Q 2 bin. The parameter values are close to the values used in Ref. [38] (0.56 fm and 14.6 mb, respectively). When the soft component is included, the description of the HERA data is good, except at very high Q 2 where too fast Q 2 dependence is again observed similar to the case of the charm structure function. We note that in principle the soft contribution is expected to also be x dependent, but there is currently no way to compute this x dependence from first priciples, as it is done for the perturbative contribution. The dotted line in Fig. 3 represents the perturbative contribution for r < r soft , to which the soft contribution is added.

VII. EXCLUSIVE J/Ψ PRODUCTION
The coherent diffractive J/Ψ production cross section as a function of squared momentum transfer |t| is shown in Fig. 7. The results at both fixed and running coupling are shown at W = 75 GeV(which can be compared with the H1 data [93], but again due to the lack of the large dipole contributions the calculation would underestimate the data as discussed in more detail later) and at high energies W = 440 GeV. Note that these energies correspond to evolution of 1.8 and 5.3 units of rapidity from the initial condition, respectively. After a few units of rapidity evolution the results obtained with fixed and running coupling start to deviate. With running coupling, the low-|t| part of the cross section is enhanced compared to the case of fixed coupling. The reason for this is that the running coupling increases the evolution speed of the long-distance modes relative to the short-distance ones, and the low-|t| part is sensitive to long-distance physics ( √ −t is Fourier conjugate to the impact parameter).
To access the proton size evolution, we study in more detail how the coherent spectrum evolves with W . The |t| spectrum is sensitive to the Fourier transform of the density profile, as can be seen from Eq. (32). Experimentally, the proton size is characterised by measuring the slope of the diffractive |t| spectra, B G , defined via The diffractive slope has been measured as a function of the photon-proton center-of-mass energy W by the H1 and ZEUS collaborations [5,6,93]. This data clearly shows that the proton size grows as a function of W . Recently, the ALICE collaboration has also measured γp scattering in ultraperipheral proton-nucleus collisions [12] and observed hints of the proton growth down to very small x. The proton size B G extracted from our calculation is shown in Fig. 8 and compared with H1 and ZEUS data. Recall that we have fixed B p at the initial condition such that the resulting slope B G is approximately 3.8 GeV −2 at x = x 0 . The JIMWLK evolution results in a proton size evolution compatible with the HERA and ALICE data. For comparison, we also show the result obtained from the IPsat model, where the proton density profile does not change, and the only effect of the W evolution is to increase the saturation scale Q 2 s . Because the profile function does not factorize in the IPsat model, there is a small residual W dependence of B G .
Even though both the fixed and running coupling evolution yield very similar descriptions of the charm structure function data, the proton size evolves faster when running coupling corrections are included. We explained this behavior in the discussion of Fig. 7: the running cou-pling suppresses short-wavelength modes relative to the longer-wavelength ones that are probed at small |t|.
In these comparisons one should realize that both the computed and experimentally measured spectra are not exactly Gaussian (in ∆ ≈ √ −t), and thus it matters which range in |t| is fitted when the slope B G is extracted. The different experimental measurements have also different cuts, the ZEUS and older H1 datasets [5,6] have only an upper cut |t| < 1.8 GeV 2 or |t| < 1.2 GeV 2 , whereas the latest H1 analysis [93] excludes the measured spectra below |t| < 0.1 GeV 2 . In any case, there is little data available at small |t| especially in the older analyses, thus we shall also impose a requirement |t| > 0.1 GeV 2 when extracting the slope B G . Because the small |t| region, important for B G , is sensitive to large impact parameters, it is also especially sensitive to the infrared regulators, as discussed in Ref. [16]. Thus, it is important to study the effect of m on the extraction of B G , which is done in Appendix A.
To illustrate the growth of the proton and the change in its transverse structure, we show one example for the evolution of the trace of the Wilson lines, 1− 1 Nc Re tr V (x) in Fig. 9. For comparison, the same evolution with the ultraviolet regulator v = 0.3 GeV −1 at the initial condition is shown in Fig. 10. The expected effect of the UV regulator of eliminating short range structures is clearly visible in the initial condition. JIMWLK evolution reintroduces short range structures and makes the proton grow as in the case of the MV model.
Next, we investigate the effect of a finite UV regulator on the coherent diffractive J/Ψ cross section. As shown in Fig. 11 with the MV model initial condition (v = 0), the calculated energy dependence of the total cross section is significantly faster than that of the experimental data. Including a UV regulator improves the energy dependence significantly, however, the result is still smaller than the data, for all W . The reason is the lack of a soft non-perturbative contribution, just like it was for the total reduced cross section. We will return to the issue with the normalization in Sec. IX.
The improvement of the energy dependence stems from the fact that in our fit to the charm reduced cross section, the case with UV regulator prefers a somewhat smaller x evolution speed. In case of MV model initial condition, the fit tries to compensate more the too fast Q 2 evolution by having a smaller initial g 4 µ 2 and faster evolution speed.

VIII. FLUCTUATING PROTON GEOMETRY
When we include fluctuations in the proton geometry and in the overall saturation scale, it becomes possible to study, in addition to the coherent cross section, also the incoherent process which measures the amount of eventby-event fluctuations. Here, similar to Ref. [16], we fix the parameters defining the fluctuating proton geometry (B qc and B q ) and the overall saturation scale controlled  by g 4 µ 2 by requiring that we get a good description of the H1 spectra at W = 75 GeV [93]. Because of the problem with non-perturbative contributions from large r to the diffractive cross section, this normalization is different from what the fit to the reduced charm cross section would require. Consequently, the parameter set used in Refs. [15,16] also cannot reproduce the normalization of  [93]. Note that the proton color charge density is also fixed by the J/Ψ data. The results with and without UV damping in the initial condition are shown.
the charm production data. 5 We study the evolution of the diffractive cross sections Here we only show results with ultraviolet damping (v = 0.3) in the initial condition. Note that the proton color charge density normalization in the initial condition is fixed by the J/Ψ data. We compare to experimental data from H1 [5,93], ZEUS [6] and ALICE [12]. towards large center-of-mass energies using JIMWLK evolution. We note that in this case the initial x 0 ≈ 10 −3 (as parameters are fixed by J/Ψ data at W = 75 GeV, unlike previously when we studied the structure function data with the round protons). When solving the JIMWLK equation, we use the values for α s and m that were previously constrained by the reduced cross section data.
The description of the H1 coherent and incoherent J/Ψ photoproduction data is shown in Fig. 12. The parameters describing the proton geometry are constrained by requiring a good description of both coherent and incoherent J/Ψ production data. Note that as our setup is slightly different than in our previous work [16] (we are not using the IPsat model where the proton density function appears in the exponent of the dipole amplitude used to extract the saturation scale), we do not get exactly the same parameters even though we are comparing with the same dataset when using a similar initial condition with v = 0.0.
Similar to the case of the round proton, we also perform the analysis by using the initial condition with an ultraviolet regulator v = 0.3 GeV −1 . As discussed earlier, this causes the short distance structure to be mostly filtered out and the proton effectively grows and becomes smoother. Consequently, when we use v = 0.3 GeV −1 , we are forced to distribute color charges as very sharp peaks that are relatively far away from each other in transverse space. As the overall normalization g 4 µ 2 in this section is fixed by the J/Ψ data at W = 75 GeV, consequently these parametrizations would overestimate the charm structure function (e.g. the parametrization with v = 0.3 GeV −1 overestimates σ r,c by 40%), but the x evolution speed is found to be compatible with the data.
The JIMWLK evolved |t|-integrated coherent and incoherent cross sections as functions of W are shown in Fig. 13, comparing to IPsat with a fluctuating proton and data for the diffractive cross sections from H1 [5,93], ZEUS [6] and ALICE [12]. While the growth of the coherent cross section with W is stronger than the incoherent, the incoherent cross section never decreases with W as was observed in the calculation in [94]. Here the results at fixed coupling and with ultraviolet damping in the initial condition are shown, in which case the coupling constant obtained, α s = 0.18, constrained by the x-dependence of the DIS data, is approximatively compatible with the HERA measurements. The larger α s = 0.21 which is a result of the fit with a pure MV model initial condition would result in too fast W dependence of the total cross sections, as already seen in case of the round proton in Fig. 11 In Figs. 14 and 15 we show one example for the rapidity evolution of the proton shape when starting from a fluctuating proton with three hot spots. The difference between the figures is that in Fig. 15 the ultraviolet damping factor is included in the initial condition, which removes short distance fluctuations. Comparing to Figs. 9 and 10, one notices that at large rapidities the shapes for initially round and fluctuating protons become similar.
The behavior observed in Fig. 13 that the coherent cross section dominates at high energies, is expected. The reason is that if we start from proton configurations with large event-by-event fluctuations, the evolution is fastest in the regions where the local saturation scale is small, compared to the centers of the hot spot that start to reach the saturated region. This causes the hot spots to grow, and the proton effectively gets smoother, which reduces the fluctuations. Also, with increasing Q s , the typical length scale of fluctuations becomes smaller, producing more domains, effectively decreasing the total amount of geometry fluctuations. Eventually, when the black disk limit is reached, the coherent cross section gets contributions from the whole transverse area of the proton where the dipole amplitude is saturated to unity. The incoher-  ent cross section, on the other hand, can not receive any contribution in this region, and becomes only sensitive to the edge of the proton. This behavior is most clearly visible in the ratio of the incoherent to the coherent diffractive cross sections, shown in Fig. 16 and compared with the H1 data [93]. We present results for the choice of parameters that produce a good fit to the H1 spectra at W = 75 GeV and compare to the parameter set where g 4 µ 2 is adjusted to fit the charm reduced cross section. The ratio is shifted slightly when changing the parameters, but the W dependence, which is a prediction based on JIMWLK evolution, is unaffected.
For comparison, we will show results obtained using an IPsat model with fluctuating hot spot structure parametrized in Ref. [16]. The proton structure parameters in that case are B qc = 3.3 GeV −2 , B q = 0.7 GeV −2 and σ = 0.5. The resulting cross section ratio is much flatter as a function of W , because it lacks important physics, including the proton growth and evolution of the fluctuating sub-structure, and only the overall saturation scale depends on energy. Additionally, we note that the cross section ratio is slightly above the data also at W = 75 GeV where the parameters are constrained in Ref. [16]. This is due to the steeper slope of the experimental coherent t spectra than what can be reproduced by a proton with root mean square size B q + B qc = 4 GeV −2 as used in Ref. [16], which describes data for W = 90 GeV well.
In Fig. 17, we compare to the case with UV damping in the initial state. Unsurprisingly, because the UV filter removes some fluctuations while keeping the overall size the same, the ratio of incoherent to coherent cross section is reduced. The evolution of the ratio with energy is similar, possibly slightly slower when UV damping is used. This makes the results with and without UV damping become more similar with evolution (this can also be observed in Figs. 5 and 6), which can be understood by the structure becoming dominated by JIMWLK effects and losing memory of the initial state.

IX. LARGE DIPOLES
As shown in the previous sections, our results for inclusive structure functions and diffractive cross sections underestimate the experimental data, unlike computations relying on the IPsat parametrization. On the other hand, both approaches give a good description of the charm structure function (but we also note that the IPsat model can not describe the observed proton growth in 1/x). This difference was explained by noticing that the behavior of the dipole amplitude for large (compared to the size of the proton and to the inverse of the saturation scale) dipoles is very different in the two frameworks.
As previously mentioned, in our model and at zero impact parameter, the quarks do not probe the densest part of the proton, but that at distances r/2 from the center. This is in contrast to IPsat where the saturation scale is probed at the impact parameter (the point between the quark and anti-quark). This difference becomes important when the dipole size is of the order of the proton size. In principle, this is the region where confinement scale physics should dominate, but no such phenomena are included in our model. In the IPsat model, large enough dipoles scatter with probability one, which can

incoherent/coherent
Fit W = 75 GeV J/Ψ data g 4 µ 2 fixed to σ r,c IPsat H1 FIG. 16: Dependence of the incoherent-to-coherent cross section ratio on the color charge density normalization in the initial condition (for the solid line the g 4 µ 2 is fixed to J/Ψ spectra, and for the dashed line it is fixed by the charm production cross section data). The results are compared with the HERA data [93]. No ultraviolet damping is included here. The experimental uncertainties are computed assuming completely independent uncertainties for the coherent and incoherent cross sections. be seen as an effective description of confinement scale physics.
In order to quantify how sensitive our results for different observables are to the description of large dipoles, and to estimate the importance of the confinement scale physics when describing the HERA data, we now study the dependence of the cross sections on an upper limit r max , we impose on the dipole sizes to be included in the calculation. For simplicity, we do not include the geometry fluctuations in this analysis, as the results presented here only depend on the average shape of the proton. The modified initial condition with an ultraviolet damping factor v = 0.3 GeV −1 constrained in Sec. VI is used in this section. The results with an MV model initial condition are qualitatively similar. For a similar analysis in the IPsat model, see Ref [76]. The already encountered insensitivity of the charm structure functions to large dipoles is shown in Fig. 18. Here, the reduced charm cross section σ rc is compared with the result obtained by neglecting dipoles larger than r max = 0.3 fm. The resulting reduced cross section is found to be identical with and without this cutoff. This supports our approach to extract the free parameters (except for the proton size) by fitting the charm cross section data, as it is not sensitive to the physics at the confinement scale. This is further demonstrated in Fig. 19, where the charm contribution to F 2 at x = x 0 = 0.01 is calculated at different values for the photon virtuality Q 2 as a function of r max . More precisely, what is shown in Fig. 19 is the amount of total F 2,c recovered by including dipoles up to r max , divided by the IPsat model result for r max = ∞. The latter serves as a stand-in for the experimental value, as IPsat provides a very good description for this and the following observables.
For comparison, the results from the IPsat model 6 with the same cutoff r max (where large dipoles are given more weight) are shown as dotted lines. We find that even at small Q 2 , the total structure function is recovered already at r max = 0.3 fm. 7 In the IPsat model, convergence happens at somewhat larger r max ≈ 0.4 fm, which is expected given the larger dipole amplitudes at large r. A similar analysis for the total F 2 at x = 0.01 is shown in Fig. 20. Here the aligned jet contribution from large dipoles is especially large when we employ the IPsat parametrization, where even at large Q 2 = 200 GeV 2 over 10% of the total cross section originate from dipoles larger than 1 fm. In our framework the contribution form dipoles larger than the inverse proton size are suppressed, thus the total F 2 is significantly smaller than in the IPsat model, and at large Q 2 , the results converge at around r max ≈ 0.3 fm. However, at small Q 2 dipoles up to 1 fm contribute.
For the exclusive cross section, which depends on the square of the dipole amplitude, we expect an even stronger dependence on the cutoff r max . Additionally, the weight is given to different dipole sizes than in inclusive charm production even though in both cases only the charm dipoles contribute, as the dipole sizes are set by the overlap between the vector meson and the virutal photon wave functions, not the square of the photon wave function. The total diffractive J/Ψ photoproduction cross section as a function of r max is shown in Fig. 21. For the smaller W = 31 GeV at the initial condition we find indeed a large difference of a factor of 2.9 compared to the IPsat result at r max = ∞ (see also at larger W the difference is reduced. The Bjorken-x dependence of the r max dependence is also weak and especially at small W the normalization of the cross section is not described by our calculation. This is due to the missing non-perturbative contribution affecting dipoles larger than 0.4 fm, not included in our framework, but effectively present in the IPsat calculation. We note that the vector meson wave function is rather uncertain and modifications could move weight away from large r, potentially improving the description within our framework.
In Fig. 22 we show how the proton size extracted from  FIG. 23: Incoherent to coherent cross section ratio with and without cutoff for the large dipoles compared with the H1 data [93].
exclusive J/Ψ production depends on the cut on large dipoles. As long as the cut is not unreasonably small and r max 0.4 fm, our results are compatible with the HERA data. Especially the evolution speed with W is independent of r max . However, this does not mean that the full non-perturbative result would have the same W dependence, since the unknown soft contribution could lead to a modification.
Finally in Fig. 23 we show how the incoherent to coherent cross section ratio depends on the contribution from large dipoles. The cross section ratio as a function of center-of-mass energy W is shown using both the full solution to the JIMWLK evolution, and the result obtained by imposing a cut r < r max = 0.3 fm for the dipole sizes. This cutoff changes the overall normalization of the cross sections, but the cross section ratio changes only slightly. In particular the energy dependence of the cross section ratio is independent of the large dipole cutoff.
The analysis done in this section has strong implications for the potential discovery of saturation effects in e+p collisions. We found that at small r the perturbative description is valid and observables that exclude large dipoles can be well described. However, non-perturbative effects will set in when r is on the order of Λ −1 QCD and the differences between our model and saturation models like IPsat at large r clearly demonstrate that this region is not under control. For realistically achieveable x values at HERA energies, the dipole amplitude should be affected by this non-perturbative physics at values of r that do not exceed 1/Q s by much. Thus, the observables that are under control are not affected by saturation effects, and the observables that could be sensitive to saturation are not fully accessible in our perturbative framework.
We thus conclude that access to saturation effects at currently realistic collider energies can only be achieved in collisions with heavy nuclei, where at a given energy, Q 2 s is increased by A 1/3 , with A the mass number of the nucleus. Thus, saturation effects could be established at small enough r, where the perturbative treatment of the color glass condensate is still under good control.

X. CONCLUSIONS AND OUTLOOK
We have presented the first comparison of HERA data on structure functions and diffractive vector meson production to calculations involving the explicit numerical solution of the JIMWLK equations with impact parameter dependent MV model initial conditions. We have used the charm reduced cross section and the slope of the |t| spectrum for coherent J/Ψ production to constrain the parameters of the model. Predicting the total reduced cross section as well as the coherent diffractive cross section (including its normalization) using these parameters leads to a significant underestimation of the experimental data. This discrepancy is rooted in the contribution from large dipoles, which is largely underestimated in our calculation. When the dipole misses the proton, the dipole amplitude goes to zero. This is not unreasonable, however, dipole sizes on the order of the proton size cannot be described perturbatively, because confinement effects become important. In fact, we find that limiting contributions from our perturbative calculations to r < 0.4 fm and adding a non-perturbative contribution for the soft part, which is represented by a simple toy model in this work, allows for a good description of the total reduced cross section.
For the IPsat model, good agreement is found with almost the entire range of HERA data [39]. This is because the large r behavior of the dipole amplitude is qualitatively different from our result. In the IPsat model the dipole amplitude always approaches one at large r, leading to significant contributions to some ob-servables from dipole sizes exceeding 1 fm. One could understand this contribution as an effective way to include non-perturbative physics, but it is not clear that the concept of a dipole amplitude at such large values of r makes sense.
Observables that do not get contributions from very large dipole sizes, like the reduced charm cross section, can be well described within our framework. Other observables need to be supplemented with a non-perturbative contribution, which could be modeled via vector meson dominance or other more ad-hoc approaches.
We identified the ratio of incoherent to coherent diffractive J/Ψ production cross sections and the energy dependence of B G as other observables that are robustly described in our framework, and are only weakly affected by contributions from large dipoles.
We have further analyzed the effect of running coupling and various infrared and ultraviolet regulators on the JIMWLK evolved observables. We find that the energy dependence of the proton size B G , accessible via coherent diffractive vector meson production, is sensitive to running coupling effects as well as the infrared regulator, even when other parameters are retuned to reproduce the charm reduced cross section. This is because the proton size is more sensitive to infrared physics than other observables. The energy dependence of B G is little affected by contributions from large dipoles.
Generally, our results demonstrate that it is extremely difficult if not impossible to access saturation effects in e+p collisions at HERA energies, because either dipoles larger than 1/Q s do not contribute or the perturbative framework breaks down for a given observable. We conclude that electron-ion collisions involving ions with large A are needed to increase Q s at a given energy and study saturation in a theoretically controlled way. Consequently, an electron ion collider will be essential to access gluon saturation and study this interesting and complex regime of non-linear QCD in the laboratory. Council, Grant ERC-2015-CoG-681707. BPS acknowledges a DOE Office of Science Early Career Award. To study sensitivity on the infrared regulator m in the JIWMLK kernel (9), we study the description of the HERA data using different values for m. As the role of the infrared regulator is to suppress unphysical Coulomb tails, smaller values for the cutoff result in faster growth of the proton size, which is expected to affect the evolution speed of all observables. In this section, we use the MV model initial condition (v = 0) and do not include geometry fluctuations for simplicity.
In practice we keep the initial color charge density g 4 µ 2 fixed and adjust the fixed coupling constant α s separately for each infrared cutoff m to get the best possible description of the charm production data. The results are shown in Fig. 24.
After retuning α s to describe the charm reduced cross section, the effect of the infrared regulator on the proton size parameter B G , measured in diffractive scattering, is shown in Figs. 25 and 26. We find that especially the small-|t| part of the spectra is sensitive to the infrared regulator, leading to a strong m dependence of the energy dependence of B G shown in Fig. 25. However, if one follows the experimental cuts as closely as possible and excludes the region |t| < 0.1 GeV 2 , the results are almost independent of m, as shown in Fig. 26.
To explain the m dependence in the extracted diffractive slope, we show the coherent J/Ψ production cross section as a function of |t| in Fig. 27. It is clear that if the infrared regulator is small, the long-distance modes cause the proton to grow much faster, creating an enhancement at small |t|. Also, in that case the spectrum is far from Gaussian, and extraction of the diffractive slope depends strongly on the |t| range used in the analysis. We note that small enhancement at small |t| (compared  with purely Gaussian spectra extrapolated from larger |t| to zero momentum transfer) can be seen in the latest HERA data at W = 75 GeV [93].