In depth analysis of the combined HERA data in the dipole models with and without saturation

We present an updated impact parameter dependent saturation model (IPsat) determined trough a fit to the combined HERA I and I+II reduced cross section data. The same HERA data are used to fit the linearized (IPnonsat) version of the applied dipole amplitude, which makes it possible to estimate the magnitude of the saturation effects in various experiments. We find that both parametrizations provide comparable descriptions of the considered data when an effective confinement scale dynamics is incorporated with quark masses. Moreover, it is possible to consistently determine the light and charm quark masses. The role of potentially non-perturbatively large dipoles is examined in detail, with the result that, especially in case of the structure function $F_2$, their contribution is numerically significant. Potential to discriminate between the two models in future $e+p$ and $e+A$ experiments is also illustrated.


I. INTRODUCTION
Quantum Chromodynamics, the theory of the strong interaction, is a vast field with a plethora of diverse phenomena still unexplored. Despite being the object of study of the high energy and nuclear physics communities both theoretically and experimentally, the true nature of a proton, its constituents, their interactions and how they come together to conform it remain elusive. In the collinear framework at a given resolution scale Q 2 , the proton can be described as composed of quarks and gluons carrying a fraction x of the proton momentum. Once known at an initial scale Q 2 0 , the partonic densities can be determined at any scale Q 2 using the DGLAP evolution equations [1][2][3][4]. This picture is successfully supported by extensive experimental evidence; however it can not be valid for all the kinematic range: as one explores lower x values the DGLAP equations predict an infinite rise of the gluon density which would break unitarity. It follows that some phenomena, e.g. gluon recombination, has to enter in order to tame this dangerous behavior. This dynamically generates the saturation scale Q 2 s , which determines when the transition to the non-linear regime of QCD takes place. Moreover, if Q 2 s is large, perturbative calculations become possible as the strong coupling constant is weak. A successful theoretical framework to describe QCD in this region is known as the Color Glass Condensate [5].
There are many theoretical models that incorporate saturation to QCD calculations with different approaches and considerations, a popular one being the IPsat parametrization [6][7][8][9][10]. The parameters that provide the necessary non-perturbative input to these models are determined through fits to available data, the bulk of them being high precision deep inelastic scattering (DIS) data measured at HERA in electron-proton (e + p) colli-sions [11][12][13][14]. Despite the wide kinematic range covered by this collider there are no spectacular signals of deviation from the DGLAP predictions and some observed discrepancies might be due to reasons other than saturation. Recently a possible hint of a non-linear regime from HERA data at low x was shown to be feasibly explainable by the inclusion of resumed logarithmic corrections [15]. Saturation model calculations have also been able to provide a natural description for the nearly flat center-ofmass energy dependence of the diffractive to total cross section ratio at HERA [16,17] (see also Ref. [18]). In general, there is no clear consensus on whether or not the onset of saturation has been reached and it will be necessary to perform a thorough and detailed exploration of the kinematic space beyond our current knowledge in order to observe the non-linear regime of QCD. Future facilities such as the Electro-Ion Collider (EIC) [19,20], the Large electron-Hadron Collider (LHeC) [21] and the Future Circular Collider (FCC-eh) [22] hold the key to this door.
In this study we present a new determination of the IPsat and its linearized version ("IPnonsat") description of the HERA combined data [11][12][13][14] in the framework of the dipole model. What is new here, compared to the previous literature [6][7][8], is that we also fit the IPnonsat model parametrization to the precise combined HERA data which allows us to explore the expected magnitude of saturation effects in current and future collider experiments. In addition, by simultaneously fitting the total cross section and the charm contribution to it, it becomes possible to determine the quark masses in this framework. For consistency, a variable flavor number scheme is also applied.
This work is organized as follows. In Sec. II we describe the inclusive photon-proton interaction in terms of the dipole model for both IPsat and IPnonsat formulations. The analysis of the combined inclusive and charm data from HERA is the subject of Sec. III, while in Sec. IV we present an analysis of the obtained dipole-proton scattering amplitude. The application of our determined parameters to the exclusive production of vector mesons is discussed in Sec. V. In Sec. VI we consider the potential of the EIC and the LHeC to provide a signal of saturation in both e + p and e + A collisions. Finally Sec. VII summarizes our findings.

II. PHOTON-PROTON SCATTERING IN THE DIPOLE PICTURE
The most precise study of the proton structure has been performed in e + p DIS experiments at HERA [11][12][13][14]. The process is described as the electron emitting a virtual photon with momentum q, which then probes the proton with a resolution scale Q 2 = −q 2 . In the dipole picture, applicable at high energy and not too large Q 2 , the virtual photon-proton scattering process can be factorized in two parts: the γ * splitting into a qq pair, and the dipole-target scattering. The total γ * p cross section is subsequently obtained as the imaginary part of the forward elastic γ * p → γ * p scattering amplitude using the optical theorem. The photon splitting into a dipole with transverse separation r is described in terms of the photon wave function Ψ f L,T (r, z, Q 2 ), where f is the quark flavor, L and T refer to transverse and longitudinal polarizations, |r| = r, and z is the fraction of the longitudinal momentum of the photon carried by the quark. The total photon-proton cross section is then given by where dσ dip d 2 b is the proton-dipole cross-section with b denoting the impact parameter, and one has to sum over all the quark flavors included in the analysis (u, d, s, c and b in this work). The photon wave functions for the transverse and longitudinal polarizations summed over spins and helicities read [23] and Here, e q is the fractional charge of the quark q and m q is the quark mass.
The proton structure functions F 2 and F L can be written in terms of the total photon-proton cross section as The most precise combined HERA data is released in the form of the reduced cross section (6) where y = Q 2 /(xs) is the inelasticity of the e + p scattering with center-of-mass energy √ s. In the IPsat model the saturation scale of the target depends on the impact parameter, and the cross section is written as The proton density profile is assumed to be Gaussian, and we use fixed B p = 4 GeV −2 based on HERA exclusive J/Ψ production data. Thus, the effective transverse area of the proton is not a free parameter in the model, and the root mean square radius of the proton is 2B p . However, we note that this parametrization describes only part of the observed growth of the proton already at the HERA energies [24,25] due to the Gribov diffusion. Including this effect would require us to either parametrize the proton width B p and try to fit it simultaneously to the HERA data, or solving impact parameter dependent small-x evolution equations such as in Refs. [26][27][28]. We do not want to include exclusive data into our fit due to additional model uncertainties, and we leave it for a future work. At the lowest order in perturbation theory the function F is proportional to the DGLAP evolved gluon distribution function with x being the momentum fraction of the proton carried by the gluon, and the scale µ 2 is a function of r 2 . This parametrization gives the correct pQCD limit for the dipole cross section, σ dip ∼ r 2 . At large dipoles, the saturation effects are described by having an eikonalized gluon distribution function, which gives dσ dip / d 2 b → 2 at large r, corresponding with the interaction probability being unity at large dipoles. The scale at which the gluon density and the strong coupling constant are evaluated is chosen to be µ 2 = µ 2 0 + C/r 2 . Here, unlike in previous fits [7,8], we fix µ 2 0 = 1.1 GeV 2 and let C be a free parameter. This allows us to force µ 2 to remain in the perturbative region. In our fit we only include data in the Q 2 bins that satisfy Q 2 > µ 2 0 , which guarantees the applicability of the perturbative calculation. We also consider data in the kinematical region x < 0.01 where the dipole picture can be considered to be most reliable.
The gluon density at the initial scale µ 0 is parametrized as where A g and λ g are free parameters to be determined by the fit. Unlike previous works, we use a variable flavor number scheme when evaluating the strong coupling constant α s and when solving the DGLAP evolution for the gluon distribution. Neglecting the change in the number of flavors and using a fixed number of quark flavors as the data moves in Q 2 is not the most adequate strategy from the theoretical point of view but, in practice, it is solely reflected in different values of the fitted parameters, without a sizable effect in the description of the data in the currently probed kinematic range. For simplicity we refrain from including the quark singlet contribution to the DGLAP evolution which should also be present. However we have checked that the fit quality and the resulting dipole amplitude are not significantly affected by its inclusion, and that the fit drives the quark singlet contribution to zero at the initial scale. Furthermore we choose the high-x behavior to be an integer exponent (6) instead of the standard 5.6 in order to speed up the DGLAP evolution performed in Mellin space. We have checked that this exponent does not have a significant impact on the determination of the parameters. The strong coupling constant is required to satisfy α s (M z = 91.1876 GeV) = 0.1183 [13]. When evaluating the heavy quark contribution, the Bjorken x is replaced by in order to take into account the kinematical shift, where q refers to the quark flavor c or b. The quark masses m f for the light and charm quark are kept as free parameters and constrained by the fit. In the fit we only include data that satisfy x c < 0.01 for the charm quark. The b quark mass is set to 4.75 GeV, and the b quark contribution to the structure function is included if the corresponding Bjorken-x satisfies x b < 0.1.
Effectively in the IPsat model we fit the x dependence of the cross section to the HERA data, and extrapolate it down to smaller values of Bjorken-x when calculating predictions e.g. for the future DIS experiments. The other approach used in small-x phenomenology is to evolve the dipole amplitude in x using the perturbatively derived Balitsky-Kovchegov evolution equation [29,30], and incorporate some of the DGLAP effects in the initial condition of the evolution, which is fitted to the HERA data. Currently these fits are done at the leading logarithmic accuracy with running coupling corrections, and very good description of the HERA data is obtained if the x evolution speed is also adjusted in the fit process by fitting the coordinate space scale at which the running coupling constant is evaluated [31,32]. Recently, there has been a lot of progress in developing the theory to NLO accuracy, see e.g. Refs [33][34][35][36][37][38][39][40].
In order to quantify the magnitude of the saturation effects, we also study the linearized version of the IPsat parametrization (7): to which we refer as the IPnonsat model. We emphasize that the rigorous way to look for the saturation effects is to compare saturation model calculations with the perturbative QCD results obtained by applying collinear factorization. In practice, however, comparing IPsat and IPnonsat results can be used as a first estimate for the strength of the saturation effects in the given process. In order to enable such a comparison, we fit the IPnonsat model parameters to the same HERA data.

III. DESCRIPTION OF THE HERA REDUCED CROSS SECTION DATA
The H1 and ZEUS experiments from HERA have published two combined datasets for the reduced cross section: the first one in Ref. [11] with the charm contribution in Ref. [12] where the HERA-I data are combined, and the latest final combined result for the inclusive reduced cross section including all HERA (HERA I+II) data in Ref. [13]. Recently the latest charm reduced cross section data from HERA I+II have been made available [14]. We will perform fits to both HERA I and HERA I+II datasets, but we will consider the fit to HERA I data to be our main result, as the charm cross section from HERA I+II is not yet published and the additional data in the newer dataset mostly affects the high-Q 2 region not included in the analysis. Moreover, the newer total reduced cross section dataset has more than twice as many data points in the region of interest of this work, which renders that data less sensitive to the charm quark if one does not introduce artificial weight factors (the HERA I+II charm reduced cross section data has as many points as in the HERA I results).
We include data in the region 1.5 < Q 2 < 50 GeV 2 . The lower limit, which we require to be larger than µ 2 0 , guarantees that there is a large scale justifying the perturbative calculation. As the validity of the dipole picture becomes questionable at high Q 2 , we only include data up to Q 2 = 50 GeV 2 in our main fit, though we also show results for fits done in the larger virtuality range with Q 2 max = 500 GeV 2 . The free parameters in this work are A g and λ g that describe the gluon distribution at µ 2 = µ 2 0 , and C that controls the momentum space scale corresponding to the given dipole size |r|. In addition, and as mentioned in the previous section, the light quark and charm quark masses are to be determined by the fit. However, as the bottom quark contribution is small, the fit can not reliably determine the b quark mass and we set it to m b = 4.75 GeV.
The dependence of the fit quality on the light quark mass m light is shown in Fig. 1 when we fit the HERA I data [11,12]. Throughout this work we show results  I: All dimensionfull parameters in GeV. X + Y points means X points for σr and Y points for σ r,charm . Bottom mass is m b = 4.75 GeV, and Bp = 4 GeV −2 . In the IPsat fit the light quark mass is fixed to prevent numerical instability. The starting scale for the DGLAP evolution is µ 2 0 = 1.1 GeV 2 . Fit results with HERA I data [11,12] and HERA I+II data [13,14] [11] and charm [12] reduced cross section data as a function of the light quark mass.
obtained by using the fit to the HERA I data. Here, the charm mass and all the other parameters are allowed to vary freely with the fixed light quark mass. As one moves to lower values of m light in the IPsat fits, the χ 2 reaches a plateau, making it hard to determine a best fit extraction of it's value, similarly as in Ref. [8]. Therefore, and in order to have finite quark mass to act as an infrared regulator, we fix m light = 0.03 GeV for the IPsat case. The situation is different in the IPnonsat fit, where a relatively large light quark mass ∼ 0.14 GeV is preferred. This can be interpreted as an effective confinement requirement. The effect of a nonzero quark mass is to suppress dipoles larger than ∼ 1/m light , that in IPnonsat model have an unphysically large (unitarity violating) cross section. The final fit quality in both IPsat and IPnonsat models is similar, suggesting that, when describing the inclusive DIS data, the effective confinement effect in the IPnonsat has a comparable effect as the gluon saturation in the IPsat parametrization.  [11] and charm [12] reduced cross section data as a function of the charm quark mass. In the IPsat fit the light quark mass is fixed to 0.03 GeV (see text).
Unlike previous works [7,8] we find that the fit clearly prefers a charm quark mass ∼ 1.35 GeV, with the χ 2 presenting a clear minimum. One main difference of our work with that of Ref. [8] is that the charm data are included in the fit, which allows us to constrain the charm mass simultaneously with the other parameters. The quality of the fit as a function of the charm mass is shown in Fig. 2, where all parameters are allowed to vary (except the light quark mass which is fixed to 0.03 GeV in the IPsat model), while keeping m c constant.
In Table I we present the fitted parameters for two different values of Q 2 max . As can be seen from the χ 2 , the description of the precise HERA data is excellent, as already noted in previous works [6][7][8]. What is new here compared to the previous literature is the IPnonsat model parametrization, that we find to describe the com- bined HERA data equally well 1 . This is demonstrated in Figs. 3 and 4 where the reduced cross section and the charm contribution to it are shown and compared to the IPsat and IPnonsat model results, the curves being practically one on top of the other. We are also able to determine the quark masses from the fit. For the charm contribution, we note that there is some tension, the HERA data suggesting slightly slower Q 2 evolution that what is the outcome of our combined fit. This could be due to shortcomings of the model in describing the heavy quarks, as it happens in the collinear factorized framework where higher order corrections are needed for a proper description of the charm and bottom data [41]. It also might be influenced by the fact that the HERA collider was not particularly tuned to measuring heavy quarks, an issue that will be addressed in future colliders such as the EIC.
The fits done to HERA I and HERA I+II combined datasets result in comparable parameters (the largest difference being the scale parameter C, on which the results depend only logarithmically). Also, the newer dataset prefers a slightly smaller charm quark mass, but the differences are small. Thus the difference at the level of an observable will be negligible between the two different fits 1 In Ref. [6] the IPnonsat model was fitted to older H1 and ZEUS data that have much larger uncertainties than the combined dataset used in this work performed to different datasets. We will demonstrate this in Appendix B. We consider the top row of Table. I to be our main fit, as it relies on published datasets and only includes measurements in the kinematical domain where the applied model can be considered to be most reliable. The bottom quark reduced cross section included in the latest combination of HERA heavy quark data [14] is discussed in Appendix A.

Contribution from large dipoles
As discussed above, especially in the IPnonsat model, a nonzero effective light quark mass is needed in order to obtain a satisfactory description of the HERA data. This means that the reduced cross section data is sensitive to the contributions from (possibly non-perturbatively) large dipoles, whose formation should be suppressed by confinement scale effects. In the IPsat parametrization, the imposed unitarity requirement limits the scattering probability not to exceed unity at large dipoles, but large dipoles can still have a numerically significant contribution.
The fractional contribution from large dipoles to the total F 2 and F L is shown in Figs. 5 and 6, respectively. These calculations are done using the IPsat fit (first line of Tab. I), using the IPnonsat fit would result in very similar r max dependence. For F 2 , even at relatively large Q 2 ∼ 500 GeV 2 , 10% of the total structure function originates from dipoles larger than 1 fm. On the other hand,  the HERA reduced cross section data have relative uncertainties at the percentage level, much smaller than the contribution we obtain from the non-perturbatively large dipoles.
The reason for this large contribution is that there is a large so called aligned jet contribution: in the limits z → 0 and z → 1 the large dipole contribution to the transverse photon-proton cross section is only suppressed by the light quark mass as ∼ e −m light r . This can be seen from the virtual photon wave function, Eq. (2). On the other hand, as can be seen from Fig. 6, in the case of F L at moderate Q 2 the contribution from the region r 1 fm is negligible. This is due to the extra factor z 2 (1 − z) 2 in the longitudinal photon wave function, Eq. (3), which suppresses the endpoint contributions. Thus, we would prefer to fit the F L data instead of the reduced cross section measurements which is dominated by F 2 . However, the HERA F L data [42,43] are not precise enough for a detailed comparison with the dipole model calculations.
Future DIS facilities EIC and LHeC have plans to measure proton structure functions (including F L ) at an unprecedented accuracy [19][20][21]. Similarly, studying only the charm contribution to the total cross section limits the contribution from large dipoles as demonstrated in Fig. 7. In case of the F 2,charm , even at small Q 2 contribution from dipoles larger than ∼ 0.6 fm is negligible (but very small dipoles are not sensitive to the saturation effects either). In general we find that F 2 , F L and F 2,charm are sensitive to different dipole sizes, and future DIS data covering all these structure functions will provide much more precise constraints.
In order to further study how much large dipoles affect the fit result, we perform the fits to HERA I inclusive and charm cross section data up to Q 2 max = 50 GeV 2 with different cutoffs for large dipoles r max . The resulting fit quality is shown in Fig. 8. We find that in order to obtain a good fit to the HERA data, we have to include dipoles up to ∼ 2 . . . 2.5 fm in the IPsat model. In the case of the IPnonsat parametrization, the fit can compensate the effect of the r max cutoff as the light quark mass is also a fit parameter, thus the fit quality is more stable with respect to the infrared cutoff. The IPnonsat fit drives the light quark mass to zero when r max becomes ∼ 1.6 fm, and it is not possible to fit the HERA data with a much smaller cutoff. The dependence of the light quark mass on r max is shown in Fig. 9 which further demonstrates that in the IPnonsat model the inclusion of large dipoles requires a larger light quark mass to suppress the contribution from this unphysical region. In the IPsat model, the fits prefer a zero light quark mass at all r max .

IV. DIPOLE SCATTERING AMPLITUDE
The resulting dipole amplitude at b = 0 is shown in Fig. 10, where we compare it with the previous results from [8] labeled as "RSKV". Even thought our study has incorporated some refinements (variable flavor number scheme in the DGLAP evolution, quark masses determined by the fit, inclusion of the charm reduced cross section data), the base model is essentially the same and, by fitting similar data sets one is expected to obtain compatible dipole amplitudes, despite some numerical differences in the fitted parameters. In the IPnonsat model the evolution at the initial scale in x is very slow (λ g being close to 0), thus there is basically no evolution at large r, in the region where the IPsat parametrization has already reached unity (and where the IPnonsat model gives unphysical results).
To demonstrate the evolution of the gluon distribution function we plot xg(x, µ 2 = µ 2 0 + C/r 2 ) in Fig. 11 as a function of r using both the IPsat and IPnonsat fitted parameters to initialize the DGLAP evolution. At large scales the two parametrizations have small differences, as the effect of the initial condition is washed out in the evolution. Close to the initial scale µ 2 0 there is basically no evolution if the IPnonsat model parametrization is used (λ g ≈ 0, which forces the dipole scattering amplitude not to grow in the region where it is already violating unitarity), unlike in the case of the IPsat model.
At large scales and at sufficiently large x 10 −3 it is also visible that the gluon density starts to decrease as the scale µ 2 ∼ C/r 2 increases. This is expected, as the momentum conservation in the DGLAP evolution  removes the larger-x gluons as they are splitting into the smaller-x ones. Similar results were already found in Ref. [6]. This effect is strong close to x ∼ 10 −2 , where the decreasing gluon density is probed already by dipoles that have large enough sizes to contribute significantly on F 2 . The point at which the non-linear effects become relevant is characterized by the saturation scale Q 2 s . To determine it we use the definition The extracted saturation scale as a function of x is shown in Fig. 12. Here, Q 2 s is extracted at the central impact This definition gives b ≈ 0.46 fm. The difference between the IPsat and IPnonsat parametrizations remains small at all values of Bjorken-x, the IPnonsat model having in general slightly faster evolution. As expected based on previous analyses (e.g. [8,32]), the saturation scale of the proton is at the Λ QCD range in the region x ∼ 10 −2 , and the region of Q 2 s being perturbative is reached below x 10 −4 . . . 10 −5 (note that the absolute value of Q 2 s depends on the definition (12)).

V. EXCLUSIVE VECTOR MESON PRODUCTION
Additional information about the proton structure can be obtained by studying exclusive processes. They are particularly powerful in probing the gluonic structure, as at leading order in collinear factorization the vector meson production cross section is proportional to the squared gluon distribution [44]. In addition to being sensitive to the total gluonic density, in exclusive process the momentum transfer ∆ = √ −t is the Fourier conjugate to the impact parameter, which makes it possible to probe also the impact parameter dependence.
In the dipole picture, the scattering amplitude for exclusive vector meson production reads (see e.g. Ref. [7]) This expression has a straightforward interpretation. First, the incoming virtual photon splits into a quarkantiquark pair as described by the virtual photon wave function Ψ. The dipole then scatters elastically off the proton with cross section σ dip , and ultimately forms the final state vector meson described by the wave function Ψ V . The scattering amplitude in the momentum space is obtained by calculating the Fourier transform from the coordinate space, with the momentum transfer ∆ being the Fourier conjugate to the impact parameter b. Here we have neglected the off-forward correction to the vector meson wave function [45]. The exclusive vector meson production cross section reads In addition, we include the corrections due to the real part of the scattering amplitude neglected when deriving the above result, and the so called "skewedness correction" which takes into account the fact that in the twogluon exchange the two gluons carry different amount of longitudinal momentum [46]. These corrections are included as in Ref. [47] and, to a good approximation, only affect the overall normalization of the diffractive cross section. Unlike the virtual photon wave function used to calculate inclusive cross sections, the vector meson wave function can not be calculated perturbatively. We use here the Boosted Gaussian parametrization as in Ref. [7], where one assumes that the vector meson is a quarkantiquark state with spin and polarization structure the same as in the case of the photon. This assumption makes it possible to write the overlap between the vector meson V and the virtual photon wave function in case of the transverse polarization as (16) and for the longitudinal polarization z) . (17) The scalar part of the vector meson is parametrized as The advantage of this parametrization is that the wave function has the proper short-distance behavior ∼ z(1 − z) in the limit of massless quarks. The normalization factors N T,L and the width R are fixed by requiring that the decay width to the electron channel (calculated using the longitudinal polarization as in Ref. [7]) reproduces the experimental value, and that the wave function is properly normalized. As these parameters depend on the quark masses, we calculate them for J/Ψ, ρ and φ for the same values obtained in the fits to inclusive data for the IPsat and IPnonsat parametrizations. The obtained values and the comparison with the results from [7] are shown in Table. II. We remark that if one were to calculate the decay width using the transverse polarization, the final numbers would be slightly different as noted in Ref. [7].
Due to the small light quark mass especially in the IPsat model parametrization, the photoproduction of ρ and φ mesons can not be reliably calculated from our model. In the case of J/Ψ, the charm quark mass provides the necessary large scale that cuts out large dipoles, and makes it possible to calculate exclusive J/Ψ production down to Q 2 = 0 GeV 2 . This is advantageous, as recently it has become possible to measure exclusive vector meson photoproduction in ultraperipheral collisions at RHIC and the LHC [49].
In the literature there are some inconsistencies using the vector meson wave function parametrization from Ref. [7] together with dipole model fits with different choices for the charm quark mass. In order to quantify the effect of having a consistent quark mass in the dipole model fit and in the vector meson wave function, we show in Fig. 13 the J/Ψ production cross section using our IPsat fit (where m c ≈ 1.35 GeV) and the widely used wave function from Ref. [7] (where m c = 1.4 GeV, referred as KMW). The larger quark mass in the KMW parametrization reduces the cross section by approximately 14%. We   [7] where the charm quark mass is mc = 1.4 GeV. The other IPsat and IPnonsat curves use the wave function parametrizations from Table II. The W = 75 GeV data are from Ref. [48] and the W = 100 GeV data from Refs. [24,25]. The high-energy results are scaled by 5 for illustrational purposes.
note that the uncertainties related to the modeling of the vector meson wave function are larger than this, see e.g.
Refs. [7,50]. The IPsat and IPnonsat results are practically on top of each other at small |t|. The agreement with the HERA data is good, except that we can not reproduce the small change of the t slope at |t| 0.1 GeV 2 visible in the W = 75 GeV data 2 . At large |t| the different form factors generate different 2 Which is described accurately in the IP-Glasma model calculation in Refs. [51,52]  spectra. The Fourier transform of the IPnonsat dipole amplitude is exactly Gaussian, and the spectra goes like e −Bp|t| . In the IPsat parametrization, the proton density profile is actually ∼ exp(−e −b 2 /2Bp ), thus its Fourier transform is a more complicated function which generates diffractive dips at large −t. At W ∼ 100 GeV, we get the location for the first diffractive minimum to be |t| ∼ 2.5 GeV 2 (see also Ref. [53] for discussion about the energy dependence of the dip location).
The total J/Ψ production cross section, calculated using our IPsat and IPnonsat model fits, is shown in Fig. 14. The results are compared with the HERA data from the H1 [24,48] and ZEUS [25] collaborations, and with the recent measurement by the ALICE collaboration [54] on ultraperipheral proton-lead collision (which can be seen as a photon-proton scattering due to the Z 2 enhancement for the photon flux emitted from the nucleus). The mod- The dashed line is obtained by using the ρ wave function provided in Ref. [7] where the light quark mass is m light = 0.14 GeV. The other IPsat and IPnonsat curves use the wave function parametrizations from Tab. II. The experimental data are from the H1 collaboration [55].
els are found to be in agreement with the current data 3 , but the future more precise LHC data at even higher W (requiring larger center-of-mass energy for the ultraperipheral proton-nucleus scattering or more forward rapidities) will be in the region where the difference between the models is large.
Let us then study the production of light mesons. The differential ρ electroproduction cross section in different Q 2 bins is shown in Fig. 15 and compared with the H1 data [55]. In the lowest Q 2 bins the applicability of our framework is questionable as, again, we do not have a large scale suppressing nonperturbative contributions. The agreement with the H1 data is good especially at higher Q 2 using both IPsat and IPnonsat parametrizations. Note that the light quark mass is much larger in the IPnonsat model, which explains why the cross section at small −t is actually smaller in the IPnonsat calculation. Again, by calculating the ρ production with the IPsat model parametrization and the KMW wave function [7] we find that the larger light quark mass suppresses the cross section at small Q 2 bins similarly to the case of J/Ψ production, and that the different impact parameter profile causes diffractive dips at large |t| when we use the IPsat parametrization.

A. Proton targets
As we saw in Sec. III, both the IPsat and IPnonsat parametrizations give equally good descriptions of the HERA data. Due to the different behavior of the gluon distribution xg at small x, and especially as the gluon distribution is eikonalized in the IPsat parametrization, differences are expected to arise when extrapolating to smaller values of Bjorken-x. This we already found in the case of exclusive J/Ψ production in Fig. 14, where both parametrizations give comparable results in the range covered by the HERA data, but differ by ∼ 50% in the kinematics covered by recent and near-future LHC experiments. On the contrary, for inclusive DIS the structure function F 2 predicted by both, the IPsat and IPnonsat, prametrizations overlap almost perfectly for a wide range of x values, as shown in Fig. 16. Even at very small x ∼ 10 −6 the two models differ only at the level of a few percent.
To see if the future high-energy DIS experiments can measure the structure functions with an accuracy lower or comparable to the difference between the two models, we show in Fig. 17 the ratio of F 2 obtained using the IPnonsat and IPsat parametrizations compared with the projected accuracy of the LHeC measurement [21]. As the uncertainty estimates for the LHeC consist of projected absolute, not relative uncertainties, the relative uncertainties shown as a colorfull bands are obtained by comparing the projected uncertainty to the result obtained by applying the IPsat parametrization. As already seen in Fig. 16, the differences are at a few percent level, and only slightly larger than the projected experimental accuracy at the LHeC. FCC-eh would probe x values down to x ∼ 10 −7 with comparable precision, thus at least in F 2 there would not be a striking difference between the IPsat and IPnonsat extrapolations. For F L the model differences are similar, but the experimental accuracy is much lower.
The data from future DIS machines on F 2 , F 2,charm and F L will thus make it possible to constrain dipole-proton scattering much more accurately, thanks to the fact that different observables are sensitive to different dipole sizes. However, in inclusive e+p scattering the IPsat model predicts the non-linear effects to be small. It thus becomes necessary to study nuclear DIS, where one expects the saturation effects to be enhanced by a large factor A 1/3 .

B. Nuclear targets
Using the Optical Glauber model one can extend the dipole-proton scattering amplitude to the dipole-nucleus one. Calculating the dipole-nucleus scattering by averaging over the positions of the individual nucleons from the Woods Saxon distribution following Ref. [6] one obtains where σ dip is the total dipole-proton cross section integrated over impact parameter, see Eq. (7). For large nuclei, this gives Only if, in addition to having a large A, the dipole-proton cross section is small (which requires small r as σ dip ∼ ln r) one obtains the smooth nucleus result In practice, as large dipoles have numerically significant contribution to F 2 , this approximation is not a realistic and results in too small nuclear suppression as discussed in Ref. [6]. Here T A is the Woods Saxon distribution integrated over the longitudinal coordinate, and the nuclear radius is R A = 1.13A 1/3 − 0.86A −1/3 fm. The normalization is chosen such that d 2 bT A (b) = 1. The corresponding dipole-nucleus amplitude in the IPnonsat model is the first term from the series expansion The nuclear suppression factor for the structure function F 2 is shown in Fig. 18, where we calculate For comparison, the experimental data points for Calcium and Lead from [56,57,57] are shown 4 . The Bjorken-x dependence of the F 2 and F L nuclear suppression factors is studied in Fig. 19. We find that in the case of F 2 , the x dependence is weak due to the fact that a significant part of F 2 originates from large saturated dipoles, whose scattering cross section is not affected by the relatively slow evolution of the saturation scale Q 2 s ∼ xg(x, µ 2 ). In the case of F L , which is dominated by smaller dipoles, a significantly faster x dependence is found.
We also find that when x increases, at some point the nuclear suppression starts to increase, which is counter intuitive. A similar observation was already seen in Ref. [59]. We note that if we were to do a BK evolution for the nucleus similarly as in Ref. [32], we would get a suppression factor for F 2 that always increases with increasing x. We consider the fact that the F 2 suppression factor has a maximum as a function of x to be an artefact of the shortcomings of the IPsat parametrization (e.g. decreasing xg with increasing scale in Fig. 11 which efectively decreases the saturation scale probed by larger dipoles).
Let us next study nuclear suppression in exclusive vector meson production. Here, we analyze the Q 2 dependence of light ρ and φ meson production that allows us to scan the transition from saturation to dilute region (see also Ref. [60]). In addition, we include J/Ψ, which is significantly smaller and heavier, and should experience less non-linear effects. As the total coherent cross section scales like A 2 , and the width of the first diffractive peak is proportional to 1/R 2 A ∼ A −2/3 , we study the suppression factor where V refers to the vector meson species. Note that diffractive cross sections are enhanced more strongly by the large nucleus compared to the inclusive scattering which scales linearly in A. The numerical factor c in the denominator can be obtained as a ratio of the form factors for the nucleus and the proton as the form factors determine the t spectra. Here the form factors areT A ( For the Gold nucleus with A = 197, this gives c ≈ 0.5011. By construction, this definition gives R = 1 in case of the linear IPnonsat parametrization. The Q 2 dependence of the suppression factor is relatively weak as shown in Fig. 20, with R reaching unity only at Q 2 ∼ 1000 GeV 2 . Physically the reason here is that even if at large Q 2 the photon preferably splits into a small dipole which does not see nonlinear effects, the requirement that a light (and large) vector meson is formed at the final state gives a small weight for the small dipoles. Instead, the specified final state requires the dipole to be relatively large, and it becomes necessary to go to very large Q 2 to give enough weight on small dipoles so that the suppression factor becomes close to 1. In case of J/Ψ the vector meson wave function always picks up relatively small dipoles, thus the maximum suppression is only around 0.7. For a discussion about the nuclear suppression in incoherent scattering, the reader is referred to Ref. [47].
In Fig. 20 we also show the W dependence of the suppression factor, which is found to be relatively modest. However, in the future Electron Ion Collider it will be useful to have maximally large Q 2 lever arm to study the evolution of the suppression factor from the saturated to the dilute region (see also Refs. [20,60]). For example, in the case of ρ production and at x P = 10 −2 (which is around the maximum x where our model can be considered to be valid), the maximum Q 2 that can be reached at an EIC with √ s N N = 90 GeV is Q 2 max ≈ 100 GeV 2 , which would make it possible to observe the evolution of the nuclear suppression from R ∼ 0.2 to R ∼ 0.8.

VII. CONCLUSIONS
The possibility of finding new and exciting QCD phenomena is just around the corner, with the next generation of DIS colliders to come soon. In view of this it is timely to exploit to the fullest the available data. In this path this work bring us one step forward, by determining not only the IPsat model with the modern data sets but also for the first time the linearized IPnonsat parametrization is determined from the same data in a fully consistent way. The main differences to previous works are the inclusion of the charm data to the global fit, which allows us to constrain the quark masses, and the use of a variable flavor number scheme. Also, for the first time, the combined HERA I+II datasets are used in dipole model fits with a similar outcome than in case of the HERA I combined results. We find that both models, with and without saturation, result in almost identical cross sections at HERA kinematics, and that the differences in e + p scattering are expected to be small even in the LHeC or FCC-eh kinematics. The nonlinear effects, however, become significant if a nuclear target/beam is used and should be easily observed in the future Electron-Ion Collider.
Despite some differences in the setup with the previous literature, the resulting dipole amplitude and calculated cross sections are similar than in the previous work [8]. This is a consequence of performing the fits using comparable data sets, which extrapolates to similar dipole amplitudes. Therefore for the IPsat case the models found in the literature will provide reasonable numerical results in the kinematical range accessible in current and future colliders. We emphasize that having a linearized "IPnonsat" model independently constrained by the HERA data is necessary for estimating the size of the saturation effects in these experiments.
The similar cross sections obtained from both IPsat and IPnonsat parametrizations are understood in terms of the effective description of the confinement scale physics. The linearized dipole cross section violates unitarity at large dipoles, and the fit compensates that by imposing an effective confinement effect damping dipoles larger than ∼ 1/m light . In the IPsat model, the unitarity requirement limits the contribution from unphysically large dipoles and a large light quark mass is not required in order to obtain a good description of the HERA data.
The inclusive structure function F 2 (and thus the reduced cross section σ r ) is especially sensitive to the dipoles expected to be heavily influenced by confinement effects not completely included in this work. On the other hand, F L and F 2,charm are not sensitive at all to dipoles larger than ∼ 1/Λ QCD . Thus, the future more precise F L and F 2,charm data, together with inclusive structure function measurements, will allow us to perform a much more precise test of the saturation picture.
Both IPsat and IPnonsat parametrizations give comparable predictions for structure functions at the energies available in future DIS experiments such as LHeC and FCC-eh. Slightly larger differences are seen when calculating predictions for exclusive vector meson production, but in order to really see the onset of non-linear nature of QCD, we find that it is crucial to perform DIS with nuclear targets. These effects should become clearly visible already at the EIC energies. Additionally, the potential for inclusive diffraction to separate between the linear and non-linear parametrizations could be studied in future work. Q 2 evolution is obtained compared with the HERA measurements, similarly as in case of charm reduced cross section (see Fig. 4).  Table I. Bjorken x values decrease to the left.  Table I.