Measurement of the muon charge asymmetry in inclusive pp to WX production at sqrt(s) = 7 TeV and an improved determination of light parton distribution functions

Measurements of the muon charge asymmetry in inclusive pp to WX production at sqrt(s) = 7 TeV are presented. The data sample corresponds to an integrated luminosity of 4.7 inverse femtobarns recorded with the CMS detector at the LHC. With a sample of more than twenty million W to mu nu events, the statistical precision is greatly improved in comparison to previous measurements. These new results provide additional constraints on the parton distribution functions of the proton in the range of the Bjorken scaling variable x from 10E-3 to 10E-1. These measurements and the recent CMS measurement of associated W + charm production are used together with the cross sections for inclusive deep inelastic ep scattering at HERA in a next-to-leading-order QCD analysis. The determination of the valence quark distributions is improved, and the strange-quark distribution is probed directly through the leading-order process g + s to W + c in proton-proton collisions at the LHC.


Introduction
In the standard model (SM), the dominant processes for inclusive W-boson production in pp collisions are annihilation processes: u d → W + and d u → W − involving a valence quark from one proton and a sea antiquark from the other. Since there are two valence u quarks and one valence d quark in the proton, W + bosons are produced more often than W − bosons. The Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) has investigated this production asymmetry in inclusive W-boson production and measured the inclusive ratio of total cross sections for W + and W − boson production at √ s = 7 TeV to be 1.421 ± 0.006 (stat) ± 0.032 (syst) [1]. This result is in agreement with SM predictions based on various parton distribution functions (PDFs) such as the MSTW2008 and CT10 PDF sets [2,3]. Measurements of the production asymmetry between W + and W − bosons as a function of boson rapidity can provide additional constraints on the d/u ratio and on the sea antiquark densities in the proton. For pp collisions at √ s = 7 TeV, these measurements explore the PDFs for the proton for Bjorken x from 10 −3 to 10 −1 [4]. However, it is difficult to measure the boson rapidity production asymmetry because of the energy carried away by neutrinos in leptonic W-boson decays. A quantity more directly accessible experimentally is the lepton charge asymmetry, defined as where dσ/dη is the differential cross section for W-boson production and subsequent leptonic decay and η = − ln [tan (θ/2)] is the charged lepton pseudorapidity in the laboratory frame, with θ being the polar angle measured with respect to the beam axis.
High precision measurements of the W-boson lepton charge asymmetry can improve the determination of the PDFs. Both the W-boson lepton charge asymmetry and the W-boson production charge asymmetry were studied in pp collisions by the CDF and D0 experiments at the Tevatron collider [5][6][7]. The ATLAS, CMS, and LHCb experiments also reported measurements of the lepton charge asymmetry using data collected at the LHC in 2010 [8][9][10][11]]. An earlier measurement of the W-boson electron charge asymmetry is based on 2011 CMS data corresponding to an integrated luminosity of 0.84 fb −1 [12].
The impact of CMS measurements of the lepton charge asymmetry on the global PDF fits has been studied by several groups [13][14][15][16][17], who concluded that improvements in the PDF uncertainties for several quark flavors could be achieved with more precise data. In this paper, we report a measurement of the muon charge asymmetry using a data sample corresponding to an integrated luminosity of 4.7 fb −1 collected with the CMS detector at the LHC in 2011. The number of W → µν events (more than 20 million) in this data sample is 2 orders of magnitude larger than for our previous measurement [10].
This precise measurement of the muon charge asymmetry and the recent CMS measurement of associated W + charm production [18] are combined with the cross sections for inclusive deep inelastic e ± p scattering at HERA [19] in a quantum chromodynamics (QCD) analysis at nextto-leading order (NLO). The impact of these measurements of W-boson production at CMS on the determination of light-quark distributions in the proton is studied and the strange-quark density is determined.
This paper is organized as follows. A brief description of the CMS detector is given in Section 2. The selection of W → µν candidates is described in Section 3. The corrections for residual charge-specific bias in the measurement of the muon transverse momentum (p T ) and

Event reconstruction
The signature of a W → µν event is a high-p T muon accompanied by missing transverse momentum E / T due to the escaping neutrino. The CMS experiment has utilized a particle-flow algorithm in event reconstruction, and the E / T used by this analysis is determined as the negative vector sum of the transverse momenta of all particles reconstructed by this algorithm [21]. The W → µν candidates were collected with a set of isolated single-muon triggers with different p T thresholds, which is the major difference with respect to the previous CMS measurement where nonisolated single-muon triggers were used [10]. The isolated muon trigger requires that in the neighboring region of the muon trigger candidate both the transverse energy deposits in calorimeters and the scalar sum of the p T of the reconstructed tracks are small, and it reduces the trigger rate while maintaining a relatively low muon p T threshold. We use all the data-taking periods during which the isolated muon triggers were not prescaled (i.e. they were exposed to the full integrated luminosity).
Other physics processes can produce high-p T muons and mimic W → µν signal candidates. We consider the SM background contributions from multijet production (QCD background), Drell-Yan (Z/γ * → + − ) production, W → τν production [electroweak (EW) background], and top-quark pair (tt) production. In addition, cosmic-ray muons can penetrate through the center of the CMS detector and also mimic W → µν candidates.
Monte Carlo (MC) simulations are used to help evaluate the background contributions in the data sample and to study systematic uncertainties. Primarily, we use NLO MC simulations based on the POWHEG event generator [22] where the NLO CT10 PDF model [3] is used. The generated events are interfaced with the PYTHIA (v.6.422) event generator [23] for simulating the electromagnetic finite-state radiation (FSR) and the parton showering. The τ lepton decay in the W → τν process is simulated by the TAUOLA MC package [24]. We simulate the QCD background with the PYTHIA event generator where the CTEQ6L PDF model [25] is used. The CMS detector simulation is interfaced with GEANT4 [26]. All generated events are first passed through the detector simulation and then reconstructed in the same way as the collision data.
Pileup is the presence of multiple interactions recorded in the same event. For the data used in this analysis, there are an average of about 7 reconstructed primary interaction vertices for each beam crossing. The MC simulation is generated with a different pileup distribution than we observe in the data. Therefore, the MC simulation is weighted such that the mean number of interactions per crossing matches that in data, using the inelastic pp cross section measured by the CMS experiment [27].
The selection criteria for muon reconstruction and identification are described in detail in a previous report [28]. Therefore, only a brief summary is given here. A muon candidate is reconstructed using two different algorithms: one starts with a track measured by the silicon tracker and then requires a minimum number of matching hits in the muon detectors, and the other starts by finding a track in the muon system and then matching it to a track measured by the silicon tracker. Muons used in this measurement are required to be reconstructed by both algorithms. A global track fit, including both the silicon tracker hits and muon chamber hits, is performed to improve the quality of the reconstructed muon candidate. The track p T measured by the silicon tracker is used as the muon p T and the muon charge is identified from the signed curvature. Cosmic-ray contamination is reduced by requiring that the distance of the closest approach to the leading primary vertex is small: |d xy | < 0.2 cm. The remaining cosmic-ray background yield is estimated to be about 10 −5 of the expected W → µν signal, and is therefore neglected [10]. The track-based muon isolation, Iso track , is defined to be the scalar sum of the p T of additional tracks in a cone with a radius of 0.3 around the muon candidate (R = √ (∆η) 2 + (∆φ) 2 < 0.3, with ∆φ and ∆η being the differences between the muon candidate and the track in the η-φ plane). Muons are required to have Iso track /p T < 0.1. Only muons within |η| < 2.4 are included in the data sample.
In each event, muons passing the above selection criteria are ordered according to p T , and the leading muon is selected as the W → µν candidate. The leading muon is required to be the particle that triggered the event. In addition, the muon is required to have p T > 25 GeV, which is safely above the trigger turn-on thresholds. Events that have a second muon with p T > 15 GeV are rejected to reduce the background from Drell-Yan dimuon events ("Drell-Yan veto"). The rejected events, predominantly Z/γ * → µ + µ − events, are used as a Drell-Yan control sample to study the modeling of the E / T and also to provide constraints on the modeling of the p T spectrum of W and Z bosons. In addition, this sample is used to estimate the level of background from Drell-Yan events where the second muon is not identified. The muon is corrected for a bias in the measurement of the momentum (discussed below) prior to the application of the p T selection.
The W → µν candidates that pass the above selection criteria are divided into 11 bins in absolute value of muon pseudorapidity |η|. The bin width is 0.2, except that the last three |η| bins are [1.6-1.85], [1.85-2.1], and [2.1-2.4], respectively. The muon charge asymmetry is measured in each of the |η| bins, along with the determination of the correlation matrix of the systematic uncertainties between different |η| bins.

Muon momentum correction and efficiency studies
The measured momentum of the muon depends critically on the correct alignment of the tracker system and the details of the magnetic field. Even after the alignment of the tracker detector a residual misalignment remains, which is not perfectly reproduced in the MC simulation. This misalignment leads to a charge dependent bias in the reconstruction of muon momenta, which is removed by using a muon momentum correction. The detailed description of the method for the extraction of the correction factors using Z/γ * → µ + µ − events is given in Ref. [29]. Here we provide only a short summary of the method. First, corrections to muon momentum in bins of η and φ are extracted separately for positively and negatively charged muons using the average of the 1/p T spectra of muons in Z/γ * → µ + µ − events. The mean values of the 1/p T spectra at the MC generator level, varied by the reconstruction resolution, are used as a "reference". The mean values of the reconstructed 1/p T spectra in data or simulation are tuned to match the reference. Second, the correction factors derived in the previous step are tuned further by comparing the dimuon invariant mass in each bin of muon charge Q, η, and φ to the ones at the MC generator level varied by the reconstruction resolution. The same procedure is performed for both data and reconstructed MC events, and correction factors are determined separately. The correction factors are extracted using the same η binning defined above in order to avoid correlations between different η bins.
The dataset used to derive the corrections was collected with a double-muon trigger with asymmetric p T thresholds of 17 and 8 GeV. Both muons are required to have p T > 25 GeV, which exceeds significantly the trigger p T thresholds. The simulation has been corrected for the muon efficiency difference between data and MC simulation as discussed below. We illustrate the relative size of the derived corrections using a 40 GeV muon as an example. For muons within |η| < 0.2, the corrections derived using the 1/p T spectra are less than 1.5% and 0.4% for data and MC simulation, respectively. A φ modulation of these corrections is observed. The maximum corrections are larger in high-η region, and for muons with |η| > 2.1 these corrections can be as large as 3.5% and 1.4% for data and MC simulation, respectively. The additional corrections derived using the dimuon invariant mass are smaller. For muons within the complete detector acceptance, the additional corrections are less than 0.5% and 0.2% for data and simulation, respectively. These additional corrections show no evidence of η-φ dependence and fluctuate around zero within the statistical uncertainties of the final corrections. The statistical uncertainties of the corrections for various η-φ bins are uncorrelated. By comparing the correction factors for positively and negatively charged muons in each bin, we can determine relative corrections from misalignment and from mismodeling of the magnetic field in the tracker system. The mismodeling corrections for muons with |η| > 2.1, where maximum deviations from zero are evident, are less than 0.3% and 0.4% for data and MC simulation, respectively. In contrast, in the same detector region, the corrections due to misalignment are about 4.4% and 1.7% for data and MC simulation, respectively. Hence, the bias comes predominantly from misalignment. Figure 1 shows the average dimuon invariant mass (mass profile) as a function of muon Q and η before and after the correction, which includes both the contributions from tracker misalignment and mismodeling of the magnetic field. The dimuon mass profiles after the correction are compared to the reference mass profile for data and MC simulation. They agree well with the reference, so the muon momentum bias is largely removed. The reference mass profile is expected to be a function of η because of the p T requirements for the two daughter muons in Z/γ * → µ + µ − decays. Values of the dimuon mass profile as a function of muon η are averaged over φ, while the muon scale corrections correct for muon momentum bias in both η and φ.  Figure 1: The dimuon mass profile as a function of muon η for µ − (a, c) and µ + (b, d), where (a) and (b) are before the correction and (c) and (d) are after the correction. The generated muon p T varied by reconstruction resolution in data is used to obtain the dimuon invariant mass of the reference.
The overall efficiency in the selection of muon candidates includes contributions from reconstruction, identification (including isolation), and trigger efficiencies. The muon reconstruction efficiency includes contributions from the reconstruction efficiency in the tracker system ("tracking") and in the muon system. The muon "offline" efficiency is the product of reconstruction and identification efficiencies. The contribution of each component to the overall efficiency (tracking, muon standalone reconstruction, identification, and trigger) is measured directly from the Z/γ * → µ + µ − events using the tag-and-probe method [1,28]. In this method one of the daughter muons is used to tag the Z/γ * → µ + µ − event and the other muon candidate is used as a probe to study the muon efficiencies as a function of Q, η, and p T . For every event a positively charged muon can be selected as the tag and a negatively charged probe can-didate is used to study the efficiencies for negatively charged muons. The same procedure is repeated by selecting a negatively charged muon as the tag to study efficiencies for positively charged muons. Each individual efficiency is determined in 22 bins of muon η, as defined above, and 7 bins of p T (15-20, 20-25, 25-30, 30-35, 35-40, 40-45, and >45 GeV) for both µ + and µ − . The same procedure is applied to both data and MC simulation and scale factors are determined to match the simulation efficiencies to the data.
The measured average tracking efficiency in each η bin varies from 99.6 to 99.9% with a slight inefficiency in the transition regions from the barrel to the endcap segments and at the edge of the tracker system. The ratio of tracking efficiencies for µ + and µ − is consistent with unity within statistical uncertainty. In the transition regions from the DT to the CSC, there is evidence that the muon offline efficiency has a slight asymmetry between µ + and µ − . The ratio of efficiencies for positively and negatively charged muons differs from unity by up to 1.0 ± 0.3%. The trigger efficiency ratio is also found to differ from unity in some η regions. The maximum deviation is at η > 2.1 where the efficiency for µ + is about 2.0 ± 0.5% higher than that for µ − . Figure 2 shows the η distribution for the leading µ + and µ − in the Z/γ * → µ + µ − sample. The dimuon invariant mass is within 60 < m µµ < 120 GeV. Here, the MC simulation is corrected for muon momentum bias, efficiency, and modeling of the Z-boson transverse momentum ( q T ) before normalizing to the measured data. The modeling of Z-boson q T spectrum is discussed in detail in Section 6.4.4. The η dependence effect in data and MC simulation are in good agreement.  Figure 2: The η distribution of the leading µ + (a) and µ − (b) in the Z/γ * → µ + µ − sample. The dimuon invariant mass is within 60 < m µµ < 120 GeV. The MC simulation is normalized to the data. The light shaded band is the total uncertainty in predicting the Z/γ * → µ + µ − event yields using MC simulation, as described in Section 6.

Extraction of the asymmetry
The asymmetry is calculated in bins of |η| from the yields of W + and W − . In this section, we explain how the yields are obtained from the E / T distributions, and we discuss corrections to the E / T needed in the accurate estimation of the yields. Finally, we explain how backgrounds are taken into account.
The raw charged asymmetry (A raw ) is defined in terms of the numbers N W + and N W − of W + and W − signal events: The yields N W + and N W − are obtained from simultaneous binned maximum-likelihood fits of the E / T distributions; the signal yields and the normalization of the QCD background are free parameters. The likelihood is constructed following the Barlow-Beeston method [30] to take into account the limited size of the MC signal event sample. The shapes of the E / T distributions for the W → µν signal and the background contributions are taken from MC simulations after correcting for mismodeling of the detector response and the q T distribution of the W bosons, as discussed further in Section 6.4.4 below. The pileup of each MC sample is matched to the data using an "accept-reject" technique based on the observed and simulated pileup distributions. This technique avoids a large spread of weights that would come from simply reweighting the MC events; the E / T templates are constructed using the accepted MC events.
A total of 12.9 million W + → µ + ν and 9.1 million W − → µ − ν candidate events are selected. The expected backgrounds from QCD, EW, and tt events are about 8%, 8%, and 0.5%, respectively. The single top-quark and diboson (WW/WZ/ZZ) production is less than 0.1% and is neglected. The variation of the background composition as a function of |η| is taken into account.
The estimate for the Drell-Yan background is based on the observed yields in a Drell-Yan control sample. The W → τν background scales with the W → µν signal using a factor determined from a MC simulation. The tt background is normalized to the NLO cross section obtained from MCFM [31][32][33]. Efficiency correction factors are applied to the simulation before determining the background normalization.
The level of the QCD background is determined by the fit. A constraint on the relative amount of QCD background in the W + and W − samples is obtained from a QCD-enriched control sample collected using a muon trigger with no isolation requirement. This constraint induces a correlation of N W + and N W − , and the resulting covariance is taken into account when evaluating the statistical uncertainty on A raw .
In the following sections, we discuss the corrections to the E / T and then report the results of the fit to the E / T distributions.

Corrections of the missing energy measurement
The analysis depends critically on the control of the E / T distributions. Several corrections are needed to bring the simulation into agreement with the observed distributions. The E / T depends on both the measured muon kinematics and the kinematics of the hadrons recoiling against the W boson. The corrections for the calibration of the muon momentum, discussed in Section 4 above, are applied by adding the p T correction to E / T vectorially. The kinematic corrections for the so-called "hadronic recoil," which are based on the control sample of Z/γ * → µ + µ − events, are explained in detail here.
By definition, the hadronic recoil, u, is the vector sum of transverse momenta of all reconstructed particles except for the muon(s). For Z/γ * → µ + µ − events, where q T is the transverse momentum of the dimuon system and E / T ≈ 0. The components of u parallel and perpendicular to q T are u || and u ⊥ , respectively. The mean of u ⊥ , u ⊥ , is approximately zero, while the mean of u || , u || , is close to the mean of the boson q T . Differences in the distributions from data and MC are ascribed to detector effects, the simulation of jets, pileup and the underlying event, all of which should be nearly the same for Z/γ * → µ + µ − and W → µν events. The distributions of u || and u ⊥ in Z/γ * → µ + µ − events are used to derive corrections for the simulation that improve the modeling of E / T for W → µν signal events as well as for backgrounds; this technique was employed previously by the Tevatron experiments and by CMS [34][35][36]. We correct both the scale and resolution of E / T .
A comparison of the E / T distributions for Z/γ * → µ + µ − events in data and MC shows that the agreement is not perfect. Both show a small φ modulation, but the phase and amplitude of the modulation are not the same. This modulation follows from the fact that collisions, including hard interactions that produce W events as well as pileup events, do not occur exactly at the origin of the coordinate system. This modulation can be characterized by a cosine function, C cos (φ − φ 0 ). The dependence of the amplitude C and phase term φ 0 on the number of primary vertices is extracted from the Z/γ * → µ + µ − event sample by fitting a φ-dependent profile of u || − u || . The amplitude C is observed to depend linearly on the number of primary vertices, while the phase φ 0 is almost independent of pileup. The φ modulation of E / T can be removed by adding a vector in the transverse plane, ∆ E / T = C cos φ 0x + C sin φ 0ŷ to E / T for each event.
The dependence of u || with Z-boson q T should be approximately linear, and this behavior is indeed observed in both data and MC. This dependence is further studied according to the direction of the leading jet, namely, in four bins of jet |η|: . The jets are formed by clustering particle-flow candidates using the anti-k T jet clustering algorithm [37] with a distance parameter of 0.5, and the muons are not included in the reconstruction of jets. The u || behavior with q T for MC and data agrees very well when the leading jet is in the central region of the detector. When the leading jet is in the forward direction, a modest difference is observed, amounting to less than 10% in the highest |η| bin.
The distributions of u || − u || and u ⊥ are fit to Gaussian functions whose widths are parametrized as a functions of q T . They depend strongly on the pileup, so they are also fit as functions of the number of vertices in the event. The weak dependence of u || on the leading jet |η| is neglected. The widths of the u || − u || and u ⊥ distributions are slightly larger in data than in MC. For example, when there are seven reconstructed vertices in the event (which corresponds to the mean number for this data set), the widths are 4-10% larger.
A test of the hadronic recoil corrections is carried out with Z/γ * → µ + µ − events. The hadronic recoil u is calculated in each MC event, and the parallel component u || is rescaled by the ratio of u || in data and in MC. Furthermore, the smearing of u || and u ⊥ is adjusted to match the resolutions measured with the data. The E / T is recalculated according to Eq. (3). Figure 3 shows E / T and the φ of the E / T after applying the hadronic recoil corrections. The data and MC simulation are in excellent agreement, demonstrating that this empirical correction to E / T works very well for Z/γ * → µ + µ − events.
To apply the hadronic recoil correction determined in Z/γ * → µ + µ − events to other MC simulations, such as W → µν events, requires defining a variable equivalent to the boson q T in Z/γ * → µ + µ − events. In W → µν events, the hadronic recoil is defined to be where p T is the muon transverse momentum. The hadronic recoil is decomposed into u || and u ⊥ components relative to q T . The hadronic recoil correction is applied in the manner above, Here, the hadronic recoil derived from the data was used to correct the MC simulation. The Z/γ * → τ + τ − + tt contribution (dark shaded region) in data is normalized to the integrated luminosity of the data sample using a MC simulation, and the normalization of the Z/γ * → µ + µ − MC simulation (light shaded region) is taken as the difference between the data and the estimated Z/γ * → τ + τ − + tt contribution. In this data sample, the Z/γ * → τ + τ − + tt contribution is negligible.
and E / T is recalculated. For the W → µν signal events, q T is the vector sum of the transverse momentum of the reconstructed muon and the generated neutrino. For W → τν events, the generated W-boson q T is used. For selected Drell-Yan background events, one muon is not reconstructed or not identified, so q T is calculated using the p T of the lost muon at the generator level. For the QCD background events, q T is identified with the p T of the reconstructed muon. Figure 4 shows the E / T distribution for the QCD control sample. We have selected only those events that pass a non-isolated muon trigger but that fail the isolated muon trigger. We also impose an anti-isolation selection cut: Iso track /p T > 0.1. With the application of the hadronic recoil corrections, the data and simulation are in very good agreement.

Extraction of the asymmetry from fits to the missing transverse energy
The W → µν signal yields are obtained by fitting the E / T distributions with all corrections applied. The events are selected with the default muon p T threshold of 25 GeV. The fits for W + and W − are shown in Fig. 5 for three ranges of |η|, namely, 0.0 ≤ |η| < 0.2, 1.0 ≤ |η| < 1.2, and 2.1 ≤ |η| < 2.4. The ratio of the data to the fit result is shown below each distribution, demonstrating good agreement of the fits with the data. Table 1 summarizes the fitted yields N W + and N W − , the correlation coefficient, the χ 2 value for each fit, and the raw asymmetry A raw . The χ 2 values indicate that the fits are good. The uncertainty in A raw takes the covariance of N W + and N W − into account. Corrections to A raw for potential bias are discussed in the next section.
As an important cross-check, we repeat the analysis with a higher muon p T threshold of 35 GeV. The background compositions change significantly; the QCD background is reduced to about 1%. Furthermore, the predicted asymmetry differs from that predicted for the default analysis with the 25 GeV threshold. The results are summarized in Table 1; they can be compared directly to  : The E / T distribution for µ + (a) and µ − (b) in the data sample dominated by the QCD background. The hadronic recoil derived from data has been used to correct the MC simulation. The W → µν contribution (light shaded region) is normalized to the integrated luminosity of the data sample using a MC simulation, and the normalization of the QCD simulation (dark shaded region) is taken as the difference between the data and the estimated W → µν contribution. The W → µν contribution in this data sample is negligible. The dark shaded band in each ratio plot shows the statistical uncertainty in the QCD MC E / T shape, and the light shaded band shows the total uncertainty, including the systematic uncertainties due to QCD E / T modeling as discussed in Section 6. the earlier measurement done with electrons [12].

Systematic uncertainties and corrections
The systematic uncertainties arise from many sources, including the measurement of the muon kinematics (efficiency, scale and resolution), the modeling of the E / T distributions, backgrounds, the boson q T distribution and final-state radiation. In general, the total systematic uncertainty is 2-2.5 times larger than the statistical uncertainty (see Table 2) and the main contributions come from the muon efficiency and from the QCD background. In the sections below, we discuss each source of systematic uncertainty, starting with muon-related quantities, followed by the E / T measurement, backgrounds, and boson-related modeling issues.
We evaluate many of these uncertainties using a MC method, in which 400 sets of pseudo-data are fitted to obtain the distribution of A raw values. This method allows us to propagate the uncertainties of the corrections to the measurement in a rigorous manner.
Several sources of potential bias are considered. To evaluate the bias, we defined a "true" muon charge asymmetry, A true , calculated by taking the muon four-vectors and charge directly from the MC generator.

Muon kinematics
One source of potential bias for A raw is the charge of the muon. The rate of charge mismeasurement, w, is very small but not zero. The measured asymmetry will differ from the true asymmetry by a factor (1 − 2w) assuming that the rate of mismeasurement is the same   Examples of the extraction of the W → µν signal from fits to E / T distributions of W → µν candidates in data: 0.0 ≤ |η| < 0.2 (a, b), 1.0 ≤ |η| < 1.2 (c, d), and 2.1 ≤ |η| < 2.4 (e, f). The fits to W + → µ + ν and W − → µ − ν candidates are in panels (a, c, e) and (b, d, f), respectively. The ratios between the data points and the final fits are shown at the bottom of each panel. The dark shaded band in each ratio plot shows the statistical uncertainty in the shape of the MC E / T distribution, and the light shaded band shows the total uncertainty, including all systematic uncertainties as discussed in Section 6. Table 1: Summary of the fitted N W + , N W − , the correlation between the uncertainties in N W + and N W − (ρ (N W + , N W − ) ), the χ 2 of the fit, and the extracted A raw for each |η| bin. The number of degrees of freedom (n doff ) in each fit is 197. Here, ρ (N W + , N W − ) and A raw are expressed as percentages. The muon p T resolution can induce a spread of the measured asymmetry from A true , which varies from 1.5 to 5.0% [28] as a function of |η|. The resolution of |η| is several orders of magnitude smaller than the bin widths used in this measurement; consequently, event migration around p T -η thresholds has a negligible effect on the measured asymmetry.
The muon momentum correction affects both the yields and the shapes of the E / T distributions.
To estimate the systematic uncertainty from this source, the muon 1/p T correction parameters in each η-φ bin and the muon scale global correction parameters are varied 400 times within their uncertainties. Each time the event yields can be slightly different in both data and MC simulation, and the extraction of the asymmetry is done for each of the 400 cases. The root mean square (RMS) of the measured A raw variations in each muon |η| bin is taken as the systematic uncertainty and the bin-to-bin correlations are assumed to be zero.
The systematic uncertainties resulting from the muon momentum corrections are typically less than 40% of those from the uncertainties in the muon efficiencies (discussed below) for the p T > 25 GeV sample. However, the two uncertainties are comparable for the p T > 35 GeV sample for two reasons: first, the charge-dependent bias from the alignment increases with p T ; second, the Jacobian peak of the W → µν events is close to 35 GeV.

Muon efficiency ratio
A difference in the muon efficiencies for positively and negatively charged muons will cause the ratio of the selection efficiencies for W + and W − to differ from unity. This would bias the measured charge asymmetry and we correct the A raw for this bias.
As discussed previously, the muon offline and trigger efficiencies are measured in 7 bins in p T and 22 bins in |η| for both µ + and µ − . The offline efficiency ratio between µ + and µ − is very close to unity in most of the detector regions. However, there is evidence that this ratio deviates from unity in the transition regions between the DT and CSC detectors.
We correct for this bias using efficiencies for µ + and µ − extracted from the Z/γ * → µ + µ − data and MC samples. For each |η| bin, an average W selection efficiency (W ± ) is obtained from the expression where ± data (p T , η), ± MC (p T , η) are total muon efficiencies, k are additional event-by-event weights introduced by W-boson q T weighting described below, and the sum is over the selected W → µν events. The efficiency ratio (r W + /W − = + / − ) is used to correct the A raw for the efficiency bias using which is an expansion to leading order in r W + /W − − 1 . In addition, all MC samples are corrected for any data/MC efficiency difference.
To estimate the systematic uncertainty due to the muon efficiencies, the muon efficiency values in data and MC simulation are modified according to their errors in each p T -η bin independently and 400 pseudo-efficiency tables are generated. In each pseudo-experiment the efficiency values are used to correct the MC simulation and A raw . The raw asymmetry is further corrected for the efficiency ratio r W + /W − described above. The RMS of the resulting asymmetries in each |η| bin is taken to be the systematic uncertainty originating from the determination of the ratio of the muon efficiencies. In this study, the variations for different |η| bins are completely independent from each other, so the systematic uncertainties due to the efficiency ratio have zero correlation between different |η| bins.
As a cross-check, Fig. 6 shows a comparison of the measured muon charge asymmetry between positive and negative η regions, taken separately and then overlaid. They are in very good agreement with each other, for both muon p T thresholds.

Backgrounds
The QCD background is estimated in part from the data. Nonetheless, a nonnegligible systematic uncertainty remains. We also discuss the uncertainty from the Drell-Yan background, and from the tt and W → τν backgrounds. The luminosity uncertainty enters in the estimation of these backgrounds, as discussed below.

QCD background
The total QCD background normalization is a parameter in the signal fit. The ratio of the QCD backgrounds in the W + and W − samples is fixed to the ratio observed in the QCD control region for each muon η bin. The ratios are about 1.02 for the first ten η bins and approximately 1.05 for the last η bin, similar for both muon p T > 25 and 35 GeV. There are two sources of the systematic uncertainties in the QCD background. The first is related to the ratio of the backgrounds in the W + and W − samples ("QCD +/−"), and the second is related to the modeling of the shape of the E / T distribution in QCD events ("QCD shape").
To evaluate the systematic uncertainties "QCD +/−", the QCD ratio is varied by ±5 and ±15% for muon p T thresholds of 25 and 35 GeV, respectively. The resulting shifts in the A raw are taken as the uncertainties. For the last |η| bin, the variations are 10% (25 GeV) and 20% (35 GeV). These variations of the QCD ratio span the maximum range indicated by the QCD MC simulation. As an additional cross-check, we fix the QCD shape to be the same for µ + and µ − and allow the two QCD normalizations to float in the extraction of the signal. We find that the fitted values for the ratio of the QCD backgrounds for W + and W − are within the uncertainties quoted above. The bin-to-bin correlation of these uncertainties in the asymmetries is assumed to be zero.
The second source of systematic uncertainties is a difference in the shape of the QCD background for W + and W − . The QCD E / T shape is taken from the MC simulation and the recoil correction is applied as discussed in Section 5.1. Two types of variations in the shape of the QCD E / T distribution are considered. First, the shape of the QCD E / T distribution without the hadronic recoil correction is used in the extraction of the signal. This is done in a correlated way for the W + and W − samples. Second, the shape of the E / T distribution for the QCD background is varied separately for the W + and W − samples (within the statistical uncertainties) and the resulting shapes are used in the signal extraction. These two contributions to the uncertainties from the "QCD shape" are then added in quadrature. The bin-to-bin correlation of the systematic uncertainties due to each shape variation is assumed to be 100%.

Drell-Yan background
The Z/γ * → µ + µ − events in the Drell-Yan control region are used to check the Drell-Yan normalization. This is done in bins of dimuon invariant mass: 15-30, 30-40, 40-60, 60-120, 120-150, and >150 GeV. The Z/γ * → µ + µ − MC simulation in each bin is compared to the data yields after correcting the simulation for the data/simulation difference in pileup, Z-boson q T , E / T modeling, and efficiencies. After correcting for the detector bias and physics mismodeling, the MC simulation describes the data well, as shown in Fig. 2 for the dimuon invariant mass between 60 and 120 GeV. The data yield in this mass bin is about 3% higher than the predictions from the next-to-next-to-leading-order (NNLO) cross section as calculated with FEWZ 3.1 [38].
The ratios of data to MC simulation of the Z/γ * → µ + µ − event yields as a function of the dimuon mass are used to rescale the MC prediction of the Drell-Yan background. We take the shift in the A raw with and without this rescaling as the systematic uncertainty. This and the PDF uncertainties in the Z/γ * → µ + µ − yields are considered as systematic uncertainties due to "Drell-Yan background normalization". This uncertainty is almost negligible at central |η| bins and increases in the forward |η| bins. The Drell-Yan background is larger in the forward region because of the lower efficiency of the "Drell-Yan veto" due to less detector coverage. The systematic uncertainties in the Drell-Yan background are assumed to have 100% correlation from bin to bin.

The tt and W → τν backgrounds
The tt and Z/γ * → τ + τ − backgrounds are normalized to the integrated luminosity of the data sample after correcting for the muon efficiency difference between data and MC simulation. The uncertainty of the integrated luminosity is 2.2% [39]. The normalization of all the MC backgrounds is varied by ±2.2%, and the resulting maximum shift in A raw is taken as the systematic uncertainty in the determination of the luminosity. The bin-to-bin correlations are 100%.
The tt background estimate also depends on the theoretical prediction [31][32][33], to which we assign an additional 15%. The bin-to-bin correlation is 100%.
The W → τν background is normalized to the W → µν yields in data with a ratio obtained from a MC simulation. This ratio is largely determined by the branching fraction of τ decaying to µ. A 2% uncertainty is assigned to the W → τν to W → µν ratio [40]. The correlation of this uncertainty is 100% bin-to-bin.

Modeling Uncertainties
The remaining systematic uncertainties pertain to the modeling of the detector and the signal process W → µν. We discuss first the issues concerning the E / T distribution, then FSR and finally the q T distribution.

Modeling of missing transverse momentum
To evaluate the systematic uncertainty due to the φ modulation of E / T , the correction for the φ modulation is removed and the shift in the A raw is taken as the systematic uncertainty.
The hadronic recoil correction changes the shape of the E / T distribution of all MC samples. To calculate the uncertainties resulting from this source, the average recoil and resolution parameters are varied within their uncertainties, taking into account the correlations between them. This is done 400 times, the RMS of the resulting A raw variations is taken as systematic uncertainty and bin-to-bin correlations are calculated.
Pileup can affect the E / T shapes. To estimate the effect of mismodeling the pileup in the simulation, the minimum-bias cross section is varied by ±5% and the pileup distributions expected in data are regenerated. The MC simulation is then weighted to match to data and the resulting shift in A raw is treated as a systematic uncertainty due to the pileup. Pileup affects the E / T shapes for all muon η bins in the same direction with a correlation of 100%.

Final-State Radiation
The emission of FSR photons in W decays reduces the muon p T and can cause a difference in acceptance between W + and W − . We studied the impact of the FSR on the muon charge asymmetry using the POWHEG W → µν MC sample. In this sample, FSR is implemented using a similar approach to parton showering and is approximate at the leading order (LO). We compare the muon charge asymmetry before and after FSR, and the difference is found to be within 0.07-0.12% and 0.03-0.11% for muon p T selections of 25 and 35 GeV, respectively. The raw asymmetry values are not corrected for FSR. Instead, the full shift in the muon charge asymmetry predicted by the POWHEG MC is taken as an additional systematic uncertainty and the bin-to-bin correlation is assumed to be 100%.

PDF uncertainty
The evaluation of PDF uncertainties follows the PDF4LHC recommendation [41]. The NLO MSTW2008 [2], CT10 [3], and NNPDF2.1 [42] PDF sets are used. All simulated events are weighted to a given PDF set and the overall normalization is allowed to vary. In this way both the uncertainties in the total cross sections, as well as in the shape of the E / T distribution are considered. To estimate the systematic uncertainty resulting from the uncertainties in the CT10 and MSTW2008 PDFs, asymmetric master equations are used [2, 3]. For the CT10, the 90% confidence level (CL) uncertainty is rescaled to 68% CL by dividing by a factor of 1.64485. For the NNPDF2.3 PDF set, the RMS of the A raw distributions is taken. The half-width of the maximum deviation from combining all three PDF uncertainty bands is taken as the PDF uncertainty. The CT10 error set is used to estimate the bin-to-bin correlations. The PDF uncertainties are about 10% of the total experimental uncertainty.

W-boson q T modeling
To improve the agreement between data and simulation, the W-boson q T spectrum is weighted using weight factors determined by the ratios of the distribution of boson q T for Z/γ * → µ + µ − events in data and MC simulation. We assume that the corrections are the same for W and Z events. This assumption is tested using two different sets of MC simulations: one from the POWHEG event generator and the other from MADGRAPH [43]. Here, the MADGRAPH simulation is treated as the "data", and the ratio of Z-boson q T of the MADGRAPH and POWHEG simulations is compared to the same ratio in simulated W-boson events. This double ratio is parametrized using an empirical function to smooth the statistical fluctuations, and additional weights are obtained using the fitted function. We weight the POWHEG simulation to be close to the MADGRAPH simulation and measure the asymmetry again. The deviation of A raw is taken as the systematic uncertainty due to mismodeling of W-boson q T . The default boson q T weighting is based on the POWHEG simulation. Table 2 summarizes the systematic uncertainties in all |η| bins. For comparison, the statistical uncertainty in each |η| bin is also shown. The dominant systematic uncertainties come from muon efficiencies, QCD background, and the muon momentum correction. The correlation matrix of systematic uncertainty among |η| bins is reported in Table 3. The correlations among |η| bins are small and do not exceed 37 and 14% for muon p T thresholds of 25 and 35 GeV, respectively. Much of the correlation is due to the systematic uncertainties in FSR and QCD background. The total covariance matrix, including both statistical and systematic uncertainties, is provided in the Appendix. Table 2: Systematic uncertainties in A for each |η| bin. The statistical uncertainty in each |η| bin is also shown for comparison. A detailed description of each systematic uncertainty is given in the text. The values are expressed as percentages, the same as for the asymmetries.

Results and discussion
The measured asymmetries A, after all the corrections, are shown in Fig. 7 as a function of muon |η| and summarized in Table 4. In Fig. 7 both statistical and systematic uncertainties are included in the error bars. These asymmetries are compared to predictions based on several  Table 4. We cross-check the theoretical predictions using the DYNNLO 1.0 [46,47] MC tool and the agreement between the FEWZ 3.1 and DYNNLO 1.0 is within 1%. The predictions using the CT10 and HERAPDF1.5 PDF sets are in good agreement with the data. The predictions using the NNPDF2.3 PDF set (which include the previous CMS electron charge asymmetry result and other LHC experimental measurements [44]) are also in good agreement with the data. The predictions using the MSTW2008 PDF set are not in agreement with the data, as seen in our previous analyses [10, 12]. The more recent MSTW2008CPdeut PDF set is a variant of the MSTW2008 PDF set with a more flexible input parametrization and deuteron corrections [15]. This modification has significantly improved the agreement with the CMS data even though they have not included LHC data, as shown in Fig. 7.
Since the per-bin total experimental uncertainties are significantly smaller than the uncertainty in the current PDF parametrizations, this measurement can be used to constrain PDFs in the next generation of PDF sets.     [44], HERAPDF1.5 [45], and MSTW2008CPdeut [15] PDF sets. The PDF uncertainty is at 68% CL. For each |η| bin, the theoretical prediction is calculated using the averaged differential cross sections for positively and negatively charged leptons. The numerical precision of the theoretical predictions is less than 10% of the statistical uncertainties of the measurements. The values are expressed as percentages.
|η| A ( ± (stat) ± (syst) ) CT10 NNPDF2.3 HERAPDF1.5 MSTW2008-CP-deut p T > 25 GeV 0.00-0. 20 15  Figure 9 shows a comparison of this result to the previous CMS electron charge asymmetry measurement extracted from part of the 2011 CMS data [12]. The electron charge asymmetry has been measured with a slightly different η binning because of the different subdetector geometry in the calorimeter and the muon system. We have calculated the bin-by-bin differences between these two measurements using the first seven η bins, where identical bin definitions are used, and the differences are fitted with a constant. The fitted constant is larger than zero by about 1.7 sigma, and the muon channel exhibits slightly higher asymmetry in these seven η bins than the electron one. The electron charge asymmetry uses a statistically independent data sample. A combination of both results can be used to improve the global PDF fits. The correlation between the electron charge asymmetry and this result is expected to be small. The completely correlated systematic sources of uncertainty include the luminosity measurement, tt background, W → τν background, and PDF uncertainty.
The theoretical predictions for the lepton charge asymmetry are given for the kinematic region specified by the lepton p T threshold. The p T distribution of the W boson affects the acceptance, and hence, the predicted charge asymmetry. However, the effect on W + and W − is largely correlated. Therefore, the impact on the lepton charge asymmetry measurement mostly cancels. Figure 10 shows the comparison of these results to the NLO CT10 PDF predictions based on the FEWZ 3.1 and RESBOS [48][49][50][51]. RESBOS does a resummation in boson q T at NLO (and approximate NNLO) plus next-to-next-to-leading logarithm which yields a more realistic description of boson q T than a fixed-order calculation such as the FEWZ 3.1. The difference between the FEWZ 3.1 and RESBOS predictions is negligible and our measurement, however precise, is not sensitive to the difference.  Figure 10: Comparison of the measured muon charge asymmetry to theoretical predictions based on the FEWZ 3.1 [38] and RESBOS [48][49][50][51] tools. The NLO CT10 PDF is used in both predictions. Results are shown for muon p T > 25 (a) and muon p T > 35 GeV (b). The CT10 PDF uncertainty is shown by the shaded bands.

The QCD analysis of HERA and CMS results of W-boson production
The main objective of the QCD analysis presented in this section is to exploit the constraining power and the interplay of the muon charge asymmetry measurements, presented in this paper, and the recent measurements of W + charm production at CMS [18] to determine the PDFs of the proton. These two data sets, together with the combined HERA inclusive cross section measurements [19], are used in an NLO perturbative QCD (pQCD) analysis.
Renormalization group equations, formulated in terms of DGLAP evolution equations [52][53][54][55][56][57], predict the dependence of the PDFs on the energy scale Q of the process in pQCD. The dependence on the partonic fraction x of the proton momentum cannot be derived from first principles and must be constrained by experimental measurements. Deep inelastic lepton-proton scattering (DIS) experiments cover a broad range of the (x, Q 2 ) kinematic plane. The region of small and intermediate x is probed primarily by the precise data of HERA, which impose the tightest constraints on the existing PDFs. However, some details of flavor composition, in particular the light-sea-quark content and the strange-quark distribution of the proton, are still poorly known. Measurements of the W-and Z-boson production cross sections in proton-(anti)proton collisions are sensitive to the light-quark distributions, and the constraining power of the W-boson measurements is applied in this analysis.
The muon charge asymmetry measurements probe the valence-quark distribution in the kinematic range 10 −3 ≤ x ≤ 10 −1 and have indirect sensitivity to the strange-quark distribution. The measurements of the total and differential cross sections of W + charm production have the potential to access the strange-quark distribution directly through the LO process g + s → W + c. This reaction was proposed as a way to determine the strange-quark and antiquark distributions [58][59][60].
Before the LHC era, constraints on the strange-quark distribution were obtained from semiinclusive charged-current scattering at the NuTeV [61,62] and CCFR [63] experiments. Dimuon production in neutrino-nucleus reactions is sensitive to strangeness at LO in QCD in reactions such as W + + s → c. These measurements probe the (anti)strange-quark density at x ≈ 10 −1 and Q 2 of approximately 10 GeV 2 , but their interpretation is complicated by nuclear corrections and uncertainties in the charm-quark fragmentation function. The NOMAD Collaboration reported a recent determination of the strange-quark suppression factor where the value κ s (Q 2 = 20 GeV 2 ) = 0.591 ± 0.019 is determined at NNLO by using dimuon production [64]. The measurements of semi-inclusive hadron production on a deuteron target at HERMES [65] have been recently reevaluated [66] to obtain the x dependence of the strangequark distribution at LO at an average Q 2 = 2.5 GeV 2 . In that analysis the strange-quark distribution is found to vanish above x = 0.1, but this result depends strongly on the assumptions of the kaon fragmentation function.
In a recent analysis by the ATLAS Collaboration [67], the inclusive cross section measurements of W-and Z-boson production were used in conjunction with DIS inclusive data from HERA. The result supports the hypothesis of a symmetric composition of the light-quark sea in the kinematic region probed, i.e., s = d.
The LHC measurements of associated production of W bosons and charm quarks probe the strange-quark distribution in the kinematic region of x ≈ 0.012 at the scale Q 2 = m 2 W . The cross sections for this process were recently measured by the CMS Collaboration [18] at a center-of mass energy √ s = 7 TeV with a total integrated luminosity of 5 fb −1 . The results of the QCD analysis presented here use the absolute differential cross sections of W + charm production, measured in bins of the pseudorapidity of the lepton from the W decay, for transverse momenta larger than 35 GeV.

Details of the QCD analysis
The NLO QCD analysis is based on the inclusive DIS data [19] from HERA, measurements of the muon charge asymmetry in W production for p T > 25 GeV, and measurements of associated W + charm production [18]. The treatment of experimental uncertainties for the HERA data follows the prescription of HERAPDF1.0 [19]. The correlations of the experimental uncertainties for the muon charge asymmetry and W + charm data are taken into account.
The theory predictions for the muon charge asymmetry and W + charm production are calculated at NLO by using the MCFM program [31,32], which is interfaced to APPLGRID [68].
The open source QCD fit framework for PDF determination HERAFITTER [19,69,70] is used and the partons are evolved by using the QCDNUM program [71]. The TR' [2, 72] general mass variable flavor number scheme is used for the treatment of heavy-quark contributions with the following conditions: (i) heavy-quark masses are chosen as m c = 1.4 GeV and m b = 4.75 GeV, (ii) renormalization and factorization scales are set to µ r = µ f = Q, and (iii) the strong coupling constant is set to α s (m Z ) = 0.1176.
The Q 2 range of HERA data is restricted to Q 2 ≥ Q 2 min = 3.5 GeV 2 to assure the applicability of pQCD over the kinematic range of the fit. The procedure for the determination of the PDFs follows the approach used in the HERAPDF1.0 QCD fit [19].
The following independent combinations of parton distributions are chosen in the fit procedure at the initial scale of the QCD evolution Q 2 0 = 1.9 GeV 2 : xu v (x), xd v (x), xg(x) and xU(x), xD(x) where xU(x) = xu(x), xD(x) = xd(x) + xs(x). At Q 0 , the parton distributions are represented by The normalization parameters A u v , A d v , A g are determined by the QCD sum rules, the B parameter is responsible for small-x behavior of the PDFs, and the parameter C describes the shape of the distribution as x → 1. A flexible form for the gluon distribution is adopted here, where the choice of C g = 25 is motivated by the approach of the MSTW group [2, 72].
Two types of analyses are made. The first is denoted as "fixed-s fit" and is performed by fitting 13 parameters in Eqs. (8-12) to analyze the impact of the muon charge asymmetry measurements on the valence-quark distributions. Additional constraints B U = B D and A U = A D (1 − f s ) are imposed with f s being the strangeness fraction, f s = s/(d + s), which is fixed to f s = 0.31 ± 0.08 as in Ref. [2].
The second analysis is denoted as "free-s fit", in which the interplay between the muon charge asymmetry measurements and W + charm production data is analyzed. The strange-quark distribution is determined by fitting 15 parameters in Eqs. (8-12). Here, instead of Eq. (11) d and s are fitted separately by using the functional forms Additional constraints A u = A d and B u = B d are applied to ensure the same normalization for u and d densities at x → 0. The strange-antiquark parameter B s is set equal to B d , while A s and C s are treated as free parameters of the fit, assuming xs = xs. This parametrization cannot be applied to HERA DIS data alone, because those data do not have sufficient sensitivity to the strange-quark distribution.

The PDF uncertainties
The PDF uncertainties are estimated according to the general approach of HERAPDF1.0 [19] in which experimental, model, and parametrization uncertainties are taken into account. A tolerance criterion of ∆χ 2 = 1 is adopted for defining the experimental uncertainties that originate from the measurements included in the analysis. Model uncertainties arise from the variations in the values assumed for the heavy-quark masses m b , m c with 4.3 ≤ m b ≤ 5 GeV, 1.35 ≤ m c ≤ 1.65 GeV, and the value of Q 2 min imposed on the HERA data, which is varied in the interval 2.5 ≤ Q 2 min ≤ 5.0 GeV 2 . The parametrization uncertainty is estimated similarly to the HERAPDF1.0 procedure: for all parton densities, additional parameters are added one by one in the functional form of the parametrizations such that Eqs. (8-11) are generalized to A x B (1 − x) C (1 + Dx) or A x B (1 − x) C (1 + Dx + Ex 2 ). In the free-s fit, in addition, the parameters B s and B d are decoupled. Furthermore, the starting scale is varied within 1.5 ≤ Q 2 0 ≤ 2.5 GeV 2 . The parametrization uncertainty is constructed as an envelope built from the maximal differences between the PDFs resulting from all the parametrization variations and the central fit at each x value. The total PDF uncertainty is obtained by adding experimental, model, and parametrization uncertainties in quadrature. In the following, the quoted uncertainties correspond to 68% CL.

Results of the QCD analysis
The muon charge asymmetry measurements, together with HERA DIS cross section data, improve the precision of the valence quarks over the entire x range in the fixed-s fit. This is illustrated in Fig. 11, where the u and d valence-quark distributions are shown at the scale relevant for the W-boson production, Q 2 = m 2 W . The results at Q 2 = 1.9 GeV 2 can be found in supplemental material. A change in the shapes of the light-quark distributions within the total uncertainties is observed. The details of the effect on the experimental PDF uncertainty of u valence, d valence, and d/u distributions are also given in supplemental material.
In the next step of the analysis, the CMS W + charm measurements are used together with the HERA DIS data and the CMS muon charge asymmetry. Since both CMS W-boson production measurements are sensitive to the strange-quark distribution, a free-s fit can be performed. The advantage of including these two CMS data sets in the 15-parameter fit occurs because the dquark distribution is significantly constrained by the muon charge asymmetry data, while the strange-quark distribution is directly probed by the associated W + charm production measurements. In the free-s fit, the strange-quark distribution s(x, Q 2 ), and the strange-quark fraction R s (x, Q 2 ) = (s + s)/(u + d) are determined. The global and partial χ 2 values for each data set are listed in Table 5, where the χ 2 values illustrate a general agreement among all the data sets.   Figure 13: Antistrange-quark distribution s(x, Q) and the ratio R s (x, Q), obtained in the QCD analysis of HERA and CMS data, shown as functions of x at the scale Q 2 = 1.9 GeV 2 (left) and Q 2 = m 2 W (right). The full band represents the total uncertainty. The individual contributions from the experimental, model, and parametrization uncertainties are represented by the bands of different shades. Table 5: Global χ 2 /n dof and partial χ 2 per number of data points n dp for the data sets used in the 15-parameter QCD analysis.

Data sets
Global χ 2 /n dof Partial χ 2 /n dp In Fig. 12, the resulting NLO parton distributions are presented at Q 2 0 = 1.9 GeV 2 and Q 2 = m 2 W . The strange quark distribution s(x, Q 2 ) and the ratio R s (x, Q 2 ) are illustrated in Fig. 13 at the same values of Q as in Fig. 12. The total uncertainty in Fig. 12 is dominated by the parametrization uncertainty in which most of the expansion in the envelope is caused by the decoupling parameter choice B s = B d . The strange-quark fraction rises with energy and reaches a value comparable to that of u and d antiquarks at intermediate to low x. Also, a suppression of R s at large x is observed, which scales differently with the energy. This result is consistent with the prediction provided by the ATLAS Collaboration [67], where inclusive W-and Z-boson production measurements were used to determine r s = 0.5(s + s)/d. In Ref. [67], the NLO value of r s = 1.03 with the experimental uncertainty ±0.19 exp is quoted at x = 0.023 and Q 2 = 1.9 GeV 2 . In the framework used, the two definitions of the strange-quark fraction are very similar at the starting scale Q 2 0 and the values R s and r s can be directly compared. In the free-s fit, the strangeness suppression factor is determined at Q 2 = 20 GeV 2 to be κ s = 0.52 +0.12 −0.10 (exp.) +0.05 −0.06 (model) +0.13 −0.10 (parametrization), which is in agreement with the value [64] obtained by the NOMAD experiment at NNLO.
The impact of the measurement of differential cross sections of W + charm production on the strange-quark distribution and strangeness fraction R s is also examined by using the Bayesian reweighting [13,14] technique. The results qualitatively support the main conclusions of the current NLO QCD analysis. Details can be found in supplemental material.

Summary
The W → µν lepton charge asymmetry is measured in pp collisions at √ s = 7 TeV using a data sample corresponding to an integrated luminosity of 4.7 fb −1 collected with the CMS detector at the LHC (a sample of more than 20 million W → µν events). The asymmetry is measured in 11 bins in absolute muon pseudorapidity, |η|, for two different muon p T thresholds, 25 and 35 GeV. Compared to the previous CMS measurement, this measurement significantly reduces both the statistical and the systematic uncertainties. The total uncertainty per bin is 0.2-0.4%. The data are in good agreement with the theoretical predictions using CT10, NNPDF2.3, and HERA-PDF1.5 PDF sets. The data are in poor agreement with the prediction based on the MSTW2008 PDF set, although the agreement is significantly improved when using the MSTW2008CPdeut PDF set. The experimental uncertainties are smaller than the current PDF uncertainties in the present QCD calculations. Therefore, this measurement can be used to significantly improve the determination of PDFs in future fits.
This precise measurement of the W → µν lepton charge asymmetry and the recent CMS measurement of associated W + charm production are used together with the cross sections for inclusive deep inelastic scattering at HERA in an NLO QCD analysis of the proton structure. The muon charge asymmetry in W-boson production imposes strong constraints on the valencequark distributions, while the W + charm process is directly sensitive to the strange-quark distribution.
for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: [9] ATLAS Collaboration, "Measurement of the inclusive W ± and Z/γ cross sections in the e and µ decay channels in pp collisions at √ s = 7 TeV with the ATLAS detector", Phys. Rev. D 85 (2012) 072004, doi:10.1103/PhysRevD.85.072004, arXiv:1109.5141.
[11] LHCb Collaboration, "Inclusive W and Z production in the forward region at √ s = 7 TeV", JHEP 06 (2012) Figure 20: Ratio R s (x, Q 2 ) resulting from the NLO QCD analysis of HERA and CMS data, presented as a function of x at the scale Q 2 = m 2 W . The light shaded band represents the total PDF uncertainty of the CMS result. For comparison, results of Bayesian reweighting using HERA I inclusive DIS data and the CMS measurement of W + charm production (dark shaded band). The reweighting results based on the data used in the global NNPDF2.3 fit and the CMS W + charm production are represented by a hatched band. : Ratio R s (x, Q 2 ), obtained by using Bayesian reweighting, shown as a function of x at the scale Q 2 = m 2 W . The dark shaded band represents the result based on the HERA I DIS and CMS W + charm data. The results of the reweighing obtained by using the CMS W + charm measurements in addition to collider-only data, and in addition to the data used in the global NNPDF2.3 analysis, are illustrated by bands of different hatches.