Measurement of the Transverse Momentum Distribution of W Bosons in pp Collisions at √ s = 7 TeV with the ATLAS Detector

This paper describes a measurement of the W boson transverse momentum distribution using ATLAS pp collision data from the 2010 run of the LHC at √ s = 7 TeV, corresponding to an integrated luminosity of about 31 pb−1. Events from both W → eν and W → μν are used, and the transverse momentum of the W candidates is measured through the energy deposition in the calorimeter from the recoil of the W . The resulting distributions are unfolded to obtain the normalized differential cross sections as a function of the W boson transverse momentum. We present results for pT < 300 GeV in the electron and muon channels as well as for their combination, and compare the combined results to the predictions of perturbative QCD and a selection of event generators.

Measurement of the transverse momentum distribution of W bosons in pp collisions at ffiffi ffi s p ¼ 7 TeV with the ATLAS detector G. Aad et al. * (ATLAS Collaboration) (Received 31 August 2011;published 18 January 2012) This paper describes a measurement of the W boson transverse momentum distribution using ATLAS pp collision data from the 2010 run of the LHC at ffiffi ffi s p ¼ 7 TeV, corresponding to an integrated luminosity of about 31 pb À1 . Events form both W ! e and W ! are used, and the transverse momentum of the W candidates is measured through the energy deposition in the calorimeter from the recoil of the W. The resulting distributions are unfolded to obtain the normalized differential cross sections as a function of the W boson transverse momentum. We present results for p W T < 300 GeV in the electron and muon channels as well as for their combination, and compare the combined results to the predictions of perturbative QCD and a selection of event generators. DOI: 10.1103/PhysRevD.85.012005 PACS numbers: 12.38.Qk, 13.85.Qk, 14.70.Fm

I. INTRODUCTION
At hadron colliders, W and Z bosons are produced with nonzero momentum transverse to the beam direction due to parton radiation from the initial state. Measuring the transverse momentum (p T ) distributions of W and Z bosons at the LHC provides a useful test of QCD calculations, because different types of calculations are expected to produce the most accurate predictions for the low-p T and high-p T parts of the spectrum. This measurement complements studies which constrain the proton parton distribution functions (PDFs), such as the W lepton charge asymmetry in pp collisions [1], because the dynamics which generate transverse momentum in the W do not depend strongly on the distribution of the proton momentum among the partons. The W p T is reconstructed in W ! ' events (where ' ¼ e or in this paper). Because of the neutrino in the final state, the W p T must be reconstructed through the hadronic recoil, which is the energy observed in the calorimeter excluding the lepton signature. This measurement is therefore also complementary to measurements of the Z p T , which is measured using Z ! '' events in which the Z p T is reconstructed via the momentum of the lepton pair [2]. Although the underlying dynamics being tested are similar, the uncertainties on the W and Z measurements are different and mostly uncorrelated. The transverse energy resolution of the hadronic recoil is not as good as the resolution on the lepton momenta, but approximately 10 times as many candidate events are available (ð W Á BRðW ! 'ÞÞ=ð Z Á BRðZ ! ''ÞÞ ¼ 10:840 AE 0:054 [3]). Testing the modeling of the hadronic recoil through the W p T distribution is also an important input to precision measurements using the W ! ' sample, including especially the W mass measurement.
In this paper, we describe a measurement of the transverse momentum distribution of W bosons using ATLAS data from pp collisions at ffiffi ffi s p ¼ 7 TeV at the LHC [4], corresponding to about 31 pb À1 of integrated luminosity. The measurement is performed in both the electron and muon channels, and the reconstructed W p T distribution, following background subtraction, is unfolded to the true p T distribution. Throughout this paper, p R T is used to refer to the reconstructed W p T and p W T is used to refer to the true W p T . The true W p T may be defined in three ways. The default in this paper is the p T that appears in the W boson propagator at the Born level, since this definition of p W T is independent of the lepton flavor and the electron and muon measurements can be combined. It is also possible to define p W T in terms of the true lepton kinematics, with (''dressed'') or without (''bare'') the inclusion of QED final state radiation (FSR). These define a physical final state more readily identified with the detected particles, so we give results for these definitions of p W T for the electron and muon channels. For all three definitions of p W T , photons radiated by the W via the WW triple gauge coupling vertex are treated identically to those radiated by a charged lepton.
The unfolding proceeds in two steps. First, a Bayesian technique is used to unfold the reconstructed distribution (p R T ) to the true distribution (p W T ) for selected events, taking into account bin-to-bin migration effects via a response matrix describing the probabilistic mapping from p W T to p R T . This step corrects for the hadronic recoil resolution. Second, the resulting distribution is divided in each bin by the detection efficiency, defined as the ratio of the number of events reconstructed to the number produced in the phase space consistent with the event selection. This converts the p W T distribution for selected events into the p W T distribution for all W events produced in the fiducial volume, which is defined by p ' T > 20 GeV, j ' j < 2:4, p T > 25 GeV, and transverse mass m T ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 2p ' T p T ð1 À cosð' ' À ' ÞÞ q > 40 GeV [5], where the thresholds are defined in terms of the true lepton kinematics.
The unfolding results in the differential fiducial cross section d fid =dp W T , in which the subscript in fid indicates that the cross section measured is the one for events produced within the phase space defined above. The electron and muon differential cross sections are combined into a single measurement via 2 minimization, using a covariance matrix describing all uncertainties and taking into account the correlations between the measurement channels as well as across the p W T bins. The resulting differential cross section is normalized to the total measured fiducial cross section, which results in the cancellation of some uncertainties, and compared to predictions from different event generators and perturbative QCD (pQCD) calculations.
This paper is organized as follows. Section II reviews the existing calculations and measurements of p W T . The relevant components of the ATLAS detector are described in Sec. III and the generation of the simulated data used is described in Sec. IV. The event selection is given in Sec. V and the estimation of the backgrounds remaining after that selection is explained in Sec. VI. The unfolding procedure is described in Sec. VII. Section VIII summarizes the systematic uncertainties. The electron and muon channel results, the combination procedure, and the combined results are all given in Sec. IX. We conclude with a discussion of the main observations in Sec. X.

II. QCD PREDICTIONS AND PREVIOUS MEASUREMENTS
At leading order, the W boson is produced with zero momentum transverse to the beam line. Nonzero p T is generated through the emission of partons in the initial state. At low p T , this is dominated by multiple soft or almost collinear partons, but at higher p T , the emission of one or more hard partons becomes the dominant effect. Because of this, different calculations of d=dp W T may be better suited for different ranges of p W T . At large p W T (p W T * 30 GeV), the spectrum is determined primarily by hard parton emission, and pQCD calculations at a fixed order of s are expected to predict d=dp W T reliably [6]. The inclusive cross section prediction is finite, but the differential cross section diverges as p W T approaches zero. Differential cross sections calculated to Oð 2 s Þ are available for Z= Ã production through the FEWZ [7,8] and DYNNLO [9,10] programs, and are becoming available for the W. The MCFM generator [11] can predict p W T at Oð 2 s Þ through the next-to-leading order (NLO) calculation of the W þ 1 parton differential cross section.
As p W T becomes small, contributions at higher powers of s describing the production of soft gluons grow in importance. These terms also contain factors of lnðM 2 W =ðp W T Þ 2 Þ which diverge for vanishing p W T . The p W T distribution is better modeled in this regime by calculations that resum logarithmically divergent terms to all orders in s [6,12,13]. The RESBOS generator [13][14][15] resums the leading contributions up to the next-to-next-to-leading logarithms (NNLL), and matches the resummed calculation to an Oð s Þ calculation, corrected to Oð 2 s Þ using a k-factor depending on p T and rapidity, to extend the prediction to large p W T . It also includes a nonperturbative parametrization, tuned to Drell-Yan data from several experiments [15,16], to model the lowest p W T values. Parton shower algorithms such as PYTHIA [17] and HERWIG [18] can also provide finite predictions of d=dp W T in the low-p W T region by describing the soft gluon radiation effects through the iterative splitting and radiation of partons. PYTHIA implements leading-order matrix element calculations with a parton shower algorithm that has been tuned to match the p Z T data from the Tevatron [19][20][21]. Similarly, the MC@NLO [22] and POWHEG [23][24][25][26] event generators combine NLO (Oð s Þ) matrix element calculations with a parton shower algorithm to produce differential cross section predictions that are finite for all p W T . Generators such as ALPGEN [27] and SHERPA [28] calculate matrix elements for higher orders in s (up to five), but only include the tree-level terms which describe the production of hard partons. Parton shower algorithms can be run on the resulting events, with double-counting of parton emissions in the phase space overlap between the matrix element and parton shower algorithms removed through a veto [27] or by reweighting [29,30]. Although these calculations do not include virtual corrections to the LO process, they are relevant for comparison to the highest p T part of the p W T spectrum, which includes contributions from a W recoiling against multiple high-p T jets.
The W p T distribution has been measured most recently at the Tevatron with Run I data (p " p collisions at ffiffi ffi s p ¼ 1:8 TeV) by both CDF [31] and D0 [32]. Both of these results are limited by the number of candidate events used (less than 1000), and by the partial unfolding which does not take into account bin-to-bin correlations. The present analysis uses more than 100 000 candidates per channel and a full unfolding of the hadronic recoil which takes into account correlations between bins, resulting in greater precision overall and inclusion of higher-p W T events compared to the Tevatron results.
Although this is the first measurement of the W p T distribution at the LHC, the W ! ' sample at ffiffi ffi s p ¼ 7 TeV has been studied recently by both the ATLAS and CMS collaborations. The ATLAS Collaboration has measured the inclusive W ! ' cross section [3] and the lepton charge asymmetry in W ! events [1]. The CMS Collaboration has also measured the inclusive cross section [33], and has measured the polarization of Ws produced with p W T > 50 GeV, demonstrating that the majority of W bosons produced at large p T in pp collisions are lefthanded, as predicted by the standard model [34].

III. THE ATLAS DETECTOR
AND THE pp DATA SET

A. The ATLAS detector
The ATLAS detector [35] at the LHC consists of concentric cylindrical layers of inner tracking, calorimetry, and outer (muon) tracking, with both the inner and outer tracking volumes contained, or partially contained, in the fields of superconducting magnets to enable measurement of charged particle momenta.
The inner detector (ID) allows precision tracking of charged particles within jj $ 2:5. It surrounds the interaction point, inside a superconducting solenoid which produces a 2 T axial field. The innermost layers constitute the pixel detector, arranged in three layers, both barrel and end cap. The semiconductor tracker (SCT) is located at intermediate radii in the barrel and intermediate z for the end caps, and consists of four double-sided silicon strip layers with the strips offset by a small angle to allow reconstruction of three-dimensional space points. The outer layers, the transition radiation tracker (TRT), are straw tubes which provide up to 36 additional R À ' position measurements, interleaved with thin layers of material which stimulate the production of transition radiation. This radiation is then detected as a higher ionization signal in the straw tubes, and exploited to distinguish electron from pions.
The calorimeter separates the inner detector from the muon spectrometer and measures particle energies over the range jj < 4:9. The liquid argon (LAr) electromagnetic calorimeter uses a lead absorber in folded layers designed to minimize gaps in coverage. It is segmented in depth to enable better particle shower reconstruction. The innermost layer (''compartment'') is instrumented with strips that precisely measure the shower location in . The middle compartment is deep enough to contain most of the electromagnetic shower produced by a typical electron or photon. The outermost compartment has the coarsest spatial resolution and is used to quantify how much of the particle shower has leaked back into the hadronic calorimeter. The hadronic calorimeter surrounds the electromagnetic calorimeter and extends the instrumented depth of the calorimeter to fully contain hadronic particle showers. Its central part, covering jj < 1:7, is the tile calorimeter, which is constructed of alternating layers of steel and scintillating plastic tiles. Starting at jj $ 1:5 and extending to jj $ 3:2, the hadronic calorimeter is part of the liquid argon calorimeter system, but with a geometry different from the electromagnetic calorimeter and with copper and tungsten as the absorbing material. The forward calorimeters, also using liquid argon, extend the coverage up to jj $ 4:9.
The muon chambers and the superconducting air-core toroid magnets, located beyond the calorimeters, constitute the muon spectrometer (MS). Precision tracking in the bending plane (R À ) for both the barrel and the end caps is performed by means of monitored drift tubes (MDTs). Cathode strip chambers (CSCs) provide precision À ' space points in the innermost layer of the end cap, for 2:0 < jj < 2:7. The muon triggers are implemented via resistive plate chambers (RPCs) and thin-gap chambers (TGCs) in the barrel and end cap, respectively. In addition to fast reconstruction of three-dimensional space points for muon triggering, these detectors provide ' hit information complementary to the precision hits from the MDTs for muon reconstruction.

B. Online selection
The online selection of events is based on rapid reconstruction and identification of charged leptons, and the requirement of at least one charged lepton candidate observed in the event. The trigger system implementing the online selection has three levels: Level 1, which is implemented in hardware; Level 2, which runs specialized reconstruction software on full-granularity detector information within a spatially limited ''Region of Interest''; and the Event Filter, which reconstructs events using algorithms and object definitions nearly identical to those used offline.
In the electron channel, the Level 1 hardware selects events with at least one localized region (''cluster'') of significant energy deposition in the electromagnetic calorimeter with E T > 10 GeV. Level 2 and the Event Filter check for electron candidates in events passing the Level 1 selection, and accept events with at least one electron candidate with E T > 15 GeV. The electron identification includes matching of an inner detector track to the electromagnetic cluster and requirements on the cluster shape. The trigger efficiency relative to offline electrons as defined below is close to 100% within the statistical uncertainties in both data and simulation.
The online selection of muon events starts from the identification of hit patterns consistent with a track in the muon spectrometer at Level 1. For the first half of the data used in this analysis, there is no explicit threshold for the transverse momentum at Level 1, but in the second half, to cope with increased rates from the higher instantaneous luminosity, a threshold of 10 GeV is used. Level 2 and the Event Filter attempt to reconstruct muons in events passing the Level 1 trigger using an ID track matched to a track segment in the MS. Both apply a p T threshold of 13 GeV for all of the data used in this analysis. The trigger efficiency relative to the offline combined muon defined below is a function of the muon p T and , and varies between 67% and 96%. Because of its larger geometrical coverage, the end cap trigger is more efficient than the barrel trigger. The trigger path starting from a Level 1 trigger with no explicit p T threshold is slightly more efficient (1-2%) than the one with a 10 GeV threshold.

C. Data quality requirements and integrated luminosity
Events used in this analysis were collected during stable beams operation of the LHC in 2010 at ffiffi ffi s p ¼ 7 TeV with all needed detector components functioning nominally, including the inner detector, calorimeter, muon spectrometer, and magnets. The integrated luminosity is 31:4 AE 1:1 pb À1 in the electron channel and 30:2 AE 1:0 pb À1 in the muon channel [36,37].

IV. EVENT SIMULATION
Simulated data are used to calculate the efficiency for the W ! ' signal, to estimate the number of background events and their distribution in p R T , to construct the response matrix, and to compare the resulting normalized differential cross section ð1= fid Þðd fid =dp W T Þ to a variety of predictions.
The simulated W ! ' events used to calculate the reconstruction efficiency correction and to construct the data-driven response matrix are generated using PYTHIA version 6.421 [17] with the MRST 2007 LO Ã PDF set [38]. The electroweak backgrounds (W ! and Z= Ã ! ' þ ' À ) are estimated using other PYTHIA samples generated in the same way. Simulated t " t and single-top events are generated using MC@NLO version 3.41 [22] and the CTEQ6.6 PDF set [39]. For those samples, the HERWIG generator version 6.510 [18] is used for parton showering and JIMMY version 4.1 [40] is used to model the underlying event. The muon channel multijet background estimate uses a set of PYTHIA dijet samples with a generator-level filter requiring at least one muon with jj < 3:0 and p T > 8 GeV. The multijet background estimate in the electron channel uses a PYTHIA dijet sample with a generator-level filter requiring particles with energy totaling at least 17 GeV in a cone of radius ÁR ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðÁÞ 2 þ ðÁ'Þ 2 p ¼ 0:05. In both channels, the normalization of the multijet background is set by the data. The multijet samples are used to provide an initial estimate of the background in the electron channel, and to extrapolate data-driven background estimates from control data to the signal region in the muon channel.
In all of the simulated data, QED radiation of photons from charged leptons was modeled using PHOTOS version 2.15.4 [41] and taus were decayed by TAUOLA version 1.0.2 [42]. The underlying event and multiple interactions were simulated according to the ATLAS MC09 tunes [43], which take information from the Tevatron into account. Additional inelastic collisions so generated are overlaid on top of the hard-scattering event to simulate the effect of multiple interactions per bunch crossing (''pileup''). The number of additional interactions is randomly generated following a Poisson distribution with a mean of two. Simulated events are then reweighted so that the distribution of the number of inelastic collisions per bunch crossing matches that in the data, which has an average of 1.2 additional collisions. The interaction of the generated particles with the ATLAS detector was simulated by GEANT4 [44,45]. The simulated data are reconstructed and analyzed with the same software as the pp collision data.
The electroweak and top quark background predictions are normalized using the calculated production cross sections for those processes. For W and Z backgrounds, the cross sections are calculated to next-to-next-to-leadingorder (NNLO) using FEWZ [7,8] with the MSTW 2008 [46] PDFs (see Ref. [3] for details). The t " t cross section is calculated at NLO with the leading NNLO terms included [47], setting m t ¼ 172:5 GeV and using the CTEQ6.6 PDF set. The single-top cross section is calculated using MC@NLO with m t ¼ 172:5 GeV and using the CTEQ6M PDF set.
We correct simulated events for differences with respect to the data in the lepton reconstruction and identification efficiencies as well as in energy (momentum) scale and resolution. The efficiencies are determined from selected W and Z events, using the ''tag-and-probe'' method [3]. The resolution and scale corrections are obtained from a fit to the observed Z boson line shape.
Additional W ! ' samples from event generators other than PYTHIA are used for comparison with the measured differential cross section ð1= fid Þðd fid =dp W T Þ. The MC@NLO sample used is generated with the same parameters as the t " t sample described above. The POWHEG events are generated using the same CTEQ6.6 PDF set as the main PYTHIA W ! ' samples, and POWHEG is interfaced to PYTHIA for parton showering and hadronization. ALPGEN version 2.13 [27] matrix element calculations are interfaced to the HERWIG version 6.510 [18] parton shower algorithm, and use JIMMY version 4.31 [40] to model the underlying event contributions. These events are generated using the CTEQ6L1 PDF set [48]. SHERPA event generation was done using version 1.3.0 [28], which includes a Catani-Seymour subtraction based parton shower model [49], matrix element merging with truncated showers [29] and high-multiplicity matrix elements generated by COMIX [50]. The CTEQ6L1 PDF set is used, and the renormalization and factorization scales are set dynamically for each event according to the default SHERPA prescription.

V. RECONSTRUCTION AND EVENT SELECTION
The p W T measurement is performed on a sample of candidate W ! ' events, which are reconstructed in the final state with one high-p T electron or muon and missing transverse energy sufficient to indicate the presence of a neutrino.
The event selection used in this paper closely follows that used in the inclusive W cross section measurement presented in Ref. [3]. The selection in the muon channel is identical to that used in the W lepton charge asymmetry measurement in Ref. [1]. The event reconstruction and W candidate selection are summarized here.
A. Lepton (e, , and ) reconstruction Electrons are reconstructed as inner detector tracks pointing to particle showers reconstructed as a cluster of cells with significant energy deposition in the electromagnetic calorimeter. This analysis uses electrons with clusters fully contained in either the barrel or end cap LAr calorimeter. These requirements translate into j e j < 2:47 with the transition region 1:37 < j e j < 1:52 excluded. To reject background (essentially originating from hadrons), multiple requirements on track quality and the electromagnetic shower profile are applied, following the ''tight'' selection outlined in Ref. [3]. Track quality criteria include a minimum number of hits in the pixel detector, SCT, and TRT, as well as requirements on the transverse impact parameter and a minimum number of TRT hits compatible with the detection of x-rays generated by the transition radiation from electrons. The energy deposition pattern in the calorimeter is characterized by its depth as well as its width in the three compartments of the LAr calorimeter, and the parameters are compared with the expectation for electrons. The position of the reconstructed cluster is required to be consistent with the location at which the extrapolated electron track crosses the most finelysegmented part of the calorimeter. Since electron showers are expected to be well contained within the LAr calorimeter, electron candidates with significant associated energy deposits in the tile calorimeter are discarded. Finally, electron candidates compatible with photon conversions are rejected. Although there is no explicit isolation requirement in the electron identification for this analysis, the criteria selecting a narrow shower shape in the calorimeter provide rejection against nonisolated electrons from heavy flavor decays. With these definitions, the average electron selection efficiency ranges from 67% in the end cap (1:52 < jj < 2:47) to 84% in the central region (jj < 1:37) for simulated W events.
Muons are reconstructed from tracks in the muon spectrometer joined to tracks in the inner detector. The track parameters of the combined muon are the statistical combination of the parameters of the MS and ID tracks, where the track parameters are weighted using their uncertainties for the combination. Combined muon candidates with jj < 2:4, corresponding to the coverage of the RPC and TGC detectors used in the trigger, are used in this analysis. To reject backgrounds from meson decays-in-flight and other poorly-reconstructed tracks, the p T measured using the MS only must be greater than 10 GeV, and the p T measured in the MS and ID must be kinematically consistent with each other: For both of these requirements, the momentum measured in the muon spectrometer is corrected for the ionization energy lost by the muon as it passes through the calorimeter. There are no explicit requirements on the number of hits associated with the MS track, but the ID track is required to have hits in the pixel detector, the SCT, and the TRT, although if the track is outside of the TRT acceptance that requirement is omitted. Finally, to reject background from muons associated with hadronic activity, particularly those produced by the decay of a hadron containing a bottom or charm quark, the muon is required to be isolated. The isolation is defined as the scalar sum of the p T of the ID tracks immediately surrounding the muon candidate track (ÁR < 0:4). The isolation threshold scales with the muon candidate p T and is P p ID T < 0:2p T . The combined muon reconstruction and selection efficiency varies from 90% to 87% as the muon p T increases from 20 GeV to above 80 GeV.
The transverse momentum of the neutrino produced by the W decay can be approximately reconstructed via the transverse momentum imbalance measured in the detector, also known as the missing transverse energy (E miss T ). The E miss T calculation begins from the negative of the vector sum over the whole detector of the momenta of clusters in the calorimeter. The magnitude and position of the energy deposition determines the momentum of the cluster. The cluster energy is initially measured at the electromagnetic scale, under the assumption that the only energy deposition mechanism is electromagnetic showers such as those produced by electrons and photons. The cluster energies are then corrected for the different response of the calorimeter to hadrons relative to electrons and photons, for losses due to dead material, and for energy which is not captured by the clustering process. The E miss T used in the electron channel is exactly this calorimeter-based calculation. In the muon channel, the E miss T is additionally corrected for the fact that muons, as minimum ionizing particles, typically only lose a fraction of their momentum in the calorimeter. For isolated muons, the E miss T is corrected by adding the muon momentum as measured with the combined ID and MS track to the calorimeter sum, with the calorimeter clusters associated with the energy deposition of the muon subtracted to avoid double-counting. In this context, muons are considered isolated if the ÁR to the nearest jet with E T > 7 GeV is greater than 0.3. Jets are reconstructed using the anti-k t algorithm [51] and the E T is measured at the electromagnetic scale. For nonisolated muons, the muon momentum is measured using only the muon spectrometer. In this case, the momentum loss in the calorimeter is kept within the calorimeter sum. To summarize, the E miss T is calculated via the formula E miss x;y ¼ À In the above, the E i x;y are the individual topological cluster momentum components, excluding those clusters associated with any isolated muon, the p j x;y are the momenta of isolated muons as measured with the combined track, and the p k x;y are the momenta of nonisolated muon as measured in the muon spectrometer. In practice, for the electron channel, only the first term contributes, but for the muon channel all three terms contribute.

B. Event selection
Candidate W events are selected from the set of events passing a single electron or a single muon trigger. Offline, events are first subject to cleaning requirements aimed at rejecting events with background from cosmic rays or detector noise. These requirements reject a small fraction of the data and are highly efficient for the W signal [3]. Events must have a reconstructed primary vertex with at least three tracks with p T > 150 MeV. They are rejected if they contain a jet with features characteristic of a known noncollision localized source of apparent energy deposition, such as electronic noise in the calorimeter. Such spurious jets can result in events with large E miss T but which do not contain a neutrino or even necessarily originate from a pp collision. In the electron channel, events are rejected if the electron candidate is reconstructed in a region of the calorimeter suffering readout problems during the 2010 run [52]. This last requirement results in a $5% efficiency loss.
After the event cleaning, we select events with at least one electron or muon, as defined above, with transverse momentum greater than 20 GeV. In events with more than one such lepton, the lepton with largest transverse momentum is assumed to originate from the W decay. To provide additional rejection of cosmic rays, muon candidates must point at the primary vertex, in the sense that the offset in z along the beam direction between the primary vertex and the point where the candidate muon track crosses the beam line must be less than 10 mm.
Finally, we require E miss T > 25 GeV and transverse mass greater than 40 GeV to ensure consistency of the candidate sample with the expected kinematics of W decay.
After all selections, 112 909 W ! e candidates and 129 218 W ! candidates remain in the data. The smaller number of candidate events in the electron channel is mostly due the lower electron reconstruction and identification efficiency.

C. Hadronic recoil calculation
The reconstruction of the W boson transverse momentum is based on a slight modification of the E miss T calculation described above. Formally, thep T of the W boson is reconstructed as the vector sum of thep T of the neutrino and the charged lepton,p W T ¼p ' T þp T . But the neutrino p T is reconstructed through the E miss T , and the E miss T is determined in part from the lepton momentum, explicitly in the case of W ! events, and implicitly in W ! e events through the sum over calorimeter clusters. Therefore when thep T of the charged lepton and E miss T are summed, the charged lepton momentum cancels out and the W transverse momentum is measured as the summedp T of the calorimeter clusters, excluding those associated with the electron or muon. This part, which consists of the energy deposition of jets and softer particles not clustered into jets, is referred to as the hadronic recoil R. The reconstructed p W T is denoted p R T and is defined as the magnitude ofR.
In this measurement, the exclusion of the lepton from p R T is made explicit by removing all clusters with a ÁR < 0:2 relative to the charged lepton. This procedure leaves no significant lepton flavor dependence in the reconstruction of p R T , so that it is possible to construct a combined response matrix describing the mapping from p W T to p R T which can be applied to both channels. To compensate for the energy from additional low-p T particles removed along with the lepton, the underlying event is sampled on an event-by-event basis using a cone of the same size, placed at the same as the lepton. The cone azimuth is randomly chosen but required to be away from the lepton and original recoil directions, to ensure that the compensating energy is not affected significantly by these components of the event.
The distance in azimuth to the lepton is required to satisfy Á > 2 Â ÁR, and the distance to the recoil should match Á > =3. The transverse momentum measured from calorimeter clusters in this cone is rotated to the position of the removed lepton and added to the original recoil estimate. Because this procedure is repeated for every event, the energy in the clusters in the replacement cone contains an amount of energy from the underlying event and from multiple proton-proton collisions (''pileup'') which is correct on average for each event and accounts for event-by-event fluctuations.

VI. BACKGROUND ESTIMATION
Backgrounds to W ! e and W ! events come from other types of electroweak events (Z ! '' and W ! ), t " t and single-top events, and from multijet events in which a nonprompt lepton is either produced through the decay of a hadron containing a heavy quark (b or c), the decay-in-flight of a light meson to a muon, or through a coincidence of hadronic signatures that mimics the characteristics of a lepton. Figure 1 shows the expected and observed p R T distribution in the electron and muon channels, with background contributions calculated as described below.
Electroweak backgrounds (W ! , Z ! '', Z ! ) and top quark production (t " t and single top) are estimated using the acceptance and efficiency calculated from simulated data, corrected for the imperfect detector simulation and normalized using the predicted cross sections as described in Sec. IV. These backgrounds amount to about 6% of the selected events in the electron channel, and to about 10% in the muon channel. The background in the muon channel is larger because the smaller geometrical acceptance of the muon spectrometer compared to the calorimeter leads to a greater contribution of Z ! events compared to Z ! ee events. Uncertainties on the summed electroweak and top background rates are 6% at low p R T in both channels, rising to 14% above p R T $ 200 GeV in the muon channel, and 25% in the electron channel. The leading uncertainties on these backgrounds at low p R T are from the theoretical model, since the cross sections used to normalize them have uncertainties ranging from 4% (for W and Z) to 6% (for t " t), and from the PDF uncertainty on the acceptances, which is 3% [3]. The integrated luminosity calibration contributes an additional 3.4% [36,37]. Important experimental uncertainties include the energy (momentum) scale uncertainty, which contributes about 3% (1%) at low p R T in the electron (muon) channel, increasing to about 6% (5%) at high p R T . At high p R T (p R T * 150 GeV), there are also significant contributions for both channels from the statistical uncertainty on the acceptance and efficiency calculated from simulated events.
The multijet backgrounds are determined using datadriven methods. In the electron channel, the observed E miss T distribution is interpreted in terms of signal and background contributions, using a method based on template fitting. A first template is built from the signal as well as electroweak and top backgrounds, using simulated events. The multijet background template is built from a background-enriched sample, obtained by applying all event selection cuts apart from inverting a subset of the electron identification criteria. The multijet background fraction is then determined by a fitting procedure that adjusts the normalization of the templates to obtain the best match to the observed E miss T distribution. This method has been described in Ref. [3], and is applied here bin by bin in p R T . The multijet background fraction is 4% at low p R T , and rises to $9% at high p R T . Uncertainties on this method are estimated from the stability of the fit result under different event selections used to produce the multijet background templates, by propagating the lepton efficiency and momentum scale uncertainties to the signal templates, and by varying the range of the E miss T distribution used for the fit. These sources amount to a total relative uncertainty of 25% at low p R T , decrease to 5% at p R T $ 35 GeV, and progressively rise again to 100% at high p R T , where very few events are available to construct the templates.
In the muon channel, the multijet background is primarily from semileptonic heavy quark decays, although there is also a small component from kaon or pion decays-inflight. The estimation of this background component relies on the different efficiencies of the isolation requirement for multijet and electroweak events, and is based on the method described in Ref. [3]. Muons from electroweak boson decays, including those from top quark decays, are mostly isolated, and their isolation efficiency is measured from Z ! events. The efficiency of the isolation requirement on multijet events is measured using a background-enriched control sample, which consists of events satisfying all of the signal event selection except that the muon transverse momentum range is restricted to 15 < p T < 20 GeV and the E miss T and m T requirements are dropped. The measured efficiency is extrapolated to the signal region (p T > 20 GeV, E miss T > 25 GeV, and m T > 40 GeV) using simulated multijet events. Knowledge of the isolation efficiency for both components, combined with the number of events in the W ! candidate sample before and after the isolation requirement, allows the extraction of the multijet background. As for the electron channel, this method is applied for each bin in p R T , with the number of total and isolated candidates, as well as the signal and background efficiencies, calculated separately for each bin. The isolation efficiency for the background is fitted with an exponential distribution to smooth out statistical fluctuations arising from the limited number of events passing all of the event selection in the simulated multijet data.
The multijet background fraction in the muon channel is found to be 1.5% at low p R T and decreases to become negligible for p R T > 100 GeV. Uncertainties on the estimated multijet background include all statistical uncertainties, including those on both the signal and background isolation efficiency measurements. The full range of the simulation-based extrapolation of the isolation efficiency for the multijet background is taken as a systematic uncertainty. Subtraction of residual electroweak events in the control samples is also included in the systematic uncertainty but is a subdominant contribution. The relative uncertainty on the background rate varies between 25% and 80%, with the largest uncertainties for p R T < 40 GeV.
VII. UNFOLDING OF THE p R T DISTRIBUTION The unfolding of the p R T distribution to the p W T distribution is performed in two steps. In the first step, the background-subtracted p R T distribution is unfolded to the true p W T distribution, using the response matrix to model the migration of events among bins caused by the finite resolution of the detector. The result of this step is the distribution of dN=dp W T of all reconstructed W events. In the second step, this distribution is divided by a reconstruction efficiency correction relating the number of reconstructed W events to the number of generated fiducial W events within each bin. That correction results in the differential cross section d fid =dp W T .

A. Unfolding of the recoil distribution
The response matrix describes the relation between p W T and p R T , the true and reconstructed W p T , respectively. It reflects the physics of the process (hadronic activity from soft and hard QCD interactions) as well as the response of the calorimeters to low energy particles. This is in principle captured by a response matrix drawn from simulated W ! ' events, but the simulation of both aspects carries significant uncertainty. Therefore, the treatment of the response matrix includes corrections from Z data to improve the model.
The Z ! ee and Z ! data are used as a model for the hadronic recoil response in W events because the underlying physics is similar but there are two independent ways to measure the p T of the Z, through the hadronic recoil or the p T of the charged leptons. The lepton energy resolution is sufficiently good that the dilepton p T can be used to calibrate the hadronic recoil, with the dilepton p T standing in for the true p T and the hadronic recoil remaining the ''measured'' quantity. One could construct a response matrix purely from Z ! '' events, but such a matrix would be limited by the relatively small number of Z ! '' events in the 2010 data and residual differences between W and Z kinematics and production mechanisms. To incorporate the best features of both the W simulation and Z data models, we introduce a parametrization of the hadronic recoil scale and resolution. Fits to the real and simulated Z data using this parametrization are used to correct the simulated W response, and the resulting corrected parametrization is used to fill the response matrix used for the unfolding.
Following this logic, the response matrix is built in three steps. A first version of the response matrix, denoted M MC , is directly filled from simulated W ! ' signal events as the two-dimensional distribution of p R T and p W T . The parametrized response matrix M param is also based solely on simulated W ! ' events but is constructed from a fit to the recoil as described below. The final corrected parametrized response matrix M corr param uses the same functional form as M param , but with the fit parameters corrected using the response measured in Z ! '' data. Only M corr param is used in the central value of the measurement, but M MC and M param are used in assessing systematic uncertainties, particularly those arising from the response matrix parametrization and the unfolding procedure.
To facilitate the incorporation of corrections from the Z data, we introduce an analytical representation of the detector response to p W T , and approximate M MC via a smearing procedure. DecomposingR into its components parallel and perpendicular to the W line of flight, R k and R ? , the response is observed to behave as a Gaussian distribution with parameters governed by p W T and AEE T , where AEE T is the scalar sum of the transverse energy of all calorimeter clusters in the event. By choosing the coordinate system to align with the W line of flight, any scale offset (''bias'') is in the R k direction by construction, and the Gaussian resolution function is centered at zero in the R ? direction. Specifically, the approximated response M param is obtained from the Monte Carlo signal sample as follows: where G denotes a Gaussian random number, and its parameters b, k and ? are the Gaussian mean and resolution parameters determined from fits to the simulation. The bias is described according to bðp , where the p W T dependence indicates that the fit is performed separately in three regions of p W T (p W T < 8 GeV, 8 < p W T < 23 GeV, and p W T > 23 GeV). The separation of the fit into regions of p W T improves the quality of the fit. With the parametrization defined, it is possible to build up a response matrix from a set of events using a smearing procedure. Given the p W T and AEE T of each event, R k and R ? can be constructed using random numbers distributed according to Eqs. (3) and (4). Then p R T is reconstructed from R k and R ? , and the results are used to fill the relationship between p W T and p R T . Applying this procedure to the simulated signal sample results in the approximate response matrix M param .
Corrections to this parametrization are derived from Z ! '' events by applying the same procedure to both real and simulated Z events and using the measured decay lepton pair momentum p '' T as the estimator of the true Z boson transverse momentum. The hadronic recoil calculated as described in Sec. V C has no dependence on the lepton flavor, and consistent response is observed in Z ! ee and Z ! events. Therefore we fit the combined data from both channels to minimize the statistical uncertainty. The corrected smearing parameters are defined as follows: Þ: Above, b '';data and b '';MC are determined as a function of p '' T , and then used as a function of p W T ; b W;corr and b W;MC are functions of p W T throughout. All resolution parameters are functions of the reconstruction-level AEE T . This defines the final, corrected response matrix M corr param used in the hadronic recoil unfolding.
The parametrization of the bias and resolution parameters in W and Z simulation are illustrated in Figs. 2(a), 3(a), and 4(a). For these, the bias and resolution are defined with respect to the true (propagator) W and Z momenta. The simulated and data-driven bias and resolution parameters in Z events are displayed in Figs. 2(b), 3(b), and 4(b). For these, the bias and resolution are defined with respect to the reconstructed dilepton p T . In Figs. 2(a) and 2(b), the bias parametrization is shown only over the range which determines the fit parameters, but the parametrization describes the data well up to p W T ¼ 300 GeV. The response matrix is constructed using the following bin edges, expressed in GeV:  8,15,23,30,38,46,55,65,75,85,95,107,120,132,145,160,175,192,210,250,300. (ii) Unfolded distribution: 0,8,23,38,55,75,95,120,145,175,210,300.
The reconstruction-level binning enables more detailed comparisons between data and simulation before unfolding, and allows a more precise background subtraction as a function of p R T . It has been used in Fig. 1. The bin edges at the unfolded level provide a purity of at least 65% across the p W T spectrum, which is large enough to ensure the stability of the unfolding procedure. The bins are still small enough to keep the model dependence of the result, which enters through the assumption of a particular p W T shape within each bin, to a subleading contribution to the overall uncertainty (see the description of the systematic uncertainties in Sec. VIII). The purity is defined as the fraction of  events where the event falls in the same bin when the bin edges are defined using p R T as it does when the bin edges are defined using p W T . The unfolding of the hadronic recoil is performed by means of the iterative Bayesian algorithm [53], where the p W T distribution predicted by the simulation is used as first assumption of the true p W T spectrum, and iteratively updated using the observed distribution. This procedure converges after three iterations.
The statistical uncertainty on the unfolded spectrum is obtained by generating random replicas of the reconstruction-level data. First, the p R T distribution from simulation is scaled to have an integral equal to the number of events observed in data. For each trial, the number of events in each bin is fluctuated according to a Poisson distribution with a mean set by the original bin content. The unfolding procedure is used on the fluctuated distribution, and the p W T distribution from the same set of simulated events is subtracted from the result. The resulting ensemble of offsets is used to fill a covariance matrix describing the impact of statistical fluctuations on the result, including correlations between the bins introduced by the unfolding procedure.
Systematic uncertainties receive contributions from the quality of the response parametrization approximation, i.e. from the difference between M MC and M param ; from the statistical precision of the data-driven corrections defining M corr param ; and from the unfolding procedure itself. Their estimation is described in Sec. VIII.

B. Efficiency correction
The W ! ' candidate event reconstruction efficiency is subsequently unfolded by dividing the number of events in each bin of p W T by the detection efficiency correction factor for that bin. The correction factor accounts for trigger and detection efficiencies, as well as the migration of events in and out of the acceptance due to charged lepton and E miss T resolution effects. It is defined as the ratio of the number of reconstructed events passing all selection in each bin to the number of events produced within the fiducial volume in that same bin. Note that any migration between bins has already been accounted for by the hadronic recoil response unfolding. The efficiency correction is based on the ratio calculated from simulated W events, and is corrected for observed differences between simulated and real data in the trigger and reconstruction efficiencies as well as in the lepton momentum and resolution (see Sec. IV). The corrections for discrepancies between data and simulation are applied as a function of the reconstructed lepton kinematics in each bin of p W T . The fiducial volume in the denominator is defined by the truth-level kinematic requirements p ' T > 20 GeV, j ' j < 2:4, p T > 25 GeV, and m T > 40 GeV. For the default, propagatorlevel p W T measurement, the lepton kinematics and transverse mass are defined at the QED Born level, i.e., before any final state QED radiation. For the dressed lepton version of the measurement, the charged lepton momentum is the sum of its momentum after all QED FSR and the momenta of all photons radiated within a cone of ÁR ¼ 0:2 around the lepton. The cone size is chosen to match the cone size used for the lepton removal in the definition ofR. The bare lepton version uses only the charged lepton momentum after all QED FSR.
In the electron channel, the efficiency rises from $60% at low p W T to $80% at p W T $ 100 GeV, and falls towards $70% at the upper end of the spectrum. In the muon channel, the efficiency rises from $80% to $90%, then falls to $80% in the same p W T ranges. The efficiency correction carries systematic uncertainties induced by the imperfect modeling of the lepton trigger and reconstruction efficiencies, by the acceptance of the E miss T cut, and by the finite statistics and physics assumptions of the signal simulation sample. Their estimation is described in Sec. VIII.

VIII. SYSTEMATIC UNCERTAINTIES
Systematic uncertainties arise from the background subtraction procedure, from the recoil response model and unfolding procedure, and from lepton reconstruction and calibration uncertainties. Theoretical uncertainties also enter, to a lesser extent. Different strategies are used for the various uncertainties according to the nature of the uncertainty and whether it is introduced before, during, or after the hadronic recoil unfolding. Accordingly, the uncertainties are evaluated by using an ensemble of inputs with the nominal response matrix, an ensemble of response matrices with the nominal input, or by simple error propagation, respectively. The uncertainties on this measurement are represented as covariance matrices, so that correlations between the bins can be included.

A. Background subtraction uncertainties
The systematic uncertainties associated to the background subtraction are estimated by generating an ensemble of pseudo-experiments in which the background estimates have been fluctuated within their uncertainties. The full analysis chain is repeated for each pseudoexperiment and the spread of the unfolded results defines the associated uncertainty. Electroweak, top, and QCD multijet contributions are treated separately, except that the luminosity uncertainty is treated as correlated between the electroweak and top backgrounds. Background subtraction is performed before the unfolding, and the unfolding redistributes the background among the p W T bins, so the covariance matrices representing the uncertainties on the backgrounds have nonzero off-diagonal elements.
The electroweak and top backgrounds contribute 0.6% (0.4%) to the measurement uncertainty at low p W T in the electron (muon) channel, and up to 4% at high p W T in both channels. The multijet background in the electron channel contributes $0:5% uncertainty for p W T < 50 GeV, which gradually rises to 4% at p W T $ 200 GeV, eventually contributing 15% in the highest p W T bin. In the muon channel, the multijet background induced uncertainty has a maximum of 2% at p W T $ 30 GeV, which corresponds to the peak of the background rate, and contributes $0:6% on average in the rest of the spectrum.

B. Hadronic recoil unfolding uncertainties
Systematic uncertainties associated to the response matrix are classified in two categories. In the first category, the impact of a given source of uncertainty is estimated by comparing the unfolded distribution obtained with the nominal response matrix, to the result obtained with a response matrix reflecting the variation of this source. The statistical component of the difference is assessed by varying a given input of the response matrix construction to generate a set of related variations of the response matrices. Repeating the analysis with these leads to a set of varied unfolded results, and the induced bias is averaged in each bin of the p W T distribution. The associated systematic uncertainty is defined from the spread of the distribution of the results, and is taken as a constant percentage across all p W T bins, represented as a diagonal covariance matrix.
By comparing results obtained from the initial Monte Carlo response matrix M MC with results obtained from the parametrized response matrix M param , the response parametrization is found to induce an uncertainty of 2.4% in the electron channel and 2.0% in the muon channel. The input generator bias is estimated by reweighting the true p W T distribution given by the PYTHIA sample to the RESBOS prediction, generating the corresponding response matrix and comparing the result to the nominal result, leading to a systematic uncertainty of 1.2% in the electron channel, and 0.9% in the muon channel. Note that the starting assumption for the Bayesian unfolding is simultaneously modified in the same way, so that this uncertainty includes both the effect of modifying the distribution underlying the response matrix and the assumption of a prior for the unfolding. In addition, it was verified that reweighting the input p W T assumption according to the actual measurement result and repeating the procedure does not affect the result beyond the uncertainties quoted above. Lepton momentum scale uncertainties also enter through the Z-based recoil response corrections, because p '' T is used in place of the true p Z T , but this amounts to less than 0.2% in both channels. As described above, these numbers are taken constant across the p W T spectrum. The second category deals with the uncertainties associated to the data-driven corrections to the response parametrization. In this case, we generate an ensemble of random correction parameters by sampling from the distribution defined by the statistical uncertainties on the central value of the parameters returned by the fit. For each parameter set the corresponding response matrix is generated. The treatment is then the same as for the background uncertainties: the analysis chain is repeated for each configuration, and the spread of the unfolded bin contents defines the associated uncertainty in each bin.
In this category, the data-driven correction to the recoil bias and resolution induces an uncertainty of $1:6% for p W T < 8 GeV, has a local maximum of $2:6% at p W T ¼ 30 GeV, and contributes less than 1% in the remaining part of the spectrum. The uncertainty related to the AEE T rescaling is 0.2% at low p W T , rising to 1% at the high end of the spectrum. These numbers are valid for both channels, as the data-driven corrections are determined from combined Z ! ee and Z ! samples, as described in Sec. VII A. Finally, the bias from the unfolding itself is found by folding the p W T distribution of simulated W ! ' events passing the reconstruction-level selection using M MC and then unfolding it using the same response matrix. The original p W T distribution is subtracted from the unfolded one, and the size of the bias relative to the original distribution is taken as the systematic uncertainty from the unfolding procedure. The folded distribution is used for p R T instead of the found p R T distribution to avoid doublecounting the statistical uncertainty. The resulting uncertainty is less than 0.5% in all bins, except for the highest-p W T bin in the electron channel, where it is 1%.

C. Efficiency correction uncertainties
In the electron channel, the main contributions to the acceptance correction uncertainty are the reconstruction and identification efficiency uncertainty, and the electron energy scale and resolution uncertainties. The identification efficiency contributes 3% to the measurement uncertainty across the p W T spectrum. The scale and resolution uncertainties contribute 0.5% at low p W T , rising to 10% at p W T $ 100 GeV, and decreasing to 6% at the high end of the spectrum.
In the muon channel, the trigger efficiency uncertainty contributes 1% across the spectrum. The reconstruction efficiency contributes 0.7% at low p W T , linearly rising to 2% at p W T $ 300 GeV. The scale and resolution uncertainties contribute 0.5% at low p W T , rising to 2% at p W T $ 120 GeV, and decreasing to 1% towards p W T $ 300 GeV. The uncertainty associated to the recoil component of E miss T (the first term of Eq. (2), minus any clusters associated with an electron) is estimated as above, by generating random ensembles of resolution correction parameters within the precision of the Z-based calibration. For each parameter set in the ensemble, the E miss T distribution is regenerated and the corresponding efficiency correction is recalculated. The width of the resulting distribution of efficiency corrections is taken as the uncertainty. This source contributes less than 0.3% across the p W T spectrum in both channels.
In both channels, the Monte Carlo statistical precision is 0.5% at low p W T and rises to 4% towards p W T $ 300 GeV. The generator dependence of the efficiency is estimated by comparing the central values found for PYTHIA and MC@NLO, and found to be smaller than 0.2%, apart from the last bin where it reaches 1%. Finally, following Ref.
[2], the PDF induced uncertainty on the efficiency correction is at the level of 0.1% and neglected in this analysis.

A. Electron and muon channel results
The efficiency-corrected distributions resulting from the two unfolding steps are normalized to unity, and the bin contents are divided by the bin width. In the normalization step, uncertainties that are completely correlated across all of the bins, such as the uncertainty on the integrated luminosity, cancel. The resulting normalized differential fiducial cross section, ð1= fid Þðd fid =dp W T Þ is given in Table I for both the electron and muon channels, together with the statistical and systematic uncertainties. The differential cross section is calculated with respect to three definitions of p W T and the fiducial volume, corresponding to different definitions of the true lepton kinematics: the first uses the Born-level kinematics, the second uses the dressed lepton kinematics calculated from the sum of the post-FSR lepton momentum and the momenta of all photons radiated within a cone of ÁR ¼ 0:2, and the third (bare) uses the lepton kinematics after all QED radiation.
Instead of normalizing the efficiency-corrected distributions to unit integral, they can also be divided by the integrated luminosity of the corresponding data to yield the differential fiducial cross section d fid =dp W T . The resulting differential fiducial cross sections, with the fiducial volume defined by the Born-level kinematics, are shown in Fig. 5. Error bars include both statistical and systematic uncertainties, but not the uncertainty on the integrated luminosity, which is common to both measurements.

B. Combination procedure
After correcting the electron and muon p W T distributions to the common fiducial volume using the efficiency cor- FIG. 5 (color online). Electron and muon fiducial differential cross sections as a function of p W T . The error bars include all statistical and systematic uncertainties except the 3.4% uncertainty on the integrated luminosity, which is common to the two measurements and cancels in the ratio.  -level definition (''propag.''), the analysis baseline, ignores the leptons and takes the W momentum from the propagator. The dressed and bare definitions of p W T are calculated using the momenta of the leptons from the W decay. In the dressed case, the charged lepton momentum includes the momenta of photons radiated within a cone of ÁR ¼ 0:2 centered around the lepton. In the bare case, the charged lepton momentum after all QED radiation is used. The factor p is the power of 10 to be multiplied by each of the three cross section numbers for each channel. It has been factorized out for legibility.  D 85, 012005 (2012) rections described in Sec. VII, we combine the resulting differential fiducial cross sections d fid =dp W T by 2 minimization. The combination is based on the distributions with p W T defined by the W propagator momentum because QED final state radiation causes differences between the electron and muon momenta that makes a consistent combination based on other definitions unfeasible. To build the 2 , the uncertainties on the two measurements are sorted according to whether they are correlated between the two channels or not, and a joint covariance matrix describing the uncertainty on both measurements is constructed. Using this covariance matrix, we define a 2 between the two measurements and a common underlying distribution. This 2 is minimized to find the combined measurement, which is the best estimate of the common underlying distribution.
Specifically, the 2 to be minimized is defined as where X is the vector of 2N elements containing the two N-bin distributions to be combined, concatenated: X ¼ fX e 1 ; . . . ; X e n ; X 1 ; . . . ; X n g. The vector " X ¼ f " X 1 ; . . . ; " X n ; " X 1 ; . . . ; " X n g contains two copies of the combined measurement f " X i g. The joint covariance matrix C is described in the next paragraph. The 2 minimization is performed analytically, following the prescription in Ref. [54], yielding the f " X i g. The joint covariance matrix C has 2N Â 2N elements and is constructed from four submatrices: The N Â N covariance matrices C e and C are the covariance matrices for the electron and muon measurements, respectively, and contain all sources of uncertainty on the measurements. The off-diagonal blocks C e are identical and reflect the sources of uncertainty that are correlated between the channels. The 2N Â 2N covariance matrix is constructed from the two N Â N matrices for each source of uncertainty individually, and the resulting set of 2N Â 2N matrices is summed. For sources of uncertainty uncorrelated between the channels, the 2N Â 2N covariance matrix is constructed by copying the N Â N matrices to the corresponding diagonal blocks C e and C . For uncertainties that are correlated between the channels, the diagonal blocks are still filled by copying the covariance matrices from the individual channels. The off-diagonal blocks are filled using the assumption that the channels are 100% correlated, so that the correlations between bins are identical for both channels. That determines the correlation matrix, which sets the magnitudes of the covariance matrix entries relative to the magnitude of the diagonal entries. The diagonal entries, which are the squares on the uncer-tainties on each bin, are taken as the geometrical average of the values for the two channels.
The statistical uncertainties on the unfolded measurements are uncorrelated because the W ! e and W ! candidate data samples are statistically independent. The systematic uncertainties induced by the subtraction of the estimated background are uncorrelated between the channels, except for the uncertainties on the luminosity and predicted cross sections used to normalize the electroweak and top quark backgrounds. Because the same hadronic recoil response matrix is used for both channels, the uncertainties associated with it are fully correlated between the channels, except for the small contribution from the lepton momentum resolution. The efficiency corrections for each channel are independent, so the associated uncertainties are uncorrelated between the channels.

C. Combined results and comparison with predictions
The 2 minimization yields a 2 =d:o:f: of 13:0=13, demonstrating good agreement between the electron and muon results. The combined differential cross section, normalized to unity, is shown compared to the prediction from RESBOS in Fig. 6. The RESBOS prediction, which combines resummed and fixed-order pQCD calculations, is based on the CTEQ6.6 PDF set [39] and a renormalization and factorization scale of m W . RESBOS performs the fixed-order calculation at NLO (Oð s Þ), and corrects the prediction to NNLO (Oð 2 s Þ) using a k factor calculated as a function of the boson mass, rapidity, and p T [13][14][15]. Table II gives the same information numerically, including the separate contribution of different classes of uncertainty. FIG. 6 (color online). Normalized differential cross section obtained from the combined electron and muon measurements, compared to the RESBOS prediction.
In Fig. 7, the combined result ð1= fid Þðd fid =dp W T Þ is compared to a selection of predictions from both pQCD and event generators. The DYNNLO predictions are from version 1.1 of the program [9,10]. The prediction from the MCFM program is produced as a calculation of d fid =dp W T for W þ 1 parton events and uses MCFM version 5.8 [11]. The leading-order calculation for W þ 1 parton production is Oð s Þ and the NLO calculation is Oð 2 s Þ, so the predictions are comparable to other Oð s Þ and Oð 2 s Þ predictions of p W T for p W T > 5 GeV, the minimum jet p T threshold in the calculation. Both of the pQCD calculations are normalized by dividing the prediction in each bin by the inclusive cross section prediction calculated in the same configuration as the differential cross section, and both have the renormalization and factorization scales set to m W . The Oð s Þ predictions use the MSTW2008 NLO PDF sets, and the Oð 2 s Þ predictions use the NNLO MSTW2008 PDF set [46]. The uncertainty on the pQCD predictions comes mostly from the renormalization and factorization scale dependence, and studies indicate that it is comparable in magnitude to the 10% and 8% observed for p Z T predictions at Oð s Þ and Oð 2 s Þ in Ref.
[2]. The DYNNLO and MCFM predictions do not include resummation effects and are not expected to predict the data well at low p W T because of the diverging prediction for vanishing p W T . Therefore, the lowest bin (p W T < 8 GeV) is omitted from Fig. 7 D 85, 012005 (2012) similar to the NLO event generators. The Oð s Þ prediction from FEWZ [7,8] is not shown in Fig. 7 but is in agreement with those from DYNNLO and MCFM. The discrepancy between the predictions and the measurement appears when normalizing to the inclusive cross section and would be compensated by a large but unphysical contribution in the first bin. The ratio moves closer to unity in the high p W T range. The Oð 2 s Þ predictions agree better with the data than those at Oð s Þ. They are within 15% of the data for all p W T . The predictions of the event generators PYTHIA, POWHEG, ALPGEN, SHERPA, and MC@NLO are based on the simulated samples described in Sec. IV. Since POWHEG and ALPGEN can be interfaced with more than one parton shower implementation, the notations POWHEG + PYTHIA and ALPGEN + HERWIG are used to make the choice explicit. The PYTHIA, RESBOS, SHERPA, and ALPGEN + HERWIG predictions describe the measurement within 20% over the entire range. For p W T < 38 GeV, the data indicate a softer spectrum than these predictions. For 38 < p W T < 120 GeV, the data distribution exceeds the RESBOS prediction and undershoots the SHERPA prediction, but agrees with the ALPGEN + HERWIG and, to a lesser extent, pure PYTHIA predictions. For p W T > 120 GeV, PYTHIA and RESBOS agree in predicting a softer spectrum than ALPGEN + HERWIG and SHERPA, but the data provide no significant discrimination among these predictions.
POWHEG + PYTHIA and MC@NLO, the NLO event generators interfaced with parton shower algorithms, provide a reasonable description of the data for p W T < 38 GeV, but both underestimate the data starting at p W T % 38 GeV, with a deficit gradually increasing to nearly 40% at high p W T . Finally, we compare the combined result to the measurement of ð1= fid Þðd fid =dp Z T Þ described in Ref.
[2]. The W and Z have different masses and couple differently to quarks, so the results cannot be directly compared, but the ratios of the measured to predicted distributions for a common model can be used to qualitatively assess the agreement between the two measurements. The ratios of the W and Z distributions in data to their respective RESBOS predictions are overlaid in Fig. 8. In spite of the different techniques and uncertainties characterizing both measurements, the ratios display similar trends as a function of p V T , the true boson p T .

X. CONCLUSIONS
The W transverse momentum differential cross section has been measured for p W T < 300 GeV in W ! ' events reconstructed in the electron and muon channels using the ATLAS detector. The W ! ' candidate events are selected from pp collision data produced at ffiffi ffi s p ¼ 7 TeV, corresponding to approximately 31 pb À1 from the 2010 run of the LHC.
The measurement is compared to a selection of predictions. The ALPGEN + HERWIG, PYTHIA, RESBOS, and SHERPA predictions match the data within 20% over the entire p W T range. MC@NLO provides the closest description of the data for p W T < 38 GeV, but MC@NLO and POWHEG + PYTHIA both underestimate the data at higher p W T . Fixed-order pQCD predictions from the DYNNLO and MCFM programs agree very well with each other. They predict fewer events at high p W T at Oð s Þ but the agreement with the measured distribution is significantly improved by the Oð 2 s Þ calculations.
A comparison of the W and Z data relative to the prediction from a given theoretical framework displays similar features across the measured transverse momentum range, supporting the expected universality of strong interaction effects in W and Z production.
Although the measurement is limited by systematic uncertainties over most of the spectrum, the dominant uncertainty sources can be constrained with more integrated luminosity. With the integrated luminosity available from the 2011 run now in progress, future measurements should be able to measure d fid =dp W T to at least double the current range in p W T . With improved statistical and systematic uncertainties, it should also be possible to measure the ratios of the W to Z and W þ to W À differential cross sections as functions of the boson p T , which will further test the predictions of QCD.

ACKNOWLEDGMENTS
We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently. We acknowledge the support of ANPCyT, Argentina; YerPhI, Armenia; ARC, Australia; BMWF, Austria; ANAS, Azerbaijan; SSTC, Belarus; CNPq and FAPESP, Brazil; NSERC, NRC and CFI, Canada; CERN; [3] ATLAS Collaboration, J. High Energy Phys. 12 (2010) 060.
[5] The origin of the ATLAS coordinate system is at the nominal pp interaction point in the geometrical center of the detector. The z axis points along the anticlockwise beam direction, and the azimuthal and polar angles ' and are defined in the conventional way, with ' ¼ 0 (the x axis) pointing from the origin to the center of the LHC ring. The pseudorapidity is defined as ¼ À lntanð=2Þ. The transverse momentum p T , the transverse energy E T , and the transverse missing energy E miss T are defined in the x-y plane.