Measurement of the t-channel single top quark production cross section in pp collisions at sqrt(s) = 7 TeV

Electroweak production of the top quark is measured in pp collisions at sqrt(s) = 7 TeV, using a dataset collected with the CMS detector at the LHC and corresponding to an integrated luminosity of 36 inverse picobarns. With an event selection optimized for t-channel production, two complementary analyses are performed. The first one exploits the special angular properties of the signal, together with background estimates from data. The second approach uses a multivariate analysis technique to probe the compatibility with signal topology expected from electroweak top quark production. The combined measurement of the cross section is 83.6 +/- 29.8 (stat.+syst.) +/- 3.3 (lumi.) pb, consistent with the standard model expectation.


1
Electroweak theory predicts three mechanisms for single top quark production in hadronhadron collisions: t-channel, s-channel, and tW (or W-associated) production. Single-top events have been observed by the D0 and CDF experiments at the Tevatron pp collider [1][2][3], and first measurements of individual channels have recently been reported [4,5]. In proton-proton collisions at 7 TeV, t-channel single top quark production, Fig. 1, has the largest cross section and the cleanest final-state topology, because of the presence of a light jet recoiling against the single top quark. Next-to-leading order (NLO) computations with resummation of collinear and softgluon corrections at next-to-next-to-leading logarithmic accuracy predict σ t = 64.3 +2.1 −0.7 +1.5 −1.7 pb [6], for a top mass of m t = 173 GeV/c 2 and with parton distribution functions (PDFs) as given in Ref. [7]. The first uncertainty comes from doubling and halving the renormalization and factorization scales and the second from PDF uncertainty at the 90% confidence level.  This Letter presents the first measurement of the t-channel single top quark production cross section in pp collisions at √ s = 7 TeV in the decay channels t → eνb, t →µνb, and t →τνb with leptonic τ decays. Two complementary measurements are performed. The first analysis exploits two angular observables sensitive to t-channel single top quark production: the noncentral pseudorapidity distribution of the light jet, and the cosine of the angle between this jet and the final-state lepton, in the reconstructed top-quark rest frame. A multivariate analysis technique with boosted decision trees (BDT) [8,9] is used in the second method, which probes the overall compatibility of the signal event candidates with the event topology of electroweak top quark production. Hereafter, these analyses will be referred to as 2D and BDT analysis, respectively.
Both analyses use a data sample corresponding to an integrated luminosity of 35.9 ± 1.4 pb −1 [10], collected by the Compact Muon Solenoid (CMS) detector [11] operating at the Large Hadron Collider (LHC). The central feature of the CMS detector is a superconducting solenoid providing a field of 3.8 T. Located within the solenoid are the silicon pixel and strip tracker, the crystal electromagnetic calorimeter and the brass/scintillator hadron calorimeter. Muons are measured in gas-ionisation detectors embedded in the steel return yoke. In addition to the barrel and endcap detectors, a quartz-fiber Cherenkov detector extends the jet acceptance to |η| = 5, where the pseudorapidity η is defined as η = − ln tan θ 2 , where θ is the polar angle of the particle or jet trajectory with respect to the counterclockwise beam direction.
Events are selected by requiring the presence of at least one muon or electron having high transverse momentum (p T ). The particle flow (PF) algorithm described in [12] performs a global event reconstruction and provides the full list of particles identified as electrons, muons, photons, charged and neutral hadrons. A fully reconstructed isolated muon (electron) candidate originating from the leading primary vertex is required [13] with p T > 20 (30) GeV/c, |η| < 2.1 (2.5), and a veto is applied on additional leptons passing lower thresholds.
Jets are reconstructed using the anti-k T algorithm [14] with a distance parameter of 0.5, clustering particles identified by the PF algorithm. Jets within the full calorimeter acceptance are considered, with p T > 30 GeV/c after corrections for the jet energy scale, as determined from simulations and collision data [15]. The BDT analysis first identifies isolated leptons, which are then excluded form the jet clustering step. In the 2D analysis, possible jet-lepton ambiguities are resolved on the basis of the distance ∆R ≡ (∆η) 2 + (∆φ) 2 between the reconstructed jet and the nearest lepton. The event is accepted for further analysis only if exactly two jets are reconstructed.
In order to reduce the large background from W + light partons, we apply a b-tagging algorithm [16] that calculates the signed 3D impact-parameter IP significance (IP/σ IP ) of all the tracks associated with the jet passing tight quality criteria. The tracks are ordered decreasingly, following their value of IP/σ IP , and a tight selection threshold is applied on the impact parameter significance of the third track in the list. This threshold corresponds to a b-jet identification efficiency of ∼40% and a misidentification rate of ∼0.1% determined in data as a function of p T and η [16]. The 2D analysis exploits the expectation that most of the signal events, even in the 2 → 3 process, have only one b quark inside the tracking acceptance (|η| < 2.4). Events are rejected if the jet failing the tight threshold passes a loose threshold on the IP significance of the second track. The loose threshold corresponds to an efficiency and misidentification rate of about 80% and 10%, respectively. The BDT analysis applies no veto on the second b-tagged jet, and rejects events where the jets are back-to-back, which are found to be poorly reproduced by the W + jets simulation. To further suppress contributions from processes where the muon (electron) does not come from the decay of a W boson, we require a transverse mass of the W boson M T > 40 (50) GeV/c 2 , where the transverse missing energy (E miss T ) from the PF algorithm is used as a measurement of the p T of the undetected neutrino.
The 2D analysis selects 112 (72) events in the muon (electron) decay channel, while the BDT analysis selects 139 (82). In both analyses a signal purity of around 18% (16%) is expected in the muon (electron) decay channel. The main backgrounds are tt, Wbb, W + light-partons, Wc, tW, and processes where the lepton does not originate from a W/Z, hereafter called QCD events.
The t-channel events from Monte Carlo simulation used in this study have been generated with the MADGRAPH 4.4 event generator [17]. To give a fair approximation of the full nextto-leading order properties of the signal, we combine the dominant NLO contribution (2→3 diagram qg → q tb and its charge conjugate) with the leading order diagram (2→2, qb → q t) by a matching procedure based on Ref. [18]. MADGRAPH is used also for tt, single top s and tW channels, and W/Z + jets. The remaining background samples were simulated using PYTHIA 6.4.22 [19]; these include di-boson production (WW, WZ, ZZ), γ + jets, multi-jet QCD enriched in events with electrons or muons coming from the decay of b and c quarks or muons from the decay of long-lived hadrons, and particles with large probability to leave a high-energy electromagnetic deposit. The CTEQ 6.6 PDF sets [20] are used for all simulated samples. All generated events undergo a full simulation of the detector response based on GEANT4 [21].
The NLO theoretical prediction is used to normalize the single-top production in s and tW channels [22,23] and di-boson processes [24]. The tt cross section is normalized to 150 pb, with uncertainty constrained to the result of a dedicated analysis. The same analysis constrains the VQQ (V = W, Z and Q = b, c) and Wc components, obtaining in particular a factor of 2±1 for Wbb over the LO prediction.
The QCD yield is estimated from the same data set by a maximum likelihood fit to the M T distribution after all other selection criteria have been applied. The M T distribution for QCD events is taken from a control sample obtained by inverting the lepton isolation requirement. The latter requirement rejects most of the signal-like events (single top, W/Z + jets, tt) leaving a QCD-dominated sample. The distribution for the sum of all non-QCD processes is taken from simulation. The uncertainty on this estimate is conservatively estimated such as to cover the differences observed when varying the fit range and the QCD shape.
The BDT analysis normalizes the result of the W + jets simulation to the inclusive W cross section at NNLO [24], while collision data are used in the 2D analysis to extract the normalization of the W + light-partons background. Two control samples are used, orthogonal to the standard selection. Control sample 'region-A', dominated by the W + light partons background, is defined by the requirement of one isolated lepton and exactly two jets, one of which is required to be within the tracker acceptance and with at least two tracks satisfying the quality selection of the b-tagging algorithm. Both jets should fail the tight b-tagging selection. A second control sample, 'region-B', is defined as a subset of the former where at least one jet passes the loose b-tagging selection although it fails the tight one. In both samples a fit of the M T distribution is performed, allowing the QCD and W + light-partons background to float, while all other processes, including heavy-flavour contributions and the t-channel signal, are constrained to their expected values. A scale factor of 1.27 in the muon and 1.05 in the electron decay channel is observed between the number of W + light-partons events obtained from the fit in sample region-B and the predictions from simulation. These scale factors are used to obtain the central value of the predicted background. A ±30% (±20%) uncertainty is assigned on the muon (electron) scale factor, covering both the statistical uncertainty from the fit, the difference between the background predictions obtained from the two control samples, region-A and region-B, and between data and simulation results for both samples. The normalization of Z + jets background is rescaled by the same factor as that for the W + light-parton background.
A top-quark candidate is reconstructed in each event by pairing the b-tagged jet with a Wboson candidate. The latter is reconstructed by imposing the W-boson mass as a kinematic constraint, leading to a quadratic equation in the longitudinal neutrino momentum, p z,ν . When two real solutions are found the smallest |p z,ν | is taken, and for complex solutions the imaginary component is eliminated by modifying E miss T,x and E miss T,y independently, such as to give M T = M W [25].
In the 2D analysis a two-dimensional maximum likelihood fit is performed. One of the two fit variables is the cosine of the angle θ * between the direction of the outgoing lepton and the spin axis, approximated by the direction of the untagged jet, in the top-quark rest frame [26,27]. This observable has a distinct slope in signal events, coming from the almost 100% polarization of the top quark due to the V − A structure of the electroweak interaction [28]. This property holds true also in many theories beyond the standard model (SM) [29]. The other fit variable is the pseudorapidity distribution of the untagged jet, η light jet , interpreted as the light quark jet recoiling against the single top, whose characteristic η distribution allows a discrimination against the typically central jets from the main background processes. The distributions in cos θ * and η light jet are shown in Fig. 2 for events passing the 2D selection.
The inputs to the fit are the distributions for signal and backgrounds in the cos θ * -η light jet plane, separately in the muon and electron decay channels. The overall background is allowed to float unconstrained in the fit, while its relative components are fixed according to the background estimates. The QCD and W + light-partons shapes are taken from the anti-isolated and region-A control samples described above, respectively, while all others are taken from the simulation.  The BDT method combines a given set of observables into one single classifier variable bdt. A total of 37 observables have been chosen. Their selection has been inspired by the D0 analysis [30] and optimised for the LHC kinematics. The most discriminant ones are the lepton momentum, the mass of the system formed by the reconstructed W boson and the two jets, the p T of the system formed by the two jets, the p T of the jet passing tight b-tagging requirements, and the reconstructed top-quark mass. The validity of the description of all the input variables in the simulation has been checked using a Kolmogorov-Smirnov test in a W-enriched control sample with no b-tagged jet, shown in Fig. 3 (top). The bdt classifier has been validated both in simulation and in data: negligible differences are found by comparing its distribution for signal events with MADGRAPH, SINGLETOP [18], and MC@NLO 3.4 [31], and for tt events with MADGRAPH, PYTHIA and MC@NLO. In the W-enriched control sample the distribution of bdt from the simulation is statistically compatible with data.
The cross section is extracted from binned bdt distributions using a Bayesian approach. The normalizations of the backgrounds and the other systematic uncertainties are treated as nuisance parameters. The measured distribution of the classifier bdt is shown in Fig. 3 (bottom).
The following sources of systematic uncertainties are common to both analyses: background normalization; jet energy scale [15], propagated coherently to the E miss T measurement; calibration of the unclustered energy deposits contributing to E miss T , varied by ±10%; b-tagging and mistagging efficiencies [16]; modeling of the signal and of the main backgrounds; and a 4% uncertainty on the integrated luminosity [10].
The uncertainty on the signal model is estimated by comparing MADGRAPH and SINGLETOP events with different fragmentation models. The uncertainty on the tt and W/Z + jets models is determined by comparing simulated samples with varied renormalization and factorization scale (within half and double the nominal value, independently for tt and for W/Z + jets), initial-and final-state radiation parameters, and two different fragmentation models.
The impact of pile-up is estimated by comparing the default simulated samples with no pile-up and dedicated samples where minimum bias interactions are superimposed with a probability distribution roughly corresponding to the one observed in the overall 2010 dataset. The shapes of the bdt classifier and of both variables used in the 2D analysis are negligibly affected.
In the 2D analysis a conservative systematic uncertainty is assigned to the degree of correlation between η light jet and cos θ * (estimated as 6% from simulation) by comparing to the result obtained using the product of uncorrelated one-dimensional distributions for the signal. The W + light-partons background shapes in η light jet and cos θ * are extracted from data in the 2D analysis, and studies with simulated events show that the shapes extracted from the control sample are statistically consistent with those in the signal region for the same process. Nevertheless, a small difference is observed in the η light jet shapes in the two selections for the Wc process, and we conservatively consider this difference as a systematic uncertainty on all W + jets processes.
The efficiencies of the muon and electron triggers, identification, and isolation for the 2D selection have been evaluated from data using dilepton events at the Z peak [13]. The uncertainties on these efficiencies have a negligible effect on this analysis.
The impact of each individual source of uncertainty on both analyses has been estimated with an ensemble of pseudoexperiments. The dominant systematic uncertainty on the cross section determination comes from the b-tagging efficiency, known within ±15%, because of its large effect on the signal acceptance. Nevertheless, this source has a negligible effect on the shapes of the final discriminant variables in both analyses. Other important systematic uncertainties come from the signal model, the factorization/renormalization scale for W/Z + jets, the jet    Table 1 shows the cross section measured by both analyses in each decay channel, corrected for acceptance and branching ratios. In the muon + electron combination all systematic uncertainties are considered fully correlated, with the exception of the uncertainty on multi-jet QCD obtained from data. All measurements are consistent among each other and with the SM expectation.
Under the assumption that all uncertainties are Gaussian and symmetric, which is fulfilled by the dominant uncertainties, the 2D and BDT cross section measurements are combined with the BLUE technique [32], taking into account a statistical correlation of 51% estimated with pseudoexperiments, and treating all the systematic uncertainties as fully correlated with the exceptions of those coming from estimates based on data. The combined result is σ exp = 83.6 ± 29.8 (stat. + syst.) ± 3.3 (lumi.) pb where the BDT analysis contributes with the largest weight (89%).
The expected and observed significances, including systematic uncertainties, are estimated with an ensemble of pseudoexperiments. The probability of the predicted background distributions to fluctuate to the observed data corresponds to 3.7 (3.5) Gaussian standard deviations in the 2D (BDT) analysis, combining the electron and muon decay channels, while 2.1 +1.0 −1.1 (2.9 +1.0 −0.9 ) are expected when assuming SM t-channel production cross section. The combined significance is well approximated by the BDT significance of 3.5 Gaussian standard deviations.
The single-top cross section measurement can be used as a test of the CKM matrix unitarity [33] under the assumption that |V td | and |V ts | are much smaller than |V tb |, and therefore that |V tb | = σ exp σ th where σ th is the SM prediction under the |V tb | = 1 assumption. Using the prior knowledge that 0 ≤ |V tb | 2 ≤ 1, at the 95% confidence level we infer the lower bound |V tb | > 0.62 (0.68) from the 2D (BDT) analysis, respectively.
In summary, we confirm the Tevatron observation of single top quark production and present the first measurement of the t-channel single top quark production cross section in pp collisions at √ s = 7 TeV, finding a good agreement with the SM prediction [6].
We wish to congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC machine. We thank the technical and administrative staff at CERN and other CMS institutes, and acknowledge support from: