Measurement of decay-time-dependent $\textit{CP}$ violation in $B^0 \to J/\psi K^0_S$ decays using 2019-2021 Belle II data

We report a measurement of the mixing-induced and direct $\textit{CP}$ violation parameters $S_{\text{CP}}$ and $A_{\text{CP}}$ from $B^0 \to J/\psi K^0_S$ decays reconstructed by the Belle II experiment at the SuperKEKB asymmetric-energy electron-positron collider. The data, collected at the center-of-mass energy of the $\Upsilon(4S)$ resonance, correspond to $190\text{fb}^{-1}$ of integrated luminosity. We measure ${S_{\text{CP}} = 0.720\pm0.062\pm0.016}$ and $A_{\text{CP}} = 0.094\pm0.044^{+0.042}_{-0.017}$, where the first uncertainties are statistical and the second systematic. In the Standard Model, $S_{\text{CP}}$ equals $\sin(2\phi_1)$ to a good approximation.

The results, based on 33000 signal decays reconstructed in a data sample corresponding to 190 fb −1 , are τ B 0 = (1.499± 0.013 ± 0.008) ps ∆m d = (0.516 ± 0.008 ± 0.005) ps −1 , where the first uncertainties are statistical and the second are systematic.These results are consistent with the world-average values.
Here we report a new measurement of τ B 0 and ∆m d using hadronic decays of B 0 mesons reconstructed in a 190 fb −1 data set collected by the Belle II experiment at the SuperKEKB asymmetric-energy e + e − collider.The data were collected between 2019 and 2021.The B 0 mesons are produced in the e + e − → Υ(4S) → B B process, where B indicates a B 0 or a B + .Our data set contains approximately 200 million such events.Our measurement tests the ability of Belle II to precisely measure B 0 meson decay times and also identify the initial flavor of the decaying B 0 ; such capabilities are crucial for measuring decay-time-dependent CP violation and determining φ 1 and φ 2 , two of the three angles of the B 0 CKM unitarity triangle.[12] Examples of measurements of φ 1 and φ 2 are found in Refs.[13,14].
The flavor of a neutral B 0 or B0 meson oscillates with frequency ∆m d before it decays.The probability density of a B initially being in a particular flavor state and decaying after time ∆t in the same flavor state (q f = +1) or in the opposite flavor state (q f = −1) is (1) By measuring the distribution of ∆t and q f , we determine both τ B 0 and ∆m d .In each event, we fully reconstruct the "signal-side" B (B sig ) via B 0 → D ( * )− π + decays, identifying its flavor via the pion charge, as the contribution from B0 → D ( * )− π + decays is of the order of 10 −4 [15][16][17][18] and hence can be neglected here.Throughout this paper, charge-conjugate modes are implicitly included unless stated otherwise.
We use a flavor-tagging algorithm to determine the flavor of the other, or "tag-side", B meson (B tag ) when it decays [19].As the B mesons are produced in a quantumentangled state, the flavor of B tag when it decays identifies (or tags) the flavor of B sig at that instant [20,21].From that time onwards, the signal-side B freely oscillates in flavor.The variable ∆t is the difference between the proper decay times of the B sig and B tag .Equation 1 also applies when B sig decays first, i.e., for negative ∆t.
At SuperKEKB [22], the Υ(4S) is produced with a Lorentz boost in the laboratory frame of βγ = 0.28.Since the B mesons are nearly at rest in the Υ(4S) rest frame, their momenta are mostly determined by the Υ(4S) boost, resulting in a mean displacement between the B sig and B tag decay positions of the order of 100 µm along the boost direction.By measuring the relative displacement, and knowing the Υ(4S) boost, we determine ∆t.To measure τ B 0 and ∆m d , we fit Eq. ( 1), modified to account for the B tag decay probability and detection effects, to the background-subtracted ∆t distribution.
The Belle II detector consists of subsystems arranged cylindrically around the interaction region [23].The z axis of the laboratory frame is defined as the symmetry axis of the cylinder, and the positive direction is approximately given by the electron-beam direction, which is the beam with higher energy.The polar angle θ, as well as the longitudinal and transverse directions, are defined with respect to the +z axis.Charged-particle trajectories (tracks) are reconstructed by a two-layer siliconpixel detector (PXD) surrounded by a four-layer doublesided silicon-strip detector (SVD) and a 56-layer central drift chamber (CDC).When the data analyzed here were collected, only one sixth of the second PXD layer was installed.A quartz-based Cherenkov counter measuring the Cherenkov photon time-of-propagation is used to identify hadrons in the central region, and an aerogelbased ring-imaging Cherenkov counter is used to identify hadrons in the forward end-cap region.An electromagnetic calorimeter (ECL) is used to reconstruct photons and to provide information for particle identification, in particular, to distinguish electrons from other charged particles.All subsystems up to the ECL are located within an axially uniform 1.5 T magnetic field provided by a superconducting solenoid.A subsystem dedicated to identifying K 0 L mesons and muons is the outermost part of the detector.
The data is processed with the Belle II analysis software framework [24] using the track reconstruction algorithm described in Ref. [25].We use Monte Carlo (MC) simulation to optimize selection criteria, determine shapes of probability density functions (PDFs), and study sources of background.We use KKMC [26] to generate e + e − → q q, where q indicates a u, d, c, or s quark, PYTHIA8 [27] to simulate hadronization, EVTGEN [28] to simulate decays of hadrons, and GEANT4 [29] to model detector response.Our simulation includes beam-induced backgrounds [30].We optimize and fix our selection criteria using simulated data before examining the experimental data.
We reconstruct B 0 → D * − π + and B 0 → D − π + decays by first reconstructing D mesons via We then reconstruct D * − mesons in their decay to a D0 π − final state, where the pion is referred to as the "slow pion"-one with low momentum in the Υ(4S) rest frame.Finally, we combine a D − or D * − candidate with a charged particle identified as a pion to form the B 0 candidate.
We require that tracks originate from the interaction region and have at least six measurement points (hits) in the SVD or twenty hits in the CDC.Each track must have a distance-of-closest-approach to the interaction point of less than 3 cm along the z axis and less than 0.5 cm in the plane transverse to it, and have a polar angle in the CDC acceptance range [17 • , 150 • ].These requirements reduce backgrounds with poorly reconstructed tracks and tracks from beam background.
Photon candidates are identified as localized energy deposits in the ECL not associated with any track.To suppress beam-induced photons, which have different energy spectra depending on their momentum direction, each photon is required to have an energy greater than 30 MeV if reconstructed in the central region of the calorimeter, greater than 80 MeV if reconstructed in the backward region, and greater than 120 MeV if reconstructed in the forward region.Neutral pions are reconstructed from pairs of photon candidates that have an angular separation of less than 52 • in the lab frame and an invariant mass in the range [121 MeV, 142 MeV].
We reconstruct D mesons by combining two to four particles, one of them being identified as a K + .The mass of D0 candidates must be in the range Each D ( * )− is combined with a remaining positive particle to form a B sig candidate.To remove background from B 0 → D ( * )− + ν decays, we require the particle to be identified as a pion.A small number of Cabibbosuppressed B 0 → D ( * )− K + decays pass this requirement.Their yield is 2.7% of that of B 0 → D ( * )− π + decays.These decays have the same ∆t distribution as B 0 → D ( * )− π + , and we treat them as signal.
We identify B sig candidates using two quantities, the beam-constrained mass M bc and the energy difference ∆E.These quantities are defined as where E beam is the beam energy, and p and E are the reconstructed momentum and energy, respectively, of the B sig candidate.All quantities are calculated in the the e + e − center-of-mass frame.We calculate E assuming that the track directly from B sig is a pion.We require that M bc be greater than 5.27 GeV and that ∆E be in the range [−0.10 GeV, 0.25 GeV].The ∆E range is asymmetric, i.e., shorter on the lower side, to reduce backgrounds from B decays with missing daughters.We determine the B tag vertex and flavor using the remaining tracks in the event.Such tracks are required to have at least one hit in each of the PXD, SVD, and CDC and have a reconstructed momentum greater than 50 MeV.Each track must also originate from the e + e − interaction point according to the same criteria as used to select B sig candidates.We require that the B tag decay includes at least one charged particle.The B tag momentum is taken to be opposite that of the B sig candidate in the center-of-mass frame.
To determine the B sig decay vertex, we fit its decay chain with the TreeFit algorithm [31,32].To determine the B tag decay vertex, we fit its decay products with the Rave adaptive algorithm [33], which accounts for our lack of knowledge of the decay chain by reducing the impact of tracks displaced by potential intermediate D decays.The decay vertex position is adjusted such that the direction of each B 0 , as determined from its decay vertex and the e + e − interaction point [34], is parallel to its momentum vector.The IR is measured from e + e − → µ + µ − events.Charged D candidates must have positive flight distances.We require that both vertex fits converge, and that the uncertainty on the decay time, σ ∆t , as calculated from the fitted vertex positions, be less than 2 ps.These vertex quality requirements retain approximately 90% of signal events.
The efficiency to reconstruct a B sig B tag pair with it is 25%.In 2.2% of selected events, there is more than one B sig candidate.We retain all such candidates for further analysis.
The main sources of background are misreconstructed Υ(4S) → B B events and nonresonant e + e − → q q events.To distinguish between signal and q q, we train two multivariate classifiers [35]: one for B 0 → D − π + decays and one for B 0 → D * − π + decays.The classifiers exploit the difference in event topologies and use as input the following quantities: Fox-Wolfram moments [36] and an extension thereof [37]; "cone" variables developed by the CLEO collaboration [38]; the angle between the thrust axes of the two B mesons [39]; and the event sphericity [40].The classifiers are trained and tested using simulated data.In addition to determining the flavor of each B tag , the flavor-tagging algorithms return a tag-quality variable r, which ranges from 0 for no flavor information to +1 for unambiguous flavor assignment.From the B tag and B sig flavors, we determine the relative flavor q f .The data is divided into seven subsamples, depending on the r value: We determine the signal yield by performing an unbinned, extended maximum-likelihood fit to the distributions of ∆E and the multivariate-classifier output C. The fit is performed separately for each r interval and determines the yield of signal events and B B and q q background events.As the fit observables ∆E and C are found to have negligible correlation, the PDFs (P ) for these variables are taken to factorize: P (∆E, C) = P (∆E) • P (C).All PDFs are determined separately for each r interval; however, some of the parameters (as noted below) are taken to be common among the r intervals.
The ∆E PDF for signal is modeled as the sum of two double-sided Crystal Ball functions [41]: one for B 0 → D ( * )− π + decays and one for B 0 → D ( * )− K + decays.The shape parameters of these functions, as well as the ratio between their normalizations, are fixed to values obtained from simulation.To account for differences between data and simulation, we introduce two additional free parameters: a shift of the mean values of the functions, and a scale factor for their widths.These parameters are taken to be common among the r intervals.The ∆E PDF for B B background is a fourth-order polynomial, and the ∆E PDF for q q background is an exponential function.All parameters of the polynomial are fixed to values obtained from simulation, while the slope of the exponential function is free to vary.The C PDFs for signal and background are taken to be Johnson S U functions [42].The Johnson functions across different r intervals have independent mode, standard deviation, skewness, and kurtosis parameters, all determined from simulation.We introduce four free parameters to account for differences between data and simulation that are common across all r intervals: one offset for the modes and one scale for the widths for all q qbackground distributions; and similarly one offset and one scale common to all signal and B B -background distributions.
We simultaneously fit to data in all seven r intervals.The fit has a total of 28 free parameters: three yields for each of the r intervals, six scale or shift factors, and the slope of the exponential function used for the ∆E PDF of the q q background.The distributions of ∆E and C summed over all r intervals, along with projections of the fit results, are shown in Fig. 1.The resulting yields are 33 317 ± 203 signal events, 2814 ± 150 B B -background events, and 5594 ± 125 q qbackground events.
Using sWeights [43,44] computed with the percandidate signal fractions obtained from the fit to ∆E and C, we statistically subtract background contributions to the ∆t and σ ∆t distributions.In this manner, we need not parametrize background distributions when fitting for τ B 0 and ∆m d .
We measure the lifetime τ B 0 and oscillation frequency ∆m d by fitting the background-subtracted ∆t and σ ∆t distributions.The probability density to observe both B sig and B tag decays is obtained from eq. ( 1) by including the probability for B tag to decay and the probability of mistagging its flavor, where t is the average of the B sig and B tag proper decay times, and w(r) is the probability of the B tag flavor being incorrectly assigned.The latter is parametrized with a single value for each r interval and is assumed to be independent of the B tag flavor.The decay-time difference ∆t can be expressed as where ∆t ≡ /(βγγ B ).In this expression, is the displacement of the B sig vertex from that of B tag , β is the velocity of the Υ(4S) in the lab frame (with ), β B is the velocity of a B in the Υ(4S) rest frame, and θ is the polar angle of the B sig direction in the Υ(4S) rest frame.We integrate out the dependence of eq. ( 3) on t and θ, accounting for the angular distribution in e + e − → Υ(4S) → B B , P θ (cos θ) = (3/4)(1 − cos 2 θ).
To account for resolution and bias in measuring , we convolve eq. ( 3) with an empirical response function, which is modeled as a linear combination of three components: (5) where δt ≡ ( − true )/(βγγ B ) and true is the true value of .The first component is a Gaussian distribution with mean and standard deviation proportional to the per-candidate σ ∆t ; this component accounts for 70% of candidates.The second component is a weighted sum of a Gaussian distribution and two exponentially modified Gaussian functions, corresponding to a Gaussian convolved with an exponential distribution, where exp > (−κx) = exp(−κx) if x > 0 and exp > (−κx) = 0 otherwise, and similarly for exp < (κx).The exponential tails account for poorly determined B tag vertices due to intermediate charm mesons yielding displaced secondary vertices.The fraction f t is zero at low values of σ ∆t and reaches a plateau of 0.2 at approximately σ ∆t = 25 ps.This is modeled using three parameters: the maximal tail fraction f max t at its plateau, a threshold parameter describing the σ ∆t value at which the tail fraction becomes nonzero, and a slope parameter describing how fast the tail fraction reaches its plateau.The third component has a large width, σ 0 = 200 ps, to account for the O(10 −3 ) fraction of outlying poorly reconstructed vertices.
Equation ( 5) is the simplest model found to satisfactorily describe the δt distribution of simulated events.We fix σ 0 , as well as k, f > , f < , and the f t slope and threshold parameters, to values determined from a fit to simulated data.Figure 2 shows the δt distribution of simulated data and the distribution of the fitted model.The parameter f max t , as well as the scaling factors relating the modes and standard deviations of G and R t to σ ∆tm G , s G , m t and s t -are free to vary in the fit to data.
After integrating over cos θ and t and convolving with R(δt), the ∆t distribution of B meson pairs is P (∆t , σ ∆t , q f , r|τ B 0 , ∆m d ) = P (σ ∆t |q f , r) P (∆t − δt, t, q f , r|τ B 0 , ∆m d ) P θ (cos θ) R(δt|σ ∆t ) dδt dcos θ d t, (7) where P (σ ∆t |q f , r) is the probability to observe σ ∆t for a given value of q f and r, modelled using histogram templates: one for each r interval and value of q f (14 in total), taken from the data.The sWeights computed using the fit to ∆E and C are used to statistically subtract the background contribution to the σ ∆t histograms.We fit for τ B 0 and ∆m d by maximizing where the sum runs over all B sig B tag candidate pairs and s i is the sWeight of a pair.Fourteen parameters are free in the fit: τ B 0 and ∆m d ; seven values of w, one for each r interval; and the five free parameters of the response function.
We calculate the statistical uncertainties using 1000 bootstrapped [45] samples obtained from the data.For each sample, we repeat the determination of the sWeights and the fit for τ B 0 and ∆m d .In this way, the spread of fitted τ B 0 and ∆m d values account for the statistical fluctuations of the signal and background fractions.We test this analysis method with independent simulated data.When tested on simulated data, our fitting procedure determines τ B 0 with a small systematic bias of (0.004 ± 0.002) ps and ∆m d with no significant bias, (0.000 ± 0.001) ps −1 .We assign the central value of the bias on τ B 0 as a systematic uncertainty.We assign the uncertainty on the bias on ∆m d , arising from the size of the simulated data, as a systematic uncertainty.
The ∆t distributions of both opposite-flavor and same-flavor B -meson pairs are shown in Fig. 3 for all r intervals combined, along with projections of the fit result.We also check that the fit quality is good in each individual r interval.The figure shows the ∆t -dependent yield asymmetry between the two samples, defined as the difference between the number of opposite-flavor pairs and same-flavor pairs divided by their sum.The fit results and statistical uncertainties for τ B 0 and ∆m d are (1.499 ± 0.013) ps and (0.516 ± 0.008) ps −1 , with a −29% statistical correlation factor between them.There are several sources of systematic uncertainty; these are listed in Tab.I and described below.The dominant systematic uncertainty is due to potential discrepancies between the assumed values (fixed in the fit) of the response-function parameters and the true values in the data.For each fixed parameter, we repeat the fit with the parameter allowed to vary.We add all the resulting changes in the result in quadrature and include this value as a systematic uncertainty.Possible misalignment of the tracking detector can bias our results [46].To estimate this effect, we reconstruct simulated signal events with several misalignment scenarios.Two scenarios are extracted from collision data using day-by-day variations of the detector alignment.Two additional scenarios correspond to misalignments remaining after applying the alignment procedure to dedicated simulated data.We repeat the analysis for each scenario and assign the largest changes in the results as systematic uncertainties.
Because we adjust the B sig decay vertex position so that the vector connecting the IR and decay vertex is parallel to the B sig momentum, the precision to which we know the IR affects our determination of .We repeat our analysis on simulated data in which we shift, rotate, and rescale the IR within its measured uncertainties and assign the changes in the results as systematic uncertainties.We perform an analogous check with changes to √ s and the magnitude and direction of the boost vector and find that the results change negligibly.
We estimate systematic uncertainties due to mismodeling the C distribution, including possible correlation with ∆E, from the changes in the results observed when fitting to the ∆E distribution only.In that case, the B B -background fraction is fixed to the value in simulated data.The result for τ B 0 changes negligibly, but a systematic uncertainty is included for ∆m d .To check for dependence of the results on the ∆E model for the q q and B B backgrounds, we repeat the analysis with each model replaced by a second-order polynomial with all parameters free in the fit.The polynomial parameters are common to all r intervals.The results change negligibly.
To check for dependence of the results on the σ ∆t model, we repeat the fit with several alternative binning choices for their templates, and also replacing templates with analytical functions.We assign the largest changes in the results as systematic uncertainties.
We investigate the impact of fixing the yield of B 0 → D ( * )− K + decays relative to B 0 → D ( * )− π + by repeating the analysis with alternative choices of the B 0 → D ( * )− K + fraction, corresponding to varying the branching fractions and relevant hadron identification efficiencies by their known uncertainties [47].The results change negligibly.
To check if potential correlations of ∆E or C with ∆t affect our results, we repeat the analysis with sWeights calculated independently for two subgroups of candidate pairs, defined by the sign of ∆t .Likewise, we repeat the analysis for two subgroups defined by whether |∆t | is greater or less than 1.150 ps.In both cases, the results change mildly and we assign the larger of these two changes as systematic uncertainties.
The global momentum scale of the Belle II tracking detector is calibrated to a relative precision of better than 0.1%, and the global length scale to a precision of better than 0.01%.Neither significantly impacts our results.
We further check our analysis by repeating it on subsets of the data divided by data-taking period or by whether the charm meson in the B sig decay is D − or D * − .The results are all statistically consistent with each other and with our overall results.
The results agree with previous measurements and have very similar systematic uncertainties as compared to results from the Belle and Babar collaborations [3,4].They demonstrate a good understanding of the Belle II detector and provide a strong foundation for future timedependent measurements.

[ 1 .
845 MeV, 1.885 MeV] for D0 → K + π − and D0 → K + π − π + π − , and in the range [1.810 MeV, 1.895 MeV] for D0 → K + π − π 0 .The mass of D − candidates is required to be in the range [1.860 MeV, 1.880 MeV].The mass range is looser for D0 candidates, as the selection requirements placed on the D * − are sufficient to suppress background events containing a fake D0 .We identify negatively charged pions with momenta below 300 MeV in the center-of-mass frame as slow pion candidates.Each of these candidates is combined with a D0 candidate to form a D * − candidate.The energy released in the D * − decay, m(D * − )−m( D0 )−m π + , must be in the range [4.6 MeV, 7.0 MeV].

FIG. 1 .
FIG. 1. Distributions of ∆E (top) and C (bottom) in data (points) and the fit model (curves and stacked shaded regions).

FIG. 2 .
FIG.2.Top: Distribution of δt in simulated data (points) and distribution modeled by the response function from the fit to the simulated data (curves) and from the fit to the experimental data (shaded).The shaded area accounts for the statistical and systematic uncertainties on the parameters of the response function.Bottom: distribution of the pull, defined as the difference between the event count in each bin and its value predicted by the fit, divided by the Poisson uncertainty.

B 0 1 FIG. 3 .
FIG.3.Distribution of ∆t in data (points) and the fit model (lines) for opposite-flavor candidate pairs (red) and same-flavor pairs (blue) and their asymmetry (black).

TABLE I .
Systematic uncertainties.