Measurement of $C\!P$ violation in $B^{0}\rightarrow K_{S}^{0}\pi^{0}$ decays at Belle II

We report a measurement of the $C\!P$-violating parameters $A$ and $S$ in $B^{0}\to K_{S}^{0} \pi^{0}$ decays at Belle II using a sample of $387\times 10^{6}$ $B\bar{B}$ events recorded in $e^{+}e^{-}$ collisions at a center-of-mass energy corresponding to the $\Upsilon(4S)$ resonance. These parameters are determined by fitting the proper decay-time distribution of a sample of 415 signal events. We obtain $A = 0.04^{+0.15}_{-0.14}\pm 0.05$ and $S = 0.75^{+0.20}_{-0.23}\pm 0.04$, where the first uncertainties are statistical and the second are systematic.

The B 0 → K 0 π 0 decay proceeds mainly via the b → sdd loop amplitude, involving emission and reabsorption of a virtual W boson and a top quark, that carries a weak phase arg(V tb V * ts ).Throughout this paper, chargeconjugate modes are implicitly included.Here, V ij denotes Cabibbo-Kobayashi-Maskawa (CKM) matrix elements [1,2].The decay is suppressed in the Standard Model (SM) due to the smallness of |V ts |.As non-SM particles can potentially propagate in the loop, studies of this decay provide sensitivity to physics beyond the SM.Such non-SM physics can manifest itself as an asymmetry in the rates of CP -conjugate decays, i.e., CP violation [3].
In the B 0 → K 0 π 0 channel, CP violation results from either interference between two B 0 decay amplitudes, or interference between a B 0 decay amplitude and that of a B 0 following B 0 -B 0 mixing.These two phenomena are quantified by the parameters A and S, respectively.The parameter A is also denoted as −A in the literature.Neglecting subleading amplitudes with a different weak phase and CP violation in mixing, we expect A = 0 and S = sin 2ϕ 1 , where ϕ 1 ≡ arg −V cd V * cb /V td V * tb .The parameter sin 2ϕ 1 is measured to be 0.70 ± 0.02 [4] in decays mediated by b → ccs transitions such as B 0 → J/ψK 0 S .However, the contribution from a colorand CKM-suppressed b → uus tree amplitude, involving the bottom-to-up-quark transition via a W boson emission, introduces an extra weak phase [5][6][7][8][9]; this shifts the S value from sin 2ϕ 1 .The resulting difference, ∆S ≡ S − sin 2ϕ 1 , is estimated in a number of theoretical approaches.Predictions of ∆S based on QCD factorization range between 0.01 and 0.12 [5,10], while those based on SU (3) symmetry provide a less stringent lower bound of −0.06 [6,9,11].Similarly, the predicted value of A due to the color-suppressed tree amplitude ranges from −0.01 to 0.07 [5,6].Deviations of ∆S and A from their expected values would indicate either large subleading amplitudes or non-SM physics [12].
In this Letter, we report the first measurement of A and S in the B 0 → K 0 S π 0 decay from the Belle II experiment.We use a sample of (387 ± 6) × 10 6 BB events collected in e + e − collisions at a center-of-mass (c.m.) energy corresponding to the Υ(4S) resonance.
At e + e − experiments operating near the Υ(4S) resonance, pairs of neutral B mesons are coherently produced in the process e + e − → Υ(4S) → B 0 B 0 .When one of these B mesons decays to a CP eigenstate f CP such as K 0 S π 0 , and the other to a flavor-specific final state f tag , the time-dependent decay rate is given by where ∆t = t CP − t tag is the difference in proper times between the two decays, q is the flavor of the tag-side B meson (+1 for B 0 and −1 for B 0 ), τ B 0 is the B 0 lifetime, and ∆m d is the B 0 -B 0 mixing frequency.This study employs a time-dependent CP analysis method similar to previous measurements [13,14].The important challenge is determining the location of the B 0 → K 0 S π 0 decay vertex, which is essential for the ∆t determination, in the absence of any charged particle originating from the vertex.The analysis is developed and tested with simulation and validated with a control sample of B 0 → J/ψK 0 S decays before examining the B 0 → K 0 S π 0 candidates in the data.
The Belle II detector [15,16] operates at the Su-perKEKB asymmetric-energy (4 GeV e + on 7 GeV e − ) collider [17].The detector consists of several subdetectors surrounding the interaction region in a cylindrical geometry and is divided into two sections depending on the coverage in polar angle θ.The two sections are the barrel (32.2 • < θ < 128.7 • ) and endcap (12.4 • < θ < 31.4 • or 130.7 • < θ < 155.1 • ).The subdetectors most relevant for our study are a silicon-based vertex detector (VXD), a gas-based central drift chamber (CDC), and an electromagnetic calorimeter (ECL) made of CsI(Tl) crystals.The VXD is the innermost component, comprising two layers of pixel sensors surrounded by four layers of double-sided strip sensors [18].The second pixel layer was incomplete, covering one-sixth of the azimuthal acceptance, for the data analyzed here.The VXD samples the trajectories of charged particles ("tracks") near the interaction region to determine the decay positions of their parent particles.The CDC is the main device for track reconstruction and measurements of particle momenta and charges.The ECL measures photon energies.
We analyze collision data recorded at the Υ(4S) resonance, corresponding to an integrated luminosity of 362 fb −1 .We use large samples of simulated Υ(4S) → BB and e + e − → qq (q = u, d, s, c) events to optimize the event selection and study background distributions.Simulated B 0 → K 0 S π 0 events are used to model signal decays and calculate the reconstruction efficiency.We use Evt-Gen [19] to generate Υ(4S) → BB with the subsequent B-meson decays and Photos [20] to incorporate finalstate radiation from charged particles.The simulation of qq background relies on the Kkmc generator [21] interfaced to Pythia [22].The detector response for finalstate particles is simulated with Geant4 [23].Events are reconstructed using the Belle II software [24,25].
Candidate K 0 S mesons are reconstructed from pairs of oppositely charged tracks, which are assumed to be pions and fit to a common vertex.The resulting invariant mass is required to lie between 489 MeV and 507 MeV, corresponding to a ±3σ range around the known K 0 S mass [26], with σ being the resolution.We suppress contamination from prompt K 0 S candidates and Λ decays using two boosted-decision-tree (BDT) classifiers [27].These BDTs rely mostly on kinematic information from the K 0 S and its decay products.
Photons are identified as isolated energy deposits in the ECL that are not matched to any track in the CDC.We reconstruct π 0 candidates from pairs of photons that have energies greater than 35 (153) MeV if reconstructed in the barrel (endcap) ECL.The different energy thresholds are used to suppress beam background, which is higher in the endcap than in the barrel section.We require the diphoton mass to lie between 116 MeV and 150 MeV (±3σ range in resolution around the π 0 mass [26]).The absolute cosine of the angle between the higher-energy photon's direction in the π 0 rest frame and the π 0 direction in the lab frame must also be less than 0.972.These criteria reduce contributions from misreconstructed π 0 candidates.To improve the momentum resolution, we perform a kinematic fit with the diphoton mass constrained to the known π 0 mass [26].
A neutral B-meson candidate is reconstructed by combining a K 0 S candidate with a π 0 candidate.Two kinematic variables are used to select signal B candidates: the beam-energy-constrained mass (M bc ) and the energy difference (∆E).These are calculated as where E beam is the beam energy, and ⃗ p B and E B are the momentum and energy, respectively, of the B meson.All quantities are calculated in the c.m. frame.Correctly reconstructed signal candidates peak in M bc at the known B 0 mass [26], and peak in ∆E at zero.For B 0 → K 0 S π 0 , the higher-energy photon from the π 0 decay causes a significant correlation between M bc and ∆E due to leakage of energy deposited in the ECL.To reduce this correlation, when calculating ⃗ p B in Eq. ( 2) we replace the magnitude of the π 0 momentum with , where E K 0 S is the K 0 S momentum in the c.m. frame.Simulation shows that the modified M bc (M ′ bc ) reduces the linear correlation coefficient from 19% to −1% and has an improved resolution over that of M bc .We retain candidate events satisfying 5.24 < M ′ bc < 5.29 GeV and |∆E| < 0.30 GeV.To measure the decay-time difference ∆t, we must determine the positions of the signal and tag-side B decay vertices.These vertices are obtained using information from the position and spread of the e + e − interaction region, which is modeled as a three-dimensional Gaussian distribution.The signal B vertex position is obtained by projecting the K 0 S flight direction, determined from its decay vertex and momentum, back to the interaction region.The intersection of the K 0 S flight projection with the interaction region provides a good estimate of the signal B decay vertex, since both the transverse flightlength of the B 0 meson (≈ 40 µm) and the transverse size of the interaction region (≈ 10 µm) are small as compared to the B 0 flight length along the boost direction (≈ 140 µm).The tag-side vertex is reconstructed with tracks that are not associated with the B 0 → K 0 S π 0 candidate.Such tracks must have a minimum momentum of 50 MeV and at least one hit in each of the PXD, SVD, and CDC subdetectors.We also apply a similar interaction-region constraint as that used for tracks on the signal side.We approximate ∆t to be ∆ℓ/βγγ * , where ∆ℓ is the distance between signal and tag-side vertices along the e − beam direction, βγ (≈ 0.28) is the Lorentz boost of the Υ (4S) in the lab frame, and γ * (≈ 1.002) is the Lorentz factor of the B 0 meson in the c.m. frame.
We employ a BDT classifier that uses 32 eventtopology variables to distinguish the qq background from B-meson decays.The following variables provide the most discrimination: modified Fox-Wolfram moments [28], CLEO cones [29], the thrust value [30] of the rest of the event, and the cosine of the angle between the thrust axis of the signal B and that of the rest of the event.The BDT is trained on samples of simulated e + e − → qq and signal events, each equivalent to about three times the size of the dataset.The BDT outputs a single variable (C BDT ) that ranges from zero for background-like events to one for signal-like events.We require C BDT to be greater than 0.6, which rejects about 93% of the qq background while preserving 80% of the signal.The remainder of the C BDT distribution strongly peaks near 1.0 for signal, leading to difficulty in modeling it with an analytic function.We thus transform it into a new variable, , where 0.6 (1.0) is the minimum (maximum) possible value of the remaining C BDT distribution.The C ′ BDT distribution can be parametrized with a sum of Gaussian functions, and C ′ BDT is later used as a fit variable.After applying all selection criteria, 3% of the events have more than one B candidate.Such multiple candidates come from random combinations of final-state particles.In events with multiple candidates, we choose that with the largest p-value resulting from the π 0 -massconstrained fit; if that criterion is ambiguous, we select the candidate with the largest p-value from the K 0 S -vertex fit.This selection retains the correct B candidate in 87% of simulated events that have multiple candidates.
The signal efficiency after all selection criteria are applied (ε rec ) is 20%.Simulation studies show that 1.7% of signal candidates are incorrectly reconstructed by including a final-state particle from the tag-side B meson.We consider this small component, arising mostly due to misreconstructed π 0 , as part of the signal.
The flavor of the tag-side B 0 meson, q, is determined from the properties of final-state particles that are not associated with the reconstructed B 0 → K 0 S π 0 decay.We use a category-based multivariate flavor-tagging algorithm for this purpose [31].The algorithm outputs two parameters, the b-flavor charge q and r, which is an event-by-event tagging quality factor ranging from zero for no flavor discrimination to one for unambiguous flavor assignment.The dataset is divided into seven r bins that contain similar numbers of events, but have different signal-to-background ratios.
We select events in which ∆t is well-measured by requiring |∆t| < 10.0 ps and σ ∆t < 2.5 ps, where σ ∆t is the uncertainty on ∆t, estimated event-by-event.The ∆t distribution of these events is fitted to determine A and S. For the remaining events, about 40%, the ∆t distribution is not included in the fit.However, these events are still useful to constrain A, which is sensitive to the relative yields of B 0 and B 0 decays.We thus perform a simultaneous extended maximum-likelihood fit to both subsamples in seven r bins [13].For each subsample, the likelihood function includes one-dimensional probability density functions (PDFs) for M ′ bc , ∆E, and C ′ BDT ; for the first subsample, the likelihood also includes a PDF for ∆t that depends on the flavor tag q.The PDFs for M ′ bc , ∆E, and C ′ BDT are taken to be the same for both subsamples, as found in simulation.
The PDFs for the signal component are as follows: M ′ bc is modeled with the sum of a Crystal Ball function [32] and a Gaussian function with a common mean; ∆E with the sum of a Crystal Ball and two Gaussian functions, all three with a common mean; and C ′ BDT with the sum of asymmetric and symmetric Gaussian functions.The ∆t PDF is given by where w r is the fraction of wrongly tagged events; ∆w r is the difference in w r between B 0 and B 0 ; ∆ε tag,r is the asymmetry in their tagging efficiencies, which are the fractions of B 0 or B 0 signal candidates to which a flavor tag is assigned; and R sig is the ∆t resolution function.
The resolution function is described by a double Gaussian convolved with an exponential function; the Gaussian means and widths are scaled by σ ∆t .The ∆t resolution is dominated by the signal-side K 0 S .Simulation shows that the σ ∆t distributions for signal and background are the same.We fix τ B 0 and ∆m d to the world averages of 1.519±0.004ps and 0.5065±0.0019ps −1 , respectively [4].The tagging parameters (w r , ∆w r , and ∆ε tag,r ) are fixed to values obtained from B 0 → D ( * )− π + decays [31].The effective tagging efficiency ε eff = r ε tag,r (1 − 2w r ) 2 is (30.0 ± 1.2)%, where ε tag,r is the tagging efficiency for the r-th bin.The w r and ∆ε tag,r values are in the ranges 2%-48% and 0.8%-3.6%,respectively.All signal shape parameters are fixed to values obtained from simulation and calibrated with control samples as described below.
For the qq background, an ARGUS function [33] is used for M ′ bc , a straight line for ∆E, and the sum of asymmetric and symmetric Gaussian functions for C ′ BDT .The ∆t distribution is modeled with the signal resolution function R sig , as this background is dominated by prompt K 0 S decays.We float the qq background yield, ARGUS curvature parameter, and ∆E slope, but fix the ARGUS endpoint, C ′ BDT and ∆t shape parameters to the values obtained from the data sideband 5.24 < M ′ bc < 5.27 GeV.All qq shape parameters are taken to be identical for all r bins.
For the BB background, a two-dimensional kernel density estimation PDF [34] is used to model the (M ′ bc , ∆E) distribution, and the sum of asymmetric and symmetric Gaussian functions is used for C ′ BDT .The ∆t distribution is modeled with an exponential function convolved with R sig .We float the yield of BB background and fix its shape parameters from a fit to the simulated sample.
We correct the common mean and core width of the signal M ′ bc , ∆E, and C ′ BDT PDF shapes for possible differences between data and simulation according to values obtained from a control sample of B + → D 0 (→ K 0 S π 0 )π + decays.To select these events, we apply the same K 0 S and π 0 criteria as used for the signal channel.To ensure the similar π 0 momentum range for signal and control channels, we require a minimum π 0 momentum of 1.5 GeV.We perform an unbinned maximum-likelihood fit to the distributions of M ′ bc , ∆E, and C ′ BDT , using PDF shapes similar to those employed to describe the signal decay.
To validate the fitting procedure, we use a control sample of B 0 → J/ψ (→ µ + µ − )K 0 S decays.To mimic the signal decay, we do not use information from the two muon tracks to reconstruct the signal B decay-vertex.We perform an unbinned maximum-likelihood fit to the distributions of M bc and ∆t, using PDF shapes and resolution functions similar to those employed in the fit to the signal sample.The measured B 0 lifetime, A, and S are 1.46 ± 0.05 ps, 0.10 ± 0.07, and 0.76 ± 0.12, respectively, where the uncertainties are statistical only.These results are consistent with their world-average values [4], thus validating our B 0 → K 0 S π 0 fitting procedure.The above sample is also used to correct the common mean and core width of the resolution function for possible differences between data and simulation.
Figure 1 shows the M ′ bc , ∆E, C ′ BDT , and ∆t distributions in the data along with the fit projections overlaid.
For these plots, the seven r bins have been combined, and for all plots except ∆t, both data subsamples (described earlier) are included.In addition, for each plot the signal-enhancing criteria 5.27 < M ′ bc < 5.29 GeV, −0.15 < ∆E < 0.10 GeV, |∆t| < 10.0 ps, and C ′ BDT > 0.0 have been applied except for the variable displayed.Distributions of ∆t with fit projections overlaid are shown in the Supplementary Material [36].The resulting signal yield N sig , A, and S are 415 +26 −25 , −0.04 +0.14 −0.15 , and 0.75 +0.20  −0.23 , respectively.The correlation coefficient between two asymmetries is −0.17%.From the signal yield, we determine the branching fraction as , which is consistent with the world average [4].Here, f +0 is the fraction of B 0 B 0 or B + B − production at the Υ(4S) resonance [37] and all quoted uncertainties are statistical.
The systematic uncertainties contributing to A and S are listed in Table I.We estimate the systematic uncertainty due to flavor tagging by individually varying the (w r , ∆w r , ∆ε tag,r ) parameters by their uncertainties for each r bin, while considering correlations.The maximum deviations with respect to the nominal results are taken as systematic uncertainties.The uncertainty due to the ∆t resolution function is estimated in a similar fashion.In the nominal fit, we assume the BB background to be CP symmetric.To account for a potential CP asymmetry in the BB background, we perform a series of fits with the ∆t PDF formed by varying the A and S values for that background from −1 to +1 while fixing the effective lifetime value to that determined from simulation.We then calculate the deviations in signal A and S from their nominal values; the largest deviation is assigned as the systematic uncertainty.To evaluate the uncertainty due to a possible asymmetry in the qq background, we perform an alternative fit by fixing the asymmetry to that obtained from the data sideband defined earlier.The uncertainty due to the signal PDF shape is estimated using an alternative model based on kernel-density estimation.Similarly, the uncertainty due to the background PDF shape is calculated by varying all fixed parameters by their uncertainties and taking the maximum deviation from nominal results as the uncertainty.
A potential fit bias is checked for by performing an ensemble test comprising 1000 simulated experiments in which signal and BB background events are drawn from simulated samples and qq background events are generated according to their PDF shapes.We calculate the mean shifts of the fitted values of A and S from their input values and assign them as systematic uncertainties.The systematic uncertainty due to multiple candidate selection is evaluated by performing an alternative fit with all candidates and taking the difference with respect to the nominal value.The impact of misreconstructed signal candidates on A and S is negligible.Uncertainties due to fixed τ B 0 and ∆m d values are calculated by varying these quantities by their uncertainties and repeating the fit; the  and C ′ BDT > 0.0 (except for the variable displayed).The solid curve shows the fit projection, while various fit components are explained in the legends.Distribution of (d) ∆t for tagged B 0 and B 0 candidates after subtracting background with the sP lot method [35].The asymmetry, defined as is displayed underneath along with the fit projection.
resulting maximum variations in A and S are assigned as systematic uncertainties.Tag-side interference can arise due to the presence of both CKM-favored and CKMsuppressed tree amplitudes contributing to the tag-side decay [38].The resulting impact is conservatively estimated by positing that all events are tagged with such hadronic decays.The uncertainty due to VXD misalignment is evaluated by reconstructing events with various misalignment hypotheses as done in Ref. [39].Assuming all systematic sources to be independent, we add their contributions in quadrature to obtain the total systematic uncertainty of ±0.047 for A and ±0.040 for S.
In summary, we measure the CP -violating parameters A and S in B 0 → K 0 S π 0 decays using a sample of 387 × 10 where the first uncertainties are statistical and the second are systematic.This constitutes the first Belle II measurement of CP asymmetries in the decay.Our results agree with previous determinations [13,14], and the precision obtained for S is better than (similar to) that achieved at Belle (BABAR), despite using a data sample only 60-80% the size of the samples used in those experiments.The results are consistent with SM predictions and can provide useful constraints on non-SM physics.This work, based on data collected using the Belle II detector, which was built and commissioned prior to March 2019, was supported by Sci-

FIG. 1 :
FIG.1: Distributions of (a) M ′ bc , (b) ∆E, and (c) C ′ BDT with fit projections overlaid for both B 0 and B 0 candidates satisfying criteria 5.27 < M ′ bc < 5.29 GeV, −0.15 < ∆E < 0.10 GeV, |∆t| < 10.0 ps, and C ′ BDT > 0.0 (except for the variable displayed).The solid curve shows the fit projection, while various fit components are explained in the legends.Distribution of (d) ∆t for tagged B 0 and B 0 candidates after subtracting background with the sP lot method[35].The asymmetry, defined as[N (B 0 tag ) − N (B 0 tag )]/[N (B 0 tag ) + N (B 0 tag )],is displayed underneath along with the fit projection.

TABLE I :
Systematic uncertainties (absolute) contributing to the time-dependent CP asymmetries.