Model-independent evidence for $J/\psi p$ contributions to $\Lambda_b^0\to J/\psi p K^-$ decays

The data sample of $\Lambda_b^0\to J/\psi p K^-$ decays acquired with the LHCb detector from 7 and 8~TeV $pp$ collisions, corresponding to an integrated luminosity of 3 fb$^{-1}$, is inspected for the presence of $J/\psi p$ or $J/\psi K^-$ contributions with minimal assumptions about $K^- p$ contributions. It is demonstrated at more than 9 standard deviations that $\Lambda_b^0\to J/\psi p K^-$ decays cannot be described with $K^- p$ contributions alone, and that $J/\psi p$ contributions play a dominant role in this incompatibility. These model-independent results support the previously obtained model-dependent evidence for $P_c^+\to J/\psi p$ charmonium-pentaquark states in the same data sample.

and uses the information contained in the Dalitz variables, (m 2 Kp , m 2 J/ψ p ), or equivalently in (m Kp , cos θ Λ * ), where θ Λ * is the helicity angle of the K − p system, defined as the angle between the p K and − p Λ 0 b (or − p J/ψ ) directions in the K − p rest frame. The (m Kp , cos θ Λ * ) plane is particularly suited for implementing constraints stemming from the H 0 hypothesis by expanding the cos θ Λ * angular distribution in Legendre polynomials P l : where N is the efficiency-corrected and background-subtracted signal yield, and P U l is an unnormalized Legendre moment of rank l, Under the H 0 hypothesis, K − p components cannot contribute to moments of rank higher than 2 J max , where J max is the highest spin of any K − p contribution at the given m Kp value. This requirement sets the appropriate l max value, which can be deduced from the lightest experimentally known Λ * resonances for each J, or from the quark model, as in Reflections from other channels, Λ 0 b → P + c K − , P + c → J/ψ p or Λ 0 b → Z − c p, Z − c → J/ψ K − , would introduce both low and high rank moments (see the supplemental material for an illustration). The narrower the resonance, the narrower the reflection and the higher the rank l of Legendre polynomials required to describe such a structure.
Selection criteria and backgrounds can also produce high-l structures in the cos θ Λ * distribution. Therefore, the data are efficiency-corrected and the background is subtracted. Even though testing the H 0 hypothesis involves only two dimensions, the selection efficiency has some dependence on the other phase-space dimensions, namely the Λ 0 b and J/ψ helicity angles, as well as angles between the Λ 0 b decay plane and the J/ψ and Λ * decay planes. Averaging the efficiency over these additional dimensions (Ω a ) would introduce biases dependent on the exact dynamics of the Λ * decays. Therefore, a six-dimensional efficiency correction is used. The efficiency parameterization, (m Kp , cos θ Λ * , Ω a ), is the same as that used in the amplitude analysis and is described in Sec. 5 of the supplement of Ref. [3].
In order to make the analysis as model-independent as possible, no interpretations are imposed on the m Kp distribution. Instead, the observed efficiency-corrected and background-subtracted histogram of m Kp is used. To obtain a continuous probability density function, F(m Kp |H 0 ), a quadratic interpolation of the histogram is performed, as shown in Fig. 2. The essential part of this analysis method is to incorporate the l ≤ l max (m Kp ) constraint on the Λ * helicity angle distribution: States predicted in Ref. [7] are shown as short horizontal bars (black) and experimentally well-established Λ * states are shown as green boxes covering the mass ranges from M 0 − Γ 0 to M 0 + Γ 0 . The m Kp mass range probed in Λ 0 b → J/ψ pK − decays is shown by long horizontal lines (blue). The l max (m Kp ) filter is shown as a stepped line (red). All contributions from Λ * states with J P values to the left of the red line are accepted by the filter. The filter works well also for the excitations of the Σ baryon [7,11] (not shown). lation between neighboring m Kp bins of where k is the bin index. Here the Legendre moments P N l k are normalized by the yield in the corresponding m Kp bin, since the overall normalization of F(cos θ Λ * |H 0 , m Kp ) to the data is already contained in the F(m Kp |H 0 ) definition. The data are used to determine Here the index i runs over selected J/ψ pK − candidates in the signal and sideband regions for the k th bin of m Kp (n cand k is their total number), i = (m Kp i , cos θ Λ * i , Ω a i ) is the efficiency correction, and w i is the background subtraction weight, which equals 1 for events in the signal region and −β n sig cand /n side cand for events in the sideband region. Values of P U l k are shown in Fig. 3.
[GeV] Instead of using the two-dimensional (2D) distribution of (m Kp , cos θ Λ * ) to evaluate the consistency of the data with the H 0 hypothesis, now expressed by the l ≤ l max (m Kp ) requirement, it is more convenient to use the m J/ψ p (m J/ψ K ) distribution, as any deviations from H 0 should appear in the mass region of potential pentaquark (tetraquark) resonances. The projection of F(m Kp , cos θ Λ * |H 0 ) onto m J/ψ p involves replacing cos θ Λ * with m J/ψ p and integrating over m Kp . This integration is carried out numerically, by generating large numbers of simulated events uniformly distributed in m Kp and cos θ Λ * , calculating the corresponding value of m J/ψ p , and then filling a histogram with F(m Kp , cos θ Λ * |H 0 ) as a weight. In Fig. 4, F(m J/ψ p |H 0 ) is compared to the directly obtained efficiency-corrected and background-subtracted m J/ψ p distribution in the data.
To probe the compatibility of F(m J/ψ p |H 0 ) with the data, a sensitive test can be constructed by making a specific alternative hypothesis (H 1 ). Following the method discussed in Ref. [13] H 1 is defined as l ≤ l large , where l large is not dependent on m Kp and large enough to reproduce structures induced by J/ψ p or J/ψ K contributions. The significance of the l max (m Kp ) ≤ l ≤ l large Legendre moments is probed using the likelihood ratio test: with normalizations I H 0,1 determined via Monte Carlo integration. Note that the explicit event-by-event efficiency factor cancels in the likelihood ratio, but enters the likelihood normalizations. In order for the test to have optimal sensitivity, the value l large should be set such that the statistically significant features of the data are properly described.
[GeV] Beyond that the power of the test deteriorates. The limit l large → ∞ would result in a perfect description of the data, but a weak test since then the test statistic would pick up the fluctuations in the data. For the same reason it is also important to choose l large independently of the actual data. Here l large = 31 is taken, one unit larger than the value used in the model-independent analysis of B 0 → ψ(2S)π + K − [13], as baryons have half-integer spins. The result for F(m J/ψ p |H 1 ) is shown in Fig. 4, where it is seen that l large = 31 is sufficient. To make F(m J/ψ p |H 0,1 ) continuous, quadratic splines are used to interpolate between nearby m J/ψ p bins. The numerical representations of H 0 and of H 1 contain a large number of parameters, requiring extensive statistical simulations to determine the distribution of the test variable for the H 0 hypothesis: F t (∆(−2 ln L)|H 0 ). A large number of pseudoexperiments are generated with n sig cand and n side cand equal to those obtained in the data. The signal events, contributing a fraction (1 − β) to the signal region sample, are generated according to the F(m Kp , cos θ Λ * |H 0 ) function with parameters determined from the data. They are then shaped according to the (m Kp , cos θ Λ * , Ω a ) function, with the Ω a angles generated uniformly in phase space. The latter is an approximation, whose possible impact is discussed later. Background events in sideband and signal regions are generated according to the 6D background parameterization previously developed in the amplitude analysis of the same data (Ref. [3] supplement). The pseudoexperiments are subject to the same analysis procedure as the data. The distribution of values of ∆(−2 ln L) over more than 10 000 pseudoexperiments determines the form of F t (∆(−2 ln L)|H 0 ), which can then be used to convert the ∆(−2 ln L) value obtained from data into a corresponding p-value. A small p-value indicates non-Λ * contributions in the data. A large p-value means that the data are consistent with the Λ * -only hypothesis, but does not rule out other contributions.
Before applying this method to the data, it is useful to study its sensitivity with the help of amplitude models. Pseudoexperiments are generated according to the 6D amplitude model containing only Λ * resonances (the reduced model in Table 1 of Ref. [3]), along with efficiency effects. The distribution of ∆(−2 ln L) values is close to that expected from F t (∆(−2 ln L)|H 0 ) (black open and red falling hatched histograms in Fig. 5), thus verifying the 2D model-independent procedure on one example of the Λ * model. They also indicate that the non-uniformities in (Ω a ) are small enough not to significantly bias the F t (∆(−2 ln L)|H 0 ) distribution when approximating the Ω a probability density via a uniform distribution. To test the sensitivity of the method to an exotic P + c → J/ψ p resonance, the amplitude model described in Ref. [3] is used, but with the P c (4450) + contribution removed. Generating many pseudoexperiments from this amplitude model produces a distribution of ∆(−2 ln L), which is almost indistinguishable from the F t (∆(−2 ln L)|H 0 ) distribution (blue dotted and red falling hatched histograms in Fig. 5), thus predicting that for such a broad P c (4380) + resonance (Γ 0 = 205 MeV) the false H 0 hypothesis is expected to be accepted (type II error), because the P c (4380) + contribution inevitably feeds into the numerical representation of H 0 . Simulations are then repeated while reducing the P c (4380) + width by subsequent factors of two, showing a dramatic increase in the power of the test (histograms peaking at 60 and 300). Figure 5 also shows the ∆(−2 ln L) distribution obtained with the narrow P c (4450) + state restored in the amplitude model and P c (4380) + at its nominal 205 MeV width (black rising hatched histogram). The separation from F t (∆(−2 ln L)|H 0 ) is smaller than that of the simulation with a P c (4380) + of comparable width (51 MeV) due to the smaller P c (4450) + fit fraction. Nevertheless, the separation from F t (∆(−2 ln L)|H 0 ) is clear; thus, if this amplitude model is a good representation of the data, the H 0 hypothesis is expected to essentially always be rejected.
The value of the ∆(−2 ln L) test variable obtained from the data is significantly above the F t (∆(−2 ln L)|H 0 ) distribution (see the inset of Fig. 5). To estimate a p-value the simulated F t (∆(−2 ln L)|H 0 ) distribution is fitted with a bifurcated Gaussian function (asymmetric widths); the significance of the H 0 rejection is 10.1 σ standard deviations.
To test the sensitivity of the result to possible biases from the background subtraction, either the left or the right sideband is exclusively used, and the weakest obtained rejection of H 0 is 9.8 σ. As a further check, the sideband subtraction is performed with the sPlot technique [16], in which the w i weights are obtained from the fit to the m J/ψ pK distribution for candidates in the entire fit range. This increases the significance of the H 0 rejection to 10.4 σ. Loosening the cut on the boosted decision tree variable discussed in Ref. [3] increases the signal efficiency by 14%, while doubling the background fraction β, and causes the significance of the H 0 rejection to increase to 11.1 σ. Replacing the uniform generation of the Ω a angles in the H 0 pseudoexperiments with that of the amplitude model without the P c (4380) + and P c (4450) + states, but generating (m Kp , cos θ Λ * ) in the model-independent way, results in a 9.9 σ H 0 rejection.  Figure 4 indicates that the rejection of the H 0 hypothesis has to do with a narrow peak in the data near 4450 MeV. Determination of any P + c parameters is not possible without a model-dependent analysis, because P + c states feed into the numerical representation of H 0 in an intractable manner.
The H 0 testing is repeated using m J/ψ K instead of m J/ψ p . The m J/ψ K distribution, with F(m J/ψ K |H 0 ) and F(m J/ψ K |H 1 ) superimposed, is shown in Fig. 6. The ∆(−2 ln L) test gives a 5.3 σ rejection of H 0 , which is lower than the rejection obtained using m J/ψ p , thus providing model-independent evidence that non-Λ * contributions are more likely of the P + c → J/ψ p type. Further, in the model-dependent amplitude analysis [3], it was seen that the P c states reflected into the m J/ψ K distribution in the region in which F(m J/ψ K |H 0 ) disagrees with the data.
In summary, it has been demonstrated at more than 9 standard deviations that the Λ 0 b → J/ψ pK − decays cannot all be attributed to K − p resonant or nonresonant contributions. The analysis requires only minimal assumptions on the mass and spin of the K − p contributions; no assumptions on their number, their resonant or nonresonant nature, or their lineshapes have been made. Non-K − p contributions, which must be present in the data, can be either of the exotic hadron type, or due to rescattering effects among ordinary hadrons. This result supports the amplitude model-dependent observation of the J/ψ p resonances presented previously [3].
We express our gratitude to our colleagues in the CERN accelerator departments for the excellent performance of the LHC. We thank the technical and administrative staff at the LHCb institutes. We acknowledge support from CERN and from the national agencies:

Appendix: Supplemental material 1 Data sample
The definition of the signal and sideband regions is illustrated in Fig. 7. The backgroundsubtracted and efficiency-corrected distribution of the data on the rectangular Dalitz plane (m Kp , cos θ Λ * ) is shown in Fig. 8.  Figure 7: Distribution of m J/ψ pK in the data with the fit of signal and background components superimposed [3]. The fit is used to determine the background fraction β in the ±2σ signal region around the Λ 0 b peak (shown by the vertical red bars). The sidebands used in the background subtraction are also shown.

Simulations based on amplitude models
The rectangular Dalitz plane (m Kp , cos θ Λ * ) distributions for the large statistics pseudosamples generated from the amplitude model with only the Λ * resonances and from the amplitude model with only the P c (4380) + and P c (4450) + resonances are shown in Figs. 9 and 10, respectively. Parameters of the models, without and with the P + c states, were determined by fitting the amplitude models to the data as described in Ref. [3].
The Legendre moments of cos θ Λ * distributions ( P U l k ) in various bins of m Kp are compared between these two simulated pseudo-samples in Fig. 11. The l ≤ l max (m Kp ) filter, used in forming a numerical representation of the hypothesis that only K − p contributions are present (H 0 ), is also illustrated in Fig. 11: moments in the shaded regions (l > l max (m Kp )) are neglected. The pentaquark resonances can induce significant values of the moments in these regions, as illustrated with the example amplitude model containing only P + c states. The P + c states also contribute significantly to the unshaded l ≤ l max (m Kp ) regions, thus feeding into the numerical representation of the H 0 hypothesis, and decreasing the sensitivity of the model-independent approach to exotic hadron contributions. This is especially true for wide resonances, which contribute very little to high moments, as illustrated for the P c (4380) + state in Fig. 12. The example amplitude model with only Λ * resonances contributes to the unshaded regions only, as expected.