Probing axion-like particles with γγ final states from vector boson fusion processes at the LHC

We perform a feasibility study to search for axion-like particles (ALPs) using vector boson fusion (VBF) processes at the LHC. We work in an effective field theory framework with cutoff scale Λ and ALP mass ma, and assume that ALPs couple to photons with strength ∝ 1/Λ. Assuming proton-proton collisions at √ s = 13 TeV, we present the total VBF ALP production cross sections, ALP decay widths and lifetimes, and relevant kinematic distributions as a function of ma and Λ. We consider the a → γγ decay mode to show that the requirement of an energetic diphoton pair combined with two forward jets with large dijet mass and pseudorapidity separation can significantly reduce the Standard Model backgrounds, leading to a 5σ discovery reach for 10 MeV . ma . 1 TeV with Λ . 2 TeV, assuming an integrated luminosity of 3000 fb−1. In particular, this extends the LHC sensitivity to a previously unstudied region of the ALP parameter space.

Laboratory-based searches for ALPs coupling to photons fall into several distinct categories depending on the regions of parameter space they are sensitive to. Light-shining-through-wall experiments [10], which rely on ALP-photon conversions, probe smaller ALP masses m a 10 −3 eV; beam dumps that rely on ALP decay typically probe larger masses ∼ O(1 MeV -1 GeV) ; while hybrid proposals like PASSAT probe an intermediate regime m a 100 eV.
High energy colliders are sensitive to a large swathe of the ALP mass and ALP-photon coupling parameter space. Theoretical studies of ALPs at the LHC and future colliders arising from on-shell decays h → aa, h → Za and Z → γa have been performed by several authors [29][30][31][32]. Constraints from LEP arise from associated production of ALPs via e + e − → γa → 3γ and e + e − → Z → γa → 3γ. On the other hand, exotic decays of the Higgs and the Z form the basis of many LHC searches via pp → h → Za → Zγγ and The purpose of this paper is to perform a careful investigation of ALPs at the LHC arising from photon fusion processes utilizing the vector boson fusion (VBF) topology, and assuming that ALPs couple to SM photons. The relevant Feynman diagram is shown in Fig. 1. An early study in this direction was performed by the authors of [33] using LHC data from 2011 and 2012. In the mass window 100 GeV < m a < 160 GeV, ATLAS VBF Higgs searches [34,35] were used to establish upper limits on the allowed signal cross section in each m a bin. For higher masses, the constraints were directly obtained by comparing the observed number of events in the diphoton mass spectrum over the expected background distribution, while for lower masses down to m a ∼ 50 GeV, ATLAS measurements of photon pair production were used [36].
In our work, we will perform an updated study of ALPs using the VBF topology, down to ALP masses at the MeV scale below which they decay outside the detector. The VBF topology has been proposed as an effective tool for a variety of beyond-SM searches, such as dark matter [37][38][39], supersymmetry [40][41][42][43][44][45][46], Z [47], heavy neutrinos [48] and heavy spin-2 resonances [49]. As we will show, it is particularly effective for probing ALPs. Our results are summarized in Fig. 13, where we see that VBF enables sensitivity to a regime of parameter space that is not covered by any other experiment.

II. SAMPLES AND SIMULATION
In the case of the axion signal samples, the model files were generated using the FeynRules package [50] and obtained from Ref. [ µν represents the a interactions with the photon (γ) and Z boson. The Wilson coefficients C γγ , C γZ , and C ZZ govern the a → γγ, a → γZ, and a → ZZ decays, respectively. In the above Lagrangian, s w and c w are the sine and cosine of the weak mixing angle, F µν and Z µν the energy-momentum tensors of γ and Z, and Λ the symmetry breaking scale. We produced several signal samples considering various values of A ≡ Cij Λ 2 . For the purpose of the studies shown in this paper, we set the value of the coefficients C ij to unity, but scenarios with different values can be derived by appropriately rescaling the production cross-sections.
The signal samples were produced for a variety of axion masses, ranging from 1 MeV to several TeV. The value of Λ was varied between 1000 GeV to 4000 GeV, for every ALP mass point generated. Pure electroweak production of a ALP and two additional jets (i.e. pp → ajj with suppressed QCD coupling α 0 QCD ) was considered. At MadGraph level, jets were required to have a minimum p T > 20 GeV, |η| < 5, a pseudorapidity gap of |∆η jj | > 2.4, and reconstructed dijet mass of m jj > 120 GeV. The parton level ∆η jj and m jj requirements reduce the contributions from s-channel gluon-gluon fusion (gg → a) and associated ALP production diagrams (e.g., qq → Z * → Za → jja), which can result in a similar final state, in order to optimize the VBF ajj statistics in our samples. Figure 2 shows the ajj production cross section, with the parton level requirements described above, as a function of m a for varying values of Λ. Photons were required to have transverse momentum greater than 10 GeV and located in the central region of the ATLAS and CMS detectors (|η(γ)| < 2.5). Photon pairs were also required to be separated in η-φ space by requiring We note that the resonant ALP production crosssection via VBF is given by σ V BF ∝ m 2 a Λ 2 , and is thus suppressed for relatively small ALP masses with respect to the symmetry breaking scales considered in these studies. For this reason, non-resonant ALP production dominates the cross-section in a large part of the m a phase space considered, a property observed and exploited by the authors of [55]. Similarly, the ALP decay width Γ is suppressed by m a over the new physics scale Λ, and thus ALPs with small m a can be long-lived and decay outside of the detector. To determine the range in m a at which the long lifetime becomes important, we compute the ALP decay length perpendicular to the proton-proton beam axis, which has the form L a,⊥ = √ γ 2 a −1 Λ sin θ. In this equation, θ is the scattering angle relative to the beam axis and γ a is the relativistic boost factor. This quantity is calculated per simulated signal event by utilizing the ALP pseudorapidity distribution and conservatively assuming the ALP moves at the speed of light. Since our focus is on the a → γγ decay channel, events with L a,⊥ values corresponding to an ALP decay beyond the CMS electromagnetic calorimeter (ECAL) cannot be used, so we neglect regions of the ALP parameter space where this happens to a non-trivial extent (see Figs. 11 and 12). Figure 3 shows the fraction of events which decay inside the detector and leave a signature in the CMS ECAL, as a function of m a and Λ. For Λ = 1 TeV (4 TeV), a large fraction of the events are lost when m a < 5 MeV (m a < 15 MeV) .
The dominant sources of SM background are production of photon pairs with associated jets, referred to as γγ+jets. In the proposed search region (defined in Section III), the associated jets are mainly from initial state radiation (i.e. pp → γγjj, α 2 QCD ) or SM VBF processes (i.e. pp → γγjj, α 0 QCD ). Therefore, the background samples are split into two categories: (i) non-VBF γγ+jets events with up to four associated jets, in- clusive in the electroweak coupling (α EW K ) and α QCD ; and (ii) pure electroweak γγjj. The production of γ+jets and multijet events with jets misidentified as photons have been checked to provide a negligible contribution to the proposed search region due to the effectiveness of the VBF selection criteria.
The MLM algorithm [56] was used for jet matching and jet merging. The xqcut and qcut variables of the MLM algorithm, related with the minimal distance between partons and the energy spread of the clustered jets, were set to 30 and 45 as result of an optimization process requiring the continuity of the differential jet rate as a function of jet multiplicity.

III. EVENT SELECTION CRITERIA
We focus on a final state with exactly two well identified photons and two jets consistent with the characteristics of the photon-photon fusion process. Stringent requirements are placed on the p T of photons, and on the kinematic properties of the VBF dijet system in order to suppress SM backgrounds.
To study the important differences between signal and background processes, we select events with at least two γ candidates with |η γ | < 2.5 and p γ > 10 GeV, and present various kinematic distributions. The γ with the highest p T is referred to as the leading γ. Figure 4 shows the leading photon transverse momentum distribution, p γ1 T , for two signal benchmark samples and the main associated backgrounds, normalized to the area under the curve (unity). Note the signal protrudes around p γ1 T > 200 GeV, but the exact cut value of p γ1 T > 300 GeV is determined through an optimization process aimed at maximizing discovery potential. The optimization of all cut values was performed using the statistical figure of merit N S / N S + N B + (0.25 × (N B + N S )) 2 , where N S and N B represent the expected number of signal and background events, and the term 0.25 × (N B + N S ) corresponds to the associated systematic uncertainty on the background plus signal prediction, which is a realistic uncertainty based on VBF searches at ATLAS and CMS [39,44,45]. We note this particular definition of signal significance is only used for the purpose of optimizing the selections. The final discovery reach is determined with a shape based analysis (described later) using the full diphoton mass or dijet mass spectrum.
For low m a values, the relatively large photon p T is a key feature attributed to the kinematically boosted topology facilitated by the VBF process. This kinematic feature provides a nice handle to reconstruct and identify low m a signal events amongst the large SM backgrounds. Figure 5 shows the reconstructed mass of the photon pair, m γγ , normalized to unity, for the SM backgrounds and two signal benchmark points. In the case of non-resonant low mass ALP production, the diphoton mass values scale as m γγ ≈ p γ1 T + p γ2 T . Therefore, the high-p T signal photons produce a broad m γγ distribution that overtakes the SM backgrounds at several hundred GeV. Since m γγ in signal and background events depends on the p T of photons and their angular correlations, we perform a twodimensional optimization of the m γγ and p γ1 T cut values. Figure 8 shows the signal significance for p γ1 T as a function of m γγ , for a benchmark point with m a = 1 MeV and Λ = 1 TeV. We select events with m γγ > 500 GeV. These results were obtained after optimizing the VBF dijet selections (discussed below) in order to account for the correlation to the boosted kinematics. VBF events are characterized by two forward jets with high p T , residing in opposite hemispheres of the detector volume, η j1 × η j2 < 0, containing a large separation in pseudorapidity, |∆η jj |, and large reconstructed dijet mass (m jj ). For a particle collider such as the LHC, the energy of a jet is very high with respect to the mass of its associated parton, allowing us to approximate the dijet mass as m jj ≈ 2p j1 T p j2 T cosh(∆η jj ). Since the reconstructed p T and η values of jets inside the ATLAS and CMS experiments are limited by the performance and geometry of their detectors, the VBF kinematic distributions are studied with a pre-selection of at least two jets with |η| < 5.0 and minimum p j T > 30 GeV. These jets are required to be well separated from photons, by imposing a ∆R γj = (∆φ γj ) 2 + (∆η γj ) 2 > 0.4 requirement. Figure 6 shows the ∆η jj distribution for signal and background, normalized to unity, while Fig. 7 shows the corresponding m jj distribution. For events where there are more than two well reconstructed and identified jet candidates, the dijet pair with the larger value of m jj is used in Fig. 7. The s-channel γγ fusion production of signal events results in events with larger ∆η jj separation with respect to background events, and subsequently larger dijet mass spectrum.
Similar to the optimization of the p γ1 and m γγ requirements, we account for the correlation between |∆η jj | and m jj by performing a two-dimensional optimization of the m γγ and p γ1 T cut values utilizing the same signal signifi- To reduce non-VBF signal processes such as gluon-gluon initiated production or associated ALP production such as Za → Zγγ → jjγγ, we pre-select events with |∆η jj | > 3.6 and m jj > 750 GeV. These requirements result in > 95% purity of genuine VBF signal events. Fig. 9 shows signal significance as a function of |∆η jj | and m jj .
Finally, to completely eliminate other smaller SM backgrounds with top quarks and heavy vector bosons, we impose b-jet and lepton veto requirements. Events are rejected if a jet with p T > 30 GeV and |η| < 2.4 is identified as a bottom quark (b). Events are also rejected if they contain isolated electrons or muons with p T > 10 GeV and |η| < 2.5. These requirements are > 95% efficient for VBF ALP signal events. The final optimized event selection criteria is summarized in Table  I. Figure 10 shows the expected background and signal yields in bins of m jj . Various signal benchmark points are considered and the yields are normalized to cross section times integrated luminosity of 3000 fb −1 . The background distributions are stacked/added on top of each  Criterion > 750.0 GeV other, while the signal distributions are overlaid on the background.

IV. RESULTS
To assess the expected experimental sensitivity of this search at the LHC, we followed a profile binned likelihood test statistic approach, using the expected bin-bybin yields in the reconstructed m γγ and m jj distributions.
Under this approach, the signal significance is defined using the local p-value, understood as the probability of obtaining the same test statistic estimated with the signal plus background hypothesis and from the statistical fluctuation of the background only hypothesis. Then, the signal significance S corresponds to the point at which the integral of a Gaussian distribution between the S and ∞ results in a value equal to the local p-value. The sensitivity was calculated considering the integrated luminosity already collected by ATLAS and CMS experiments during the so called Run-II phase, 150 fb −1 , and for the 3000 fb −1 expected by the end of the LHC era. The estimation of this shape based signal significance was performed using the ROOFit [57] toolkit, developed by CERN.
The calculation considers various sources of systematic uncertainties, based upon experimental and theoretical constrains. These uncertainties were incorporated in the test statistic as nuisance parameters. We considered experimental systematic uncertainties on γ identification and on reconstruction and identification of jets. For γ identification, a conservative 15% was assumed, following results reported in Ref. [58,59]. The uncertainties between the two photons, and between signal and background process, were considered to be fully correlated. For experimental uncertainties related with the tagging of VBF jets, a 20% value was included (independent of m jj or m γγ ), following the experimental results from Refs. [39,44]. In addition, theoretical uncertainties were included in order to account for the set of parton distribution functions (PDF) used to produce the simulated signal and background samples. The PDF uncertainty was calculated following the PDF4LHC prescription [60], and results in a 5-12% systematic uncertainty, depending on the process. The effect of the chosen PDF set on the shape of the m jj and m γγ distributions is negligible. Figures 11 and 12 show the results on the expected signal significance for different Λ and m a scenarios, specifically focusing on the lower m a range below 100 GeV. The dashed line delimits the discovery region.
For the 150 fb −1 scenario, it is feasible to probe ALP masses 10 MeV m a 100 GeV for Λ 1.8-2.2, with the latter bound for Λ varying with m a . The grey band on the plot shows the scenarios in which ALPs decay outside the CMS detector volume, so no detection is possible.
Similarly, for the 3000 fb −1 scenario, the discovery The expected signal significance was calculated by interpolating discrete data points as a function of ma, Λ, assuming an expected luminosity of 3000 fb −1 . The grey dashed line encloses the region with 5σ discovery potential.
reach includes 10 MeV m a 100 GeV for Λ 2.0-2.3, the Λ bound again depending upon m a . The expected discovery reach using the VBF topology includes sensitivity to a regime of the ALP parameter space that is not covered by any other experiment. This feature is further explained in the following section.

V. DISCUSSION
We have presented a feasibility study for the detection of axion-like particles with strong coupling to photons, a → γγ, produced through VBF processes at the CERN LHC. The expected experimental sensitivity of the search was presented for two different luminosity scenarios, 150 fb −1 , the current integrated luminosity collected by AT-LAS and CMS experiments, and the 3000 fb −1 expected by the end of the LHC era. The signal model was developed under an effective field theory approach, considering the symmetry breaking scale, Λ, and the ALP masses as free parameters. The expected signal significance for the 150 fb −1 scenario allows the ATLAS and CMS experiments to probe ALP masses from 10.0 MeV to 100.0 GeV, for values of Λ up to 1.8-2.2 TeV, depending on m a . For the 3000 fb −1 scenario, the discovery reach goes from 10.0 MeV to 100.0 GeV, for values of Λ up to 2.0-2.2 TeV, depending on m a . For Λ values below a few TeV, the sensitivity to m a extends to TeV scale values (see Fig. 13, discussed below). Figure 13 shows the comparison of our 5σ discovery reach at 3000 fb −1 to existing constraints on ALP parameter space (grey). The constraints shown in Fig. 13 are taken from Fig. 4 of [31] and correspond to LEP (light blue and blue), CDF (purple), the LHC (associated production and Z decays (orange), photon fusion (light orange), and heavy-ion collisions (green)). The results from previous collider searches show a gap in sensitivity in the ALP mass range 10 MeV m a 100 GeV, which is primarily due to: (i) low resonant ALP production cross sections at TeV scale values of Λ; and (ii) the lowp T photon kinematics arising from low mass ALP decays in the traditional searches without a boosted topology, which suffer from large SM backgrounds. It is clear that the proposed methodology using a boosted VBF topology can probe regions of parameter space that are currently unconstrained by other searches (red).

VI. ACKNOWLEDGEMENTS
We thank the constant and enduring financial support received for this project from the faculty of science at Universidad de los Andes (Bogotá, Colombia), the administrative department of science, technology and innovation of Colombia (COLCIENCIAS), the Physics & Astronomy department at Vanderbilt University and the US National Science Foundation. This work is supported in part by NSF Award PHY-1806612. KS is supported by DOE Grant DE-SC0009956.
FIG. 13. The 5σ discovery reach at 3000 fb −1 obtained using our search methodology is depicted on ALP parameter space, with an emphasis placed on the subset that has previously not been experimentally probed. The other constraints shown are taken from Figure 4 of [31].