New Collider Searches for Axion-like Particles Coupling to Gluons

Axion-like particles (ALPs) are pseudo Nambu-Goldstone bosons associated with spontaneously broken global symmetries in many extensions of the Standard Model. Assuming the most general effective Lagrangian up to dimension-5 operators for an ALP interacting with the SM fields, we investigate for the first time the sensitivity of LHC13 to the ALP production in association with two jets. This study is focused on light ALPs which appear as invisible particles at the detector. Performing a realistic detector simulation and deploying a multivariate technique to best discriminate the signal from backgrounds, we set expected upper bounds on the ALP coupling to gluons. A comprehensive set of backgrounds is considered, and it is shown that, this process provides significant sensitivity to the ALP-gluon coupling and the resulting bound is complementary to those already obtained. We further extend our analysis to obtain limits on the ALP-gluon coupling utilizing the associated production of an ALP and a single jet. We show that, using a multivariate analysis with a set of best-discriminating variables, this process yields an upper limit significantly stronger than limits obtained in earlier studies. We also present prospects for HE-LHC27 and FCC-hh100 for both of the mono-jet and di-jet signals and show that these future colliders are able to improve the limits from the LHC by roughly one order of magnitude.


Introduction
The Standard Model (SM) of particle physics provides remarkable predictions which have been successfully verified in plenty of experimental observations. However, it suffers from some theoretical and experimental shortcomings. An extensive effort has been therefore made towards the development of theories beyond the Standard Model (BSM) to address some of the open questions. However, no success in observation of these BSMs has been achieved yet.
A long time ago, Peccei and Quinn [1] showed in their proposed Axion model that incorporating the anomalous global U (1) PQ symmetry in the SM Lagrangian can lead to a chiral solution to the strong CP problem [2,3]. Over time, different aspects of their idea were studied and different variants of their original model were developed [4,5,6]. In the Axion model, spontaneous breaking of the anomalous PQ symmetry yields a new particle called QCD Axion with a mass stringently related to the scale of global symmetry breaking f a . Setting the strict relation between the Axion mass and f a aside, the idea of Axion-like particles (ALPs) [8,7] with wide-ranging masses and couplings arose. Unlike the QCD Axion, the mass and couplings of the ALP are considered as independent parameters. In general, these pseudo Nambu-Goldstone bosons, ALPs, exist in any model with a spontaneously broken global symmetry. They are gauge-singlet, CP-odd and naturally light because of a (softly broken) shift symmetry.
In recent decades, there has been increasing interest in ALPs due to their various beneficial properties. ALPs display various applications depending on the region in parameter space spanned by their masses and couplings. For instance, they might make good candidates for non-thermal Dark Matter (DM) [9]. They can play an essential role in baryogenesis providing a possible explanation for the observed imbalance of matter and antimatter [10,11]. ALPs might help solve the neutrino mass problem by an ALP-neutrino connection (coupling) through which neutrinos acquire mass [12,13,14]. They can be used to explain the anomalous magnetic dipole moment of the muon [15]. The excess events observed in the rare K mesons searches recently reported by the KOTO experiment can be explained by ALPs [16]. Furthermore, ALPs can explain the anomalous decays of the excited Beryllium 8 Be * producing an excess of electron-positron pairs which has been recently reported [17,18].
Numerous applications ALPs found to concrete problems provide the motivation to devote considerable attention to these new degrees of freedom. A significant region of the parameter space in terms of the ALP mass and its couplings has been already explored or will be accessible by low-energy experiments, cosmological observations, collider searches, etc. [19,20,21,22,23,24]. ALPs favor derivative couplings to SM particles due to an (approximate) shift symmetry, and the strength of their couplings is proportional to the inverse of global symmetry breaking scale. ALPs with masses below the electron pair production threshold can only decay into a pair of photons, with a decay rate varying as the third power of the ALP mass. Heavier ALPs can also experience the decays into charged leptons and hadrons depending on their masses and couplings. For light ALPs, the decay rate is usually so small that the ALP can travel a long distance before it decays. This motivates searches for solar ALPs (produced by the Primakoff process) using helioscopes such as SUMICO [25] and CAST [26]. The future helioscope experiment IAXO which is a proposed 4th generation axion helioscope is sensitive to ALPs with masses below the GeV scale [27]. In collider searches, long-lived ALPs appear as invisible particles and manifest themselves as missing energy since they decay outside the detector and don't interact with the detecting devices. Searching for missing energy signals, collider experiments probe long-lived ALPs. It has been shown that mono-γ and mono-jet plus missing energy are promising signals to probe ALP masses above the KeV scale [7]. Another study has reported mono-W, mono-Z and W+γ plus missing energy as promising signals to probe the mass of the ALP and its couplings to electroweak gauge fields [8]. Besides the missing energy signals, signals from ALPs decaying inside the detector have also been explored extensively. Searches for the exotic Higgs decays h → aZ and h → aa with subsequent ALP decays a → γγ and a → + − at the LHC have been performed providing significant sensitivities in large regions of unprobed parameter space [28,29,30,31]. These channels display reasonable sensitivity to the ALP couplings to photons and leptons for ALP masses above a few MeV. The LEP data on the decay Z → γγ provides unprecedented access to the ALP mass range MeV to 90 GeV [32]. It is also shown that the tri-γ channel (with two photons originating from a prompt ALP decay) provides the possibility to constrain the mass of the ALP and its coupling to photons [32,7,29]. The mono-Higgs signal pp → ah is investigated in Ref. [8]. The same reference also provides the prospects for bounds on the ALP coupling to fermions using the associated production of an ALP with a pair of top quarks. The ultra-peripheral heavy-ion collisions at the LHC can also be used to probe ALPs albeit these collisions are not optimized for BSM searches. The ALP is produced exclusively from the photon-photon luminosity induced by the lead ions, and decays into a pair of back-to-back photons. Utilizing this process, the ALP mass range 100 MeV to 100 GeV can be probed [33]. Besides the collider experiments, the search for ALPs has also been extensively performed in proton beam-dump experiments. Long-lived ALPs with masses in the MeV-GeV range have been probed in beam-dump experiments providing new constraints on ALPs [34]. Employing a data-driven method, the ALP-gluon coupling is carefully studied for QCD-scale ALP masses in Ref. [35].
In this work, we present for the first time collider searches for the production of an ALP in association with two jets (pp → a + jj). This process provides spectacular signatures, and therefore warrants thorough consideration. We simulate the detector response for the signal and background events based on the CMS detector at the LHC. A complete set of background processes is considered in our analysis and it is seen that the backgrounds are well under control. The multijet production which is the dominant background process displays properties that help to achieve an efficient signal-background discrimination using a multivariate analysis. The discriminating variables are carefully chosen so that the best discrimination is obtained. Utilizing this signal process, we set expected upper limits on the ALP coupling to gluons c GG at the LHC (13 TeV) for 1 MeV ALPs. It is shown that the obtained limit is complementary to the limits already obtained at the LHC. We also present prospects for HE-LHC (27 TeV) and FCC-hh (100 TeV). Besides, we derive the c GG upper limits corresponding to the previously studied mono-jet signal (pp → a + j) analyzing all relevant backgrounds and utilizing a multivariate technique to discriminate the signal against backgrounds. Expected bounds at present and future colliders are computed and it is shown that the new approach employed in this analysis significantly improves the limit on the c GG coupling obtained from the mono-jet analysis when compared with the results of earlier studies. This paper is structured as follows: In section 2, we review the effective Lagrangian for an ALP interacting with SM gauge bosons and matter fields. In section 3, we summarize the most prominent decays an ALP can experience, and then we discuss the stability of ALPs in the detector. In section 4, we present collider searches for the ALP production in association with one or two jets and set upper bounds on the ALP coupling to gluons.

Effective Lagrangian for ALPs
It has long been advocated that incorporating an anomalous global U (1) symmetry (which is spontaneously broken) in the SM full Lagrangian provides the possibility of resolving the socalled strong CP problem by dynamically driving the parameterθ (which controls the size of CP violation in the QCD sector) to zero [1]. The pseudo Nambu-Goldstone boson associated with the spontaneously broken symmetry is called QCD axion. There is a strict relation between the mass and couplings of the axion which limits the parameter space of the axion model. Axion-like particles or ALPs are similar to QCD axions in many respects. They, however, benefit from wide-ranging masses and couplings as they avoid the stringent relation connecting the mass of the particle to its couplings. The ALP is a scalar, singlet under the SM charges and odd under the CP transformation. Neglecting ALP couplings to fermions (of any type) for the sake of simplicity, the most general bosonic effective Lagrangian up to dimension-5 operators which describes an ALP interacting with SM fields is given by [8] where a and Φ denote the ALP field and the Higgs boson doublet, respectively and f a is the scale associated with the breakdown of the global U (1) symmetry. In this equation, G A µν , W A µν and B µν are the gauge field strength tensors corresponding to SU (3) c , SU (2) L and U (1) Y , respectively, the dual field strength tensorsX µν are defined asX µν ≡ 1 2 µνρσ X ρσ , where µνρσ is the Levi-Civita Apparently, this Lagrangian is CP-conserving and invariant under the SM gauge transformations.
Under a U (1) transformation, the ALP field undergoes the translation a(x) → a(x) + α, with α constant, implying that if the effective Lagrangian is to be U (1) invariant, it must also respect the symmetry under the ALP field translation (shift symmetry). The ALP field must thus only show up with a derivative in the Lagrangian. This, however, is not the case because of the chiral anomaly in this model. In the L eff of Eq. 1, the last operator, which couples the ALP field directly to the gluon density GG, induces a modification to the action when the ALP field translates. This operator is a total divergence a fa G a µνG aµν ≡ a fa ∂ µ K µ , and thus induces the total divergence contribution α fa ∂ µ K µ to δL eff under the a field translation. However, the integral of this contribution doesn't vanish because of the QCD vacuum structure and instanton effects. The action is thus altered albeit a discrete version of the shift symmetry is preserved. In addition to the chiral anomaly, the shift symmetry is also affected by the explicit symmetry breaking ALP mass term present in the Lagrangian. The shift symmetry is softly broken and the ALP acquires more mass than is obtained from non-perturbative dynamics as a result of the ALP mass term. In the case of the original QCD axion [1], where no mass term is present in the Lagrangian, the axion receives its mass solely via spontaneous breaking of the global U (1) PQ symmetry. Therefore, the axion mass is stringently related to the characteristic scale of global symmetry breaking f a via where V eff represents the effective potential for the axion field provided by the aGG operator in the Lagrangian. The ALP mass, however, doesn't suffer from such a strict relation by virtue of the presence of the soft symmetry breaking mass term m 2 a,0 in L eff , which adds a mass to the dynamically generated mass: m 2 a = m 2 a,0 + m 2 a, dyn , and thus m a and f a are considered as independent parameters. This freedom opens up a wide range of possibilities and thus a rich phenomenology of ALPs.
After electroweak symmetry breaking (EWSB), a product term of the form Z µ ∂ µ a which stems from the c aΦ i∂ µ a fa (Φ † ← → D µ Φ) operator appears in L eff . Accordingly, it is understood that besides the SM would-be Nambu-Goldstone boson, the a field contributes to the longitudinal component of the Z boson too. As, in this situation, the direct interpretation of the Lagrangian leads to difficulties, a field redefinition of the general form where α Φ is a real constant, α ψL,R are 3 × 3 hermitian matrices in flavor space, ψ L = {Q L , L L } and ψ R = {u R , d R , e R } is applied so as to illustrate the physical consequences of the induced a-Z vertex. The parameters α Φ and α ψL,R can be conveniently chosen so that the a-Z vertex is eliminated in favour of fermionic couplings. The a-Z vertex is redefined away and replaced with fermionic couplings by the choice α Φ = c aΦ , and tuning the parameters α ψL,R provides the possibility to control the structure of the induced fermionic couplings (which can be Yukawa-like, vector-axial or a combination of them) [8]. We choose to use the field redefinition of Eq. 3 with α Φ = c aΦ and α ψL,R = 0, which, when acts on L SM , yields where Y U , Y D and Y E represent 3 × 3 matrices in flavor space containing Yukawa couplings for up-type quarks, down-type quarks and charged leptons, respectively, and whereΦ = iσ 2 Φ * . In Eq. 4, the last operator stems from the Higgs kinetic energy term in L SM while the Yukawa-like ALP-fermion operators in the first line stem from the SM Yukawa terms. The ALP-fermion operators in this equation can be conveniently written as where σ 3 acts on weak isospin space, , e R } and the block matrices Φ and Y ψ are given by Applying the field redefinition, the c aΦ which induces the a-Z vertex on EWSB, is entirely canceled out by the same operator with opposite sign in Eq. 4. Moreover, the fermionic operators of Eq. 5 are added to L eff . The effective Lagrangian up to dimension-5 operators is thus recast as [8] After EWSB, the operators aBB and aWW in Eq. 7 induce the ALP couplings to the photon and the Z boson as where F µν is the electromagnetic field strength tensor, Z µν is the corresponding field strength tensor for the Z boson and the Wilson coefficients c γγ , c γZ and c ZZ are given by with s θ (c θ ) the sine (cosine) of the weak mixing angle.

ALP decay and stability at colliders
The leading interactions of the ALP described by the effective Lagrangian, Eq. 7, govern ALP decays into pairs of SM particles. In this section, we summarize the most prominent decay modes of the ALP. To keep the discussion pertinent to the scope of the paper, only decays relevant to MeV-scale ALPs are considered. In this mass region, the decays into photons, charged leptons and light hadrons are of great importance. The di-photon decay mode plays a crucial role in many scenarios in the case of light ALPs. Particularly in the case of ALP masses below the on-shell electron pair production threshold m a < 2m e ≈ 1.022 MeV, as in the present work, where it is the only possible decay mode. At leading order, the decay rate of an ALP into a photon pair in terms of the relevant Wilson coefficient is given by Beyond the tree-level, other Wilson coefficients in the effective Lagrangian are also involved in the decay rate. For the decay rate of Eq. 10 to include higher-order corrections, it proves convenient to substitute c γγ in this equation with the effective Wilson coefficient c eff γγ in which the loop-induced effects are encoded [29]. At the one-loop level, fermion loops and electroweak loops involving all the Wilson coefficients in the Lagrangian 7 except for c GG contribute to the corrections. Corrections due to the ALP-gluon coupling first appear at two-loop level and can be significant because of a logarithmic dependence on the global symmetry breaking scale. Even if c γγ vanishes, contributions from other Wilson coefficients to c eff γγ can result in a non-zero decay rate. The effective coefficient c eff γγ is also dependent on the ALP mass. This dependence is, however, so mild that the decay rate varies as m 3 a to an excellent approximation (see Eq. 10). As m a decreases, the decay rate decreases substantially leading to so large lifetimes for very light ALPs with m a < 2m e .
The leptonic decay a → + − (with = e, µ, τ ) becomes available for ALP masses larger than 2m . Depending on the Wilson coefficients, leptonic decays can be significant or not. To gain some insight, it's worth mentioning that in the case where the Wilson coefficients share the same magnitudes and the ALP mass does not exceed the pion mass m π (assuming the ALP couples to quarks and gluons as well), the leptonic modes are dominant in the vicinity of pair production thresholds 2m [29]. In other regions of parameter space, however, the ALP decays predominantly into photons.
The hadronic decays of the ALP start to appear when the ALP mass exceeds m π . Such processes begin with the ALP decay into colored particles at the partonic level, i.e. a → gg and a → qq. In the mass range m a < 1 GeV, the triple pion decay modes a → 3π 0 and a → π 0 π − π + are the dominant hadronic modes. Other decay modes possible in this range are substantially suppressed. For instance, the modes a → ππγ, a → π 0 γγ and a → π 0 e − e + , although allowed, are substantially suppressed by phase space and powers of the fine-structure constant [29].
After the brief preliminary discussion on the ALP decay modes, we now turn to the important subject of stability of ALPs at colliders. If the total decay width Γ a is sufficiently large, the ALP undergoes a prompt or displaced decay inside the detector after production. It can be then reconstructed by its decay products. On the other hand, in the case where the total decay width is small enough, the ALP escapes the detector before its decay and manifests itself as missing transverse energy / E T . Apart from the total decay width, the Lorentz boost factor γ plays an important role in detectability of ALPs. It, however, depends on the employed experimental setup. The average decay length of the ALP perpendicular to the beam axis L ⊥ a and the probability that the ALP decays inside the detector P det a are given by where θ is the angle of the ALP's flight direction with respect to the beam axis, β is the ALP speed, τ is the proper lifetime and L det is the transverse distance of the calorimeter from the collision point. To get a sense of the size of ALP decay length, we find an approximate lower limit of it and compare the result with the size of LHC detectors (∼ 10 m). For an ALP of mass 1 MeV, which will be considered later in this work, the di-photon decay is the most significant decay mode affecting our evaluation. The ALP decay length in the laboratory frame can be therefore estimated as where p a is the three-momentum of the ALP. In the last step, we have used Eq. 10. Setting m a to 1 MeV and using the current experimental upper limit on c γγ /f a [28,29], Eq. 12 implies The ALP momentum | p a |, which is closely related to the missing transverse energy of the event, varies with the experiment conditions. It depends on the center-of-mass energy of the collider, the imposed minimum / E T cut, technical specifications of the calorimetry system, etc. Assuming that the order of magnitude of | p a | is roughly > O(100) GeV, one finds L a 4 × 10 10 m according to Eq. 13. This lower limit implies that, on average, an ALP of mass 1 MeV travels a distance substantially larger than the LHC detectors radius before it decays. Such ALPs are, therefore, very likely to appear as invisible particles and the missing transverse energy reveals their existence. Apart from the di-photon decay mode, the modes a → νννν and a → γνν are also available for the ALP mass considered here. However, they do not alter our conclusion. The a → νννν decay mode cannot be detected at the LHC and thus an ALP decaying through this channel appears invisible even if it decays inside the detector. The a → γνν mode, which could be mediated by the a-Z-γ interaction (see Eq. 8), would not affect the lower limit 13 significantly if it was also taken into account in Eq. 12 in addition to the di-photon decay mode. The reason is that the upper limit on Γ(a → γνν) (which is inferred from the experimental constraint on the Wilson coefficient c γZ [8]) is many orders of magnitude smaller than that of Γ(a → γγ) and thus the total decay width can be safely restricted to the two-photon decay width.
It can be concluded that a large fraction of ALPs under consideration in this work, with m a = 1 MeV, are stable at the detector and manifest themselves as missing transverse energy. This leads to a spectacular ALP signature which will benefit the analyses in the rest of the paper. The ALPs which decay inside the detector produce a small effect which will be also considered in our analyses.

Associated production of an ALP with one or two jets
The ALP can be produced through various processes at a proton-proton collider. In this work, we consider the ALP production in association with one or two jets separately. Studying these processes, we set upper bounds on the ALP-gluon coupling c GG for 1 MeV ALPs described by the effective Lagrangian 7. We implement this Lagrangian into FeynRules [36] and pass the generated Universal FeynRules Output (UFO) [37] 4 model, which is based on the study in Ref. [8], to MadGraph5_aMC@NLO [38] for event generation and cross section computation. In what follows in this section, we provide a detailed analysis of the two production processes as well as the constraints obtained in each case. In addition to the constraints obtained for the LHC, we also present prospects for future colliders to gain an impression of the capability of future colliders to probe the parameter space of the present model.

Signal and background processes
In this work, the two signal processes pp → a + j and pp → a + jj are assumed and will be analyzed separately. These processes display sensitivity to the ALP-gluon and ALP-fermion couplings, i.e. c GG and c aΦ . Using MadGraph5_aMC@NLO to compute the cross section and considering the effect of one operator at a time, the leading order cross sections of the assumed mono-jet and di-jet processes at the center-of-mass energy of 13 TeV (as an example) can be expressed as where the unit of f a is TeV. These results have been obtained using NNPDF23 [39] as the parton distribution function (PDF) of the proton. The obtained cross sections show that the processes are much more sensitive to c GG than c aΦ . One of the reasons behind this is that the ALP coupling to gluons is involved in more sub-processes than the coupling to fermions. This is more obvious in the case of mono-jet signal where the c GG coupling is involved in sub-processes with any initial state, i.e. gg, gq(q) and qq while c aΦ is only involved in sub-processes with gq(q) and qq initial states. Another reason refers to the fact that these processes are sensitive to the gluon PDF, particularly at small x. The gluon PDF rises steeply as x decreases at small x. This behavior leads to a substantial enhancement of the cross section of sub-processes with one or two gluons in the initial state in which the c aΦ coupling is less involved. Because of negligibility of the cross section dependence on c aΦ , we only concentrate on the c GG coupling in this work and examine the potential of the assumed processes to constrain it. Fig. 1 shows the representative leading order Feynman diagrams contributing to these processes which only involve the ALP coupling to gluons.
Besides the LHC, we also examine the capability of the High-Energy Large Hadron Collider (HE-LHC) [40] and Future Circular Collider (FCC) [41] to probe the present model. These colliders are proposed as future steps to be taken after the LHC. Benchmark specifications of these future collider proposals are provided in Tab. 1. It will be shown that the improvements in the center-of-mass energy and the integrated luminosity L of these colliders provide the opportunity to probe larger regions of the parameter space.
Assuming the ALP-gluon coupling to be the only non-vanishing ALP coupling and computing the leading order cross section of the signal processes at the center-of-mass energies of 13, 27 and These results correspond to the ALP mass of 1 MeV and have been obtained using the same packages and assumptions as used for obtaining the cross sections of Eq. 14. The obtained cross sections are depicted in Fig. 2. In this figure, the dashed (solid) lines show the cross sections of the a + j (a + jj) process as functions of c GG /f a at the considered center-of-mass energies. As is obvious from the cross sections, at the same center-of-mass energies and c GG /f a values, the a + jj signal has a smaller production rate than a + j. It will be seen in the following that this signal provides spectacular signatures which compensate for its weaker production when the processes are analyzed to obtain the limits on the ALP-gluon coupling. Performing a scan over the ALP mass, it is observed that the cross section doesn't display a significant mass dependence for ALP masses below few GeVs. This is not surprising though because of the smallness of the ALP mass in this range when compared with the process energy scale. With further increasing the ALP mass, the cross section will eventually decrease and become small for heavy ALPs. This may produce negative impacts on observability of scenarios with large masses (which are out of the scope of this paper). However, since the lifetime decreases with mass, a large enough ALP mass can partially compensate for the reduction in the interaction rate by providing signatures due to the ALP decay inside the detector.
The dominant SM backgrounds relevant to the two assumed signal processes are as follows: • Multijet production.
• Production of two massive gauge bosons V V (V = W ± , Z), where the W and Z bosons can decay via both hadronic and leptonic decay modes.
• W +jets, Z+jets and γ+jets in all possible final states.
• Top quark pair production (tt) and single top quark production, where all canonical channels (s-, t-and tW -channel) are considered in the single top production and all final states are assumed.
These background processes can produce missing transverse energy and final state jets (which resemble the signal signatures) in various ways. The sources of the missing energy include the neutrino(s) coming from the leptonic decays of the W ± and Z bosons, W ± → ± ν ( = e, µ, τ ) and Z → νν, the leptonic decays of hadrons inside the jets, uncertainties in the jet energy measurement, electronic noise of the detector, etc.

Event generation
Generation of signal and background events is performed as follows. Hard (parton-level) events are generated using MadGraph5_aMC@NLO and then are passed to Pythia 8.2.43 [42] to perform parton showering, hadronization and sequential decays of unstable particles. The output is internally passed to Delphes 3.4.2 [43] to simulate the detector effect assuming an upgraded CMS detector [45,44]. Jet reconstruction is performed by FastJet [46] using the anti-k t algorithm [47] with a jet cone size of 0.4. All objects reconstructed in events are required to satisfy the transverse momentum and pseudorapidity thresholds p T > 30 GeV and |η| < 2.5. In generation of signal events, the ALP mass m a is assumed to be 1 MeV. Moreover, it is assumed that c GG is the only non-vanishing coupling of the ALP.
According to the discussion of section 3, a very small fraction of ALPs produced in protonproton collisions decay inside the detector and thus don't contribute to the missing transverse energy (except for those rare ALPs with invisible particles in their decay final states, e.g. the ALPs experiencing the decay mode a → νννν). Using the probability that an ALP decays inside the detector P det a which is defined in Eq. 11, we take this effect into account event-by-event.

Validity of the effective Lagrangian
Validity of the effective Lagrangian 7 can be ensured by requiring the suppression scale, f a in the present model, to be much larger than the typical energy scale √ŝ of the processes under consideration. The condition √ŝ < f a should be therefore imposed on the events. This is, however, impossible since the processes under study have an undetectable ALP in their final states and thus √ŝ cannot be experimentally measured. The above requirement can only be imposed naively. To do so, one can use the missing transverse energy of the event (which is a measurable quantity) to make a comparison. In analyses in this work, we impose the condition 2 / E T < f a (= 1 TeV) on all generated events to ensure the validity of the effective description.

Signal-background discrimination
As for preselection, the number of jets in events is required to be 1 or 2 (2 or 3) for the mono-jet (di-jet) analysis. Moreover, events with isolated photons or leptons (electrons and muons) are vetoed. Isolated objects are selected with the help of the isolation variable I rel . The isolation variable is defined as the ratio of the sum of transverse momenta of all particles in a cone of radius 0.5 around the photon or lepton candidate (excluding the candidate) to the transverse momentum of the candidate. I rel is required to be < 0.12 for photons and electrons, and < 0.25 for muons. The minimum allowed missing transverse energy for events is set to 30 GeV. Imposing all the conditions, event preselection efficiencies presented in Tab. 2 are obtained for signal and background processes. We observe a sizable suppression of the overwhelming multijet and γ+jets backgrounds as a result of the preselection cuts. It is also observed that the efficiencies grow with the center-of-mass energy which is expected as higher center-of-mass energies produce harder jets and larger missing transverse energies.
To better discriminate between the signal and background events, we employ a multivariate technique [48] in this analysis. The preselected events are analyzed exploiting the Boosted Decision Trees (BDT) algorithm. The BDT is fed with a proper set of discriminating variables needed to perform the training process. The variables used in the mono-jet and di-jet analyses are: • Missing transverse energy / E T .
• Scalar transverse energy sum H T which is defined by where the index i runs over all particles reconstructed in the detector.
• Azimuthal separation between the missing transverse energy and the hardest (with the highest transverse momentum) jet ∆φ( / E T , j).
• Transverse momentum and pseudorapidity of the hardest jet p j T , η j .
only di-jet analysis • Azimuthal separation between the two hardest jets ∆φ(j, j ).  Table 2: Event preselection total efficiencies ( tot ) for signal and background processes at the center-of-mass energies of 13, 27 and 100 TeV corresponding to a) a + j and b) a + jj processes.
• Transverse momentum and pseudorapidity of the second hardest jet p j T , η j .
All of the variables used for mono-jet and di-jet analyses are the same except for ∆φ(j, j ), p j T and η j which are only used in the case of di-jet channel. Distributions of some of the discriminating variables obtained from the preselected events at the center-of-mass energy of 13 TeV (as an example) are presented in Figs. 3 and 4.
The missing transverse energy plays a significant role in separation of signal and background events for both signals. As shown in section 3, a large fraction of produced ALPs in pp collisions cover a distance many orders of magnitude larger than the detectors size before they decay. As a result, a missing transverse energy due to the undetectable ALPs is expected in the final state. The signal missing transverse energy is therefore expected to be significantly larger than that of the background processes since the source of missing transverse energy in background processes is limited to final state neutrinos.
The nature of the a + jj process and its relevant backgrounds provide a striking signature which substantially improves the signal-background discrimination. The jets in the di-jet events, which constitute the majority of the events of multijet background (the dominant background process), are likely to be back-to-back distributed because of the momentum conservation. Their azimuthal separation is therefore expected to be close to π radians. This is, however, not the case in signal events as the signal has a three-body final state with jets distributed randomly in the azimuth plane. Consequently, the azimuthal separation between the two hardest jets ∆φ(j, j ) has a high discriminating power and can compensate for the signal's smaller cross section (compared with the a + j signal process).
The BDT training is performed with the use of the aforementioned set of variables. All the background processes take part in the training process according to their corresponding weights. Applying the trained model to events, the BDT classifier produces a response for each event. As an example, the distributions of the BDT response for signal and background events assuming the center-of-mass energy of 13 TeV are provided in Fig. 5. As seen, the BDT performs well at separating the signal from the background in both cases. It is also seen that the separation is somewhat better in the di-jet case. This was expected because of the more discriminating power the training variables provide in this case. Particularly, the ∆φ(j, j ) variable plays a significant  role as discussed before.

Constraints on ALP coupling to gluons
Using the BDT response distributions of the signal and background processes, expected upper limits on the ALP coupling to gluons from mono-jet and di-jet signals at the center-of-mass energies of 13, 27 and 100 TeV are computed. The 95% confidence level (CL) upper limits on c GG /f a are plotted against the integrated luminosity in Fig. 6. In this figure, the dashed (solid) lines correspond to the mono-jet (di-jet) signal and different colors correspond to different center-of-mass   energies. The presented limits have been obtained assuming m a = 1 MeV. As seen in Fig. 6, the obtained bounds from the mono-jet and di-jet events are approximately of the same order of magnitude. The bounds derived from the di-jet channel are, however, slightly stronger compared with the bounds from the mono-jet signal even though the production rate of the di-jet channel is smaller (see Eq. 15). This can be attributed to the better performance of the BDT classifier in the di-jet case as mentioned before.
To better compare the results, the limits obtained for the LHC13, HE-LHC27 and FCC-hh100 at the integrated luminosities of 3, 15 and 20 ab −1 are presented in Tab. 3. As can be seen, the strongest obtained constraints on c GG /f a at the LHC13, HE-LHC27 and FCC-hh100 are 0.0103,   Figure 6: Expected upper limits on c GG /f a at 95% CL from mono-jet (dashed lines) and di-jet (solid lines) events as a function of the integrated luminosity assuming the LHC 13 TeV (red), HE-LHC 27 TeV (green) and FCC-hh 100 TeV (blue). 0.0056 and 0.0035 TeV −1 , respectively. Mono-jet analyses using 19.6 fb −1 of data at the LHC (8 TeV) experiments result in the 95% CL upper limit 0.025 TeV −1 on c GG /f a [7]. A comparison shows that the limits obtained in this work are complementary to that of the LHC experiments and the examined signal processes can be used to improve the constraint on the ALP coupling to gluons.

Summary and conclusions
Axion-like particles (ALPs) are CP-odd scalars resulting from spontaneously broken global U (1) symmetries incorporated in the Standard Model. Motivated by the possibilities such particles can provide to address some of the SM problems, e.g. the dark matter problem, strong CP problem, baryon asymmetry problem, neutrino mass problem, etc., we examine the potential of the production processes pp → a + j and pp → a + jj to search for light ALPs and probe the parameter space of the model at the present and future proton-proton colliders. Light ALPs are often so feebly coupled to the SM particles that their long lifetime doesn't let them decay inside the detector. They, therefore, manifest themselves as missing energy in the final state providing an interesting signature. Assuming the ALP mass to be 1 MeV, we present expected 95% CL upper limits on the ALP coupling to gluons at the LHC (13 TeV), HE-LHC (27 TeV) and FCC-hh (100 TeV). It is seen that the limits derived from the mono-jet and di-jet channels are approximately of the same order of magnitude. However, the limits from the di-jet channel are slightly stronger because of the better signal-background discrimination achieved in this case. The upper limits on c GG /f a from the mono-jet (di-jet) signal at the LHC13, HE-LHC27 and FCC-hh100 are found to be 0.0112 (0.0103), 0.0065 (0.0056) and 0.0048 (0.0035) TeV −1 , respectively. These limits correspond to the maximum integrated luminosities the colliders will eventually operate at according to their benchmark specifications. The limits obtained for the LHC13 in this study are comparable to the limit |c GG /f a | < 0.025 TeV −1 which has been already derived from mono-jet events at the LHC8 experiments using 19.6 fb −1 of data. Therefore, the present analysis could be complementary to the mono-jet analyses at the LHC. Based on our prospects for bounds at future colliders, HE-LHC27 and FCC-hh100 are expected to provide reasonable sensitivities to the ALP coupling to gluons down to |c GG /f a | < 0.0056 and 0.0035 TeV −1 , respectively. They are, therefore, capable of improving the present bound by roughly one order of magnitude. It is concluded that the present analysis can serve experimentalists well in probing the light ALP physics since the ALP-gluon coupling is reachable through this analysis in a significant portion of the parameter space.