The High Quality QCD Axion and the LHC

The QCD axion provides an elegant solution to the Strong CP Problem. While the minimal realization is vulnerable to the so-called"Axion Quality Problem", we will consider a more robust realization in the presence of a mirror sector related to the Standard Model by a (softly broken) $\mathbb{Z}_2$ symmetry. We point out that the resulting"heavy"axion, while satisfying all theoretical and observational constraints, has a large and uncharted parameter space which allows it to be probed at the LHC as a Long-Lived Particle (LLP). The small defining axionic coupling to gluons results in a challenging hadronic decay signal which we argue can be distinguished against the background in such a long-lived regime, and yet, the same coupling allows for sufficient production at hadron colliders thanks to the large gluon parton luminosity. Our study opens up a new window towards accelerator observable axions, and more generally, singly produced LLPs.

Introduction.-The Strong CP problem is the puzzle of why the strong interactions are CP symmetric even though the Standard Model (SM) as a whole is not. Technically, the question centers on the vanishingly small value of the one CP-violating coupling of QCD, θ. An elegant solution is provided by elevating θ to a fully dynamical pseudoscalar "axion" field. If the axion gets its potential entirely through QCD effects, then remarkably its ground state automatically corresponds to θ = 0 [1][2][3][4].
Despite this bottom-up simplicity, the QCD axion mechanism has a top-down flaw: the axion quality problem [5][6][7][8]. This arises because there can be other UV contributions to the axion potential that can push the minimum away from θ = 0. The QCD-induced potential is so shallow that even higher-dimensional interactions suppressed by UV scales, all the way up to the Planck mass, are sufficient to spoil the axion mechanism. One approach to this problem is to very strongly suppress these UV contributions with special UV structure for the axion, such as compositeness [9,10], extra dimensions [11] or string theory [12]. While such structures do ameliorate the quality problem, it is not clear to what extent they fully solve it. However, in this paper, we present an alternative resolution. We consider a mirror sector in the UV related to the SM by a Z 2 symmetry [13][14][15][16][17], coupled to the same axion, such that its contribution to the axion potential is much larger than QCD's but aligned with it in having its minimum at θ = 0. This results in a vastly higher quality axion mechanism in that it is much more robust against other uncorrelated UV effects. See Refs. [18][19][20][21][22][23][24][25] for earlier work which also discuss other ways of having a heavy QCD axion.
Enormous efforts are currently being devoted to the search for a standard QCD axion in the form of an extremely light and extremely weakly coupled field in smallscale experiments, astroparticle experiments, and astrophysical observations. For reviews, see Refs. [26,27]. By contrast, we find our general solution to the quality problem places us in a very different and interesting region in axion mass-coupling parameter space, in which the axion can be probed as a quantum particle at cutting-edge collider experiments.
The phenomenology of the high quality QCD axion should be distinguished from that of other types of light pseudoscalar particles, often dubbed axion-like-particles (ALPs) (for recent discussions on ALP effective theories see e.g. Refs. [28,29]). We expect that true axions will have their coupling to QCD not much weaker than other couplings to the SM. Many ALP searches/models, however, consider a strong connection and couplings to the electroweak sector, and are, therefore, not directly relevant for a true QCD axion (e.g. see Refs. [28][29][30][31][32][33][34][35][36][37][38][39]). Some ALP searches do conform to QCD axion expectation and yet couple strongly enough to the SM so that axion production and decay can be detected at colliders by traditional means, e.g., see Refs. [40][41][42][43][44][45][46][47]. But we will show that a high quality axion has a significant parameter space where its coupling to QCD and the SM are so weak that its production and decays might be expected to be buried under SM background. And yet, remarkably, for sufficiently small coupling the axion becomes a longlived particle (LLP), spatially separable from the dominant backgrounds, and is produced in sufficient numbers because the small coupling is offset by the immense gluonic content of the colliding proton beams. We will show that this high quality axion search presents a theoretically motivated and experimentally novel but challenging target for singly produced LLPs at the LHC main detectors. For other related on-going and proposed LLP ALP searches see Refs. [48][49][50][51][52][53][54][55][56][57][58].
We begin by reviewing the axion mechanism and its quality problem, and then present our two-sector solution, which puts the axion within collider reach. We discuss the details of the LHC phenomenology, focusing on axion production, its long-lived regime, and the suppression of the dominant background from fake-tracks. Finally, we discuss the results and provide our outlook.

arXiv:1911.12364v1 [hep-ph] 27 Nov 2019
Axion mechanism and the quality problem.-The QCD axion field, a(x), is coupled to QCD by promotingθ →θ + a(x)/f a : In the absence of the axion,θ represents the CP-odd gauge invariant QCD coupling, constrained by bounds on the neutron electric dipole moment to be |θ| < 10 −10 [59]. In the above, α 3 = g 2 s /4π is given in terms of the QCD gauge coupling g s ;G denotes the dual of the gluon field strength,G a,µν = 1 2 µνρσ G a ρσ with 0123 = 1; and f a denotes the axion "decay constant".θ is the effective θ-parameter obtained after diagonalizing the Yukawa matrices via chiral rotations, and is given bȳ Yukawa matrix with complex entries and θ is a bare Lagrangian parameter.
The non-perturbative QCD axion potential resulting from Eq. (1), can be calculated using chiral perturbation theory [3,60], where m π and f π are respectively the mass and the decay constant of the pion, m u(d) is the up (down) quark mass. Given the potential in Eq. (2) (and its refinements), the axion acquires a vacuum expectation value (VEV), a = −f aθ . Plugging this into Eq. (1) we see at low energies that the CP violation in QCD is eliminatedsolving Strong CP Problem. The mass of the axion, m a , can then be obtained from Eq. (2) as ( Clearly, if there is any other contribution to the axion potential from beyond QCD, the resulting axion ground state need no longer screen the QCD CP violation. One can realize the axion as a Nambu-Goldstone boson of a U (1) Peccei-Quinn (PQ) symmetry, e ia/fa → e iφ e ia/fa , which would forbid any axion potential. But such a symmetry cannot be exact because it is broken by QCD (chiral anomaly) effects, and further in the far UV quantum gravity is expected to explicitly break all global symmetries [61,62]. At best, PQ must be an accidental symmetry of the leading couplings. For example, the UV violation of PQ symmetry may take the form of a higher-dimensional composite operator, O, with scaling dimension ∆ > 4 and PQ charge q, where M p ∼ 10 19 GeV is the Planck scale. Naively, one would expect that Planck suppressed PQ violation would have negligible effects on the axion mechanism. However, given the experimental constraint f a > 10 9 GeV [27] and the very delicate QCD potential Eq. (2), the axion mechanism is spoilt (for q ∼ 1) out to very high scaling dimension unless ∆ ≥ 9. This extreme fragility of the axion mechanism is the so-called "quality problem" [5][6][7][8].
While it is possible, but demanding, that the UV structure does strongly suppress PQ violation at this level, in this paper, we study a different approach: we strengthen the IR axion mechanism itself. We begin by noting that the quality problem is so severe because the QCDinduced potential realizing the axion mechanism is set by the relatively small hadronic scales ∼ 100 MeV. This implies even smaller axion masses which could only have escaped discovery so far if the coupling 1/f a to the SM is extremely small. We will introduce a mirror sector to the SM which reinforces the axion mechanism and mass with much larger scales, and consequently, larger couplings to the SM are experimentally allowed.
Model construction.-Here we will develop the mirror sector structure for the axion in a manner that resolves the quality problem with the minimum of other theoretical bias, leaving the broadest axion parameter space to be explored or constrained by experiments. We will consider a Z 2 symmetry which exchanges between the fields of the SM and matching fields of the mirror sector. This replication includes the SM gauge structure, so that the entire mirror sector carries no SM charges and vice versa. All the marginal (dimensionless) couplings of the two sectors are identical, includingθ. However Z 2 is softly broken in the one relevant operator, by having two distinct tachyonic Higgs mass terms, We will consider M 2 p µ 2 µ 2 . One can view µ 2 − µ 2 as arising from the VEV of a Z 2 odd scalar coupled to the Higgs fields in the far UV. It is then plausible that the Z 2 breaking in the marginal couplings is as small as ∼ µ 2 /M 2 p . Let us address the theoretical plausibility of this kind of replication of SM structure from a UV perspective. For example, within a broadly string theoretic framework, it is possible that the SM gauge structure is realized at the intersection of some "branes" localized in an extra-dimensional manifold, see e.g. Ref. [63]. SM couplings and fields could represent the ground state and low-energy fluctuations/degeneracies of the brane arrangement in the extra dimensions. This is analogous to a molecule in ordinary space, with its ground state and (near-)degeneracies. Of course, instead of one, there can readily be two such molecules in space, made of the same atoms, and they will have identical ground states and degeneracies. It is possible that there is an environmental bias, say provided by one molecule being in a background magnetic field, in which case the two molecules are not identical. But if this bias is very weak, it is only the most fragile properties of the molecules that will differ significantly. Similarly, the SM brane arrangement may be replicated in the extra dimensions by a mirror sector with nearly identical couplings and fields. But the most significant difference is expected in the most fragile property of the SM. This is given by the mass-squared of the Higgs doublet, the very root of the Hierarchy Problem. It is plausible that the Anthropic Principle requires that one of these sectors has a small electroweak scale, but not both. This would yield the structure we have proposed above.
We now introduce a Z 2 -invariant QCD axion coupling, We assume a Z 2 -symmetric UV completion of this nonrenormalizable axionic coupling at scales ∼ f a , paralleling any choice of UV completion in minimal axion models without a mirror sector. The equality of the strong couplings indicated above is true in the far UV, by Z 2 , but can run differently below ∼ µ . For µ > 100 TeV, Λ QCD Λ QCD < m q for all mirror quarks q . We can estimate the strong coupling scale Λ QCD given Λ QCD using the 1-loop renormalization group. The differential running at 1-loop depends on the quark and mirror quark masses in terms of µ and µ , but is insensitive to new model-dependent thresholds involving colored degrees of freedom needed to UV-complete the non-renormalizable axionic couplings, which do not get mass through electroweak symmetry breaking.
In the above regime the non-perturbative QCD (pure glue) contribution to the axion potential near its minimum is given by lattice computation [64], and continuum (MS) matching [65]: is in detail the model-dependent conversion between the 1-loop and 2-loop estimates of the strong coupling scales.
Considering Eq. (7), this immediately shows that the single QCD axion a solves the Strong CP problems of both the sectors at the same time by having the VEV a = −f aθ .
Although the two values ofθ are identical for the two sectors in the UV, the breaking of the Z 2 symmetry can make the twoθ's different below µ . However, RG running ofθ occurs at seven loops, and contributions from threshold corrections arise at four loops [66]. Thus both of these effects, arising from renormalizable operators in FIG. 1. The preferred model parameter regions for our high quality axion model. We require that no new colored particles exist with a mass below 2 TeV; higher dimensional operators involving the axion or the Higgs do not reintroduce the strong CP problem, as well as several astrophysical, cosmological and collider constraints. We assume an axionic coupling to different gauge groups weighted by their respective fine structure constants (see Appendix A).
SM and SM , are too small to be significant even given the tight EDM constraints.
However, higher dimensional operators suppressed by the Planck scale can make the two angles different. For example, the interactions can giveθ =θ upon the breaking of the Z 2 symmetry. Requiring that |θ −θ |/θ < 10 −10 , gives the constraint that µ < 10 14 GeV, and thus there is a maximal amount by which a can be made heavier. Furthermore, as discussed above dimensionless couplings such asθ can directly get a small Z 2 -breaking correction ∼ µ 2 /M 2 p , which is again sufficiently suppressed for µ < 10 14 GeV.
From Eq. (8), we see that the resulting axion mass is much larger than in the SM alone (for a given f a ) so that it can be heavier than Λ QCD . This significantly weakens the existing experimental constraints and allows stronger couplings, 1/f a , to the SM. The raising of the axion mass and lowering of f a clearly reduces the severity of the quality problem. This opens up a strongly motivated and new experimentally testable regime for the QCD axion, which we identify now.
Constraints on parameter space.-In Fig. 1 we show the preferred parameter space for our model. We begin with the quality problem. We will choose as a benchmark a composite axion model for which PQ symmetry holds at the renormalizable level, but can be violated at ∆ = 6. Given Eq. (4), this reintroduces the Strong CP Problem in the region labeled as "PQ Quality Problem" in Fig. 1. We cannot populate the area labeled as "Below the QCD Axion line", defined by Eq. (3), as our mechanism can only make the axion heavier, and not lighter. In the area labeled "Higgs VEV Quality Problem", H ∼ µ > 10 14 GeV and Planck suppressed operators spoil the axion mechanism, as explained around Eq. (9). Our EFT is only valid if f a > Λ QCD which excludes the region shaded in cyan in Fig. 1. We are being agnostic about the origin of the axion coupling to QCD. Typically, the coupling is generated by integrating out colored fundamental fermions who get a mass yf a e ia/fa 1 . Requiring that the Yukawa coupling, y, of these fermions is smaller than 4π, we have new colored particles below 4πf a . Requiring that these colored fermions satisfy LHC constraints [67][68][69][70] and are heavier than ∼ 2 TeV disfavors the region shown in Fig. 1. The region labeled "As-tro+Cosmo+Beam Dump" is ruled out due to a variety of supernova, stellar cooling, beam dump, and cosmology constraints. The current collider coverage is shown in red regions with label "Collider Searches". These constraints, along with original references, can be found in Appendix B and Ref. [29]. Constraints shown without sharp outlines involve some of degree of theoretical uncertainty as discussed in the text.
We see then that the most favored region is given by m a ∼ GeV-TeV and f a ∼ TeV-PeV, ripe for collider exploration!
Phenomenology.-We present a search strategy and discuss in detail its feasibility for massive axions in the GeV to tens of GeV range, with decay constants ranging between 100 TeV to PeV, thereby covering a sizable portion of the open regime seen in Fig. 1. We first note from Eq. (8) that, for the axion to have m a GeV and f a 100 TeV, Λ QCD 100 GeV Λ QCD . For this to happen in our (softly broken) Z 2 symmetric set-up, we need µ > 10 11 GeV.
Three crucial ingredients lead to the plausibility of realizing a massive axion search at the LHC. First, the signal is displaced, giving us a powerful discriminator against hadronic backgrounds. Second, the signal still has a sizable production rate in the displaced parameter regime. This is non-trivial for the GeV scale axion, or in general, for singly produced long-lived particles, given that the same small coupling controls the production rate and upper limit of the proper lifetime. The lever-arm that offsets the small production coupling is provided by the immense gluon parton luminosity at the LHC and other proton accelerator experiments. The third ingredient is the possibility of a low-level Displaced Track Trigger [71,72] that can be configured towards our low-mass signal, where traditional high-p T triggers would fail.
Let us see the quantitative connection between the production rate and lifetime. The production rate of the axion is, (10) Here we have imposed an H T > 100 GeV cut and hence the axion mass ( 20 GeV) does not significantly affect the cross section. Similarly, if one instead requires a leading jet minimal p T cut of 30 GeV, the cross section increases by around a factor of 3. The lifetime of the axion can be approximated by, The detailed discussion of the axion production and decays can be found in the Appendices A 1 and A 2 respectively. We note here that axion dominantly decays into hadronic final states, and hence, with the production mode we consider the signal will be a displaced jet recoiling against prompt jet(s). The challenges and opportunities for this signal reside in both the trigger level and post-trigger analysis. A good signal trigger efficiency is critical given the rate in Eq. (10), which can be achieved using the Displaced Track Trigger discussed in Ref. [71,72]. We follow this construction with conservative modifications to accommodate our signal. In detail, we require • At least three tracks (within an L1 jet) with p T > 2 GeV; • Amongst the above tracks, at least three of them have the transverse impact parameter d 0 > 1 mm; • The pseudo-rapidity of the tracks to be |η| < 2.4; • The signal decay location in the transverse plane, d T < 35 cm to have enough hits in the tracker outer layers; • The H T of the event to be greater than 100 GeV. 2 These requirements are sufficient to pass the Level-1 (L1) trigger with affordable rates below 10 kHz, where this is dominated by backgrounds from fake tracks [71,72]. The requirement of three or more tracks can be passed quite easily for axion masses few GeV. For example, given a fixed proper lifetime of 3 cm, whereas for m a = 2 GeV, 17% of the axion decays produce three or more tracks, for m a 4 GeV, >90% of the axion decays do the same. More details about the kinematic distributions of tracks coming from axion decay can be found in the Appendix in Fig. 7. The SM background consists mainly of metastable SM particles, such as B mesons, kaons, pions, and tau leptons. The kaon and pion have rather long lifetimes but only produce two displaced tracks. Our requirement of three displaced tracks will veto these backgrounds very effectively. For backgrounds from B mesons and tau leptons, their proper lifetimes are around 500 microns and 87 microns, respectively, producing tracks with low impact parameters. Our trigger requirement d 0 > 1 mm and requirement of a reconstructed vertex displaced by more than 5 mm vetoes them effectively. For detailed simulation and the background counting, see the Appendix.
The fake-track background is one of the most crucial for displaced vertex searches at the LHC at every stage of experimental analysis. This background comes from misconnections of the tracker hits or instrumental noises.
The key feature of these fake tracks is that they allow for a much larger reconstructed impact parameter than the SM background. Empirically, one can model them as tracks with flat distributions in the finite range in the following dimensions: track impact parameter Nevertheless, the fake track backgrounds that pass the above L1 triggers produce huge background, amounting to about 10 kHz × 10 8 sec = 10 12 (12) triples of fake tracks at the HL-LHC. At higher level triggers and in the analysis, one needs to suppress the background much further, to below a Hz while maintaining a high signal efficiency. Beyond all existing studies, we not only consider the issue of triggering on our signal, but we also demonstrate that it is possible to suppress these backgrounds using a 2D-4D displaced vertexing selection at high level. The 2D-4D vertexing is defined as the following. We first solve for a 2D vertex by finding the best-fit point that minimizes the distances between the vertex and the tracks. Then we construct the 4D vertex of the system by extrapolating the 2D point in the transverse plane to the z direction and time direction by propagating the tracks. Our 2D-4D displaced vertexing selection is defined as follows: 1. The 2D common vertex has a minimal distance to the interaction point of 0.5 cm and maximal distance of 35 cm, 0.5 cm < d T < 35 cm; 2. The 2D tracks fit a common vertex with standard deviation ∆d T < 1 cm; 3. The 2D common vertex is sufficiently displaced away from the interaction point, d T /∆d T > 5; 4. The corresponding 4D vertex has a standard deviation in z direction ∆d z < 5 cm; 5. The corresponding 4D vertex has a z-direction location d z < 20 cm; 6. The corresponding 4D vertex has a standard deviation in time ∆d t < 500 ps; 7. The corresponding 4D vertex has a time d t < 1000 ps; 8. The tracks are within 0.4 in pseudorapidity of the reconstructed displaced jet direction |η i −η V | < 0.4 for all the three tracks; 5 We conservatively assume that the track momentum direction follows the direction that minimizes the flight time between the 2D impact point to the fitted vertex. 6 Note that this is not directly a 4D vertex fitting that minimizes the χ 2 of the 4D information of the tracks for a common vertex. This definition is to reduce the computation needed for the vertex finding. While we will show below that our 2D-4D vertexing will suppress the background to a minimal level, a full 4D fit will certainly make the result very robust providing higher background suppression.
9. The tracks are within 0.4 in azimuthal angle of the reconstructed displaced jet direction |φ i −φ V | < 0.4 for all the three tracks; The definitions of these quantities are shown in the schematic drawing in Fig. 2. Following the empirical model of fake track distribution discussed above, we find that the combination of the transverse plane vertex fitting (Cut1+Cut2+Cut3) provides a suppression factor of 8.2 × 10 −2 . The combination of z-direction consistency (Cut4+Cut5) and t-direction consistency (Cut6+Cut7) provides 4.9 × 10 −2 and 3.0 × 10 −3 background suppression, respectively. Furthermore, the requirement for the displaced tracks pointing back to the primary vertex (Cut8+Cut9) provides 4.9 × 10 −4 suppression. After taking into account the correlations between the selection cuts 7 , the resulting overall suppression factor of the faketrack background from this 2D-4D vertex fitting procedure is 2.9 * 10 −9 . This means the background is reduced to 10 12 × 2.9 * 10 −9 = 2900.
A crucial consideration on top of the above background estimation is that so far, it is using outer layers of the tracking information only. For the signal, there would be consistent energy deposition in the Electromagnetic Calorimeter (ECal) and Hadronic Calorimeter (HCal), as well as inner tracker information, which will improve the spatial resolution of the displaced tracks and constitute a powerful consistency check. If one further requires the matching of the information between different subdetectors for all the tracks and as well the neutral hadrons, each of three tracks should be able to at least provide one order of magnitude fake-track background suppression 8 , reducing our background estimate to, 2900 × (10 −1 ) 3 3.
Hence, it is plausible that the fake track background can be suppressed to negligible levels. Admittedly, there is large uncertainty in the fake-track background estimation. For instance, depending on the detector performance at the HL-LHC, one can have 10-30 fake tracks per collision and, the ranges in the fake track distribution model may vary. The study in Ref. [72] showed that by requiring two "high quality" fake tracks, one can have ∼ 10 −1 background rate suppression and the H T cut provides ∼ 10 −2 suppression, so the overall rate at L1 is around 10 kHz. Our evaluation here conservatively assumes that the same level of suppression can be achieved by requiring three "high quality" fake tracks. As shown in the cross-section discussion in the Appendix in Ref. A 1, such a H T cut reduces the inclusive cross section by more than two orders of magnitude for the axion mass regions of interest. To show what one can achieve with a slightly less conservative trigger consideration, in the next section, we also consider a trigger with a leading jet p T of 30 GeV plus three "high quality" displaced tracks. We assume the same level of L1 rate and background can be maintained using advanced trigger developments such as matching information between different subdetectors. Additionally, there are backgrounds from two fake tracks plus one real track, or one fake-track plus two real tracks. These backgrounds can be vetoed effectively by requiring the real tracks to have sizable impact parameters, which is already part of our preselection and trigger considerations.
Although it was not included in the 2D-4D vertexing selection above, a requirement on vertex mass will certainly help to further reduce SM background for axion masses bigger than ∼5-10 GeV. For smaller axion masses, a simple mass cut will cut away signals, so a more careful treatment might be needed. There is another crucial machine-related background, which is the secondary vertices created by SM particles interacting with detector materials. Naively, these backgrounds will be very similar to our signal given it has a true displaced vertex with a small mass, whose momenta also points back to the primary vertex with high probability. Special treatments, such as removing displaced vertices that originated from material-dense areas, have been applied in various displaced vertex searches at the LHC [74][75][76][77]. It is not yet clear how efficient this removal procedure would be for low mass searches such as ours, but it is important to remove the backgrounds from material interactions. This challenge deserves further study.
Lastly, there may be another approach to start looking for our signals, which is through a modification of the hadronic tau tagger to exploit the large displacement. Note the similarities between our signal and the three-prong tau decays: both of them are hadronic, involve GeV scale masses, and are displaced. One could start conducting our search by imposing a further displacement requirement for a three-prong tau-like object, after removing the invariant mass cut. It would be an interesting and practical approach to start exploring our signals.
Results.-After these considerations, we show the estimated sensitivities for our signals at the HL-LHC with 3000 fb −1 integrated luminosity. The typical signal efficiencies with our selection cuts are between 10 −3 to 10 −1 as can be seen from Fig. 6. Given the large uncertainties in background estimation, we show in Fig. 3 the reach of the proposed search with two different assumptions on the number of signal events, namely with 3 and 10 signal events that correspond to 0 and 25 background events respectively. While these numbers for the background events are just our crude estimates as described above, the LHC experiments would have refined background estimates based on full simulations and data driven methods.
From Fig. 3 we can see that with a minimal H T cut of 100 GeV (shown in green shaded region), our estimates imply a search for a displaced hadronic vertex would probe axion masses GeV m a 12 GeV, with decay constants 100 TeV There are several unique features of the coverage plot in Fig. 3, which are different from the analogous results involving more common LLP searches where LLPs are pair-produced. Understanding this will help in designing and optimizing future searches for GeV scale axions and, more generally, singly produced LLPs. First, the coverage has a strong dependence on the number of signal events needed, which is clearly shown by comparing the coverage between the shaded regions with different colors. The reason behind the strong dependence comes from our trigger requirement H T > 100 GeV or leading jet p T > 30 GeV. Given that the axion mass we are probing is much smaller than these scales, the production cross section remains approximately constant, as discussed around Eq. (10) and shown clearly in the Appendix in Fig. (4). Second, the lower edge of the coverage is determined by the minimal displacement requirement, below which the efficiencies becomes too low for a GeV axion to be sufficiently displaced. Third, unlike most LLPs that are produced by stronger interactions than those involved in their decays, the upper edge for our search is limited by the production rate of the LLPs. These features echo the challenge of the searchone needs to have good background control to reach the GeV axion. We have to reduce the background as much as possible while maintaining a good signal selection efficiency. Of course, alternative signal trigger and selection strategies that do not penalize the low mass, small displacement, axions could potentially enhance coverage. This deserves further study.
The allowed region for the high quality QCD axion in Fig. 1 also includes a regime of relatively low decay constants ∼ 100 GeV, which allows for copious production and prompt decays. A significant number of ongoing ALP searches at various accelerator facilities are covering this regime. The coverage of these searches, after translating into our axion EFT normalization, a 8πf a c 3 α 3 GG + c 2 α 2 WW + c 1 α 1 BB , are typically for f a 100 GeV. The best projected sensitivity for m a ∼ 10 GeV masses is the HL-LHC prompt diphoton resonance search [42] which reaches f a TeV.
We also show our result in logarithmic scale in the lower panel of Fig. 3 to compare with these existing and other proposed prompt searches, whose details can be found in the Appendix B.
In the regime f a 100 GeV, the nonrenormalizable EFT is expected to break down at energies 4πf a , and new SM-charged UV degrees of freedom should appear roughly within range of the LHC, providing alternate search channels, with current constraints schematically indicated in Fig. 3. But for high decay constants that we have studied, it may well be that the axion production/detection is the only channel available at collider energies.
Discussion and Outlook.-The quest for axions is pressing. We have put forward a general theoretical structure involving a mirror sector in which a true QCD axion solves the Strong CP problem while being very robust against the axion quality problem. We find that ∼ GeV axions with ∼ PeV decay constants lie at the heart of the motivated parameter space (see Fig. 1).
Cosmologically, the mirror sector is constrained because any massive stable mirror states would constitute a component of dark matter and must have consistent properties and abundance, while the massless mirror photon is subject to ∆N eff constraints on relativistic species. We will take a simple attitude here, and assume that the reheating temperature T RH after cosmic inflation is below f a ∼ PeV so that all massive stable mirror states are not reheated. Mirror glueballs are the lightest massive mirror states and may be reheated, but decay promptly into SM gluons via off-shell axion exchange. The mirror photon equilibrates with the ∼ 100 degrees of freedom of the SM through axion exchange which implies ∆N eff ∼ 0.05, which is below the current constraint of ∆N eff ∼ 0.2 [78], but testable in future experiments. For Majorana neutrinos in both sectors, originating from (even roughly) Z 2symmetric dimension-5 couplings to the (mirror) Higgs, m ν ∼ µ 2 µ 2 m ν > 10 8 GeV for µ > 10 11 GeV and therefore the mirror neutrinos are not reheated and pose no cosmological problem.
For ∼ GeV axions with ∼ PeV decay constants, the axion can be produced and detected at the LHC as a long-lived particle. This is a challenging long-lived particle search at the LHC. The production rate is highly suppressed by the same small coupling that leads to the displaced decay, implying that for reasonable rates, most decays will take place inside the LHC main detectors. Furthermore, there is only one low-mass displaced vertex in the event, while most other searches are for pair production of massive long-lived particles. We explored the dominant background of fake tracks and proposed a three-displaced-track strategy with 2D-4D displaced vertex reconstruction, demonstrating that the background can be feasibly suppressed. We believe that this makes the case for experimental exploration.
While we have mostly focused on QCD axion motivations and modeling in this paper, there is a second general and important motivation for our proposed LLP search following from the principle of electroweak Naturalness. This broad principle gives strong reasons to expect substantial new TeV-scale physics beyond the SM in order to stabilize the Higgs sector under quantum effects. However, we have not yet seen any evidence for such new physics, either in direct collider searches or in indirect tests of flavor physics and fundamental symmetries. This suggests that at best Naturalness is "frustrated by other UV or anthropic considerations or mechanisms which postpone the new physics to scales beyond the reach of our most sensitive existing probes. It is entirely plausible that the new physics scale of frustrated Naturalness is in the 100 TeV − PeV range, beyond our current capacity to directly explore. But in any rich spectrum with typical masses in this range, it is also plausible that there will be a few much lighter particles that fall within our grasp. See, e.g., the discussion in Ref. [79]. Pseudo-Goldstone ALPs are particularly robust examples of these. From this perspective, we would expect an ALP to have all its non-renormalizable couplings to the SM characterized by these high scales, including its decay constant f a that sets its couplings to gauge bosons. Almost all such highly suppressed couplings would be phenomenologically irrelevant. But the coupling to gluons, Eq. (1), is exceptional, as we have seen in this paper, in allowing us to sufficiently produce ALPs at the LHC given the large effective luminosity of gluons. Such weakly coupled ALPs will be LLPs and plausibly dominated by hadronic decays. This is the same signal structure for which our proposed QCD axion search is designed.
An LLP search along the lines described in this paper will require creative designs of the triggers and analysis at the LHC main detectors. New axion production modes can also be considered, including hadron decays and gluon splittings during parton shower. For lighter axions new decay modes can also be considered, ranging from exclusive modes into three pions to diphotons. These new production and decay considerations lead to rich phenomenology and further motivates our explorations and searches for GeV scale long-lived particles at accelerator facilities [32, 39, 42-44, 50, 52, 54, 80-83].
Acknowledgments.-We would like to thank Matthew Daniel Citron, Jared Evans, Yuri Gershtein, Simon Knapen and Diego Redigolo for very useful comments on the draft. We would also like to thank Prateek Agrawal, Evan Berkowitz, Zohreh Davoudi and Simone Pagan Griso for helpful discussions. This research was supported in part by the NSF grants PHY-1620074, PHY-1914480 and PHY-1914731, and by the Maryland Center for Fundamental Physics (MCFP). AH, ZL and RS acknowledge the hospitality of the Kavli Institute for Theoretical Physics, UC Santa Barbara, during the "Origin of the Vacuum Energy and Electroweak Scales" workshop, and the support by the NSF grant PHY-174958. AH and ZL would also like to thank PITT-PACC, MI-APP, and Aspen Center for Physics (supported by NSF grant PHY-1607611) for support from their programs and providing the environment for collaboration during various stages of this work. The codes for this study are available at Axion@LHC. We parametrize the coupling of the axion to the SM gauge fields as, where α i = g 2 i 4π denote the SM gauge couplings and α 1 is related to the hypercharge gauge coupling as α 1 = 5/3α Y . Below the scale of the electroweak symmetry breaking, the axion-photon coupling can be written as, where c γ = c 2 + 5 3 c 1 and α EM = e 2 /4π is the electromagnetic fine structure constant.

Production cross section
The calculation of the production cross section for the GeV scale axion, dominated by the gg → a process at leading order, is subtle. It samples the low x regime and is subject to scale dependences from the running of α 3 as well as the choices of factorization and renormalization scale. Without proper resummation, this is likely to give unphysical results in the low mass regime. Here we carry out a one-jet matched rate calculation with further detector simulation. In Fig. 4, we show the number of produced axion at the 13 TeV LHC with 3 ab −1 of integrated luminosity as a function of the axion mass in a one-jet matched cross section calculation, for a fixed axion decay constant of 100 TeV. The blue, red, and green lines represent the matching scale choices of 7.5 GeV, 15 GeV, and 30 GeV, respectively. We can see that the matched calculation of the inclusive and exclusive cross section is relatively stable against the matching scale choices for axion mass above 5 GeV. The cross section un-physically decreases when the axion mass goes below 5 GeV. Similar results were also found in Ref. [42]. The dotted line shows the inclusive cross section, and the solid, dot-dashed, and dashed lines are with a leading jet minimal p T cut of 30 GeV, a minimum H T cut of 60 GeV and 100 GeV. We can see that imposing the leading jet p T cut reduces the signal rate by one to two orders of magnitude, and the H T cut of 60 GeV and 100 GeV further reduces the rate by another factor of ∼ 1.5 and ∼ 3.5, respectively. The shaded bands indicate the scale uncertainty with the matching scale choices of 7.5 GeV for the inclusive rate and the one with a minimal jet p T cut of 30 GeV, which is around ±25% for a broad range of the axion mass.
The cross section calculation is carried out using Madgraph5@NLO [84] with a modified model file based upon heft model file, with a parton shower through Pythia8 [85,86] and a further detector simulation for the response of jet clustering and H T calculation through Delphes3 [87]. We have also used a rough estimate of the K-factor based on Ref. [41]. Details of the cross section calculation can be found in our public repository Axion@LHC.

Axion decays and lifetime
For m a < 3m π , the axion predominantly decays into diphotons via the coupling aFF coming from Eq. (A1), with the rate, while the decays a → φγ and a → ππ are forbidden by C and CP invariance. For m a 2 GeV, the axion predominantly decays into hadrons. The corresponding rate can be calculated from the decay rate into gluons via the coupling aGG in Eq. (A1), where K NLO (m a ) is the K-factor at next-to-leading order and is given by, We see that the decay rate into gluons dominate over that into diphotons due to the smallness of the fine structure constant α EM in comparison with α 3 , and the presence of color factors for the former. For 3m π < m a 2 GeV, the axion decay patterns are more complex and can be understood as from the mixings between axion and SM pseudoscalar mesons [44]. New search strategies, taking into account new SM backgrounds, such as those from Kaons, need to be developed. In Fig. 5 we show the GeV Axion partial decay widths into diphoton (red) and hadrons (blue) as a function of its mass. We chose the axion decay constant to be 1 PeV and show the corresponding inverse proper decay length on the y-axis on the right-hand side. Here, in addition to the above direct calculation, we have followed Ref. [44] in its treatment of the hadronic axion decay modes for m a 2 GeV. Such treatment generates the non-trivial behavior in the regime below 2 GeV for the a → gg calculation. The seemingly abrupt transition region near the 2 GeV regime shall be interpreted as a smooth transition between chiral perturbation treatment to matrix element calculation from the first principle. We can see that the GeV axion mainly decays into the hadrons, and the diphoton branching fraction typically is of order 10 −3 to 10 −4 , assuming the Wilson coefficients c 1 = c 2 = c 3 . This property renders the searches for axions decaying into diphotons not very sensitive to the signal, especially when it is long-lived.

Appendix B: Current Search Coverage of GeV Axions
LHCb diphoton search-Using 80 pb −1 diphoton data, Ref. [42] sets a bound on σ(pp → Xa(γγ)) for 4.9 GeV < m γγ < 6.3 GeV. A projection for HL phase with 300 fb −1 is also done for an extended mass range 3 GeV < m γγ < There are also many proposals and auxiliary experiments of the LHC that can be sensitive to sub-GeV axions. For instance, FASER, an ultra-forward detector of the LHC, would cover the axions below a GeV [52,54]. More details and comparisons of the future proposals can be found in Refs. [43,80]. Ref. [80] has proposed that the CODEX-b [51] and MATHUSLA [49] detector can also probe some of the long-lived regime of the parameter space.
BaBar and Belle-II-BaBar carried out a search [82] for Υ(2S, 3S) → γa where a decays hadronically for the mass range 2m π < m a < 7 GeV. Ref. [42] recasted this for the axion model under consideration which gives f a 3.5 GeV for c 1 = c 2 = c 3 = 1. A Belle-II projection was also given for the same search assuming an increase in the number of produced of Υ(3S) by a factor of 100 compared to BaBar and only statistical uncertainty, which gives a projection of f a 10 GeV. Assuming a future sensitivity of BR(B → K ( * ) a(γγ)) = 10 −8 at future B-factories, we recast the bound of Ref. [32], for our benchmark point c 1 = c 2 = c 3 = 1. The resulting sensitivities are shown in magenta (B → K * a) and purple (B → Ka) in the lower panel of Fig. 3. The bounds from B ± → K ± a(ηπ + π − ) and other hadronic decay modes of the ALP, as discussed in [44], cover a small parameter space in the lower panel of Fig.  3 and are not shown. We also do not show bounds from different flavor searches [34,[36][37][38]88] as they do not cover a significant parameter space in Fig. 3. The most recent proposal at Belle II searches for invisible axions and the photonic decays of visible axions bremming from photons of the decaying mesons [50], covering the sub GeV regions. Noting that at or beyond a GeV, the hadronic decays of the axion dominates, new studies on these modes would be very promising, given the clean environment from the B-factories. For a recent study on the use of LHCb to probe signatures of hidden valleys, see Ref. [89].
Diphoton searches from the LHC and Tevatron-Using the measured diphoton cross section at the LHC [90][91][92], Ref. [41] puts a bound on the process σ(pp → Xa(γγ)) and projects sensitivities at HL-LHC. We show the associated limits (solid)/projections (dashed) for c 1 = c 2 = c 3 = 1 in brown in the lower panel of Fig. 3.
Appendix C: Analysis Details Fig. 6 shows the selection efficiency as a function of axion lifetime for different axion masses. We can see from the figure that the typical efficiency with our proposed search is ∼ 10 −1 for 4 GeV and 10 GeV axion. For lighter masses, e.g., for the mass of 2 GeV shown in the blue curve, the maximum efficiency is ∼ 10 −2 . The peak efficiency lifetime shifts towards higher values because of the smaller boost factor. To get a better understanding of why lower masses have lower efficiencies, we now include the effects of various cuts in Fig. 7. Fig. 7(a) shows the number of hadronic objects produced from the axion decay. Heavier axions produce a larger number of tracks on average. Thus it is much easier for a heavier axion to pass our requirement of three displaced tracks. Fig. 7(b) shows the H T distribution of jets in the entire event. We see that the requirement of having an ISR jet with p T > 30 GeV at the generation level makes the resulting H T distribution largely independent of the comparatively smaller axion mass. This requirement on the ISR also leads to the feature that the H T distribution is peaked around ∼ 200 GeV. It is also important to note that whereas we rescaled the cross section using the reconstructed H T in Fig. 4, in Fig. 7(b) the H T variable used is not the reconstructed one, but rather the sum of all hadronic activities. Fig. 7(c) shows the p T distribution of tracks coming from axion decay. Since heavier axions produce more tracks, those are also somewhat softer on average compared to tracks from lighter axions. However, we see the requirement of p T > 2 GeV is somewhat, but not very significant. Figs. 7(d) and 7(e) respectively show the η and impact parameter distribution of tracks. The requirement of the ISR jet at the generation level again makes the results mostly independent of the axion mass. Lastly, Fig. 7(f) for the distribution of the decay location, d T , in the transverse plane shows that lighter axions are more boosted on average than the heavier ones, as expected.

Appendix D: Vertexing efficiencies and correlations
We show a few event displays of the fake-track vertexing in the transverse plane in Fig. 9. The four panels represent four random instances of the one million fake-track background simulated, following our assumptions about the faketrack behavior discussed in the main text. The axes of these displays extend to the edge of the tracking volume. To obtain a more quantitative understanding of the 2D-vertex reconstruction selection, we show the distribution of the three-track vertices in Fig. 8 in the ∆d T -d T plane. The shaded region is excluded by our Cut1 and Cut3. Our selection Cut2 cuts away points with ∆d T > 1 cm. We can see that the fake tracks, due to the smallness of the impact parameter d 0 < 15 cm and the high p T requirement of larger than 2 GeV, do have a sizable probability to form a displaced vertex with small uncertainty in the fitted vertex location. The three-track vertex probability is 8.2% with ∆d T < 1 cm. In fact, the 2D-cut efficiency is dominated by Cut2 as the quality of the 2D vertexing cut. Cut1 and Cut3 are chosen to make sure the selection is also effectively vetoing the SM prompt (due to smearing) and shortly long-lived backgrounds.
We further describe the correlations between various cuts considered in this study regarding the fake-track generated vertex background. Given the computational limitation, we studied one million vertices formed by the fake-tracks. A subset of the selection cuts employed in this paper would already ensure that none of the fake vertices pass. However, the HL-LHC requires a much higher suppression factor than 10 −6 . Hence, a study regarding the independence of many of the selection cuts and their suppression factors multiplicity is needed.  In Table. I, we show the individual cut efficiencies and the correlations between the different cuts employed in this study. The correlations are defined as the following, where A ( B ) is the cut efficiency of applying a set A (a set B) of cuts on the backgrounds, and AB is the cut efficiency of applying the joint set of cuts A ∩ B. In this study, we simulated one million fake track background and performed the vertex reconstruction on them. Due to the limited simulation background set, it is important to check the independence of the cuts when attempting to obtain a combined cut efficiency that is more than the inverse of the number of simulated background events. If ρ AB = 0, it shows the two sets of cuts are completely independent. For ρ AB > 0, the two cuts are correlated. Directly multiplying A and B leads to an over-estimation of the combined cut efficiencies. Similarly, for ρ AB < 0, the two sets of cuts are anti-correlated. Directly multiplying A and B leads to an under-estimation of the combined cut efficiencies. In the first row of this table, we label the abbreviated cut selections, and the second row reports the cut efficiencies for an individual cut. The columns are the corresponding correlation factor ρ AB for the given set of cuts. I. Cut efficiencies on the fake track backgrounds and their correlations. The row containing x's are the cut efficiencies for each cut (columns 1-9) and sets of cuts in the remaining columns to the right. We abbreviate ×10 −y to a superscript −y here. Row 3-11 shows the correlations between individual cut and the cut sets, defined as in Eq. (D1). For positive (negative) numbers, the cuts are correlated (anti-correlated). We omit the entries with no correlations and use "-" to represent entries where we run out of statistics. Cut (set) x  1  2  3  4  5  6  7  8  9 1+2 123 4+5 6+7 8+9 4567 1245 1289 124567 We can see from the Table I several important correlations for use to derive the overall efficiency. First, although Cut3 provides a 2.3×10 −1 reduction of the fake-track background, it highly correlates with Cut2 and only improves the suppression marginally, from 8.4% to 8.2%. Cut3 is more of a consideration to reject SM QCD background accidentally displaced due to finite detector resolution effects. We will ignore Cut3 in the overall cut chain background suppression factor calculation. The rest of the vertex reconstruction cuts, in general, have sizable anti-correlations and, at most, have a very mild correction of +0.3 between Cut4 and Cut5, and +0.7 between Cut7 to Cut8. The correlation between Cut4 and Cut5 is the consistency of the 4D vertexing information in the z-direction and can be eliminated when taking the cut combination of Cut4+Cut5. The correlation between Cut7 and Cut8 is offset by the anti-correlation between Cut7 and Cut9 of −0.7, which results in an anti-correlation between Cut7 and the combination of Cut8 and Cut9 of −0.2. As we can see clearly from the right-hand side of the table for the correlations between the combinations of cuts, although some entries are "-" due to lack of statistics, it is not hard to infer that many cut-combinations are not correlated and allows us to derive the overall cut efficiency. In this study, we take the cut combination of Cut1+Cut2+Cut8+Cut9 with background suppression of 1.8 × 10 −5 and Cut4+Cut5+Cut6+Cut7 with background suppression of 1.6 × 10 −4 to reach an overall estimation of the background suppression 2.9 × 10 −9 . Note that this choice is still conservative as Cut1+Cut2+Cut8+Cut9 mildly anti-correlates with Cut5 and Cut7, while Cut4 shows a strong anti-correlation of −1.0 with Cut8+Cut9, a mild anti-correlation of −0.1 with Cut1+Cut2, and only Cut6 has a comparatively smaller correlation with Cut8+Cut9 of +0.4. While we have considered a 2D-4D vertexing above, a full 4D fit will undoubtedly make our result more robust while providing higher background suppression.