ALPs at FASER: The LHC as a Photon Beam Dump

The goal of FASER, ForwArd Search ExpeRiment at the LHC, is to discover light, weakly-interacting particles with a small and inexpensive detector placed in the far-forward region of ATLAS or CMS. A promising location in an unused service tunnel 480 m downstream of the ATLAS interaction point (IP) has been identified. Previous studies have found that FASER has significant discovery potential for new particles produced at the IP, including dark photons, dark Higgs bosons, and heavy neutral leptons. In this study, we explore a qualitatively different, `beam dump' capability of FASER, in which the new particles are produced not at the IP, but through collisions in detector elements further downstream. In particular, we consider the discovery prospects for axion-like particles (ALPs) that couple to the standard model through the $a \gamma \gamma$ interaction. TeV-scale photons produced at the IP collide with the TAN neutral particle absorber 130 m downstream, producing ALPs through the Primakoff process, and the ALPs then decay to two photons in FASER. We show that FASER can discover ALPs with masses $m_a \sim 30 - 400~\text{MeV}$ and couplings $g_{a\gamma\gamma} \sim 10^{-6} - 10^{-3}~\text{GeV}^{-1}$, and we discuss the ALP signal characteristics and detector requirements.


I. INTRODUCTION
the IP with small angles θ γ relative to the beamline. These photons then collide with the neutral particle absorber (TAN or TAXN), producing ALPs a at similarly small angles θ a relative to the beamline. Note the extreme difference in horizontal and vertical scales. Lower right panel: The ALPs then travel ∼ 350 m further downstream and decay through a → γγ to two highly collinear, high-energy photons in FASER, which is located in the side tunnel TI18 close to the UJ18 hall.
generated between ALPs and SM gauge bosons, (1/f ) aV µν V µν [28]. As in the case of the QCD axion, a shift symmetry a → a + c can naturally keep the ALP mass low. On the other hand, for generalized ALPs, one typically introduces a small mass term 1 2 m 2 a a 2 that softly breaks the shift symmetry and allows m a to be an independent parameter of the model. Low-mass ALPs with suppressed couplings to the SM are then long-lived particles (LLPs) that can be sensitively probed by FASER.
An interesting possibility, which leads to a phenomenology that is qualitatively different from the models considered in our previous studies, is that ALPs are predominantly coupled to two photons [29]. In this case, ALPs can be produced in pp collisions at the IP through, e.g., photon fusion or rare decays of neutral pions. However, as we show below, for highenergy forward-going ALPs that can reach FASER, the dominant production process is one in which photons produced at the IP collide with elements of the LHC infrastructure ∼ 130 m downstream, producing ALPs through the Primakoff process γN → aN X [30,31]. The ALPs then travel another ∼ 350 m and decay to two photons in FASER. This process, through which the LHC can be thought of as a high-energy photon beam dump experiment, is depicted in Fig. 1.
In this study, we evaluate the prospects for FASER to discover ALPs that are produced through their di-photon coupling and decay through a → γγ. This work is structured as follows. In Sec. II we review the basic properties of ALPs and their di-photon coupling. In Sec. III we discuss ALP production and decay. The discovery reach of FASER for ALPs is presented in Sec. IV. In Sec. V, we discuss the detector requirements for detecting the ALP signal in FASER. Our conclusions are collected in Sec. VI. Details of the kinematics of the Primakoff process, ALP production in rare meson decays, and the angular acceptance function for FASER are given in Appendices A, B, and C, respectively.

II. PROPERTIES OF AXION-LIKE PARTICLES
We consider a low-energy effective theory in which an ALP a couples to vector bosons through the dimension-5 interactions where B µν and W A µν are the U(1) Y and SU(2) L field strength tensors, respectively, and g aBB and g aW W are the corresponding coupling constants with dimension GeV −1 . If such interactions are generated by physics with coupling α that was integrated out at some heavy scale f , one expects g aBB , g aW W ∼ α 2πf . ( This is the case for axions [23][24][25][26], for example, and more generally for pseudo-Goldstone bosons with non-vanishing axial anomalies. After electroweak symmetry breaking, the couplings g aBB and g aW W induce couplings of the ALP to γγ, γZ, ZZ, and W + W − . In this study, we focus on the γγ coupling and neglect the others. The other couplings typically have a small effect on our signal; for example, the subdominant production process of ALPs from meson decays can be enhanced by W + W − couplings [32]. More important are their effects on other observables. For example, a non-vanishing γZ coupling induces the exotic decay Z → aγ. Although this process does not contribute significantly to the ALP production rate in the far forward region, it can be searched for in high-p T experiments at the LHC. (See Refs. [27,33] for some future projections.) ALPs may also couple through dimension-5 operators to gluons and fermions, as well as through dimension-6 couplings to the Higgs boson. For a recent review see, e.g., Ref. [27]. In this study, we will assume that the effects of these other couplings on our signal processes are negligible. This is the case when these couplings are relatively small, or, for example, when the couplings are to heavy particles, such as third-generation fermions, and so their impact on ALP production and decay is suppressed. It is important to note, however, that gluon and fermion couplings generate di-photon couplings through loops and vice versa, and so to analyze a specific underlying ALP model in detail, one would in general have to include all of these couplings in a unified way. Here, we take a more model-independent, phenomenological approach.
With these simplifying assumptions, we therefore focus on the ALP effective Lagrangian where F µν is the field strength tensor of electromagnetism. The resulting parameter space is very simple, as it is spanned by two parameters: the ALP mass m a and the di-photon coupling g aγγ .
With this Lagrangian, the ALP decay width is The cubic dependence on m a , resulting from the fact that the decay is mediated by a dimension-5 operator, implies that, as the ALP mass decreases, the ALP lifetime increases rapidly. The ALP decay length is where we have normalized to currently viable values of g aγγ and m a . For these values and ALP momenta p a ∼ TeV, the ALP decay length is naturally hundreds of meters, i.e., in the range relevant for FASER searches for LLPs. Although the ALP decays primarily into pairs of photons, it is possible that one of the photons converts into an electron pair leading to the decay a → e + e − γ. The branching fraction for this decay is [34] where q 2 is the invariant mass of the electron pair, and F (q 2 ) ≈ 1. For ALP masses between m a = 10 MeV and 1 GeV, the branching fraction ranges from B(a → e + e − γ) = 0.4% to 1.7%. Note that this branching fraction peaks at low q 2 , implying that most of the ALP energy will be carried by the photon, while the electrons will typically be softer.

A. Mechanisms for ALP Production in the Forward Region
In the dominant ALP-photon coupling scenario, ALPs can be produced in any process involving photons by radiating an ALP off a photon line. However, for FASER, we are primarily interested in the production of highly energetic ALPs in the very forward region. The dominant production mechanism is then the Primakoff process, in which a photon converts into an ALP when colliding with a nucleus. This can happen when photons produced at the LHC collide with the forward LHC infrastructure, as illustrated in Fig. 1. The corresponding Feynman diagram is shown in the left panel of Fig. 2.
Photons are produced at the IP mainly through π 0 decay. They then propagate in the beam pipe until they hit the material of the LHC accelerator, as illustrated in the bottom left panel of Fig. 1. Very forward photons collide with the neutral particle absorber, which is designed to protect the magnets behind it. In the current LHC, this is the TAN, a ∼ 3.5 m thick metal block placed along the beam collision axis at a distance of 140 m from the IP. At the high-luminosity LHC (HL-LHC) this absorber is planned to be upgraded to the TAXN and shifted to a new position about 130 m away from the IP [35]. In the following we will use the details of the upgraded absorber TAXN. The reach of FASER is only mildly sensitive to the precise properties and location of the absorber. Very forward ALPs may also arise from the exotic decays of light mesons, shown in the right panel of Fig. 2, which are abundantly produced at the IP with very high forward-going momenta. However, as we discuss in more detail in Appendix B, such rare meson decays typically give a subdominant contribution to the FASER signal relative to the Primakoff process. In the rest of this section we therefore focus primarily on the Primakoff process. When presenting FASER's sensitivity reach in Sec. IV, however, we include the dominant exotic meson decays, π 0 → aγγ and η → aγγ.
Last, we note that ALPs may also be produced at the LHC through other processes, e.g., through exotic Z decays or photon fusion. 1 As with rare meson decays, however, these processes do not typically produce large numbers of boosted forward-going ALPs, and they are therefore subdominant contributions to the FASER signal.

B. Primakoff Process in the TAXN
Forward high-energy photons that eventually hit the TAXN are copiously produced in pp collisions at the IP, primarily in meson decays. To estimate the FASER event yield of ALPs produced by such photons, a reliable estimate of the forward photon spectrum is required. In the left panel of Fig. 3 we show the estimated photon spectrum in the (θ γ , p γ ) plane, where θ γ and p γ are the photon's angle with respect to the beam axis and its momentum, respectively. The spectrum was simulated using the CRMC package [38], applying the EPOS-LHC model [39]. This includes photons produced in the decays of all light mesons. The dominant contribution comes from the decays π 0 , η → γγ; decays of heavier mesons provide only a small correction. As can be seen, in the log-log plot the events cluster around the line defined by p γ θ γ ≈ p T = Λ QCD ≈ 0.25 GeV. This is indicative of the characteristic momentum transfer scale for light meson production. As discussed in Ref. [10], the results are consistent with other Monte Carlo simulations, such as QGSJET-II-04 [40] and SIBYLL 2.3 [41,42], indicating a good understanding of the forward photon spectrum at the LHC. This is not surprising, since all three of the simulations have been tuned to the LHC data collected by the LHCf Collaboration [43,44].
As noted above, we assume that the TAXN will be located at a distance L TAXN = 130 m from the IP [35]. Photons produced at the IP may collide with the TAXN at transverse distances up to the radius R TAXN = 12.5 cm from the beam collision axis or LOS. Within a transverse distance of R TAXN from the beamline, the TAXN has two holes to let the beams   , and ALPs decaying at FASER's position within the range (L min , L max ) (right) in the (θ, p) plane, where θ is the particle's angle with respect to the beam axis, and p is the particle's momentum. We assume the HL-LHC integrated luminosity 3 ab −1 and a TAXN radius of R TAXN = 12.5 cm, which implies that ALPs produced at the TAXN typically have θ a < R TAXN /L TAXN ≈ 1 mrad.
through. Following Ref. [45], and as illustrated in the bottom left panel of Fig. 1, we assume these holes are circles with radii 4.25 cm and center-to-center separation 14.8 cm. We take this into account when implementing the TAXN geometry and estimating the number of photon-ALP conversions in the TAXN.
The differential cross section for the Primakoff process is [31] dσ where θ aγ is the lab-frame ALP-photon opening angle, p a is the lab-frame ALP momentum, t = −q 2 = −(p a − p γ ) 2 is the momentum exchange, and Z and F (t) are the target atomic number and form factor, respectively. We use the elastic form factors of the atom and the coherent one for the nucleus, and we checked that contributions from inelastic and incoherent processes can be safely neglected in our case. Following Refs. [46,47], we parametrize the form factors as where a = 111 Z nuc GeV 2 , Z nuc and A nuc are the atomic and mass numbers for the nucleus, and m e = 511 keV is the electron mass.
The precise value of the cross section depends on the target material. Although the main inner absorber of the TAXN will be made of copper, it will additionally be surrounded by steel outer shielding [48]. For simplicity, we use the atomic and mass numbers for iron, Z nuc = 26, A nuc = 56, when evaluating Eq. (7). This approximation is further justified by the fact that, to a good approximation, the dependence on form factors cancels out in the ratio between the Primakoff and pair-production cross sections, which is what ultimately determines the rate of ALP production.
The Primakoff production process competes with the other photon-matter interactions. At photon energies higher than ∼ MeV, photon conversion to an e + e − pair in the nuclear fields dominates over all other processes. The relevant pair-production cross section in iron for the photon energies of our interest is of the order of σ conv 5 barn [47]. The probability of a photon to convert into an ALP, P γ→a , is, then, given by To be conservative, we neglect the scatterings of secondary photons produced in the electromagnetic showers inside the TAXN, which could also produce ALPs. An accurate modeling of this contribution would require dedicated simulation tools, e.g., FLUKA [49] or Geant4 [50], to study shower development in the TAXN, taking into account its precise geometry and composition. This is beyond the scope of the current analysis, but we note that such secondary photons will have lower energies than those produced at the IP, and they will also be less collimated along the LOS. We therefore do not expect secondary photons to drastically improve the reach of FASER.
Importantly, the nuclear form factor typically suppresses large momentum transfers between the projectile photon and the target nucleus; that is, the target nucleus does not recoil much. As a result, ALPs tend to carry most of the photon initial momentum and follow the direction of the incoming photon. Consequently, the angles of the ALP and parent photon relative to the LOS are very similar, and θ a ≈ θ γ < R TAXN /L TAXN ≈ 1 mrad. This can be seen in the central panel of Fig. 3. This implies that only a few ALPs, mostly at lower energies, are produced in processes with large enough momentum transfers to produce larger θ a . Also, photons that collide with other parts of the infrastructure besides the TAXN do not typically end up in FASER. As a result, to a good approximation, the number of ALPs going towards FASER is given simply by rescaling the number of photons incident on the TAXN by the integrated probability of the Primakoff process to occur, σ Prim /σ conv , and their resulting momenta are determined by the collinear approximation p a = p γ . A more detailed discussion of the kinematics of the Primakoff process is given in Appendix A.

C. ALP Decays in FASER
Once produced, the ALPs decay into two photons after traveling distance ∼d a . As can be seen in Eq. (5), for typical ALP momenta p a ∼ TeV, ALP masses m a ∼ 50 MeV, and coupling constants g aγγ ∼ 10 −4 GeV −1 ,d is of the order of a few hundred meters, motivating a search for ALPs at FASER.
Following Refs. [10][11][12], we assume that FASER has a cylindrical detector volume of radius R, which is co-centric with the beam collision axis, and has a depth ∆ = L max − L min , where L max and L min are the distances of the far and near edges of the detector to the IP. We will show results for the following detector parameters: For the parameters of interest, the event rate is linearly proportional to ∆. For reasons explained below, reducing ∆ by a factor of 2 or 3 makes very little difference to the sensitivity reach in ALP parameter space. As mentioned above, FASER will be stationed in a side tunnel, after the curving of the main LHC tunnel containing the beam pipe. We assume that a high granularity electromagnetic calorimeter is positioned at the back of the detector, after the tracking system, and detects photons with high efficiency (see Sec. V). The probability P det that an ALP produced at the TAXN subsequently decays within FASER is where L TAXN = 130 m, and the detector angular acceptance A ang (θ aγ , θ γ ) is the probability that an ALP produced by a photon at the TAXN has a trajectory that passes through FASER, given the scattering and photon polar angles (θ aγ , θ γ ). The scattering angle θ aγ is defined as the ALP's angle relative to the photon direction, and the photon polar angle θ γ is defined relative to the beam collision axis. In the aforementioned collinear approximation, p a = p γ , the angular acceptance can simply be written as A ang (θ aγ , θ γ ) = Θ(L max tan θ γ −R), where Θ(x) is the Heaviside theta function. In the special case of a cylindrical detector it is even possible to obtain an analytic solution, which is presented in Appendix C. In practice, we obtain a more accurate description for the angular acceptance function through Monte Carlo simulation. The spectrum of ALPs decaying within a distance (L min , L max ) from the IP is shown in the right panel of Fig. 3. As we can see, only very forward ALPs with θ a < R TAXN /L TAXN contribute to the signal. This allows us to have a relatively small calorimeter with radius ∼ O(10) cm, which can detect almost all available ALPs. In the following we will assume that the radius of the calorimeter is R = 20 cm and coincides with the radius of FASER's decay volume.

IV. SENSITIVITY REACH OF FASER
The ALP-induced signal in FASER typically consists of two high-energy photons coming from the ALP decay inside the detector. In the left panel of Fig. 4 we show the expected signal event yield in FASER in the (m a , g aγγ ) plane, assuming an integrated luminosity of 3 ab −1 . The gray shaded regions, which are adapted from Ref. [51], represent the parameter space that is already excluded by previous experiments. The colored contour lines correspond to the number of ALP decays within FASER's decay volume, for the ALP production mechanisms indicated. As can be seen, the Primakoff process indeed provides the leading contribution, while meson decays only add a O(10%) correction for most parts of parameter space.
Notably, up to ∼ 10 5 signal events can be expected in still unconstrained regions of parameter space. Note that FASER mainly probes the region of parameter space in which ALPs are required to be highly boosted to reach the detector. This is exactly the regime in which FASER has been shown to have significant discovery potential, comparable to the reach of SHiP for the case of dark photons [10]. Note also that in the upper part of the region covered by FASER, the lines with constant number of signal events are very tightly spaced. In this regime, the decay length is significantly smaller than the distance to the detector,d a L min , resulting in a strong exponential suppression of the number of events once the decay length drops further, as discussed in Ref. [10]. This limits the ability to probe higher g aγγ , but also implies that the reach in this region of parameter space is highly insensitive to the number of background events and to the signal detection efficiency.
In the right panel of Fig. 4, we show FASER's projected total sensitivity in the ALP parameter space. Here we assume that backgrounds can be reduced to a negligible level and a signal acceptance of 100%. A more detailed discussion is postponed until Sec. V. We note   [51]. The reach for NA62 assumes ∼ 3.9×10 17 protons on target (POT) while running in a beam dump mode that is being considered for LHC Run 3 [29]. The SeaQuest reach assumes ∼ 1.44 × 10 18 POT, which could be obtained in two years of parasitic data taking and requires additionally the installation of a calorimeter [18]. The reach for proposed beam dump experiment SHiP assumes ∼ 2 × 10 20 POT collected in 5 years of operation [29].
that even the subdominant decay channel a → e + e − γ, which only has a branching fraction of ∼ 1%, may also be able to cover unprobed parameter space.
For comparison, we also show future projections of the sensitivity reach for Belle-II [51], as well as for the beam dump experiments NA62 [29], SeaQuest [18], and SHiP [29]. In the parameter space with g aγγ ∼ 3 × 10 −6 − 3 × 10 −2 , FASER's reach is comparable to or better than the projected future sensitivities of these other experiments. As discussed above, in the regime ofd a L min , the contours with fixed number of signal events are very close to each other. As a result, the sensitivity reaches of FASER and the other experiments are similar, despite significant differences in luminosity. The effect of the increase in luminosity can, however, be observed at low values of coupling constants g aγγ that allow larger lifetime. In this regime, ALPs can be less boosted and still reach FASER. However, such less energetic ALPs are typically characterized by larger θ a and they miss FASER. This disadvantage is less pronounced for a much larger detector like SHiP.
In the left panel of Fig. 5, we show the number of signal events as a function of FASER's radius R for several benchmark values of m a and g aγγ . In the right panel, the sensitivity reach in the (m a , g aγγ ) plane is shown for several values of the radius R. As can be seen, even a very small detector with R = 2 cm can probe unconstrained regions of ALP parameter space. Increasing R above ∼ 10 cm has a very mild impact on the reach for larger values of g aγγ .

V. DETECTION OF A DI-PHOTON SIGNAL IN FASER
The ALP-induced signal in FASER typically consists of two highly-collimated, highlyenergetic (∼ TeV) photons that point back to the IP. Given a detector consisting of several layers of tracker followed by an EM calorimeter, the ALP signal could be detected in the EM calorimeter or, if the photons convert into e + e − pairs, in the tracking system.
Given the shielding of the detector from the main LHC tunnel, one does not expect high-energy electromagnetic particles produced in beam-induced collisions in the beam pipe to reach FASER. Also, hadronic particles that could induce partly electromagnetic showers when interacting in the concrete or rock before FASER or inside the detector are expected to first effectively lose their energy. This suppresses the SM background for energies of the incoming ALP that are above a certain threshold. To determine the remaining background, a detailed FLUKA simulation for FASER is currently being performed, and there are plans to validate these simulations with in situ measurements.
If the number of ALP-induced high-energy signal events significantly surpasses the expected number of background events, the detection of ALP events as a single high-energy EM shower without accompanying tracks might be sufficient to indicate the discovery of new physics. On the other hand, if the background to such "single photon" events is significant, distinguishing the two photons produced in ALP decays, that is, detecting the ALP signal as genuine di-photon events, may be required. The background to high-energy di-photon events from the direction of the IP is essentially negligible.
In the lab frame the distribution of the opening angle between two photons, θ γγ , is strongly peaked towards its minimal value, θ γγ 2/γ, where γ = E a /m a is the ALP's boost factor. For typical ALP energies E a ∼ TeV and m a ∼ 100 MeV, the opening angle is θ γγ ∼ 200 µrad. After 1 meter, this leads to a small separation between the two photons of order d γγ ∼ 200 µm, which makes it challenging to resolve them in a calorimeter.
Remarkably, however, the required resolution might be achieved by employing existing calorimeter technology. In particular, such resolutions are already achieved in the calorimeters used by the LHCf collaboration [52]. These consist of 16 layers of plastic scintillators interleaved with tungsten converters, which increase the radiation length of the detector and minimize the shower leakage effects. The total longitudinal size of such a calorimeter "tower" is about 23 cm, small enough to fit in the FASER location without sacrificing much tracker volume. The calorimeter's energy resolution is roughly 5% for E γ > 100 GeV [53] and is employed to study photons originating from neutral pion decays with energies up to several TeV [43]. The spatial resolution for the initial position of the photon entering the calorimeter can be better than 200 µm [53]. This is achieved with four layers of microstrip silicon sensors that are placed within the calorimeter towers. Most important for the present context, events with two photons that develop two distinct peaks in the lateral shape of the shower can be distinguished with more than 90% accuracy provided that the peaks are 1 mm apart from each other [54]. This can be done assuming that the lower energy photon carries at least 5% of the energy of the more energetic one, which is almost always the case in ALP decay.
The possibility of distinguishing two nearby photons has also been studied for high-p T searches at ATLAS and CMS. For example, at ATLAS, for energy deposited in a calorimeter, a variable w s3 is defined, which corresponds to the ratio of energy deposited in the two strips adjacent to the central one relative to the total energy deposited in all three strips. This has been used in Ref. [55] to differentiate di-photon and single-photon events and translated into a limit on the difference in pseudorapidity, below which two photons are indistinguishable. Such a limit corresponds to about a half of the strip size in the first layer of the electromagnetic calorimeter, which roughly leads to a spatial separation δ ∼ 1 − 2 mm [56,57]. Other techniques involving more sophisticated photon-jet substructure analyses can also be used for a better discrimination [58][59][60].
Requiring that the two photons decay products of the ALP are separated by a calorimeter spatial resolution δ, so that they can be resolved as two photons, effectively reduces the depth of the detector. For the example above with θ γγ ∼ 200 µrad, requiring a spatial separation δ ∼ 1 mm typically requires that the photons travel a distance ∼ 5 m in the detector. Of the ALPs that decay in the detector, then, the number that have photon separations greater than δ is effectively given by the number that decay in the reduced depth ∆ red = ∆ − δ/θ γγ . Since the number of events depends on the depth as shown in Eq. (11), this reduces the number of signal events. The efficiency for di-photon detection, that is, the fraction of ALP decays in the detector that can be resolved as di-photon events, given a detector resolution δ, is where · · · denotes the average over the distribution of opening angles θ γγ . In the last step, to provide a rough, but simple, approximation, we set the photon-photon opening angle to the fixed value θ γγ = 2m a /E a , and assume ∆ d a and δ/d a θ γγ . The approximation is quite accurate when δ/∆ 2m a /E a and the deviation of from 1 is small, but it breaks down for δ/∆ 2m a /E a , when the full simulation result must be used. In our numerical results we employ the exact θ γγ distribution. In the left panel of Fig. 6, we show di-photon  efficiencies as a function of the ALP energy for fixed ∆ = 10 m and some representative choices of ALP mass and detector resolution δ.
The precise value of δ will depend on the final detector setup and technology, but we note that for the aforementioned case with δ 1 mm and for the detector dimensions given in Eq. (10), we obtain δ/∆ = 10 −4 , and the typical suppression factor for detecting the di-photon signal from ALP decays in FASER is about 50%. In the right panel of Fig. 6 we present the impact this can have on the sensitivity reach. The reach for several other choices of δ/∆ are also shown for comparison. In particular, the case with δ/∆ = 0 corresponds to the scenario with negligible background when even an effectively single-photon signal is enough for discovery.
If an ALP-like signal is observed in the calorimeter, further improvement of the analysis requires proper particle identification. In particular, FASER is also sensitive to many models for physics beyond the SM that lead to a signal that consists of high-energy electron-positron pairs. These could be disentangled from di-photon events by the use of a tracker and by employing a sufficiently strong magnetic field. On the other hand, hadronic neutral particles depositing their energy in calorimeters could be differentiated from photons based on their distinct shower development. Examples of such analyses are given in Refs. [43,61], where hadronic neutral showers and EM showers are differentiated based on parameters L 20% and L 90% , where L n% denotes the length after which n% of the shower energy has been deposited in the calorimeter.
Photon conversion into e + e − pairs inside the detector, in particular in the tracking system, can also allow one to disentangle single-and di-photon events; see, e.g., Ref. [62] for a recent discussion. In the ATLAS detector such a conversion can occur in 10 − 50% of the cases [63], depending on the pseudorapidity, making it an important search strategy. However, for FASER a dedicated analysis is needed to determine whether this approach can be used, given that the trackers will only constitute a small fraction of the total decay volume. In the case of conversion of one of the photons, one expects a signal in the calorimeter that consists of three simultaneous electromagnetic showers, one from the second photon and two from the e + and e − deflected by the magnetic field.

VI. CONCLUSIONS
Searches for new light, weakly-coupled particles could provide the first evidence of physics beyond the standard model, with wide-ranging implications for particle physics and cosmology. This possibility has stimulated a variety of proposals for experiments that could discover these new particles, and it motivates studies to determine the reach and promise of these proposed experiments.
In this study, we have considered FASER, the ForwArd Search ExpeRiment at the LHC. Previous studies have shown that FASER can harness the currently "wasted" large forward cross section in pp collisions at the LHC to search for new light particles with renormalizable couplings to the SM, such as dark photons, dark Higgs bosons, and heavy neutral leptons [10][11][12][13][14][15]. Even a small ∼ 1 m 3 detector that takes data concurrently with the ongoing high-p T experiments can achieve world-leading sensitivities to these types of new particles.
Here we have determined the reach of FASER to a qualitatively different form of new physics: ALPs, which couple dominantly through non-renormalizable di-photon interactions. Such ALPs are dominantly produced not at the IP, but by TeV-energy photons from the IP that collide with the neutral particle absorber (TAN or TAXN) ∼ 130 m downstream. These interactions produce high-energy ALPs through the Primakoff process, and these ALPs propagate through matter without interacting and mainly decay to two photons in FASER. This process exploits FASER's capability as a high-energy photon beam dump experiment.
Our results show that ALPs produced in this way are highly collimated. At FASER's location 480 m from the IP, for most underlying model parameters, most of the ALP signal is contained within ∼ 10 − 20 cm of the beam collision axis. In this way, the ALP signal is similar to the dark photon signal, and both are more collimated than the dark Higgs and HNL signals. With a detector spanning this area and ∼ 3 − 10 m deep, we have shown that FASER could detect as many as ∼ 10 5 ALP events at the HL-LHC and have sensitivity comparable to or better than other proposed experiments.
Of course, another important way in which ALPs differ from other dark sector candidates is that their signal is not two charged tracks, but two photons with ∼ TeV energies that originate from the direction of the IP in time with bunch crossings. FASER's sensitivity therefore depends on its calorimeter capabilities and the relevant EM shower backgrounds. If the background of ∼ TeV EM showers with the required direction and timing is negligible, all ALP decays may be taken as a background-free signal. Alternatively, if the EM shower background is non-negligible, the ALP signal of two photons can still be background-free, provided the two photons can be differentiated from each other. Because the photons are highly collimated, this requires a calorimeter that can differentiate showers separated by ∼ 1 mm. Remarkably, calorimeters with resolution δ ∼ 1 mm already exist, as discussed in Sec. V. We have shown how the ALP reach depends on δ. With the existing technology, the efficiency for detecting di-photon signals can still be as large as ∼ 50%, and the reach in ALP parameter space is degraded only slightly. Further progress depends on background simulations and in situ measurements that are currently underway.
In this work we have considered the case of axion-like particles coupling to photons, with both the coupling and mass as free parameters of the model. Within this framework, probably the most motivated model is the QCD axion, for which the coupling to photons is g aγγ = c a α EM m a /(2πm π f π ), where the range for the prefactor c a is typically taken to encompass the values ∼ −4 (KSVZ) to ∼ 1.5 (DFSZ) [64]. For a QCD axion with mass m a = 100 MeV, this implies g aγγ ∼ 0.01, which is excluded by beam dump experiments. However, recent work [65] has shown that it is possible to construct a viable QCD axion at the MeV scale by coupling it to first-generation fermions, while keeping its mixing with the neutral pion suppressed. The di-photon coupling of such an axion can also be kept below current bounds. The possibility of ALPs with dominantly di-photon couplings but also other couplings is quite general [66], and it is interesting to note that these models can also be probed by FASER in the way described here.
Finally, it is important to note that ALPs may also couple dominantly to other SM particles, such as gluons or fermions, and these couplings in fact induce each other at the loop level. These alternative couplings may alter the di-photon signal and rate, allow ALPs to be produced through other processes, such as the rare decays of heavier mesons, or induce other ALP signals in FASER, such as the two charged track signals already analyzed for other dark sector candidates. we can approximate the ALP momentum as The momentum transfer between the photon and target is The left panel of Fig. 7 shows the momentum transfer q = √ t as a function of the scattering angle for various ALP masses and photon energies. At angles much smaller than θ * ≡ m 2 a /( √ 2E 2 γ ), t approaches a constant value t min = m 4 a /(2E 2 γ ), while, for larger angles, it scales as t ≈ E 2 γ θ 2 aγ . The horizontal gray dashed lines show typical scales of the momentum transfer for the form factors defined in Eq. (8): the atomic form factor scale q atom = 1/a = Z 1/3 nuc m e /111, the atomic-nuclear form factor crossover scale q eq = 2.71 m e , and the nuclear form factor scale q nuc = √ d = 0.4A −1/3 nuc GeV. Note that depending on m a and p a , the atomic form factor and its cutoff scale might or might not be relevant for ALP production.
The form factor F (t) is shown in the central panel of Fig. 7. We see that (1) for |q| q nuc , the form factor decreases as F (q) ∼ q 2 nuc /q 2 , thus suppressing large angle θ aγ (large momentum transfer) scattering; (2) for q atom |q| q nuc , the form factor is approximately equal to unity; and (3) if |q| < q atom is kinematically accessible, the form factor approaches F (t min ) as θ aγ decreases.
For scattering angles θ aγ θ * the momentum transfer scales like t ≈ E 2 γ θ 2 aγ , and we can approximate the differential Primakoff cross section in Eq. (7) as 2 This accounts for the steep descent at large angles for which t q nuc , and the constant plateau region, where the function F 2 (t) is approximately constant. As θ aγ approaches θ * , t decreases, and the form factor approaches a constant value F (t min ). At this kinematic region the differential cross section is given by which decreases as ∼ θ 4 aγ . These results, normalized to the conversion cross section of photons in iron, are shown in the right panel of Fig. 7. Note that the transition at θ aγ = θ * depends on the ratio m a /E γ . For FASER energies and masses of interest, this transition typically occurs when the form factor is above the atomic scale cutoff q atom .
The total cross section for photon conversion into an ALP via the Primakoff process relative to the photon conversion cross section into electrons is shown in the left panel of Fig. 8 for different ALP masses and energies. For large E γ /m a , the Primakoff cross section approaches a constant maximum. In this case, a fraction of O(0.1%) × [g aγγ /GeV −1 ] 2 of the photons convert into ALPs.
In summary, we have seen that the momentum transfer in the Primakoff process is typically small due to a cutoff of the nuclear form factor at t > q 2 nuc . For high energy photons, this implies small scattering angles of the ALP with respect to the photon direction, θ aγ < q nuc /E γ . Therefore, the ALP momenta are almost collinear with the photon momenta. Furthermore, the ALPs carry almost the entire energy of the photon E a ≈ E γ . Hence the collinear approximation, p a ≈ p γ , gives an excellent estimate for the final sensitivity reach discussed in the main text.

Appendix B: Production of Axion-like Particles in Pion Decay
If the ALP is light enough, it can also be produced in the decays of neutral pions π 0 . To calculate the decay width Γ(π 0 → aγγ), let us consider the interaction Lagrangian in the effective theory where g π 0 γγ = 2.512 × 10 −2 GeV −1 is the pion decay constant and we conform to the convention that F µν = 1 2 αβµν F αβ is the dual field strength tensor. Choosing a momentum 2 For θ * < θ aγ 1,  assignment π 0 (p) → a(q)γ(k 1 )γ(k 2 ), the decay amplitude is M = −g π 0 γγ g aγγ αβγδ µνρσ p α q µ g βν k 1γ ε 1δ k 2ρ ε 2σ (p − k 1 ) 2 + k 2γ ε 2δ k 1ρ ε 1σ (p − k 2 ) 2 . (B2) Let us consider this process in the pion's rest frame, where the particle momenta are p = (M, 0, 0, 0), k 1 = E 1 (1, 0, 0, 1) and k 2 = E 2 (1, sin θ 12 , 0, cos θ 12 ). Here M denotes the pion mass. The ALP momentum is given by q = p − k 1 − k 2 with q 2 = m 2 , where m is the ALP mass. Energy-momentum conservation implies The differential decay width is, then, dΓ(π 0 → aγγ) (B7) Similar results can be obtained for the η meson, where one can use g π 0 γγ ≈ g ηγγ (up to an O(10 −4 ) correction). The branching fractions for both π 0 and η decays as functions of the ALP mass are given in Fig. 8 for g aγγ = 1 GeV −1 . As one can see, the ALP production rate in rare π 0 and η decays is typically suppressed compared to the Primakoff process. (Cf. the right panel of Fig. 2.) Note also that ALPs from 3-body meson decays are typically less boosted than ALPs produced in the Primakoff process. As a result, meson decays are less significant for FASER event rates throughout parameter space, as can be seen in the left panel of Fig. 4.
shown in the right panel of Fig. 9: it is the ratio of the arc-length from the ALP circle overlapping with FASER, to the circumference of the ALP circle. The angular acceptance can therefore be written as A ang = 1 for d + r < R, A ang = 0 for d − r > R, and otherwise.