Measurement of the Iron Spectrum in Cosmic Rays from 10 GeV$/n$ to 2.0 TeV$/n$ with the Calorimetric Electron Telescope on the International Space Station

The Calorimetric Electron Telescope (CALET), in operation on the International Space Station since 2015, collected a large sample of cosmic-ray iron over a wide energy interval. In this Letter a measurement of the iron spectrum is presented in the range of kinetic energy per nucleon from 10 GeV$/n$ to 2.0 TeV$/n$ allowing the inclusion of iron in the list of elements studied with unprecedented precision by space-borne instruments. The measurement is based on observations carried out from January 2016 to May 2020. The CALET instrument can identify individual nuclear species via a measurement of their electric charge with a dynamic range extending far beyond iron (up to atomic number $Z$ = 40). The energy is measured by a homogeneous calorimeter with a total equivalent thickness of 1.2 proton interaction lengths preceded by a thin (3 radiation lengths) imaging section providing tracking and energy sampling. The analysis of the data and the detailed assessment of systematic uncertainties are described and results are compared with the findings of previous experiments. The observed differential spectrum is consistent within the errors with previous experiments. In the region from 50 GeV$/n$ to 2 TeV$/n$ our present data are compatible with a single power law with spectral index -2.60 $\pm$ 0.03.


INTRODUCTION
Direct measurements of the energy spectra of charged cosmic rays (CR) have recently achieved a level of unprecedented precision with long term observations of individual elements. The new data, provided by magnetic spectrometers up to their maximum detectable rigidity and by space-based and balloon-borne calorimeters (as well as transition radiation and Cherenkov detectors), revealed unexpected spectral features, most notably the onset of a progressive hardening of proton and He spectra at a few hundred GeV/n [1][2][3][4][5][6][7] which has also been observed for heavier nuclei [8][9][10][11][12]. The emergence of this new scenario prompted a number of theoretical interpretations [13][14][15][16][17][18][19][20][21][22][23][24][25] ranging from an anomalous diffusive regime near the sources (e.g., [16]) to the dominance of one (or more) nearby supernova remnant (SNR) (e.g., [25]) in the framework of specific models of confinement and gradual release from the source. In order to discriminate among different interpretations, a precision measurement of the iron spectrum is of particular interest as iron provides favourable conditions for observations, not only because of its largest relative abundance among the heavy elements, but also for a negligible contamination from spallation of higher mass elements. At the time of writing, a compilation of direct measurements of iron in space includes the satellite experiments HEAO3-C2 [26] (at low energy), CRN [27] (on Spacelab2 aboard the Challenger Space Shuttle) and NUCLEON [28]. Balloon measurements include data from Ichimura, M. et al. [29] (hereafter referred to as the Sanriku experiment), providing an energy estimate using Earth's magnetic field via an accurate angular measurement with nuclear emulsions, and from balloon experiments with electronic instrumen-tation such as ATIC, TRACER and CREAM [30][31][32][33]. Recently published are also data from the spectrometer AMS-02 [34].
CALET is a space-based instrument [35][36][37] optimized for the measurement of the all-electron spectrum [38,39], but also designed to study individual chemical elements in CR from proton to iron and above, exploring particle energies up to the PeV scale. This can be achieved thanks to its large dynamic range, adequate calorimetric depth, accurate tracking and excellent charge identification capabilities. In the hadronic sector, CALET already provided accurate spectral measurements of protons to 10 TeV [7] and of carbon and oxygen nuclei to 2.2 TeV/n [12].
In this Letter, we present a new measurement of the iron flux from 10 GeV/n to 2.0 TeV/n, based on the data collected by CALET from January 1, 2016 to May 31, 2020 aboard the International Space Station (ISS).

CALET INSTRUMENT
CALET measures the particle energy with the TASC (Total AbSorption Calorimeter), a lead-tungstate homogeneous calorimeter (27 radiation lengths (r.l.), 1.2 proton interaction lengths) preceded by a thin (3 r.l.) preshower IMaging Calorimeter (IMC), both covering a very large dynamic range. Charge identification is carried out by the CHarge Detector (CHD), a two-layered hodoscope of plastic scintillator paddles placed on top of CALET. The CHD can resolve individual elements from Z = 1 to Z = 40 with excellent charge resolution. The IMC, with 16 layers of thin scintillating fibers (read out individually), provides fine-grained tracking and an independent charge measurement, via multiple samples of specific energy loss (dE/dx) in each fiber, up to the onset of saturation which occurs for ions more highly charged than silicon. Details on the instrument layout and the trigger system can be found in the Supplemental Material (SM) of Ref. [38].
CALET was launched on August 19, 2015 and installed on the Japanese Experiment Module Exposure Facility of the ISS. The on-orbit commissioning phase was successfully completed in the first days of October 2015. Calibration and test of the instrument took place at the CERN-SPS during five campaigns between 2010 and 2015 with beams of electrons, protons and relativistic ions [40][41][42].

DATA ANALYSIS
The sample of flight data (FD) used in the present analysis covers a period of 1613 days of CALET operation. The total observation live time for the high-energy (HE) shower trigger is T ∼ 3.3×10 4 hours, corresponding to 85.8% of total observation time.
A dedicated trigger mode [42,43] allows the selection of penetrating protons and He particles for the individual on-orbit calibration of all channels. First, raw data are corrected for gain differences among the channels, light output non-uniformity, and any residual dependence on time and temperature. After calibration, a track is reconstructed for each event with an associated estimate of its charge and energy.
Physics processes and interactions in the apparatus are simulated by Monte Carlo (MC) techniques, based on the EPICS package [44,45] which implements the hadronic interaction model DPMJET-III [46]. The instrument configuration and detector response are detailed in the simulation code which provides digitized signals from all channels. An independent analysis based on FLUKA [47,48] is also performed to assess the systematic uncertainties. The particle's direction and entrance point are reconstructed and fit by a tracking algorithm based on a combinatorial Kalman filter fed with the coordinates provided by the scintillating fibers in the IMC. It identifies the incident track in the presence of background hits generated by secondary radiation backscattered from the TASC [49]. The angular resolution is ∼ 0.08 • for Fe and the spatial resolution for the impact point on the CHD is ∼180 µm.
The particle's charge Z is reconstructed by measuring the ionization deposits in the CHD. The dE/dx samples are extracted from the signals of the CHD paddles traversed by the incident particle and properly corrected for path length. Either CHD layer provides an independent dE/dx measurement. In order to correct for the reduction of the scintillator's light yield due to the quenching effect, a "halo" model [50] has been used to fit FD samples for each nuclear species as a function of Z 2 . The resulting curves are then used to reconstruct a charge value in either layer (Z CHDX , Z CHDY ) on an event-by-event basis [12]. Differently from the case of lighter nuclei, an independent charge measurement with the IMC fibers is not possible for iron due to the saturation of signals occurring for Z 14 in the upstream layers and Z 22 in the last four layers. The presence of an increasing amount of backscatters from the TASC at higher energy generates additional energy deposits in the CHD that add up to the primary particle ionization signal and may induce a wrong charge identification. This effect causes a systematic displacement of the CHDX and CHDY charge peaks to higher values (up to 0.8 charge units) with respect to the nominal charge position. Therefore it is necessary to restore the iron peak position to its nominal value, Z = 26, by an energy dependent charge correction applied separately to the FD and the MC data. A charge distribution obtained by averaging Z CHDX and Z CHDY is shown in Fig. 1. The CHD charge resolution σ Z for iron is ∼ 0.35 (charge units).  For each event, the shower energy E TASC is calculated as the sum of the energy deposits of all TASC logs, after merging the gain ranges of each channel [43]. The energy response derived from the MC simulations was tuned using the results of a beam test carried out at the CERN-SPS in 2015 [40] with beams of accelerated ion fragments of 13, 19 and 150 GeV/c/n momentum per nucleon (as described in the SM of Ref. [12]). Correction factors are 6.7% for E TASC < 45 GeV and 3.5% for E TASC > 350 GeV, respectively. A linear interpolation is used to determine the correction factor for intermediate energies.
The onboard HE shower trigger, based on the coincidence of the summed dynode signals of the last four IMC layers and the top TASC layer (TASCX1) is fully efficient for elements heavier than oxygen. Therefore, an offline trigger confirmation, as required for the analysis of lower charge elements [7,12], is not necessary for iron, because the HE trigger threshold is far below the signal amplitude expected from a particle at minimum ionization (MIP) and the trigger efficiency is close to 100%. However, in order to select interacting particles, a deposit larger than 2 sigmas of the MIP peak is required in at least one of the first four layers of the TASC.
Events with one well-fitted track crossing the whole detector from the top of the CHD to the TASC bottom layer (and clear from the edges of TASCX1 and of the bottom TASC layer by at least 2 cm) are selected. The fiducial geometrical factor for this category of events is SΩ ∼ 416 cm 2 sr, corresponding to about 40% of CALET total acceptance.
Particles undergoing a charge-changing nuclear interaction in the upper part of the instrument are removed by requiring that the difference between the charges from either layer of the CHD is less than 1.5 charge units. Iron events are selected within an ellipse centered at Z = 26, with 1.25 σ x and 1.25 σ y wide semiaxes for Z CHDX and Z CHDY , respectively, and rotated clockwise by 45 degrees as shown in the cross plot of the CHDY vs CHDX charge in Fig. S2 of the SM [51]. Event selections are identical for the MC data and the FD.
For nuclei with Z > 10, the TASC crystals undergo a light quenching phenomenon which is not reproduced by the MC simulations. Therefore, it is necessary to extract from the data a quenching correction to be applied a posteriori to the MC energy deposits generated in the TASC logs by non-interacting primary particles, as shown in Fig. S3 of the SM [51].
Distributions of E TASC for Fe selected candidates are shown in Fig. S7 of the SM [51], with a sample of 4.0×10 4 events. In order to take into account the relatively limited calorimetric energy resolution for hadrons (of the order of ∼30%) energy unfolding is applied to correct for bin-to-bin migration effects. In this analysis, we used the Bayesian approach [52] implemented within the RooUnfold package [53] of the ROOT analysis framework [54]. Each element of the response matrix represents the probability that a primary nucleus in a given energy interval of the CR spectrum produces an energy deposit falling into a given bin of E TASC . The response matrix is derived using the MC simulation after applying the same selection procedure as for flight data and it is shown in Fig. S8 [51] of the SM.
The energy spectrum is obtained from the unfolded energy distribution as follows: where ∆E denotes the energy bin width, E is the geometric mean of the lower and upper bounds of the bin [55], N (E) the bin content in the unfolded distribution, ε(E) the total selection efficiency ( Fig. S4 of the SM [51]), U the unfolding procedure operator, N obs (E TASC ) the bin content of observed energy distribution (including background), and N bg (E TASC ) the bin content of background events in the observed energy distribution. Background contamination from different nuclear species misidentified as Fe is shown in Fig. S7 of the SM [51]. A contamination fraction N bg /N obs < 1% is found in the energy range between 10 2 GeV and 10 3 GeV of E TASC increasing up to ∼2% at E T ASC ∼ 10 4 GeV.

SYSTEMATIC UNCERTAINTIES
Dominant sources of systematics uncertainties in the iron analysis include (1) event selection, (2) energy response, (3) unfolding procedure and (4) MC model. The systematic error related to charge identification (1) was studied by varying the semi-axes of the elliptical selection up to ±15%. The result was an (energy bin dependent) flux variation lower than a few percent below 600 GeV/n increasing to ∼10% at 1 TeV/n.
The uncertainty on the energy scale correction (2) is ±2% and depends on the accuracy of the beam test calibration. It causes a rigid shift of the measured energies, affecting the absolute flux normalization by +3.3% −3.2% , but not the spectral shape. As the beam test model was not identical to the instrument now in orbit, the difference in the spectrum obtained with either configuration was modeled and included in the systematic error.
The uncertainties due to the unfolding procedure (3) were evaluated with different response matrices computed by varying the spectral index (between -2.9 and -2.2) of the MC generation spectrum, or by using the Singular Value Deconvolution method, instead of the Bayesian approach, in the RooUnfold procedure [53].
A comparison between different MC simulations (4) is in order as it is not possible to validate the MC simulations with beam test data at high energy. A comparative study of key distributions was carried out with EPICS and FLUKA showing that the respective total selection efficiencies for Fe are in agreement within 2% over the whole energy range (Figs. S4 and S5 of the SM [51]). However, the energy response matrices differ significantly in the low and high energy regions. The resulting fluxes show a maximum discrepancy around 10% below 40 GeV/n, a few percent in the 100 GeV/n region and less than 5% up to 1 TeV/n. This turns out to be the dominant source of known systematic uncertainties at low energy.
As the trigger threshold is much smaller than the energy of a noninteracting iron event, the HE trigger efficiency is close to 100% in the whole energy range with a negligible contribution to the systematic error. The fraction of interactions (Fig. S6 of the SM [51]) in the CHD, and above it, was checked by comparing the MC data and the FD as explained in the SM. The contribution due to a shower event cut, rejecting non interacting particles (5% below 30 GeV and < 1% above), was evaluated and included in the systematic uncertainties.
Possible inaccuracy of track reconstruction could affect the determination of the geometrical acceptance. The contamination due to off-acceptance events that are erroneously reconstructed inside the fiducial acceptance was estimated by MC simulation to be ∼ 1% at 10 GeV/n while decreasing to less than 0.1% above 60 GeV/n. The systematic uncertainty on the tracking efficiency is negligible [12]. A different tracking procedure, described in Ref. [56], was also used to study possible systematic uncertainties in tracking efficiency. The result is well consistent with the Kalman filter algorithm.
Additional energy-independent systematic uncertainties affecting the flux normalization include live time (3.4%), long-term stability (< 2%) and geometrical factor (∼ 1.6%), as detailed in the SM of Ref. [38]. The flux normalization remains stable within 1% when varying the background contamination fraction up to ±40%. The energy dependence of all systematic errors for iron analysis is shown in Fig. S10 of the SM [51]. The total systematic error is computed as the quadrature sum of all the sources of systematics in each energy bin.

RESULTS
The iron differential spectrum in kinetic energy per nucleon measured by CALET from 10 GeV/n to 2.0 TeV/n is shown in Fig. 2, where current uncertainties including statistical and systematic errors are bounded within a green band. The CALET spectrum is compared with the results from space-based (AMS 02 [34], HEAO3-C2 [26], CRN [27], NUCLEON [28]) and balloon-borne experiments (Sanriku [29], ATIC-02 [30], TRACER [32], CREAM-II [33]), as well as ground-based observations (H.E.S.S. [57]). The CALET iron flux measurements are tabulated in Table I of the SM [51] where statistical and systematic errors are also shown. Our spectrum is consistent with ATIC 02 and TRACER at low energy and with CNR and HESS at high energy. CALET and NUCLEON iron spectra have similar shapes while they differ in the absolute normalization of the flux. The latter turns out to be higher for CALET than for CRN by ∼10% on average, while it is lower by 14% with respect to Sanriku. CALET and AMS-02 iron spectra have a very similar shape (Fig. S12 of the SM [51]), but differ in the absolute normalization of the flux by ∼ 20%. Figure 3 shows a fit to the CALET iron flux with a   that the iron flux, above 50 GeV/n, is compatible within the errors with a single power law. The experimental limitations of the present measurement (i.e. low statistics as well as large systematic errors for the highest energy bins) do not allow yet to test theoretical interpretations predicting spectral shapes different from a single power law. As a matter of fact, current expectations (e.g., [16,25]) for a detectable spectral hardening of iron are still under debate.

CONCLUSION
From its privileged observation point on the ISS, CALET is carrying out direct measurements of CR fluxes extending the available spectral data on electrons and cosmic-ray nuclei to higher energies. In this Letter (4.4 years of observations) we report a measurement of the energy spectrum of iron from 10 GeV/n to 2.0 TeV/n with a significantly better precision than most of the existing measurements. Taking into account the average size of the large systematic errors reported in the literature, our data turn out to be consistent with most of the previous measurements within the uncertainty error band, both in spectral shape and normalization. Below 50 GeV/n the iron spectral shape is similar to the one observed for primaries lighter than iron. Above the same energy, our present observations are consistent with the hypothesis of an SPL spectrum up to 2 TeV/n. Beyond this limit, the uncertainties given by our present statistics and large systematics do not allow us to draw a significant conclusion on a possible deviation from a single power law. An SPL fit in this region yields a spectral index value γ = −2.60 ± 0.03. An extended data set, as expected beyond the 5 year period of continuous observations accomplished so far, will not only improve the dominant statistical limitations of the present measurement, but also our understanding of the instrument response in view of a further reduction of systematic uncertainties.  FIG. S1. Charge distributions from the combined charge measurement of the two CHD layers in the elemental region between Ca and Ge. Events are selected with 100 < ETASC < 125 GeV in (a) and 501 < ETASC < 630 GeV in (b). Flight data, represented by black dots, are compared with Monte Carlo samples including chromium, manganese, iron, cobalt and nickel. Titanium and vanadium are not included in MC because their contamination to iron data is negligible.
In Fig. S2, obtained with flight data, a cross-plot of the independent charge measurements provided by the two CHD layers shows a clear separation of iron candidates from the less abundant neighbor elements.

LIGHT QUENCHING IN THE TASC CRYSTALS
For nuclei with Z > 10, the TASC crystals undergo a light quenching phenomenon as is clearly visible in Fig. S3. In the same figure the position of the peak for a minimum-ionizing particle (MIP), generated by a non-interacting primary particle crossing the first TASC layer, is plotted as a function of Z 2 for nine elements ranging from O to Ni and selected from flight data. A "halo" model is used in the fit to parameterize the non-linearity of the scintillator's response due to light quenching. The applied corrections on the signals from the plastic scintillators are based on the same model, as explained in [S1].

ADDITIONAL INFORMATION ON THE ANALYSIS
Efficiencies. The total efficiency and relative efficiencies (i.e., the efficiency of a given cut normalized to the previous cut) were studied extensively over the whole energy range covered by the iron flux measurement. The total selection efficiency from EPICS (blue open circles) and FLUKA (red filled circles) are shown in Fig. S4 as a function of total particle kinetic energy per nucleon. The above efficiencies were validated by comparing distributions relevant to the selection of events, and obtained from flight data, with the same distributions generated by EPICS or FLUKA. An example is given in Fig. S5 where the total energy observed in each layer of the TASC (black points) were plotted using the final sample of iron candidates, marginally contaminated by a residual background (estimated around 1%) due to elements with atomic number close to iron. Compared with pure iron samples simulated by EPICS (blue) or FLUKA (red), the distributions from the two MC were found to be consistent with each other and in fair agreement with flight data.
Interactions in the instrument. The amount of instrument material above the CHD is very small and well known. The largest significant contribution is limited to a 2 mm thick Al cover placed on top of the CHD. This amounts to ∼ 2.2% radiation length and 5×10 −3 λ I . The material description in the MC is very accurate and derived directly from the CAD model. As CALET is sitting on the JEM external platform of the ISS, no extra material external to CALET is normally present within the acceptance adopted for the flux measurement. However, occasional obstructions caused by the ISS robotic arm operations may temporarily affect the FOV. Those rare periods are well identified and events discarded accordingly (a continuous monitoring is routinely done for gamma-ray analyses).
MC simulations were used to evaluate the iron survival probability after traversing both layers of the CHD and the material above. In order to check its consistency with flight data, the ratio R = (CHDX & CHDY) / CHDX (i.e. the fraction of iron candidates tagged by both CHD layers among those detected by the top charge detector) was plotted, as a function of the TASC energy, for selected iron candidates with measured charge in the range 25.5 -26.5.
In the upper panel of Fig. S6, R is shown in 15 bins of the TASC energy for both MC (R MC ) and FD (R F D ) with an average value around 90% and a flat energy dependence. The R MC /R F D double ratio (lower plot) shows a good level of consistency between the MC and flight data, within the errors. The total loss (∼ 10%) of iron events interacting in the upper part of the instrument was taken into account in the total efficiency and its uncertainty included in the systematic error. Background contamination. Background contamination of iron from neighbor elements is relatively small as shown in Fig. S7 as a function of the TASC energy. It is subtracted from the flux as explained in the main body of the paper. Calorimetric energy, bin size, and unfolding. The energy response of the TASC was studied with MC simulations and compared with the results of measurements of the total particle energy vs beam momentum carried out at CERN. During one of the beam test campaigns of CALET at the SPS with an extracted primary beam of 40 Ar (150 GeV/c/n), beam fragments were generated from an internal target and guided toward the instrument along a magnetic beam spectrometer that provided an accurate selection of their rigidity and A/Z ratio. The relation between the observed TASC energy and the primary energy was measured for several nuclei up to the highest available energy (6 TeV total particle energy in the case of 40 Ar). After an offline rejection of a very small amount of beam contaminants from the data, the shape of the TASC total energy was found to be consistent with a Gaussian distribution [S2].
The correlation matrix used for the unfolding was derived from the simulations, using two different MC codes EPICS and FLUKA, and applying the same selection cuts as in the FD analysis. Three different binning schemes (4, 5, 10 equal log-bins /decade) have been applied. Two normalized unfolding matrices (with 4 and 10 bins/decade) derived from EPICS are shown in Fig. S8.
The aforementioned three binning schemes were applied to the iron flux analysis with the result shown in Fig. S9 where only statistical errors are shown. Within the errors, no statistically significant difference was found among the three fluxes.
Energy dependent systematic errors. A breakdown of energy dependent systematic errors stemming from several sources (as explained in the main body of the paper) and including selection cuts, charge identification, MC model, energy scale correction, energy unfolding, beam test configuration and shower event shape is shown in Fig. S10 as a function of kinetic energy per nucleon.
Iron flux normalization and spectral shape. The CALET iron flux and a compilation of available data, including  the recent AMS-02 measurement [S3], are shown in Fig. S11, as an enlarged version of Fig. 2 in the main body of the paper.
Normalization issues among cosmic-ray flux measurements have a long historical record and unfortunately they seem to persist, in a few cases, also among more recent precision measurements. If we focus on the last 15 years, we notice: (1) an inconsistency of the DAMPE electron flux with AMS-02 in the energy interval from ∼ 50 GeV to 1 TeV, whereby the DAMPE electron data are found to be significantly higher than AMS-02 and CALET, the latter being in excellent agreement with each other; (2) an inconsistency of the DAMPE proton flux with AMS-02 (interval from ∼ 300 GeV to ∼ 1 TeV) and with CALET (from 300 GeV to ∼ 10 TeV) where the DAMPE proton flux is significantly higher than the other two experiments; (3) a tension in the AMS-02 normalization (by more than 20%) for carbon flux with respect to PAMELA, CALET and other previous experiments; a similar problem for oxygen with CALET (no oxygen data are available from PAMELA), while preliminary C, O results from DAMPE appear to  FIG. S10. Energy dependence (in GeV/nucleon) of systematic uncertainties (relative errors) for iron. The band bounded by the red lines represents the statistical error. The shaded band within the green lines shows the sum in quadrature of all the sources of systematics including energy independent ones. A detailed breakdown of systematic energy dependent errors, stemming from charge identification, MC model, energy scale correction, energy unfolding, beam test configuration and shower event shape is shown. The blue lines represent the sum in quadrature of statistical and total systematic uncertainties. be consistent in normalization with those of AMS-02 (the spectral shape is consistent within the errors for all three experiments); (4) a tension in normalization (by about 20%) between AMS-02 and CALET iron data.
A thorough investigation on possible unaccounted systematic effects related to the normalization of the fluxes has been carried out in the framework of different analyses of CALET with electron, proton, helium, carbon, oxygen, heavier nuclei, and iron data. We partially or totally rule out specific sources of uncertainty. The presence of a significant systematic issue on the live time normalization of the flux is considered unlikely, given the consistency in normalization of the CALET electron, (as well as proton) flux with AMS-02 (and PAMELA) within their respective rigidity ranges. Possible areas where systematic effects could be further investigated by CALET, and by other collaborations as well, include the impact of simulations (specifically on the hadronic cross sections). While EPICS and FLUKA differences were taken into account in the assessment of systematic errors, a comparative study with GEANT4 is not available for CALET at present. Another area under scrutiny is the CALET trigger efficiency. It was extensively studied using ratios of different trigger modes vs the minimum bias trigger and an excellent agreement was found with the MC simulations (see [S12]). The possibility of implementing additional trigger types, and dedicated orbital run modes, to further investigate this aspect is under study. In figure Fig. S12 the recently published AMS-02 flux is compared with the CALET iron data (with the same multiplicative flux factor E 2.7 ), after increasing the latter's overall normalization by 20%. Taking the data at facevalue, we notice that the data points of the two experiments not only share a very similar spectral shape, but also have comparable errors.   of the CALET differential spectrum in kinetic energy per nucleon of cosmic-ray iron. The first, second, and third error in the flux are representative of the statistical uncertainties, systematic uncertainties in normalization, and energy dependent systematic uncertainties, respectively.