Modeling early-universe energy injection with Dense Neural Networks

We show that Dense Neural Networks can be used to accurately model the cooling of high-energy particles in the early universe, in the context of the public code package DarkHistory. DarkHistory self-consistently computes the temperature and ionization history of the early universe in the presence of exotic energy injections, such as might arise from the annihilation or decay of dark matter. The original version of DarkHistory uses large pre-computed transfer function tables to evolve photon and electron spectra in redshift steps, which require a significant amount of memory and storage space. We present a light version of DarkHistory that makes use of simple Dense Neural Networks to store and interpolate the transfer functions, which performs well on small computers without heavy memory or storage usage. This method anticipates future expansion with additional parametric dependence in the transfer functions without requiring exponentially larger data tables.


I. INTRODUCTION
Dark matter (DM) constitutes 84% of the matter content in the universe [1] and plays an important role in the evolution of the early universe. It has so far eluded detection in all channels other than gravitational interactions. DM annihilation or decay could inject energy in the form of Standard Model particles, modifying the temperature and ionization of the intergalactic medium (IGM) and the anisotropies of the cosmic microwave background (CMB); studies of these observables have placed strong constraints on such energy injections (e.g. [2][3][4][5][6][7][8][9][10][11][12][13][14][15]).
DarkHistory [16] is a Python package developed to calculate the evolution of the IGM temperature and ionization in the early universe in the presence of such exotic energy injections. For an injected spectrum of Standard Model (SM) particles, it calculates the particle cascade by computing (1) the production of photons and electrons/positrons by the decay of the originally injected SM particles; (2) the subsequent secondary particle cascade and energy deposition arising from this exotic injection of photons/electrons/positrons, due to interaction with the IGM and the photon bath; (3) modifications to the IGM's temperature and ionization from the secondary particles and their energy deposition, using a simple Three-Level Atom (TLA) model.
These calculations are carried out in redshift steps starting prior to recombination (at redshift 1 + z = 3000 by default) and ending well after reionization near the present day (redshift 1 + z = 4). In particular, the particle cascade in step (2) is evaluated using precomputed transfer functions, which are matrices that take an input spectrum and output the spectrum of secondary particles (for a given redshift step). DarkHistory includes the backreaction effects of changes to the ionization level of matter, which means the transfer functions themselves are functions of the gas ionization levels, as well as red- * yitians@mit.edu † tslatyer@mit.edu shift. In the previous version of DarkHistory, this dependence is realized by interpolating tables of transfer function matrices on a grid of values for the hydrogen and helium ionization fractions, as well as a grid of redshift values. the input and output photon/electron energy and necessary physical parameters including the redshift z, ionized hydrogen fraction xHII, and singly ionized helium fraction xHeII, and outputs (the logarithm of) the transfer function value P . The transfer function matrix acting on a given discretized spectrum is then obtained by evaluating the DNN on the given energy abscissa.
At around 1.5 Gb per table with 12 tables around this size, the transfer functions take up significant storage space as well as memory during runtime, since they are all loaded in a standard run. They will also be difficult to scale up to include additional parametric dependence, as the expected size scales exponentially with the number of added parameters. Intuitively, storing the transfer functions as tables is an over-representation of the information content within, since the transfer functions, despite not being smooth globally, can be divided into multiple regions that are relatively smooth, each of which could plausibly be fitted with analytical functions. However, in practice, finding such a solution is quite non-trivial, and even if a solution was found by ad hoc methods, it would be quite specific to individual transfer functions arXiv:2207.06425v1 [hep-ph] 13 Jul 2022 and potentially difficult to maintain under changes to the code. In this work, we present a general solution to these issues by replacing the transfer function tables with trained Dense Neural Networks.
Recently, machine learning and especially Neural Networks (NNs) has seen many applications in high energy physics and astrophysics [17][18][19][20]. NNs are often used to capture highly nonlinear and complex relations between inputs and outputs, and in particular, Dense Neural Networks (DNNs), also known as fully connected Neural Networks, are a class of general function approximators given sufficient neuron numbers [21].
In the updated version of DarkHistory we present in this work, we use lightweight DNNs to store and automatically interpolate DarkHistory's transfer functions. A transfer function is essentially a multi-dimensional table, with 2 dimensions corresponding to the input and output particle energy, and the rest corresponding to physical parameters which in this case will be a subset of redshift z, ionized hydrogen fraction x HII ≡ n HII /n H , and singly ionized helium fraction x HeII ≡ n HeII /n H . (As a matter of convenience, we define the singly ionized helium fraction with the hydrogen number density as the denominator, so that x HeII and x HII can be easily summed.) The DNN-based transfer functions are shown in schematic form in Fig. 1. We let a DNN take in all of the relevant parameters on equal footing and predict (the natural logarithm of) the transfer function value P .
Using DNNs as transfer functions has several benefits: • The DNNs we use are smaller in size compared to the stored transfer function tables, by a factor of ∼400.
• The computed matter temperature history and ionization history match those calculated using the transfer function tables to within a few percent relative difference (with sub-percent relative differences in regions when the species in question are more than 10% ionized), while the spectral distortion due to upscattered CMB photons (see Sec. II for precise definition) matches to below the 10 percent level. These errors are small compared to current experimental uncertainties.
• We expect the DNNs to have improved scaling in size when additional parameters are added compared to the original tables. (Including a smooth dependence on an additional parameter might result in a O(1) increase in the DNN neuron number to reach similar accuracy due to increased information content. On the other hand, adding an additional dimension to a data table would increase its size by a multiplicative factor of the number of bins in the new parameter.) • The DNNs automatically interpolate to any the physical parameter values and input/output particle energies within the trained range. This allows the use of flexible binning in DarkHistory, and will also allow us to perform interpolation on sparse training data [22,23]. The latter may become necessary in future extensions of DarkHistory, when probing dependence on an increasing number of physical parameters and generating dense grids of training data becomes computationally expensive.
• The DNNs predict transfer functions quickly, taking up a similar amount of time to the rest of the evolution routine for injected particles. Thus compared to retrieving tabular data from memory on a personal computer, the use of DNNs results in only a O(1) increase in total runtime.
• Open source building and training tools for NNs and especially simple architectures like DNNs are readily available. (In this work we use Tensorflow 2.0 [24] with Keras [25].) In Sec. II, we introduce DarkHistory and the roles of transfer functions. In Sec. III, we detail the training and implementation of the DNN transfer functions. In Sec. IV we present test runs and discuss the performance of DNN transfer functions compared to baseline DarkHistory. Finally, in Sec. V we summarize our results, briefly discuss other possible approaches, and outline some ideas for future expansion with DNN transfer functions in DarkHistory.

II. TRANSFER FUNCTIONS IN DARKHISTORY
To better illustrate the role of transfer functions, we briefly introduce the procedure followed in DarkHistory and sketched in Fig. 2, which is modified from a flow chart in Ref. [16]. In Fig. 2, boxed quantities represent particle spectra, and arrows represent transfer functions, which are functions acting on spectra. DarkHistory stores the free streaming photon spectrum, the IGM temperature, and the IGM's ionized hydrogen fraction and singly ionized helium fraction at each redshift step (these quantities are assumed to be homogeneous). For each redshift step, DarkHistory: 1. Converts injected SM particles at that redshift to injected photons N γ inj and electron/positrons N e inj (hereafter referred collectively as electrons). In DarkHistory, high energy (> 3 keV) positrons are treated as electrons since their dominant energyloss process, Inverse Compton Scattering (ICS) on the CMB, does not depend on the particle charge. Lower-energy positrons are assumed to annihilate and the resulting photon spectrum is tracked; their kinetic energy is approximated as following the same pattern of energy deposition as that of the electrons (see Ref. [16] for a more in-depth discussion). 2. Computes any injected electron spectrum's energy deposition into ionization, excitation, or heating by applying the transfer function R c . Computes secondary photon and electron spectra produced from injected electrons due to ICS, positronium formation and decay, and atomic processes by applying the ICS transfer function T ICS and secondary electron transfer function T e , and evaluating the spectrum of gamma rays produced from positron annihilation N γ pos . These secondary photon spectra are added to the spectrum of photons N γ propagated from the previous step, plus any injected photon spectrum.
3. Computes the secondary particles and energy deposition produced by a propagating photon spectrum N γ , due to a variety of processes, including photon-photon scattering, Compton scattering, pair production, photoionization, and redshifting. The production of secondary electrons/positrons in the same redshift step, and their subsequent production of photons via ICS on the CMB or positron annihilation, are also included. The propagating photon spectrum for the next redshift step is obtained by applying the high energy photon transfer function P γ to N γ . The low energy photon spectrum, which stores photons below 3 keV that either photoionize within the redshift step or lie below 13.6 eV, is obtained by applying the low energy photon transfer function D γ . The low energy electron spectrum, which stores electrons with kinetic energy below 3 keV where atomic cooling dominates ICS and is treated separately in the electron cooling module, is obtained by applying the low energy electron transfer function D e . Finally, the photon's energy deposition into ionization, excitation, and heating is obtained by applying the high energy deposition transfer function D high c . 4. Computes the change to IGM temperature and ionization by first calculating the energy deposition fraction f c 's, from the low-energy electron/photon spectra and direct energy deposition by higherenergy particles, and then performing the TLA integration (see Ref. [16] for details).
In general, transfer functions from an input spectrum to another spectrum (such as T ICS , P γ ) take up much more space than those outputting an energy value (such as D high c ) or those that are diagonal (such as D γ ). For this version of DarkHistory, we focus on replacing the largest spectral transfer functions with DNNs, but similar procedures can be applied to the lower-dimension transfer functions in the future. In the following we introduce the two major types of transfer functions we will replace in more detail.

A. ICS transfer functions
As discussed above, the ICS transfer functions describe the spectrum of scattered photons from the complete cooling of injected electrons. Unlike the transfer functions applied to the photon spectrum, the ICS transfer functions T ICS , T e and R c are not directly interpolated from tables, but reconstructed from reference ICS transfer function tables. Sec. III.D and Appendix A of Ref. [16] describe in detail how this is achieved, and we only provide a simple summary here: since an electron quickly deposits all of its energy in one of our red-shift steps, in order to obtain the total secondary spectrum or energy deposition by an electron, we need to consider multiple interaction events (via ICS and atomic processes). This can be done recursively: one can reconstruct the full ICS secondary spectrum and energy deposition for an electron of energy E knowing the same information for all electrons with E < E. With discretized energy abscissa, the full ICS-induced photon spectrum and energy output can be solved recursively from the lowest energy bin. As a result, at each redshift, one can solve for the full electron ICS transfer functions using transfer functions describing a single ICS scattering event (as well as functions describing the interaction rates due to atomic processes).
The transfer functions for a single ICS scattering event on the CMB have simple (approximate) scaling relations with respect to the temperature of the CMB T , as described in detail in Appendix A of Ref. [16]. As such one can derive ICS transfer functions at different redshifts from a single transfer function at a fixed redshift, assuming the CMB is the dominant radiation background (in DarkHistory, 1+z = 400 is used). These reference transfer functions are interpolated from tables ics thomson, ics rel, and ics engloss, corresponding respectively to transfer functions for the secondary photon spectra of nonrelativistic electrons and relativistic electrons, and the relativistic electron energy loss in a single ICS scattering event on the CMB. It is these tables that we will fit with DNNs.

B. Photon transfer function
The high energy photon transfer function P γ , low energy photon transfer function D γ , and low energy electron transfer function D e are interpolated from corresponding tables highengphot, lowengphot, and lowengelec. They in general depend on the CMB temperature through redshift and matter ionization levels (ionized hydrogen fraction and singly ionized helium fraction). For each combination of these physical parameters, lowengphot can be represented as 1-D arrays (with the one dimension being input/output energy) and is a factor of 500 smaller than the other transfer functions, so it is at present not replaced with a DNN. We also found that the numerical calculation of Compton scattering used in the previous version of DarkHistory was inaccurate in some parts of parameter space (in particular populating kinematically forbidden regions), and so we have updated the relevant tabular transfer functions to ensure sufficient accuracy. 1 We now describe some special features of the photon transfer functions: a. Redshift regimes and matter ionization dependence. All photon transfer functions depend on the redshift, as the photon cooling processes involve interactions with the redshift-dependent photon background and/or interstellar medium. For late redshifts (z < 40) encompassing the epoch of reionization, the transfer functions are allowed to vary with both the ionized hydrogen fraction x HII and the singly ionized helium fraction x HeII , which can be altered by exotic energy injections. For redshifts between helium recombination and reionization (40 < z < 1600), the ionized helium fraction can be safely approximated as zero [16]; the transfer functions are precomputed assuming no helium ionization, but can depend on the hydrogen ionized fraction. Before helium recombination (z > 1600), exotic energy injections consistent with current experimental bounds have little impact on the thermal equilibrium determining hydrogen and helium ionization levels [16], so DarkHistory uses RECFAST [26] ionization fractions as a baseline to pre-compute the transfer functions, which only depends on redshift.
For the DNN implementation, the flexibility of the network allows one DNN to be trained on the entire redshift range 4 < z < 3000 with a learned dependence on the ionization levels, for each transfer function. However, using different networks for different redshift regimes gives slightly better accuracy, and the latter approach is chosen in this version of DarkHistory.
b. Redshift step coarsening and energy conservation. In the previous version of DarkHistory, photon transfer functions are computed with redshift step ∆ log(1 + z) = 0.001. One can choose to increase the (log) step size to multiples of 0.001 to speed up computation. To combine multiple redshift steps, DarkHistory pre-composes multiple propagation transfer functions P γ and applies them appropriately to the deposition transfer functions D γ , D e , and D high c . Since the DNN implementation of transfer functions introduces a small amount of error that can accumulate over many redshift steps, in order to increase numerical stability, we train the DNNs to reproduce the pre-composed transfer functions, and require the use of a fixed log redshift step of ∆ log(1 + z) = 0.012. 2 To further decrease numerical error, we store the total fraction of the injected energy entering each type of secondary particle spectrum, for an injected photon of any given energy, and use these data (which are of ∼ 1% the size of the transfer functions) to ensure energy conservation while   accounting for energy loss to redshifting. We discuss the precise procedure of transfer function pre-composing and imposing energy conservation in Appendix V A.

III. TRANSFER FUNCTIONS FROM NEURAL NETWORKS
As described earlier and as indicated in Fig. 1, we update the largest transfer function tables (ics thomson, ics rel, ics engloss, highengphot, and lowengelec) to DNN networks that take in input and output particle energies, redshift, and possibly (depending on redshift) the ionized hydrogen fraction and singly ionized helium fraction, to produce the transfer function value P . In general, P can vary greatly across many orders of magnitude. (For example, after recombination the probability for a 10 keV photon to free stream, losing energy only through redshifting, is substantial, while the chance of it directly producing secondary photons of 1 keV is very close to 0, since this outcome is not kinematically allowed in a single Compton scattering event, nor is the scattered electron produced by Compton scattering or photoionization able to up-scatter other photons to 1 keV.) As such, we train the networks to output the natural logarithm of the transfer function value log P . Similarly, since the input/output energy abscissa and our redshift steps are also binned in log space by default, the networks also take the log value of these as inputs. The ionized hydrogen fraction and singly ionized helium fraction are linearly scaled to match the spread of other parameters before being fed into the DNN.
For high energy photons, the redshift-coarsened transfer functions (see Sec. II B) are fitted. Note also that the transfer function value P can be negative since by convention the CMB spectrum is subtracted from the transfer function and the negative values (and the positive values at higher energies) represent CMB photons being upscattered. In this case log |P | is predicted by the DNNs, and the negative value region is recovered in a post-processing step that identifies local minima of log |P | (which takes up an negligible time compared to the rest of the DarkHistory routine). Additionally, the transfer functions values near the diagonal in the input/output energy dimensions (corresponding to the free-streaming photons) are adjusted to enforce energy conservation up to redshifting. For details please refer to Appendix V A.
After adjusting hyperparameters, we find that for all transfer functions in question, it is sufficient to use DNNs with 7 hidden layers with 400 neurons each, making the number of parameters per DNN ∼ 400 2 × (7 − 1) ∼ 9.7 × 10 5 , and 2.9 × 10 6 for each transfer function built from 3 such networks.
Training is done with TensorFlow 2.0 [24] and Keras [25]; Adagrad [27] is used as the optimizer, with the mean squared error of log P as the loss function. Each DNN is trained on 2 V100 GPUs for O(10) hours or equivalent. For each epoch, training data is generated by interpolating the multi-dimensional transfer function table on uniformly random sampled inputs. Since training data is not reused across epoch, there is no concern of overfitting to a fixed subset of full data set. Trainings are terminated with the evaluation after each iteration stops improving significantly. To check for systematic offsets of the table transfer functions and the NNs, we trained multiple NNs with random initial values and randomly sampled training data. We found no obvious systematic offsets between the ensemble of NNs and the tables; e.g. at any given point the different NNs both underpredicted and overpredicted the table data.
The codes related to the DNN transfer functions are stored in the nntf module under DarkHistory. A new example file "Example 12: Using Neural Network transfer functions.ipynb" is provided to demonstrate using the DNN transfer functions and comparison with the baseline DarkHistory (which will be available if the appropriate data tables are present).

IV. PERFORMANCE
In this section, we describe the accuracy and speed of generation of the DNN transfer functions, as well as the accuracy of full runs over a range of simple DM injection scenarios using the DNN transfer functions.

A. Transfer function value prediction
In Fig. 3, a high energy photon transfer function generated by a DNN is compared against one interpolated from tables. As one can see, the errors in the raw output i.e. logarithm value of the transfer functions are concentrated near the distinct physical features in the transfer function, such as at the output photon energy of ∼ 0.5 MeV corresponding to positronium decay. The absolute values in logarithm errors ∆ log 10 |P | can be interpreted as rela-tive errors in |P | (up to a ln 10 factor). In this particular slice through the high energy photon transfer function, the average ∆ log 10 |P | when |P | > 10 −20 is 0.017, corresponding to a relative error of ∼ 4%. (The range of log 10 |P | is about ∼ −45 to 6). Overall ∆ log 10 |P | is comparable to this value, for all DNN transfer functions. A summary of the errors can be found in Tab. I.
To see the impact of these errors in a DarkHistory evolution run, it is also useful to look at errors in energy (per bin) transition rates, besides the particle number transition rates. The energy transfer function de-emphasize errors at low energies where many particles can be produced with only a small fraction of the original particle's energy, and thus such errors have a small effect on heating and ionization. Since the standard photon transfer functions are maps between particle number spectra N i , the transfer function values have the physical meaning of number transition rates. The particle energy spectrum is related to the number spectrum by where N i is the number of particles in the i-th bin, and E i its central energy. (Note that the energy bins are logspaced.) For a particle number transfer function P with   the corresponding energy transfer function P E is defined by In the last panel of Fig. 3, we show the relative error in the high energy photon transfer function. As expected, the errors are concentrated on the highest output energy for a given input energy. The relative errors are generally sub-percent. Fig. 4 shows that the DNN networks interpolate sensibly between the fixed abscissa values in the transfer function tables, for the high energy photon transfer function in the lowest redshift regime (4 < z < 40), as an example. The errors in the transfer function interpolation are consistent with average values shown in Tab. I.
The time it takes to generate a photon transfer function is about or less than 1 second on an 8-CPU personal computer. ICS transfer functions that are only generated once per run takes only slightly longer at 3 s. The accuracy and prediction time for all other transfer functions can be seen in Tab. I. Fig. 5 shows the evolution of integrated variables in one particular setting: 0.1 GeV DM particles decaying into electron-positron pairs (this is the same as the example used in Fig. 4 of Ref. [16]). In this example, the error introduced by the DNN in the matter temperature history and hydrogen ionization history is consistently sub-percent, while the error in singly ionized helium fraction is sub-percent when n HeII /n He > 10 −3 . For a scenario where DM decays into electron-positron pairs, over a range of DM rest masses, the maximum relative error for temperature and ionized hydrogen fraction over the entire redshift range is consistently below 2%, as shown in Fig. 6. Taking into account injection scenarios with DM decaying to photons, and also undergoing s-wave annihilation into e + e − or photons, the relative error is always below 8%. DarkHistory also computes the partial photon spectral distortion from high energy photon and electron processes, mostly from ICS of electrons or positrons on the CMB. This distortion is stored in the low energy photon spectrum. Note that spectral distortions arising from atomic transitions, due to photons and electrons below 3 keV, are not included in this spectrum (which is why we label it as "partial" or "incomplete"). The DNN transfer functions introduce a small amount of error in this spectral distortion, as shown in Fig. 7 and Fig. 6. While the shape of the spectral distortion is generally correct, the error in the location of the distortion zero can cause errors with a magnitude up to 10% of the distortion magnitude due to the errors in the distortion zero location. In a future update of DarkHistory, we are anticipating an update to include the correct treatment of the complete photon spectral distortion including contributions from atomic transitions. A small photon spectral error would allow the DNN functions to be used simultaneously with these updates.

B. Performance over a range of scenarios
In Appendix V B, we include the errors in some other exotic injection scenarios, including DM decaying to photon pairs, and DM annihilating to photon or e + e − pairs. Although these examples cover only a few simple injection scenarios, they can serve to test the whole range of DarkHistory's dependence on transfer functions, since all exotic energy injections are converted to either e + e − or photon injections. We expect exotic energy injections with a more complicated injection spectrum (such as e.g. annihilation to quarks) to have similar amounts of error associated with using the DNN transfer function.

V. CONCLUSION
In this work, we have made use of simple Dense Neural Networks to approximate complex and multi-dimensional transfer functions in DarkHistory to reduce storage and

A B
Relative error in (partial) spectrum distortion FIG. 6. Relative errors in temperature, ionization levels, and low energy photon distortion across a range of DM mass in a scenario with DM decaying to e + e − . The left panel shows the maximum relative error at any point in the evolution in 4 < z < 3000 of matter temperature, ionized hydrogen fraction, and singly ionized helium fraction across a range of DM exotic electron injection energies*. (*Due to its very small absolute value, the relative error for singly ionized helium fraction xHeII when nHeII/nHe < 10 −3 is not included in this plot. Since nHeII/nHe changes rapidly between O(1) and < 10 −3 , we are essentially only counting its relative error when its value is order unity.) The relative errors are generally below 5%.

FIG. 7.
Example partial low-energy photon spectral distortion in the present day. The two panels show two examples of low energy photon spectral distortion outputs (see definition in Sec. II) from two different runs: 0.1 GeV and 10 GeV DM decaying to e − e + pairs. They represent the extremes of large and small relative errors for runs over the full range of DM masses we consider, as shown in Fig. 6. The photon number density (per bin) is normalized against the baryon number density. The black lines represent outputs generated using tabulated transfer functions while the red dashed lines represent that using the DNN transfer functions. The blue line shows the different between the two. Note that the relatively large errors shown in the right panel are in part due to the error in the location of the zero.
memory usage, as well as to enable the possibility of adding more parameter dependence to the transfer function. The DNN transfer functions achieve good accuracy in computing the evolution history of matter temperature and ionization, as well as the partial CMB spectral distortion evaluated by the current version of DarkHistory; typical errors are at the few percent level, and are comparable to or smaller than estimates of systematic uncertainties in previous studies of constraints on energy injection [9,10,16]. The DNN-based functionality is available in the DarkHistory Github repository at https://github. com/hongwanliu/DarkHistory, and the necessary data files (one can choose to download the large tables, or DNN and auxiliary files, or both) are hosted on Zenodo (see the Github repository for details). While the use of DNNs offers one solution to this challenge, there may well be other viable solutions. The information in the DNNs still seems likely to be an overrepresentation of the piecewise-smooth transfer functions. We have briefly explored some alternative methods, including fitting to conventional functions directly,  Fig. 6. Note that the matter temperature and ionization fractions have max relative errors below 2%, and the errors in the low-energy photon spectral distortion are also below 10%.
and with the assistance of symbolic regression techniques [28]; however, DNN networks stand out as the best solution (so far) in terms of fitting accuracy and ease of implementation. There is work ongoing to expand the capabilities of DarkHistory, and we look forward to exploring NN-based and alternative techniques in this context.

A. Redshift step coarsening and energy conservation
In this Appendix, we describe how the photon transfer function change with coarsened redshift step, (expanding on Sec. III.E.3 of Ref. [16]), and how energy conservation while correctly accounting for photon energy loss due to redshift is implemented.

Transfer functions without coarsening
In DarkHistory's main.evolve function, evolution is discretized into log-normal redshift steps with fixed spacing d (dlnz in code) where the next redshift z is expressed in terms of z such that: Following DarkHistory's flow described in Fig. 2, to obtain the propagating photon spectrum N γ prop , low energy photon spectrum N γ low , low energy electron spectrum N e low and energy deposition array E high c , we retrieve the corresponding transfer functions P γ , D γ , D e , and The corresponding transfer functions P γ , D γ , and D e all take in the propagating photon spectrum N γ prop at redshift z and produce secondary spectra at z . (D high c produces energy deposition in this redshift step.) They are implemented as: where i and j are indices of discretized energy abscissa.
Energy conservation can be enforced straightforwardly: for an injection in any photon energy bin with central energy E i , the total output energy on the right hand side of the above equations should add up to E i minus the loss to redshifting of the propagating photons. Photons below a certain energy E relevant do not contribute to redshift energy loss because they either dump all of their energies efficiently within one redshift step or free stream and no longer interact, in which case they are stored as an array history of low energy photons N γ low (z )) and not immediately redshifted.
For the propagating photons that are redshifted by P γ , the energy lost is approximately: Let the energy abscissa (log-central energies of each bin) for photons and electron be E γ i and E e i respectively. We can express the above equation as  where d i = d when the photon with energy E i should be redshifted and d i = 0 otherwise. (This renders d i a function of redshift and hydrogen and helium ionization levels in general.) Then the energy conservation constraint can be written as This relation is imposed by shifting the near diagonal (propagating) part of the high energy photon transfer function P γ . Any energy non-conservation due to numerical errors or errors from approximating the transfer function as DNNs can be absorbed into this shift.

Coarsening
DarkHistory can enlarge the redshift step d to an multiple of a preset step d 0 . Let the coarsening multiple be c, then the next redshift step z to z is such that The photon and electron transfer functions are built with log redshift step d 0 , so we have to reconstruct the transfer functions for d. We first obtain single-step transfer functions (and d i ) with current ionization levels and a redshift value z mid in the middle of the large redshift step d: Then, we approximate the large redshift step as consisting of c single redshift steps, each with the same transfer function applied. The compounded transfer functions DNN transfer function |∆ log 10 |P || Prediction time   can be expressed as N γ prop (z ) = N γ prop (z) P γ c N γ low (z ) = N γ prop (z) 1 + P γ + P γ 2 + · · · + P γ c−1 D γ N e low (z ) = N γ prop (z) 1 + P γ + P γ 2 + · · · + P γ c−1 D e E high c (z ) = N γ prop (z) 1 + P γ + P γ 2 + · · · + P γ c−1 D high c , (12) with all transfer function evaluated at z mid . In the DNN implementation, the compounded transfer functions P γ c , 1 + P γ + P γ 2 + · · · + P γ c−1 , and D e are learned as DNN networks, and this step can be carried out without using the value of P γ itself. (The low energy photon transfer function D γ can be quickly reconstructed using the CMB energy loss information.) Imposing energy conservation is similar: At each d 0 step, the redshift energy loss is N γ prop (z + nd 0 ) · d 0i E i , where N γ prop (z + nd 0 ) = N γ prop (z) · P γ n . So the total redshift energy loss for a E i photon is |E redshift,i | = 1 + P γ + P γ 2 + · · · + P γ c−1 d 0 E i . (13) Let S γ,c = 1 + P γ + P γ 2 + · · · + P γ c−1 . Then the energy conservation condition is Again, the propagating photon spectrum can be adjusted to account for energy non-conservation from numerical and DNN prediction errors.  [29], Jupyter [30], matplotlib [31], NumPy [32], Tensorflow [24], Keras [25], and SciPy [33] software packages.