Application of Neural Networks for the Reconstruction of Supernova Neutrino Energy Spectra Following Fast Neutrino Flavor Conversions

Neutrinos can undergo fast flavor conversions (FFCs) within extremely dense astrophysical environments such as core-collapse supernovae (CCSNe) and neutron star mergers (NSMs). In this study, we explore FFCs in a \emph{multi-energy} neutrino gas, revealing that when the FFC growth rate significantly exceeds that of the vacuum Hamiltonian, all neutrinos (regardless of energy) share a common survival probability dictated by the energy-integrated neutrino spectrum. We then employ physics-informed neural networks (PINNs) to predict the asymptotic outcomes of FFCs within such a multi-energy neutrino gas. These predictions are based on the first two moments of neutrino angular distributions for each energy bin, typically available in state-of-the-art CCSN and NSM simulations. Our PINNs achieve errors as low as $\lesssim6\%$ and $\lesssim 18\%$ for predicting the number of neutrinos in the electron channel and the relative absolute error in the neutrino moments, respectively.

One of the latest advancements in this field involves the discovery of fast flavor conversions (FFCs), which can take place on scales significantly shorter than those anticipated in the vacuum .A condition both necessary and sufficient for the occurrence of FFCs is that the angular distribution of the neutrino lepton number, defined as, crosses zero at some v = v(µ, ϕ ν ), with µ = cos θ ν [35].
Here, G F represents the Fermi coupling constant, E ν , θ ν , and ϕ ν are the neutrino energy, the zenith, and azimuthal angles of the neutrino velocity, respectively.The f ν 's are the neutrino occupation numbers of different flavors, with ν x and νx denoting the heavy-lepton flavor of neutrinos and antineutrinos.In this study, as also commonly observed in state-of-the-art CCSN and NSM simulations, we assume that ν x and νx have similar angular distributions.The expression in Eq. ( 1) then transforms into the conventional definition of the neutrino electron lepton number, νELN.
The occurrence of FFCs on much shorter scales compared to typical hydrodynamical simulations of CCSNe and NSMs, makes their integration into the simulations a formidable task.One prospective approach includes performing short scale simulations of FFCs and then extrapolating the insights gained to inform broader hydrodynamic simulations [30,46,58,59,[66][67][68]. Given this, there has been a body of research on the assessment of FFC outcomes in local dynamical simulations with periodic boundary conditions [45,49,[52][53][54][55][69][70][71][72][73][74], where quasistationary flavor states have been observed in the neutrino gas.In particular, it has been demonstrated that such states can be accurately described by analytical formulations [74].
Despite this, incorporating FFCs into CCSN and NSM simulations is still challenging.The obstacle arises from the need for complete angular distributions of neutrinos, a demanding task in computationally intensive simulations.
As an alternative, advanced simulations often simplify neutrino transport using a limited set of angular distribution moments [75][76][77].In a multi-energy neutrino gas, one can define the radial moments (with a focus on axisymmetric crossings) for each energy bin as, with E ν,i and ∆E ν,i being the mean energy and the width of the i-th energy bin, where I 0,i = n ν,i is the number of neutrinos in that specific bin.Following these moments allows for a computationally more manageable treatment of the neutrino transport.In practice, simulations typically offer only I 0,i and I 1,i .The challenge is then to determine the ultimate values of I 0,i and I 1,i following FFCs.Note that the energy-integrated I's can be simply learning and performance of the NN can be enhanced with the utilization of the domain knowledge [79][80][81].
Our findings demonstrated the efficacy of a single hidden layer PINN, achieving a remarkable accuracy for the prediction of the asymptotic values of I 0 and I 1 in a singleenergy neutrino gas.
In this paper, we extend our prior research by considering a multi-energy neutrino gas, which is considered a more realistic scenario.We demonstrate that when the FFC growth rate surpasses that of the vacuum Hamiltonian significantly, all neutrinos, irrespective of energy, share a common survival probability as dictated by the energy-integrated neutrino spectrum, consistent with the findings of Ref. [32].
To predict the asymptotic outcome of FFCs, we employ a PINN.This PINN utilizes critical information derived from the initial (anti)neutrino zeroth and first moments, considering both the energy-integrated neutrino spectra and a specific neutrino energy bin.Consequently, it produces the corresponding moments specific to the asymptotic outcome of FFCs for that energy bin.Our findings highlight the effectiveness of a single hidden layer PINN, achieving remarkable accuracy in predicting the asymptotic values of I 0 and I 1 for each neutrino energy bin.
This paper is organized as follows.In Sec.II, we begin by providing an overview of our simulations concerning FFCs in a multi-energy neutrino gas.We also elaborate on the assumptions made in deriving the outcomes of FFCs.Moving to Sec.III, we describe the architecture of our NNs, elaborating on the necessary feature engineering and the implementation of a tailored loss function.Furthermore, we present and discuss our results in this section.Finally, our findings are summarized, and conclusions are presented in Sec.IV.

II. FFCS IN A MULTI-ENERGY NEUTRINO GAS
In this section, we present the results of our simulations of FFCs in a multi-energy neutrino gas.Essentially, when the growth rate of FFCs, κ, significantly surpasses the vacuum frequency, ω, i.e. κ ≫ ω, one anticipates that neutrino energy becomes inconsequential to their flavor evolution.Here, ω ≡ δm 2 /(4E ν ) with δm 2 being the squared neutrino mass difference.This suggests that in such circumstances, all neutrinos should experience identical survival probabilities dictated solely by the energy-integrated neutrino spectrum, effectively making the energy irrelevant.
The condition κ ≫ ω could be expected to be met in a dense neutrino gas provided that λ ≫ ω1 , where λ = √ 2G F n νe with n νe being here the initial ν e number density of the neutrino gas.A crucial exception arises when the neutrino gas lepton asymmetry ratio defined as, is extremely close to unity, which already implies that flavor equipartition should occur on short scales in the neutrino gas, even in the absence of FFCs (see the discussion in Ref. [83]).
In the following, we first demonstrate that when λ ≫ ω, all neutrinos (with different energies) experience identical survival probabilities.Subsequently, we discuss an analytical representation of the survival probabilities, which will be useful for our PINN calculations.

A. Results of the simulations
We consider a multi-energy and multi-angle neutrino gas in a 1D box, extending the framework outlined in Ref. [53].Our model assumes translation symmetry along the x and y axes, axial symmetry around the z axis, and employs periodic boundary conditions in the z direction.We also take two flavor approximation, exclude the consideration of neutrino-matter forward scattering, and assume that the system consists of (anti)neutrinos of electron flavor whose energy and angular distribution is spatially homogeneous in the beginning for simplicity.Under these assumptions, the evolution of the normalized neutrino and antineutrino density matrices, ϱ(t, z, ω, µ) and ρ(t, z, ω, µ) is governed by the following equations of motion, (∂ t + µ∂ z )ϱ(t, z, ω, µ) = −i[H(t, z, ω, µ), ϱ(t, z, ω, µ)], where µ here represents the neutrino velocity in the z direction.The Hamiltonian H(t, z, ω, µ) and H(t, z, ω, µ) are given by H(t, z, ω, µ) = H vac (ω) + H νν (t, z, µ) and , where with θ eff the effective vacuum mixing angle, and Here, the neutrino distribution functions are similar to those introduced in Eq. ( 1) except that such situations, the growth rate of FFCs could be significantly suppressed so that κ ≫ ω does not hold anymore [82].10)), with the survival probability averaged over space and ω (black dashed curve; see Eq. ( 9)), at the final time of the simulation for a system with ω ≪ λ.Also shown is the analytical prescription for ⟨Psur(µ, ω)⟩ f (blue dotted curve) described in Eq. ( 16).Note that the greyshaded area basically overlaps with the black dashed line, implying that the survival probabilities are nearly independent of the neutrino energy.
For the specific simulation discussed below, we take a 1D box of size L = 1200λ −1 with α = 0.9.The neutrino distribution function is parameterized by with σ ν = 0.6, σ ν = 0.5, χ ν = 3.2, χ ν = 4.5, ⟨E ν ⟩ = 10 MeV, and ⟨E ν ⟩ = 12 MeV.For the vacuum mixing parameters, we set ω/λ = 10 −4 for E ν = 1 MeV with θ eff = 10 −5 .In order to introduce small inhomogeneity to the system, we follow Ref. [53] to assign perturbations to ϱ and ρ at t = 0 by where ϵ(z) is a real number randomly generated between 0 and 0.01.We discretize the simulation domain with 6000, 50, and 20 uniform grids in −600 ≤ z ≤ 600 λ −1 , −1 ≤ µ ≤ 1, and 0 ≤ 4ω/δm 2 ≤ 0.5 MeV −1 , and use the finite difference scheme of coseν [84] to conduct the simulation until t = 2000λ −1 when the system has settled into the asymptotic state.Fig. 1 shows the survival probability of electron neutrinos as a function of µ, averaged over z and ω (black dashed curve) at the end of the simulation, computed as, as well as the spanned range of the spatially averaged survival probabilities (shaded grey area) by all different ω values calculated by: This comparison clearly shows that ⟨P sur (µ, ω)⟩ f is nearly independent of ω as the grey-shaded area basically overlaps with the black dashed line.Also shown in the plot is the analytical prescription for ⟨P sur (µ, ω)⟩ f (blue dotted curve) following Ref.[74] for two-flavor scenario described in Eq. ( 16).
While ω/λ ≲ 10 −4 is a reasonable assumption regarding the SN neutrino decoupling region, we conducted additional calculations with ω/λ = 10 −3 .Although the spanned range of spatially averaged survival probabilities turned out to be more noticeable and the analytical formula was less precise in that case (compared to the former case), we noticed that assuming an energyindependent survival probability remains a justified assumption also for ω/λ ≳ 10 −3 .

B. Survival probability function
To effectively train and evaluate our PINN, we require the survival probabilities derived from energyintegrated neutrino distributions.Our approach involves utilizing two parametric distributions for the initial neutrino angular distributions, as previously explored in our work [78]: the maximum entropy distribution and the Gaussian distributions [53,[85][86][87][88][89], defined as, respectively, where, Note that here A, a and ξ are arbitrary parameters which determine the overall neutrino number and the shape of the distributions.Allowing for two distinct forms of angular distributions takes into consideration potential deviations in the shape of neutrino angular distributions in realistic simulations, which can occur, e.g., due to the use of different closure relations.
In our analytical treatment of the survival probability, we follow closely our recent works in Refs.[74,78].We assume that G(µ) (= 2π 0 dϕ ν G(v)) has only one zero crossing.In the three-flavor scenario, the survival probability can then be defined as, with, where h(x) = (x 2 + 1) −1/2 and ζ can be found such that the survival probability function is continuous.Here, µ < (µ > ) are defined as the µ range over which the following integral is smaller (larger): where Θ is the Heaviside theta function.In the case of the two-flavor scenario, the survival probability can be obtained using: We refer an interested reader to Refs.[74] for more details.
Using the neutrino angular distributions in Eq. ( 11) and the survival probability function defined in Eq. ( 13), one can obtain the asymptotic outcomes of FFCs given the initial distributions.

III. APPLICATIONS OF NEURAL NETWORKS
To effectively train our PINN, we require information on two fronts: the energy-integrated moments of the neutrino gas and the moments within a specific energy bin.The former implicitly contains the necessary information for the survival probability (dictated only by the energy-integrated quantities), while the latter supplies the bin-specific information to which the survival probability must be applied.
To prepare our datasets, we begin with the initial energy-integrated angular distributions of neutrinos, which can follow either a maximum entropy or a Gaussian distribution.With these distributions for ν e and νe , we derive analytical survival probabilities.Next, we apply these analytical distributions to the neutrino angular distributions within a particular energy bin (again either maximum entropy or a Gaussian).This process helps us determine the eventual outcomes of FFCs for that specific bin.By performing integration over the neutrino angular distributions, we can then obtain the initial and final values of I 0 's and I 1 's for that specific energy bin.
Before discussing our findings, it's crucial to emphasize that to ensure high performance in our NN models on the test set, it's essential to divide the dataset into three distinct sets: Training set for foundational learning, development set for optimizing hyper-parameters, and test set for evaluating the model's generalization to novel data.

A. The architecture of NNs
For a given multi-energy neutrino gas, one is provided with the initial values of energy-integrated I 0 's and I 1 's of ν e , νe (also of ν x , which is irrelevant here since it has no effect on the survival probability).In addition, for each specific energy bin, one has I 0 's and I 1 's of ν e , νe , and ν x .In this context, we make the assumption that the initial distributions of νx and ν x are identical (though their final ones following FFCs could be different), a simplification that aligns with the majority of state-of-the-art CCSN and NSM simulations.
Though in total 10 I's are available (which could be, in principle, the inputs of NNs), we here introduce a layer of feature engineering to enhance the performance of our NNs, namely we define the new features: with F ν = (I 1 /I 0 ) ν .Here, the quantities without/with subscript i indicate the quantities belonging to the energy-integrated spectrum/specific energy bin.Note that the neutrino number densities in the particular energy bin must be smaller than the corresponding energyintegrated values.
The selection of these features offers explicit insights into the configuration of neutrino angular distributions, which plays a crucial role in understanding the asymptotic outcome of FFCs.Furthermore, it is worth highlighting that all quantities in this context are normalized by the initial energy-integrated ν e number density, allowing the convenient choice of setting it to n initial νe = 1.This simplification reduces the number of inputs to our NNs, and notably, there is no input parameter related to n νe .
As we also discussed in our previous work [78], there is still the possibility of improving our NNs through more advanced feature engineering.By considering the neutrino survival probability's shape, as expressed in Eq. ( 13), one observes that a significant amount of information about the shape of the survival probability can be derived by learning the position of µ c .Another crucial piece of information, given µ c , is determining the side where equipartition happens.The behavior of the survival probability on the opposite side is regulated by conservation laws.This side's determination is described by FIG. 2. Schematic architecture of our NNs.The green zone shows the implementation of the extra features, µc, and ERL, which are obtained through an extra layer of regression, using linear and logistic regressions, respectively.Here, µc is the crossing direction and ERL is a binary, which is 1 if the equilibrium occurs for µc ≤ µ, and 0 otherwise.The blue zone represents energy-integrated inputs, while the orange zone displays inputs for specific energy bins.Note that the neutrino number densities in the particular energy bin must be smaller than the corresponding energy-integrated values.In our basic NN, referred to as the NN with no extra features, the NN only takes the inputs highlighted in Eq. ( 18).However and in our PINN, we provide our NN with the extra features µc and ERL.
the quantity E RL , a binary value that equals 1 if equipartition happens for µ c ≤ µ and 0 otherwise.
As in Ref. [78] and as illustrated in Fig. 2, we explore two distinct architectures in our NN framework.In the foundational architecture, we integrate only α, F νe , F νe , n νe,i , n νe,i , n νx,i , F νe,i , F νe,i , and F νx,i into our NN.On the other hand, our alternative NN includes also information coming from µ c and E RL .Our feedforward NN has a single hidden layer containing 150 neurons, as justified in Fig. 4 and the text around it.
Regarding the output layer, our NNs return I 0,i and I 1,i for ν e and νe , employing a total of 4 neurons.Deriving I i 's for ν x and νx is achieved by applying principles of neutrino and antineutrino number density, as well as momentum conservation.Put simply, our NN ensures neutrino conservation laws.
Apart from the inputs, we also consider modifying the loss function of our NN's.In particular, we introduce an additional loss term in the optimization of the NN model with the extra features, defined as, which tends to penalize any deviation in the number of neutrinos in the electron channel, i.e., N νe+νe = n νe,i + n νe,i , a critical parameter of utmost significance in CCSNe and NSMs.Here ∆, N sample , and Σ k denote the difference between the true and predicted values, the number of samples in the training set, and the summation over the training samples, respectively.The specific inclusion of the domain knowledge allows one to consider this particular NN architecture as a PINN [79][80][81].The PINN should be compared with our basic NN, referred to as NN with no extra features, for which the loss term only includes the ordinary mean squared errors of the output parameters.

B. The NN's performance
In this section, we present and discuss the performance of our NNs in predicting the asymptotic outcome of FFCs in a multi-energy three-flavor neutrino gas.For training/testing our NNs, we generate a dataset comprising a well-balanced combination of maximum entropy and Gaussian initial neutrino angular distributions.The ultimate outcome of FFCs is determined through a three-flavor survival probability, as detailed in Eq. ( 13).We also set α ∈ (0, 2.5), F νx,(i) ∈ (0, 1), F νe,(i) ∈ (0.4F νx,(i) , F νx,(i) ), and F νe,(i) ∈ (0.4F νe,(i) , F νe,(i) ), which is consistent with the expected hierarchy F νe ≲ F νe ≲ F νx .Regarding n ν,i 's, we take them from a half-normal distribution with zero mean and a standard deviation of 0.1n ν , with n ν being the energy-integrated neutrino number density.This choice can enhance the performance of our NNs in the energy bins with fewer neutrinos.Also note that since our NNs process only a single energy bin at a time, the hierarchy of flux factors among different neutrino energies is irrelevant here.
In Fig. 3, we illustrate the performance of our PINN and the basic NN without extra features.The relative error in the electron neutrino number density within our PINN, quantified by |∆(n νe,i + n νe,i )|/(n νe,i + n νe,i ), achieves a minimum of 6%.Additionally, the mean absolute relative error in the output variables, computed as the mean of |∆I i |/I i , attains values ∼ 16%.In contrast, when considering the basic NN, these errors increase to ∼ 12% and ∼ 22%, respectively, showing higher discrepancies.The noticeable performance improvement within our PINN can be primarily attributed to the inclusion of extra features, which provide extra information on the shape of the survival probability distribution.
Comparing the findings illustrated in Fig. 3 with those discussed in Ref. [78], a discerning reader will observe a substantial discrepancy in the impact of employing PINN.Specifically, the application of PINN results in a significantly more pronounced enhancement in performance in the former case.While the utilization of PINN can almost reduce the error by a factor of two in the multi-energy neutrino gas, its application to a singleenergy neutrino gas only yields a modest ≲ 1% improve- ment in the error.This discrepancy can be attributed to the substantial difference in the amount of input information between the two cases.In the former scenario, the volume of input information is notably larger, leading to a higher degree of degeneracy in the input data.The introduction of PINN in this context is remarkably effective in mitigating this degeneracy and, consequently, substantially reducing the error.
The computations conducted here have utilized a feed- forward neural network featuring a single hidden layer having n h = 150 neurons.The reasoning behind selecting this specific number of neurons is depicted in Fig. 4, which shows errors for different NN architectures.The optimal performance on the validation set is observed when n h ≳ 50.In Fig. 5, we analyze our PINN's performance concerning the training set size.The red curve indicates the absolute relative error in the PINN's output, while the blue curve shows the relative error in N νe+νe .As the training dataset expands to include several thousand data points, the errors rapidly decrease to approximately ∼ 6% and ∼ 18%, respectively, and additionally, the disparity between the validation and training set errors diminishes.These findings align with results observed in single-energy scenario calculations [78].This underscores the crucial minimum number of data points required for reliable calculations using NNs.There remains a crucial aspect regarding the assessment of the absolute relative error that needs discussion.In our prior study concerning FFCs within a singleenergy neutrino gas [78], we primarily regarded the absolute error as the relevant metric.However, when addressing the complexities of a multi-energy neutrino gas, the absolute error falls short.The problem arises because, despite normalizing all quantities by n νe , the values of I's within a specific energy bin are expected to constitute a minor fraction of one.Hence, achieving a low absolute error does not inherently guarantee accurate prediction due to the relatively small magnitudes involved.
Hence and for the multi-energy neutrino gas, we've adopted the absolute relative error as the informative metric.Nonetheless, this choice comes with a notable drawback: extremely small I's yield disproportionately large absolute relative errors.While these cases may not include the most intriguing aspects of the parame-ter space, their associated errors tend to dominate over the rest of the parameter space.To resolve this issue, in our performance evaluation on the test set, we have chosen to exclude data points where |I i | ≤ 5 × 10 −3 (while retaining them in the training set).In Sec.III C, we come back to this problem and devise a solution to this challenge.

C. Reconstruction of neutrino energy spectra
In the preceding part, we engineered our NNs to process the energy-integrated I's and their values within an specific bin, I i 's.To elaborate, our methodology works on an energy bin-based approach, focusing solely on predicting the final outcomes within individual energy bins.This approach eliminates unnecessary complexity associated with attempting to reconstruct the entire neutrino energy spectra following FFCs at once.
In this part, we explore the performance of our PINN's to reconstruct the complete neutrino energy spectra following FFCs.While this application of our NNs might appear straightforward initially, it presents significant challenges.Specifically, when our PINNs are employed to analyze the tail of the energy spectrum, where the count of neutrinos is notably low, we encounter a considerable obstacle.The relative error in these instances might surpass the total values of n ν(ν) , potentially leading to a scenario where the conservation laws cannot be satisfied, as discussed in the following.In the high-energy tail of the spectra, where neutrino number densities can reach very low values, the predicted value of n νe(νe) may even surpass the total (anti)neutrino number density.This is attributed to the expected large relative errors associated with the output of neural networks when dealing with such small values, as discussed before.Consequently, while the conservation law for neutrino number remains mathematically valid in principle, it loses its practical significance.This is because adhering to the conservation laws would now imply a negative number density for n νx(νx) , which is not physically meaningful.
To address this challenge, we devised a solution by simultaneously adjusting the number of ν e (ν e ) and ν x (ν x ) while ensuring the conservation of total neutrino numbers.Our approach indeed involves the utilization of two PINNs.The first PINN is designed to compute I 0,i 's and I 1,i 's for ν e and νe , following the standard architecture we discussed in previous part.Meanwhile, the second PINN shares a similar structure but deals with ν x and νx , calculating their respective I 0,i 's and I 1,i 's.In practice, this implies that the outputs of such a PINN are: while the input features are not different in the two PINNs.Importantly, in this PINN, the I's of ν e and νe are then derived through neutrino conservation laws.
In essence, the difference between these two PINNs lies in the information they provide-while the first PINN provides ν e I's, the second one focuses on ν x , with the retrieval of ν e quantities governed by conservation principles.Given these two PINNs, the final neutrino number densities for each energy bin can be calculated as: where n pred ν β (ν β ),i represents the corresponding PINN predicted value for the neutrino species β.Here, N pred ν(ν),i denotes the predicted total (anti)neutrino number density, and N ini ν(ν),i is its initial value.It's important to note that the prediction for each neutrino species is conducted by the relevant PINN model discussed earlier, i.e., the former PINN is employed for the electron species, while the latter is used for heavy-lepton flavors.Note that the errors in the predictions of n ν can now be automatically adjusted to ensure respecting the conservation laws, preventing negative number densities.Furthermore, a fair treatment is now applied to electron and heavy-lepton flavors, preventing one from becoming unreasonably small when the error in the prediction of the other is unreasonably large.
In Fig. 6, we present the performance evaluation of our PINNs in reconstructing the neutrino energy spectra.The upper panels show the initial neutrino distributions, which are prepared as follows.We describe the energydifferential number flux of a specific neutrino species, ν β , as [14], with being the normalized ν β spectrum, where E ν is the neutrino energy.Here, η ν β and ⟨E ν β ⟩ are the pinching parameter and the neutrino average energy which describe the normalized spectrum and T ν β = ⟨E ν β ⟩/(1 + η ν β ).In addition, L ν β is the neutrino luminosity.To be specific, we use the following values which could be expected during the SN accretion phase [7]: L νe : L νe : L νx = 1 : 1 : 0.33, ⟨E νe ⟩ : ⟨E νe ⟩ : ⟨E νx ⟩ = 9 : 12 : 16.5, η νe : η νe : η νx = 3.2 : 4.5 : 2.3. (24) Note that for the average energies and luminosity's only the ratios matter.Moreover, we assumed these values for the energy-integrated flux factors: F νe = 0.5, F νe = 0.7, and F νe = 0.8.We here take 14 energy bins and for each energy bin, we then adopted an assumption describing the relationship as Here we have assumed that the spectra approach zero for E ν,i ≳ 60 MeV.Note that we anticipate a decrease in the flux factor as neutrino energy increases.This reduction is expected to be nonlinear, attributed to the nonlinear scaling of the neutrino scattering cross-section with matter in the SN environment.While our assumption about the energy-dependent nature of the flux factor is speculative, it aligns with the expected conditions mentioned above.
In the lower panels of Fig. 6, we present the final neutrino energy spectra following FFCs.It is evident that a notable difference in the spectra reconstruction error emerges between neutrinos and antineutrinos.This disparity can be indeed quantified by an absolute spectral relative error, defined as, where the prediction error for n ν,i is weighted by the relative distribution across energy bins.For the results presented in the lower panels of Fig. 6, we observed δ νe = 0.20, δ νe = 0.06, δ νx = 0.23, and δ νx = 0.05.Despite the clear difference between neutrinos and antineutrinos in this particular example, we've noticed that this observation depends notably on the specific example and it can fluctuate across different calculations and models.This variability highlights the need for more sophisticated neural network architectures, such as Bayesian neural networks.These specialized networks offer the capability to provide uncertainty estimates for predicted quantities, addressing the intricacies and fluctuations observed in these calculations.

IV. DISCUSSION AND OUTLOOK
We have employed a single hidden layer Physics-Informed Neural Network (PINN) to predict the asymptotic outcome of FFCs within a three-flavor multi-energy neutrino gas.Our approach focuses on utilizing the first two moments of neutrino angular distributions, making our PINNs highly relevant to state-of-the-art CCSN and NSM simulations.We have demonstrated that our PINNs can achieve remarkable accuracy, with errors reaching ≲ 6% for the number of neutrinos in the electron channel, and ≲ 18% for the relative absolute error in the neutrino moments.
By conducting simulations of FFCs in a 1D box with periodic boundary conditions, we first demonstrated that in scenarios where the FFC growth rate notably exceeds that of the vacuum Hamiltonian, a uniform survival probability is experienced by all neutrinos, regardless of their energy.This common survival probability is solely determined by the energy-integrated neutrino spectrum.
In our PINNs, we incorporated novel features to effectively capture the shape of the expected neutrino survival probability distributions.Our improvements involve incorporating the position of the zero crossing in the distribution of νELN, µ c , and also information about the side of µ c where the expected equipartition occurs.Our research demonstrates that this advanced feature engineering significantly improves the performance of our PINN.
Moreover, we demonstrated that the variance between the training and validation sets decreases significantly with a minimum of a few thousand data points.This underscores the necessity for datasets of (at least) this size when developing more realistic models based on simulation data in future studies.
We also highlighted a significant challenge in applying NNs to predict the whole neutrino energy spectrum.This challenge arises from the fact that predicting the tail of the spectrum may lead to an error of such magnitude that it violates the preservation of neutrino conservation laws.To address this issue, we propose the development of two separate models: one dedicated to predicting electron (anti)neutrino quantities and another for heavy-lepton flavor of (anti)neutrino quantities, respectively.By scaling the numbers in accordance with conservation laws, we could overcome this challenge.Our demonstrated approach showed that PINNs can accurately enough reconstruct the entire neutrino spectrum, particularly in a typical neutrino spectra scenario during the SN accretion phase.
In summary, our research highlights the effectiveness of PINNs in predicting the asymptotic outcomes of FFCs within a multi-energy neutrino gas.Nevertheless, there are crucial avenues for further exploration.An important consideration is extending our study to encompass more realistic neutrino gases characterized by nonaxisymmetric distributions, where ν x and νx can also exhibit dissimilar patterns.Such refinements will improve the feasibility of incorporating FFCs into CCSN and NSM simulations, thereby advancing our capacity to model and predict accurately these extreme astrophysical phenomena.

FIG. 3 .
FIG. 3. Performance evaluation of the PINN and the basic NN with no extra features.We present the relative absolute error in the output parameters, along with the relative error in the total number of neutrinos within the electron channel, Nν e +νe .It is evident that the PINN can well outperform the basic NN with no extra features.Here, an epoch refers to a single pass through the entire training dataset during the training phase.

FIG. 4 .
FIG.4.Performance evaluation of our PINN and the basic NN with no extra features (on the validation set) as a function of the number of neurons in the hidden layer.It is evident that the NNs have achieved their best performance on the validation set once n h ≳ 150.The labels and NN models are the same as those in Fig.3.
FIG.5.Absolute relative error in the output of our PINN (red curve) vs the relative error in the number of neutrinos in electron channel, i.e., Nν e +νe (blue curve).The inclusion of a few thousand data points in the training set leads to the disappearance of error variations between the validation and training sets.Note that we do not display the absolute relative error in the training set.This is due to the presence of small I ′ s in the training set (which are removed from the test set), causing a significantly larger relative error.Hence, any direct comparisons between the absolute relative errors in the training and test sets would be unfair.

FIG. 6 .
FIG.6.Performance evaluation of our PINNs in reconstructing neutrino energy spectra.The upper panels exhibit the initial spectra characteristics, including the neutrino initial energy spectra, angular distributions of the energy-integrated neutrino spectra as a function of µ, and the corresponding FFC survival probability.Here, we have assumed Fν e = 0.5, Fν e = 0.7, and Fν e = 0.8.The lower panels illustrate the post-FFCs final neutrino spectra for νe, νe, and νx (νx), respectively.The results are shown for both the PINN approach and an exact method, assuming having access to the full neutrino angular distributions.As one can see, the prediction errors for antineutrinos are much smaller than those concerning neutrinos.