A Convolutional Neural Network for Multiple Particle Identiﬁcation in the MicroBooNE Liquid Argon Time Projection Chamber

We present the multiple particle identiﬁcation (MPID) network, a convolutional neural network (CNN) for multiple object classiﬁcation, developed by MicroBooNE. MPID provides the probabilities that an interaction includes an e − , γ , µ − , π ± , and protons in a liquid argon time projection chamber (LArTPC) single readout plane. The network extends the single particle identiﬁcation network previously developed by MicroBooNE [1]. MPID takes as input an image either cropped around a reconstructed interaction vertex or containing only activity connected to a reconstructed vertex, therefore relieving the tool from ineﬃciencies in vertex ﬁnding and particle clustering. The network serves as an important component in MicroBooNE’s deep learning based ν e search analysis. In this paper, we present the network’s design, training, and performance on simulation and data from the MicroBooNE detector.


I. INTRODUCTION
A series of liquid argon time projection chamber (LArTPC) detectors have been or are being deployed at Fermilab as part of the Short-Baseline Neutrino (SBN) program [2] along the Booster Neutrino Beamline (BNB [3]) and as part of the long-baseline program of the Deep Underground Neutrino Experiment (DUNE) [4].The MicroBooNE experiment [5], part of the Fermilab SBN program, has been operating since 2015, collecting data accumulated during beam-on and beam-off time periods.
MicroBooNE operates a 170 ton (85 ton active) LArTPC placed 470 m from the BNB target at Fermilab.The LArTPC is 10.4 m long, 2.6 m wide and 2.3 m high.The detector has three readout wire planes with 2400 readout wires on the two induction planes and 3456 readout wires on the collection plane [6].Wires are installed with two induction planes oriented at ±60 • with respect to the vertical collection plane at a wire pitch of 3 mm.An array of 32 PMTs are installed behind the collection plane to detect the scintillation light from argon ionization caused by charged final state particles from neutrino interactions [7].The TPC readout time window is 4.8 ms and is digitized into 9600 readout time ticks.Charged particles in liquid argon produce ionization electrons, which drift to the readout wire planes in an electric field of 273 V/cm.It takes 2.3 ms for an ionization electron to drift across the full width of the detector.
The MicroBooNE LArTPC continuously records charge drifted and its arrival time on each wire.A software trigger, based on PMT signals, records an event triggered by the BNB beam spill if the interaction light * microboone info@fnal.govdetected by the PMT array is above a set threshold.Each event consists of data collected from 1.6 ms before the trigger and 3.2 ms after the trigger.Therefore, each event has three sets of TPC data for each wire on all three planes.A truncation of the wire readout is performed around the trigger results so that the two induction planes have resolutions of 2400 wires × 6048 readout ticks, while the collection plane has a resolution of 3456 wires ×6048 readout ticks.Wire and time data can be converted into an image format (charge on each wire versus drift time) using the software toolkits LArSoft [8] and LArCV [9] while maintaining high resolution in wire, time and charge amplitude space.These information-rich LArTPC images are suitable for applying deep learning tools.In consideration of computing resources, images for deep learning tools are compressed along the time tick axis by a factor of six.Pixel values are merged by a simple sum.Images become 2400 wires × 1008 ticks and 3456 wires × 1008 ticks for the induction and collection planes, respectively.This corresponds to an effective position resolution of 3.3 mm [10] and 3 mm [6] along the time tick and wire number directions, respectively.
Convolutional neural networks (CNN), deep learning networks commonly applied to image processing applications, are currently used across neutrino and high energy physics experiments [11].For accelerator neutrino experiments, NOvA has applied a CNN as a neutrino event classifier [12] in its ν µ → ν e oscillation measurement [13,14] and its neutral-current (NC) coherent π 0 production measurement [15].NOvA has also demonstrated a context-enriched particle identification network [16].MINERvA has developed CNN tools to determine neutrino interaction vertices and study possible biases due to models used in the large simulated training sample [17].The NEXT experiment has also used a CNN classifier to perform particle content studies at candidate neutrinoless double beta decay vertices [18].
A variety of deep learning techniques have been used in neutrino LArTPC experiments.In MicroBooNE, a CNN for assigning probabilities of particle identities for single particles in the MicroBooNE LArTPC has been demonstrated on simulated data in Ref. [1].A semantic segmentation network for LArTPC data [19,20] has been used for π 0 event reconstruction [21], vertex finding, and track reconstruction [22].The DUNE experiment has recently presented an updated long-baseline neutrino oscillation sensitivity study incorporating a CNN for neutrino event selection and background rejection [23].
In this article, we present our study in developing and applying a multiple particle identification (MPID) network with the task of multiple binary logistic regression problem solving in MicroBooNE.It is the first demonstration of the performance of a CNN on LArTPC data including systematic uncertainties, and the first particle identification network applied to LArTPC datasets.The MPID network extends the functionality of Mi-croBooNE's previously-described single PID CNN network [1].It does not require pre-processing of image data to identify and filter selected pixels in an image assumed to be produced by a specific particle.The network provides simultaneous prediction scores for particle existence probabilities in the same image among five different particle species: electrons (e − ), photons (γ), muons (µ − ), charged pions (π ± ) and protons (p).The network is a particularly useful tool for data analysis of particle interactions in LArTPC detectors, since the region of an interaction vertex often contains many particles.
The MPID algorithm can take as input a LArTPC image with a fixed 512×512 pixel scale.A detailed description of the network design and training for MPID is given in Section II.When used in MicroBooNE's deep learning based low-energy excess ν e (LEE 1e-1p) search analysis, the MPID network is primarily applied to images that contain candidate reconstructed neutrino interaction vertices as well as all reconstructed topologically connected activity.MPID predictions are derived based on the full information of all energy depositions topologically connected to the vertex, particularly the first few centimeters of final-state particles' trajectories, which are critical for particle identification.In the ν e search, the network is also applied to more inclusive images roughly cropped around the interaction vertex.This is a new feature compared with the single PID network, which takes as input only images containing filtered, reconstructed hits.Cropping around the interaction vertex allows re-evaluation of charge missing from the former topologically-connected image, but is nonetheless present near the vertex, such as photon showers from final-state π 0 s.This feature of the MPID network can help MicroBooNE suppress important photon backgrounds to a LEE search, as observed by MiniBooNE [24].We demonstrate this feature's robustness against the presence of LArTPC activity such as cosmic ray tracks that are uncorrelated with signal features of interest.
In this paper, we are not prepared to show full performance in the context of a physics analysis, but we can present some specific measures of network performance.Section III shows the efficiency of the different particle scores on idealized events containing e − , µ − , and p; Section IV shows data-simulation agreement on samples highly enriched in certain signal topologies; and Section V shows efficiency and background rejection performance for ν e and some specific backgrounds.

A. Network design
The MPID network applies a typical CNN [25] structure for the task of multiple object classification, which is summarized in block diagram form in Fig. 1.Input images have a resolution of 512 × 512 (1.5 m×1.5 m) pixels, which generally matches the size of neutrino-induced activity in MicroBooNE.A series of ten convolutional layers are applied to the image for extracting high-level features.
The first convolutional layer has a stride (shift unit of the convolution calculation) of two with the goal of reducing the LArTPC images' sparsity and increasing feature abundance at the beginning of the algorithm.Following convolutional layers have a stride of one, a block of two convolution layers with a kernel size of three, followed by a pooling layer that is repeated five times.An average pooling layer is applied at every other convolutional layer to contract the spatial dimension.Then following the pooling layer is a rectifier activation function (ReLU) [26] for adding non-linearities to the network, as well as a group normalization operator [27] to avoid early overfitting.
Two fully connected layers with 192×8×8 nodes and 192×8 nodes are applied to combine the features derived by convolutional layers.Output of the fully connected layers is a vector with five floating point numbers, each representing a confidence score for a target particle type to be present in an image.The score is interpreted as a normalized probability after applying a sigmoid function [28].The algorithm is optimized by minimizing the sum of binary cross entropy loss [29] across target particle types.In this way the prediction categories are not exclusive between particles.Figure 2 shows one example of the input and output of the MPID network during inference.In this case, the input image has one e − and one p concatenated at the same vertex, a typical signal interaction topology for an interaction-channel-exclusive 1e-1p search, as implemented in the MicroBooNE deep learning based LEE analysis.The MPID network calculates as output the five floating point numbers described in the previous paragraph, or "particle scores," that correspond to the inferred probability to have each type of particle present in the image.In this example, high scores of 0.99 and 0.98 are given for p and e − in the image and low scores of 0.06, 0.01 and 0.02 are provided for γ, µ − and π ± .Figure 3 shows another example of the input and output for the network during inference.The input image has one γ, one e − and one p produced at same vertex, which would in principle be rejected in an exclusive 1e-1p search.Again, the MPID calculates scores that correspond to containing each particle in the image.High scores of 0.89, 0.95 and 0.85 are for p, e − and γ in the image and low scores of 0.02 and 0.08 are found for µ − and π ± in the image.We also note for total clarity that the photon particle score is indicative not of the predicted total number of photons in the image, but rather the probability that any photons are present in the image.The former judgement, as well as the capability to identify the particle content of specific sub-features within an image, is not within the scope of the MPID algorithm.

B. Training and Test Samples
Training and test samples for the MPID CNN are produced with a customized event generator that uses LAr-Soft [8] and LArCV [9].Detector processes are simulated with the GEANT4 [30][31][32] simulation tool.
The first generator step produces a 3D vertex uniformly distributed in the MicroBooNE LArTPC.The second step generates a random number of particles from e − , γ, µ − , π ± , and p options.All particles are generated at the vertex from the first step with isotropic directions.The multiplicities for the total number of particles allowed in each image are randomly distributed between two and four.The multiplicity for each particle type is allowed to vary randomly between zero and two.Such a configuration will include as a subset final-state interaction vertex topologies that we are searching for or trying to reject in MicroBooNE analyses, such as 1e-1p, 1µ-1p and 1γ − 1p, as well as non-signal ones, such as 2µ or 2e.This generation strategy purposefully does not rely on any of the standard neutrino final-state generators [33] to avoid possible biasing the MPID network via inclusion of possibly-incorrect kinematic or multiplicity information provided by the generator.Moreover, this training model will produce a more robust particle identification tool capable of producing unbiased results for a much p e − γ µ − π ± MPID Score 0.89 0.95 0.85 0.06 0.17 broader range of vertex-generating physics processes.Finally, high multiplicity topologies generated in this randomized training samples help the network to activate more nodes and learn more parameters for classification.
Each particle is generated with a single particle simulation package, where no neutrino interaction model kinematics are assumed.For 80% of the training and test samples, particles are simulated with kinetic energies between 60 MeV and 400 MeV for protons and between 30 MeV and 1 GeV for other particles.For the other 20% of the training and test samples, particles are simulated with kinetic energies between 40 MeV and 100 MeV for protons and between 30 MeV and 100 MeV for other particles.Particles are generated with a flat energy distribution.Energy ranges are chosen based on the BNB neutrino energy distribution and the analysis priority towards the lower energy range.We generated 60,000 simulated events for training and 20,000 images for testing.The images are intentionally generated without overlaying cosmic rays on simulated images to retain separation capabilities for µ − .Images used for training, testing and inference are from the better performing collection plane only [34], similar to networks described in Ref. [1] and Ref. [19].This choice serves to reduce the network's reliance on upstream reconstruction steps, such as the matching of pixels from different wire planes.

C. Network Training
The loss of the network is defined using the BCEWith-LogitsLoss [29] function in PyTorch taking the output layer (five floating point number) as input.The BCE-WithLogitsLoss function combines a sigmoid [28] operator with the binary cross entropy calculation.During training, we applied an initial learning rate of 0.001.Batch sizes of 32 and 16 are chosen for the training and test processing.Training is processed with one single NVIDIA 1080 Ti graphics card.Regularization methods of dropout [35] and group normalization [27] are applied to avoid early overfitting during training.

FIG. 4. Losses of training and test events during training (top). Accuracies of training and test events during training (bottom).
An accuracy is calculated while the training is monitored for loss.Accuracy is defined as the fraction of predicted labels matching the truth labels with a threshold value of 0. weights around epoch 29 and selected the one with best accuracy on the test sample.

D. MPID Occlusion Analysis
We applied an occlusion analysis [36] to determine whether the MPID network has calculated its predictions using image features associated with underlying physics for example, dE/dx at the first pixels of a particle (referred as the trunk region of a particle), as opposed to other extraneous features in the image.The strategy is to feed the network an image partially masked to check how the MPID responds to the masked image.The occlusion analysis places a 9×9 pixel box in the top left corner of the image, which masks all pixels in the occlusion box with zero values.With this box placed, we then apply the MPID network to the masked image and plot at that center pixel the produced score value.This process is then repeated for each pixel as the occlusion box scans along the x and y axis of the image.Figure 5 shows an example of the occlusion box placed on the image.After scanning the whole image, we obtain score maps showing the MPID responses to each occlusion box placement location.A lowered score for a particular pixel in occlusion images indicates that the masked region contains topological information valuable for determining the identity of that particular particle.
A simulated interaction image with one e − , one γ and one p at the same vertex, shown in Fig. 5 is chosen to demonstrate the occlusion analysis.The bottom left panel in Fig. 5 shows the p scores from the occlusion study on the input image.The p score drops significantly as the proton track's Bragg peak region, where strong p dQ/dx features exist, is masked.This indicates that the MPID network is properly identifying and leveraging features associated with the p's unique energy deposition density profile.It can also be seen that a few pixels in the circle with very high pixel values in the pictured shower are mildly misinterpreted as p-like features.
The bottom right panel in Fig. 5 shows the γ scores from occlusion analysis of the same input image.From the occlusion image, it is clear that a few key physics features of γ-containing images have been properly learned by the MPID network.There are two critical features in the particle trunk region for e − /γ separation: the projected trunk region dE/dx difference and the presence or absence of a gap between the trunk and the interaction vertex.One can see the γ score drops to near 0.3 when the trunk region of the γ (rather than the gap region between γ and vertex) is masked.We also applied an occlusion analysis to images with single γ images to confirm that γ scores drop and e − scores increase as the γ trunk region is masked.We observe in this example that the γ score also increases to near one when nearby pixels connecting the p and e − are masked, since this produces more gaps between different particles.The e − score does not change much as we move the occlusion box around since there are overwhelming e − -like features in the image from both the e − and γ.This observation indicates that consideration of both e − and γ scores is likely important in attaining good e − and γ separation with the MPID network.

III. PERFORMANCE ON SIMULATION
To provide a first look at the capabilities of the trained MPID algorithm, we present particle score results returned from the test images generated using the same method applied in producing training images.This section is divided into discussions of individual final state vertex and particle topologies of interest to MicroBooNE physics analyses, with occasional reference to a larger set of complimentary final state particle combinations located in Appendix A.
We primarily focus on two generated test samples with particles 1µ-1p and 1e-1p in the final state, which are not used in training.10,000 events are generated in each sample.These samples are generated with the same customized event generator described in Section II B. Vertices are uniform in the detector with one proton and one corresponding lepton.Kinetic energies of the protons are between 50 MeV and 400 MeV, while kinetic energies of leptons are generated between 50 MeV and 1 GeV.The 1e-1p final state dataset has a similar final state as the target events of MicroBooNE's deep learning based LEE 1e-1p analysis.The 1µ-1p dataset has a final state similar to a MicroBooNE ν µ selection analysis, described in Section IV A, that will be used to constrain the beam-intrinsic backgrounds in the LEE search.For complimentary final state particle combinations located in Appendix A, generated protons, muons, and electrons are generated with similar requirements as given above, while pions and gammas follow requirements similar to those of muons and electrons, respectively.For completeness, Appendix A includes descriptions of MPID performance all combinations of the five considered final state particle types, excepting the 1µ-1p and 1e-1p sets described in this section.

A. 1µ-1p Simulated Sample
Figure 6 shows stacked MPID scores of five particle hypothesis for the 1µ-1p simulated test dataset.A similar plot showing a complementary inverted final-state configuration (N e-N γ-0µ-N π-0p) is shown in Fig. 35 in appendix A. One can see between Fig. 6 and Fig. 35 the MPID network provides good separation between tracklike and shower-like particles with p and µ − scores concentrated near one and e − and γ piled up near zero and vice versa in the complementary sample.
The plot also shows a good separation between µ − and π ± using MPID, with a low score distribution for π ± .Separation between µ − and π ± comes from the fact that π ± have higher rates of nuclear scattering than the µ, and the π ± can have a kink point where they decay as noted in Ref. [1].The network is likely keying primarily off of visible kinks in a particle's trajectory in order to identify π ± and the absence of visible kinks in a particle trajectory to identify µ − .By checking MPID over a hand scanning of images from a 1π ± -1p sample, we notice MPID predicts high π ± score and low µ − score when the kink is visible, and vice versa when the kink is not visible.Fig. 7 shows examples of predicting a high µ − score for an 1π ± -1p event where no kink is present and and predicting a high π ± score for an 1µ-1p event where the muon scatters and has a kink on its track trajectory.
To perform particle identification as part of a neutrino event selection analysis, a set of selections are usually applied to particle score variables; these cuts will have associated impacts on total signal selection efficiencies.Figure 8 shows the passing fractions for track-like particles in the 1µ-1p dataset.Similar plots of the complementary configuration (N e-N γ-0µ-N π-0p) are shown in Fig. 35 in appendix A. Passing fraction is defined as the percentage of events with an MPID particle score above a specified value; a tested set of events will have a passing fraction calculated for each particle type.The cut value for each particle score is varied between 0 and 1 with a step size of 0.01.For example the blue dotted line shows the passing fraction of p in the image at each p score cut value.Figure 8  ple.Fig. 35 shows low passing fractions for µ − and p and high passing fractions for other three particles in images with the final state of N e-N γ-0µ-N π-0p.
Figure 9 shows the correlation between µ − /π ± scores and the µ − kinetic energy using the same 1µ-1p simulation of 10,000 events.One can see that when the µ − particles have low kinetic energy and produce fewer µ − -like pixels in the image, the µ − score is decreased.Meanwhile, π ± scores for the same dataset appear to be comparatively low across all tested muon energies.The passing fractions over MPID scores for track-like particles in the 1e-1p dataset are given in Fig. 11.Similar plots of the complementary configuration (0e-N γ-N µ-N π-0p) are shown is shown in Fig. 30 in appendix A. The passing fraction for p in the image are much higher than the fractions for µ − or π ± .The capability to discriminate between p and µ − appears to be particularly high, while p/π ± separation also remains high.This difference in performance between µ − and π ± should not be too surprising given the level of π ± -µ − passing fractions demonstrated in the previous section.Figure 11  passing fractions for the shower-like particles in the 1e-1p dataset.Fig. 30 shows low passing fractions for e − and p and high passing fractions for other three particles in images with the final state of 0e-N γ-N µ-N π-0p.
Figure 12 shows the correlation between e − /γ scores and e − kinetic energy.One can see the MPID network has an overall high e − score until the e − kinetic energy approaches its critical energy in liquid argon and becomes less shower-like.In a related sense, µ − scores for low energy 1e-1p interactions are found to be slightly higher than high energy ones.Meanwhile, the γ score for these events has a positive correlation with e − kinetic energy, since high energy e − are more likely to experience substantial amounts of radiative energy loss.

IV. COMPARISON OF DATA/SIMULATION PERFORMANCE
We prepared two different MicroBooNE LArTPC data samples to validate the performance of the MPID network on data.The MPID network was not employed in the selection of these data samples.The first data sample is a 1µ-1p enriched selection that uses a hybrid selection of a series of reconstruction algorithms [22] and MicroBooNE's semantic segmentation network [19].This dataset is intended to be used in a MicroBooNE LEE 1e-1p analysis to provide a data-based constraint on the BNB neutrino beam's intrinsic ν e contamination.The second sample contains ν µ charged current interactions with a final-state π 0 (ν µ CCπ 0 ) as defined in Ref. [21].In this section we demonstrate that the MPID network works well on real LArTPC images.We show good agreement in MPID scores between data and simulation for the selected datasets.
To enable data/simulation comparisons for these two event classes, we simulate neutrino interactions using the GENIE v3.0.6 [33] neutrino Monte Carlo generator.To accurately include on-surface cosmogenic backgrounds present in all MicroBooNE LArTPC images, beam-off data containing only cosmic rays is overlayed on simulated neutrino interaction images.Beam-off data is taken with cosmic ray triggers.An overlay sample is a combination of GENIE simulated beam events and cosmic data events.This ensures that the reported particle score distributions for data and simulated images will be equally affected by the presence of cosmic rays.
In the study of the 1µ-1p dataset, we apply the MPID network to processed images containing only wire signal activity associated with particles reconstructed at a candidate neutrino interaction vertex.In the study of ν µ CCπ 0 dataset, we instead apply the MPID network to images made with all pixels near the reconstructed vertex; in this case, particle scores are completely independent of any previous reconstruction.We show that the network can purify the desired particle content while maintaining good data-simulation agreement in both the 'cleaned' (input images containing only the reconstructed interactions) and potentially 'polluted' (input images also containing cosmic rays) input images.

A. 1µ-1p Enriched Data
The 1µ-1p enriched dataset is selected from a set of MicroBooNE beam-on data corresponding to 4.4 × 10 19 protons on target (POT) in the BNB beam.These events consist of exactly two reconstructed particles -ideally one p and one µ − -at the candidate interaction vertex.The selection consists of two steps.The first step involves a set of preliminary cuts based on optical information and interaction topology cuts.Candidate 1µ-1p interactions are required to have more than a threshold number of photo-electrons recorded in the beam trigger window to be signal.Interaction topology selections require candi- dates to be located inside the TPC with exactly two fullycontained reconstructed tracks.Topology selections also require an opening angle greater than 0.5 radians.The second step involves two boosted decision trees (BDT) to make a final 1µ-1p selection.The first BDT is trained to separate 1µ-1p from the cosmic backgrounds using a simulated ν µ sample and a beam-off cosmic ray only dataset.The second BDT is trained to separate 1µ-1p from non-signal neutrino interactions (i.e non-charged current quasi-elastic (CCQE) ν µ interactions, off-vertex ν µ interactions and interactions missing more than 20% energy in reconstruction) using a simulated ν µ sample.Details of preliminary selection and BDT selections will be documented in detail in future publications.The selection of the dataset described above produces 478 data and 466 simulated input images for processing by the MPID network.In the simulated dataset, 94% of these images contain true neutrino interactions.Among these, 314 (67% of total images) events contain solely one re-  We produce the input images in three steps.First, the interaction vertex is located and any associated track-like particles are reconstructed using algorithms described in Ref. [22]; two and only two reconstructed tracks are required.Next, a 512×512 image is produced, centered at the pixel-weighted center of the reconstructed 1µ-1p event from a flat weight for non-zero pixels.Finally, to address noise-related features, a threshold is placed on the images with a minimum and maximum pixel value of 10 and 500, respectively.This procedure removes effects from pixels from unrelated interactions near the neutrino interaction vertex.Figure 13 shows an example of a 1µ-1p image fed into the MPID network.The image is from the collection plane.The top image of Fig. 14 shows the p score for the selected candidate 1µ-1p interactions, broken down into the true physics process of each imaged vertex.The simulation predicts that true 1µ-1p charged-current neutrino interactions should cluster at high p score, with background processes (particularly cosmic processes) more evenly distributed across the score axis.In the data, a distinct peak is present at high p score, providing a strong indication of proton(s) being present in most of the images.
The bottom sub-panel of this sub-figure shows the ratio of data and simulation versus the p score.We note that as we are primarily concerned with understanding the agreement in the distribution of scores from 0 to 1, discussion of the level of absolute agreement in normalization between data and simulation is beyond the scope of this study.For each point, the data's statistical uncertainty is shown, along with the systematic uncertainty associated with flux and cross-section uncertainties.Beam flux uncertainties are evaluated by re-weighting events according to the properties of the hadrons that decay to produce the neutrinos.Cross section uncertainties are evaluated by re-weighting events according to the properties of the neutrino's interaction with an argon nucleus.Detector uncertainties are in development and are expected to not have a dominant systematic effect on MPID scores for 1e-1p events.Good agreement is found between the data and simulation across the full range of p scores with flux and cross section uncertainties.This level of agreement was quantified by calculating the χ 2 between the data and simulation distributions in the top panel of Fig. 14.This χ 2 includes both statistical and systematic uncertainties in the data and simulation.A χ 2 /NDF of 32.4/ 20 is found, indicating a comparable performance of the MPID on both data and simulation.Figure 14 also shows the µ − score distribution for the same selected 1µ-1p interactions.A majority of events are found in the higher score region, indicating that the MPID algorithm has correctly identified the presence of µ − in these images.The bottom panel again shows the ratio of data to simulation in the µ − score distribution; systematic error bars are similarly defined as for the simulated p score distribution.A χ 2 /NDF of 9.9/ 20 is found between the two distributions, indicating good MPID data-simulation agreement for µ − score.
Figure 15 shows the score distributions for particle types expected to be absent from or contained in limited quantities in the selected 1µ-1p dataset: π ± , e − , and γ.For γ and e − , the score distributions are peaked very close to zero, since input images have only track-like particles, and because, as demonstrated in Section III, discrimination between track-like µ − and p particles and shower-like γ and e − particles is expected to be high.Scores for track-like π ± particle scores are also clustered towards zero, but with a broader overall width; this result also matches the expectations of Section III.The χ 2 /NDF of 22.0/ 20, 27.0/ 20, and 15.8/ 20 for data/simulation comparisons for γ, π ± , and e − indicate comparable performances of MPID on data and simulation.
The MPID network appears to provide similar performance on both data and simulated neutrino interaction images containing primarily track-like final-state particles.This similarity in performance is achieved despite the input image's reliance on other reconstruction algorithms to 'remove' pixel content not related to final-state particles connected to the candidate neutrino interaction vertex.This indicates that not only the MPID algorithm, but also the upstream reconstruction algorithms, treat data and simulated LArTPC images on an equal footing.

B. νµCCπ 0 Enriched Data
A study of π 0 -producing charged current ν µ (ν µ CCπ 0 ) interactions is useful in providing a similar data/simulation agreement validation for images that also contain shower-like objects, as is expected from charged-current ν e interactions.For this study, we select events from the same dataset used in MicroBooNE's previous ν µ CCπ 0 measurement [21].The primary reconstruction toolkits used to develop selection metrics for these events are Pandora [37] and SSNet [19].Selected events are primarily required to have two showers close to the interaction vertex.This requirement makes this dataset distinct from a 1e − 1p selection, where one and only one shower is allowed, which must be directly attached to the vertex.In this way, in studying MPID performance on the ν µ CCπ 0 data sample, we demonstrate not only data/simulation performance, but also show how the network can help to reduce a major intrinsic background to the ν e channel: π 0 -producing interactions.
Input images from ν µ CCπ 0 candidates are generated by cropping a 512 × 512 square image centered at the reconstructed interaction vertex, rather than at the image's pixel-weighted center as in the 1µ-1p images.To ensure that π 0 decay γs are not scrubbed from the im- age, no additional pixel 'cleaning' is applied.This means that cosmic rays and other interactions unrelated to the vertex remain in input images, presenting an additional challenge to the MPID network's performance.The same noise filtering metric, as described for the 1µ-1p dataset, is applied to the images.Figure 16 shows an example of a ν µ CCπ 0 -containing image fed into the MPID network.The image is from the collection plane.The selection and dataset described above produces 2051 data and 2011 simulated input images for processing by the MPID network.According to the simulation, 41% of total events have ν µ CCπ 0 interactions and 60% of events contain π 0 -including interactions (including the ν µ CCπ 0 interactions).
Figure 17 shows the score distribution for having any e − in the images cropped from the ν µ CCπ 0 sample.The score indicates a generally low probability of having e −like features in the data and simulated images.As a comparison, Fig. 17  also included in the same manner as described for the 1µ-1p dataset.Good comparable performance can be seen between data and simulation with χ 2 /NDFs of 43.6/ 39 for e − score, 42.8/ 39 for γ score and 24.0/ 39 for µ − score.Thus, this study demonstrates that, for a subset of π 0 -containing neutrino interactions, the MPID algorithm can reliably identify shower-related particle content in images without introducing biases between neutrino data and simulation predictions.This is achieved despite the presence of additional incidental pixel activity being present in interaction candidate images.FIG.18. Muon score distribution for selected νµCCπ 0 interactions.Score distributions for data and simulation agree well using the νµCCπ 0 selection.χ 2 calculation includes systematic and statistical uncertainties.

V. USE OF MPID IN A LOW ENERGY EXCESS MEASUREMENT
In the two previous sections, we have demonstrated the MPID network's utility in particle identification for both track and shower topologies in LArTPC images, as well as its equivalent performance on both data and simulated events.We will now apply the trained MPID network to simulated BNB ν e and ν µ interactions overlayed with beam-off cosmic event images to demonstrate the ability of the MPID network to aid in event selection for MicroBooNE's deep learning-based 1e-1p low-energy excess search.
A. Simulated Intrinsic νe vs. νµCCQE and νµπ 0 We generated simulated neutrino events to evaluate the performance of MPID in the 1e-1p selection in identifying beam-intrinsic backgrounds originating from from ν µ CCQE and neutrino interactions with one or more π 0 s in the final state (ν µ π 0 ).Samples for these three datasets are produced using the standard GENIE v3.0.6 [33] neutrino interaction generator and filtered using truth-level information.In these samples, we require the lepton kinetic energy be greater than 35 MeV and p kinetic energy greater than 60 MeV.The minimum kinetic energy thresholds were set in order to choose events whose lepton and p trajectories are long enough to be reconstructed by our deep learning based vertex finding and particle reconstruction algorithms [22].Samples are then processed using the reconstruction algorithms to identify candidate interaction vertices and nearby related particles.Finally, input images are generated with pixels from only the reconstructed interaction final-state particles; each interaction is required to have two particles at this stage.
Images are centered at the pixel weighted center of reconstructed interactions.No other selection cuts beyond the truth-level filtration described above are applied to the samples.FIG.19.Electron score of νe intrinsic events and νµπ 0 events.Both datasets are generated with the GENIE neutrino generator and filtered using truth level information.Presented events have a reconstructed vertex.
Figure 19 shows the e − score distribution of reconstructed events from ν e and ν µ π 0 datasets.A good separation is visible between these two event classes.For example, with only an e − cut score of 0.5, 83% of ν µ NCπ 0 and 86% of ν µ CCπ 0 events are rejected, while 81% of true 1e-1p events are selected.It seems likely that further gains in background rejection could be achieved by also considering scores for other particles and by using differing input pixel image inclusion settings.
Previous discussion from the occlusion analysis presented in Section III provides some level of insight into the causes of the substantial discrimination shown in Fig. 19.In particular, ν e interactions will contain a shower-like object with a trunk directly connected to another particle, a feature that was clearly noticed by the MPID network.This is not the case for most γ rays present in ν µ π 0 interactions.Another critical parameter for separating e − -and π 0 -including events is the energy deposition density, dE/dx, along this vertex-connected shower trunk; the trunk region information is usually well-reconstructed, since it is almost always directly attached to neutrino candidate vertex.Some of the discrimination in Fig. 19 may thus also arise from the network's ability to discriminate a high trunk dE/dx for vertex-connected showers from quickly-converting π 0 γ rays.
The e − score can also be applied to separate 1e-1p and 1µ-1p events.The separation is shown in Fig. 20.
The ν e and ν µ CCQE events are well separated using the FIG.20.Electron score between νe intrinsic events and νµ CCQE events.Both datasets are generated with the GENIE neutrino generator and filtered using truth level information.Presented events have a reconstructed vertex.
e − score calculated by the MPID network.For example, with only an e − cut score of 0.2, 91% of true 1e-1p events are selected, while 95% of ν µ CCQE events are rejected.This discrimination ability almost certainly arises from the lack of shower-like topologies in the ν µ CCQE interaction images.

B. Simulated Intrinsic νe vs. Cosmic Event
Due to the lack of substantial overburden and the long readout time, cosmic rays could provide a substantial background to a BNB-based 1e-1p ν e measurement in MicroBooNE.As most of this cosmic ray activity is induced by µ − , it is expected that the presence of a p in the signal's final state will aid in distinguishing the two categories.To test the MPID network's ability to discriminate the signal's p particle content, we generated a simulated intrinsic ν e dataset with cosmic data overlay, in addition to another event set consisting purely of beam-off cosmic triggers.For both datasets, we applied the vertex finding and particle reconstruction algorithms developed for two-track events, as described in Ref. [22]; in particular, each image is required to have exactly two reconstructed particles connected to the candidate neutrino interaction vertex.As in the sub-section above, no selection cuts are applied beyond truth-level event filtration.
Figure 21 shows the p score distributions on images from the intrinsic ν e dataset with cosmic overlay and the pure cosmics dataset.One can see that the majority of pure cosmic dataset events reconstructed as two-particle signals events have p scores below 0.2.Meanwhile, the majority of reconstructed ν e intrinsic events have p scores near 1.For example, with only a p cut score of 0.5, 81% of true 1e-1p events are selected, while 79% of cosmic events are rejected.Investigation of information from prior reconstruction stages and hand-scanning of event displays indicates that the small peak in p score close to zero in the ν e intrinsic dataset is due to inefficiencies in p reconstruction as shown in Fig. 21(a) of Ref. [22].Similar investigations show that the small peak of of p score close to one in the cosmic sample is introduced by cosmics with small incident angles relative to the collection plane; these non-p tracks are often topologically compressed by reconstruction algorithms, giving them the appearance of short tracks with a proton-like Bragg peak.Thus, future improvements in lower-level signal processing and particle reconstruction is likely to further improve the cosmic discrimination shown in Fig. 21.

VI. CONCLUSION
We have developed a CNN-based multiple particle identification network, MPID, and applied it to images of event interactions in MicroBooNE data.This is the first demonstration of the performance of a CNN that incorporates systematic uncertainties in LArTPC data, and the first use of CNNs to perform particle identification on real LArTPC data.The network takes a 512×512 LArTPC image and calculates the probability scores for any particle in the image as p, e − , γ, µ − , and π ± .The training images are generated with a customized event generator that concatenates particles at the same vertex.
The code for making the network and training sample are made available in MPID [38] and LArSoft [8].
10,000 1e-1p and 1µ-1p images are used to benchmark the network performance on simulated interactions.Pass fractions of particles present in the images are found to surpass those not present in the input images.
Satisfactory agreement in all score distributions are found between data and simulation despite the many complexities of the MicroBooNE liquid argon TPC response, including inactive wire regions [39], electronics noise [39], signal processing [40,41], and space charge effects [42].
We also demonstrated the metrics and performance of applying the MPID network on BNB beam data from MicroBooNE, which also illustrated the MPID network's clear capabilities in particle discrimination.When we take reconstructed vertex activity as input in filtered 1µ-1p candidate event images, MPID score distributions are indeed high for p and µ − , and low for e − , γ and π ± .When we instead take all pixel activity as input in filtered images containing π 0 -produced γ rays, we see large differences between obtained e − and γ scores.By applying these demonstrated particle identification capabilities to simulated BNB ν e and ν µ interactions, we have shown that this validated tool can play an important role in achieving a successful low-energy electron-like excess measurement in MicroBooNE.are 32 different combinations regarding the five considered particle types.However, cases involving none of the particle types, as well as all five particle types, were not included in the training or test samples.The remaining 30 combinations are presented in this paper.For each combination we present a stacked distribution similar to Fig. 6 and a passing fraction plot similar to Fig. 8 for each of the five type of particles.We present the 30 combinations in 15 pairs, with each pair having two complementary configurations, for example the network perfor-mances over the final states of N e − 0γ − 0µ − − 0π ± − 0p and 0e − N γ − N µ − − N π ± − N p as shown in Fig. 22.The data is generated using the same configuration for the test sample described in Section II B. For 80% of the sample, particles are simulated with kinetic energies between 60 MeV and 400 MeV for protons and between 30 MeV and 1 GeV for other particles.For the other 20% of the sample, particles are simulated with kinetic energies between 40 MeV and 100 MeV for protons and between 30 MeV and 100 MeV for other particles.Particles are generated with a flat energy distribution.

FIG. 1 .
FIG. 1. MPID network scheme.The output has five numbers.Each of the values is between 0 and 1, representing the probabilities of corresponding particles in the given LArTPC image.
FIG. 2. MPID example of an 1e-1p topology with a tabulated output of particle scores.This image is generated by concatenating a p and an e − at the same vertex.Scores indicate high probabilities of having a p and e − in the image.The image applied to MPID has 512 × 512 pixels.A zoom-in image of 250 × 250 pixels is shown for visualization.

FIG. 3 .
FIG.3.MPID example of an 1e-1γ-1p topology with a tabulated output of particle scores.This image is generated by concatenating three particles at the same vertex.Scores indicate higher probabilities of having p, e − and γ in the image.The image applied to MPID has 512 × 512 pixels.A zoom-in image of 250 × 250 pixels is shown for visualization.
FIG. 5. Simulated 1e-1γ-1p final state event example (top).p score map (bottom left), p scores decrease as occluded region crosses the p pixels.γ score map (bottom right), γ scores decrease as pixels in the trunk region of the gamma shower are occluded and increase as the trunk region of the e-shower are occluded.
Figure6shows stacked MPID scores of five particle hypothesis for the 1µ-1p simulated test dataset.A similar plot showing a complementary inverted final-state configuration (N e-N γ-0µ-N π-0p) is shown in Fig.35in appendix A. One can see between Fig.6and Fig.35the MPID network provides good separation between tracklike and shower-like particles with p and µ − scores concentrated near one and e − and γ piled up near zero and vice versa in the complementary sample.The plot also shows a good separation between µ − and π ± using MPID, with a low score distribution for π ± .Separation between µ − and π ± comes from the fact that π ± have higher rates of nuclear scattering than the µ, and the π ± can have a kink point where they decay as noted in Ref.[1].The network is likely keying primarily off of visible kinks in a particle's trajectory in order to identify π ± and the absence of visible kinks in a particle trajectory to identify µ − .By checking MPID over a hand scanning of images from a 1π ± -1p sample, we notice MPID predicts high π ± score and low µ − score when the kink is visible, and vice versa when the kink is not visible.Fig.7shows examples of predicting a high µ − score for an 1π ± -1p event where no kink is present and and predicting a high π ± score for an 1µ-1p event where the muon scatters and has a kink on its track trajectory.To perform particle identification as part of a neutrino event selection analysis, a set of selections are usually applied to particle score variables; these cuts will have associated impacts on total signal selection efficiencies.Figure8shows the passing fractions for track-like particles in the 1µ-1p dataset.Similar plots of the complementary configuration (N e-N γ-0µ-N π-0p) are shown in Fig.35in appendix A. Passing fraction is defined as the percentage of events with an MPID particle score above a specified value; a tested set of events will have a passing fraction calculated for each particle type.The cut value for each particle score is varied between 0 and 1 with a step size of 0.01.For example the blue dotted line shows the passing fraction of p in the image at each p score cut value.Figure8also shows the passing fractions for shower-like particles in the 1µ-1p dataset.The passing fractions are extremely low for either in the 1µ-1p sam-

B
Figure 10 shows stacked MPID score distributions for the simulated 1e-1p dataset.A similar plot for a complementary configuration (0e-N γ-N µ-N π-0p) is shown in Fig. 30 in appendix A. MPID correctly calculates high scores for signal particles of p and e − .One can see between Fig. 10 and Fig. 30, the network shows good separation between track particles in deriving low scores for µ − and π ± .The MPID CNN also shows good separation between shower-like particles when e − 's are present in the image: derived scores for γ are clustered close to zero, while e − -like scores are clustered around unity.The passing fractions over MPID scores for track-like particles in the 1e-1p dataset are given in Fig.11.Similar plots of the complementary configuration (0e-N γ-N µ-N π-0p) are shown is shown in Fig.30in appendix A. The passing fraction for p in the image are much higher than the fractions for µ − or π ± .The capability to discriminate between p and µ − appears to be particularly high, while p/π ± separation also remains high.This difference in performance between µ − and π ± should not be too surprising given the level of π ± -µ − passing fractions demonstrated in the previous section.Figure11also shows the Figure 10 shows stacked MPID score distributions for the simulated 1e-1p dataset.A similar plot for a complementary configuration (0e-N γ-N µ-N π-0p) is shown in Fig. 30 in appendix A. MPID correctly calculates high scores for signal particles of p and e − .One can see between Fig. 10 and Fig. 30, the network shows good separation between track particles in deriving low scores for µ − and π ± .The MPID CNN also shows good separation between shower-like particles when e − 's are present in the image: derived scores for γ are clustered close to zero, while e − -like scores are clustered around unity.The passing fractions over MPID scores for track-like particles in the 1e-1p dataset are given in Fig.11.Similar plots of the complementary configuration (0e-N γ-N µ-N π-0p) are shown is shown in Fig.30in appendix A. The passing fraction for p in the image are much higher than the fractions for µ − or π ± .The capability to discriminate between p and µ − appears to be particularly high, while p/π ± separation also remains high.This difference in performance between µ − and π ± should not be too surprising given the level of π ± -µ − passing fractions demonstrated in the previous section.Figure11also shows the

FIG. 9 .
FIG. 9. Muon score vs. muon kinetic energy (top) and charged pion score vs. muon kinetic energy (bottom) for the 1µ-1p simulation.Red dots indicate the average score in the vertical bin.

FIG. 12 .
FIG. 12. Electron score vs. electron kinetic energy (top) and photon score vs. electron kinetic energy (bottom) for the 1e-1p simulation.Red dots indicate the average score in the vertical bin.

FIG. 13 .
FIG. 13.Example of the input data image from 1µ-1p selection.The image is centered at the non-zero pixel weight center.The image has 512×512 pixels.A zoom-in image of 250 × 250 pixels is shown for visualization.

FIG. 14 .
FIG. 14. MPID proton score distribution (top) and muon score distribution (bottom) for selected 1µ-1p interactions.Simulation-predicted score distributions show satisfactory agreement with those realized in the 1µ-1p selection applied to MicroBooNE data.Plot error bars indicate data statistical errors, while hatched bands indicate statistical and/or systematic uncertainties in the simulated dataset.The χ 2 calculation incorporates contributions from systematic and statistical uncertainties.The breakdown of interaction type is based on the predicted event classification for the initial neutrino interaction.

FIG. 16 .
FIG. 16.Example of the input data image from the νµCCπ 0 selection.The image is centered at the reconstructed vertex.The image has 512×512 pixels.A zoom-in image of 250×250 pixels is shown for visualization.
FIG. 17. Electron score distribution (top) and photon score distribution (bottom) for selected νµCCπ 0 interactions.Score distributions agree with the νµCCπ 0 selection.Data and simulation agree well.χ 2 calculation include systematic and statistical uncertainties.

FIG. 21 .
FIG.21.Proton score of νe intrinsic events and beam-off cosmic data.The νe dataset is generated with the GENIE neutrino generator and filtered using truth level information.Presented events have a reconstructed vertex.

FIG. 27 .
FIG. 27.MPID score distributions and MPID passing fractions on a complementary set of N e-N γ-0µ-0π-0p and 0e-0γ-N µ-N π-N p. N is randomly one or two in each event.