Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multi-Layer Calorimeters

Physicists at the Large Hadron Collider (LHC) rely on detailed simulations of particle collisions to build expectations of what experimental data may look like under different theory modeling assumptions. Petabytes of simulated data are needed to develop analysis techniques, though they are expensive to generate using existing algorithms and computing resources. The modeling of detectors and the precise description of particle cascades as they interact with the material in the calorimeter are the most computationally demanding steps in the simulation pipeline. We therefore introduce a deep neural network-based generative model to enable high-fidelity, fast, electromagnetic calorimeter simulation. There are still challenges for achieving precision across the entire phase space, but our current solution can reproduce a variety of particle shower properties while achieving speed-up factors of up to 100,000$\times$. This opens the door to a new era of fast simulation that could save significant computing time and disk space, while extending the reach of physics searches and precision measurements at the LHC and beyond.


INTRODUCTION
High-precision modeling of the interactions of particles with media is important across many physical sciences, enabling and accelerating new findings. Similar to complex weather or cosmological modeling, the detailed simulation of subatomic particle collisions and interactions, as captured by detectors at the LHC, is a computationally demanding task, which annually requires billions of CPU hours, constituting more than half of the LHC experiments' computing resources [1][2][3].
The Nobel-prize-winning Higgs boson discovery [4,5] would not have been possible without extensive simulation. Before its experimental observation, its fundamental properties, such as its mass, were unknown, but synthetic particle collisions could be generated to simulate the outcome of various measurements under different model assumptions.
Today, as several questions remain unanswered about the nature of known particles (such as neutrinos) and hypothetical ones (such as the supersymmetric partners of the Standard Model particles), modern nuclear and particle physics research continues to strongly depend on detailed simulations for developing analysis techniques, interpreting results, and designing new experiments.
Cutting-edge software libraries such as Geant4 [6] provide the backbone to construct complex detector geometries and accurately model physical processes and interactions happening at distance scales as small as 10 −20 m.
The shortcoming of this method is its computational footprint. The high-precision description of electromagnetic and nuclear processes that govern the evolution of particle showers in calorimeters can requires minutes per event on modern computing platforms [7,8], making this the most computationally expensive step in the simulation pipeline. Due to the expensive simulation cost, sig-nificant resources are also invested in storing generated data sets, which can occupy petabytes of disk space.
This bottleneck becomes apparent at the scale at which events need to be simulated to enable physics analyses at the high luminosity phase of the LHC (HL-LHC). The ATLAS and CMS experiments are expected to observe about 10 8 Higgs boson events [9], buried in ∼ 10 17 background events [10,11]. Hundreds of billions of simulated collisions will be required to reduce the Monte Carlo uncertainty and measure some of the Higgs boson's as yet unprobed properties.
Full detector simulations are too slow to meet the growing analysis demands; current fast simulations are not precise enough to serve the entire physics program. We therefore introduce a Deep Learning model, named CaloGAN, for high-fidelity fast simulation of particle showers in electromagnetic calorimeters. Its goal is to be both quick and precise, by significantly reducing the accuracy cost incurred with increased speed-up. A fast simulation technique of this kind also addresses the issue of data storage and transfer, as the gained generation simplicity and speedup make real-time, on-demand simulation a possibility.
Similar techniques have been tested in Cosmology [16,17], Condensed Matter Physics [18], and Oncology [19]. However, the sparsity, high dynamic range, and highly location-dependent features present in this application make it uniquely challenging. In addition to enabling physics analysis at the LHC, an approach similar to the CaloGAN may be useful for other applications in particle and nuclear physics, nuclear medicine, and space science that require detailed modeling of particle interactions with matter.

METHOD
To alleviate the computational burden of simulating electromagnetic showers, we introduce a method based on Generative Adversarial Networks (GANs) [20] in order to directly simulate component read-outs in electromagnetic calorimeters. GANs are an increasingly popular approach to learning a generative model using deep neural networks, and have shown great promise in generating clear samples from natural images [21].
Though the GAN formulation, by design, does not admit an explicit probability density or explicit likelihood, we gain the ability to sample from the learned generative model in a efficient manner. The GAN training uses a minimax game theoretic framework, and admits a function g as an artifact that maps a d-dimensional latent vector, z ∼ p z (z) ∈ R d to a point in the space of realistic samples. We would like the implicit density learned by g to be close to the distribution f that governs the simulated data distribution. Since g is a neural network, a forward pass to generate new samples is highly efficient on modern computing platforms [22].
Previous work [23] investigated GAN-based methods for jet images [24], which are similar to one-layer calorimeters with square pixels (except jet generatators such as Pythia [25] are much faster than Geant4). This work addresses the complexity introduced by modeling a realistic sampling detector with heterogeneous longitudinal and transverse segmentation. We exploit the location specificity of the calorimeter, and utilize weight locality at the model level. We also follow the guidelines outlined in [23] in order to deal with both high dynamic range and sparsity levels. Our neural network architecture per calorimeter layer is a function of the read-out grid dimensionality, and is augmented with an attentional component [26] that provides a mechanism to carry information from layer to layer [27]. This allows the CaloGAN to model the physical sequential dependence among the calorimeter layers.
To ensure the realism of the CaloGAN setup, we impose an additional constraint to encourage the generator to produce a given energy shower. That is, the learned, implicit PDF f needs to converge to the hypothetical data generating function g for any initial nominal en- To encourage this to be well modeled, a physics-specific loss component is introduced to penalize absolute deviation between the nominal energy E 0 and the reconstructed energy E. A noteworthy subtlety is that this penalization scheme, coupled with minibatch discrimination [28], invites the network to learn the distribution of |E 0 − E|, a desirable characteristic for a readily applicable practical system to augment fast simulation. Such a formulation also encourages conservation of energy through the generation process. The simulation only includes models of energy deposition, not digitization (a non-linear effect that can violate reconstructed energy conservation). The energy per layer includes the contribution from inactive material (see below). Therefore, aside from leakage beyond the calorimeter (relevant mostly for charged pions), energy must be conserved and provides a useful constraint on the generation.

EXPERIMENTAL RESULTS
From a series of simulated showers, the CaloGAN is tasked with learning the simulated data distributions of γ, e + , and π + generated by Geant4 with uniform energy spectrum [1,100] GeV, and incident perpendicular to the center of a three-layer, heterogeneously segmented, liquid argon (LAr) calorimeter cube of side-length 480 mm. The training dataset [29] is represented in image format by three figures of dimensions 3 × 96, 12 × 12, and 12 × 6, each representing the shower energy depositions per pixel in each calorimeter layer. The energy per layer includes the active and inactive contributions. For e.g. calorimeter calibrations [30], it is important to have the inactive component; in the future one could add separate layers for the inactive component or add a second step for dividing the energy per layer into the two components. The flexible CaloGAN architecture allows for a straightforward extension to related detector geometries that have more sampling layers or different cell sizes per layer [31].
Our analysis establishes that it is possible to generate three-dimensional electromagnetic showers in a multilayer sampling LAr calorimeter with uneven spatial segmentation, while attempting to preserve spatio-temporal relationships among layers.
For performance evaluation, we choose applicationdriven methods focused on sample quality. A first qualitative assessment is accompanied by a quantitative evaluation based on physics-driven similarity metrics. The choice reflects the domain specific procedure for Monte Carlo-data comparisons. However, it is also important to examine high-dimensional behavior because CaloGAN is not anchored by parameterized models the way traditional fast simulators are. While the adversarial classifier provides some high-dimensional validation, we also use particle classification performance. Visualization and validation is still a key challenge for multi-dimensional generators parameterized by a neural network.

Qualitative Evaluation
The average calorimeter deposition per voxel (Fig. 1) suggests that the learned generative models of γ, e + , and π + showers capture aspects of the underlying physical processes. For photon showers, for instance, the mean per-layer cell variations only show a ∼ 4% and ∼ 1% discrepancy in the first two layers where most energy is deposited for e/γ. This level of agreement is promising, but it is important to analyze more than the mean energy pattern to fully study the strengths and weaknesses of the proposed approach. The CaloGAN-generated samples are checked for adequate diversity and lack of direct memorization of the Geant4 samples used for training. The nearest (by Euclidean distance) Geant4 image is found for each of a random selection of CaloGAN images in order to verify the desired characteristics (Fig. 2). The samples show strong inter-and intra-class diversity and no evidence of memorization since the closest images do not look exactly the same.

Shower Shape Description
Geometrically and physically motivated shower shape variables [32] are used as further validation and introspection into the capabilities of the CaloGAN to adequately model and capture non-linear functional representations of the simulated data distribution (Fig. 3). In fact, it is desirable for the CaloGAN to recover the target distribution of these 1D statistics.
The network is not shown any shower shape variable (only pixel values) at training time -therefore, it is encouraging to note that the CaloGAN recovers the simulated data distribution for a variety of shower shapes across the three particle types. However, certain features of some distributions are not well-described. This is a challenge for the future and will likely require improvements to the architecture and training procedure. Longer trainings of higher capacity architectures have shown promise in rectifying some of these issues. Examining 1D statistics does not probe correlations between shower shapes or higher dimensional aspects of the probability distribution. One way to examine the full shower phase space is to study classification performance, as described in the next section.

Classification as a Performance Proxy
When training a six-layer, fully-connected classification model on the 504-dimensional pixel space of the concatenated representation of shower energy depositions across all calorimeter layers, no major classification degradation is observed for out-of-domain learning when trained on the full simulation, i.e. when the network is trained on Geant4 samples but evaluated on CaloGAN samples. Specifically, although the classification accuracy always reaches 99% when evaluating performance on CaloGAN showers -which points to an over-differentiation among particle types in the Calo-GAN dataset -in both e + −γ and e + −π + discrimination tasks, the evaluation of the network trained on Geant4 images results in no accuracy decrease in the former task (∼ 70%), and only a 2% decrease in the latter (∼ 97% versus 99% accuracy), when compared to the classi- fier tested on CaloGAN samples. The stability of the accuracy metric implies that the CaloGAN succeeds at representing at least as much variation among showers initiated by different particles as it is necessary to classify them using the same features in Geant4. Training on CaloGAN and testing on Geant4 does show significant degradation, indicating that the GAN is inventing new class-dependent features or underrepresenting class-independent features. While percent-level variations may be important for some applications, using classification as a generator diagnostic is an important tool for exposing the modeling of interclass shower variations.

Computational Performance
Directly generating deposited energy per calorimeter cell rather than particle dynamics renders the model's time-complexity invariant to nominal energy, whereas Geant4 shower simulation runtime increases significantly with higher energy. Therefore, the CaloGAN affords sizable simulation-time speed ups compared to Geant4. All benchmarks are performed on Intel Xeon R 2.6GHz processors for CPU-time and a single NVIDIA R K80 for GPU-time. When simulating a single e + in a uniform energy range between 1 GeV and 100 GeV, Calo-GAN is O(10 2 ) times faster than Geant4 on both CPU and GPU. However, when batching is utilized, the Calo-GAN throughput significantly improves -when batching of size 1024 is allowed (not unrealistic given the embarrassingly parallel nature of EM showering), the pere + generation time is O(10 3 ) times faster on CPU and O(10 5 ) times faster on GPU.

OUTLOOK AND FUTURE WORK
This letter demonstrates that the Generative Adversarial Network technology represents a powerful new tool for efficient simulation. Our ability to infuse Physics domain knowledge into the neural network documents the flexibility and extensibility of the method for field-specific applications and explicit mismodeling mitigation.
Prior to this work, the prospect of a GAN-based calorimeter simulation had generated considerable excitement within the high energy physics community. The availability and performance of the CaloGAN has attracted further interest as a concrete and publically available demonstration of the power and drawbacks of a GAN-based calorimeter simulation. In addition to the applicability within individual experiments, variations of the CaloGAN are also being studied as a generic tool for future Geant software versions. While the Calo-GAN is currently structured as a fast simulation tool, in the future it could also be trained on testbeam data to replace or augment a full simulation tool.
Future work will focus on incorporating the most recent cutting-edge innovations from the GAN literature to stabilize the training procedure and improve convergence to optimal solutions [33][34][35][36]. While our primary effort will be to improve and maintain this technique for event simulation at the LHC, this neural-network approach retains generalization power to other fields in which computationally expensive simulation inhibits result productivity.