Towards a Quantum Computing Algorithm for Helicity Amplitudes and Parton Showers

The interpretation of measurements of high-energy particle collisions relies heavily on the performance of full event generators. By far the largest amount of time to predict the kinematics of multi-particle final states is dedicated to the calculation of the hard process and the subsequent parton shower step. With the continuous improvement of quantum devices, dedicated algorithms are needed to exploit the potential quantum computers can provide. We propose general and extendable algorithms for quantum gate computers to facilitate calculations of helicity amplitudes and the parton shower process. The helicity amplitude calculation exploits the equivalence between spinors and qubits and the unique features of a quantum computer to compute the helicities of each particle involved simultaneously, thus fully utilising the quantum nature of the computation. This advantage over classical computers is further exploited by the simultaneous computation of s and t-channel amplitudes for a 2$\rightarrow$2 process. The parton shower algorithm simulates collinear emission for a two-step, discrete parton shower. In contrast to classical implementations, the quantum algorithm constructs a wavefunction with a superposition of all shower histories for the whole parton shower process, thus removing the need to explicitly keep track of individual shower histories. Both algorithms utilise the quantum computer's ability to remain in a quantum state throughout the computation and represent a first step towards a quantum computing algorithm to describe the full collision event at the LHC.


Introduction
Modern collider experiments such as the Large Hadron Collider (LHC) at CERN depend heavily on the modelling of particle collisions and simulations of detector response to examine physics processes within the experiments. This modelling is used to construct different possible outcomes from particle collisions, used both for the identification of certain physical processes, and for the construction of event backgrounds. Consequently, such simulations play a crucial role in modern high energy physics, and are usually carried out by Monte Carlo event generators such as Pythia [1], Herwig [2] and Sherpa [3].
The theoretical description of LHC events can be highly complex. In a typical event, hundreds of particles are produced as a result of the evolution of an event from the collision of two protons to the formation of long-lived hadrons, leptons and photons. The collision process can be separated into several stages. The protons consist of many partons, each carrying a fraction of the total proton energy. When protons collide, two of their partons can interact with each other via a large momentum transfer, thereby giving rise to the so-called hard interaction. In this part of the collision, large interaction scales are probed, possibly accessing new physics. However, if color-charged particles are produced during the hard interaction process, they are likely to emit further partons. This results in a parton shower, providing a mechanism that evolves the process from the hard interaction scale down to the hadronisation scale O(Λ QCD ), where non-perturbative processes rearrange the partons into colour-neutral hadrons.
The hard interaction and the parton shower are the two parts of the event evolution that can be described perturbatively and largely independently of non-perturbative processes, as a result of the factorisation theorem [4]. In addition, they are by far the most time-consuming parts of the event simulation and pose, therefore, the bottleneck in the generation of pseudo-data for ongoing measurements at the LHC.
While a speed improvement in calculating the hard process and the parton shower is crucial for the interpretation of high-energy collision experiments, the conceptual methods used to calculate either of these two parts are distinctly different. For a mathematical description of the hard interaction, scattering matrix elements are calculated, which nowadays rely on helicity amplitude methods to cope with the ever-increasing complexity of the partonic scattering process [5,6]. Instead, the parton shower is technically implemented through a Markov chain algorithm ordered in some measure of showering time t, where splitting functions define the probability for a parton to branch into two partons and Sudakov factors [7] determine the probability for the system not to change between two shower times * t in and t end . Recent developments in combining helicity amplitudes with the parton shower have shown to improve the theoretical description of scattering events including multiple jets [9][10][11][12][13][14], in hypothesis testing [15,16] and in particular in the construction of spin-dependent parton showers [17].
With practical quantum computers becoming available, there has been growing interest in harnessing the power and advantages that these machines may provide. This interest extends to applying the abilities of quantum computers to describe processes in field theories, with the hope of exploiting the intrinsic 'quantumness' of these novel machines to calculate quantum phenomena efficiently. Current quantum computers are divided into two classes: quantum annealers and universal gate quantum computers (GQC). The former is based on the adiabatic theorem of quantum mechanics to find the ground state of a complex system. Quantum annealers perform continuous-time quantum computations and are therefore well-suited to study the dynamics of quantum systems, even quantum field theories [18,19], and in solving optimisation problems, e.g. applied to Higgs phenomenology [20]. However, they are not universal. Despite their severe limitation due to the relatively small number of qubits of current machines, GQC are a popular choice for the implementation of algorithms to calculate multi-particle processes [21][22][23][24][25][26][27][28][29][30][31], often with field theories mapped onto a discrete quantum walk [32][33][34][35] or a combined hybrid classical/quantum approach [36][37][38][39].
Here, we aim to provide a first step towards a generic implementation of quantum algorithms, applicable to QGC devices, for the most time-consuming parts of the event generation in high-energy collisions, i.e. the calculation of the hard process in terms of helicity amplitudes and the simulation of the parton shower † .
As depicted in Fig. 1, QC calculations proceed in general in three stages: i) encoding of the initial state, i.e. an initial wavefunction, using a specific representation of the problem, ii) applying unitary operations on this state, which on a GQC is realised through circuits, and iii) measuring a specific property of interest, i.e. a projection onto the final state vector. Following this structure, we will elucidate how the calculation of a hard process in terms of helicity amplitudes or the parton shower can be performed using a GQC. Specifically, we use the IBM Q Experience [40], which provides access to a range of public access quantum computers and a 32-qubit Quantum Simulator [41]. We have designed the circuits with a focus on limiting the number of qubits needed to perform the calculations. While our code can be run on a real quantum device, the current quantum machines cannot outperform classical computers. The quantum circuits presented here, therefore, serve as a template and nucleus for future developments. This paper is organised as follows: in Sec. 2, we motivate and detail the implementation of our QC algorithm for helicity amplitudes for a 1→2 process and a 2→2 scattering process, Sec. 3 contains the description of our 2-step parton shower algorithm and Sec. 4 offers a summary and conclusions.

Helicity amplitude algorithm
Scattering processes are calculated using conventional techniques by squaring the scattering amplitude and then performing a sum of all possible helicity processes using trace techniques. For a process with N possible Feynman diagrams, this results in N 2 terms in the squared amplitude. Therefore, for processes with a large number of Feynman diagrams, such calculations become extremely complicated. In contrast, helicity amplitude calculations provide a more efficient way of calculating such processes, as one calculates the amplitude for a specific helicity setup. The different helicity combinations do not interfere, and therefore the full amplitude can be obtained by summing the squares of all possible helicity amplitudes.
Helicity amplitude calculations are based on the manipulation of helicity spinors. As the Lorentz group Lie algebra can be written as the direct sum of two SU (2) sub-algebras, i.e. so(3, 1) = su(2) ⊕ su (2), there are two specific complex representations each specified by two degrees of freedom which solve the massless Weyl equation: a right-handed Weyl spinor, associated with the representation ( 1 2 , 0), and a left-handed Weyl spinor, associated with the representation (0, 1 2 ). Consequently and for concreteness, the helicity spinor |p ȧ for a massless state can be chosen to be expressed as associated with momentum p µ and energy E, such that p µ p µ = −m 2 using the η µν =diag(-1, +1, +1, +1) metric convention. This spinor is parametrised by the angles θ and φ, where the other spinors p|ȧ, |p] a and [p| a are related by p aḃ = −|p] a p|˙b and pȧ b = −|p ȧ [p| b . The correspondence between the two-dimensional helicity spinors and four-component Dirac spinors associated with Feynman rules is demonstrated in Appendix B.
To facilitate and implement such calculations on a GQC, we use qubits, the quantum analogue of the bit for classical computation. The state of the qubit is defined on a twodimensional complex vector space with states |0 and |1 forming the orthonormal basis for this space. A qubit can thus be formed by a linear superposition of these orthonormal basis states. By considering a general qubit parametrized by two angles we can represent the qubit on a three-dimensional unit sphere called the Bloch sphere. Performing unitary operations on qubit states corresponds to rotating states in the Bloch sphere.
Remarkably, comparing Eqs. (2.1) and (2.2), helicity spinors can be represented through a qubit, modulo an overall normalisation factor √ 2E, and the calculation of helicity amplitudes follows the identical structure shown in Fig. 1, i.e. quantum operators act on an initial state to eventually perform the projection onto a final state. In contrast to classical -4 -computers, where all numerical quantities are converted into a binary system representation, on which an algorithm is applied, and then transformed back into quantities that can be understood in terms of a numerical result, in a quantum computing algorithm, the helicity spinor is a faithful representation of the object the circuit directly operates on. The spinors can be directly represented as vectors on the Bloch sphere, which provides the most efficient encoding of the state on which the algorithm operates. This indicates that GQC provide an ideal framework for the calculation of helicity amplitudes.
Consequently, we will exploit that the spinors used to calculate helicity amplitudes naturally live in the same representation space as qubits. This motivates the manipulation of the direct correspondence of the θ and φ variables of the qubit states and helicity spinors to represent the spinors on a quantum circuit. We further encode operators acting on spinors as quantum circuits of unitary operations. These can be applied to qubits (rotating vectors on the Bloch sphere) to calculate helicity amplitudes. The helicity spinors |p ȧ ,( p|ȧ) T , |p] a and ([p| a ) T are visualised for θ = π/4, φ = π/2, E = 1/2, as vectors on the Bloch sphere in Fig. 2, in direct analogy to their respective qubit representation.
This study aims to create the basic building blocks to encode spinor helicity calculations on a quantum circuit. These basic building blocks are then used to construct quantum algorithms for two simple examples of helicity calculations: i) the contraction of an external polarisation vector corresponding to a g → qq vertex and ii) the construction of s and tchannel amplitudes for a qq → qq process with identical initial and final quark flavours. 'Helicity registers' are crucially introduced into these circuits to control the helicity of each particle involved. In addition, we introduce a superposition state between the helicity qubits of |+ = |1 and |− = |0 by applying Hadamard gates to the helicity registers. In doing so, we can calculate both helicities of each particle involved simultaneously, thus fully utilising the quantum nature of the computation. This advantage is further exploited by the simultaneous computation of s and t-channel amplitudes for the qq → qq process.
This section is organised as follows: a description of the quantum circuit for the 1→2 process of g → qq is given in Sec. 2.2, together with a comparison of the results of the algorithm as run on a real machine and a simulator, the quantum circuit and the results for the 2→2 process of qq → qq are given in Sec. 2.3, and a brief discussion of the generalisation of the algorithm to 2→ n processes follows in Sec. 2.4. -5 -

Constructing helicity spinors and scalar products on the Bloch sphere
The helicity spinors have been implemented on the quantum circuit by constructing Bloch sphere representations, like the ones shown in Fig. 2. The helicity spinor decompositions are outlined in detail in Appendix C. They utilise the Qiskit U 3 (θ, φ, λ) gate, which applies a rotation to a single qubit. The rotation is defined by, A simple U 3 gate acting on a |0 state has been used to create the |q ȧ spinor, where θ and φ variables of the U 3 gate corresponded to the θ and φ variables of the helicity spinor.
The |q] a spinor has been created by sequentially applying a U † 3 rotation and a N OT gate, where here the θ and λ variables of the U 3 gate corresponded to the θ and φ variables of the |q] a spinor.
To construct the scalar products pq or [pq] on a quantum computer, 2 × 2 unitary gates U p and U [p were created such that, when they act on the |q ȧ and |q] a spinors respectively, the scalar product values correspond to the first component of the final qubit state, i.e. the complex coefficient associated with the |0 state. It should be noted that the factors of √ 2E in the definition of the helicity spinors have not been accounted for such that the spinor-qubit states are normalized to one on the quantum register. As a consequence, these factors must be added after the results have been obtained from the quantum computer.

1→2 amplitude calculation
A simple application of the helicity amplitude approach is the calculation of a 1→2 process. Here we will consider the process of q → gq by calculating the gqq vertex, where p f and p f are the momenta associated with the fermon and anti-fermion respectively. The gluon polarisation vectors are defined as [42], (2.5) From this, it is possible to create a circuit where each four-vector present in the amplitude, i.e. the fermion anti-fermion vertex and polarisation vector, is calculated individually on a series of 4 qubits. This is done by using the corresponding Pauli gates for each fourvector component on each qubit. However, this will lead to a large circuit depth due to the number of gates required to do such a calculation. Therefore it is useful to simplify the expression for the amplitude using the Fierz identity, p|σ µ |q] k|σ µ |l] = 2 pk [ql]. (2.6) With this, the amplitude for the gqq vertex becomes As a consequence of this simplification, the number of qubits needed to calculate the amplitude on the quantum computer can be reduced from 10 to 4. The circuit for calculating this amplitude is shown in Fig. 3. The three q i qubits calculate the three scalar products from Eq. (2.7) using the gate decompositions outlined in Appendix C. These rotation gates are controlled from the helicity register, h. If h is in the |1 state, then the helicity is positive and the M + amplitude is calculated; if h is in the |0 state, then the helicity is negative and the M − amplitude is calculated. The three calculation qubits, q i , are then measured by the quantum machine.
h H Figure 3: gqq vertex circuit. The amplitude for the process is calculated on the q i qubits, which are controlled from the helicity register. The q i qubits are then measured by the quantum computer.
negative and the M amplitude is calculated. The three calculation qubits, q i , are then measured by the quantum machine. Figure 4 shows the results of the algorithm for a random selection of small scattering angles, with runs on the IBM Q 32-qubit Quantum Simulator [42] and the IBM Q 5qubit Santiago Quantum Computer [44]; both of which have been compared to theoretical predictions of the probability distributions extrapolated directly from analytic calculations of the helicity amplitude, calculated using the S@M software [45]. The simulator has been run without a noise profile for 10,000 shots, and has been shown to agree within 1 of the theoretically predicted values. From these distributions, one can determine the helicity setup of the process and consequently reconstruct the helicity amplitudes of the process.
The Santiago machine has been run on the maximum shot setting of 8192 for 100 runs, leading to a total of 819,200 shots of the algorithm. From Fig. 4, it is clear that the quantum computer's performance does not match that of a perfect machine. Although the helicity of the process which has been calculated can be identified from the distinct probability distributions, one cannot determine the explicit amplitude from the real machine. However, it should be noted that a comparison to a perfect machine may not be a fair comparison for modern quantum computers. Therefore a comparison between a simulator run with the Santiago device's noise profile and the quantum computer results is shown in Appendix D. Section 2.4 explores the future of quantum computers for precise helicity amplitude calculations.
The results from the quantum computer, shown in Fig. 4, have been achieved by isolating the individual helicity processes on the quantum circuit, and removing the superposition between the positive and negative processes. The full amplitude is achieved Figure 3: gqq vertex circuit. The amplitude for the process is calculated on the q i qubits, which are controlled from the helicity register. The q i qubits are then measured by the quantum computer. Figure 4 shows the results of the algorithm for a random selection of small scattering angles, with runs on the IBM Q 32-qubit Quantum Simulator [41] and the IBM Q 5qubit Santiago Quantum Computer [43]; both of which have been compared to theoretical predictions of the probability distributions extrapolated directly from analytic calculations of the helicity amplitude, calculated using the S@M software [44]. The simulator has been run without a noise profile for 10,000 shots. The results agree well with theoretically predicted values, to within 1σ. From these distributions, one can determine the helicity setup of the process and consequently reconstruct the helicity amplitudes.
The Santiago machine has been run on the maximum shot setting of 8192 for 100 runs, leading to a total of 819,200 shots of the algorithm. Figure 4 shows that the quantum computer's performance does not match that of a perfect machine, as expected. Therefore, the simulator is rerun with the noise profile of the Santiago device and a comparison between this and the quantum computer is shown and discussed in Appendix D.
The results from the quantum computer, shown in Fig. 4, have been achieved by isolating the individual helicity processes on the quantum circuit, and removing the superposition between the positive and negative processes. The full amplitude is achieved through the implementation of a Hadamard gate on the helicity qubit, which puts the system into a superposition state of the positive and negative processes. The helicity of the process is then determined by measuring the helicity register. The qubit setup chosen here has been used in order to best reduce the CNOT qubit errors and limits the number of SWAP operations needed in the algorithm. The Santiago machine is a 5-qubit quantum computer, with all qubits connected inline to their adjacent qubit. The helicity qubit, h, from Fig. 3 has been assigned to qubit 4 on the Santiago machine, with the q i qubits on the 2nd, 3rd and 5th qubits of the Santiago machine. The optimum qubit setup would have the h qubit fully connected to the q i qubits, thus fully minimising the SWAP operation errors. However, the available machines with such a qubit mapping on the public IBM Q experience have a lower quantum volume than the Santiago machine, which reports a quantum volume of 32. Consequently, the trade of ideal qubit mapping for a better quantum volume has been made.  Figure 4: Results for the q → gq helicity amplitude calculation. Comparison between theoretically calculated probability distribution, quantum simulator and real quantum computer.
One of the key sources of error in the quantum computer is readout noise. Error mitigation methods have been used to optimise the output from the quantum computer and reduce readout noise effects. This has been done using the Qiskit Ignis software [40], which provides tools for noise characterisation and error correction based on noise models -8 -of the quantum machines. The method involves testing simple qubit states on a series of calibration circuits, which are run using the quantum simulator with the noise profile of the Santiago machine. The response matrix created from this is shown in Fig. 5. This response matrix is calculated immediately before running the algorithm and then applied to the machine results to obtain the error corrected results, as shown in Fig. 4.

2→2 amplitude calculation
Extending from the 1 → 2 case in Sec. 2.2, the implementation of a full helicity amplitude calculation for the s and t-channels of a 2 → 2 scattering process is presented here ‡ . As an example, we consider a qq → qq process. The initial state quark and antiquark are labelled as particles 1 and 2 respectively and the final state quark and antiquark as 3 and 4. In total, there are only 4 non-zero helicity configurations possible for each s and t-channel process. The relevant amplitudes are, where the +/-signs denote the helicity of the outgoing-particles 1, 2, 3 and 4 and The other non-zero amplitudes are obtained by complex conjugation. The calculation is performed in the Centre-of-Mass (CM) frame and the momenta of individual particles is defined such that the only dependent input variable is the angle, ‡ Note, for the calculation of the 1 → 3 case only minor modifications are needed. θ, through which the quark (and antiquark) is scattered. In the CM frame, the overall magnitude of energy, E, associated with the momenta of each particle also drops out of the final helicity amplitude and is therefore not considered in this example.
In the 'all-outgoing' convention of spinor-helicity formalism [42], the momenta of incoming particles are flipped so that the incoming quark (1) (antiquark (2)) is mapped to an outgoing antiquark (quark) with opposite helicity. In the quantum algorithm, each quark-antiquark vertex is calculated on a 4-qubit quantum register, q i . The outgoing antifermion spinor, q /q], is implemented on the vertex quantum register, q j i , followed by the two dimensional representation of the gamma matrices, σ µ /σ µ , and then finally the vertex is closed with the opposite helicity outgoing fermion spinor, [q/ q. A single qubit, s, is used to calculate the denominator of the gluon propagator. The calculation is controlled both from the helicity registers, h i , which determine what helicity configuration the particles are in, and the amplitude qubit, p, which controls whether the s or t-channel process is calculated. A schematic of the quantum circuit is shown in Fig. 6. Through this implementation, each component of the helicity amplitude can be calculated and extracted from the machine.
where the +/-signs denote the helicity of the outgoing-particles 1, 2, 3 and 4 and (2.10) The other non-zero amplitudes are obtained by complex conjugation. The calculation is performed in the Centre-of-Mass (CM) frame and the momenta of individual particles are defined such that the only dependent input variable is the angle, ✓, through which the quark (and antiquark) is scattered. In the CM frame, the overall magnitude of energy, E, associated with the momenta of each particle also drops out of the final helicity amplitude and is therefore not considered in this example.
In the 'all-outgoing' convention of spinor-helicity formalism [? ], the momenta of incoming particles is flipped so that the incoming quark (1) (antiquark (2)) is mapped to an outgoing antiquark (quark) with opposite helicity. In the quantum algorithm, each quark-antiquark vertex is calculated on a 4-qubit quantum register, q i . The outgoing antifermion spinor, qi/q], is implemented on the vertex quantum register, q j i , followed by the two dimensional representation of the gamma matrices, µ /¯ µ , and then finally the vertex is closed with the opposite helicity outgoing fermion spinor, [q/hq. A single qubit, s, is used to calculate the denominator of the gluon propagator. The calculation is controlled both from the helicity registers, h i , which determine what helicity configuration the particles are in, and the amplitude qubit, p, which controls whether the s or t-channel process is calculated. A schematic of the quantum circuit is shown in Fig. ??. Through this implementation, each component of the helicity amplitude can be calculated and extracted from the machine. Figure 6: Circuit for the qq ! qq process helicity amplitude calculation. The q j i registers are used to calculate the qq vertices, and these are controlled from the helicity registers, h i , which dictate the helicity configuration of the process.
This method is powerful as it allows for each component of the calculation to be extracted, however it leads to a complicated circuit, especially if one implements a method -9 - Figure 6: Circuit for the qq → qq process helicity amplitude calculation. The q j i registers are used to calculate the qq vertices, and these are controlled from the helicity registers, h i , which dictate the helicity configuration of the process.
This method is powerful as it allows for each component of the calculation to be extracted, however it leads to a complicated circuit, especially if one implements a method of dealing with incorrect helicity setups. As in Sec. 2.2, the circuit can be simplified by directly calculating the scalar products required for the final amplitudes. The amplitudes given in Eqs. (2.8) and (2.9) can be simplified using Eq. (2.6) (and that [p|σ µ |q = q|σ µ |p]) to give the final forms, 12 [21] , M s (+−−+) = 2 23 [41] 12 [21] (2.11) -10 -and M t (++−−) = 2 34 [21] 13 [31] , M t (+−−+) = 2 32 [41] 13 [31] . (2.12) Using these expressions, the number of qubits needed for the circuit is reduced from 17 to 12 qubits. Another advantage is that the machine now only has to read out 3 qubits, where previously 8 qubits were read out per run. On these three qubits, each of the scalar products is calculated. The quark-antiquark vertex scalar products from the numerator are calculated on the first two qubits, and the denominator of the gluon propagator is calculated on the third qubit. Only one scalar product needs to be calculated for the denominator since [42], therefore the second scalar product can be determined from the same qubit. This simplified circuit is run on the IBM Q 32-qubit Quantum Simulator [41] for 10,000 runs and compared to theoretically calculated probability distributions, extrapolated directly from analytic calculations of the helicity amplitude, calculated using the S@M software [44]. Using the equivalence between helicity spinors and orthogonal pure state qubits, these theoretical predictions have been obtained from the probabilities of each of the qubits to be in the |0 or |1 state, which correspond to the magnitude squared of the upper and lower components of the helicity spinor respectively. The results from the quantum simulator show that the output of the quantum circuit lies within 1σ of the theoretically predicted probability distribution and are shown in Fig. 7 for both the s and t-channel in a specific helicity configuration.

Generalisation to 2 → n amplitude calculations
It can be shown, using the BCFW recursion formula [45,46] and the relations in Eq. (2.5), that scattering amplitudes for massless partons can be reduced to a combination of scalar -11 -products between helicity spinors § . Consequently, the algorithm presented in Secs. 2.2 and 2.3 can be generalised to multi-particle amplitudes straightforwardly, as the tools are already created, namely the circuit decompositions of the helicity spinors from Appendix C. The number of calculation qubits, q i , and the number of helicity qubits, h i , needed in the algorithm both scale linearly with the number of final state particles, n. As the number of helicity qubits, h i , scales linearly, then so does the number of work qubits needed in the algorithm. Each scalar product calculation requires two spinor operations, and so the algorithm can be easily extended without adding disproportionate complexity. The circuit depth scales linearly with an increase in the number of scalar products, calculated on the q i qubits, and the number of helicity qubits, h i , added to the circuit. It is interesting and practical to consider the extension of the simple helicity amplitude algorithms presented here to more complicated processes that are likely to be present in high energy collisions, such as those studied at the LHC. As we have seen in Sec. 2.2, modern public access quantum computers do not perform to a standard where one could extrapolate accurate calculations of helicity amplitudes, even for a single vertex. However, the performance of public access computers is well below that of state of the art machines, such as the IBM 53-qubit machine and the Honeywell machine. The latter, in unpublished work, claims to have the world's best Quantum Volume of 64 [47]. Such computers do not have the same restrictions as the smaller, less capable public access machines. The more powerful machines offer more choice for qubit setup and mapping, and the ability to perform more operations before decoherence in the machine starts to affect the circuit output. We can speculate that the algorithms presented here would be very accurate on these machines, especially the vertex calculation, which comprises a maximum of only 33 operations across 4 qubits.
The main difficulty of extending such algorithms for helicity amplitude calculations on quantum computers comes not only from limitations due to the number of qubits, but also the machine's fault tolerance. The more complicated the helicity amplitude calculation, the more operations are needed to calculate it. Therefore, a machine needs not only sufficient qubits but also the ability to implement many operations without excess noise. For the algorithm proposed, the immediate challenge is not the number of qubits available, but the number of operations that can be reliably implemented on the circuit. With advancements in the Quantum Volume of quantum computers [48], this limitation will likely be overcome on current hardware. It is possible that near-future computers will have the ability to perform accurate and precise calculations and also have a large number of qubits. IBM recently announced their roadmap for the future and the goal of having machines with the number of qubits exceeding 1,000 by 2023 [49]. Therefore, it is highly likely that these nearfuture devices will be able to perform precise helicity amplitude calculations for processes § A well-known example is the Parke-Taylor formula for a 2 → n gluon scattering process, where the gluons i and j have helicity (-) and all other gluons have helicity (+). Then the formula provides the following expression for the amplitude An, . (2.14) -12 -with a large number of particles.

Parton shower algorithm
After the hard process is calculated, the next step in simulating a scattering event at a high-energy collider experiment is the parton shower stage. The parton shower evolves the scattering process from the hard interaction scale down to the hadronisation scale. We propose an algorithm for simulating a QCD parton shower using IBM Quantum Experience [40] software and hardware. The quantum circuit has been implemented to simulate a 2-step QCD parton shower with collinear splittings only. Section 3.1 provides the theoretical outline for the shower algorithm and discusses the splitting functions and probability calculations implemented in the quantum circuit. A brief overview of the quantum circuit is given in Sec. 3.2, and a comparison between the results of the algorithm and theoretically calculated probability distributions are discussed in Sec. 3.3. A glossary of quantum logic gates is given in Appendix A and a detailed overview of the quantum circuit for the algorithm in Appendix E.

Theoretical outline of shower algorithm
We present a parton shower algorithm with the ability to simulate a general, discrete QCD parton shower, harnessing the quantum computer's ability to remain in a quantum state throughout the algorithm. In contrast to classical methods, the algorithm does not need to explicitly keep track of individual shower histories. Instead, our algorithm constructs and maintains a wavefunction that consists of a superposition of all possible shower histories, with the final measurement projecting out a specific quantity of the final state. Consequently, the algorithm presented inherently simulates the quantum interference between all possible final states, without the need for extensive computational logic present in current classical algorithms. In a classical algorithm, a physically meaningful quantity can only be extracted from a parton shower calculation after summing over all possible shower histories, requiring them to be stored on a physical memory device. Our quantum algorithm avoids the need for such an intermediate step, as the measurement is performed on the superposition of all shower histories directly. The goal is to create the foundation for constructing a general quantum algorithm that can simulate a full QCD parton shower. To comply with the current capabilities of public access quantum computers and simulators provided by IBM Quantum Experience [40], the algorithm presented here uses a simplified model consisting of one flavour of quark and a gluon. This reduces the number of qubits needed, and the algorithm can be run on the IBM Q 32-qubit Quantum Simulator [41]. To further reduce the number of required qubits, only collinear splittings are considered within the model. By neglecting the soft-limit, there is no need to keep track of the detailed kinematics of the particles in the shower history.
Collinear emission occurs when a parton splits into two massless particles which have parallel 4-momenta, such that the total momentum, P , is distributed between the particles as thus, (p i + p j ) 2 = P 2 = 0 [50]. The emission probabilities in the algorithm are calculated using the collinear splitting functions outlined in [51][52][53][54]. A consequence of the collinear limit being a semi-classical interpretation with 1-to-2 splittings leads to the presence of a diagonal colour charge in the splitting functions, C ii . The splitting for a quark to a gluon and a quark, with momentum fractions z and 1 − z respectively, is described at Leading Order (LO) by with C F = 4/3. The gluon splitting can be divided into two parts, with the first describing the splitting of a gluon to a quark-antiquark pair and the second describing the splitting of a gluon to two gluons, where C A = 3 and T R = 1/2. Here, n f is the number of massless quark flavours, and T R is the colour factor. It should be noted that both splitting functions have a soft singularity at z = 0; the hard-collinear limit only takes into account finite z.
Further to calculating the splitting functions, the Sudakov factors have been used to determine whether an emission occurred in the step. The Sudakov factors for a QCD process are given by [7] and are used to calculate the non-emission probability. The running of the strong coupling, α s , is not simulated in this algorithm and for ease has been set to 1. For any given step N , there are N possible particles present, and so the probability that none of the particles split is given by Finally, the probability of a certain splitting is therefore obtained from To implement the algorithm efficiently, preference has been given to gluons splitting to a quark-antiquark pair. This splitting preference implementation is explained in depth in Appendix E, but, for definiteness, the probability of a gluon splitting to two gluons is calculated as For the energy scale considered here, this should have a small affect on the results as P g→qq (z) P g→gg (z).

Implementation on quantum circuit
A quantum circuit has been constructed to simulate a parton shower with collinear splittings. The circuit comprises of particle registers, emission registers and history registers and uses a total of 31 qubits. The algorithm is discretised into individual steps. An emission can occur in each step, and the probabilities are calculated from the splitting functions and Sudakov factors. To meet the 32 qubit limit of the IBM Q Quantum Simulator [41], the algorithm has been limited to two steps, but it is generally extendable. Figure 8 shows the circuit diagram for a single step.
gluons is calculated as For the energy scale considered here, this should have a small a↵ect on the results as P g!qq (z) ⌧ P g!gg (z).

Implementation on Quantum Circuit
A quantum circuit has been constructed to simulate a parton shower with collinear splittings. The circuit comprises of particle registers, emission registers and history registers and uses a total of 31 qubits. The algorithm is discretised into individual steps. An emission can occur in each step, and the probabilities are calculated from the splitting functions and Sudakov factors. In order to meet the 32 qubit limit of the IBM Q Quantum Simulator [? ], the algorithm has been limited to two steps, but it is generally extendable. Figure ?? shows the circuit diagram for a single step.   The algorithm follows a similar method to that described in [26], first counting the particles present in the simulation, determining whether an emission has occurred and if so, assessing which splitting did occur, then finally updating the particle content of the simulation. In contrast to the method shown by [26], the algorithm presented here has the ability to simulate a QCD process with splittings for both gluons and quarks implemented using the splitting functions outlined in Eqs. (3.2) and (3.3). The addition of such splitting functions leads to significant changes to the algorithm compared to that presented in [26], specifically in the History and Update gates of the algorithm, shown in Fig. 8. The implementation of these gates is outlined in detail in Appendix E. Unlike the algorithm presented in [26], we have chosen not to introduce flavour mixing at the start of the algorithm. Instead, the superposition and interference between the possible output states are introduced in the tailored History and Update gates. With the ability to simulate gluon and quark splittings, the algorithm is thus well suited to hadronic parton shower simulation and provides the foundations for a general parton shower algorithm for use on a GQC.
The parton shower algorithm is designed to operate on the public access IBM Q 32qubit Quantum Simulator [41], which allows for a total of two steps to be simulated on the machine. As the machine is a simulator, it does not suffer from noise or a limit on the number of operations due to qubit decoherence effects, therefore giving a simulation of a perfect machine. As a consequence, error checking is easily done with direct comparison to theoretically predicted probability distributions, and this is discussed in Sec. 3.3.
One of the main benefits of using a quantum computing (QC) algorithm for the simulation of QCD parton showers over classical methods is the computational simplicity of the algorithm. When dealing with interference of different splittings in the shower process, the algorithm presented here offers a much less computationally complex approach than that provided by modern Monte Carlo event generators. This is achieved by utilising the unique ability to maintain the quantum computer in a fully quantum state throughout the algorithm, and only collapse to a classical circuit by measurement at the end of the process. This allows for the system to account for all possible parton shower histories simultaneously. In contrast, modern Monte Carlo methods must manually keep track of the particle splitting histories to consider all possible contributions to a specific final state. For a two-step, discrete parton shower, this is a relatively easy task for a modern Monte Carlo generator. However, the quantum computing field is still in its infancy; the true potential of quantum computing for simulating QCD parton showers will become apparent with the advancement of quantum technologies. With more available qubits and machines with improved hardware, the algorithm presented here will have the ability to simulate quantum effects, without the extensive and complex computational logic that a classical computer would need. Therefore, quantum computers offer an avenue to explore processes that contain a large number of shower particles, thus requiring complicated parton histories and computing power, not currently achievable with modern classical techniques. Beyond QCD parton showers, this feature of a quantum computing algorithm can be of particular interest for cosmic-ray air showers, where millions of long-lived particles are simulated [55,56].

Results of parton shower
A comparison of the output from the parton shower algorithm and theoretical predictions of the splitting probabilities is made, and the results are shown in Fig. 9. The algorithm was run for 10,000 shots using the IBM Q 32-qubit Quantum Simulator [41], with a momentum interval of z lower = 0.3 to z upper = 0.5, and no noise simulation. Here the theoretical predictions have been calculated using the collinear splitting functions from Eqs. (3.2) and (3.3), using the method outlined in Sec. 3.1. The z value used for the particle splitting probabilities from Eq. (3.7) is the mid-point of the momentum interval used in the algorithm. The results are in agreement with the theoretically calculated probabilities to within 1σ.
A consequence of running the algorithm on a quantum simulator is that there will be no noise in the results, unlike a real quantum computer. Therefore, problems with the algorithm can be identified through direct comparison with the theoretical calculations. In the future, if the algorithm can be run on a real quantum computer with enough qubits, then IBM Q offers a range of noise reducing schemes for its devices through the Qiskit software [40]. Another advantage of using the quantum simulator is that it automatically chooses an optimum qubit setup. In a real quantum computer, the user has to select a -16 -qubit mapping in order to optimise the operation of the computer. For future use of the algorithm, this can be done using the calibration data provided by IBM Q.  : Results from the quantum circuit compared to theoretical predictions for two steps of the parton shower with momentum interval of z lower = 0.3 to z upper = 0.5 and the initial state particle of (a) gluon, (b) quark and (c) antiquark.
-17 - The accurate modelling of the complexity of collisions at experiments, such as the Large Hadron Collider, relies on theoretical calculations of multi-particle processes. Such calculations can be factorised into: the hard interaction, which models the large momentum transfer, and the parton shower, which models the evolution from the hard interaction scale down to the hadronisation scale. We present general and extendable quantum computing algorithms that provide calculations of the hard interaction process and the parton shower, as a first step towards a quantum computing algorithm to describe the full collision event at the LHC.
The hard interaction calculation uses helicity amplitudes by exploiting the equivalence of spinors and qubits, and encoding operators as a series of unitary operations in the quantum circuit, thus demonstrating an excellent use case of quantum computers to model the intrinsic quantum behaviour of the system. A quantum algorithm is constructed for two simple examples of helicity calculations; a gqq vertex and the qq → qq process. By applying Hadamard gates to helicity registers, we introduce a superposition state between the helicity qubits and can therefore calculate the positive and negative helicities of each particle involved simultaneously. This is further exploited in the simultaneous computation of s and t-channel amplitudes for the qq → qq process, thus fully utilising the quantum nature of the computation. A comparison between the theoretical predictions and the output of the quantum algorithm shows very good agreement. Furthermore, the successful implementation of the gqq vertex algorithm on a real machine is also demonstrated by comparing results from the machine with a simulator.
We also present a quantum algorithm for simulating collinear emission in a two-step, discrete parton shower with a maximum of three final state particles, utilising the quantum computer's ability to remain in a quantum state throughout the simulation. In contrast to classical implementations of parton showers, where individual shower histories have to be stored on a physical memory device, our quantum computing algorithm constructs a wavefunction for the whole parton shower process, which contains a superposition of all shower histories. As a result, we do not need to keep track of individual shower histories explicitly, and a physical quantity of the shower process can be obtained through a measurement of the wavefunction. The results from the algorithm, as performed on the IBM Q 32-qubit Quantum Simulator [41], show good agreement with theoretical predictions. The algorithm builds on previous work [26] by including a vector boson and boson splittings, which leads to significant changes in its implementation. The ability to simulate gluon and quark splittings makes the algorithm presented here well suited to hadronic parton shower simulation and provides the foundations for developing a general parton shower algorithm. With advancements in quantum technologies, this algorithm can be extended to include all flavours of quarks without adding disproportionate computational complexity.
With IBM recently setting their goal of exceeding 1,000 qubits by 2023 [49] and advancements in the development of devices with better Quantum Volume [48], we are on the brink of a quantum revolution. These developments would allow the algorithms presented in this paper to be extended to reflect the processes seen in experiments such as the -18 -LHC. The consequence of such advancements would be algorithms that can fully model the dynamics of quantum field theories to provide accurate and precise helicity amplitude calculations and simulations of parton showers.

A Quantum logic gate definitions
• NOT gate -a NOT gate is a single qubit operation which flips the state of the qubit.
The circuit representation of a NOT gate is: in their work and for sharing their preliminary codes. M.S would like to thank Steve Abel and Daniel Maitre for helpful discussions. K.B and M.S are supported by the STFC under grant ST/P001246/1. S.M and S.W are supported by a grant from the Royal Society.

A Quantum Logic Gate Definitions
• NOT Gate -a NOT Gate is a single qubit operation which flips the state of the qubit.
The circuit representation of a NOT Gate is: • CNOT Gate -a controlled -NOT (CNOT) Gate is a two qubit operation which flips the state of a target qubit dependent on the state of a control qubit.
Here, the first qubit is the control. The circuit representation of a CNOT Gate is: • To↵oli Gate (CCNOT) -A To↵oli Gate is a three qubit operation, which is just a further extension of the NOT gate with two control qubits.
-18 - • CNOT gate -a controlled -NOT (CNOT) gate is a two qubit operation which flips the state of a target qubit dependent on the state of a control qubit.
Here, the first qubit is the control. The circuit representation of a CNOT gate is:

A Quantum Logic Gate Definitions
• NOT Gate -a NOT Gate is a single qubit operation which flips the state of the qubit.
The circuit representation of a NOT Gate is: • CNOT Gate -a controlled -NOT (CNOT) Gate is a two qubit operation which flips the state of a target qubit dependent on the state of a control qubit.
Here, the first qubit is the control. The circuit representation of a CNOT Gate is: • To↵oli Gate (CCNOT) -A To↵oli Gate is a three qubit operation, which is just a further extension of the NOT gate with two control qubits.

• Toffoli gate (CCNOT)
-A Toffoli gate is a three qubit operation, which is just a further extension of the NOT gate with two control qubits.
The circuit representation of a Toffoli gate is: The circuit representation of a To↵oli Gate is: -20 - • Hadamard gate -a Hadamard gate is a purely quantum logic gate and does not have a classical logic gate equivalent. A Hadamard gate is a single qubit operation which puts a qubit into a superposition.
The Hadamard gate can be controlled, and so is only applied depending on the state of the control qubit. The circuit representation of a Hadamard gate is: • Hadamard Gate -a Hadamard gate is a purely quantum logic gate and does not have a classical logic gate equivalent. A Hadamard gate is a single qubit operation which puts a qubit into a superposition.
The Hadamard gate can be controlled, and so is only applied depending on the state of the control qubit. The circuit representation of a Hadamard Gate is:

H B Dirac and Helicity Spinor correspondence
The following demonstration of the correspondence between Dirac spinors and Helicity spinors can be seen in Chapter 2 of [? ]. Fermion and anti-fermion spinors satisfy the Dirac equations such that, where both equations have independent solutions which can be labelled by subscripts s = ±.
One can move to a basis where the ± denotes spin up/down along the z-axis, by ensuring that spinors u ± and ⌫ ± are eigenstates of the z-component of the spin-matrix in the rest frame. For massless fermions, ± denotes the helicity; the projection of the spin along the momentum of the particle. These spinors are also associated with the conventional Feynman rules for external fermions, e.g. ⌫ ± (p) for an outgoing anti-fermion andū ± (p) for an outgoing fermion. For the massless case, the Dirac equations reduce to where ⌫ ± (p) and u ± (p) are the wave functions associated with outgoing anti-fermions and fermions respectively. For this case the wavefunctions are related as u ± = ⌫ ⌥ and⌫ ± =ū ⌥ . The two independent solutions of the Dirac equations can be written as where the angle and square spinors are 2-component spinors that satisfy the massless Weyl equation.

B Dirac and helicity spinor correspondence
The following demonstration of the correspondence between Dirac spinors and Helicity spinors can be seen in Chapter 2 of [42]. Fermion and anti-fermion spinors satisfy the Dirac equations such that, where both equations have independent solutions which can be labelled by subscripts s = ±.
One can move to a basis where the ± denotes spin up/down along the z-axis, by ensuring that spinors u ± and ν ± are eigenstates of the z-component of the spin-matrix in the rest frame. For massless fermions, ± denotes the helicity, the projection of the spin along the momentum of the particle. These spinors are also associated with the conventional Feynman rules for external fermions, e.g. ν ± (p) for an outgoing anti-fermion andū ± (p) for an outgoing fermion. For the massless case, the Dirac equations reduce to where ν ± (p) and u ± (p) are the wavefunctions associated with outgoing anti-fermions and fermions respectively. For this case the wavefunctions are related as u ± = ν ∓ andν ± =ū ∓ . The two independent solutions of the Dirac equations can be written as where the angle and square spinors are 2-component spinors that satisfy the massless Weyl equation.

C Helcity amplitude gate decompositions
• U a gate -The U a takes the form of a conventional U 3 rotation gate, Therefore, the circuit representation is just a qiskit U 3 rotation,

C Helcity Amplitude Gate Decompositions
• U ai Gate -The U ai takes the form of a conventional U 3 rotation gate, Therefore, the circuit representation is just a qiskit U 3 rotation, -The U a] has the matrix form, Therefore, this gate has the circuit representation, = U a] U 3(✓, ⇡ ,⇡ ) • U hb Gate -The U hb has the matrix form, Therefore, this gate has the circuit representation, -The U a] has the matrix form, Therefore, this gate has the circuit representation,

C Helcity Amplitude Gate Decompositions
• U ai Gate -The U ai takes the form of a conventional U 3 rotation gate, Therefore, the circuit representation is just a qiskit U 3 rotation, -The U a] has the matrix form, Therefore, this gate has the circuit representation, = U a] U 3(✓, ⇡ ,⇡ ) • U hb Gate -The U hb has the matrix form, Therefore, this gate has the circuit representation, The U b has the matrix form, Therefore, this gate has the circuit representation,

C Helcity Amplitude Gate Decompositions
• U ai Gate -The U ai takes the form of a conventional U 3 rotation gate, Therefore, the circuit representation is just a qiskit U 3 rotation, -The U a] has the matrix form, Therefore, this gate has the circuit representation, • U hb Gate -The U hb has the matrix form, Therefore, this gate has the circuit representation, Therefore, this gate has the circuit representation, Therefore, this gate has the circuit representation,

D.1 1 ! 2 Amplitude Calculation
Here we present the detailed circuit diagram for the q ! gq process, shown in Fig. ??, which is implemented using the helicity amplitude gate decompositions outlined in Appendix ??. This demonstrates the simplification achieved by using fully contracted helicity amplitudes in the calculation, with a scalar product calculated on each qubit. The first slice in the circuit diagram calculates the positive helicity, controlling from the h register in the |1i state. The second slice controls from the h register in the |0i state and calculates the negative helicity process. A superposition of both the positive and negative processes, and thus the full amplitude, is achieved by implementing a Hadamard gate on the helicity qubit, h. Figure 10: Explicit circuit for q ! gq helicity amplitude calculation, using gate decompositions outlined in Sec. ??. The amplitude for the process is calculated on the q i qubits, which are controlled from the helicity register. The q i qubits are then measured by the quantum computer.
In Fig. ?? Here we present the detailed circuit diagram for the q → gq process, shown in Fig. 10, which is implemented using the helicity amplitude gate decompositions outlined in Appendix C. This demonstrates the simplification achieved by using fully contracted helicity amplitudes in the calculation, with a scalar product calculated on each qubit. The first slice in the circuit diagram (to the left of the vertical dashed line) calculates the positive helicity, controlling from the h register in the |1 state and the second slice (to the left of the vertical dashed line) controls from the h register in the |0 state and calculates the negative helicity process. A superposition of both the positive and negative processes, and thus the full amplitude, is achieved by implementing a Hadamard gate on the helicity qubit, h.
Therefore, this gate has the circuit representation,

D.1 1 ! 2 Amplitude Calculation
Here we present the detailed circuit diagram for the q ! gq process, shown in Fig. 10, which is implemented using the helicity amplitude gate decompositions outlined in Appendix C. This demonstrates the simplification achieved by using fully contracted helicity amplitudes in the calculation, with a scalar product calculated on each qubit. The first slice in the circuit diagram calculates the positive helicity, controlling from the h register in the |1i state. The second slice controls from the h register in the |0i state and calculates the negative helicity process. A superposition of both the positive and negative processes, and thus the full amplitude, is achieved by implementing a Hadamard gate on the helicity qubit, h. Figure 10: Explicit circuit for q ! gq helicity amplitude calculation, using gate decompositions outlined in Sec. C. The amplitude for the process is calculated on the q i qubits, which are controlled from the helicity register. The q i qubits are then measured by the quantum computer.
In Fig. 11, a comparison between the output of the IBM Q Santiago 5-qubit Quantum Computer [44] and the IBM Q 32-qubit Quantum Simulator [42] run with the Santiago device's noise profile is presented. The quantum computer has been run for 100 runs of 8192 shots, giving a total of 819,200 shots on the circuit and the simulator has been run for 10,000 shots. Here we see more reasonable agreement between the noisy simulator -22 - Figure 10: Detailed circuit diagram for the q → gq helicity amplitude calculation, using gate decompositions outlined in Sec. C. The amplitude for the process is calculated on the q i qubits, which are controlled from the helicity register. The q i qubits are then measured by the quantum computer.
In Fig. 11, a comparison between the output of the IBM Q Santiago 5-qubit Quantum Computer [43] and the IBM Q 32-qubit Quantum Simulator [41] run with the Santiago device's noise profile is presented. The quantum computer has been run for 100 runs of 8192 shots, giving a total of 819,200 shots on the circuit and the simulator has been run -23 -for 10,000 shots. Here we see a more reasonable agreement between the noisy simulator and the output from the quantum computer than the comparison to the perfect machine simulator from Sec. 2.2. However, it should be noted that the noise profile used in the noisy simulation is only an approximation of the real quantum computer errors. Noise profiles are built from a limited number of parameters and are based on average measurements of qubit errors [40]. As a result, some discrepancies are present between the quantum simulator with the Santiago device's noise profile and the real device. These can be attributed to noise not accounted for in the quantum computer.  Figure 11: Results for the q → gq helicity amplitude calculation. Comparison between the results from a quantum simulator with the relevant machine noise profile, the results from the Santiago quantum computer and the error mitigated results from the quantum computer.
While obtaining the results, we noticed a discrepancy between separate runs on the negative helicity case. By changing the qubit mapping in the measurement process, this was identified as a tuning error on the entangling gate between qubits 2 and 3 of the Santiago machine. This error was later fixed, and the results shown were obtained from runs with a fully functioning machine. To further validate the results, a series of runs were performed on the IBM Q Valencia 5-qubit machine [57], which has a Quantum Volume of 16. The results confirmed that the Santiago machine was working correctly.

E Detailed quantum circuit for collinear parton shower algorithm
The algorithm presented here follows a similar method to that outlined in [26]. In contrast, the algorithm does not introduce flavour mixing, but does simulate a vector boson with the possibility of boson splittings. As a result, the algorithm presented here includes tailored History and Update gates to deal with the increased splitting channels. Shown in Fig. 8, the circuit comprises of four tailored gate operations: Count, Emission, History, and Update gate. The particle identity is encoded using a 3-qubit base, and the following qubit combinations have been chosen for each type of particle: Using a 3-qubit base, it is possible to simulate 7 different types of particle and 1 null state. Therefore, the algorithm could be easily extended to accommodate more quark flavours if more qubits were available.

E.1 Count gate
The count gate comprises of three individual counting mechanisms for each type of particle, and is applied to each particle register individually. The algorithm utilises a series of NOT, controlled-NOT (CNOT ) and Toffoli (CCNOT ) gates to update the count registers, n i , depending on the type of particle represented in the particle register. Fig. 12 shows the counting mechanism for a gluon, controlling only from the gluon state outlined in E.1.

E Detailed Quantum Circuit for Collinear Parton Shower Algorithm
The algorithm presented here follows a similar method to that outlined by Bauer et al. [? ]. In contrast, the algorithm does not introduce flavour mixing, but does simulate a vector boson with the possibility of boson splittings. As a result, the algorithm presented here includes tailored History and Update gates to deal with the increased splitting channels. Shown in Fig. ??, the circuit comprises of four tailored gate operations: Count, Emission, History and Update gate. The particle identity is encoded using a three qubit base, and the following qubit combinations have been chosen for each type of particle: gluon quark antiquark Using a 3-qubit base, it is possible to simulate 7 di↵erent types of particle and 1 null state. Therefore, the algorithm could be easily extended to accommodate more quark flavours if more qubits were available.

E.1 Count Gate
The count gate is made up from three individual counting mechanisms for each type of particle, and is applied to each particle register individually. The algorithm utilises a series of NOT, controlled-NOT (CNOT ) and To↵oli (CCNOT ) gates to update the count registers, n i , depending on the type of particle represented in the particle register. Fig. ?? shows the counting mechanism for a gluon, controlling only from the gluon state outlined in ??.  Figure 12: Count Gate circuit decomposition for counting a gluon in the particle register. To complete the count gate, this is repeated for all other possible particle types by applying di↵erent combinations of NOT gates.
-24 - Figure 12: Count gate circuit decomposition for counting a gluon in the particle register. To complete the count gate, this is repeated for all other possible particle types by applying different combinations of NOT gates.
-25 - The total number of count registers, n i , used in the algorithm is 4. As the particle count registers are updated at the beginning of a step, the maximum number of gluons that can be present is 2, and the maximum number of quarks/antiquarks is 1. Therefore, for this algorithm, only 2 gluon count registers and 1 quark/antiquark count register are required. Ideally, one would have the same number of count registers for each particle type, which would be equal to the step number. However, due to the limited number of available qubits, this has not been possible here.

E.2 Emission gate
The emission gate implements the Sudakov factors from Eq. (3.5) by defining a U 3 rotation that can be applied to the emission register, e. The structure of this rotation takes the same form as that presented in [26], The total number of count registers, n i , used in the algorithm is 4. As the particle count registers are updated at the beginning of a step, the maximum number of gluons that can be present is 2 and the maximum number of quarks/antiquarks is 1. Therefore, for this algorithm only 2 gluon count registers and 1 quark/antiquark count register are required. Ideally, one would have the same number of count registers for each of the particle types, and this would be equal to the step number. However, due to the limitation on the number of available qubits, this has not been possible here.

E.2 Emission Gate
The emission gate implements the Sudakov factors from Eq. (??) by defining a U 3 rotation that can be applied to the emission register, e. The structure of this rotation takes the same form as that presented by Bauer et al. in Reference [? ],  Figure 13: Emission Gate for a single gluon in the first particle register. Here the U e is a U 3 rotation is used to implement the Sudakov Factors.
Similarly to the Count Gate, the Emission Gate is constructed from a series of NOT gates which determine the target state, and a series of CCNOT gates which implement the operation if the target state is present. Here, the emission is determined by controlling from -25 - Figure 13: Emission gate for a single gluon in the first particle register. Here the U e is a U 3 rotation is used to implement the Sudakov factors.
Similarly to the Count gate, the Emission gate is constructed from a series of NOT gates which determine the target state, and a series of CCNOT gates which implement the -26 -operation if the target state is present. Here, the emission is determined by controlling from the particle count gates. If the desired particles are present, then the emission rotation from Eq. (E.2) is applied to the emission register. As only one emission can occur in a single step, then only one emission qubit is needed per step.

E.3 History gate
The history gate is the most complicated implementation in the algorithm. This is largely due to the fact that a gluon can split to either a gluon pair, or a quark-antiquark pair. As a consequence this requires two calculations of splitting probabilities for a gluon, as outlined in Eq. (3.7). These probabilities are implemented by controlling from present particles and applying a rotation to the relevant history register; again taking a form similar to the one presented in [26], where P tot is defined as, P tot (z) = n g (P g→qq + P g→gg ) + n q P q→qg + n q P q→qg . (E.5) Here the non-splitting probabilities are used in the diagonal elements due to the definition of the qubit states outlined in Eq. (E.3).
the particle count gates. If the desired particles are present, then the emission rotation from Eq. (??) is applied to the emission register. As only one emission can occur in a single step, then only one emission qubit is needed per step.

E.3 History Gate
The history gate is the most complicated implementation in the algorithm. This is largely due to the fact that a gluon can split to either a gluon pair, or a quark-antiquark pair. As a consequence this requires two calculations of splitting probabilities for a gluon, as outlined in Eq. (??). These probabilities are implemented by controlling from present particles and applying a rotation to the relevant history register; again taking a form similar to the one presented by Bauer et al. [? ], where P tot is defined as, P tot (z) = n g (P g!qq + P g!gg ) + n q P q!qg + n q P q!qg . (E.5) Here the non-splitting probabilities are used in the diagonal elements due to the definition of the qubit states outlined in Eq. (??).  Figure 14: History Gate for a single gluon in the first step. Here the U h gate is a U 3 rotation used to implement the splitting probabilities.
The history gate used in this algorithm di↵ers from [? ], such that it controls from the particle registers and not the count registers. This is to reduce the number of count -26 - Figure 14: History gate for a single gluon in the first step. Here the U h gate is a U 3 rotation used to implement the splitting probabilities.
The history gate used in this algorithm differs from [26], such that it controls from the particle registers and not the count registers. This is to reduce the number of count -27 -registers needed in the algorithm. For this algorithm, the history rotation needs to know which particle is being considered and which particle register it is in so that the correct rotation can be applied to the correct history qubit. This could be done by increasing the count registers by one qubit per particle type every step, to have a count register for each possible particle in a specific particle register. However, this need for more counting qubits can be reduced by simply controlling from the particle registers themselves; this is shown for a gluon in Fig. 14. This can be done without impacting the rest of the algorithm, as long as there are enough count qubits to count the number of present particles correctly. This is because the emission gate does not need to know what specific particle is present in which particle register, just how many particles are present.
Once the particle content of the simulation has been assessed, the history rotations, U h , from Eq. (E.4) are applied to the relevant history registers. The first, labelled g 1 , is for the g → qq splitting, and the second, labelled g 2 , is for the g → gg. Note that both of these rotations could result in a splitting, and thus rotate the history qubit to the |1 state. Therefore, they are applied to different history registers. In general, one could have a condition on the second rotation, such that it is not applied if the first rotation yields a |1 , but in this algorithm, this condition was carried forward to the update gate, see Sec. E.4. As a result of these different splittings, the required number of qubits needed for the history register in each step is 3N , where N is the step number. Figure 14 shows the history gate for the first step, thus 3 qubits are needed for the history register: two for the gluon splittings, and one for the quark/antiquark splittings.

E.4 Update gate
The final gate in the algorithm is the update gate, which, if an emission has occurred, changes the particle content of the particle registers accordingly. Figure 15 shows the update gate from the first step, which is the simplest update gate in the algorithm. The circuit shown is sliced into individual updates. The first slice on the left shows the addition of a new gluon to the particle register. This is controlled from the quark/antiquark history gate, and so corresponds to the (−) q → (−) q g process. The middle slice in Fig. 15 shows the update of a gluon splitting to a quark-antiquark pair controlled from the g → qq history register. The first three CNOT gates of this slice put the particle registers into a state of two quarks. The update gate then utilises a controlled-Hadamard gate, putting the p j 1 qubit in a superposition of |0 and |1 states. The final gate of the slice controls from the history register, but also controls on a |0 state on the p j 1 . At the point of measurement, if p j 1 is measured as a |0 state, then the p k register represents an antiquark and p j represents a quark. If p j 1 is measured in the |1 state, then the p k register represents a quark and the p j register represents an antiquark. In a sense, this controlled-Hadamard gate and subsequent CCNOT gate put the particle registers p j and p k into a superposition of quark-antiquark and antiquark-quark states.
The final slice on the right of the circuit diagram in Fig. 15 shows the update gate corresponding to the g → gg process. This simple update changes the p j 0 qubit to a |1 state controlled from the history register, like the quark/antiquark update gate. However, this is where the algorithm adds a preference to g → qq process over the g → gg process.
-28 - Figure 15: Update gate for the first step of the algorithm. Each slice is a different update mechanism: far left slice updates q → qg splittings, centre slice updates g → qq and the far right slice updates g → gg.
The CCNOT gate for the final slice in Fig. 15 also controls from a |0 state on the g → qq history qubit. Therefore a gluon can only split to a gluon pair if the history gate for a gluon splitting to a quark-antiquark pair is in the |0 state. This is an acceptable approximation because the splitting probabilities for g → qq are a lot less than for g → gg. Consequently, there is only a small probability that they are both in the |1 state at any one time. However, it is possible that this may be a limitation in comparison to current classical parton shower algorithms provided by packages such as Pythia [1], Herwig [2], and Sherpa [3], as these give more complex weightings to the different splitting channels.