Quantum clustering and jet reconstruction at the LHC

Clustering is one of the most frequent problems in many domains, in particular, in particle physics where jet reconstruction is central in experimental analyses. Jet clustering at the CERN's Large Hadron Collider (LHC) is computationally expensive and the difficulty of this task will increase with the upcoming High-Luminosity LHC (HL-LHC). In this paper, we study the case in which quantum computing algorithms might improve jet clustering by considering two novel quantum algorithms which may speed up the classical jet clustering algorithms. The first one is a quantum subroutine to compute a Minkowski-based distance between two data points, whereas the second one consists of a quantum circuit to track the maximum into a list of unsorted data. The latter algorithm could be of value beyond particle physics, for instance in statistics. When one or both of these algorithms are implemented into the classical versions of well-known clustering algorithms (K-means, Affinity Propagation and $k_T$-jet) we obtain efficiencies comparable to those of their classical counterparts. Even more, exponential speed-up could be achieved, in the first two algorithms, in data dimensionality and data length when the distance algorithm or the maximum searching algorithm are applied.


Introduction
Quantum computing devices, which are based on the laws of quantum mechanics, offer the possibility to efficiently solve specific problems that become very complex or even unreachable for classical computers since they scale either exponentially or super-polynomially. Algorithms used in quantum computers [1] exploit the quantum principles of superposition and entanglement to clearly manifest a speed-up advantage over the classical counterpart algorithms. Two examples of these quantum algorithms are the well-known cases of Grover's database querying [2] and Shor's factoring of integers into primes [3]. These two quantum methods shown, for first time in the 1990s, a clear potential advantage over their corresponding classical analogues. In the past recent years, we have witnessed an impressively fast development of quantum computing algorithms going from optimization problems such as port-folios in fintech [4], applications in quantum chemistry [5], nuclear physics and Monte Carlo simulation [6][7][8], combinatorial optimization [9], to state diagonalization [10,11].
In the present paper we address the problem of clustering and jet reconstruction from collision data, which is a nontrivial and computationally expensive task, as it often involves performing optimizations over potentially large numbers of final-state particles. To give a rough idea of how demanding this activity is, the state-of-the-art algorithm in jet clustering needs few months to clusterize all the particles generated in the data of interest that is produced at the LHC in just one year [41]. Moreover, with the upcoming HL-LHC, the number of events will be up to an order of magnitude more than in earlier runs [42] and also the pile-up (simultaneous proton-proton collisions per bunch crossing) will increase by a factor of 5 [43]. Therefore, the state-of-the-art algorithm will require roughly 50 times the computational time needed now. So we would be talking about a few tens of years for processing the data of interest generated in just a year. This evidences the necessity of developing fast and effective jet clustering algorithms.
With this in mind, we consider the possibility of using quantum algorithms to improve the velocity in jet identification. Here we focus on three well-known classical algorithms: the K-means clustering [44,45], the Affinity Propagation (AP) algorithm [46] and the k T -jet clustering method in all its variants [47][48][49][50][51]. We propose the correspon-ding quantum versions of the precedents algorithms: quantum K-means clustering, quantum AP-algorithm and quantum k T -based algorithms.
Clustering is one of the most frequent classic problems in machine learning and computational geometry. It is a major data analysis tool used in such domains as marketing research, data mining, bioinformatics, image processing, pattern recognition and also in HEP. The popular K-means formulation [44,45], which is a method of vector quantization originally proposed for signal processing, involves the partition of n observations into K clusters in which each observation belongs to the cluster with the nearest mean (cluster center or cluster centroid), serving as a prototype of the cluster. Solving this problem exactly is NP-hard 2 (Non-deterministic Polynomialtime hardness), even with just two clusters [52]. Forty years ago, Lloyd [53] proposed a local search solution that is still very widely used today. Usually referred to simply as K-means, Lloyd's algorithm begins with K arbitrary centers, typically chosen uniformly at random from the data points. Each point is then assigned to the nearest center, and each center is recomputed as the center of all points assigned to it. These two steps (assignment and center calculation) are repeated until the process stabilizes.
The improved version of the K-means method, the K-means++ algorithm [54], initializes the K-means algorithm by choosing random starting centers with very specific probabilities. This strategy outperforms K-means in terms of both accuracy and speed, often by a substantial margin [54]. K-means is a method of cluster analysis using a pre-specified number of clusters. It requires an advance (a priori) knowledge of K and belongs to the group of the so-called partitional clustering algorithms. The classical K-means algorithm has already been used in high-energy physics in Refs. [55][56][57][58]. For example, in Ref. [55], the use of K-means led to 25 % and 40% improvement of the top quark and W boson mass resolution, respectively, compared to the k T (Durham) algorithm, and reduced the systematic uncertainty in the measured peak positions. As a drawback, K-means was roughly three times slower than the Durham algorithm, therefore the interest to explore potential speed ups. In Ref. [56], the tagging performance of N -subjetiness for boosted top quarks was improved through minimization using a variant of K-means. The XCone jet algorithm introduced in Ref. [57] is closely related to the traditional K-means and its variants. Finally, K-means has been used in Ref. [58] to identify minijets at low p T .
The AP algorithm, is a clustering method that identifies representative examples (exemplars) within a given dataset by exchanging messages between all data points. Points are then grouped with their most representative exemplar to give the final set of clusters. The AP algorithm has been successfully applied to a wide range of problems including face recognition, gene identification, putative exons using microarray data [59][60][61] and astrophysics [62]. In high-energy physics, it has been used to cluster replicas of parton densities [63]. In Ref. [46], it was shown that AP might be faster and more accurate than the K-means [44,45] clustering algorithm in solving certain problems. The AP algorithm is solid and well understood and the number of clusters is not needed to be pre-specified. Among its disadvantages, the high time complexity turns out to make it not suitable for very large datasets, and the clustering result is typically sensitive to the parameters involved in the AP algorithm. Our motivation in using it for jet clustering comes from the fact that it does not need the number of clusters to be defined beforehand.
Hierarchical clustering also known as hierarchical cluster analysis (HCA) is also a method of cluster analysis that seeks to build a hierarchy of clusters without having an a priori fixed number of clusters. The k T -based algorithms [64] belong to the hierarchical category, which needs a linkage function that defines the distance between any two sub-sets (and relies on the base distance between elements). It is the most widely used jet clustering algorithm in the LHC experiments.
The quantum K-means clustering algorithm was presented in Refs. [19,65] for HEP. An earlier study of the quantum K-means can be found in Ref. [66]. Both implementations make use of the Euclidean distance to perform the clustering of particles. In this paper, we present a version of the quantum K-means clustering algorithm which is based on the definition of a Minkowskian distance at the quantum level for the first time. Considering the case of the quantum version of the AP algorithm, it uses the invariant sum squared as a metric in the similarity matrix and calculates it through a quantum subroutine with a similar procedure as in the quantum K-means implementation. Regarding the quantum k T -based algorithms, to our knowledge, it is the first time it has been presented in the literature. In addition, the search for the maximum distance used in our implementation is performed with a new quantum algorithm. This new quantum method is presented in a general way, and we comment on its reach regarding other areas of interest. Beyond the specific application to jet clustering, the quantum algorithms presented in this paper are of interest to the particle physics and quantum computing communities.
This paper is organized as follows. In Section 2 we introduce our notation and we define the Euclidean and Minkowskian quantum distances. In Section 3 we present our new quantum algorithm in order to search the maximum in a set of a given number of elements. We consider the quantum version of the K-means clustering, AP and k T -based algorithms in Section 4. In Section 5 we present our results considering the quantum simulations of these algorithms and a proof-of-concept implementation with Gaussian datasets as well as with simulated LHC physical events. We also compare their performance in detail. We discuss their differences and conceptual similarities and we compare them with their classical counterparts. A brief summary of our results is presented in Section 6.

Quantum distances
In quantum computing, it is essential to have the ability to measure quantum entanglement between two states, as in many cases it determines the possibility of obtaining a quantum advantage [67]. We rely on the SwapTest method [68] (see Appendix A for more details) in order to probe the entanglement between two given states. The definition of quantum distances (Euclidean distance or Minkowski invariant sum squared) presented in this Section, makes use of the SwapTest procedure.

Euclidean quantum distance
We start by considering N data points or vectors in an Euclidean d-dimensional space, {x i } i=1,...,N , which are encoded as quantum states of the form where |x i | = d µ=1 (x i,µ ) 2 is the modulus of the vector x i , and x i,µ are its components. Each vector requires n ≥ log 2 d qubits to be encoded, i.e. for d = 3 we need two entangled qubits where one of its states remains free and is not used. The Euclidean distance between two vectors x i and x j is defined classically as d where the subscript E stands for Euclidean and the superscript C denotes that it corresponds to the classical version. The quantum analogue of Eq. (2) is obtained by using the controlled SwapTest method. In order to define the Euclidean quantum distance between the d-dimensional vectors x i and x j , we entangle the corresponding associated quantum states |x i and |x j , and define the following subsidiary states where Z ij = |x i | 2 + |x j | 2 is a normalization factor and |0 and |1 are the states of an ancillary qubit. It is also convenient to define the swapped state |ψ 1 The inner products between the quantum states defined in Eqs. (3) and (4) are written as follows From where Therefore (see Eq. (24) in Appendix A), the Euclidean quantum distance is where the superscript Q refers to the Quantum version of the distance d E and the subscript Ψ 3 in the probability P , means that it is considered the resulting probability of measuring the ancillary qubit in the state |0 in the last of the three steps in the SwapTest procedure.

Quantum invariant sum squared in Minkowski space
Vectors in high-energy physics are defined in a four-dimensional space-time with Minkowski metric. They have the form is the temporal component and x i represent the three spatial components. In the following, we assume that the dimension of the space-time is d, where d − 1 is the number of spatial components. We shall define the analogue of the Euclidean classical distance in the Minkowski space corresponding to the invariant sum squared s ij , which is commonly called invariant mass squared when vectors are particle four-momenta, This quantity, which is Lorentz invariant, can be used as test distance to measure similarity between particle momenta. It is also equivalent to the distance used in some of the traditional jet-clustering algorithms at e + e − colliders [69][70][71]. It is necessary to apply twice the SwapTest subroutine (presented in Appendix A) for computing the Minkowski-type distance through a quantum algorithm. Once for the spatial and once for the temporal components.
The spatial distance is computed through the procedure explained in the previous section with a slight modification with respect to Eq. (5) (change of sign in the term proportional to qubit |1 ) whereas the temporal distance is computed as a result of the overlap of the following states: where Z 0 = x 2 0,i + x 2 0,j . Then, applying the SwapTest to these states one gets the relation: where the overlap | ϕ 1 |ϕ 2 | 2 is trivially given by Therefore: At this point, the quantum version of the invariant sum squared follows from the combination of results from Eq. (7) and Eq. (13): The quantum circuit used to implement the invariant sum-squared distance is shown in Fig. 1.
In the first three wires, the SwapTest is applied to the spatial components, where we assume that the states ψ 1 , ψ 2 have been loaded from a quantum Random Access Memory (qRAM) in O(log(d − 1)), since the state ψ 1 is encoded in log 2 (d − 1) qubits. On the other hand, from the fourth wire onward, the SwapTest is applied to the temporal components. In this case, it takes O(1), since we only have 1-dimensional qubit states.

Quantum maximum search by amplitude encoding
Finding a particular member belonging to a dataset is a recurring problem in data analysis. This is a computationally very expensive task. However, quantum computing offers suitable tools to solve data query in a shorter computational time. In particular, it is well known the quadratic speed up exhibited by Grover's algorithm [2]. In this paper, we present a considerably simpler algorithm that is used exclusively to find the maximum in a list of values. This algorithm, although very elementary, is sufficiently accurate for the applications that we will present in Sections 5.1 and 5.3. To our knowledge, it is the first time presented in the literature.
Let L[0, . . . , N − 1] be an unsorted list of N items. Solving the maximum searching problem is to find the index y such that L [y] is the maximum. The quantum algorithm to solve that problem using amplitude encoding proceeds in two steps: 1. The list of N elements is encoded into a log 2 (N ) qubits state as follows: where L sum = N −1 j=0 L[j] 2 is a normalization constant. This amplitude encoding is achieved using qRAM.
2. The final state is measured. This step is rerun several times to reduce the statistical uncertainty. Once done, the most repeated state gives us the maximum.
The graphical representation of the algorithm is shown in Fig. 2  The bottleneck of this procedure underlies in encoding data into a quantum state. Assuming data is stored in a qRAM, as would be the case on a true universal quantum computer, encoding takes O(log 2 (N )) steps [72][73][74][75][76][77][78][79]. The corresponding classical algorithms typically used to obtain the minimum of an unsorted list of N items are of order O(N ). Therefore, with the assumptions considered, the improvement introduced by this quantum algorithm is exponential.
The well-known quantum minimum searching algorithm proposed by Dürr and Høyer [80] is O( √ N ). After their theoretical paper [80] the algorithm was studied and implemented in a quantum simulator (see Ref. [81]). In summary, previous implementations [81] of the Dürr and Høyer algorithm suggests that it could be improved, given the excessive number of qubits needed to implement the method, the unviability to hard code a different oracle for each element, the large number of shots required and (in some cases) the poor performance obtained. This is the aim of the new quantum maximum searching algorithm by amplitude encoding through qRAM presented here: the improvement of the previous enumerated challenges.
Nevertheless, the new algorithm presented in this paper and the corresponding Dürr and Høyer quantum method share common features that could lead to miss-identification of the respective absolute maximum and minimum. These cases, in which the list typically presents a very low standard deviation (or the largest/minimum values are very close to each other) could manifest difficulties related to the fact that the probability of measuring several candidates would be almost identical.
Regarding the practical implementation of the quantum algorithm presented in this paper, the results shown in Section 5 reveal that these potential difficulties do not manifest strongly in the context of jet clustering.
Beyond the jet clustering procedure in HEP, there are other fields where our quantum algorithm could be of value. For instance, in the so-called Extreme Value Theory (EVT) [82]. According to Gumbell 1958 [83], this particular field studies the probability distribution of the desired data by focusing on the outliers with the ultimate goal of being able to predict them in the future. It is precisely in this estimation of the extreme values where our algorithm could be useful. Since for the predictive models historical data has to be analysed and therefore extreme values have to be searched in large data lists. This would mean that our algorithm could be implemented successfully in statistical analysis of extreme data, including actuarial and financial sciences, meteorology, material sciences, engineering and environmental sciences climatology, geology, hydrology and highway traffic analysis [84][85][86].
4 Quantum clustering algorithms 4.1 K-means algorithm K-means is an unsupervised machine learning algorithm that classifies the elements of a dataset into K groups called clusters [44,45]. The data points within each cluster have to be as similar (near) as possible whereas the clusters themselves have to be as different (far) as possible from each other. The input for this algorithm is a set of N data points or vectors, in d dimensions as well as the number of clusters K, with K ≤ N , and its output is a set of K centroids, calculated by averaging the position of the data points corresponding to each group, thus defining K clusters The flow chart of this algorithm is the following: 1. K initial centroids within the data points are generated. They can be generated randomly or through a specific method such as kmeans++ [54].
for obtaining the minimum distance of each data point with respect to the K centroids, which is achieved by Dürr and Høyer's algorithm [80].
In this paper, we focus on a new quantum version of the K-means algorithm, where the calculation of distances is made quantumly and the minimum distance of each data point to the centroids is obtained with the quantum maximum searching algorithm 3 explained in Section 3. Other quantum versions of the K-means algorithm have been studied in Refs. [19,65] and [66], where an Euclidean distance was used to separate the particles from each other. In this paper, we analyse for the first time an implementation of the K-means algorithm with a Minkowski-type quantum distance, as defined in Section 2.2.
The time complexity of this algorithm is estimated by analysing the time complexity of its components. The distances that have to be calculated are O(N ), the search of a minimum distance for every data point with respect to the centroids would be O(log K), and the calculation of each distance itself would require O(log(d − 1)) qubits assuming the data is stored in a qRAM. This results in a speedup from O(N Kd) in the classical version to O(N log K log(d − 1)) in our quantum version. Therefore an exponential speed-up in the number of clusters and in the vector dimensionality would be achieved. A quantum simulation of the quantum K-means algorithm is presented in Section 5.1.

Affinity Propagation algorithm
Although K-means is a successful algorithm capable of clustering data in a satisfactory manner, it needs the number of clusters K to be defined beforehand, which is not typically the case in HEP applications. The Affinity Propagation (AP) algorithm [46], which is an unsupervised machine learning algorithm, does not need the number of clusters as an input. AP only takes as input the data points that have to be classified. So, let x 1 , . . . , x N be a set of data points. Then, a function s to quantify the similarity between points is computed. In such a way that s(i, j) ≥ s(i, k) if and only if x i is more similar to x j than to x k . The most common metrics to measure the similarity is the negative squared distance of the two points we are comparing: s(i, j) = −|x i − x j |. The diagonal s(i, i) of the matrix s is especially relevant since it stores values referred as "preferences" that are related to how likely a particular instance is to become an exemplar, i.e, a cluster. Most of the metrics make the diagonal s(i, i) be s(i, i) = 0, ∀i ≤ N , although it can be different from 0. Hence, on the first iteration, every element s(i, i) is set to the same certain value, which is typically the median similarity of all pairs of inputs. Next, two matrices are calculated that are related to the concept of message exchanging between data points [46]. First, there is the responsibility matrix R. This matrix contains the values r(i, k) that quantify the suitability of point k to serve as the exemplar for point i, compared to other candidate exemplars for i. Then comes the availability matrix A, whose elements a(i, k) reflect how appropriate it would be for point i to select point k as its exemplar, relative to the preferences of other points for k as an exemplar. As they have been described, both matrices could be viewed as log-probability ratios. Then, the AP flow chart reads: 1. The matrices R and A are initialized to zero.
4. Steps 2 and 3 are repeated until either the cluster boundaries remain unchanged for several iterations, or a predetermined number (of iterations) is reached.
Once convergence has been reached, the exemplars i.e, the clusters, are obtained from the final matrices as those whose r(i, i) + a(i, i) > 0. This algorithm takes O(N 2 ) steps to fill the similarity matrix, and also computing each element takes O(d), since a distance between two d-dimensional points has to be calculated. Moreover, steps 2 and 3 are repeated a number T of times, so the final time complexity of this algorithm is O(N 2 T d).
Here, a quantum (hybrid) algorithm is presented which uses the invariant sum squared as a metric in the similarity matrix and calculates it through a quantum subroutine, as the K-means algorithm described in the subsection 4.1. Then, a speedup would be achieved, since computing the distances only requires O(log(d − 1)) qubits. So, the quantum AP algorithm, which is as far as we know completely original, would have a time complexity of O(N 2 T log(d − 1)).

Generalised k T -jet algorithm
The inclusive variant of the generalised k T -jet algorithm is formulated as follows [64]: 1. For each pair of partons i, j the following distance is computed: with ∆R 2 ij = (y i − y j ) 2 + (φ i − φ j ) 2 , where p T,i , y i and φ i are the transverse momentum (with respect to the beam direction), rapidity and azimuth of particle i. R is a jet-radius parameter usually taken of order 1. For each particle i the beam distance is d iB = p 2p T,i .
2. Find the minimum d min amongst all the distances d ij , d iB . If d min is a d ij , the particles i and j are merged into a single particle summing their four-momenta (this is the E-scheme recombination); if d min is a d iB then the particle i is declared as a final jet and it is removed from the list.

Repeat from step 1 until there are no particles left.
It is noticeable that for specific values of p in Eq. (19), the generalised k T algorithm is reduced to the algorithms: k T (p = 1), Cambridge/Aachen (p = 0) and anti-k T (p = −1). As it is claimed in Ref. [88], this classical version of the k T -jet algorithm is O(N 3 ), since the bottleneck of the algorithm is scanning the O(N 2 ) table with all the distances d ij , d iB , and it has to be done N times. Nevertheless, the FastJet algorithm is able to reduce the complexity to O(N 2 ). It is achieved by identifying each particle's geometrical nearest neighbour, thereby it is not necessary to construct a size-N 2 table of d ij , but only the size-N array, d iG i , where G i is i's geometrical nearest neighbour. Furthermore, this FastJet algorithm can be optimized further using the so-called Voronoi diagrams achieving a reduction in the time complexity from O(N 2 ) to O(N log N ).
Regarding the quantum version of this algorithm, the distance ∆R 2 ij will be computed classically whereas the minimum will be obtained through a quantum algorithm. This is due to the fact that the speed up achieved by obtaining the minimum here with a quantum subroutine will be dominant. Thereby, what is to be used here is the new algorithm to obtain the maximum of a list of values (see Section 3). So obtaining the minimum amongst all the distances d ij , d iB will turn out to be obtaining the maximum of its inverses: d −1 ij , d −1 iB . Actually, these inverse distances are what will be computed directly for each pair i, j. Since computing the distances and thereafter computing its inverses would require traversing a vector of size N , so it would have a complexity O(N 2 ) . With that in mind one may also directly compute d −a ij , d −a iB , with a ∈ N, to increase the separation among the data, which makes the maximum more likely when measuring. And this will not increase the overall time complexity of the algorithm either. In Section 5 we compare the results obtained when applying the algorithm with different a values.
The quantum maximum searching algorithm presented above could be applied to the k T -jet algorithm successfully because accuracy is not critical. Even if our quantum algorithm fails to obtain the absolute maximum in one of the multiples iterations, this could end up not affecting the overall jet clustering process. Since an error in finding the maximum will provoke a flip in the order in which two particles merge, and the final result will in many cases be independent of this permutation.
As a final remark, notice that the k T -jet quantum algorithm would be O(N 2 log(N )), since computing all the distances takes O(N 2 ) and finding the minimum would be O(log(N )), in comparison with the O(N 3 ) that requires its classical analogue [88]. Furthermore, the quantum minimum searching could also be implemented in the FastJet algorithm of complexity O(N 2 ). In this case, the resulting quantum algorithm would be O (N log(N )), which is of the same order as the FastJet algorithm version with Voronoi diagrams, which is the most efficient clustering algorithm known to date. This quantum FastJet algorithm has been tested in Section 4.3 with LHC physical datasets.

Quantum simulations
The implementation of the quantum algorithms has been performed through the open-source IBMQ software. In particular, the Python module Qiskit developed by IBMQ has been used to build the quantum circuit to calculate the invariant sum squared as described in Section 2.2 for the K-means and the AP algorithm, as well as to build the quantum circuit for finding the minimum distance in the K-means and the k T -jet algorithm. Afterward, these quantum subroutines have been introduced into their respective classical algorithm substituting the classical part they are speeding up. The Qiskit module serves for executing circuits on real quantum devices. Nevertheless, in previous studies such as [66] and [89], it has been found that the experimental error associated with the quantum devices provided by IBMQ is not yet sufficiently small to extract significant results. Hence, the algorithms presented here have been executed on a quantum simulator that offers an unrestricted and noise-free environment. A quantum implementation in an existing quantum device taking advantage of the claimed maximum speed-up is also not possible, as a qRAM architecture does not exist yet. Nonetheless, the quantum simulations in IBMQ presented in this section show a satisfactory performance and clustering efficiencies comparable to those of their classical counterparts.

Quantum K-means with Minkowski-type distance
At this point we present our implementation of the K-means algorithm with the invariant sum squared as a distance as well as a maximum searching algorithm, and compare its performance with its classical analogue. To this end, we have generated 15 Gaussian clustered datasets of N = 300 three-dimensional vectors 4 with different levels of noise and clustering using the Scikit−learn function make blobs, which gives us the true labels 5 of the generated data. These true labels of the data points are used to calculate the true efficiencies, ε t , of the algorithms when analysing Gaussian datasets. The efficiency ε t is obtained as the ratio of the number of particles classified by the algorithm in the same way as the true labels to the total number of particles. We then applied the hybrid and classical versions of the K-means algorithm to each dataset. Note that the data we are analysing represent the particle four-momenta in such a way that the three-dimensional vectors correspond to the spatial components, while the temporal components are calculated assuming that all particles are massless and on shell. Results are shown in Figs. 3 and 4.  In different colors, clusters identified after 5 iterations by the classical and quantum versions of the K-means algorithm in a Gaussian dataset generated with a random seed and a standard deviation of 2.0 from the cluster centroids. Note that clusterization has been performed using a Minkowski-type distance assuming that all particles are massless and on shell and the efficiencies of both algorithms are ε t = 1.00.
Regarding Fig. 3 one can see at a glance that both classical and quantum versions perform the clustering in the same way in the three-dimensional space of transverse momentum (p T ), rapidity (y) and azimuth (φ). Fig. 4 shows the efficiency in the reconstruction of the clusters as a function of the standard deviations used to generate the data, namely we check whether clustering occurs as expected. It is evident that for small values of the standard deviation both algorithms perform really well, with efficiencies close to one, while for larger values of the standard deviation (i.e. highly noisy data) both efficiencies drop. Furthermore, we can compare the performances of the K-means algorithm      when the seed of the centroids is chosen randomly (see Fig. 4a), with respect to the case when the seed centroids are carefully selected to be as far as possible from each other, according to the K-means++ prescription (see Fig. 4b). The random seed variant in Fig. 4a, has a linear decrease with respect to the standard deviation, and the performances of classical and quantum versions are very similar. On the other hand, the K-means++ variant, Fig. 4b, presents a different behaviour. The quantum version outperforms, in the majority of the cases, the classical one from a standard deviation of 4 onward. Furthermore, in this variant both performances show a dropoff from 4 standard deviations to 7, and then a slight rise from 7 to 8. Finally, comparing both variants it is observed that the K-means++ method outperforms the random seed case for small values of the standard deviation (< 4). However, for larger values of the standard deviation the random seed prescription presents higher efficiencies. In the following, we will apply our quantum K-means method to LHC physical events. To do so we first have processed the data to avoid the following problem: a negative vector −x represents the same quantum state |x as its positive analogue x up to a global phase. This data processing consists of rescaling the data to be analysed in the interval {1,10} 6 . This means every component of every data point will be rescaled in the desired interval. Thus, all the data points are positive now. Moreover, when analysing LHC physical events, we no longer have the true labels, so we cannot calculate ε t . Instead, we define the efficiency ε c , which is defined as the quotient of the number of particles clustered in the same way as their classical counterpart and the total number of particles to be classified.
We consider the generation of a physical n-particle event produced at the LHC. We use a private implementation of an n-particle (n can be of the order of tens of thousands) phase-space event generator. This C++ code, which is based on ROOT [90], generates n-particle events, in which the final-state particles can be massive or massless in any combination of each other (combination chosen by the user). This allows the user to generate final states in which all the particles are massless QCD partons, massless QCD partons associated with photons, massive vector bosons, top-quarks, etc.
The precision in the generation of the final-state event is verified on an event-by-event basis by computing the kinematical constraint between the initial and the n-particle final state. The required precision 7 is always better than 10 −2 . Each generated event is then analysed with the classical versions of the k T -jet algorithms (as implemented in FastJet [64]) and with our quantum version of the corresponding jet algorithms.
In this paper we consider the n-particle massless final-state production in proton-proton 8 collisions at a centre-of-mass energy of √ s = 14 TeV. We apply the following final-state selection cuts. We select jets with the k T -jet algorithms according to the following parameters: the minimum transverse momentum of the resulting jets is required to be p T min ≥ 10 GeV and with a radius R = 1. For our study, we consider n massless particles in the final state with n = 128.
The application of the quantum K-means++ method to LHC physical events is displayed in Fig.  5. Notice that even if we choose K = 8 beforehand, one may see in Fig. 5 that the algorithms clearly distinguish only 3 or 4 clusters (jets). There is actually a simple explanation. Although the algorithm starts with K centroids, the algorithm may converge to a local minimum when the number of clusters is less than K, leaving the remaining clusters completely empty.
In Fig. 5 one can observe graphically that both algorithms classify the data in much the same way, and also the efficiency shown by the quantum algorithm is close to one. Therefore, the results of this quantum version using physical data may be considered satisfactory.

Quantum Affinity Propagation algorithm
In this subsection, a simulation of the quantum AP algorithm is presented. First, we apply this algorithm to Gaussian datasets with different numbers of clusters, generated with a standard deviation of 0.6. That value of the standard deviation has been chosen arbitrarily by convenience. The efficiencies resulted for the classical and the quantum versions are shown in Table 1. Table 1 depicts that the AP classical algorithm and its quantum counterpart clustered the low-noise Gaussian datasets successfully.    In the following, we apply this algorithm to the physical dataset described in Section 5.1, which was preprocessed for the reasons explained in the same section. The results obtained are shown in Fig. 6. In Fig. 6b exactly the same clustering is performed as in Fig. 6a (notice that the efficiency of the quantum version is ε c = 1.00). Nonetheless, this algorithm only finds 2 clusters, which differs with respect to the 3 or 4 clusters found by the K-means algorithm (see Fig. 5).

Number of clusters
Even more, both algorithms identify correctly the most energetic jets of the event (the blue and the orange ones) while the majority of the remaining particles are not classified in the same way, probably because they are soft particles.

Quantum k T jet algorithm
In this section, we apply the quantum version of the k T -jet algorithm to the same LHC physical events as described in Section 5.1 in order to compare the three clustering algorithms.
In Fig. 7 we show the performance of classical and quantum k T jet algorithms. It depicts the jet clustering process carried out by each one of the k T algorithm versions, i.e. anti-k T , k T and Cambridge/Aachen. The classical and quantum versions perform the same jet clustering. When comparing Figs. 5, 6 and 7, one can observe that the latter performs a cleaner clusterization with a larger number of jets. This is a visual effect because jet clusterization is represented graphically in 3-dimensions, which coincides with the dimensionality of the k T metrics, while the K-means and AP use a 4-dimensional Minkowski distance.
To conclude this section we also analyse the efficiencies and the number of shots required for all the quantum versions as a function of the a parameter (see Section 4.3). These are shown in Table 2. Table 2 displays that the efficiencies of the quantum algorithms are close to one, i.e., they classify particles almost identically to their classical counterparts. Furthermore, it may be observed that the larger the parameter a, the smaller the number of shots required to achieve a successful efficiency. In this case, we only need to increase the parameter a to the number 5 to achieve the desired efficiencies with at most 10 shots. However, in other problems (with a larger dataset) a parameter greater than a = 5 can be used to separate the data points and achieve the highest possible efficiency with the smallest number of shots.         Table 2: Efficiencies and number of shots of the different quantum k T -jet algorithms as a function of parameter a.

Conclusions
In this paper, we have considered the quantum versions of the well-known K-means, Affinity Propagation and k T -jet clustering algorithms. These quantum versions are based on two novel quantum procedures. The first one is a quantum subroutine which serves to compute distances satisfying Minkowski metric, whereas the second one consists of a quantum circuit to track the maximum into a list of unsorted data.
In the case of the K-means clustering algorithm, the quantum version is based on the standard classical algorithm with a quantum procedure to compute distances in Minkowski space and an additional quantum procedure to assign each particle to the nearest centroid. We found that the K-means quantum algorithm has a clustering efficiency as good as its classical counterpart while it would show an exponential speed-up in computational time in the vector dimensionality d, as well as in the number of clusters K on a quantum device with qRAM.
In the second place, we have considered a quantum version of the Affinity Propagation method, which is an unsupervised machine learning algorithm, where the similarity is computed with the same quantum procedure as in the K-means case. Thus, it would lead to an exponential speed-up regarding its classical counterpart in the vector dimensionality d while maintaining the clustering efficiency.
Finally, we have presented the quantum versions of the well-known k T -jet clustering algorithms. On a true universal quantum device, the implementation of these algorithms would exhibit an exponential speed-up in finding the minimum distance. Therefore, while the classical version requires O(N 3 ) in computational cost, where N is the number of particles to cluster, the quantum counterpart would only require O(N 2 log(N )). Notice that this comparison is performed between the classical non-optimal and not optimized version and its quantum analogue. Further improvements can be obtained by applying to the quantum algorithm the geometrical nearest neighbour optimization procedure that is also applied to FastJet. In this way, we would obtain a quantum version of order O (N log(N )), which is of the same order as the fully optimal version of FastJet.
For all the clustering algorithms considered, the quantum simulations presented in this paper show an excellent performance and clustering efficiencies. Furthermore, the comparison with their classical counterparts displays that both classifications of the LHC simulated data are quite in agreement.