Quantum Simulation of Dissipative Collective Effects on Noisy Quantum Computers

Dissipative collective eﬀects are ubiquitous in quantum physics and their relevance ranges from the study of entanglement in biological systems to noise mitigation in quantum computers. Here, we put forward the ﬁrst fully quantum simulation of dissipative collective phenomena on a real quantum computer, based on the recently introduced multipartite-collision model. First, we theoretically study the accuracy of this algorithm on near-term quantum computers with noisy gates and we derive some rigorous error bounds that depend on the time step of the collision model and on the gate errors. These bounds can be employed to estimate the necessary resources for the eﬃcient quantum simulation of the collective dynamics. Then, we implement the algorithm on some IBM quantum computers to simulate superradiance and subradiance between a pair of qubits. Our experimental results successfully display the emergence of collective eﬀects in the quantum simulation. In addition, we analyze the noise properties of the gates that we employ in the algorithm by means of full process tomography, with the aim of improving our understanding of the errors in the near-term devices that are currently accessible to worldwide researchers. We obtain the values of the average gate ﬁdelity, unitarity, incoherence


I. INTRODUCTION
Quantum simulation, i.e., the groundbreaking idea of simulating complex quantum systems on a controllable physical platform following the laws of quantum mechanics [1], is probably the most promising application of quantum computers in the near future [2], owing to its exponential quantum advantage [3] which would lead to crucial achievements in both fundamental and applied science [2].Quantum simulation of unitary manybody systems has been studied and experimentally implemented on several physical platforms [2,4], including superconducting quantum circuits [5][6][7][8], trapped ions [9][10][11], photonic systems [12] and cold atoms [13,14].A relatively less studied problem is the quantum simulation of open quantum systems, whose evolution is not unitary due to the action of an external environment [15][16][17].Different protocols for open system quantum simulation have been introduced in the past twenty years [18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33], and careful studies of the resources they require are available in the literature [22,24,25,28,29,31].On * marco.cattaneo@helsinki.fi the experimental side, the main achievements include the simulation of a quantum map through quantum gates between trapped ions [34,35], of different single-and twoqubit quantum channels on a near-term quantum computer [36], of a single-qubit master equation via Trotterization in a superconducting quantum circuit [37], of the Hubbard model with local dissipation on a near-term device [38], of local fermionic reservoirs on non-interacting lattices [39,40], and of local master equations via quantum imaginary time evolution on a near-term computer [33].In this work, we present the first fully quantum digital 1 simulation of dissipative collective effects on a real quantum computer, reproducing the dynamics driven by a global master equation [41][42][43] due to a common environment acting on the whole system.The experiments have been run on near-term superconducting quantum computers available through the IBM cloud [44].
Quantum dissipation is generally considered detrimental for quantum technologies, because it inevitably induces decoherence on the system, hindering the realization of accurate quantum algorithms [45].Therefore, experimentalists usually try to reduce or counter dissipative processes in quantum computers.Here, our focus is different: we aim to engineer the most general coherent processes that occur in a dissipative global dynamics.The importance of engineered collective dissipation is broad for both fundamental physics and quantum technologies.Simulating collective dissipation in a qubit platform paves the way for the experimental study of cutting-edge physical phenomena such as dissipative phase transitions [46], quantum synchronization [47][48][49] and dissipative time crystals [50].Moreover, such quantum simulations are an ideal experimental test-bed for quantum thermodynamics, owing to the importance that global (i.e., collective) master equations have in this field [41][42][43].Indeed, it has been shown that local master equations may break the second law of thermodynamics [51], or modify the related energy contributions [52].Engineered dissipation can also be used to build a different model of universal quantum computation [53], or to generate exotic entangled states by means of a common environment [54][55][56].In addition, collective phenomena are believed to play a major role in the dynamics of light-harvesting complexes [57], therefore engineering global master equations may shed new light on the relation between dissipative quantum physics and biological systems.Last but not least, simulating collective dissipation would help us to detect and understand cross-talks in quantum computers [58,59], which may be one of the major sources of noise therein.
In this article, we investigate theoretically and experimentally one of the most well-known dissipative collective processes, namely, the emergence of superradiance [60] and subradiance [61] between qubits emitting simultaneously into a common environment.For this purpose, we use a quantum algorithm recently introduced by some of us, namely the multipartite collision model (MCM) [31].The algorithm reproduces the global emission by means of repeated interactions between the system qubits and a single ancillary qubit that mediates the collective decay.Collision (or repeated interactions) models are an important tool in the theory of open quantum systems [62][63][64], with fundamental applications in quantum thermodynamics [52,65,66] and in the study of non-Markovianity [67,68].Our work is the first implementation of the MCM on a real quantum computer, and one of the first experimental realizations of a collision model [36,69].
Current quantum computers are inevitably noisy [70], limited by non-negligible gate errors and short coherence times, which imposes strong constraints on the depth of quantum circuits.Hence, it is critical to analyze their limitations, identify possible improvements, and characterize their errors [71].In particular, near-term quantum computers are available on the cloud to a community of quantum physicists who aim to test their theories experimentally without having access to the hardware.Inevitably, this means that this community has limited control over the characterization of the device and over the errors therein.It is our experience that the data about the noise features on the quantum computers available on the cloud are sometimes not sufficient to understand the errors we may face when running simulations on these devices.For instance, bad experimental results on the IBM quantum computers may not be justified by the gate errors provided by IBM [44].Starting from these considerations, we have decided to devote a considerable part of our work to studying, both theoretically and experimentally, the errors arising from noisy gates.
On the theoretical side, we estimate a rigorous error bound for the precision of the quantum algorithm we have used in order to simulate dissipative collective effects in the presence of noise.On the experimental side, we perform process tomography [45,72,73] of all the gates employed in the algorithm, and state tomography of the system qubits at each step of the quantum simulation.This allows us to better understand the type of noise affecting the quantum devices used in the experiments, to build a noise model based on the experimental gate process tomography, and to propose possible countermeasures.
A crucial aspect of our analysis is the choice of the figures of merit to estimate the noise properties.Gate process tomography allows us to compute the average gate fidelity [74] of all the CNOTs we employ in the algorithm and their diamond distance [75,76] from the ideal gate.These values capture the features of the error in each individual gate, but are also subject to the so-called SPAM errors, i.e., errors arising from incorrect state preparation and measurement [77].In contrast, the gate error provided by IBM is obtained through a protocol called randomized benchmarking [78][79][80][81][82][83], which estimates a sort of average error for single-and two-qubit gates on some selected qubits, but it is not subject to SPAM errors.
We compare the experimental average gate infidelity with the IBM gate error, showing that in some cases the two values can be remarkably different.Two different noise models can be built based on these quantities, and we observe that their performance is similar.However, the noise model that makes use of the experimental process tomography may give better predictions in the presence of some highly noisy gates.
The value of the diamond distance between ideal and experimental CNOT gates is also of interest to us for two main reasons.First, it directly relates to the theoretical error bound we estimate for the MCM with noisy gates.Secondly, it is the most used figure of merit to estimate the gate error thresholds for fault-tolerant quantum computation [45], given that the average gate fidelity is not reliable for this purpose [84].Having a grasp on the magnitude of the diamond distance is of particular importance as the first experiments aimed at proving quantum advantage have been recently presented [85,86].Our results suggest that the errors in the near-term devices we have employed may still be orders of magnitude away from the strictest fault-tolerance thresholds.
The paper is structured as follows.In Sec.II we in-troduce the most important theoretical tools to estimate the distance between quantum channels and the errors on a quantum computer.We also discuss the properties of the figures of merit used in our study and compare them.In Sec.III we briefly recall the multipartite collision model and the dissipative quantum dynamics we aim to simulate.Sec.IV is devoted to the discussion of the new theoretical error bounds for the quantum simulation of open systems with noisy gates.In Sec.V we present in detail our experimental results and the noise analysis.Finally, in Sec.VI we draw some concluding remarks and discuss the importance and significance of our results.

II. ERROR ESTIMATION IN QUANTUM SIMULATION
The goal of quantum simulation is to reproduce any quantum dynamics by means of a suitable composition of quantum channels that we can easily implement on a quantum computer.Therefore, errors in quantum simulation algorithms arise from the difference between such ideal quantum channels and the physical ones implemented in practice.In this section, we introduce some measures to quantify the distance between two quantum channels, in order to provide a precise estimate of the error of a quantum simulation algorithm.Specific emphasis is given to the distance between the ideal quantum gate of our theoretical algorithm and its noisy version, i.e., the actual gate implemented on the quantum computer.
We consider quantum channels T : B(H) → B(H), with B(H) the space of bounded operators on the Hilbert space of the qubits.According to their usual definition, quantum channels are completely positive, linear, trace preserving maps [15,45,76].

A. Distances between quantum channels and their properties
In this section we introduce the figures of merit for estimating the distance between two quantum channels.These are based either on the "average" or on the "worstcase" scenario [87].Throughout the paper we will also employ the trace norm • 1 and the infinity norm • ∞ , which are recalled in Appendix A.
Definition 1 (Average gate fidelity).If U g is the (ideal) unitary superoperator associated with a quantum gate, and T is its noisy implementation, the average gate fidelity [74] is defined as: where dµ(ψ) is the Haar measure over the pure states of the Hilbert space and F is the fidelity between two quantum states introduced in Appendix A. Correspondingly, we introduce the average gate infidelity as r(U g , T ) = 1 − ϕ(U g , T ).Note that if we average over the mixed states instead of pure states we obtain the same result, given the linearity in the definition of a density matrix [87] and the convexity of the infidelity [76].
From an abstract perspective, obtaining the exact value of the average gate fidelity requires perfect knowledge on the channel T , and this is obtained through standard quantum process tomography [45].The latter, however, has the drawback of being subject to errors in the state preparation and measurement in the circuits for the experimental process tomography [77], and it is computationally feasible only for few-qubit gates.For this reason, a procedure called randomized benchmarking has been introduced [78][79][80][81][82][83], which does not require full-state tomography, nor a precise control over the preparation and/or measurement errors.The core idea of the randomized benchmarking protocols is to apply a sequence of gates randomly drawn from the Clifford group and to compose it with its conjugate transpose on the selected qubits, so that, if the gates were ideal, the outcome channel would be equal to the identity.The "survival probability" of an initial state under this kind of evolution is then computed as a function of the number of gates in the sequence, and an average gate error is estimated by properly fitting the results (we refer the reader to extensive discussions in the literature for a more rigorous definition of what randomized benchmarking is actually measuring [81,[88][89][90]).Randomized benchmarking is gauge free and it is robust both to noise fluctuations on the "twirling gates" (the gates that form the Clifford group action) and to SPAM errors [89,90].However, it does not estimate the average fidelity of each individual gate [81].Individual gate fidelities may be estimated through interleaved randomized benchmarking [91], although some assumptions must be made for this protocol to be reliable [92].
To the best of our knowledge, the single-qubit and twoqubit gate errors provided by IBM are obtained through standard randomized benchmarking [44].In the experimental part of this work, we will estimate the two-qubit average gate fidelity in Eq. ( 1) via process tomography and we will reduce the SPAM errors through readout error mitigation.A "third way" between process tomography and randomized benchmarking may be gate-set tomography [77,[93][94][95], which is a type of calibrationfree tomography.However, it suffers from some so-called "gauge issues" [94,96], so we will not consider it here and leave it for future experimental studies.
Definition 2 (Induced superoperator norm).The 1 → 1 superoperator norm of a quantum channel T is defined as: The 1 → 1 superoperator norm has the drawback of not behaving well with respect to the tensor product.
For this reason, a different norm is usually employed in quantum error correction algorithms: Definition 3 (Diamond norm).The diamond norm of a quantum channel T is defined as [76,97]: where I A is the identity superoperator over a copy of the space B(H).
Note that both the 1 → 1 and the diamond norms are sub-multiplicative.Furthermore, the diamond norm satisfies the fundamental stability property (see for instance Refs.[87,98]), i.e., T ⊗ I B ♦ = T ♦ , where I B is the identity superoperator over a generic space.Specifically, it can be shown that the diamond norm defined with the identity superoperator over a copy of the space B(H) is the maximal one (it is lower for lower dimensions, and it has the same values for higher dimensions).The diamond norm can be computed either through convex optimization procedures [99] or by a semidefinite program [100,101].
Definition 4 (Diamond distance).The diamond distance between an ideal gate with channel U g and its noisy realization T is defined as: The sub-multiplicativity of the diamond norm implies that: and analogously for the 1 → 1 superoperator norm.This is the sub-additivity property, which is crucial to guarantee that the error in the gate composition scales at most linearly as a function of the length of the gate sequence.Note that both the 1 → 1 and the diamond norm do not have a closed form as a function of the Choi matrix of the quantum channel we are interested in, but must be computed through numerical maximization.Moreover, to obtain their values we must typically rely on quantum process tomography, which is sensitive to SPAM errors (while, for instance, the average gate fidelity may be estimated through a SPAM-free randomized benchmarking protocol).
In addition, we introduce a figure of merit to estimate the coherence of noise [102].That is, how close a given quantum channel T is to a unitary one.This will be crucial to understand whether the error of a noisy quantum gate will be mostly due to the presence of decoherence or to the fact that we are performing a gate which is different from the target gate (say, we perform a qubit rotation driven by σ 0.99 x instead of the desired σ x or, more drastically, a rotation driven by σ z instead of σ x ).
Definition 5 (Unitarity).The unitarity of a quantum channel T is defined as the average purity of output states, with the identity components subtracted: where d S is the dimension of the system Hilbert space.
Equivalently, the unitarity can be defined as [92,102] where T u is the unital block [92], defined by the following Liouvillian representation of the trace-preserving channel: The orthonormal basis in which we are representing the matrix is For instance, for qubit systems we will choose the basis given by the tensor products of Pauli matrices, including the identity.T n is the non-unital vector [92].
The unitarity of a general quantum channel satisfies u(T ) ≤ 1, being one only if T is unitary.Moreover, u(VT U) = u(T ) for all U, V unitary.The unitarity satisfies the following lower bound in terms of the average gate infidelity: 2 for any U g (clearly, the closer U g is to T the tighter the bound [102]).
Finally, we will make use of the incoherence [103,104], which is a quantity related to the unitarity as follows: Definition 6 (Incoherence).The incoherence of a channel T is defined as: Using the lower bound for the unitarity, we readily obtain 0 ≤ ω(T ) ≤ r(U g , T ) for any U g .The value of the incoherence can be thought of as the minimum infidelity that may be achieved with perfect unitary control over the system [104], i.e., the "contribution" to the infidelity due to purely dissipative errors.A different measure of the "non-unitarity" of a quantum channel has also been introduced in the literature [105], but for simplicity it won't be considered in this study.

B. Comparing different figures of merit
It is a well-known fact that the fidelity is a deceptive measure to compare quantum states.For instance, the fidelity between states with very different properties, e.g., one entangled and the other separable, can be larger than 0.95 [106].This is a consequence of the infidelity not being a proper mathematical distance.If we compare it to a well-defined distance such as the trace norm, a common inequality reads [76]: where, crucially, the trace norm is bounded by the square root of the infidelity.This means that states with fidelity 0.99 may have a trace distance equal to 0.1.We can expect similar considerations also when the fidelity is employed to estimate the distance between quantum channels, as in Eq. (1).Indeed, the most important and rigorous theorems guaranteeing fault-tolerant quantum computation do not rely on the fidelity, and instead make use of the diamond distance to bound the maximal error which can be cast away by means of quantum error correction protocols [75,84,107,108].The diamond norm is preferred to the 1 → 1 superoperator norm because of its stability property, discussed in the previous section.Moreover, addressing the worst case scenario (i.e., performing a maximization as in Eq. ( 3)) is necessary to estimate a proper fault-tolerant error threshold, while one cannot rely on the average figure of merit in Eq. (1).
As a possible alternative to average gate fidelity, the entanglement fidelity [109][110][111][112] of a quantum channel has been proposed to estimate the noise properties of a quantum channel.
Definition 7 (Entanglement fidelity).The entanglement fidelity of a quantum channel T is defined as: where I A is the identity superoperator acting on a copy of the Hilbert space of the system, while φ represents a maximally entangled state in the extended Hilbert space H ⊗ H A .
Entanglement fidelity captures how well entanglement between the quantum system of interest and other systems is preserved under the local application of the channel T .It has been shown that the entanglement fidelity can be directly connected to the average gate fidelity as [74,113]: where d S is the dimension of the system Hilbert space.However, the entanglement fidelity is not a measure of the worst case error either.An upper bound for the diamond distance as a function of the system dimension d S and the average gate infidelity r(U g , T ) is given by [82]: Once again, we see that the square root of the average gate infidelity appears in the bound.Moreover, the dependence on the system dimension makes its scaling even worse than in Eq. (10).The tightness of this bound has not been proven, but some results indicate that the bound is asymptotically tight as a function of ϕ and d S [84].They also show how, in general, the average gate fidelity is not reliable for assessing fault-tolerant quantum computation: a value of ϕ(U g , T ) = 99% for a two-qubit gate can still lead to an error around = 0.13, which is very far from the fault-tolerant thresholds against general noise (typically ≈ 10 −3 -10 −4 [93,95,[114][115][116]).Therefore, the diamond norm is the only reliable figure of merit for estimating the distance between quantum gates with the aim of assessing fault-tolerance of the quantum computation, while the value of the average gate fidelity, which is very useful to estimate the average error we will run into during the implementation of a quantum gate [74,87], must be taken with a grain of salt when speaking about fault-tolerance (more discussions can be found in Refs.[82,84,102,108,117]).
A tighter error bound for the diamond distance can be employed if we know the unitarity of the quantum channel [117] (a similar bound can be found in Ref. [108]): with Finally, the unitarity can be used to estimate how the average gate fidelity of composed channels behaves as a function of the average gate fidelity of each component [92], or equivalently, how the average gate fidelity scales as a function of the number of gates in the algorithm.Without getting into the details discussed in Ref. [92], if we consider, for simplicity, a circuit made of m identical gates with average gate fidelity ϕ, then the average gate infidelity of the whole circuit will be upper bounded by m2 (1 − ϕ), plus some lower-order terms that scale as m 4 (1−ϕ) 2 2 .This bound is tight for all unitary channels with unitarity u = 1.If, on the contrary, one considers highly incoherent channels, such as the depolarizing one, the bound can be improved as m(1−ϕ), plus terms of the order of O(m 2 (1 − ϕ) 2 ).To summarize, lower unitarity implies a better scaling of the infidelity as a function of the channel length.

III. MULTIPARTITE COLLISION MODEL FOR DISSIPATIVE COLLECTIVE EFFECTS A. Introduction to the algorithm
The multipartite collision model (MCM) has been recently introduced [31] as a repeated interaction model [62,63] able to reproduce any Markovian evolution of a multipartite open quantum system (that is, an open system made of multiple subsystems) by elementary collisions between each subsystem and some environment particles, termed as "ancillas".Collision models such as the MCM are naturally suited to implement digital quantum simulations of open quantum dynamics.Specifically, we consider a quantum system living in a Hilbert space H composed of M subsystems, which, without loss of generality, we consider identical and of dimension d: We are interested in the most general Markovian quantum evolution of the state of the system at time t, ρ S (t).Specifically, starting from a generic ρ S (0), we aim to simulate the dynamics on a quantum computer, with L is the so-called Liouvillian superoperator, i.e., the generator of the quantum dynamical semigroup driving the dynamics.The most general structure of the Liouvillian is expressed by the celebrated Gorini-Kossakowski-Sudarshan-Lindblad (GKLS) master equation [15]: where we are working with units such that = 1.H S is a generic Hermitian operator of the system, to which we refer as the "effective Hamiltonian", while is the dissipator, describing the non-unitary system dynamics.Γ k ≥ 0 are non-negative decay rates, while {L k } J k=1 are the Lindblad operators, whose number J is bounded as The multipartite collision model provides a way to simulate Eq. ( 16) on a quantum computer by expressing the action of the dissipator of Eq. ( 18) in terms of elementary quantum gates between each subsystem, i.e., a subset of qubits, and a collection of ancillary qubits, representing the environment ancillas.Dissipative collective effects, due to terms in Eq. ( 18) that couple two or more subsystems, are implemented via a suitable sequence of quantum gates between a single environment ancilla and two or more system qubits.More precisely, collective terms are treated through a second-order Suzuki-Trotter decomposition [118].A detailed presentation of the MCM algorithm is summarized in Appendix B.
The fundamental result introduced in [31] states that the MCM simulates the exact open system dynamics driven by L (Eq. ( 16)) in the limit of small timestep ∆t: where φ ∆t is the quantum channel reproducing a single step of the MCM (see Appendix B for details), which here has been applied n times (i.e., n repeated collisions until time t = n∆t).
On a real quantum computer, obviously one can not employ an infinitesimal timestep ∆t → 0 + to implement the gates in Eq. (B2), and has to choose a small but finite ∆t.This implies that the multipartite collision model is able to simulate the dynamical semigroup as in Eq. (19) up to an error that depends on ∆t.Let us term this global error as g , following the original paper [31]: In the above equation we are using the 1 → 1 superoperator norm despite its stability issues discussed in Sec.II because this figure of merit was studied in Ref. [31].This definition will be extended through the use of the diamond norm in Sec.IV and Appendix D.
The global error can be trivially bounded by the sum of the errors in a single timestep: g ≤ n s , where Ref. [31] estimates a very general bound on s , showing that its scaling is optimal for collision models.This allows for the efficient implementation of the quantum simulation algorithm.The complete expression of this bound is quite cumbersome and can be found in the Supplemental Material of Ref. [31].Here, we just write it as: where (the latter term being the Hamiltonian driving a single collision, as explained in Appendix B), and pol 1 and pol 2 are polynomial functions of Λ, the number of subsystems M , and the maximum number of jump operators J.The scaling of the global error is Finally, note that J scales exponentially with the number of subsystems.However, under the very common assumption of k-locality [3,22], i.e., the jump operators and the effective Hamiltonian can be decomposed as sums of k-local terms that act non-trivially on k subsystems only, the bound in Eq. ( 22) scales polynomially as a function of M [22,31].This guarantees that the multipartite collision model is efficiently simulable on a quantum computer.

B. Simulation of super-and sub-radiance
We will now show how the multipartite collision model can be applied to the simulation of topical collective effects, such as super-and sub-radiance.The simplest model where these phenomena arise consists of two atoms that emit coherently into a common environment.Specifically, superradiance emerges when the atoms lose their energy through a quick emission, the intensity of which is enhanced with respect to the local incoherent decay that each atom would experience in the absence of the other [60].In contrast, subradiance, which is a complementary effect due to the same cause (namely, the action of a common bath), can be identified as the presence of a slowly decaying, metastable mode of the atomic emission, which persists for a time way longer than the usual relaxation time [61] (T 1 in the language of cavity or circuit quantum electrodynamics [8]).
In this work we address super-and sub-radiance between two two-level atoms, i.e., two qubits of a quantum computer.Despite its simplicity, this model displays a rich landscape of collective phenomena [119] that may also bring new insights on the presence of entangling noise in the experimental platform [120].Remarkably, we will address a scenario where subradiance emerges through the presence of a decoherence-free subspace, that is, a subspace of the system Hilbert space where the dynamics is unitary and there is no dissipation.The relevance of decoherence-free subspaces for quantum computation is crucial, given that noiseless computation may be possible by restricting the Hilbert space over which we run the quantum algorithms to the decoherence-free subspace [121].More specifically, we aim to simulate the following generator of the open system dynamics: with , and the latter are the lowering operators of respectively qubit 1 and 2. In Eq. ( 23) γ is a constant decay rate that defines the magnitude of the dissipation and, specifically, it is the spontaneous emission coefficient of each atom in the vacuum.
The master equation ( 23) can be derived from a microscopic model with two identical two-level atoms dissipatively coupled in a symmetric way to the same common bath at zero temperature [43].In a reference frame rotating with the frequency of the qubits, the free Hamiltonian of the atoms can then be neglected, as in Eq. ( 23).The analytical solution of the dynamics driven by Eq.( 23) can be found in Appendix C.
Superradiance is observed when the open evolution of the two qubits starts from the "superradiant state" , where |g and |e are respectively the ground and the excited state of each qubit.Indeed, introducing ρ sup = |ψ sup ψ sup | and the emission power [60] where is the system's energy, we easily find (see Appendix C for details): We observe a clear enhancement of the atomic decay, driven by a decay rate equal to 2γ instead of the standard spontaneous emission rate γ.
In contrast, subradiance emerges when the system dynamics starts from the "subradiant state" . Defining ρ sub = |ψ sub ψ sub |, we observe with zero emitted power.In other words, the subradiant state is a steady state of the dynamics that does not "feel" the presence of dissipation acting on the qubits.
Let us now expose how the multipartite collision model can simulate the dynamics in Eq. ( 23).Since there is a single Lindblad operator (J = 1), we only need a single ancillary qubit initialized in the ground state for each timestep ∆t.Let us term it as the qubit E.Then, the two-qubit quantum gates we will employ in the algorithm are: These gates will be composed as in Eq. (B2).
Let us also consider the analogous quantum evolution in the presence of two separate baths, i.e., without collective effects between the qubits.Then, the master equation driving this dynamics, assuming once again that the frequencies of the qubits and their decay rates are the same, is equivalent to Eq. ( 23) without cross terms with j = k in the summation (see Appendix C and Eq.(C10) for further details).In this scenario, we have two local Lindblad operators, namely σ − 1 and σ − 2 .Therefore, the multipartite collision model to reproduce the local dynamics requires two ancillas, say respectively qubit E 1 and qubit E 2 , and the same gates as in Eq. ( 27), both lasting for a time ∆t.Indeed, in the absence of collective effects we do not need to employ the second-order Suzuki-Trotter decomposition as in Eq. (B2), while we can just rely on its first-order analogous [31,122].
Let us now fix the values of γ and ∆t.We set γ = 1 (after choosing some suitable units of measurement), which is the only physical timescale of the problem.This means that thermalization is reached at a time t R ∼ O(1) [43].The time evolution of the emission power of Eq. ( 24) is plotted in Fig. 1(a) for different initial states (superadiant, subradiant, and the excited state |ee ), and compared to the case of local dissipation for an initial subradiant state.
For the quantum simulation, we choose ∆t = 0.1 and consider the dynamics until t = 1, requiring n = 10 timesteps.The upper bound on the single-step error, given by Eq. (IV), is plotted in Fig. 1(b) as a function of the timestep ∆t.These high values of the single-step error are due to the fact that the bound in Eq. ( 22) reflects the worst-case scenario (the maximization is performed over all the initial states).This is necessary to guarantee the efficiency of the execution of the algorithm, but quite often of little use for practical purposes.Therefore, in Fig. 1(a) (inset) we plot the trace distance between ρ S (t), as obtained from Eq. ( 23), and the state simulated by the MCM with ∆t = 0.1, as a function of the number of timesteps, i.e., for n = 1, . . ., 10.Note that ideal (n) is different from the global error at the nth timestep g introduced in Eq. (20).Indeed, in Eq. ( 28) there is no maximization over all the possible initial states, since we focus on a specific initial state.As a consequence, ideal (n) is much lower than g .In fact, in Fig. 1(a) (inset) we observe that, even with a large timestep such as ∆t = 0.1, the MCM is able to simulate the state of the dynamics at time t with high accuracy.

IV. ERROR BOUND FOR THE NOISY SIMULATION
In Ref. [31] an error bound for the ideal case of the MCM based on the 1 → 1 superoperator norm is computed.For a single step of the MCM, it is expressed as in Eq. ( 22).Here, we estimate an analogous upper bound for the case of noisy gates.We use the diamond norm instead of the 1 → 1 superoperator norm to obtain bounds expressed in terms of error values that can be employed to guarantee fault-tolerant quantum computation.
Let us first replace the ideal quantum map φ ∆t , defined in Eq. (B5), with a noisy one, which we term as φ * ∆t .The error we want to estimate is: Thanks to the triangle inequality, we have: where we have introduced the errors Let us focus on these norms for a single application of the MCM.We first address the bound for the ideal case based on the diamond norm ♦ s , and then the diamond norm * m between the ideal and noisy MCM maps.We give here only the final result, while the derivation of the bounds can be found in Appendix D. Our first statement is: Proposition 1.The ideal error bound B 1→1 evaluated through the 1 → 1 superoperator norm as in Eq. ( 22) is valid also for the diamond norm: where the expression B 1→1 was introduced in Eq. (22).
This means that all the results of Ref. [31] about the scaling of the error can be trivially extended to a scenario where the diamond norm is employed.In particular, the efficient quantum simulation of the MCM is guaranteed also through the diamond norm.Note that, for instance, the behavior of the upper bound shown in Fig. 1(b) holds also for ♦ s .Let us now focus on * m .Our aim is to estimate "how far" the noisy implementation of the MCM is from its ideal analog.To do so, we may assume that each ancillary qubit is not prepared in the ideal initial state ρ E,i , but in the noisy ρ * where G i is a known quantum channel characterizing the noise on the state preparation of the ith ancilla in the actual device.The quantum channel for the total unitary evolution of the system+ancillary qubits of the ideal MCM for a single timestep ∆t is U sim (∆t) = U sim (∆t) • U † sim (∆t) (see Eq. (B4)).The unitary evolution U sim (∆t) can also be decomposed as a composition of many quantum gates on a quantum computer, such as U sim (∆t) = j U j .These gates typically act on both the system and ancillary qubits.Let us now suppose that, on a real platform, each of these gates is not ideal but noisy, and can be represented by the quantum channel E j .Then, we find: where G i and E j are the noisy channels introduced above, while the diamond distance d ♦ is defined in Eq. (4).
That is to say, even though the MCM map acts on the state of the system only, we can estimate an upper error bound for the noisy map that is equal to the sum of the individual errors for each quantum gate between the system qubits and the ancillas (including the preparation of the initial states of the ancillary qubits).Note that the above error bound is valid also for modified versions of the MCM, where, for instance, the ancillary qubits can be prepared in an initial entangled state.Moreover, the above bound is robust even under more general sources of error, e.g., if the initial state of system qubits+ancillas is accidentally entangled.Indeed, the diamond distance E j − U j ♦ involves a maximization over all the possible initial states of the overall system3 , including the entangled ones.In such a scenario, the noisy preparation given by the set of G i can then be extended to act on both system qubits and ancillas, yielding an entangled state instead of an initial state that is separable between system and ancillary qubits.
To summarize, the results stated in Proposition 1 and Proposition 2 can be employed to estimate a general upper bound for the global error of the noisy MCM map, according to Eq. ( 30).More specifically, Eq. ( 32) expresses the error due to the finite (and not infinitesimal) timestep ∆t in the ideal algorithm.Clearly, the lower ∆t the better, as shown in Fig. 1(b).Instead, Eq. (33) states that the error due to a noisy MCM protocol can be decomposed into the sum of the individual errors for each quantum gate employed in the algorithm, including the state preparation of the ancillary qubits.The diamond distances in Eq. ( 33) can be estimated experimentally, and we will show their values for some CNOT gates employed to simulate the MCM on the IBM quantum computers (see the discussion in Sec.V and Appendix G 2).
We stress again that we are using the diamond distance, which is employed to prove the formal theorems about fault-tolerant quantum computation.This means that the distances between noisy and ideal gates that appear in Eq. ( 33) express exactly the errors that we must keep below a certain threshold to guarantee faulttolerant computation.Moreover, recall that the diamond norm addresses the worst-case scenario, so the upper error bound can be higher than the actual error in a single implementation of the algorithm (compare Figs. 1(a) and (b)).

V. EXPERIMENTAL DEMONSTRATION ON IBM QUANTUM COMPUTERS A. Introduction to the experimental results
In this section, we present the implementation of the multipartite collision model on a near-term superconducting IBM quantum computer available on the cloud, programmed through the library Qiskit [44].In particular, we will show some experimental results obtained on the limited-access 16-qubit ibmq guadalupe, while more results on the 27-qubit ibmq toronto are discussed in Appendix F 2.
Our aim is twofold.On the one hand, we want to show how even current near-term devices can display non-trivial dissipative collective effects in a quantum simulation by means of the multipartite collision model.On the other hand, we intend to investigate the properties of noise in these platforms via process tomography, and to relate them with the accuracy of the simulation of the MCM.By doing this, we aim to investigate whether the quantum physicists who make use of near-term quantum computers on the cloud can infer the accuracy of the simulations they would like to run without having access to the hardware.We will see that gate process tomography can provide us with useful information about the precision of the quantum simulation despite its drawbacks that are discussed in the literature [77,95], namely its sensitivity to preparation and measurement (SPAM) errors.
In this paper we present two different sets of results.The first one (discussed in the present section) shows clear signatures of collective effects, and the accuracy of the simulation is reasonably good.The second one is presented in Appendix F 1 and displays noisier results, and the simulated MCM yields results that are quite far from their expected values.We will show, however, that this can be traced back to some highly noisy gates that are repetitively employed in the algorithm, the noise prop-erties of which are captured by our analysis based on process tomography.Whether these high levels of noise are due to an incorrect application of the quantum gate itself or to SPAM errors may be a matter of debate.However, their signatures are clearly evident in the results of the quantum computation, which, ultimately, is what we are interested in when we run algorithms for quantum simulation on near-term quantum computers.
The experimental details and the platform schemes can be found in Appendix E. Shortly, the protocols that, for the above-mentioned purposes, we have to run on the devices are (note that process tomography is not needed for the MCM, but only to assess errors): 1. Algorithm implementing the MCM until the step n = 5 for the three different initial states explored in Sec.III B and for the local dynamics, and performing measurements in the computational basis.
2. State tomography of the system at each timestep of the algorithm.
3. Process tomography [45] of the CNOT gates employed in the algorithm, in order to estimate the noise properties.We do not address the properties of the single-qubit gates, because their error is usually a couple of orders of magnitude lower than the one of two-qubit gates (see Appendix E for further details).
4. Readout error mitigation on the qubits measured in both the algorithm and the gate process tomography.This is a standard procedure that can be implemented on the IBM near-term devices through the Qiskit library [44].It detects possible systematic errors in the outcomes of the measurements on the set of qubits of interest, and allows for readout error mitigation.All the outcomes we will show in the manuscript have been obtained after applying the proper readout error mitigation procedure.
Note that we will simulate n = 5 steps of the MCM, using the same model parameters that have been employed in Sec.III B (namely, decay rate γ = 1, timestep ∆t = 0.1).The specifics of the near-term devices we are employing and the noise level do not allow for more collisions between the system qubits and the ancillas.In any case, we will see that 5 steps are already enough to observe collective effects in the dynamics.
It is worth stressing that a crucial challenge in the simulation of the MCM is getting a new ancilla initialized in the ground state after each collision, as required by the steps of the algorithm discussed in Appendix B. In general, in the topology of a quantum computer only one or two ancillary qubits are directly connected to (i.e., can perform operations with) both the system qubits.Therefore, we have generated a new refreshed ancilla after each timestep by swapping the state of the common ancillary qubit with the one of the nearby qubits prepared in the ground state.To do this, we need more and more swap gates as the number of collisions increases, giving rise to a "train of ancillas" that must be subsequently swapped to get to interact with the system qubits.More details can be found in Appendix E. A different solution may consist in employing a reset gate to reinitialize the state of the common ancillary qubit in the ground state after each collision.However, this did not work on the platforms we have used due to decoherence effects, as shown in Appendix F 3.

B. Experimental MCM outcomes
A first set of results and measurement outcomes in the computational basis of the quantum computer is shown in Fig. 2 as a function of the number of steps in the MCM, from n = 0 (state preparation) to n = 5.The computational basis "00", "01", "10", "11" on the IBM quantum computer [44] corresponds to the physical basis |gg , |ge , |eg , |ee of the two qubits.
The results display a clear evidence of dissipative collective effects in the system dynamics.To see this, let us compare the behavior of the subradiant and superradiant evolution with the one of the local dynamics.The ideal local decay of the subradiant state (solid black lines) can be considered as a benchmark to discriminate between collective and non-collective dynamics.Indeed, as shown also in Fig. 1(a), it corresponds to an exponential decay driven by the local dissipation rate γ (see Appendix C for the analytical solution of the dynamics).Then, the emergence of collective effects is detected by either the presence of a slower decaying eigenspace of the dynamics ("subradiance") or by a much faster decay ("superradiance"), depending on the initial state of the system.This is exactly what we observe in the experimental data depicted in Fig. 2: if we start in the subradiant state (dash-dotted light blue lines), the decay of the excited population levels (|ge and |eg ) into the ground state (|gg ) is slower than the one given by the ideal local dynamics, i.e., they are decaying with a rate that is smaller than γ.This can be possible only in the presence of collective effects between the two qubits.Note that, although the light blue markers are below the black solid line in the evolution of |ge , this is not due to a faster, local decay of the subradiant state, but to an imprecise state preparation at time t = 0 for which the initial population of |ge is lower than 0.5.Its decay, however, is again slower than e −γt .The same considerations apply for the evolution of the superradiant state (dash-dotted magenta lines): the decay of the initially excited populations (|ge and |eg ) into the ground state (|gg ) is clearly faster than the local decay and, at least during the first timesteps of the dynamics, it follows the two times faster decay driven by e −2γt (solid magenta lines).The experimental local decay of the subradiant state (black markers) approximately follows the ideal local dynamics, and it decays as e −γt , as we were expecting.Finally, the collective evolution of the state |ee (orange markers) is a linear combination of subradiant and superradiant dynamics (see Appendix C), and it well captures its ideal behavior (orange solide lines).Moreover, it does not run into a large error in the state preparation, since preparing |ee (starting from |gg ) requires only singlequbit gates.In contrast, the initialization of the superand sub-radiant states involves two-qubit gates, which are much noisier than single-qubit operations, and this is why their state preparation is more imprecise.
It is worth noting, however, that the subradiant state is decaying, while it should be stationary according to the master equation (23).In other words, there are leakages out of the decoherence-free subspace.This is due to the errors in the gates we employed in the algorithm, which are inevitably noisy in near-term devices.Another source of leakages is due to the finite Trotter timestep ∆t of the algorithm.However, as one can infer from the inset of Fig. 1(a), the discrepancy between the true dynamics and the ideal simulation is very small for the subradiant state, therefore this source of leakages is basically negligible in the present scenario.The interested reader can check in Appendix G 1 a quantitative comparison between the error in the experimental results of Fig. 2 due to noisy gates and the ideal error of the algorithm.For the subradiant state, the ratio between ideal and experimental error is always smaller than 1%.
We stress that, despite the dissipation due to the gate errors, we have proven that the multipartite collision model produces a more robust subradiant state than in the presence of local decay only.Therefore, our algorithm is able to preserve the populations of an entangled state of the qubits for a longer time than in the absence of collective effects, even on near-term devices with a considerable level of decoherence.

C. Gate error analysis
Let us now focus on the gate analysis displayed in Fig. 3.We compare the experimental average gate infidelity of the CNOTs in the algorithm, computed by reconstructing the Choi matrix of the gate [123] through gate process tomography, with the gate error provided by IBM and extrapolated through a randomized benchmarking protocol [44,80].We also plot the value of the incoherence ω(T ) given by Eq. ( 9).The experimental values have been resampled 100 times via bootstrapping to estimate the standard deviations of the samples (depicted in Fig. 3 as error bars).We have also computed the values of the diamond distances between ideal and 0.6 0.7 0.8 0.9 1.0 ( )/r( g , )  3: Error analysis of the set of CNOT gates in ibmq guadalupe employed for the simulation of the first set of results in Fig. 2. The tick 0-1 on the x axis corresponds to the CNOT gate where "qubit 0" in ibmq guadalupe is the control qubit and "qubit 1" is the target.Lower plot: gate error provide by IBM (magenta), experimental average gate infidelity r(U g , T ) via full process tomography (light blue), and experimental incoherence ω(T ) (yellow), as defined in Eq. ( 9).The error bars on the values of r(U g , T ) and ω(T ) are the standard deviations of 100 realizations of a random sample of the experimental data via bootstrapping.Upper plot: ratio between the incoherence and the experimental average gate infidelity of each CNOT gate.experimental gates, and a discussion about them can be found in Appendix G 2.
Note that the CNOTs between qubit "10" and "12" are not taken into account due to an error in the experimental outputs.However, these gates are employed only a single time during the whole protocol to switch the fourth and fifth collision ancillas, so their features are of little interest for our analysis.
We first make two major remarks: (i) As shown in the upper plot of Fig. 3, the ratio between the incoherence and the average gate infidelity is larger than 0.9 for almost all gates.This means that the "coherence of noise" [102]  hardware is usually quite low, and the dominant source of error is incoherent.This agrees with previous studies on the noise properties of the gate pulses on the IBM quantum computers [124].
(ii) The experimental average gate infidelity is of the same order of magnitude as the IBM gate error, but it can quite remarkably differ from the latter (e.g., see the CNOT 2-1 or 1-4).So, we are observing a discrepancy between the average gate fidelity via process tomography and the value provided by IBM, which is typically obtained through randomized benchmarking.Similar remarks on the IBM quantum devices can be found in the literature [125,126].For instance, the full process tomography shows how the average gate infidelity of a CNOT gate between the same pair of qubits can differ when we switch the control and target qubits, while this is not captured by the IBM error.
Overall, the discrepancy between the experimental average gate infidelity and the IBM error may be explained as follows: 1) The IBM error is describing the average infidelity of a noise channel obtained as, following standard randomized benchmarking, the average channel between the twirling gates of the Clifford group (see, e.g., Refs.[81,89,90] for more rigorous details); this is providing us with a useful measure to benchmark the average noise on the selected pair of qubits, but it is different from the average gate infidelity of a specific CNOT gate.2) Even if readout error mitigation has been applied, some residual SPAM errors might be present in the characterization of the average gate infidelities, while the IBM gate error is SPAM-free.
Even if we cannot rule out the presence of SPAM errors in the values of the average gate infidelities (this is a well-known drawback of process tomography [77]), we point out that these errors have been mitigated through the procedure for readout error mitigation.Indeed, the experimental values of the average gate infidelities without error mitigation are huge (typically around 5 times larger than the values we plotted in Fig. 3).Moreover, the gates for state preparation and state measurement in the circuits for the full process tomography are only single-qubit gates, whose gate error is typically two orders of magnitude smaller than the two-qubit gate error [44]; for instance, the state preparation of |ee , which is the only one not involving two-qubit gates, is much better than the preparation of the subradiant or superradiant states.So, it may be reasonable to posit that the state preparation error in the characterization of the average gate infidelities is not as relevant as the error due to the application of the gate itself.

D. Scaling of the experimental infidelity
After computing the Choi matrices of the quantum gates we employ in the experiment via full process tomography and the figures of merit depicted in Fig. 3, we can try to employ them to better understand the noise we will face during an execution of the algorithm.A possible way to do so is to simulate a noisy version of the algorithm run on a "noisy circuit", which we construct by replacing all the CNOT gates of the ideal circuit for the MCM with their corresponding non-ideal quantum channels described by the Choi matrices we have estimated experimentally.Then, we can compare the state simulated on this noisy circuit with the experimental state that we have reconstructed through state tomography (the data for the state tomography have been resampled 100 times through bootstrapping and the error bars are within the markers in Fig. 4).Yet another noise model we may test is the standard noise model from the backend provided by IBM [44].This is built as follows: i) for each single-qubit gate we add a depolarizing channel with error rate equal to the IBM gate error, followed by a thermal relaxation through pure dephasing [15] with rate T 2 and through pure dissipation with rate T 1 (these values are available on the IBM website and typically, for ibmq guadalupe, they are of the order of 100 µs, while the qubit frequencies are of the order of 2π × 5GHz) ii) for each two-qubit gate we apply a two-qubit depolarizing channel with error rate equal to the IBM gate error given in Fig. 3, followed by a local thermal relaxation as for single-qubit gates.The IBM noise model takes into account single-qubit gate errors and the "natural" dissipation acting on each qubit, while the noise model based on the noisy circuit we reconstructed through the experimental Choi matrices focuses on the two-qubit gate errors only.
In Fig. 4 we compare the states simulated according to the noise models presented above with the ideal and experimental states.The solid violet line depicts the infidelity between the ideal and experimental state as a function of the number of CNOT gates in the protocol N g .The markers indicate the state at each step of the algorithm, from n = 0 (state preparation) to n = 5.Note that the number of gates to implement the nth step is higher than for the step n − 1, as explained in Appendix E. We can compare its scaling with the linear sum (as a function of N g ) of the average gate infidelities of all the CNOTs employed in the algorithm, taking into account their repetitions, and following the order in which they appear in the circuit (i.e., N g = 1 on the x axis denotes the first gate we employ in the algorithm, and so on).We plot this linear scaling for both the experimental average gate infidelity obtained through gate process tomography and the IBM gate error, which are displayed in Fig. 3.Note that these curves are not exactly straight lines, because the average gate fidelities of the gates employed in the algorithm are in general different.According to the discussion in Sec.II B, we may expect the experimental state infidelity to scale linearly when the noise source is mostly incoherent [92,108], at least during the first steps of the algorithm (as soon as the number of CNOT gates N g increases, it is reasonable to assume that the experimental errors will drive the state of the system towards some sort of mixed state, so that the experimental infidelity will saturate at a value that is not captured by the linear scaling in Fig. 4).We indeed observe a linear scaling, or also sublinear, as a function of the number of gates.The linear sum of the experimental average gate infidelities captures the first stages of the algorithm, but then overestimates the error, as we may expect due to some sort of saturation toward a mixed state.This may suggest that, in the platform we are considering, the average gate fidelity of composed channels [92] scales in a favorable way, which may also be sublinear.The scaling of the trace distance between experimental and ideal state as a function of N g is also linear or sublinear (we refer the interested reader to the discussion in Appendix G 1).
Let us now focus on the solid orange and yellow lines, which depict the infidelity between the experimental state and respectively the state simulated through the noisy circuit based on experimental Choi matrices and the state simulated through the IBM noise model.Remarkably, both error models provide us with a good prediction for the experimental state, as the infidelity between the latter and the simulated states never exceeds 0.1, and is usually lower than 0.05.For the noisy circuit noise model, the discrepancy between the experimental and simulated states may be explained through the presence of crosstalks or correlated measurement errors on the backend, i.e., the quantum channel associated with a gate when operated on its own may be different from the corresponding quantum channel when the same gate is applied during a more complex algorithm with multiple quantum operations on different qubits at the same time.Another reason for this discrepancy may be residual SPAM errors in the characterization of the Choi matrices of the gates.
Interestingly, both noise models have a quite similar performance, although the noise model based on experimental Choi matrices may outperform the IBM noise model in the presence of very noisy gates (see the second set of results in Appendix F 1).This means that a noise model that only addresses two-qubit errors can be as good as a more complete noise model if the CNOT errors are properly characterized, highlighting how the latter are the main source of noise in near-term computers.Finally, note that, despite the two noise models having a similar performance, as captured by the state infidelity between experimental and simulated states, the states they simulate may actually be quite different.Indeed, Fig. 5 shows the absolute value of the population difference between the experimental (results in Fig. 2) and simulated states for both noise models.We observe that one model outperforms the other for some timesteps and for different observables, but overall they display a simi-lar performance, despite predicting quite different quantum states.A possible improvement of our noise model may consist in including a depolarizing channel for each single-qubit gate, where the decay rate of the channel is obtained through randomized benchmarking (e.g., the IBM-provided error values can be used for this).We leave this possibility for future works.

VI. CONCLUDING REMARKS AND PERSPECTIVES
We have presented the first fully quantum digital simulation of dissipative collective effects on a quantum computer, and we have analyzed both theoretically and experimentally the impact of noisy gates on the algorithm.
State-of-the-art universal quantum computers do not allow for fault-tolerant computation yet, while they are subject to a considerable level of noise.However, it is remarkable that the algorithm we employed, namely the multipartite collision model (MCM) [31], simulates the superradiant and subradiant dynamics of two qubits colliding with a common ancilla with a good degree of accuracy (Fig. 2 and Fig. 8 in Appendix F), and that a noise model based on our experimental noise analysis is able to estimate with reasonable precision the distance between the ideal and experimental states (Fig. 4 and Fig. 10 in Appendix F 1).
More specifically, we have simulated on a quantum computer the enhanced decay of the two-qubit state prepared in (|eg + |ge )/ √ 2, which is a paradigmatic signature of superradiance, and the very slow decay of the state (|eg − |ge )/ √ 2, which displays the emergence of subradiance.As a benchmark for these collective phenomena, we have also simulated the two-qubit dynamics in the presence of local decay only, and the results follow the theoretical scaling.
We have stressed the importance of the rigorous analysis of the gate errors when dealing with current quantum devices, and here we have addressed this issue also theoretically.In Ref. [31], an error bound for the ideal multipartite collision model was derived using the 1 → 1 superoperator norm.This error is essentially due to the inevitable choice of a small but finite algorithm timestep ∆t, while the second-order Suzuki-Trotter decomposition of the MCM simulates the exact dynamics in the limit of infinitesimal timestep only.In Sec.IV, we have presented a refined error bound that takes into account the possibility of employing noisy gates in the practical implementation of the MCM.Specifically, Proposition 1 generalizes the bound of Ref. [31] by making use of the diamond norm, which is a more precise norm for the distance between quantum channels and, crucially, is employed in the estimation of a rigorous error threshold for quantum fault-tolerant computation.Moreover, Proposition 2 estimates an upper bound based on the diamond norm for the error we are incurring by using imperfect gates.While the quantum map associated with the MCM acts on the state of the system only, Proposition 2 expresses the error bound by taking into account the action of each gate on both the system and the ancillary qubits, and decomposes it into the sum of the individual errors of each quantum gate.
On the experimental side, the results of the noise analysis are depicted in Figs. 3 and 9 in Appendix F 1 for the average gate infidelity and the incoherence, while Fig. 14 in Appendix G 2 shows the experimental diamond distance between the employed CNOTs and their ideal counterparts.To obtain these figures of merit, we have performed the full process tomography of all the CNOT gates employed in the algorithm.To reduce the SPAM errors that are the major drawback of two-qubit process tomography, we have applied readout error mitigation.However, the reader must be aware that residual SPAM errors may be present in the results of our experimental noise analysis.Then, we have compared these results with the gate errors provided by IBM, which are obtained through randomized benchmarking.
Our findings indicate that the ratio between incoherence and average gate infidelity is almost always larger than 0.9, therefore the major source of error is dissipative, i.e., the "coherence of noise" [102] is low.We have also observed that the experimental average gate infidelity computed through full process tomography can sometimes remarkably differ from the IBM gate error (see Figs. 3 and 9 in Appendix F 1).Moreover, the scaling of the infidelity between exact and simulated states as a function of the number of gates is linear or sublinear (see Figs. 4 and 10), i.e., much better than the worst-case scenario with quadratic scaling.This may be due to the fact that the major source of errors is dissipative and not coherent.Finally, the diamond errors of the CNOT gates we have obtained are of the order of 5 × 10 −2 ÷ 10 −1 , therefore, if we put these numbers into the expression for the theoretical bound in Proposition 2, the latter exceeds 1 after a few timesteps.Furthermore, these values suggest that the near-term devices we have employed are still orders of magnitude away from the strictest quantum fault-tolerant thresholds, at least if we assume, as it may be reasonable, that SPAM errors do not affect the diamond distances by several orders of magnitudes.
In addition, we have employed the experimental results of the noise analysis to build a noise model that (classically) simulates the state of the collision model at timestep n by replacing all the CNOT gates in the algorithm with the noisy channels obtained through process tomography.We have found that this noise model predicts the experimental state with a reasonable accuracy (the infidelity between experimental and simulated state is typically lower than 0.1, while the one between experimental and ideal state is between five and ten times higher, see Fig. 4).Moreover, we have compared it with the built-in noise model provided by IBM that considers also single-qubit gates and qubit relaxation, and discovered that their performance is quite similar.However, our noise model may outperform the IBM one in the pres-ence of very noisy gates (see Fig. 10 in Appendix F 1), and this might also suggest that the high values of average gate infidelities displayed in Fig. 9 in Appendix F 1 are more informative than the more optimistic IBM gate errors.
A crucial issue of open system simulation via interactions with ancillary qubits is the need for a new ancilla at each timestep.In this study, we have employed a train of ancillas that are swapped after each collision, so that a single ancillary qubit is finally interacting with the system qubits (see the discussion in Appendix E).A different solution consists in employing a reset gate to refresh the state of a single ancillary qubit at every timestep.However, our preliminary results based on the reset gate have shown a very quick emergence of decoherence (Fig. 12 in Appendix F).Reducing the gate time of the reset gate would therefore be a remarkable improvement for any simulation algorithm making use of ancillas.If this is not possible, enriching the topology of the near-term devices will crucially reduce the number of swap gates necessary to refresh the state of the ancilla for the nth collision.Yet another possibility for enhancing the accuracy of the quantum simulation would be working at the pulse level of the near-term devices [127].This would basically correspond to performing an analog quantum simulation instead of a digital one, and has very recently led to improved results on the quantum simulation of many-body unitary systems [128].
To conclude, our experimental outcomes highlight how near-term quantum computers, even if far from being ideal, can already give useful and meaningful results on the simulation of collective quantum dynamics.Our findings are a proof-of-principle demonstration of the potential of the MCM algorithm for the exploration of topical and groundbreaking phenomena such as dissipative quantum phase transitions, quantum synchronization, and dissipative time crystals.Finally, we have demonstrated that gate process tomography, despite its potential bias due to residual SPAM errors, can give valuable information about the noise features of near-term quantum computers and about the accuracy of the experimentally simulated state.We have shown how this can help us to understand the device limitations, and hence to engineer possible countermeasures.
4. Follow the same procedure for each ancilla k = 1, . . ., J, and insert each gate sequence in a total unitary operator, where their order of execution does not matter: 5. Introduce a sequence of gates that simulate the effective Hamiltonian H S , which is a free system Hamiltonian, during the timestep ∆t: with U S (∆t) = exp[−iH S ∆t].Note that the latter step amounts to simulating the closed-system dynamics driven by the Hamiltonian H S , which is a well-known task in quantum computing since the seminal paper by Lloyd [3].
6. Initialize the ancillary qubits in the ground state expressed by ρ E = J k=1 |0 k 0|.Initialize the system qubits in the initial state of the open dynamics ρ S (0), as introduced in Eq. ( 16).
7. Implement a single timestep ∆t of the algorithm by making the system and ancillary qubits evolve through the sequence of gates contained in the operator U sim (∆t).Then, to obtain the information on the state of the system only, trace out the degrees of freedom of the ancillas (the environment): φ ∆t is the quantum map associated with a single timestep of the multipartite collision model.8. To implement n timesteps of the algorithm, apply the quantum map φ ∆t n times.That is, repeat the sequence of gates contained in U sim (∆t) for n times using the same system qubits and a new set of fresh ancillas, initialized in the ground state ρ E , for each timestep.The simulation of the dynamics until time t requires n = t/∆t repetitions, where the timestep of the algorithm ∆t should be chosen as small as the experimental conditions allow for.
A straightforward calculation yields L[ρ sub ] = 0, therefore i.e., ρ sub is a steady state of the dynamics and lives in a decoherence-free subspace.
As for the superradiant state, we observe: where ρ gg = |gg gg| is the ground state, which is stationary because the master equation ( 23) is at zero temperature.Therefore, the operator ρ sup − ρ gg is an eigenvector of the Liouvillian with eigenvalue −2γ.We finally obtain: exp Using the above result, we immediately find the formula for the intensity of the superradiant emission in Eq. ( 25).
Computing the evolution of ρ ee is slightly more involved.First of all, we calculate: Then, we can merge the results of the above equation and Eq.(C2) into a single system of linear differential equations written as: where One way to tackle this problem is to solve each differential equation starting from the trivial one for the steady state, and then to insert this solution into the following equation, which will now be independent and affine, and so on.Another instructive (and more general) way to find the solution of the dynamics is to compute v(t) = exp(M t)v(0) by getting the Jordan-Chevalley decomposition of M [129].Indeed, M has the eigenvalue 0 with eigenvector (1, 1, 1) T , and the eigenvalue −2 with multiplicity 2 and a one-dimensional eigenspace spanned by (0, 0, 1) T .That is, M is not diagonalizable.So, we need to obtain its Jordan decomposition as M = P J M P −1 , with: Then, the Jordan-Chevalley decomposition is trivially expressed by J M = D M + N M , where D M is the diagonal matrix with the same elements of J M on the diagonal, while N M is a matrix whose only non-zero element is the off-diagonal 1 in J M .N M is nilpotent (N 2 M = 0), and [N M , D M ] = 0.Then, after some simple matrix algebra we find e M t = e P J M P −1 t = P e D M t e N M t P −1 Finally, we obtain the evolution of ρ ee : Once again, the enhanced decay rate 2γ is a signature of collective effects in the dynamics.
Let us now compute the local incoherent evolution of ρ sub .The Liouvillian driving this type of dynamics is: where, for simplicity, we have taken the same decay rate γ for both qubits.Straightforwardly, As expected, the subradiant state decays toward the ground with the standard incoherent rate γ.
Appendix D: Proofs of the results in Sec.IV 1. Diamond distance ♦ s for the ideal case We want to prove the bound To check its validity, note that the bounds we need to estimate are, according to the Supplemental Material (SM) of Ref. [31], D2) where R c is a remainder of the expansion of the MCM unitaries (see the original paper [31] for details).For our purposes, it is sufficient to assume that it is a polynomial function of operators.
Let us first show that the bound on the 1 → 1 norm of the Liouvillian found is valid also for the diamond norm.This result is also stated in Ref. [130] (SM).The crucial point is that, despite the 1 → 1 having the undesirable property of not scaling well in the presence of a tensor product, by means of the Hölder's inequality introduced in Sec.II, the error bound can be written as a function of infinity norms only, which are stable.Therefore, estimating the same bound with the diamond norm leads to the same equations.Let us show this.
In the following, I A will be the identity superoperator on a copy of the space of the bounded operators on the Hilbert space of the system B(H), according to the definition of the diamond norm in Eq. (3).We evaluate: under the assumption that ρ 1 = 1.We have used the Hölder's inequality and the multiplicativity of the infinity norm.The above bound is the same as for the 1 → 1 norm.Indeed, evaluating everything on a basis of the extended Hilbert space we observe: and we can write ρ as Therefore, showing that Eq. (D3) is correct.This is basically due to the linearity of quantum channels.
Equivalently, for the remainder: Therefore, when evaluating the trace norm of the above expression, we can still find a bound that only depends on U (∆t) ⊗ I A ∞ , that is, on U (∆t) ∞ .This is the same bound found in Ref. [31], expressed in Eq. ( 22) of the main text.

Diamond distance *
m between ideal and noisy MCM map The estimation of an upper bound for * m can be performed as follows: Note that we are employing the subscript "A" to denote states and operators that belong or act on a copy of the Hilbert space of the system, while "B" denotes states and operators that belong or act on a copy of the Hilbert space of the ancillary qubits of the collision model, according to the definition of the diamond norm in Eq. ( 3).In contrast, the subscripts "S" and "E" indicate respectively the "original" Hilbert space of the system and the Hilbert space of the ancillary qubits.To obtain the final result in Eq. (D8), we have employed the fact that , the sub-multiplicativity of the diamond norm, as in Eq. ( 5), and the fact that the diamond norm defined in Eq. ( 3) with the tensor product of the identity operator over a copy of the Hilbert space (in our case of system + ancillary qubits of the MCM) is the maximal one [76].
Finally, note that the diamond distance between the map G i and the identity does not assume that the ancilla is ideally initialized in the ground state.So, Eq. (D8) is valid also for scenarios with different choices of initial states.Moreover, if the quantum channel describing the noise for the state preparation on the backend is not known, one may tighten the bound by minimizing G i − I E ♦ over all the channels G i that map the ideal initial state of the ancilla into the observed noisy state.
Appendix E: Experimental scheme and methods

Topology of the backend
The IBM quantum computers are often recalibrated to optimize their performance.That is, their qubit and gate parameters are often modified at the level of the hardware.As a consequence, the gate and readout errors change after every calibration.For our purposes, we want to run the above-listed protocols with a fixed set of experimental parameters.So, we need to employ a nearterm device with a sufficient time interval between two calibration procedures (running all the necessary protocols listed in Sec.V takes roughly 3 hours).The 16-qubits backend ibmq guadalupe is very stable in this sense (it is usually re-calibrated once per day), and so we have chosen it for running our algorithm.
When we write the (Qiskit) code to run a quantum algorithm on a near-term device, we must always take into FIG.6: Topology of ibmq guadalupe with 16 qubits.
The links denote the possibility of implementing a direct two-qubit gate between a pair of qubits.The red (green) qubits are employed as system (ancillary) qubits in the simulation of dissipative collective effects.
account the topology of the latter, because it constraints the operations we can actually perform on the platform: only CNOTs between nearby qubits can be directly implemented.The topology of ibmq guadalupe is depicted in Fig. 6.
For the experiments presented in Sec.V, we have used the following sets of qubits on ibmq guadalupe: the system qubits are (0, 2).The ancillary qubits for the collective dynamics are (1,4,7,10,12).We have employed the same set of ancillary qubits for the local dissipative dynamics on the system qubit 0, while the ancillas for the local decay of the system qubit 2 are (3,5,8,11,14).
Note that the ancillary qubit 1 (and 3 for the local decay of the system qubit 2) is the only one that is directly linked to both the system qubits.So, we are only allowed to implement the CNOTs 0-1, 1-0, 1-2 and 2-1 to evolve the state of the system.To make the system qubits interact with the remaining ancillas, we need to swap the states of the ancillary qubit 1 with the one of the ancillary qubit 4, then 4 with 7 and again 1 with 4, then 7 with 10, and so on.This requires an additional number of CNOT gates (three per each swap) that increases at each timestep.Indeed, given a train of ancillary qubits from 1 (which is directly linked to the system qubits) to n, we need exactly n−1 swaps to make the n-th ancilla interact with the system.This is a well-known issue of near-term devices with limited connectivity [36].In the next subsection, we will show how we have been able to optimize the number of necessary gates by employing, under certain circumstances, two CNOTs only to perform a single swap.
Possible ways to tackle this problem in the near future may be: i) Adding more qubit-qubit links to the topology of the backend, so as to reduce the number of swaps necessary to connect a distant ancilla with the system qubits; ii) Implementing a fast and efficient reset gate to be applied on the closest ancillary qubit (qubit 1 in Fig. 6) at every timestep.In fact, by employing a reset gate, we would need a single ancillary qubit only to be re-initialized on the ground state after each collision, or equivalently, thinking of more complex physical problems, one ancillary qubit for each Lindblad operator in the dissipator in Eq. ( 18) [31], according to the discussion in Sec.III.The reset gate is already available on ibmq guadalupe.However, it is a very slow gate and decoherence rapidly emerges after a couple of applications thereof, as we will show in Appendix F. Therefore, improvements on the reset gate time would be extremely beneficial for the quantum simulation of open systems via MCM.

Algorithm implementation and optimization
The circuit scheme to implement the algorithm discussed in Sec.III B on ibmq guadalupe is shown in Fig. 7.The unitary interaction between a single system qubit and the ancilla, i.e., Eq. (B1) of the MCM, or equivalently U (i) (∆t) in Eq. ( 27) for the simulation of super-and subradiance, is depicted in Fig. 7(a).To implement it on the hardware, two CNOT gates and some single-qubit rotations are required.Since the gate error of single-qubit rotations is one or two orders of magnitude lower than the CNOT error [44] (typically, the infidelity of 1-qubit gates is of the order of 10 −4 ), the CNOTs in the circuit bring the largest contribution to the overall error of the algorithm.Fig. 7(b) represents a single collision between the system qubits and the ancilla, which is composed of the three applications of the gates described in Fig. 7(a), according to the second-order Suzuki-Trotter decomposition of the MCM in Eq. (B2).Finally, Fig. 7(c) shows the implementation of the algorithm on ibmq guadalupe, starting from the subradiant state and up to n = 4 collisions.The preparation of the subradiant and superradiant states requires one CNOT and some single-qubit gates [44,45].In addition, due to the topology of the backend in Fig. 6, there is no direct link between the system qubits 0 and 2. Therefore, we need to employ the ancillary qubit 1 to prepare a Bell state between qubits 0 and 1, and then to swap the state of the qubit 1 with the one of qubit 2. In contrast, the preparation of |ee only needs two local X gates, and is therefore way less noisy.
As shown in Fig. 7(c), after the state preparation we are ready to implement a single timestep of the MCM by applying the routine in Fig. 7(b) to the system qubits and the ancilla.To simulate further collisions, we need to reinitialize the state of the ancillary qubit 1 by swapping it with the fresh ancillas in the train (4,7,10,12).We finally measure the state of the system qubits to reconstruct the statistics of the outcomes after the nth collision.
The state preparation, the interactions for every timestep and the swaps in the train of ancillas require a  27) on the IBM quantum computer, applied on a system qubit s 0 and an ancillary qubit e 0 .The gates U 3 are single-qubit rotations defined by the three Euler angles given in the figure, and are elementary gates available on the IBM quantum computers [44].Two CNOT gates are also necessary to implement the system-ancilla interaction.(b): Scheme of a single collision of the MCM with ∆t = 0.1 on the IBM quantum computer.The gates U (1) (0.1) and U (2) (0.1/2) are defined in Eq. ( 27), and their decomposition into elementary gates is given in subfigure (a).(c): Experimental circuit scheme to implement four collisions of the MCM on ibmq guadalupe, starting from the subradiant state.All the qubits are initialized in the ground state, and the system qubits are then prepared in the subradiant state using the environment qubit 1 as an ancilla (there is no direct interaction between them).After each collision, as given by the decomposition in subfigure (b), a new fresh state of the ancillary qubit 1 is prepared by swapping the latter with the remaining ancillas of the train.
large number of noisy CNOT gates.However, we can optimize this number by noticing that either the first or the last CNOT of the usual three-CNOT constructed swap gate is the identity gate when one of the qubits is in the ground state: if C 01 is the CNOT with qubit 0 as control and qubit 1 as target, while S 01 is the swap gate between these qubits, then S 01 |0 ⊗ |ψ = C 01 C 10 |0 ⊗ |ψ .That is, we need two CNOTs only to implement the swap gate if one of the qubits starts in the ground state.Ideally, all the swap gates in the algorithm act at least on one qubit in the ground state.However, we have verified that the 2-CNOT swap is reliable only when the qubit in |0 is a fresh ancilla that has not been manipulated before.
Otherwise, the error coming from the fact that this qubit is not exactly in |0 (due to noise and imperfections in the algorithm implementation) jeopardizes the advantage of using one CNOT less.As a consequence, we can remove only 5 CNOT gates from the actual protocol (one for each ancillary qubit).Finally, note that employing a reset gate would drastically reduce the number of two-qubit gates in the algorithm.Indeed, the train of ancillas at the nth collision requires (n − 1) × (3n − 1) CNOTs.

Methods to estimate the figures of merit
The outcome probabilities displayed in Figs. 2 and 8 have been obtained through standard projective measurements in the computational basis, which are available on the IBM quantum computer, as in the circuit scheme of Fig. 7(c).The results have been computed as averages over 37 realizations of the protocol, and each realization had 8192 shots (i.e., repetitions of the algorithm).We have found that the standard deviation over the 37 realizations always leads to small error bars that are within the markers shown in the plots.
The state tomography after each collision (Figs. 4 and 10) and the process tomography of the CNOT gates (Figs. 3 and 9) have been computed by running the proper tomographic circuits on the back- end, which are available on qiskit.ignis[44].
Then, the results have been fitted to reconstruct either the density matrix of the system through the Qiskit class StateTomographyFitter, or the Choi matrix of the process through the Qiskit class ProcessTomographyFitter.The results have then been resampled 100 times via bootstrapping, after which we have computed the standard deviations we used for the error bars.
The trace distance and the fidelity between ideal and simulated states at each timestep have been computed through the built-in functions in Qiskit and Qutip [132].Starting from the Choi matrix of the ideal and simulated gates, the diamond distance and the average gate fidelity can be evaluated through the Qiskit functions average gate fidelity and diamond norm.In particular, the latter makes use of the semidefinite program developed in Ref. [100].Finally, we have computed the unitarity of each gate by employing both definitions in Eqs. ( 6) and (7).In particular, we have introduced two new functions in Qiskit to obtain these values starting from the Choi matrix of the process.We have checked that the values computed according to each definition coincide.
Readout error mitigation has been performed for all the results shown in the paper.The calibration filters have been obtained through the class CompleteMeasFitter and the function complete meas cal in Qiskit.
Appendix F: Further results

Detrimental effects of noise
In this section, we will focus on the second set of results on the implementation of the MCM on ibmq guadalupe.We will observe how a pair of very noisy gates can jeopardize the simulation of collective effects on near-term devices.
The outcomes of the measurements in the computational basis at each step of the algorithm are shown in Fig. 8.It is quite evident that the results are very noisy, and almost no signature of collective effects can be extrapolated from them.In particular, the experimental evolution of the subradiant state following the collective dynamics (light blue markers) shows no slower decay than in the case of local dissipation only (solid black lines).The superradiant state (magenta markers) does display a fast (although not always accurate) decay dur-

IBM gate error Average gate infidelity Incoherence
FIG. 9: Error analysis of the set of CNOT gates in ibmq guadalupe employed for the simulation of the second set of results in Fig. 8.The tick 0-1 on the x axis corresponds to the CNOT gate where "qubit 0" in ibmq guadalupe is the control qubit and "qubit 1" is the target.Lower plot: gate error provide by IBM (magenta), experimental average gate infidelity r(U g , T ) via full process tomography (light blue), and incoherence ω(T ) (yellow), as defined in Eq. ( 9).The error bars on the values of r(U g , T ) and ω(T ) are the standard deviations of 100 realizations of a random sample of the experimental data via bootstrapping.Upper plot: ratio between the incoherence and the experimental average gate infidelity of each CNOT gate.ing the first two collisions, but decoherence gains the upper hand starting from the third one, and the dynamics stabilizes at a wrong value.The concavity of the evolution of |ee (orange markers) is often erroneous as well.The gate analysis in Fig. 9 may shed light on the origin of these noisy results.We have found a very large value of the experimental average gate infidelities of the CNOTs 1-2 and 2-1, which are repetitively employed in the algorithm, as explained in Appendix E. The corresponding IBM values are not as high.These gates may be responsible for the noisy dynamics in Fig. 8. Indeed, we can once again compare the experimental state with the states simulated through the "noisy circuit" noise model and the IBM noise model introduced in Sec.V D. This is done in Fig. 10.We observe that, especially for the subradiant dynamics, the noise model based on the noisy circuit built through the experimental Choi matrices predicts a slightly better state than the IBM noise model.We have found that this is due to the fact that the IBM noise model is too optimistic, as it makes use of a relatively not-so-noisy CNOT gates 1-2 and 2-1.So, in the presence of some very noisy gates the full process tomography may actually be an improved tool to understand the errors in the quantum simulation.
It is worth stressing that the gate analysis for the second set of results confirms the noise properties we previously found for the first set.In particular, the source of noise is mostly incoherent, since the ratio between incoherence and gate infidelity is almost always larger than 0.9, as depicted in the upper plot of Fig. 9.Moreover, the scaling of the state infidelity as a function of the channel length is again linear or sublinear.Finally, the diamond distance is of the same order of magnitude as for the first set of results, as discussed in Appendix G 2.

Subradiant dynamics on ibmq toronto
The validity of the experimental results presented in Sec.V is not restricted to ibmq guadalupe, or to a specific choice of system and ancillary qubits.Indeed, in Fig. 11 we show the experimental outcomes for the MCM implemented on ibmq toronto, which is a different IBM backend with 27 qubits.We have run the MCM to simulate the collective subradiant dynamics starting from different pairs of system qubits, each of which was connected to a suitable train of ancillas.The findings plotted in Fig. 11 confirm the emergence of collective effects in the quantum simulation, as discussed in Sec.V and Fig. 2. Once again, the gate errors break the decoherence-free subspace of the ideal dynamics, but still, comparing Fig. 11 with Fig. 2, we observe a slower decay of the subradiant state than in the scenario with local dissipation only.In some cases, as for the system pair (9,11) (we refer the reader to the Qiskit documentation for the topology of the backends [44]), the subradiance on ibmq toronto is actually enhanced with respect to the best results obtained on ibmq guadalupe.
Due to the fact that ibmq toronto is re-calibrated much more often than ibmq guadalupe, it has not been possible to perform the process tomographies necessary for the gate analysis we introduced in Sec.V.However, the crucial readout error mitigation has been correctly applied to the results in Fig. 11.

Refreshing the ancilla through the reset gate
As discussed in Appendix E, a possible way to optimize the necessary resources for the quantum simulation of the MCM is represented by the reset gate available on some IBM computers [44], which re-initializes the state of the target qubit in the ground state |0 .With a reset gate at our disposal, we need a single ancillary qubit (two ancillary qubits in the case of the local decay) to simulate collective effects, since we can re-initialize it after each collision and avoid using a whole train of ancillas, which must be swapped.It goes without saying that this would represent a huge improvement in the number of necessary qubits and gates.Moreover, the ability to reset qubits in parallel with unitary gates would be beneficial for quantum error correction schemes, as it would provide a mechanism for flushing entropy out of system qubits.
We have run a protocol based on the MCM with the reset gate on ibmq guadalupe, and the results are shown in Fig. 12.The of using always the same ancilla allows us to implement more collisions (here, we have chosen n = 10).However, the accuracy of the simulation is clearly much worse than for the MCM via trains of ancillas.Indeed, after one or two collisions, decoherence arises and the state of the system qubits reaches what looks like a thermal stationary state with no coherences at all.This is due to the fact that the running time of the reset gate is much longer than the one of standard single-and two-qubit gates [44] (the reset gate length at the time of the experiment was around 7.34 µs, while the T 1 time of the ancillary qubit was around 80.0 µs, so 10 applications of the reset gate would already lead to complete dissipation).That is, implementing a dissipative channel on the quantum platform takes more time than performing coherent operations.Therefore, employing the reset gate even just a couple of times is enough for decoherence to occur.This being said, we point out that no dynamical decoupling scheme [133] has been performed during the application of the long-duration reset gate.Dynamical decoupling may therefore improve the results in Fig. 12.
In conclusion, at present (the results in Fig. 12 have been obtained during December 2021), the reset gate is not a feasible solution to reduce the number of necessary resources for the quantum simulation of the MCM.Similar results have been obtained on different IBM computers, including ibmq toronto and ibmq mumbai.
Appendix G: Additional material on the noise analysis 1. Trace distance between experimental, ideal and simulated states Fig. 13 shows the trace distance between the experimental state and the ideal state of the algorithm and between the experimental state and the states simulated through the two noise models introduced in Sec.V D, as a function of the number of gates N g and for each timestep of the algorithm (markers).The first row refers to the first set of results in Fig. 2, while the second one to the second set of results in Fig. 8.We observe that for the first set of results the trace distances between experimental and simulated states have a similar behavior for the two different noise models; the IBM noise models outperforms the noisy circuit one for the subradiant dynamics, but it has a worse behavior for the superradiant and |ee initial states.In contrast, the noisy circuit noise model is always better than the IBM noise model for the second set of results.Anyway, both noise models provide a good prediction for the experimental state of the dynamics.The trace distance has a larger value than the infidelity between the same states, depicted in Figs. 4 and 10, because of the square root dependence between these quantites, expressed by Eq. (10).The scaling of the trace distance between experimental and ideal states is linear or sublinear as a function of the number of gates.Finally, in Fig. 13 we also plot the quantity ideal (n) given by Eq. (28), that is, the trace distance between the ideal state of the multipartite collision model at the nth timestep and the "physical" state ρ S (n∆t) generated by the master equation (23).As we may also deduce from the inset in Fig. 1(a), the ratio between the ideal error ideal (n) and the "experimental error" due to noisy gates (solid violet line) is very small (around 10 −2 ) apart from the case where the initial state is |ee .In this case, the ratio is around 10 −1 , so that the experimental errors are still playing a major role in the discrepancy between the physical quantum dynamics we aim to simulate (ρ S (n∆t)) and its experimental implementation via MCM.

Diamond distance of the CNOT gates
In Fig. 14 we show the diamond distance between ideal and experimental gate for the CNOTs employed in the algorithm.The distances have been computed by reconstructing the Choi matrices of the experimental gates via process tomography, and the experimental values have been resampled 100 times via bootstrapping to estimate the standard deviations of the samples (error bars in Fig. 14).We observe that the experimental diamond distances for the noisy CNOTs are mostly between 5 × 10 −2 and 10 −1 .As the threshold values for fault- tolerant quantum computation against general noise are tipically ≈ 10 −3 -10 −4 [93,95,[114][115][116], our experimental results seem to indicate that current near-terms quantum computers are still orders of magnitude away from these thresholds.Note, however, that the values in Fig. 14 may still be biased due to residual SPAM errors [77], which anyway have been mitigated as described in Sec.V C. The values of the diamond distance in Fig. 14 are interesting for our purposes also because they are the diamond distances appearing in the bound in Proposition 2, Eq. ( 33), when we restrict ourselves to the errors on twoqubit gates only.This makes sense because, as stated before, the errors on single-qubit gates are typically around a couple of orders of magnitude smaller than on two-qubit gates.Moreover, the state preparation of the ancillas in the ground state at the beginning of the algorithm is very accurate on IBM quantum computers, whereas the swap gates to bring the new ancillas in contact with the system qubits are taken into account by the distances in the plot.Therefore, we can use the values in Fig. 14 to estimate the upper bound on the simulation error of the MCM due to its noisy implementation via imperfect gates.As the diamond errors are quite large and we are using tens of CNOT gates in the algorithm, the bound in Eq. ( 33) quickly exceeds 1.
For completeness, in Fig. 14 we plot also the upper bound on the diamond distance based on the infidelity and unitary, as given by Eq. ( 14).We observe that the actual value of the diamond distance is always quite far from this upper bound (typically between three and five times smaller). .The tick 0-1 on the x axis corresponds to the CNOT gate where "qubit 0" in ibmq guadalupe is the control qubit and "qubit 1" is the target.We plot the diamond distance between ideal and noisy CNOT according to Eq. (4) (violet), and the corresponding upper bound based on the average gate fidelity and unitarity (orange), as defined in Eq. ( 14).The error bars in the plots are given by the standard deviations of 100 realizations of a random sample of the experimental data via bootstrapping.

FIG. 1 :
FIG. 1: (a): Emission power as a function of time, when the initial state is the subradiant state (dotted light blue line), superradiant state (dash-dotted magenta line), |ee (dashed orange line), and when the dynamics is local starting from the subradiant state (solid black line).Inset in (a): Trace distance between real and simulated state ideal (n) as a function of the number of timesteps n, with the same initial states as in the main figure.(b): Upper bound B 1→1 for the single-step error s as a function of the timestep ∆t.Inset in (b): Upper bound for the global error g with ∆t = 0.1, equal to n times B 1→1 , where n is the number of timesteps.

FIG. 2 :
FIG.2: Probability of finding the states |gg , |ge , |eg , |ee through a projective measurement in the computational basis as a function of the number of steps in the MCM (first set of results), when the initial state is the subradiant state (light blue), superradiant state (magenta), |ee (orange), and when the dynamics is local starting from the subradiant state (black).Solid lines: theoretical prediction based on the master equation.Markers of the dash-dotted lines: experimental values.The results have been obtained as averages over 37 realizations of the protocol, and the error bars are within the markers.

FIG. 4 :
FIG.4: Scaling of the error as a function of the number of CNOT gates in the algorithm N g during the simulation of the collective dynamics displayed in Fig.2.The markers indicate the steps of the collision model.Dotted light blue line and dotted magenta line: linear scaling of respectively the IBM gate error and the experimental average gate infidelity, given in Fig.3.Solid violet line: infidelity between the experimental state and the ideal one.The errors bars for the latter quantity have been obtained as the standard deviations of 100 realizations of a random sample of the state tomography data via bootstrapping, and they are within the markers.Solid orange line: infidelity between the experimental state and the state simulated on the noisy circuit.Solid yellow line: infidelity between the experimental state and the state simulated via the IBM noise model.

FIG. 7 :
FIG.7:(a): Decomposition of the gate U(1) (0.1) defined in Eq. (27) on the IBM quantum computer, applied on a system qubit s 0 and an ancillary qubit e 0 .The gates U 3 are single-qubit rotations defined by the three Euler angles given in the figure, and are elementary gates available on the IBM quantum computers[44].Two CNOT gates are also necessary to implement the system-ancilla interaction.(b): Scheme of a single collision of the MCM with ∆t = 0.1 on the IBM quantum computer.The gates U (1) (0.1) and U (2) (0.1/2) are defined in Eq. (27), and their decomposition into elementary gates is given in subfigure (a).(c): Experimental circuit scheme to implement four collisions of the MCM on ibmq guadalupe, starting from the subradiant state.All the qubits are initialized in the ground state, and the system qubits are then prepared in the subradiant state using the environment qubit 1 as an ancilla (there is no direct interaction between them).After each collision, as given by the decomposition in subfigure (b), a new fresh state of the ancillary qubit 1 is prepared by swapping the latter with the remaining ancillas of the train.

FIG. 8 :
FIG.8: Probability of finding the states |gg , |ge , |eg , |ee through a projective measurement in the computational basis as a function of the number of steps in the MCM (second set of results), when the initial state is the subradiant state (light blue), superradiant state (magenta), |ee (orange), and when the dynamics is local starting from the subradiant state (black).Solid lines: theoretical prediction based on the master equation.Markers of the dash-dotted lines: experimental values.The results have been obtained as averages over 37 realizations of the protocol, and the error bars are within the markers.

FIG. 10 :
FIG.10: Scaling of the error as a function of the number of CNOT gates in the algorithm N g during the simulation of the collective dynamics displayed in Fig.8.The markers indicate the steps of the collision model.Dotted light blue line and dotted magenta line: linear scaling of respectively the IBM gate error and the experimental average gate infidelity, given in Fig.9.Solid violet line: infidelity between the experimental state and the ideal one.The errors bars for the latter quantity have been obtained as the standard deviations of 100 realizations of a random sample of the state tomography data via bootstrapping, and they are within the markers.Solid orange line: infidelity between the experimental state and the state simulated on the noisy circuit.Solid yellow line: infidelity between the experimental state and the state simulated via the IBM noise model.

FIG. 11 :FIG. 12 :
FIG.11: Probability of finding the states |gg , |ge , |eg , |ee (from left to right) through a projective measurement in the computational basis as a function of the number of steps in the MCM on ibmq toronto, when the initial state is the subradiant state.Solid black lines: theoretical prediction based on the master equation.Markers of the dash-dotted lines: experimental values.Different colors denote different initial pairs of system qubits in the topology of ibmq toronto, each of which interacts with a corresponding train of ancillas.

FIG. 13 :
FIG.13: Trace distance between the experimental state and the ideal state of the algorithm (solid violet line), the simulated state through the noisy circuit (solid orange line) and the simulated state through the IBM noise model (solid yellow line), as a function of the number of gates in the algorithm N g , and for the first set of results in Fig.2(first row) and for the second set of results in Fig.8(second row).The markers indicate the steps of the collision model.The dotted light blue line depicts the trace distance ideal (n) between the state of the multipartite collision model at each timestep of the dynamics and the ideal physical state driven by the master equation, according to Eq. (28).

FIG. 14 :
FIG.14: Error analysis of the set of CNOT gates in ibmq guadalupe employed for the simulation of the first set of results in Fig.2(upper plot) and for the second set of results in Fig.8(lower plot)