Witnessing latent time correlations with a single quantum particle

When a noisy communication channel is used multiple times, the errors occurring at different times generally exhibit correlations. Classically, these correlations do not affect the evolution of individual particles: a single classical particle can only traverse the channel at a definite moment of time, and its evolution is insensitive to the correlations between subsequent uses of the channel. In stark contrast, here we show that a single quantum particle can sense the correlations between multiple uses of a channel at different moments of time. In an extreme example, we show that a channel that outputs white noise when the particle is sent at a definite time can exhibit correlations that enable a perfect transmission of classical bits when the particle is sent at a superposition of two times. In contrast, we show that, in the lack of correlations, a single particle sent at a superposition of two times undergoes an effective channel with classical capacity of at most 0.16 bits. When multiple transmission lines are available, time correlations can be used to simulate the application of quantum channels in a coherent superposition of alternative causal orders, and even to provide communication advantages that are not accessible through the superposition of causal orders.


I. INTRODUCTION
Quantum communication enables new possibilities that were unthinkable in the classical world, notably including secure key distribution [1,2]. The main hurdle to the implementation of quantum communication, however, is the fragility of quantum states to noise. To tackle this problem, quantum error correction schemes encode information into multiple quantum particles, using redundancy to mitigate the effects of noise [3][4][5].
When the same communication channel is used multiple times, the noisy processes experienced by particles sent at different times are generally correlated [6][7][8][9]. For example, photons transmitted through an optical fibre are subject to random changes in their polarisation [10], and since such changes happen on a finite timescale, photons sent at nearby times experience approximately the same noisy processes. A similar situation arises in satellite quantum communication, where the satellite's motion induces dynamical mismatches of reference frame with respect to the ground station [11].
The presence of correlations is both a threat and an opportunity for communication. On the one hand, it can undermine the effectiveness of standard error correcting schemes, which assume independent errors on the transmitted particles. On the other hand, tailored codes that exploit the correlations among different particles can enhance the transmission of information [6,8,[12][13][14][15][16][17][18][19][20][21][22][23][24][25][26]. * giulio.chiribella@cs.ox.ac.uk Like most error correcting schemes, the existing codes for correlated noise use multiple physical particles to encode a single logical message. Classically, the use of multiple particles is essential: since a single classical particle can only traverse a communication channel at a definite moment of time, correlations between different uses of the channel do not affect the particle's evolution. The same conclusion holds even if the moment of transmission is chosen at random: in this case, the resulting evolution is simply the average of the evolutions associated to each individual moment of time, and the overall evolution is independent of the time correlations.
In stark contrast, here we show that a single quantum particle can sense the correlations between multiple uses of the same quantum communication channel. At the fundamental level, this effect is made possible by the ability of quantum particles to experience a coherent superposition of multiple time-evolutions [27][28][29][30][31][32][33][34]. In particular, we will consider the situation in which the particle is in a superposition of travelling at different moments of time, as illustrated in Figure 1. Taking advantage of the time correlations in the noise, we show that it is possible to enhance the amount of information that a single particle can carry from a sender to a receiver, beating the ultimate limit achievable in the lack of correlations.
We demonstrate this effect with an extreme example, in which a single quantum particle carries one bit of classical information through a transmission line that completely erases information at every definite time step. This phenomenon witnesses the presence of correlations between different uses of the transmission line: in the lack of correlations, we show that the number of bits that can FIG. 1.
A single quantum particle can travel through a transmission line at a superposition of two different moment of time t1 (red) and t2 (blue). Along the way, the particle experiences errors (yellow region), and the errors occurring at time t1 are generally correlated with the errors occurring at time t2. By taking advantage of these correlations, the errors can be mitigated or even completely removed.
FIG. 2. A single particle can travel on a superposition of two different paths (red and blue), which traverse two transmission lines (top and bottom) at two moments of time t1 and t2. The errors occurring on successive uses of the same transmission line are correlated (yellow lines), so the particle experiences correlated errors across the two branches (red and blue) of the superposition. These time correlations are a resource that can be used to mimic the use of quantum channels in a superposition of orders, and even to achieve larger communication advantages. be reliably transmitted by sending a single particle at a superposition of two different times does not exceed 0. 16. It is worth stressing that the above advantage is not specific to time correlations, but applies more generally to spatial correlations, or to other types of correlations: as long as two different uses of a channel are correlated, one may take advantage of the correlations by sending a quantum particle in a superposition of going through one use or the other.
Time-correlated channels are also interesting for foundational reasons. Recently, they have been proposed as a way to reproduce the use of quantum channels in a superposition of different causal orders [32,35]. In particular, they have been used to reproduce the action of the quantum SWITCH [36,37], a higher-order operation that combines two variable quantum channels in a superposition of two alternative orders. In practice, timecorrelated channels underlie all the existing experimental setups inspired by the quantum SWITCH [38][39][40][41][42][43][44].
The quantum SWITCH is known to offer a number of advantages in quantum communication [42,[45][46][47][48][49][50]. Here we show that (1) time correlations are essential in order to reproduce the advantages of the quantum SWITCH, and (2) the access to time-correlated channels is an even more powerful resource than the ability to combine ordinary quantum channels in a superposition of alternative orders.
To make the above points, we consider the scenario illustrated in Figure 2, where a single particle is sent on a superposition of two paths, traversing two independent channels, each with the property that different uses of the same channel at different moments of time are correlated, while the action of the channel at any given time is completely depolarising. When the noise is perfectly correlated, the network in Figure 2 reproduces the quantum SWITCH of two completely depolarising channels, which is known to achieve a communication capacity of 0.049 [45,50]. In contrast, we show that in the lack of time correlations the maximum capacity achieved by sending a particle on a superposition of paths is at most 0.024 bits. This result proves that, in this scenario, the physical origin of the communication advantage of the quantum SWITCH is not merely the superposition of paths, but rather the interplay between the superposition of paths and the time correlations in the noise.
Remarkably, we also find that the time correlations that reproduce the action of the quantum SWITCH are not the most favourable for the transmission of classical information: while the quantum SWITCH of two completely depolarising channels can at most yield 0.049 bits of classical communication [45,50], a more sophisticated pattern of time correlations yields the communication of at least 0.31 bits. The gap between these two values further highlights the power of time correlations, which are not only capable of reproducing the benefits of the superposition of causal orders, but also of surpassing them.
The remainder of the paper is structured as follows. In Section II we describe the formalism of time-correlated channels and derive the effective evolution experienced by a single particle upon entering a time-correlated channel at a superposition of times. In Section III, we consider the transmission of a single particle at a superposition of times, as in Figure 1, and we demonstrate that the correlations between different uses of the channels offer a communication advantage over all communication scenarios where the channels are uncorrelated. In Section IV, we consider the network scenario of Figure 2, and we show that time correlations are necessary to reproduce the advantages of the quantum SWITCH, and that certain time correlations can even offer higher advantages. Finally, we discuss the effects of noise on the control degree of freedom in Section V and conclude in Section VI.

II. TRANSMISSION OF A SINGLE PARTICLE AT A SUPERPOSITION OF DIFFERENT TIMES
A. Time-correlated channels A transmission line that can be accessed at k different times is described by a correlated quantum channel [6][7][8].
Mathematically, the correlated channel is a linear map transforming density matrices of the composite system S 1 ⊗ · · · ⊗ S k , where S j denotes the system sent at the j-th time. Note that, in general, the k systems sent at k different times can be initially prepared in an arbitrary entangled state.
Correlated quantum channels are also known as quantum memory channels [7,8,20], quantum combs [51,52], or non-Markovian quantum processes [9,53]. In the following we will focus on the k = 2 case, corresponding to a transmission line that can be accessed at two different time steps, hereafter denoted by t 1 and t 2 . We consider random unitary channels of the form where U m and U n are unitary gates in a given set, and p(m, n) is a joint probability distribution. Here, the system sent at time t 1 experiences the unitary gate U m , while the system sent at time t 2 experiences the gate U n . The density matrix ρ 12 represents the joint state of the two systems sent at the two times t 1 and t 2 , that is, ρ 12 is a density matrix on the Hilbert space of the composite system S 1 ⊗ S 2 . The probability distribution p(m, n) specifies the correlations between the random unitary evolutions experienced by system S 1 and system S 2 .
Note that, while in this paper we will focus on time correlations, the correlations in Eq. (1) are not specific to time. The same expression can be used also to describe correlated channels acting on two spatially separated systems, or on any other type of independently addressable systems.
Physically, a time-correlated random unitary channel of the form (1) can arise in a photonic setup where the systems S 1 and S 2 are modes of the electromagnetic field associated to two different time bins [54][55][56][57]. The noisy channel can correspond e.g. to the action of an optical fibre, where the random unitary changes of the photon polarisation arise from random fluctuations in the birefringence. Correlations between the unitaries at different times can arise when the time difference t 2 − t 1 between successive uses of the channels is smaller than the timescale on which the birefringence fluctuates.
B. Sending a single particle through a time-correlated channel Consider now the situation where the input of the correlated channel (1) is a single particle, carrying information in its internal degrees of freedom. Classically, the particle must be sent either at time t 1 , or at time t 2 , or at some random mixture of t 1 and t 2 . When the particle is sent at time t 1 , its evolution is given by the reduced channel R 1 (ρ) := m p 1 (m) U m ρU † m , where p 1 (m) := n p(m, n) is the marginal probability distribution of the unitaries at time t 1 . Similarly, if the particle is sent at time t 2 , its evolution is given by the channel FIG. 3. A single particle is sent at a superposition of two times (red and blue dashed lines), through the same transmission line (green ovals). The green dotted line represents the correlations between random unitary processes Um and Un taking place with probability p(m, n) at the two subsequent uses of the transmission line, respectively.
R 2 (ρ) := m p 2 (n) U n ρU † n , with p 2 (n) := m p(m, n). A random choice of transmission times then results into a random mixture of the evolutions corresponding to channels R 1 and R 2 . Crucially, the evolution of the particle is independent of any correlation that may be present in the probability distributions p(m, n), that is, of any correlation between the first and the second use of the transmission line.
In contrast, quantum mechanics allows one to transmit a single particle in a way that is sensitive to the correlations between noisy processes at different times. The key idea is that the time when the particle is transmitted can be indefinite, as the particle could be sent through the transmission line at a coherent superposition of times t 1 and t 2 (see illustration in Figure 3). The superposition of transmission times could be achieved by adding an interferometric setup before the transmission line, letting the particle travel on a coherent superposition of two paths, one of which includes a delay [58]. This results in a time-bin qubit, described by a superposition of amplitudes corresponding to localisation at two different points in time, separated by a time difference much greater than a photon's coherence time [59].
Before developing the general theory of single particle transmission through time-correlated channels, it is instructive to look at a concrete example. Consider the case of a single photon, and denote by H 1 and V 1 (H 2 and V 2 ) the horizontal and vertical polarisation modes in the first (second) time bin. Here we take the polarisation state to be the same on both paths, so that the only role of the interferometric setup is to coherently control the moment of transmission. The result is a linear combination of states of the form (α|1 H1 |0 V 1 + β|0 H1 |1 V 1 ) ⊗ |0 H2 |0 V 2 and states of the form |0 H1 |0 V 1 ⊗ (α|1 H2 |0 V 2 + β|0 H2 |1 V 2 ). The composite system of the two modes in the first (second) time bin can be regarded as system S 1 (S 2 ) in Eq. (1). The states produced by the interferometric setup can then be written as a linear combination of states of the form |ψ 1 ⊗ |vac 2 and states of the form |vac 1 ⊗ |ψ 2 , where, for i ∈ {1, 2}, |vac i := |0 Hi |0 V i is the vacuum state of the modes in system S i , and |ψ i := α |1 Hi |0 V i + β |0 Hi |1 V i is a single-photon polarisation state. The change in the particle's state upon the transmission is then computed by applying the chan-nel (1) to the appropriate state.
Generalising the above example, we model the transmission of a single particle through channel (1) by interpreting systems S 1 and S 2 as abstract modes, each of which can contain a variable number of particles equipped with an internal degree of freedom, such as the photon's polarisation. For i ∈ {1, 2}, the Hilbert space of system S i has two orthogonal subspaces: a one-particle subspace, denoted by A (i) , and a vacuum subspace, denoted by Vac (i) . We assume that the dimension of the one-particle subspace is the same for both S 1 and S 2 , as in the example of the single-photon polarisation. Under this assumption, we have A (1) A (2) M , where M is the internal degree of freedom of the particle. Also, we assume that each vacuum subspace is one-dimensional, and is spanned by a vacuum state |vac i , i ∈ {1, 2}, as in our motivating example.
A single particle sent at a superposition of two moments of times will then be described by states of the form α |ψ 1 ⊗ |vac 2 + β |vac 1 ⊗ |ψ 2 , where |ψ ∈ M is the state of the particle's internal degree of freedom. For the transmission of the particle, we will consider channels that conserve the number of particles, i.e. that map states of a given sector into states of the same sector. This is the case, for example, for linear optical elements, which preserve the photon number. For the channel (1), preservation of the particle number means that the operators U m have the form where V m is a unitary acting in the one-particle sector M , and φ m ∈ [0, 2π) is a phase. Physically, φ m corresponds to the phase difference between states in the one-particle sector and the vacuum state. In the quantum optical example, each unitary U m can be realised by a Hamiltonian acting on the two polarisation modes associated to system S i , i ∈ {1, 2}. For example, the unitary Z ⊕ e iφ |vac vac| can be generated by the Hamiltonian H = [(ξ + θ/2)a † H a H + (ξ − θ/2)a † V a V ], where a H (a V ) are the annihilation operators for the appropriate modes with horizontal (vertical) polarisation, in suitable units.

C. Effective evolution with a control system
The representation of a single particle in terms of abstract modes is equivalent to a representation in terms of a composite system M C, consisting of a message-carrying system M and a control system C, which determines the particle's time of transmission. The change of representation is described by the mapping where |ψ is an arbitrary state in the one-particle subspace. If the control is in state |0 , then the message is sent through the first application of the channel, with the vacuum in the second application; vice versa if the control is in state |1 . If the control is in a generic state ω, the overall evolution is described by an effective channel C ω , which transforms a generic state ρ of the message into the state where W mn is the unitary W mn := V m e iφn ⊗ |0 0| + e iφm V n ⊗ |1 1|. The derivation of Eq. (4) is provided in Appendix A. When the probability distribution p(m, n) is symmetric (that is, when p(m, n) = p(n, m) for every m and n), the effective channel has the simple expression and (See Appendix A for the derivation.) Here, the map C is the quantum channel representing the evolution of the message when it is sent at a definite time (either t 1 or t 2 ). The channel C depends only on the marginal probability distribution p 1 (m) := n p(m, n), and it is independent of the correlations. Instead, the map G can generally depend on the correlations between the evolution of the particle at two mutually exclusive moments of time. We call G the interference term.

A. Correlated white noise
Consider the case where the evolution at any definite time step is completely depolarising on the messagecarrying sector M , that is, where C |j j| is the quantum channel obtained by plugging ω = |j j| into Eq. (4). Eq. (8) implies that, whenever the particle is sent at a definite moment of time, the message is replaced by white noise. Accordingly, the channel C in Eq. (6) is depolarising. When the probability distribution p(m, n) is symmetric, Eq. (5) becomes In the realisation of the random unitary channel, we will take the unitaries {V m } to be an orthogonal basis for the space of d × d matrices. Accordingly, the set {V m } will contain d 2 unitaries, labelled by integers from 0 to d 2 − 1. For qubits, we will take {V m } to be the four Pauli matrices {I, X, Y, Z}, labelled as In terms of the probability distribution p(m, n), the condition (8) amounts to requiring that the marginal probability distributions p 1 (m) and p 2 (n) be uniform, that is The probability distributions p(m, n) satisfying Eq. (10) form a convex polytope whose extreme points are prob- For the identity permutation, satisfying σ(m) = m for all values of m, the probability distribution p(m, n) is symmetric, and the interference term (7) is the completely depolarising channel G(ρ) = I/d ∀ρ. Hence, the channel C ω in Eq. (9) is completely depolarising, and no information can be transmitted through it, no matter what state ω is used. In the following, we will show that, instead, other types of permutations enable a perfect transmission of classical information.

B. Perfect communication through correlated completely depolarising channels
Here we focus on the case where the message is a qubit (d = 2). Let σ be a permutation that swaps two pairs of indices, for example mapping (0, 1, 2, 3) into (1, 0, 3, 2). In this case, the probability distribution p(m, n) = δ m,σ(n) /4 is symmetric, and the interference term is where h.c. denotes the Hermitian conjugate of the preceding matrices. Note that G(ρ) depends only on the differences φ 1 − φ 0 and φ 3 − φ 2 . We now show that, by suitably choosing the differences φ 1 − φ 0 and φ 3 − φ 2 , and the state ω, it is possible to achieve a perfect transmission of classical information. When φ 1 −φ 0 = 0 and φ 3 −φ 2 = π/2, the interference term becomes where {A, B} = AB + BA denotes the anticommutator of two generic operators A and B. In particular, choosing ρ = |± ±|, with |± := (|0 ± |1 )/ √ 2, we obtain Combining this relation with the depolarising condition C(|± ±|) = I/2, and inserting these two relations into into Eq. (5), we obtain with ω + := ω and ω − := ZωZ. In other words, the net effect of the superposition of correlated depolarising channels is to transfer information from the message to the output state of the control. Putting the control in the state ω = |+ +|, one obtains the orthogonal output states ω ± = |± ±|. Hence, a sender can encode a bit into the states |± , and a receiver will be able to decode the bit in principle without error, by measuring the control system in the basis In summary, there exist time-correlated channels that look completely depolarising when the message is sent at any definite moment of time, and yet allow for a perfect transmission of classical information by sending messages at a coherent superposition of different times.

C. Maximum capacity in the lack of correlations
We now show that correlations in the probability distribution p(m, n) are essential in order to achieve the perfect communication task discussed in the previous subsection. Specifically, we prove that no perfect communication is possible in the lack of correlations, that is, when the probability distribution factorises as p(m, n) = p 1 (m) p 2 (n) = 1/d 4 (cf. Eq. (10)). For qubit messages (d = 2), we show that, in the lack of correlations, 1. the classical capacity of the channel C ω is upper bounded by 0.5 bits, meaning that it is impossible to transmit more than 0.5 bits per use of the channel, 2. the maximum classical capacity of the channel C ω over arbitrary states ω of the control system and over arbitrary (not necessarily random-unitary) realisations of the completely depolarising channel is equal to 0.16 bits.
The first result follows from an analytical upper bound on the classical capacity, while the second result follows from numerical optimisation.

Analytical bound on the classical capacity
The derivation of the bound consists of three steps, whose details are provided in Appendix B.
The first step is to prove that, in the lack of correlations and for message dimension d = 2, the channel C ω is entanglement-breaking [61], i.e. it transforms all entangled states into separable states. For entanglementbreaking channels, it is known that the classical capacity coincides with the Holevo capacity [62]. For a generic quantum channel E, the Holevo capacity is , where the maximum is over all possible ensembles {p x , ρ x } consisting of a probability distribution {p x } and a set of density matrices {ρ x }, and H(ρ) := − Tr[ρ log ρ] is the von Neumann entropy of a generic state ρ, log denoting the logarithm in base 2.
The second step is to observe that state of the control that maximises the Holevo capacity of the channel C ω is ω = |+ +|. This result holds for arbitrary message dimension d ≥ 2, and, in fact, it holds even in the presence of correlations, as long as the probability distribution p(m, n) is symmetric.
Finally, the third step is to show that, in the lack of correlations and for arbitrary message dimension d ≥ 2, the Holevo capacity of the channel C |+ +| is upper bounded by 1/d.
Putting the three steps together, we obtain that, in the lack of correlations and for qubit messages, the classical capacity of the channel C ω is upper bounded by 1/2 for every possible state ω. Hence, the perfect transmission of 1 bit achieved in Subsection III B is impossible in the lack of correlations.

Numerical evaluation of the capacity
The evaluation of the Holevo capacity involves an optimisation over all possible input ensembles. For quantum channels with d-dimensional input, the optimisation can be restricted to ensembles with up to d 2 linearly independent pure states [63]. In practice, however, the optimisation is often hard to carry out even in dimension d = 2. To make the optimisation feasible, we first show that in our case the optimisation can be reduced to an optimisation over ensembles that depend only on three real parameters q, p 0 , p 1 ∈ [0, 1]. The proof of this result is provided in Appendix C.
Building on the above results, we can numerically evaluate the largest value of the Holevo capacity, and therefore the classical capacity, for all possible qubit channels (i.e. d = 2) of the form (4) with p(m, n) = 1/16. We set the state of the control to ω = |+ +|, which we know to guarantee the maximum Holevo information (cf. Lemma 3 in Appendix B).
The resulting value of the Holevo capacity is a function of the phases {φ m } m∈{0,1,2,3} in Eq. (7). One phase, say φ 0 , can be set to 0 without loss of generality, as it represents a global phase. In Figure 4b, we provide a 3dimensional plot showing the exact values of the Holevo information, and therefore by the arguments above, the classical capacity, for all possible values of the phases φ 1 , φ 2 , and φ 3 . The maximum over all possible choices of phases is 0.16 bits.
In Appendix C we also show that 0.16 bits is the maximum capacity achievable with arbitrary (not necessarily random unitary) channels that reduce to the depolar- ising channel in the one-particle subspace sector. The value 0.16 was previously found to be a lower bound to the classical capacity [31], and our result shows that the lower bound is actually tight: 0.16 is the best classical capacity one can obtain by sending a single particle through a superposition of paths traversing two identical, independent channels that are completely depolarising in the one-particle subspace.

D. Lower bound to the classical capacity in the presence of correlations
In the correlated case, we do not have a proof that the classical capacity coincides with the Holevo capacity. On top of that, the evaluation of the Holevo capacity generally requires an optimisation over all possible ensembles of d 2 linearly independent pure states, which is computationally challenging. Here, we circumvent this problem by computing a lower bound to the Holevo capacity, obtained by restricting the optimisation to the set of all orthogonal ensembles, that is, input ensembles consisting of two orthogonal qubit states. In general, this lower bound may not be tight [64][65][66], but it is nevertheless interesting as it quantifies the maximum performance of a natural set of encoding strategies. Since the Holevo capacity is always a lower bound to the classical capacity, the above lower bound is also a lower bound to the classical capacity.
Here, we evaluate the lower bound for the correlated channel with p(m, n) = δ n,σ(m) /4, where σ is the permutation that exchanges 0 with 1, and 2 with 3. This particular choice is interesting because as we have seen in Subsection III B, it can reach the maximum capacity of 1 bit. We now inspect how the lower bound depends on the phases.
Since the interference term (11) depends only on the pA(m, n) A single particle is sent through a superposition of two paths (orange and blue dashed lines), each traversing two independent channels (green and red ovals), each of which exhibits time correlations between successive uses. The green and red dotted lines represent the correlations between the two subsequent uses of the same channel.
differences φ 1 − φ 0 and φ 3 − φ 2 , we set φ 0 = φ 2 = 0 and scan the possible values of φ 1 and φ 3 . For the state of the control system, we choose again ω = |+ +|, as it maximises the Holevo capacity (cf. Lemma 3 in Appendix B). The lower bound to the Holevo capacity is shown in Figure 4b for all values of φ 1 and φ 3 .

IV. COMMUNICATION THROUGH MULTIPLE TIME-CORRELATED CHANNELS
Time-correlated channels can be used to mimic the use of ordinary quantum channels in a superposition of different causal orders [32,35]. In this section we show that time correlations are a necessary resource for reproducing the benefits of the superposition of orders in quantum communication, and that, in fact, time correlations are an even more powerful resource than the ability to combine channels in a superposition of orders.

A. A network of time-correlated channels
Suppose that two time-correlated channels R A and R B , each of the form (1), are arranged as in Figure 5, and that a single particle is sent through a superposition of two alternative paths visiting each of the two channels exactly once. When the control system is initialised in the state ω, the overall evolution of the message and the control is described by the effective channel E ω defined as with  (1) and (2). The derivation of Eq. (15) is provided in Appendix D.
An interesting special case occurs when the probability distributions p A (m, n) and p B (k, l) are perfectly correlated, that is where p 1A (m) and p 1B (k) are the marginal probability distributions of p A (m, n) and p B (k, l), respectively. Under this condition, the network in Figure 5 reproduces the action of two random unitary channels in a superposition of two alternative orders [32]. Mathematically, the operation of putting two quantum channels in a superposition of orders is described by the quantum SWITCH [36,37], a higher-order transformation that takes as inputs two generic channels A and B (with d-dimensional input and ouput systems) and produces as output a new quantum channel S(A, B) with Kraus operators where {A m } ({B k }) are Kraus operators of A (B), and {|0 , |1 } is a basis for a control qubit that determines the relative order between A and B. Notably, the overall channel S (A, B) is independent of the choice of Kraus representations for the input channels A and B.
When the control qubit is put in a fixed state ω, the quantum SWITCH of channels A and B yields the effective channel with S mk as in Eq. (18). In particular, here we are interested in the case where the channels A and B are random unitary, with Kraus operators . With this choice, the channel S ω in Eq. (19) coincides with the channel E ω in Eq. (15) under the condition that the probability distributions p A (m, n) and p B (k, l) are perfectly correlated (cf. Eq. (17)).
When the channels A and B are completely depolarising, Ref. [45] showed that the channel S ω resulting from the quantum SWITCH can transmit 0.049 bits of classical information, provided that the control is initialised in the state ω = |+ +|. Later, the value 0.049 was proven to be exactly equal to the classical capacity [50]. Since the channels E ω and S ω coincide, we conclude that the timecorrelated network in Figure 5 can achieve a capacity of 0.049 bits.
In the following, we provide two new results: 1. We show that time correlations are strictly necessary in order to achieve the quantum SWITCH capacity of 0.049 bits. Specifically, we show numerically that the maximum classical capacity in the uncorrelated case is 0.018 bits for random-unitary realisations of the completely depolarising channel, and 0.024 bits for arbitrary realisations. This result shows that, when the quantum SWITCH is reproduced by the network in Figure 5, the origin of the communication enhancement is not just the interference of paths, but rather the combined effect of the interference of paths and of the time correlations.
2. We show that there exist time correlations that achieve a classical capacity of at least 0.31 bits. This result shows that the access to time correlations is generally a stronger resource than the ability to combine ordinary channels in a superposition of orders.

B. Maximum capacity in the lack of correlations
Here we evaluate the maximum amount of classical information that can be transmitted through the network in Figure 5 when the channels are completely depolarising and no correlation is present, that is, when p A (m, n) = p B (k, l) = 1/16 ∀m, n, k, l ∈ {0, 1, 2, 3}.
The evaluation of the maximum capacity follows the same steps as in Subsection III C. The main observations are: 1. in the lack of correlations, the channel E ω in Eq.
(15) is entanglement-breaking, and therefore its classical capacity coincides with the Holevo capacity 2. the control state ω that maximises the Holevo capacity of the channel E ω is ω = |+ +| 3. without loss of generality, the maximisation of the Holevo information can be reduced to ensembles that depend only on three real paramters q, p 0 , and The derivation of these results is provided in Appendix E. Building on the above observations, we evaluate the capacity of the channel E ω in Eq. (15) by scanning all possible values of the phases {φ m } 3 m=0 . The result is the plot shown in Figure 6a. The largest classical capacity over all random unitary realisations is 0.018 bits, which is strictly smaller than the value 0.049 bits achieved by the superposition of orders.
Furthermore, we also extend the optimisation from random unitary realisations to arbitrary realisations of the completely depolarising channel. For this broader class of realisations, we numerically obtain that the maximum capacity is 0.024 bits.
Summarising, the best classical capacity one can obtain by sending a single particle through the network in Figure 5, in the lack of correlations between the two paths, is 0.018 bits, and the capacity can be increased to 0.024 bits by replacing the random unitary channels with more general realisations of the completely depolarising channel.
Note that both values 0.018 and 0.024 are below the 0.049 bits of classical capacity achieved by the quantum SWITCH. This result shows that, when the quantum SWITCH is reproduced by the correlated network in Figure 5, it offers a communication advantage over all communication protocols where a single particle travels in a superposition of two paths on which it experiences uncorrelated noisy processes. Hence, we conclude that, in this scenario, the origin of the communication advantages of the quantum SWITCH is not merely the superposition of paths, but rather the non-trivial interplay between the superposition of paths and the time correlations in the noise.
Our results also imply a caveat about terminology. The quantum SWITCH of two channels A and B is sometimes described informally as a "superposition of channels AB and BA." While this expression may be formally correct (at least according to a broad notion of superposition [32]), it can be misleading if taken at face value, because it does not mention explicitly the requirement of correlations between the channels A and B in the two branches of the superposition.

C. Time correlations surpassing the quantum SWITCH capacity
We now show that the classical capacity of 0.049 bits, achieved by the quantum SWITCH, can be surpassed using more general time correlations. We prove this result explicitly, by exhibiting a pair of time-correlated channels that achieve a capacity at least 0.31 bits.
Our choice of channels corresponds to p A (m, n) = p B (m, n) = δ n,σ(m) /4, where σ is the permutation that exchanges 0 with 1, and 2 with 3. This choice is motivated by the fact that the permutation σ guarantees the maximum communication capacity in the case where a single time-correlated channel is used (cf. Subsection III B).
With the above choice, the effective channel describing the transmission of the message is with The derivation of this formula is provided in Appendix D. Note that the channel E ω depends only on the phase differences φ 1 −φ 0 and φ 3 −φ 2 , via Eq. (21).
We now provide a lower bound to the classical capacity of the channel E ω . As we did earlier in the paper, we lower bound the classical capacity by the Holevo capacity, and, in turn, we lower bound the Holevo capacity by restricting the maximisation to orthogonal input ensembles. For the state of the control qubit, we pick ω = |+ +|, which is the choice that maximises the Holevo capacity (cf. Lemma 3 in Appendix B).
The lower bound to the classical capacity is shown in Figure 6b for all possible values of the phase differences φ 1 − φ 0 and φ 3 − φ 2 . The highest lower bound over all combinations of phases {φ m } 3 m=0 is given by 0.31 bits. This value is larger than the classical capacity of 0.049 bits achieved by the quantum SWITCH, corresponding to perfect correlations p A (m, n) = p B (m, n) = δ m,n /4. This result implies that not only can time correlations reproduce the superposition of causal orders, but they can also surpass its advantages.

V. NOISE ON THE CONTROL DEGREE OF FREEDOM
So far we have assumed that the message-carrying degree of freedom of the particle undergoes noise during transmission, while the control degree of freedom is noiseless. However, in practical scenarios, this will only be an approximation to the actual physics. We now briefly discuss the effect of noise on the control system, focussing in particular on dephasing noise, of the form where s ∈ [0, 1/2] is a probability and ω is the initial state of the control. For a more detailed investigation into the effects of noise on the control system, we refer the reader to a recent related work [67]. For simplicity, here we focus on the communication scenario involving a single transmission line, as in Figure  1. In this setting, the evolution experienced by a single , where σ is the permutation that exchanges 0 with 1, and 2 with 3, and φ0 = φ1 = φ2 = 0, φ3 = π/2. particle is described by the channel obtained by dephasing the control system at the output of the channel C ω in Eq. (5). By inserting the expression (5) into the above equation, it is immediate to see that the effect of dephasing is to dampen the interference term G in the effective channel (5): specifically, the interference term changes from G to (1 − 2s) G.
In the case of completely depolarising channels on the message degree of freedom, the presence of a non-zero interference term means that, as long as the dephasing of the control is not complete (s = 1/2), the superposition of evolutions can still allow for a non-zero amount of classical information to be transmitted, thereby offering an advantage over the transmission at a definite moment of time. Figure 7 shows the behaviour of the classical capacity as a function of the dephasing parameter s. The figure shows that correlations between two uses of the channel offer an enhancement of the classical capacity. To make this point, we first evaluate numerically the maximum capacity achievable in the lack of correlations, with arbitrary realisations of the completely depolarising channel (blue curve). Notably, the capacity for every fixed value of s is achieved by the same realisation of the completely depolarising channel that achieves the maximum capacity in the ideal s = 0 case. We then show that a higher capacity can be achieved with the correlated channel described in Subsection III B. To this purpose, we numerically evaluate a lower bound to the Holevo capacity (and therefore to classical capacity), obtained by restricting the maximisation to orthogonal input ensembles (orange curve). Note that both the blue and orange curves are above 0 for every non-maximal amount of dephasing (s = 1/2), meaning that the single particle transmission at a coherent superposition of times offers an advantage over the transmission at a definite time.

VI. CONCLUSIONS
We have shown that a single quantum particle can sense the correlations between noisy processes at different moments of time. By sending the particle at a superposition of different times, one can take advantage of these correlations and boost the communication rate to values that would be impossible if the moment of transmission were a classical, well-defined variable.
An important avenue for future research is the experimental realisation of our protocols, as well as the experimental exploration of their noise robustness to timing errors and decoherence between the two different modes used to create the superposition. On the theoretical side, it is interesting to apply our framework for single-particle communication to more complex scenarios, e.g. involving the transmission of a single particle at more than two times, or even in continuous time. It is also interesting to analyse other communication tasks, such as the twoway communication proposed in Ref. [68]. Moreover, the extension from single particle communication to other communication protocols with a finite number of particles is a natural next step of this research.
At the foundational level, time-correlated channels provide an insight into the resources used by the existing experiments on the superposition of causal order. We analysed a basic setup that reproduces the overall result of the quantum SWITCH by sending a single particle in a superposition of paths through time-correlated channels. In this setup, we showed that time-correlations are a necessary resource to reproduce the communication advantages of the quantum SWITCH. Moreover, we observed that, with more elaborate patterns of correlations, one can achieve an even greater enhancement than the one found for the superposition of orders. This result establishes time-correlated channels as an appealing resource, which can be used as a testbed for foundational results on causal order, and, at the same time, as a building block for new communication protocols. Costa, Daniel Ebler, Sina Salek, and Carlo Sparaciari. The numerical simulations presented in this paper were written using the Python software package QuTiP and the circuit diagrams were drawn using TikZiT. This work is supported by the National Natural Science  8. The left-hand side depicts a 2-step correlated quantum channel B taking two input states on systems S (1) and S (2) , in succession. The right-hand side shows the physical implementation of the 2-step channel via two unitary channels W1 and W2 [51,52] where the memory between the two uses of the channel is realised by an environment E, which is inaccessible to the communicating parties. Here we provide a mathematical framework for describing the transmission of a single particle at a superposition of different times, and, more generally, for describing the transmission of the particle on a superposition of different trajectories, each passing through one of the ports of a multiport quantum device.

Multiport quantum devices and their vacuum extensions
A transmission line with a single input port is described by a quantum channel, that is, a completely positive trace-preserving map transforming density matrices on the particle's Hilbert space. In the following we will denote by Chan(S → S ) the set of quantum channels with input system S and (possibly different) output system S . When S = S we will use the shorthand Chan(S). The action of a quantum channel A on a density matrix ρ can be conveniently written in the Kraus representation with k input-output pairs (S (i) , S (i) ) k i=1 . A transmission line that can be used k times in succession is described by k-step quantum channel [6] (also known as a quantum k-comb [51,52]). A k-step quantum channel is a special type of k-partite channel B with the additional property that no signal propagates from an input S (i) to any group of outputs S (j) with j < i [51]. We will denote the set of k-step quantum channels as Chan(S (1) → S (1) , . . . , S (k) → S (k) ), or simply Chan(S (1) , . . . , S (k) ) when the input and output of each pair coincide. For k = 2, an example of 2-step quantum channel is illustrated in Figure 8. The possibility that no particle is sent through a port of a device can be described using the notion of vacuum extension [32]. Consider first a single-port device, described by an ordinary quantum channel A ∈ Chan(S). When no particle is sent through the device, we describe the input as the vacuum state |vac , that is, a state in a vacuum sector Vac [32,33,69,70], which is orthogonal to the one-particle sector S. Overall, the device acts on an extended system S := S ⊕ Vac, which is associated with the Hilbert space given by H S ⊕ H Vac , where H Vac is the vacuum Hilbert space, here assumed to be one-dimensional.
Given a quantum channel A, a vacuum extension A of A is any channel which acts as A (respectively, I Vac ) when the input is a state in sector S (respectively, Vac).

The Kraus operators of
i=0 is a Kraus representation of A, and {α i } r−1 i=0 are vacuum amplitudes satisfying A given channel has infinitely many possible vacuum extensions. In an actual communication scenario, the vacuum extension can be determined by probing the action of the channel on superpositions of the vacuum and one-particle states. Physically, the choice of vacuum extension is determined by the Hamiltonian of the field describing the vacuum and the one-particle sector.
The notion of vacuum extension can be easily extended to the case of k-partite channels, which include k-step channels as a special case. For simplicity, we focus on the k = 2 case, but the extension to k ≥ 2 is straightforward. Consider a transmission line described by a bipartite channel B ∈ Chan(S (1) ⊗ S (2) ). A vacuum extension of the channel B is another bipartite channel B ∈ Chan( S (1) ⊗ S (2) ), acting on the extended systems S (1) := S (1) ⊕Vac (1) and S (2) := S (2) ⊕Vac (2) . In general, the systems S (1) , S (2) can represent the systems accessible at the same location at two consecutive moment of time, or it can represent the systems accessible at different locations at the same time (as considered in Refs. [31,32]), or more generally, they can represent any pair of independently aderressable systems, representing the input/output ports of our multiport device.

A single particle travelling through multiple ports
In order to be able to send the same quantum particle to either of the ports of the device, we require the isomorphism S (1) ∼ = S (2) ∼ = M , where M is the message-carrying degree of freedom of the particle. In this case, the tensor product S (1) ⊗ S (2) contains a no-particle sector Vac (1) ⊗ Vac (2) , a one-particle sector (S (1) ⊗ Vac (2) ) ⊕ (Vac (1) ⊗ S (2) ), and a two-particle sector S (1) ⊗ S (2) . The one-particle sector is isomorphic to M ⊗ C, where C is a qubit system, representing the degree of freedom of the particle that controls its time of transmission. When the control is in state |0 , the message is sent through the first application of the channel and the vacuum is sent in the second application; vice versa for the control in state |1 .
We now define the situation in which a single particle is sent at a superposition of two different ports. We call the process experienced by the particle the superposition channel S( B), and define it as the restriction of B to the one-particle sector, regarded as isomorphic to the composite system "message + control." Explicitly, the action of the superposition channel is defined as where U(·) := U (·)U † is the isomorphism between M ⊗ C and the one-particle sector (S (1) ⊗ Vac) ⊕ (Vac ⊗ S (2) ), with Mathematically, the transformation S : Chan( S (1) ⊗ S (2) ) → Chan(M ⊗ C) is a quantum supermap, that is, a transformation from quantum channels to quantum channels satisfying appropriate consistency requirements [37,52,71]. An illustration of the supermap S is provided in subfigure 9a. Note that definition (A1) can be applied in particular to k-step quantum channels, which are a special case of k-partite channels. The illustration of the supermap S in this special case is provided in subfigure 9b.
The same definition can be adopted for the transmission of a single particle through a k-partite multiport device. In this case, the device is represented by a kpartite quantum channel B ∈ Chan(S (1) ⊗ · · · ⊗ S (k) ), with S (1) ∼ = S (2) ∼ = · · · ∼ = S (k) , and with vacuum extension B ∈ Chan( S (1) ⊗ · · · ⊗ S (k) ). The superposition channel is then defined as the restriction of B to the oneparticle sector where C is now a k-dimensional control system.

Derivation of Eq. (4) in the main text
We now specialise to the case of correlated channels of the random unitary form . 9. (a) Transmission of a single particle through a bipartite quantum channel B (green). (b) Transmission of a single particle through a 2-step quantum channel B (green). In both caes, the particle is represented by a composite system M ⊗C, where M represents the degrees of freedom used as the message, and C represents the degrees of freedom used as the control. The isomorphism U converts the composite system M ⊗ C into the one-particle sector (S (1) ⊗ Vac) ⊕ (Vac ⊗ S (2) ) of S (1) ⊗ S (2) . The inverse map U † converts the output state back into M ⊗ C. For the applications in this paper, we take the input of the control system C to be fixed in the state ω whilst the message system M is accessible to the sender.
where V m (·) := V m (·)V † m is a unitary channel, {V m } is a set of unitary gates, and p(m, n) is a joint probability distribution. The vacuum extension of each unitary V m is taken to be another unitary U m , which we write as where the vacuum amplitude is given by a complex phase, representing the coherent action of each possible noisy process on the one-particle and vacuum sectors. This leads to the vacuum extension with V m (·) := V m (·) V † m , which is equivalent to Equation (1) in the main text, with U m = V m .
The use of the channel R, specified by the vacuum extension R, at a superposition of times is given by: Explicitly, we have the expression where ρ (respectively, ω) is an arbitrary state of the message (respectively, control), and e iφm being the vacuum amplitude in Eq. (A5). Eq. (A8) coincides with Equation (4) in the main text, with C := S( R) and W mn := C mn / p(m, n).

Derivation of Eq. (5)-(7) in the main text
It is useful to consider the case where the probability distribution p(m, n) is symmetric, that is, p(m, n) = p(n, m) for every m and n. In this case, the superposition channel has the simple expression where Z is the unitary channel associated to the Pauli matrix Z, R 1 is the reduced channel defined by This section refers to the scenario where the message is transmitted at a superposition of two possible times, experiencing independent noisy processes that are completely depolarising in the one-particle subspace. This section makes use of the notation introduced in Appendix A.

Proof that the superposition of uncorrelated completely depolarising channels is entanglement-breaking
Let A(·) = r−1 m=0 A m (·)A † m ∈ Chan(S) be a generic quantum channel, and let A ∈ Chan( S) be a vacuum extension of A. Using Eq. (A1), we obtain where I (respectively, Z) is the identity channel (respectively, Pauli channel corresponding to the Pauli matrix Z), and is the vacuum interference operator defined in Ref. [32]. Now, let A be the completely depolarising channel D : ρ → I/d, with vacuum extension D. For a fixed state ω of the control system, consider the effective channel defined by For d = 2, we have the following result: The proof uses the following lemma: If D is the completely depolarising channel, then D(|w w|) = I/d and therefore the bound becomes | v|F |w | ≤ 1/d which implies ||F || ∞ ≤ 1/d.
We are now ready to provide the proof of Proposition 1.
Proof of Proposition 1. To prove that a channel is entanglement-breaking, it is sufficient show that it transforms a maximally entangled state into a separable state [61]. Let |Φ + = d−1 k=0 |k ⊗ |k / √ d be the canonical maximally entangled state. When the channel C ω,F is applied, the output state is with G F := (F ⊗ I)(|Φ + Φ + |)(F ⊗ I) † .
We now show that the operators I⊗I d 2 ± G F are proportional to states with positive partial transpose. To this purpose, note that the partial transpose of G F on the second space is Hence, for every unit vector |Ψ we have the bound, where the first inequality follows from Schwarz' inequality, and the last inequality follows from Lemma 2. Using Eq. (B8), we obtain the relation Since |Ψ is an arbitrary vector, we conclude that the op- has positive partial transpose. For d = 2, the Peres-Horodecki criterion [72, 73], guarantees that I⊗I 4 ± G F is proportional to a separable state. Hence, the whole output state (B6) is separable.

Optimal control state for maximizing the Holevo capacity
Proposition 1 implies that the classical capacity of the channel C ω,F is equal to its Holevo capacity (see [62]). Here we show that the Holevo capacity is maximised by the state ω = |+ +|. In fact, we prove a more general result: Lemma 3. Let C ω be an arbitrary channel of the form where L ± are arbitrary linear maps. Then, for every density matrix ω, the Holevo capacity satisfies the bound Proof. The Holevo capacity is known to be monotonically decreasing under the adtion of quantum channels, namely χ(E) ≥ χ(F • E) for every pair of channels E and F. For every channel C ω of the form (B10), we have the relation where P ω is the quantum channel defined by for an arbitrary state γ. Hence, we have χ(C ω ) = where F is the vacuum interference operator defined in Eq. (B2).
Proof. For a fixed vacuum extension, and therefore for a fixed vacuum interference operator F , the Holevo capacity of the channel C ω is upper bounded by the Holevo capacity of the channel C |+ +|,F (Lemma 3). Hence, it is enough to prove the bound for the channel C |+ +| .
Note that the output of channel C |+ +|,F has dimension 2d. For a generic channel E with (2d)-dimensional output, the Holevo capacity is upper bounded as [74] where H(ρ) := − Tr[ρ log ρ] is the von Neumann entropy, and the minimisation can be restricted without loss of generality to pure states. We now upper bound the right-hand-side of Eq. (B14) for E = C |+ +|,F . The action of the channel C |+ +|,F on a generic input state ρ is as one can deduce from Eqs. (B3) and (B1).
In the case of a pure state ρ = |ψ ψ|, we write F |ψ = k |ϕ , where |ϕ is a unit vector and k is a normalisation constant. With this notation, we obtain with P ⊥ := I − |ϕ ϕ|. The von Neumann entropy of this state is Now, note that one has where the last inequality follows from Lemma 2. The expression (B17) is monotonically decreasing for k in the Hence, one has the lower bound Inserting this expression into Eq. (B14) with E = C |+ +|,F , we then obtain Eq. (B13).
Proof. Immediate from the fact that the right-handside of Eq. (B13) is monotonically decreasing with F ∞ , and that F ∞ is upper bounded by 1/ √ d (Lemma 2).
Appendix C: Maximisation of the Holevo information for the superposition of independent depolarising channels Here we prove a series of results that enable a complete numerical maximisation of the Holevo information of the channel (B3) over all input ensembles, over all states of the control system, and over all vacuum extensions of the completely depolarising channel. This Appendix makes use of notation introduced in the previous appendices.
Let us start from the maximisation over the vacuum extensions, which are in one-to-one correspondence with the possible operators F . Lemma 6. Without loss of generality, the operator F that maximises the Holevo information of the channel C ω,F can be taken to be of the form F = a |0 0| + b|1 1|, with a 2 + b 2 ≤ 1/d, a, b ≥ 0.
Proof. Using the singular value decomposition, F can be written as F = U F V , where U and V are suitable unitary matrices, and F is diagonal in the basis {|0 , |1 }. Now the capacity of the channel C ω,F is equal to the capacity of the channel C ω,F = (U ⊗ I C ) † • C ω,F • V † , where U † and V † are the inverses of the unitary channels associated to the unitary matrices U and V , respectively, and I C is the identity channel on the control system. Notice that F is also a vacuum interference operator associated to the completely depolarising channel. Hence, the maximisation of the Holevo capacity can be restricted to channels with diagonal vacuum interference operator.
Next, we note that, for a vacuum extension of the completely depolarising channel, the vacuum interference operator F must satisfy the condition Tr F † F ≤ 1/d [31]. For an operator of the form F = a |0 0| + b |1 1|, this implies the inequality |a| 2 + |b| 2 ≤ 1/d. Finally, we show that a, b can restricted to positive numbers. Let W = a |0 0| + b |1 1|, where a =ā/|a|, b =b/|b|. Then F := W F = F W = |a| |0 0| + |b| |1 1|. The capacity of the channel C ω,F = (W ⊗ I C ) • C ω,F (where W is the unitary channel associated with the unitary W ) is equal to the capacity of the channel C ω,F . Therefore, a maximisation of the Holevo capacity can be restricted to vacuum interference operators with positive coefficients in the computational basis.
Let us consider now the maximisation over all possible ensembles. The key result here is that the maximisation can be reduced to the optimisation of d vectors with positive coefficients in the computational basis.
Lemma 7. When the operator F is diagonal in the computational basis, the input ensemble that maximises the Holevo information after application of the channel C ω,F can be chosen without loss of generality to be of the form m=0 ω m |m m|, ω := e 2πi/d , and |ψ x is a unit vector with positive coefficients in the computational basis {|m } d−1 m=0 . Proof. When F is diagonal, the channel C ω,F has the covariance property where θ = (θ 0 , θ 1 , . . . , θ d−1 ) is a vector of d phases, and U θ is the unitary channel associated to the unitary matrix valid for every vector |ψ .
For qubit messages (d = 2), we finally obtain an upper bound on the classical capacity: Theorem 9. For every vacuum extension of the completely depolarising channel and for every state of the control qubit, the classical capacity of the channel resulting from the superposition of two independent depolarising qubit channels is upper bounded as Proof. For d = 2, Proposition 1 guarantees that the channel C ω,F is entanglement breaking, and therefore its classical capacity is equal to the Holevo capacity. Lemma 3 guarantees that the maximum of the Holevo capacity is attained by the state ω = |+ +|. Then, Lemma 7 guarantees the maximum of the Holevo capacity of the channel C |+ +|,F can be obtained with a diagonal operator F = a |0 0| + b |1 1|, a, b ≥ 0. The Holevo capacity of C |+ +|,F can be computed explicitly using Corollary 8, with Finally, an upper bound is obtained by relaxing the constraint on a and b to a 2 + b 2 ≤ 1/d (Lemma 7).
Appendix D: Transmission of a single particle through a network of two-step channels In the following we will use the notation introduced in Appendix A.
We now connect the 2-step channels A and B in such a way that the output of the first use of each channel is fed into the input of the second use of the other channel, as in Figure 10. This particular composition of two 2-step channels is described by a supermap Z that maps pairs of channels in Chan( A (1) , A (2) ) × Chan( B (1) , B (2) ) into bipartite channels in Chan( A (1) ⊗ B (1) → B (2) ⊗, A (2) ).
We can now consider the scenario in which a single particle is sent in a superposition of going through the A-port and the B-port of the channel Z( R A , R B ). Following Eq. (A1), the evolution of the particle is described Green: A plot of the classical capacity against ||F ||∞ for the channel Cω,F . Red : A plot of the classical capacity against ||F || 2 ∞ for the channel C ω,F 2 . In both cases F = 3 m=0 1 4 e −iφm Vm and is sampled over the phase parameters {φ1, φ2, φ3} with a numerical precision of π/8 for each parameter. We set φ0 = 0 without loss of generality, as F ρF † is invariant under the phase group U (1). The classical capacity is here equal to the Holevo capacity (see Appendix B) and the Holevo capacity was calculated using the methods outlined in Appendix C.

Appendix E: Proofs of the statements in Subsection IV B
Here we consider the scenario of Figure 11, in the special case where the 2-step channels A and B are of the product form A = A 1 ⊗ A 2 and B = B 1 ⊗ B 2 , respectively. In this case, the combination of the channels in the network of Figure 10 gives the bipartite channel Z( A ⊗ B) = B 2 A 1 ⊗ A 2 B 1 . (E1) When a single particle is sent into one of the two ports of this channel, the resulting evolution is described by the superposition channel where S is the supermap defined in Eq. (A1).
We now restrict our attention to the case where the channels A 1 , A 2 , B 1 , and B 2 are all equal to each other, and are all equal to D, a vacuum extension of the completely depolarising channel. In this case, the action of the superposition channel on a generic product state ρ⊗ω is where F is the vacuum interference operator associated to channel D. The above equation follows from Eq. (B3) and from the observation that the vacuum interference operator of D 2 is F 2 .
Note that one has the equality using the notation of Eq. (B3). That is, in the lack of correlations the configuration of channels depicted in Figure 11 gives rise to the effective channel in Equation (B1), with F replaced by F 2 . This means that all of the results in Appendices B-C apply to this scenario as well, with F replaced by F 2 . In particular, the classical capacity can be determined numerically using Theorem 9, with the maximisation constraint now being that for the vacuum interference operator F 2 = g |0 0|+h |1 1|, g +h ≤ 1/d, where g, h ≥ 0.
The classical capacity of the channels C ω,F and C ω,F 2 can be evaluated numerically. For the cases where each completely depolarising channel is implemented by a random unitary channel (cf. Eqs. (4) and (15), respectively, in the main text), Figure 12 show a scatter plot with the capacities of both channels in the same graph against the norm of the corresponding vacuum interference operator, F or F 2 , for same combination of phases φ 1 , φ 2 , φ 3 as shown in Figs. 4 and 6.