Improved coherent one-way quantum key distribution for high-loss channels

The coherent one-way (COW) quantum key distribution (QKD) is a highly practical quantum communication protocol that is currently deployed in off-the-shelves products. However, despite its simplicity and widespread use, the security of COW-QKD is still an open problem. This is largely due to its unique security feature based on inter-signal phase distribution, which makes it very difficult to analyze using standard security proof techniques. Here, to overcome this problem, we present a simple variant of COW-QKD and prove its security in the infinite-key limit. The proposed modifications only involve an additional vacuum tail signal following every encoded signal and a balanced beam-splitter for passive measurement basis choice. Remarkably, the resulting key rate of our protocol is comparable with both the existing upper-bound on COW-QKD key rate and the secure key rate of the coherent-state BB84 protocol. Our findings therefore suggest that the secured deployment of COW-QKD systems in high loss optical networks is indeed feasible with minimal adaptations applied to its hardware and software.


I. INTRODUCTION
Quantum key distribution (QKD) is a promising application of quantum communications where two users, Alice and Bob, exchange quantum signals to establish a common secret key [1,2]. The original ideas of QKD were first presented using the transmission of single photon states [3], but the field has since evolved to include more practical communication systems based on coherent states. One prime example is the coherent one-way (COW)-QKD protocol [4], which uses a sequence of randomly modulated coherent states with fixed reference phase to distribute the secret key. In this protocol, each secret bit is encoded into the time-of-arrival of a single light pulse and security is evaluated by checking the optical coherence of consecutive light pulses. The basic idea is to check if the optical coherence between consecutive light pulses has been disturbed-indeed, if an eavesdropper tries to measure the position of light pulse and learn the secret bit, the optical coherence between adjacent non-vacuum light pulses will be broken. This security feature was originally designed to deter the so-called photon-number-splitting (PNS) attack, which raised serious security concerns when it was first discovered in 2001 [5].
However, the idea of using inter-signal correlation to detect PNS attacks creates a new problem. In particular, it puts the protocol in an unorthodox situation involving the analysis of sequential trains of pulses, which is fundamentally different from the standard setting based on repeated rounds of quantum communications [2]. Consequently, none of the QKD proof techniques developed to date can be applied to COW-QKD. In fact, at the moment, the general security of COW-QKD remains an open problem.
That said, significant progress has been made towards * emilien.lavie@u.nus.edu demystifying the security of COW-QKD. Initially, upper bounds on the achievable secret key rate were derived using specific class of collective attacks, which suggested a secret key rate that is linear in the channel transmittance [6,7]. However, recent studies found tighter upper bounds which feature quadratic scaling [8,9]. These results are significant because they indicate that COW-QKD may not be suitable for ultra-long-distance QKD, which are similar to what Ref [10] have found based on unambiguous state discrimination (USD) attacks assuming zero-error statistics. While these upper bounds provide a clear idea of what COW-QKD could theoretically achieve with lossy channels, it is not obvious if lower bounds with quadratic scaling could be obtained. We note that certain variants of COW-QKD have achieved quadratic scaling with more sophisticated optical receivers based on active switching [11,12], but these increase the complexity of the implementation. To better illustrate the current security status of COW-QKD, we plotted some of the known upper and lower bounds in Fig. 1 assuming zero error statistics.
In this work, we show that COW-QKD can reach quadratic scaling-close to the bound established in Ref. [9] with only a slight modification of the original protocol. In particular, the proposed protocol is the same as the original protocol, except for (1) an additional vacuum tail signal that is needed to keep the protocol in the standard (iid) setting and (2) a balanced beam-splitter is used to decide passively the measurement basis. To analyse the security of the protocol which is based on practical photon-counting detectors, we use a generalised form of universal squashing to map the Hilbert space of the detectors to a two-dimensional Hilbert space [16]. Then, we calculate the achievable secret key rate using the standard Shor-Preskill formalism for qubit channels by optimising the phase error rate given the expected channel statistics [11].
The rest of the paper is organised as follows. In Section II we first provide a detailed model of the protocol implementation (II A) [13] Sequential attack [9] Coherent states BB84 [14] COW+vacuum [15] Active switch [12], β = α 2 This paper, β = α 2 Active switch [12], β = α This paper, β = α FIG. 1. We compare existing results on the asymptotic security of coherent one-way type protocols. For the sake of comparison, we indicate another popular type of protocol based on phase-encoded coherent states [14]. Ref. [13] is a fundamental upper limit for point-to-point communication. Ref. [9] is a recent result providing an upper bound that scales only quadratically with the channel transmittance. Ref. [12] provides a lower bound on the secret key rate for a modified version of the protocol using an active switch instead of a passive interferometer and an optimised intensity for the test state |β |β . The protocol in this paper is using a passive interferometer for Bob as in the original design, and prepared states similar to Ref. [12] with an extra vacuum pulse sent by Alice. See more details in Section II. We also analyse the performance of one of the countermeasures proposed in Ref. [15] using a fourth state composed of vacuum pulses only.
how to use the universal squashing framework to estimate the statistics of a virtual single-photon protocol based on the expected statistics of the actual protocol. We conclude the security analysis in Section II C where we use a numerical method to estimate a lower bound on the secure key rate of the single-photon protocol. Finally, we present simulated results and discuss their relevance in Section III.

A. Modeling
The protocol we consider here is based on the preparation and measurement of coherent states in three consecutive temporal modes labeled c 0 , c 1 and c 2 . The global phase information of the laser used to prepare the states is public and known to the adversary. Any other degree of freedom is assumed to be random and do not carry any useful information about the random inputs. Here, the transmitter, Alice, prepares and sends the state |ϕ i with probability p i chosen from a predefined set: Note that Alice always sets the third temporal mode c 2 to the vacuum state and this is needed to ensure the protocol can be treated in the standard quantum communication setting.
The receiver, Bob, performs decoding measurement using a passive basis choice setup, which is implemented using a balanced beam-splitter leading to two possible detection lines. The first line is a direct time-of-arrival detection line (key basis labeled Z): a threshold detector measures the presence or absence of photons in each temporal mode. The second line is a monitoring line (test basis labeled X ): a Mach-Zender interferometer that interferes consecutive pulses to check for good optical coherence.
Bob's monitoring line is such that only the middle temporal mode c 1 arriving on the detectors will contain relevant (conclusive) information. The first temporal mode contains only one half of the first pulse sent by Alice that the sequence |vac c 2 |vac c 1 |α c 0 is representing bit 0, the sequence |vac c 2 |α c 1 |vac c 0 is representing bit 1 and the sequence |vac c 2 |β c 1 |β c 0 is used to test the channel. The last temporal mode c2 is always set to vacuum by Alice to ensure the symmetry of the protocol. Bob is using a beamsplitter to passively choose between a direct line and a monitoring line composed of a Mach-Zender interferometer. Here, the coherence is only monitored between temporal modes c0 and c1 within the same train of pulses, and any coherence with another train is ignored. This is to guarantee a symmetry between the rounds of the protocol and ensure that an optimal collective attack is also an optimal coherent attack [17]. There are two minor differences with the original setup presented in Ref. [4]: an additional vacuum state is enforced at the end of each train, and we allow a lower intensity β ≤ α for the test state |ϕ2 .
did not interfere with anything (since there is no residual light in the interferometer initially), and the third temporal mode contains half of the middle pulse sent by Alice. Therefore, the conclusive rate in the monitoring line is only half of the one in the direct line. The overall setup is presented in Fig. 2.
There are a two minor differences from the original COW protocol [4]. First, the use of a vacuum state at the end of each sequence breaks the coherence between two trains of pulses. Therefore it is only possible to monitor the coherence within a single train of pulses. In the original setup, the coherence between any two non-empty subsequent pulses is monitored. This additional vacuum state makes the security analysis simpler since now an optimal collective attack is also an optimal coherent attack by virtue of symmetries [17]. Second, we allow the possibility to use a lower intensity for the test sequence |ϕ 2 similar to Ref. [12]. We expect that these modifications only require minor changes to the original design: only an additional intensity level is used at the transmitter and no additional phase modulator is required. Now we provide a more detailed notation to describe the transmitter and the receiver. We label the spatial mode corresponding to the only detector arm in the direct line a 0 and the two spatial modes corresponding to the two detector arms in the monitoring line b 0 , b 1 (see in Fig. 2).
Additionally, we define 9 modes corresponding to the combination of the spatial and temporal modes.
We use here the notation (a 0 , c 0 ) to represent the spatialtemporal mode of the direct line a 0 during the first time detection window c 0 and similarly for the others. We use this notation to avoid confusion later when we use the creation/annihilation operators for the modes. Indeed, populating one photon in mode d 0 is effectively considering d † 0 |vac = |1 d0 which is very different from the product a † 0 c † 0 |vac a0 |vac c0 = |1 a0 |1 c0 (one photon in each mode).
Alice only has access to the spatial mode a 0 which is the physical channel (e.g. optical fiber or free space) over which she is sending the train of coherent states. The two spatial modes b 0 , b 1 come from the empty ports of Bob's beamsplitters: one for the basis choice, one for the first beamsplitter in the interferometer. Alice (and Eve) have no access to these and they can only affect modes d 0 , d 3 , d 6 instead. We rephrase Alice's prepared states: We assume all the other unspecified modes are trusted and populated with vacuum states.
Let us now analyse Bob's measurement apparatus. The beamsplitters are operating on two spatial modes at each time-window. We find for the basis choice beamsplitter: and similarly for the others. The delay line can be seen as a shift between the temporal modes: the first temporal mode becomes the second, the second temporal mode becomes the third. The third temporal mode is actually never populated by Alice or Bob and it is simply an artifact of the computation. We can always assume that the delay line is effectively transforming it into the first temporal mode. All in all we have for the only spatial mode that is delayed: The other modes remain unchanged. We represent the overall transformation performed by Bob on the input state (before the detectors) with the circuit in Fig. 3 representing a unitary transformation U † , since its inverse is given by the reverse circuit. Finally, we describe Bob's POVM. We assume here that the threshold detectors are ideal (no dark counts and 100% efficiency), i.e. each detector will click (event denoted by "C") if and only if there is one photon or more in the associated mode. The complementary event is labelled "N". The operators describing the threshold detector in mode j are: Then the overall POVM is given by: Here, there are 2 9 possible detection events corresponding to any combination of the detectors clicking or not clicking; we represent them using strings of C and N of length 9. We further classify these events by assigning a basis value and an outcome value to each event according to the rules in Table I. The outcome ⊥ represents an event that carries no relevant information; we call it an inconclusive outcome, not to be confused with a no-click outcome ∅. For a given input state, it is required that the clicking probabilities in each basis are equal, but the conclusive probabilities in each basis might be different. This is the case here since the inconclusive probability in basis Z is essentially 0 but in the basis X , half of the total clicking probability is actually inconclusive due to the interference of only half of the signal in the middle temporal mode. This can also be understood as a basis dependent trusted erasure channel operating after a basis independent filter.
At least two in Z At least two in X   TABLE I. Mapping the detection pattern to basis value and outcome value. If no detector clicked, then the basis is chosen randomly. Any event containing at least one click C0, C3 or C6 contributes to a measurement in basis Z while any event containing at least one click C1, C2, C4, C5, C7, or C8 contributes to a measurement in basis X . If the detection event contains clicks from the two bases, then the conflict is resolved by choosing one basis or the other at random and ignoring all the detector clicks from the other basis. If at least two different clicks from the same basis occurred, then we assign the special symbol "d" representing a double click.

B. Universal squashing
In the previous subsection, we modeled the receiver and defined a few relevant outcomes. Here, we show how to deal with the double clicks to derive the statistics of a virtual single photon protocol whose security will be analysed in the next subsection.
We rely on a generalisation of the universal squashing result first proposed in Ref. [16]. The method proposed in that paper comprises two steps: an equivalence theorem stating that two virtual situations are equivalent and an estimation technique to compute the statistics of one of the two virtual situations using the statistics of the actual protocol.
First, the equivalence theorem (Theorem 1) states that under certain assumptions, the following two situations are equivalent. In Situation 1, Bob is receiving a n photon single mode input state and only keeps one photon. The single photon evolves through the unitary operation describing the receiver and is measured in only one output arm j using a threshold detector. In Situation 2, Bob keeps all n photons to interfere through the unitary operation describing the receiver and measures the number of photons in each output arm (let it be l j for the arm labelled j) with a photon-number resolving detector. Bob finally outputs the outcome j with probability lj n . Then, the estimation technique tries to estimate the statistics of Situation 2. The general method is simple: if only one detector clicked in the real protocol, then there were one or more photons in that particular arm and the same outcome would have been announced in Situation 2. If multiple detectors clicked in the real protocol (double click), then there is no way to resolve the conflict as any outcome could have been announced in Situation 2. Therefore, the probability of each outcome in Situation 2 is bounded by the statistics of the real protocol: it is lower bounded by the single click probability and upper bounded by the sum of single click and double click probabilities ; the lower the double click rate, the tighter the bounds.
In our paper, we will use the estimation technique exactly as proposed in Ref. [16].
However, our considered application is bringing two issues that we need to consider in order to properly apply the equivalence theorem. First, the theorem was stated and proved in the single mode case only. In the protocol under consideration here, Alice has to prepare non-trivial states over several modes (e.g. |ϕ 2 = |β d0 |β d3 |vac d6 ). Second, the theorem holds for any states having a fixed number of photons. It is straightforward to generalise to any classical mixture of photon-number states, but here we use coherent states that have coherence between different photon number states and it is not clear if the theorem still holds.
We addressed the first concern with a generalisation of the equivalence theorem in the multi mode case and the second one with an additional argument based on a virtual photon number measurement. More details are provided in Appendix A.
With this generalised result, we are able to upper and lower bound the statistics of a virtual single-photon protocol after the statistics of the actual implementation using the universal squashing framework. While the main application here is QKD, the universal squashing framework is a general quantum optics result that could have other applications in single photon Quantum Information Processing. For instance, Ref. [16] proposed applications to qubit state tomography.

C. Security analysis
The point of this subsection is to provide a lower bound on the asymptotic key rate of the protocol we described in Subsection II A. Here, we restrict the analysis to a qubit protocol since we obtained qubit statistics (or rather upper and lower bounds on them) in the previous subsection II B.
We choose to use numerical methods to estimate the information leakage to the adversary since they are very practical and often provide better results than analytical methods. In our case, it is possible to use the method proposed either in Ref. [18] or in Ref. [11]. Both rely on convex optimisation techniques [19] and we find that both are giving similar results, the latter being substantially faster though. The main difference lies in the objective function: Ref. [18] is minimising the quantum relative entropy which is a convex non-linear function while Ref. [11] is maximising the phase error rate which is linear.
We consider here the phase error method of Ref. [11] for the simulation and we indicate below the main steps to implement it.
We use the entanglement replacement scheme for the transmitter, so instead of preparing the state |ϕ i with probability p i , Alice is preparing the bipartite state: She sends the system A over the channel to Bob and measures the other half A (qutrit) in the computational basis { |i i| A }. Alice records her outcome value in a reg-isterĀ and her basis choice (announcement) in a classical registerÃ according to Table II. Alice's partial state ρ A is characterised by the overlap of the prepared states and by the preparation probabilities: The information about this state is included in the programme by considering that Alice could perform a tomography of her partial state. There are 4 Mutually-Unbiased-Bases in dimension 3 and we include the statistics of each of these operators.
For the receiver, while we could in principle write down the whole unitary (in dimension 9), we choose to simplify it to dimension 3 instead to speedup the programme. We consider an active measurement instead, so that only either the direct line or the monitoring line is operating at once on a small dimension system. As in Ref. [18], we consider that Bob is manipulating a qutrit where the first two dimensions correspond to a qubit and the third dimension represents the no-click outcome ∅. We find that Bob's measurement operators are: After performing a measurement in either of the bases, Bob will register an announcement value in a classical registerB and an outcome value in a registerB. Here the classical registerB corresponds to the public announcement Bob will make about his results, but it is not equivalent to a typical basis choice. The announcement is different for no-click, conclusive (including basis choice) or inconclusive outcomes. Bob will register his outcome value and announcement value according to Table II.
We only consider the announcements (ã = 0,b = 0) for key generation and we use a few more operators to define @ @ @ certain errors and detection probabilities: Then it is easy to define: We note that the bit error in the basis X is related to the usual visibility parameter V with the relation Finally, the optimisation problem can be cast as a Semi-Definite Programme (SDP): for certain measurement operators Π k and lower and upper bounds p ↓ k and p ↑ k and the key rate is given by:

III. RESULTS AND DISCUSSION
We perform a simulation to visualise the expected performance of our proposed protocol. We consider two possible values for the test intensity: β = α (same intensity as the key states) or β = α 2 (one quarter of the intensity of the key state). We choose a highly biased state preparation where the key states are prepared most of the time: p 0 + p 1 = 99%. The detectors are assumed to have no dark counts and 100% efficiency.
In Fig. 1, we consider a loss-only channel, and in Fig. 4 we consider a noisy channel with a fixed error rate in both bases e Z = e X = 1%.
Our study reveals that both in the loss-only and in the noisy situation, the original encoding β = α cannot guarantee an optimal quadratic scaling, but a proper modulation of β can achieve it. Moreover, any more advanced security analysis involving the inter-phase information would at best only improve the performance marginally since our results (in the loss-only situation) lie close to the upperbound derived in Ref. [9].
We also analyse one countermeasure proposed in Ref. [15]. If a fourth test state |vac c0 |vac c1 |vac c2 with vacuum pulses only is used along with the original encoding β = α for the test state, a quadratic scaling is also achievable.
Surprisingly, those two possible modifications give performances that are similar to a phase-encoded BB84 implementation requiring the preparation of four coherent states with a phase modulation [14]. Thus offering a viable alternative with only limited modifications to the intensity modulator and without phase modulator.
We also notice that our analysis gives results similar to those reported in Ref. [12] with an active switch and a basis independent filter. We think that the difference in the low loss regime comes from the penalty caused by the use of the universal squashing method: the double click rate is non-negligible in the low loss regime and becomes smaller with a higher channel loss. Hence it seems that the performance of the protocol is preserved as long as the inconclusive rate (i.e. the clicks in the monitoring line outside of the middle temporal mode; which have been enforced to be zero in Ref. [12] using active switching) corresponds to a trusted erasure channel operating after a basis independent filter.

IV. CONCLUSION AND OUTLOOK
We presented the security analysis of a coherent state based quantum key distribution protocol against collective attacks in the asymptotic regime. Our approach relies on the application of the universal squashing framework to bound the single photon statistics, and then the single photon security analysis is performed using numerical methods. Our simulated results illustrate that our analysis can only establish poor lower bounds on the secure key rate for the original design when not using the inter-signal phase information, but it this bound can also significantly be improved using minor modifications. Indeed, we have shown that modulating the intensity of the test state |β |β to a lower value β ≤ α, or sending an additional test state |vac |vac can guarantee that the key rate scales quadratically with the channel loss. Interestingly, it seems that the performance of this upgraded scheme is comparable to an other popular design based on phase modulated coherent states [14].
Our results also highlight that the universal squashing framework might have been overlooked when it was initially proposed, while it actually has a considerable interest for applications where it is challenging to obtain Secret Key Capacity [13] Sequential attack [9] Coherent states BB84 [14] COW+vacuum [15] Active switch [12], β = α 2 This paper, β = α 2 Active switch [12], β = α This paper, β = α FIG. 4. We simulate a noisy situation with a quantum bit error rate in both bases eZ = eX = 1% (or equivalently visibility 98%), and ideal threshold detectors with 100% efficiency and no dark counts. Two possible modification of the original design offer performances close to another popular design based on phase-encoded coherent states [14]. It is possible either to include an additional test state with vacuum pulses as suggested in Ref. [15] or to modulate the intensity of the test state as mentioned in Ref. [12] to achieve a quadratic scaling. an exact squashing model.
The security analysis in the non-asymptotic regime and against general attacks is left to further work.
We provide additional details to show that the universal squashing framework is applicable for the security analysis of the COW protocol.
We revisit the equivalence result of Ref. [16] (Theorem 1) using a description with optical modes. They consider a natural squashing operation that keeps only 1 photon at random out of a pulse of n photons in a single mode. They show that a protocol implementing this squash-ing operation has identical statistics as one keeping all n incoming photons, measuring the number l j of photons in each output arm and annoucing the outcome j with probability lj n . We propose a different derivation for their result in the single mode case in Section A 1, and then we generalise it to the multimode case in Section A 2. Finally in Section A 3 we discuss an additionnal argument to apply the equivalence theorem to coherent states instead of states with a fixed number of photons.

Single mode case
We consider here the case in dimension 2 to keep equations short but the derivation can be extended to any dimension d ≥ 2. We denote the measurement modes a † 0 , a † 1 and input the modes b † 0 , b † 1 , we assume they are related by a unitary transformation: We assume that the input state is a population of n photons in a single mode, say b 0 for instance, i.e. the input state is 1 After the squashing operation, the single photon is found in arm j with probability |u 0,j | 2 .
Using the other protocol instead, we find the probability of photons in the various arms to be: Let us write X 0 = |u 0,0 | 2 and X 1 = |u 0,1 | 2 , then the probability of outcome j in this protocol is: Using this approach, it is easy to see that the photon number probability in each arm as in Eq. (A2) is naturally a 0-th moment (sum to 1, i.e. normalisation) and the classical postprocessing probability in Eq. (A3) is a 1st moment. We generalise this property to any number of input modes in the next section.

Multimode case
We show that the equivalence still holds when d modes b † 0 , b † 1 , . . . , b † d−1 are populated with k 0 , k 1 , . . . , k d−1 photons, with k 0 + k 1 + · · · + k d−1 = n. In this case, the outcome probability for the squashing protocol is: The probability for the second protocol is: The equality between Eqs. A7 and A8 is proved below in a slightly more general case (non normalised vectors) and follows three major steps. First we establish a few results on combinatorics. Next we show that a particular matrix transformation is a group homomorphism (Claim 1) and show the 0-th moment property (i.e. normalisation). Finally, we revisit the derivation of Claim 1 to compute the first moment instead, which gives directly the equivalence result in the multimode case.

a. Notations
We start by defining some quantities that are useful to simplify the notations later on.  L(n) has N = n+d−1 n,d−1 elements, we can index its elements using an index in { 0 . . . N − 1 }. We can identify a line with its associated index that we also label l or l(n) if we need to specify the total weight to avoid confusion. In the following the indices l, k are reserved for elements of L, and we use lowerscripts to indicate the component of the solution, e.g. l 0 is the 0-th component of l(n) which is the l-th line of total weight n in L(n) and similarly for k.
We denote M (n) the ensemble of squares of total weight n. If we are given two lines k(n) and l(n), it is also possible to construct more constrained squares where we add the additional constraints: and we denote M (k, l) the ensemble of such squares. Similarly, we identify a square with its index in M (n) or M (k, l) that we label again m or m(n) or m(k, l) depending on the context. We also occasionally use the following array notation to visualise the sum over rows and columns (here in dimension 2): We label P (n) the ensemble of cubes of total weight n. Again, if we are given three squares m(k(n),l(n)), m(k(n), l(n)) andm(l(n),l(n)), we can constraint more

b. Preliminary results
We introduce here three Lemmas. Lemma 1 is a well know relation and is not directly used to prove our result, but its generalisation will be the main ingredient of Lemma 2. Lemma 3 is the key ingredient to prove the result in the next subsection and its derivation relies on Lemma 2.
The results are given for d = 2 to keep equations short, but the results hold for any d ≥ 2 Lemma 1. (Vandermonde's identity) Let n, m be any two non-negative integer and l(n) ∈ L(n), then: Proof. We consider the expansion of (1 + X) n = (1 + X) l0+l1 . We find: Then the result follows from the identification of the coefficient in X m . Proof. We consider here the expansion of (X 0 + X 1 ) n = (X 0 + X 1 ) l0+l1 Then we take the product and group the monomials: We read out the coefficient in X l0 0 X l1 1 and we find the result.
Lemma 3. Let n be any non-negative integer, three lines k, l,l ∈ L(n) and two squaresm(k, l),m(l,l), then: Proof. We compute the square of the left hand side of Lemma 3: And we can find a closed formula for each independent summation appearing in the right hand side of Eq A25.
In the summation over p, only two squares are constraining the ensemble:m(k, l) andm(l,l). The idea is to split the cube into d independent squares along the axis of the square that is not constrained (see Fig. 5). Then we apply Lemma 2 to each square and we get: In the same way, we find for the other summation: We take a fixed non-negative integer n and positive integer d. In the following, we take d = 2 to have concise equations, but the result is easily generalised to any d ≥ 2.
Let us consider any square complex matrix A of size d. We denote its elements by α ab , ∀(a, b)  Claim 1. f is preserving the usual matrix multiplication, i.e. for any two square complex matrices A, B, we have : Proof. Let us take two complex matrices A, B whose elements are respectively α ab and β ab . We take C = A · B and we label its elements γ ab = c α ac β cb . Then: We notice thatm 00 +m 10 =m 00 +m 01 , and similarlȳ m 01 +m 11 =m 10 +m 11 . We label these two quantities respectively l 0 and l 1 . We also notice that If the matrix A is unitary, we find that it is 1, i.e. the resulting basis is normalised. with w(l) = l 0 or l 1 . It should be possible to compute any moment the same way, but we are only concerned about the first moment here.
The idea is to revisit the derivation of Claim 1 with this added weight to find a closed formula. Let us consider the example w(l) = l 0 .

Squashing coherent states
The generalised result we proved applies for any fixed n and any population of the input modes k(n). However in practice Bob will never receive an input state with a fixed number of photons since Alice is preparing coherent states that can have an arbitrary number of photons. It is easy to generalise the result to a classical mixture of photon number states, but here Alice's states are coherent and there is coherence between photon number states of the same mode, hence it is not clear if the result can still apply.
To clarify this point, we use a simple trick. Since eventually, the detectors are photon-number sensitive, we can always assume that there is a virtual non-destructive measurement of the global photon number across all the modes preceding the actual photon number measurement in each of the arms. Here it is important to highlight two points. First this measurement is not performed in practice, but it commutes with the actual measurement, so the statistics will be unchanged and we can always assume it was performed. Second, this total photon number measurement is a global measurement acting on all d modes at once, and it gives no information about the particular photon-number distribution in each of the arm.
As a result, it is mode basis-independent and coherence-nonbreaking (within the n photon number subspace). Since the unitary transformation representing the receiver is equivalent to changing basis for the modes, it is equivalent to measure the number of photons before the circuit or after, so we can always assume that the photon number measurement was performed first before the actual receiver transformation. Now the virtual total photon number measurement will project the input state onto a quantum state (possibly mixed) with a fixed number of photons n and the theorem directly applies.
To see that it is equivalent to measure the total photon number before or after the circuit, let us consider a simple example: two input modes a 0 , a 1 and two output modes b 0 , b 1 are related by a unitary transformation U such that: b 0 = u 00 a 0 + u 01 a 1 b 1 = u 10 a 0 + u 11 a 1 (A47) Then if we denoten a = a † 0 a 0 + a † 1 a 1 andn b = b † 0 b 0 + b † 1 b 1 the observable of the total number of photons in the two input modes and two output modes respectively, then it is easy to check thatn a =n b using Eq. (A47) and U U † = U † U = I. The general case for any dimension d is derived in a similar way.