A model for optimizing quantum key distribution with continuous-wave pumped entangled-photon sources

Quantum Key Distribution (QKD) allows unconditionally secure communication based on the laws of quantum mechanics rather then assumptions about computational hardness. Optimizing the operation parameters of a given QKD implementation is indispensable in order to achieve high secure key rates. So far, there exists no model that accurately describes entanglement-based QKD with continuous-wave pump lasers. For the first time, we analyze the underlying mechanisms for QKD with temporally uniform pair-creation probabilities and develop a simple but accurate model to calculate optimal trade-offs for maximal secure key rates. In particular, we find an optimization strategy of the source brightness for given losses and detection-time resolution. All experimental parameters utilized by the model can be inferred directly in standard QKD implementations, and no additional assessment of device performance is required. Comparison with experimental data shows the validity of our model. Our results yield a tool to determine optimal operation parameters for already existing QKD systems, to plan a full QKD implementation from scratch, and to determine fundamental key rate and distance limits of given connections.


I. INTRODUCTION
Quantum key distribution (QKD) is a method of creating a secret and random one-time pad for two remote users usable for unconditionally secure encryption of messages [1,2]. Since its first proposal in 1984 [3], intense research has pushed QKD ever closer to reallife realizations. It has been shown via free-space links on ground [4][5][6] and from space [7] as well as for longdistance fiber links [8] and in network configurations [9,10]. Many different schemes have been proposed in recent decades, such as entanglement-based protocols (E91 [11] resp. BBM92 [12]), twin-field [13] and decoy-state prepare-and-send implementations [14]. Unlike prepare-and-measure protocols, entanglement-based applications have the advantage of being able to create their quantum states in a single coherent process based, for example, on spontaneous parametric down-conversion (SPDC). Therefore, no quantum random number generators or other electronical inputs are required. Thus, provably no information about the individual photon state exists before the actual measurement. In this sense, entanglement-based protocols exploit the quantum nature of the correlations necessary for QKD on the most fundamental level and can be extended to deviceindependent QKD [15]. QKD with entangled photons also allows quantum network configurations with many users using one and the same sending apparatus, an entangled-photon source (henceforth simply referred to as "source") [10]. There are two fundamentally different ways to operate such a source: by creating the photon pairs with a continuous-wave (CW) or a pulsed pump laser. Up to now, no in-depth model exists for the prediction of key rates and the calculation of optimal source brightness for CW sources. A model describing sources pumped with a pulsed laser was published in 2007 [16] and has been the state of the art ever since. In such pulsed schemes, all photon pairs are found in discrete and evenly spaced time modes depending on the laser's repetition rate. This rate can be tuned independently of the pulse intensity, allowing to individually address photon creation rate and multi-pair emission. Due to the broad frequency spectra in a pulsed-pump scheme, dispersion effects in the optics have to be accounted for, especially in the nonlinear crystals where the entangled photons are created.
This model of pulsed operation can be applied to CW pumped sources with limited accuracy only, as will be shown below. CW pumping has several advantages compared to pulsed-pump schemes, especially in the context of fiber-based QKD: firstly, the spectrum of the downconverted photons is narrower, thus reducing dispersion effects in both source and transmission channels [17]. Secondly, additional high-precision time synchronization is not needed as the temporal correlation peak can be precisely determined using a delay histogram. And thirdly, damage to the source optics due to high-intensity pulses can be avoided.
In this work, we present for the first time a model that accurately describes CW-pumped entanglementbased QKD systems. Importantly, all necessary inputs to the model can be read directly from experimentally available data, without the need of any additional assumptions. Our approach allows to calculate optimal brightness values and coincidence window lengths as well as the resulting final key rate. Hence, the present results are of particular importance for state-of-the-art entanglementbased QKD applications. Comparison with experimental data demonstrates the validity of our model. Although we are focusing here on polarization-encoded BBM92 implementations, our approach can be extended to other degrees of freedom, which is, however, outside of the scope of this work.
The paper is structured as follows: in Sec. II, we explain the basic working principle of polarization-encoded BBM92. We then develop our model in Sec. III by first introducing parameters for an idealized model (Sec. III A), modifying them to account for experimental imperfections (Sec. III B) and then combining them into the final model to calculate the expected secure key rates (Sec. III C). We optimize the key rate with regard to pair creation rate and temporal detection tolerance and compare our model with experimental data (Sec. IV). Concluding, in Sec. V we discuss our findings and present optimal parameters to maximize key rates.

II. WORKING PRINCIPLE OF ENTANGLEMENT-BASED QKD
Entanglement-based QKD protocols such as BBM92 [12] rely on entanglement between distant physical systems, in our case specifically in the polarization degree of freedom of a photon pair. In an idealized scenario, one can create maximally entangled photon pairs which form a so-called Bell state, e.g., where H (V ) denotes horizontal (vertical) polarization and the subscripts signify the recipient of the single photon traditionally called Alice (A) and Bob (B). We choose this state because of the fact that it is correlated in the mutually unbiased linear polarization bases HV and DA (diagonal/antidiagonal), where |D = 1 √ 2 (|H + |V ) and |A = 1 √ 2 (|H − |V ). The following model can however be used for any Bell state, if the correlations are adapted accordingly.
Alice and Bob measure their photons randomly and independently from each other either in the HV or the DA basis. The basis choice can in practice be realized actively or passively. Actively means that Alice and Bob switch their measurement bases depending on the outputs of a quantum random number generator. A QKD implementation with passive basis choice uses probabilistic beamsplitters to direct the photons to either a HV or a DA measurement, both of which are realized simultaneously. In the course of the paper, we will assume active basis choice unless noted otherwise. In any case, Alice and Bob record outcome (H, D=0 and V , A=1) and measurement basis for each event. By communicating about their measurement bases only, Alice and Bob can discard those recorded events where they measured in different bases and therefore see no correlation between their bit outcome ("sifting"). For the other events, they can expect perfect correlation, and thus use their sifted bit strings for key creation. By checking a randomly chosen subset of their sifted measurement outcomes to make sure that correlations have not degraded, Alice and Bob can rule out the existence of an eavesdropper.
In a real experiment, however, perfect Bell states such as in Eq. (1) do not exist. The polarization correlations are degraded through optical imperfections of the source and the detectors, which result in bit and/or phase flips. Also, in practice it is not possible to distinguish each and every consecutively emitted entangled pair from one another due to imperfections in temporal detection, as discussed below. We call such temporally irresolvable emissions "multipairs". 1 Multipairs degrade the quantum correlations necessary to create a secure key, since detection of a multipair photon at Alice does not unambiguously herald the detection of its entangled-and therefore perfectly correlated-partner photon at Bob (and vice versa). Instead, with a certain probability, the photon is wrongly identified as being correlated with a photon from another pair, which leads to errors. Based on these considerations, in what follows, we will define the parameters necessary to calculate the performance of a CW-QKD system. All of these parameters can easily be obtained from experimental detection results, thus making our model ideally suited for direct implementation in real-world applications.

III. MODELING QKD WITH CW-PUMPED SOURCES
For developing the model, we will start out with an idealized polarization-encoded CW-QKD protocol introducing the basic parameters (Sec. III A). In Sec. III B, we will extend this consideration by taking into account noise counts and multipair effects. We then use the experimental quantities defined in this way to calculate error rate and secure key rate (Sec. III C).

A. Idealized CW-QKD system
The most general CW-pumped source setup uses a photon source creating an average number of entangled photon pairs per time unit. This quantity is called brightness B, for which we use the unit counts per second (cps) instead of Hertz to emphasize the random nature of the emission process. We assume the probability of photonpair creation to be uniformly distributed in time, as is justified in the case of CW pumping [18].
The entangled photons are spatially separated and sent to communication partners Alice and Bob, where they are detected with overall channel probabilities η A and η B , respectively. Although these probabilities are composed of the source's intrinsic heralding efficiency [19], the channel and coupling losses, the detection optics' transmission and the detectors' deadtimes and efficiencies, we will consider each η i as one single entity in the following calculations, sometimes referred to as system efficiency. This is because isolating individual loss effects is difficult in a real experiment and not required for our model.
As a result of these definitions, the average local photon detection rate of Alice resp. Bob, the so-called single counts, can be written as where we ignore noise counts for now. Note also that deadtime-induced losses, unlike other effects contributing to the η i , are a function of detector count rates S t i and therefore of the brightness B, which has to be taken into account for low-loss scenarios (see Appendix B 1).
Naturally, two photons of a pair must be detected in order to observe their polarization correlation, i.e., use them for generating a cryptographic key. The rate of such two-photon events, which we call "true coincident counts" or "true coincidences" 2 , is given as where we again preliminarily ignore noise counts. Using Eqs. (2) and (3), the η i can be calculated as [19] The η i are sometimes also called "heralding efficiency", since they give the probability that the detection of one photon in one arm announces, or "heralds", the detection of a photon in the other arm. One can also define a total heralding efficiency η = √ η A η B .
Imperfections of source, polarization compensation and optical detection system lead to erroneous polarization measurement outcomes, i.e. two-photon events which do not comply with the expected Bell state. We call the probability of such an erroneous measurement e pol . It consists of contributions of the individual polarization error probabilities e pol A and e pol B of Alice and Bob, respectively: It should be noted that measuring the wrong bit value at Alice and Bob still counts as a valid measurement, since it is impossible in principle for the experimenter to distinguish such an event from a correctly measured true coincidence. In most practical implementations, it is more convenient to read e pol directly from the experimental data instead of quantifying the e i individually (see Appendix A).

B. Noise-afflicted CW-QKD system
In a real-world entanglement-based QKD implementation, the crucial source of error is not e pol , which can be kept below 1% in modern applications [20], but the unavoidable registration of uncorrelated multipair photons which have lost their partner, and/or noise counts as coincidences. Such erroneous coincidences are called "accidental coincidence counts". To calculate the accidental coincidence rate for BBM92 with a CW pump, firstly one needs to modify Eq. (2) to account for dark counts DC i in the detectors: where S m i are the actually measured count rates. Note that stray light, residual pump laser light, intrinsic detector dark counts or any other clicks which do not originate from source photons all have the same effect for our purposes. Therefore, we include all such clicks in the DC i . In a real experiment, Alice and Bob require at least two detectors each to be capable of distinguishing orthogonal quantum states. In Eq. (6), we assume that Alice and Bob each own identical detectors whose photon and dark count rates can simply be added; for the case of nonidentical detectors and polarization dependent detection efficiency, see Appendix B 3.
Alice and Bob identify coincidences by looking for simultaneous detection times (accounting for a certain constant delay t D caused by different photon travel times and electronical delays). There are three main effects that can degrade the fidelity of this identification: the detection system's finite timing precision, the coherence length of the photons, and chromatic dispersion effects in fiber, which delay photons of different wavelengths with respect to each other [17]. These effects cause a spread of the photons' temporal correlation function, whose full width at half maximum (FWHM) we call t ∆ . Because in any real experiment t ∆ > 0, Alice and Bob need to define a so-called "coincidence window" t CC . It can be understood as the temporal tolerance allowed for the difference in detection time of two correlated photons.
It follows that there is a possibility of confusing uncorrelated detector clicks with true coincidences. This possibility can be calculated, since it depends on t CC and the S m i . Assuming independent Poissonian photon statistics at Alice and Bob, one can define the mean number of clicks at Alice resp. Bob per coincidence window as Most single-photon detectors used today are not photonnumber resolving. Therefore, the chance of an accidental coincidence to be registered can be approximated by the probability of at least one detection event taking place at each of them: where we use the fact that the click probability is given by (1 − e −µ S i ); cf. [21,22]. This expression for P acc provides a good estimate for the accidental coincident-count probabilities in high-loss regimes. For low-loss scenarios it needs to be adapted as it overestimates the probability of accidental coincidence counts by also counting true coincidences as accidental (see Appendix B 2). For µ S i 1, Eq. (8) can be simplified to The rate of accidental coincidences per second is therefore Note that since we assume at least one detector click per receiver for an accidental count to happen, we take into account the fact that in a real experiment with several detectors, there can be more than one click per coincidence window (cf. B 2). In that case, a random bit value has to be assigned [23,24], which has the same error probability as an accidental count and can therefore be seen as a part of Eq. (10). Also note that CC acc depends quadratically on B, but CC t linearly. Thus, noise increases faster than the desired signal when increasing B, which gives an intuitive understanding why simply pumping the source with higher power can only enhance the key rate up to a certain degree (see Sec. IV). It is not only accidental coincidences which depend on the choice of t CC . If it is chosen in the order of the timing imprecision t ∆ , true coincidences will be cut off and lost due to the Gaussian shape of the g (2) intensity correlation with FWHM t ∆ between Alice's and Bob's detectors (see Fig. 1).
This g (2) function can be modeled as a normal distribution with delay t D . t ∆ is the resulting timing imprecision between Alice's and Bob's measurements, i.e., it is the convolution of detector jitter, chromatic dispersion and coherence time of the photons at both Alice and Bob. To arrive at the loss which true coincidences suffer due to the coincidence window, one can carry out the integration Here, η t CC is the proportion of true coincidences which fall into the chosen coincidence window t CC and are thus identified as coincidences in the experiment. In this sense, η t CC can be interpreted as coincidence-window dependent detection efficiency. Now we can define the actually measured coincidences as This is the total number of detector events per second that Alice and Bob use to create their key. But obviously, a subset of these events occuring with rate CC err actually does not show correlations in accordance with Eq. (1): firstly, all those correlated photons which are measured erroneously; and secondly, on average half of all accidental coincidence counts: Number of coincidences per time unit for different relative measurement times. tD is the delay between Alice and Bob and t∆ is the FWHM of the temporal distribution, both of which are constant. The magnitude of the freely selectable coincidence window tCC not only determines the number of total coincidences CC m , but also the QBER E, i.e. the ratio of erroneous (η t CC · CC t · e pol ) plus half of all accidental ( 1 2 CC acc ) coincidence counts to CC m .
C. Error rate and secure key rate From the quantities defined above, one can now calculate the quantum bit error rate (QBER E), i.e. the ratio of erroneous coincidences to total coincidences: As a side remark, the commonly used parameter "visibility" V relates to E as V = 1 − 2E [1]. Fig. 1 shows a geometrical interpretation of Eq. (16). Coincidences correspond to different areas under the graphs, which are restricted by the chosen coincidence window. On one hand, it is desirable to increase the ratio of the light blue area to the combined dark blue and orange ones, which is equivalent to decreasing E. This can be done by decreasing t CC , since the Gaussian-shaped CC m (dark blue curve) scales more favorable in this case than the uniformly distributed accidental coincidence counts CC acc . On the other hand, reducing t CC means that η t CC reduces the total number of coincidences which can be used for key creation.
In order to evaluate the trade-off between these two effects, we will analyze the secret key rate in the limit of infinitely many rounds -the so-called asymptotic key rate. 3 Alice and Bob choose randomly between measurement settings in the HV and DA bases. Let us denote the probability that Alice and Bob measure in the same basis as q. Only in this case, the polarization measurement outcomes at Alice and Bob are correlated. All other coincidences have to be discarded. Therefore, the rate of coincidence rounds left for post-processing is equal to qCC m . Subsequently, Alice and Bob reveal a small fraction of measurement outcomes in both bases to estimate the error. Now we can finally evaluate the amount of achievable key per second as [16]: where H 2 is the binary entropy function defined as E bit and E ph are the bit and phase error rates, which are measurement-basis-dependent rates of measurement outcomes incompatible with the maximally entangled state described in Eq. (1). f (E bit ) is the bidirectional error correction efficiency which takes into account how much of the key has to be sacrificed due to the fact that post-processing is performed in finite blocks. In order to asses the validity of our model against an actual experiment, both the sifting rate q and efficiency f (E bit ) need to be defined. We assume that the measurement settings of Alice and Bob are chosen uniformly, and thus q = 1/2. Further, we choose a realistic value of f (E bit ) = 1.1 [25]. Finally, since in our model the noise parameters are independent of measurement settings, we can set E bit = E ph = E. With these choices, key rate formula becomes: From Eq. (19) follows immediately that there is a fundamental limit E max ≈ 0.102, above which no key creation is possible. In the following section we maximize R s depending on the parameters discussed up to now. Importantly, all parameters used in this optimization can be directly determined in real-life experiments, which is explained in detail in Appendix A. Finally, note that the key rate formula can be adjusted using Eq. (17) to take into account measurement setting dependent losses as well; cf. Appendix B 4 for details.

IV. COMPARISON TO EXPERIMENTAL DATA
For realistic applications, the η i , the optical error e pol , the dark counts DC i and the temporal imprecision t ∆ cannot be modified freely. Two important parameters however can be chosen by the experimenter: brightness B and coincidence window t CC . The experimenter can vary B up to a certain level by changing the laser pump power in the source. With laser powers of many hundreds of milliwatts, brightness values of up to 10 10 cps are feasible with current state-of-the-art sources [20]. The coincidence window t CC can in principle be chosen at will. It follows that for each QKD scenario, there is an optimal choice of B and t CC which maximizes R s of Eq. 19. Fig.  2 shows a comparison of our model and experimental values, where t CC has been numerically optimized for each curve with regard to the highest obtainable key rate and is then kept constant for every curve.
The data were collected using a Sagnac-type source of polarization entangled photons in the telecom C-band. For a detailed description of such a source's working principle, we refer the reader to Ref. [20]. After passing wavelength division multiplexing (WDM) filters of 18. rameters were determined by using count rates, coincidence rates and temporal histograms of the single-photon detections only, with no need of additional "external" characterization (cf. Appendix A). Since the timing jitter of nanowire detectors strongly depends on the count rates they measure, linear fits of the jitter change depending on brightness have been included in the model. The data show excellent agreement with our model's predictions. The losses introduced in the measurements range from 40 to 80 dB in total, with different distributions along the channels. Note that the two loss scenarios with equal total loss of 60 dB (orange and turquoise curve) perform very differently. Assuming DC A = DC B 4 , symmetric loss is preferable to asymmetric loss because the probability of a partnerless photon matching with a dark count is reduced in this case. In Fig. 2, this effect on the two 60 dB curves is, however, exaggerated due to different polarization errors e pol , which we set via a manual polarization controller (MPC) to show the model's validity for different parameter regions. The total losses are equivalent to in-fiber distances between 200 and 400 km. Nevertheless, our model can be applied to all kinds of quantum channels, including e.g. free-space satellite connections, where variation of the channel attenuation [26,27] can be integrated in our model in a straightforward manner.
We want to emphasize that in any case, our optimization strategy works exclusively with experimentally measurable quantities that can be inferred directly from the actual QKD implementation (see Appendix A). Furthermore, the presented model can be used during the planning phase of an experiment to devise optimal working parameters based on specification sheets. While several calculations are approximated in our model, it shows excellent agreement with the experimental data. This is proof of its usefulness in a wide range of experimental parameters. For a more extensive treatment of phenomena that might become necessary in certain parameter regimes, such as dead time effects, low-loss channels and non-identical detectors, we refer the reader to Appendix B. Lower t∆ allows both for higher key rates and longer maximum distance, since CC acc , the main source of errors, is directly proportional to t∆. Note that the dotted green curve (t∆ = 10 −10 ps) is the same curve as the equally colored one in Fig. 4. Now that we have shown the validity of our model in different parameter scenarios, we want to use it to illustrate limits and potential of CW-QKD. Therefore, we numerically maximize both B and t CC for every point on the curves in Figs. 3 and 4, i.e.,

V. OPTIMIZATION OF QKD WITH
is fulfilled continuously. Fig. 3 shows the maximum obtainable key rate assuming symmetric loss for different jitter values. Lower jitter allows for a smaller coincidence window, which in turn allows for higher brightness values and thus key rates. Note that no matter the jitter value, there is an abrupt drop to zero key after a certain amount of loss. This is because dark counts will inevitably induce a minimum accidental coincidence count value CC acc min = DC A · DC B · t CC . In a regime of high loss, this constant value can mask true coincidences if η t CC · CC t 10 · CC acc min . In this case, key creation is frustrated. Fig. 4  In the case of no dark counts (dark blue curve), there exists no distance limit, since tCC can in principle be set arbitrarily small, thus keeping the error rate below Emax for any loss. Note that the dotted green curve (DC = 250) is the same curve as the equally colored one in Fig. 3.
DC i = 0, the accidental coincidences CC acc can be decreased to arbitrarily low values by reducing the brightness B. Although this also decreases maximum key rates beyond the point of usefulness, they never drop to zero, as indicated by the dark blue curve. When comparing Fig. 3 and 4, it becomes apparent that in a real-world scenario, reducing the timing imprecision t ∆ is more important than reducing the dark counts. This is because lower DC i can only increase the maximum distance in high-loss regimes, where key rates are extremely low already. To increase the key rate for a given loss, it is more favorable to lower t ∆ in most cases. We would also like to emphasize that when wrongly using the model for pulsed-source BBM92 by Ma et al. [16] to estimate key rates for a CW-pumped implementation, one arrives at erroneous results, even when trying to adapt it. One could try to do so by replacing the mean photon number per pulse 2λ with the average photon number per coincidence window µ = B · t CC and changing the multipair probability of Eq. (5) in Ref. [16] to a Poissonian distribution. Since doing so ignores any effects of temporal uncertainty, the results differ strongly, as can be seen in Fig. 5.

VI. CONCLUSION
To the best of our knowledge, we have for the first time presented a comprehensive and accurate model of continuous-wave entanglement-based quantum key dis-1.0x10 9 2.0x10 9 0.5x10 9 1. tribution. Our model allows to estimate and optimize the performance of any given CW-QKD system by extracting experimental parameters from the recorded detections only, without the need to perform any additional characterization of the experiment. It also allows to compare different devices and find the optimal solution for a given quantum link. For a given QKD setup, the model can accurately estimate the optimal settings of brightness and coincidence window to extract the maximal possible key and thus enhance the performance of the implementation. Furthermore, the presented approach is readily extendable to BBM92 based on entanglement in other degrees of freedom. We are confident that our easy-toimplement model will be used as an important design and optimization tool for CW-QKD links. There are numerous ways to estimate the parameters discussed in this work. When planning a QKD link from scratch, one has to rely on data sheets and fiber loss measurements. However, one can also estimate all pa-rameters with the same QKD equipment used for the experiment, if already available.
Directly accessible parameters for the experimenter are t CC (since it is a free variable to be chosen by the experimenter), the S m i and CC m . The delay t D between Alice's and Bob's detection times can be inquired by calculating a delay histogram of single counts at Alice and Bob and determining the location of the histogram peak (see Fig. 6). From the same histogram, the (total) tim- (1). The orange curve's small peak around 0 corresponds to erroneous polarization measurements, while the noise floor is equivalent to accidental coincidence counts CC acc (cf. Fig. 1).
ing imprecision t ∆ can be read from the peak's FWHM (less CC acc ). It should be mentioned that SNSPD jitter depends on both the detector's bias current and its count rate, and exhibits the lowest specified values for high current and low count rates only. This dependency has been included in the model of Fig. 2 by using a linear fit of t ∆ vs. B rather than a constant jitter value. The dark counts DC i can be determined by blocking the source of photons and observing the S m i , which are equal to the DC i for B = 0 [see Eqs. (2) and (6)]. Note however that stray light from the pump beam cannot be observed with this method. To do so, one either needs filters that block just the SPDC wavelength, or the possibility to frustrate SPDC without blocking or misdirecting the laser, e.g. by changing the crystal temperature. Especially for long-distance single-mode-fiber links designed for the SPDC wavelength, it is safe to assume that pump light is sufficiently suppressed at the detectors.
For the following calculations, it is necessary to determine CC t (for a certain brightness). Especially in the case of low loss and low jitter, this can be done experimentally by lowering the brightness to a value where CC acc → 0 and therefore CC m → CC t . Alternatively, CC acc can be subtracted from CC m : either by calcula-tion using Eq. (10) or experimentally by changing t D to a value far from the actual coincidence peak, while keeping t CC constant. In absence of CC t , the measured CC m become equal to CC acc . For all these approaches, it is important to choose t CC large enough such that η t CC → 1; as a rule of thumb, t CC = 3 · t ∆ is sufficient. Now to determine the optical error e pol , one can use the methods just described to eliminate CC acc in Eq. (16) such that E ≈ e pol .
The heralding efficiencies or transmission factors η i can be calculated using Eq. (4), where again, CC t and S t i have to be determined in advance by subtracting CC acc and DC i .
Finally, also the brightness B can be calculated using CC t and S t i via Note that for this calculation of the η i and B, deadtime effects have not been taken into account. Thus, even if the CC acc are simply measured and subtracted, one should take care to operate the source at sufficiently low pump power (see Appendix B 1).
If it should be necessary to incorporate deadtime effects, the most efficient way to determine t † is to calculate an auto-correlation histogram in time of each detector channel while subjecting it to photons with Poissonian emission statistics. The temporal stretch for which no correlations are found is the detector channel's deadtime.
Here d is the number of (identical, cf. Appendix B 3) detectors deployed per communication partner. This effective loss cannot simply be considered as a constant contribution to η i , since it is a function of S m i and therefore B. For Bη i ·t † /d < 0.02, η T i ≈ 1 holds. Note that the estimation of B can be compromised if this assumption is not justified due to low loss, high brightness and/or long detector deadtime.
Another result of deadtime loss is that the definition of the µ S i in Eq. (7) needs to be modified, since photons arriving at the detectors during the deadtime do not contribute to S m i . One therefore needs to modify the CC acc in Eq. (10) to where we assume DC i S t i η i , which is reasonable in the high single-count regimes where deadtime effects become important.

Accidental coincidence probability
Equation (8) slightly overestimates the probability of accidental coincidence counts. Since it assumes completely independent photon statistics at Alice and Bob, any photon contributes to CC acc , regardless of whether it has lost its partner or not. Thus, here we want to give a more extensive description P acc ext , which is well approximated by P acc in Eq. (9) for η i 1. We start by defining the probability of a coincidence happening per coincidence window, P CC t : where µ = B · t CC is the average number of photon pairs created per coincidence window before any loss, and P DC i = DC i · t CC are the probabilities of a noise count happening at Alice resp. Bob per coincidence window. This formula takes into account the Poissonian emission and dark count statistics. Multi-pair emissions can still yield a valid measurement if photons get lost in a way that two correlated photons end up at the detectors before all others (first factor inside the square brackets). However, if photons emitted after the true pair, but inside the coincidence window, are detected as well, they can in some cases eliminate a true coincidence (second line). The divisions by 2 come from the fact that if the later photon detection would occur in the same detector as the true photon detections, this case cannot be distin-guished from a true coincidence. If it clicks in the other detector, a random bit value has to be assigned, i.e. only this case has to be counted as an accidental. Dark counts can also occur in the presence of a true pair, eliminating a valid coincidence in the same way as photons arriving later, which gives rise to the factors in the third line. As a side remark, in the case of passive basis choice using beamsplitters, there are 4 instead of 2 detectors deployed; accordingly, the factor 1/2 has to be replaced by 3/4.
Using P CC t , the actual probability of detecting an accidental coincidence per coincidence window reads The formula can be understood as follows: The accidental coincidence probability P acc cor can be seen as all those two-click events that did not originate from a true pair. We proceed by subtracting from probability 1 all events which are no accidental coincidences.
Thus, in the first line, we subtract the probability of no photon pair being emitted, corrected by the case of two dark counts producing a coincidence. We also subtract all correct coincidences according to Eq. (B3). Then we subtract the sum over all remaining pair emission probabilities which are not the vacuum state, not a true coincidence and not an accidental count. In the second line, we count those cases where no accidental coincidence happens since in at least one arm, no click occurs. Since the possibility of both detectors not clicking is included in both (1 − η A ) n and (1 − η B ) n , it has to be subtracted. This subtraction avoids mistakenly counting the case of all photons lost twice.
In lines three and four of Eq. (B4), we have to readd the cases where dark counts cause an accidental coincidence by "replacing" a photon. All other dark count cases are already included in the first line of the equation-either as part of P CC t or in 1, since a dark count happening when an accidental coincidence would have occurred anyway does not change their statistics.
For η i 1, one can approximate P acc cor with P acc from Eq. (9), which actually constitutes an upper bound for Eq. (8).

Non-identical detectors
In our model, we assume Alice resp. Bob to use identical detectors for their orthogonal polarization measurements. It has recently been shown [29] that vast differences in detector performance do not necessarily degrade the security of a QKD protocol. However, different detection efficiencies lead to asymmetric single-count rates and therefore different accidental coincidence rates for different polarization correlations. On top of this, different detector jitters lead to different η t CC for each correlation. These asymmetries and differences of used detectors can lead to a deviation from the reported model.
To account for such imbalances one has to define two heralding efficiencies per communication partner, which we denote by η Aj and η Bk , where j and k indicate the detectors. Following Eq. (3), one can now differentiate true coincidence values: S m Aj · S m Bk · t CC . (B10) To take into account different detector jitters, one arrives at different values of t ∆jk , which require an adaptation of the coincidence window loss of Eq. (13): In this case, Eq. (14) becomes and similarly, Eq. (15) can be written as Here we assume a correlated Bell state (φ +/− ) in the respective basis. For anticorrelated ones (ψ +/− ), the indices to be summed over have to be replaced by j = k.

Key-rate-formula adjustments
Following from above considerations, in a realistic experiment, one might additionally expect that one of the polarization measurement settings used in the BBM92 protocol is more prone to errors than the other one. Let us assume that this is due to different optical errors e pol which can depend on the measurement basis. As an example, the HV basis often shows higher fidelity than the superposition bases as a result of the source design, which relies on polarizing beam splitters defining H and V with high extinction (1 : 1000 or better). Because of this, we obtain two values of QBER [see Eq. (16)], one for each measurement setting. Let us denote these with E HV and E DA . If coincidences obtained in the HV basis are used to derive the key, then in Eq. (17) we can set E bit = E HV and E ph = E DA . Similarly, for a key derived from coincidences in the DA basis we set E bit = E DA and E ph = E HV . If both Alice and Bob choose the HV setting with probability p and the DA setting with probability (1 − p), they would obtain two key rates, each in one basis: The total key rate is then the sum of these two key rates, and the total compatible basis choice probability from Eq. (17) is q = p 2 + (1 − p) 2 .
Another common technique is to use predominantly one of the basis settings and use the other only with very low probability to obtain the estimate on E ph . This is often referred to as the "efficient BB84 protocol" [30]. In the asymptotic setting, one can therefore assume that the probability p to choose the HV basis approaches unity, and the final key rate is: Additionally, in some works the authors assume that in the asymptotic setting the block length is also approaching infinity and therefore f (E bit ) approaches unity [31,32]. Last but not least, even in case of different error rates, one can in practice use the average error E = (E HV + E DA )/2 with Eq. (19) to obtain a lower bound on the secret key rate [10,33], since (B17)