Sampled sub-block hashing for large input randomness extraction

Randomness extraction is an essential post-processing step in practical quantum cryptography systems. When statistical fluctuations are taken into consideration, the requirement of large input data size could heavily penalise the speed and resource consumption of the randomness extraction process, thereby limiting the overall system performance. In this work, we propose a sampled sub-block hashing approach to circumvent this problem by randomly dividing the large input block into multiple sub-blocks and processing them individually. Through simulations and experiments, we demonstrate that our method achieves an order-of-magnitude improvement in system throughput while keeping the resource utilisation low. Furthermore, our proposed approach is applicable to a generic class of quantum cryptographic protocols that satisfy the generalised entropy accumulation framework, presenting a highly promising and general solution for high-speed post-processing in quantum cryptographic applications such as quantum key distribution and quantum random number generation.


I. INTRODUCTION
Randomness extractors are the cornerstone of quantum cryptography.It is an essential step in both quantum key distribution (QKD) [1][2][3] and quantum random number generation (QRNG) [4], which are two of the most mature quantum technologies to date.In general, after execution of a quantum cryptographic protocol, the raw data collected is a weakly random source [5], which the adversary, Eve, might have some side-information on.Randomness extractors, as their name suggests, are functions, with the help of a short random seed, that transform the weakly random source to an (almost) perfectly random source, even from Eve's perspective.
Often, Toeplitz hashing [6] is the function chosen for randomness extraction.This is because Toeplitz hashing, which belongs to a family of universal hash functions [7], is optimal in terms of extractable length [8,9].In addition, Toeplitz hashing also has a straightforward construction and implementation.Toeplitz hashing utilises a Toeplitz matrix T, which is a diagonal-constant matrix, constructed using a uniform random seed.The number of rows and columns of T correspond to the size of the output and input of the extractor, respectively.Expressing the input data as a column vector A ′ , the entire extraction process can be simply expressed as K = T•A ′ , where K is the extracted output.
Field programmable gate arrays (FPGAs) are an excellent choice of implementation platform for Toeplitz hashing because computations can be executed in parallel compared to sequentially on CPUs.In addition, FPGAs consume much less power [10] and are much smaller than CPUs.An FPGA implementation also eases the transition to application-specific integrated circuits (ASIC) compared to a CPU implementation.These factors are all crucial for future downstream engineering work to achieve chip-based implementations of QKD and QRNG systems, which are of keen interest to both academia and the industry.However, as FPGAs are platforms with limited resources, active steps have to be taken to keep the resource utilisation level low.
On high-level platforms like CPUs and GPUs, the process of matrix-vector multiplication can be sped-up using the fast Fourier transform (FFT) algorithm.While it is possible to implement FFT-based Toeplitz hashing on FPGA as well [11], this is often not the chosen method, because FFT requires floating-point precision, which would drive up the complexity and resource consumption of the implementation.In addition, it is also hard to determine if FFT will improve the speed of Toeplitz hashing on FPGA since the architectures of CPU and FPGA are different.Hence, in this paper, we only consider a direct hashing approach implemented on FPGA.
All real implementations of quantum cryptographic protocols are run for a finite duration, and therefore have finite resources.Due to statistical fluctuations in finitesize data, randomness extraction has to be performed on large input sizes for practical implementations of quantum cryptographic protocols [12].For example, in QKD, finite-size analysis requires a block size of at least 10 6 bits [13,14].When considering protocols with weaker assumptions, such as the measurement device-independent (MDI) setting [15][16][17] and the device-independent (DI) setting [18][19][20], the block size required is even larger, around 10 11 bits.Taking DI-QKD as an example, despite improvements in both theoretical [21,22] and experimental [23,24] aspects, the required block size is still larger than the input sizes of state-of-the-art implementations of Toeplitz hashing on FPGA [11,25,26].This is because the hashing process becomes increasingly impractical as the input size increases.This can be attributed to two scenarios, which are a decrease in throughput, or an increase in resources required.As the input size of hashing increases, the output speed natu-arXiv:2308.02856v1[quant-ph] 5 Aug 2023 rally decreases, as a larger number of bits have to be processed to produce even a single secure bit.Alternatively, to prevent the throughput of hashing from decreasing, the FPGA can create duplicates of certain hardware and execute them in parallel.However, this leads to a large increase in the resource utilisation, which restricts the quantum cryptographic system to the use of high-end FPGAs.
To ensure that the hashing process remains practical while satisfying the finite-size effect, we propose a sampled sub-block hashing approach.The main idea is to randomly split (via sampling) the large input data into sub-blocks, and provide a lower bound on the conditional smooth min-entropy of each sub-block.With this bound, we then perform randomness extraction on each sub-block, and concatenate the outputs together.If our proposed method is performed on a protocol satisfying certain properties and whose security can be analysed with the recent generalised entropy accumulation theorem (GEAT), we can demonstrate that our method only introduces a small (linear) loss in security while maintaining the same number of secure bits obtained.
The sample-and-hash method by Konig and Renner [27], and the sample-then-extract method by Dupuis et al. [28] contains features that closely resemble those of our method.However, they incur high entropy losses.The reasons are two-fold.First, the sampling penalty of our method is lower because works [27,28] have fewer assumptions about the characterisation of the entropy source.Thus, our method considers a more restricted class of protocols (but not too restrictive in general).Secondly, works [27,28] also do not account for the correlation between different sampled sub-blocks.Therefore, after performing the sampling, the rest of the unsampled bits are discarded.The act of discarding the other bits effectively introduces a large loss of entropy.Using our method, we can account for the correlation between different sampled sub-blocks.Thus, we are able to perform hashing on all the sub-blocks and concatenate the outputs, while still preserving the overall security.
We employ our method on a simulated standard BBM92 [3] QKD protocol as a proof of concept demonstration.We show that with our method, we are able to reduce the execution time of Toeplitz hashing by almost twenty-fold.It should be noted that the improvement can be increased further by optimising the sampling parameters.
The rest of the paper is organised as follows.In section II, we provide a description of the theoretical framework, GEAT, that we use in our analysis.In section III, the theoretical analysis for our method will be explained in detail.The implementation and results of our proposed method is given in section IV.Finally, some discussions and a conclusion is given in section V. [29], where an initial state ρR 0 E 0 undergoes a series of EAT channels, Mi, to generate the final state ρ X N C N R N E N .

II. THEORETICAL FRAMEWORK
Our work makes use of the generalised entropy accumulation theorem [29], and we provide a brief description of it here.Suppose at the end of some protocol, we have the output quantum state ρ X N C N R N E N which can be described as being generated by applying a sequence of channels on some initial state ρ R0E0 (see Fig. 1).In the output state, X N typically refers to the output (e.g., raw keys in QKD), C N refer to the statistics used to select an event (e.g., whether the protocol should be aborted), R i are the side-information carried by Alice and Bob, while E i is Eve's side-information.Generalised EAT provides a tight bound on the conditional smooth min-entropy of the output X N conditioned on the side-information E N , i.e., where Ω ⊂ C N is a set of events.This bound is extremely useful as it is directly related to the final key length and the security parameter in quantum cryptographic protocols.
Generalised EAT states that if the following two conditions: 1. Non-signalling: One requires that any side information that Eve can have about X , where • represents the composition of the channels.

Projective reconstructability:
Statistics C N should be able to be reconstructed by performing a projective measurement on X N and E N , and applying a function on the outcomes.More formally, there exists a channel T such that are satisfied, then (original form presented in Appendix A) where with variables ξ, γ and β, which are independent on f , and Var(f ), Max(f ) and Min(f ) as the variance, maximum value, and minimum value of some affine mintradeoff function f (q).The min-tradeoff function is an affine function f (q), where for all i = 1, , ρ Ci = q}, and Ẽi−1 as a purifying system of R i−1 E i−1 .

III. THEORETICAL ANALYSIS
The current methods to analyse sampling techniques [27,28] provide bounds which are less than ideal due to their large penalty term, as explained in section I. Here, we propose a different method of sampling, which is applicable to a general class of protocols that we term GEAT-analysable protocols.By capitalising on the properties of this class of protocols, we demonstrate that the penalty due to sampling is greatly reduced relative to current techniques.Therefore, for bit strings that are generated by this class of protocols, users can simply switch the randomness extraction step to the sample and hash method proposed here to achieve improved performance.

A. Generic GEAT-analysable protocol
Here, we introduce briefly the class of protocols that are amenable to the sample and hash method, termed GEAT-analysable protocol.Most critically, the protocol can be proven to be secure using the generalised EAT framework.Many QKD and QRNG protocols can be analysed with this framework, with Metger et al. [30] providing a set of instructions on how this analysis can be performed on a generic QKD protocol.
Consider a generic N -round protocol, with honest parties (or a single honest party) and an adversary, At the end of the protocol, the adversary would have information E ′ N = T N I N (I test ) N E N , along with Y and the hash function F RE .
After randomness extraction, one ideally wants the hashed output K to be secret from the adversary.Using the composable security framework [31,32], the secrecy of bit string K can be defined by where ∆ t (ρ, σ) is the trace distance between states ρ and σ.Using the quantum leftover hash lemma [33,34] and chain rule of smooth min-entropy [35,36], proving the security can be reduced to finding a lower bound on the smooth min-entropy Since the protocol can be analysed with GEAT, one expects that the min-entropy and thus security can be computed directly using GEAT, as expressed in the form of Eqn.(1), providing a secrecy of where log 2 |Y| is the length of Y .We note here that tighter bounds on the penalty due to the announcement of additional information Y can be provided, for instance using tighter min-entropy chain rules, but the analysis would be protocol dependent.

B. Sub-block hashing protocol
In this section, we present our sampled sub-block hashing protocol, which is meant to replace the sifting and randomness extraction step of the generic protocol.One begins with the raw bit string A N = (A 1 , . . ., A N ) that is supposed to undergo sifting and randomness extraction.Let p S be the sampling probability and N S = 1 p S be the number of sampled sub-blocks.Assume, for simplicity, that p S is chosen such that we have an integer number of sub-blocks, N S ∈ N.
1. Sampling: For each round i ∈ [1, N ], the honest parties generate a uniform random variable V i , which can take values v i ∈ [1, N S ] (i.e.V i can take on values between 1 to N S , each with probability p S ).If V i = j, we say that bit A i is "sampled" into the j-th sub-block, which we denote as A Sj , where the indices for a set S j = {i : V i = j}.The value of V i is then announced to the rest of the honest parties (if necessary).
2. Sifting: For each sub-block S j , sifting can be performed to discard rounds which are inconclusive or part of the test rounds, leaving us with sifted sub-blocks A ′ Sj .In general, the length of these bit strings are not fixed.Therefore, to prevent an excessively long bit string from slowing the Toeplitz hashing, one can choose to abort if the size of any sifted sub-blocks A ′ Sj exceeds a pre-determined threshold L UB S .In practice, this aborting probability would be very small.

Randomness Extraction: If the protocol does
not abort, randomness extraction is performed on all the sub-blocks independently to yield output , where f RE,j are random hash functions chosen independently from a family of 2universal hash functions.For simplicity, we take the length of output K j to be l for every j.

Concatenation:
The final output, denoted by , is obtained by concatenating all the individual outputs of randomness extraction on each sub-block.
When the sifting and randomness extraction is replaced with sampled sub-block hashing, V i is additionally announced, along with the use of multiple hash function, and would be accessible to the adversary.

C. Security of sub-block hashing
Replacing the sifting and randomness extraction with sampled sub-block hashing, one remains concerned about the security of the bit string K, now formed from a concatenation of sub-strings.The security of the protocol is closely tied to the security of each sub-string K j , which one could consider as being generated from a subprotocol.The security of this sub-protocol, as we claim below, can be easily computed based on the security analysis of the original GEAT-analysable protocol.We note while the security claim provided here deals with the ease of computing the secrecy of the protocol when adapted with sub-block hashing, it makes no assertion on whether this would cause an increase in performance.
Claim 1. Replacing the sifting and randomness extraction with the sampled sub-block hashing steps, the security of the protocol can be computed from the security of the sub-protocols, ε = N S ε ′ .The sub-protocol security ε ′ can be computed based on its associated smooth min-entropy, with simple replacement of the min-tradeoff function f (q) → 1 N S f (q) and sub-string size l → l = l N S , i.e.
where h ′ , v ′ 0 and v ′ 1 are computed with the min-tradeoff function replacement.
Here, we provide a brief idea of the security analysis demonstrating the claim, with the full proof provided in Appendix B. For each sub-protocol, the associated smooth min-entropy is of the form V N E ′ N ), which accounts for correlations between the sampled raw bit strings.Since the original protocol is GEAT-analysable, one is able to write down EAT channels M i and apply the EAT theorem by defining some min-tradeoff function f (q).To construct the min-tradeoff function for the sub-protocol, one can adapt the EAT channel to output ,i and V i to Eve.Since only p S fraction of the bits A ′ i are output as A ′ Sj ,i , the entropy and thus the min-tradeoff function gains a factor of p S = 1 N S , giving the modification in the claim.
We note again that the penalty due to announcement of additional information exchanged, Y , could be tighter bounded, depending on the protocol.For instance, if Y is the syndrome sent for error correction, adapting the error correction to apply to each sub-block could reduce the penalty, since each sub-block A ′ Sj would only face penalty due to error correction for that sub-block itself, Y j .This could in principle reduce the error correction rate by a factor of N S on each sub-block.We also note that for protocols with additional parameter checks, such as the key verification step in QKD to ensure correctness, the additional conditioning on the event that these checks pass, Ω KV , can be removed by a simplification of the smooth min-entropy:

D. Excessive sub-block size
The sampled sub-blocks would in general be shorter in length (by a factor of N S ) and thus enjoy a speed-up during the hashing process.However, due to the nature of bit-wise sampling, there remains a small probability that the length of one sub-block could be excessively long.To properly account for these events in our analysis, we set a block size limit, L UB S , where if any sub-block exceeds, the protocol would abort.As such, one can be sure that if the protocol succeeds, the speed-up can be guaranteed by the fact that the hashing process has an input block size of at most L UB S .
Aborting when the bit strings are excessively long incurs a small penalty to the secrecy, ε Ω ′ , the probability that these rare events occur, i.e., Pr One could in principle select the block size limit with any concentration bound, and here we demonstrate with a concentration bound on binomial distribution [37], that one can choose the block size limit such that where p sif t is the probability that a protocol would output a bit from A i into A ′ Sj (has to be conclusive, not be a test round and sampled into block S j ), Φ(x) is the normal cumulative distribution function and H(x, p) = x ln(x/p)+(1−x) ln[(1−x)/(1−p)].Including this aborting chance would result in an additional ε Ω ′ penalty to the security condition, since the trace distance between the state including and excluding this aborting part is bounded by ε Ω ′ .

IV. RESULTS
We demonstrate our method by employing it on a simulated standard BBM92 [3] QKD protocol with ε Ω ′ = 1 × 10 −8 , e ph = 0.82%, and e bit = 5.8%.The error rates are chosen based on previously reported experimental results [38].We then use GEAT to optimise the testing probability p X under different simulation parameters.We note that the different p X values can be easily achievable experimentally using passive beam splitters with fixed splitting ratio, or using optical switches for active basis choice.
We first design a module on FPGA to carry out Toeplitz hashing.The schematic of the hardware module on FPGA is shown in Fig. 2. We explain the process for sampling of the first sub-block, and note that the implementation for the rest of the sub-blocks is largely similar, with some minor changes.The hashing module receives one bit of input data every clock cycle and runs a PRNG algorithm.If the output of the PRNG is within a certain range, the module stores the bit in external DDR memory (i.e., the bit is sampled).Once the sampling is done, the module receives one bit of seed data from the processing subsystem (PS) and one bit of raw data from the DDR RAM every computation cycle.The seed bits are accumulated as a seed string and the string is used for matrix-vector multiplication.At the end of the computation cycle, a shift register is used to shift the accumulated seed string by one position.The string will then be ready to receive the new seed bit that is incoming in the next computation cycle.
The module also splits the large Toeplitz matrix to a smaller matrix of size m ′ × n ′ and performs the matrixvector multiplication iteratively.We use a pipeline implementation, where the interval between every computation cycle is set to one clock cycle.With this, the estimate for the number of clock cycles required to perform hashing on one sub-block is approximately For our implementation, we set m ′ = 2000 and n ′ = 1.

A. Simulation with fixed testing probability
Using the key length presented in Appendix C and the timing information, we simulate the key rate for three scenarios: (1) hashing the sifted key directly, (2) using sampled sub-block hashing, and (3) running the QKD protocol for a smaller number of rounds (N/N S ) for N S times, totalling N rounds, and performing parameter estimation and hashing for each of the N S blocks separately.The main result of this paper would be the com- parison between scenarios one and two.However, as the last scenario is largely similar to our method, we will make additional comparisons between scenarios two and three throughout the paper as well.
To illustrate the effects of the sampled sub-block hashing method, we consider a BBM92 protocol with N = 10 9 rounds, and a secrecy parameter of ε sec = 1 × 10 −6 .The optimised p X value for direct hashing is 0.0176, which we round to 0.02.This method of choosing p X corresponds to the scenario where an optimised setup has already been achieved, and one would like to modify the post-processing step to improve the throughput.From Fig. 3, it is clear that using our method, there is a small loss in key rate per signal sent, but we obtain a linear speed-up.The loss in key rate from sampling, as shown in Fig. 3a, is expected, and is a result of the penalty from sampling.Interestingly, despite the loss, there is an advantage over starting with a smaller block.This may be due to the fact that our method uses the overall statistics of N rounds to estimate the min-entropy of each sub-protocol, whereas the small block scenario only uses each protocol's statistics of N/N S rounds.In addition, as the time required scales more favourably for lower input size (based on the clock cycle equation we obtained), a shorter sub-block would improve generation rate despite a loss in key length, as seen in Fig. 3b.Now, suppose that the protocol is run with the parameters presented in Fig. 3, and the user requires a key of length 5 × 10 8 bits.This corresponds to a value of 0.5 for the key rate per signal, which is indicated by the dashed line in Fig. 3a.In this scenario, N S = 4 is the optimal value for the small block implementation.On the other hand, if our method is utilised, then the optimal value would now be N S = 17.From the plot shown in Fig. 3b, according to the dashed lines, both methods would result in an improvement in throughput compared to direct hashing.However, our method has a throughput that is higher than the small block implementation by more than a factor of 4. This shows the advantage of our method over the small block implementation.
To study the behaviour of the secrecy parameter ε sec , we fix the output length to l = 4.3 × 10 8 bits, N = 10 9 , and p X = 0.02.We optimise the secrecy parameter for all three scenarios under different values of N S and plot the results in Fig. 4. From the results, we can see that the sampled sub-block hashing method incurs a small linear loss in terms of secrecy, but allows for a linear improvement in terms of throughput.
We observe that our method is able to achieve positive key rates in parameter regimes that are previously not possible.As an example, Fig. 5 contains the calculated key rate per signal result when N = 3 × 10 7 , p X = 0.02, and ε sec = 10 −6 .Under the small block implementation, hashing has to be performed on a minimum block size of approximately 1.3 × 10 6 bits to obtain a positive key rate.Using our method, we are able to perform hashing on an even smaller block size of 1 × 10 6 bits and obtain a key rate per signal that is higher by more than an order of magnitude.

B. Simulation with optimised testing probability
Next, we consider the case where the testing probability p X is not fixed (e.g., the setup is still in the design optimisation stage, or p X can be easily changed with an active beam splitter).We again present the simulated key rate results for the same three scenarios mentioned in section IV A. For this case, we optimise p X over all values of N S , for all three scenarios.The simulated results for N = 1 × 10 9 rounds and ε sec = 1 × 10 −6 are shown in Fig. 6.Comparing Figs.3a and 6a, our method achieves similar key rate per signal for both cases.On the other hand, for the case of small block implementation, optimising the value of p X provides an improvement to the key rate per signal.This is because the optimal p X increases with respect to N S , as shown in Fig. 6c, which allows for a larger parameter estimation set.Hence, there is less penalty from statistical fluctuations.However, our method is still able to achieve a higher key rate than the small block implementation in this parameter regime.
Similarly, we optimised the p X values for N = 3 × 10 7 and ε sec = 1 × 10 −6 , and plot the simulated key rate results in Fig. 7.We see that in this case, the small block implementation will outperform our method when the value of N S is greater than 13, as shown in Fig. 7a.Comparing Figs.6c and 7b, the optimal p X increases with N S for the small block implementation for both cases.As explained, this is to allow for a larger parameter estimation set, and therefore reduce the penalty from statistical fluctuations.For our method, since the overall string is utilised for parameter estimation, the effect of statistical fluctuation remains relatively unchanged with increasing number of sub-blocks, resulting in similar optimal p X values.The decrease in key rate is thus primarily due to penalty from the sampling method, which does not appear to be sensitive to p X .

C. Hardware implementation
We implemented large input Toeplitz hashing on FPGA with and without our method, using simulated datasets.For our method, although the sampling procedure is executed in full, we perform privacy amplification on only the first three sub-blocks independently, before concatenating the hashed outputs together as a proof of  concept demonstration.We choose p S = 0.05, because at that point, the loss in extractable key length is small, but the increase in execution speed is large.We used a Xilinx ZCU111 evaluation board as our implementation platform and tabulated our results in Table I.As expected, when the input size of privacy amplification increases, the speed of the privacy amplification decreases linearly.This is because to generate even a single bit of output, a larger number of bits needs to be processed first.When comparing between direct hashing (traditional) and our method, it is clear that privacy amplification with our method has an execution speed that is faster by almost 20 times.This shows that our method allows one to satisfy the finite-size effect without too heavy a penalty on speed.In particular, for the cases where the input size is 960.40Mbits and 1920 Mbits, Toeplitz hashing without our method would have taken approximately 413 hours and 1695 hours respectively, but our method reduced the timing to 20.73 hours and 84.99 hours, therefore greatly improving the practicality of large input hashing.
For the final implementation, the concatenated output of 152.56 Mbit is fed into the NIST 800-22 statistical test suite [39] to test for uniformity.In Fig. 8a, we show the P-value results of each individual test in the test suite.A P-value greater than 10 −4 indicates that the sequence under test passes that particular test for uniformity.Additionally, in Fig. 8b, we show the proportion result of each individual test.The proportion result shows the proportion of samples from the sequence under test that passes that particular test.A proportion value greater than 0.9657 [40], illustrated by the horizontal orange line in Fig. 8b, indicates that the sequence under test passes that particular test for uniformity.The combined results show that our concatenated output passes all the tests in the test suite.In addition, as the security is guaranteed by our theoretical model, we can thus confidently certify that the output is uniform and secret, even from the adversary.
Finally, in Table II, we present the resource utilisation results from the implementation of our method on the ZCU111 evaluation kit.The utilisation of look-up table (LUT), LUTRAM, flip-flop (FF), block RAM (BRAM), and digital signal-processing (DSP) blocks are low.The low utilisation percentage also indicates that the entire project can fit on platforms which are smaller and have lower cost.

V. DISCUSSION AND CONCLUSION
In this paper, we propose a sampled sub-block hashing method for large block size randomness extraction with low resource utilisation level.Our simulation and experimental results, based on the BBM92 QKD protocol, demonstrate an improvement in key rate per unit time (throughput) compared to both the direct hashing approach and the small block implementation approach, within certain parameter regime of p X and N .We attribute this improvement to the parameter estimation advantage of our scheme -parameter estimation is conducted based on the entire N rounds rather than individual small blocks, thereby introducing a smaller penalty term compared to that introduced by sampling.As a result, we conjecture that our method would exhibit a more distinct advantage for protocols requiring more extensive parameter estimation.We also note that although this work only considers direct Toeplitz hashing on FPGA, it is highly plausible that our method would also improve the throughput when applied to other implementations on other platforms (e.g., Toeplitz hashing accelerated by FFT on CPU).We leave these explorations for future work.
In summary, the proposed randomness extraction method has a straightforward operation and can be ap- plied to a generic class of quantum cryptographic protocols.Most importantly, it offers a remarkable orderof-magnitude improvement in throughput while utilising system resources efficiently.Consequently, our method presents a highly promising solution for achieving highspeed randomness extraction under stringent resource constraints.This feature holds tremendous importance for practical and ultra-high-speed realisations of quantum cryptographic protocols, particularly in the context of on-chip deployments.
p Ω is the probability that the event Ω occurs and d X is the dimension of X i .
One could perform simplification of the generalised EAT [29] to obtain the form in the main text, where the additional terms are Since sub-block hashing computes each output string K j = f RE,j (A ′ Sj ), one can apply the quantum leftover hash lemma [33,34] on each sub-block.Therefore, one could set to achieve the required security level, where the minentropy is evaluated on and F N S \j RE refers to the set of hash functions used except f RE,j .
To determine the error, the important quantity to compute is the min-entropy.One can begin by lower bounding the min-entropy using the data-processing inequality, where the second inequality stems from the fact that F N S \j RE is independent on all other terms in the minentropy.
To analyse the min-entropy with generalised EAT, one has to remove the conditioning on Y , which, by definition, cannot be expressed as being generated round-byround.Depending on the nature of Y , there are many ways to remove the conditioning with min-entropy chain rule.Here, we highlight one possible efficient method, if Y is generated by blocks.Suppose Y j is generated from the information in rounds S j only, i.e.Y i = g(A ′ Si , I Si ), then one could remove the conditioning with where S \j = ∪ i:i̸ =j S i .The second inequality makes use of the property that conditioning does not increase minentropy, the third inequality uses the data-processing inequality on Y i for i ̸ = j, since A ′ Si and I Si (in E ′ N ) are present, and the final inequality uses the chain rule.For consistency, however, we shall use the min-entropy expansion in the main text in the following analysis (incurs a penalty log 2 |Y|) but it is straightforward to generalise the result to the more efficient case highlighted here.
We shall now focus on the min entropy term V N E ′ N ), which is now amenable to generalised EAT analysis.In particular, one can define the channel Mj 2. Generate a uniformly random variable V i (step 1 of sample and hash).

Define
4. Announce the values of A ′ S k ,i for k = j + 1, • • • , N S and V i , i.e. they will be known to the adversary.

Trace out
E ′ N being known to the adversary.To show that this is a valid EAT channel, one must check for the two conditions of projective reconstructability and non-signalling.
The statistics generated in { Mj i } i , C N , is no different from that in {M i } i .Moreover, since C i is generated from (I test ) N , which includes A i by assumption, the same generation method can be utilised in analysing Mj i .Therefore, using T from the projective reconstructability of the M i EAT channel, one can create T ′ , which acts on the expanded . It is clear that

and thus { Mj
N } i satisfies projective reconstructability.
The proof of non-signalling relies on the assumption that the original EAT channel M i is non-signalling and that A ′ i is generated independently of any memory, i.e.R i−1 of the EAT channel.Due to this independence, one could in general split the EAT channel M i into two parts.The first part E 1 : i , with modifications to form Ẽi and possible intermediate systems includes the use of R i−1 to arrive at the final output systems of the M i channel.Since the first part does not depend on R i−1 , the no-signalling condition would only depend on the second part, i.e. that there exists R′ i : Consider the channel Mj i described above, which can be written as a series of channels, Mj • M i , where describes the generation of V i and the sampling step.It can be noted that the order of the maps and E 2 are interchangeable, as neither act to alter the systems that are inputs to the other channel.As such, we have that where the second equality traces out the output A ′ Sj ,i , the fourth equality applies the non-signalling property, and the final equality swaps the order of the trace over

With Mj
i as a valid EAT channel, one can now consider the min-tradeoff function to use before generalised EAT is applied.In this case, the min-tradeoff function has to satisfy The von Neumann entropy can be simplified by expanding in terms of different v i values.When v i ̸ = j, A ′ Sj ,i =⊥, which has entropy 0 since it is a fixed value.When

Mj
i , they do not give any information on A ′ i and we can reduce . Moreover, since the generation of A ′ i and E ′ i in Mj i are fully determined by the channel M i , the states to optimise over reduced to Σ i (q), thus yielding the requirement on the min-tradeoff function where the p S factor is due to the probability of v i = j.
From the GEAT-analysable original protocol, there exists a min-tradeoff function f (q) that lower bounds the RHS without the factor p S .Therefore, f ′ (q) = p S f (q) is a valid min-tradeoff function.
Applying GEAT using this min-tradeoff function, along with EAT channels Mj i , one arrives at the result in the main text.Since this result applies to all min-entropy terms in Eqn.B3, we have that If the smoothing parameters are set to be equal for all rounds, ε sm,j = εsm , the error reduces to Using the quantum leftover hash lemma [33,34], the error can be expressed as where the min-entropy is computed for ρ A ′ N E ′ N V EC |Ω .Since V EC is not generated in a round-by-round manner (Y in generic GEAT-analysable protocol), it is removed using the chain rule of smooth min-entropy [35,36], where for simplicity, we assume , where e b is the bit error rate, and f EC is the efficiency of the error correction step.
We now focus on simplifying H εsm min (A ′ N |E ′ N ) using generalised EAT.In the most general case, Eve's attack can be performed at the start, preparing the quantum state ρ Q N A Q N B E , before sending the sub-system to Alice and Bob.
Therefore, one can craft the EAT channel, with initial state ρ To achieve the same statistics, is announced by the two parties, they are accessible to Eve.Therefore, it is clear that C N can be recovered through E ′ N alone and the channel satisfies projective reconstructability.Since there is no R i in the protocol, non-signalling is trivially satisfied by R i = Tr Ai •M i , demonstrating that M i is a valid EAT channel.
Before applying GEAT, one has to define a valid mintradeoff function, f (q), which lower bounds H(A ′ i |E ′ i ) for any state that can be generated from the channel M i and has statistics C i ∈ {0, 1, 2, ⊥} satisfying some probability distribution q = (q 0 , q 1 , q 2 , q ⊥ ), i.e. states in set Σ i (q) = {ρ|ρ A ′ i CiRiEi Ẽi−1 = M i (ω Ri−1Ei−1 Ẽi−1 ), ρ Ci = q}.Expanding the entropy to remove cases where A ′ i =⊥, when Alice and Bob did not both choose the Z basis or when either party registers no detection, where A det,ZZ i are the outcomes of rounds where both parties chose a Z basis and records a detection and p det is the detection probability.Note that fair sampling is assumed here, p det|XX = p det|ZZ = p det , so the detection probability should be basis independent.Since these rounds have guaranteed detection and Z basis measurement, one can apply the entropic uncertainty relation (for conditional von Neumann entropy) [41,42] to show where the phase error rate can be computed from e ph = q1 q1+q0 and the detection probability from p det = (q 0 + q 1 )/p 2 X = 1 − q2 p 2

X
for statistics ⃗ q.To define an affine mintradeoff function, noting that binary entropy is concave, one could upper bound the expression above by its tan-gent, where e ′ ph ∈ (0, 0.5) is the tangent point, which could be optimised to provide a tight entropy bound.Simplifying the RHS of the equation, one arrives at an affine mintradeoff function using an expansion similar to those in Lemma V.5 of Ref. [43] for the variance computation.Applying GEAT, the error can now be expressed as ] .(C8) The key length at a given secrecy level can then be expressed as

FIG. 3 .
FIG.3.Simulated key rate results for a standard BBM92 QKD protocol, where N = 1 × 10 9 , e ph = 0.82%, e bit = 5.8%, p X = 0.02, and εsec = 1 × 10 −6 .Full refers to hashing the sifted key directly, splitting refers to using our sampled subblock hashing method, and small block refers to running the QKD protocol for (N/NS) rounds for NS times, and performing parameter estimation and hashing for each of the NS blocks separately.(a) Plot of key rate per signal sent against the number of sampled sub-blocks.The dashed line represents a target key rate per signal of 0.5.(b) Plot of key rate per unit time against the number of sampled sub-blocks.The dashed lines represent the optimal NS value, and its corresponding throughput, that achieves the target key rate per signal for the splitting (NS = 17) and the small block (NS = 4) method.

FIG. 6 .
FIG.6.Simulated results for a standard BBM92 QKD protocol, where N = 1 × 10 9 , e ph = 0.82%, e bit = 5.8%, and εsec = 1 × 10 −6 .The sampling probability p X is left as a free parameter to be optimised.(a) Plot of key rate per signal sent against the number of sampled sub-blocks.(b) Plot of key rate per unit time against the number of sampled sub-blocks.(c) Plot of optimal p X against the number of sampled sub-blocks.

FIG. 7 .
FIG.7.Simulated results for a standard BBM92 QKD protocol, where N = 3 × 10 7 , e ph = 0.82%, e bit = 5.8%, and εsec = 1 × 10 −6 .The sampling probability p X is left as a free parameter to be optimised.(a) Plot of key rate per signal sent against the number of sampled sub-blocks.(b) Plot of optimal p X against the number of sampled sub-blocks.

FIG. 8 .
FIG.8.Results obtained from the NIST 800-22 statistical test suite for the concatenated output of 152.56 Mbit.For each test in the test suite, we plot (a) the P-value, and (b) the proportion of samples that passed.If there were multiple runs of the same test, the minimum value is taken.

4 Z p 2 X 1 + 2 h = p 2 Z
the min-tradeoff function, one could compute the quantities relevant to GEAT,Max(f ) = p 2 Z 1 + log 2 (1 − e ′ ph ) Min Σ (f ) = p 2 Z 1 + log 2 e ′ ph Var(f ) ≤ p log 2 (1 − e ′ ph ) η tol 1 − h(e ′ ph ) − (Q tol − e ′ ph ) log 2 Schematic of Toeplitz hashing on FPGA using our proposed method.To perform the test without our method, the PRNG & Toeplitz hashing module is switched out to another module that performs only Toeplitz hashing.
one can formulate the EAT channel as follows: 1. Randomly select D i ∈ {X, Z} and D ′ i with probabilities p X and p Z .2. Measure subsystems Q A,i and Q B,i in the chosen basis and obtain outcomes A i and B i .3. Compute the detection result, W i , W ′ i , test round choice T i , test round outcomes A test ′ i and statistics C i .4. Trace out A i , B i and B ′ i .