Limitations in quantum computing from resource constraints

Fault-tolerant schemes can use error correction to make a quantum computation arbitrarily ac- curate, provided that errors per physical component are smaller than a certain threshold and in- dependent of the computer size. However in current experiments, physical resource limitations like energy, volume or available bandwidth induce error rates that typically grow as the computer grows. Taking into account these constraints, we show that the amount of error correction can be opti- mized, leading to a maximum attainable computational accuracy. We find this maximum for generic situations where noise is scale-dependent. By inverting the logic, we provide experimenters with a tool to finding the minimum resources required to run an algorithm with a given computational accuracy. When combined with a full-stack quantum computing model, this provides the basis for energetic estimates of future large-scale quantum computers.


I. INTRODUCTION
With the advent of small-scale quantum computing devices from companies like IBM, and the myriad software and hardware quantum startups, the interest in building quantum computers is at an all-time high.The latest declaration of quantum supremacy by Google [1] begs the question: How do we make our quantum computers more powerful?The answer is, of course, to have larger quantum computers.But larger also usually means noisier, with more fragile quantum components that can go wrong, leading to more computational errors.The way out of this conundrum is fault-tolerant quantum computation (FTQC), the only known route to scaling up quantum computers while keeping errors in check.
FTQC schemes have been known since the early days of the field [2][3][4][5][6][7][8], and are widely reviewed [9][10][11][12].They remain an active field of research, especially in the context of surface codes, see e.g., Refs.[13][14][15] or Ref. [16] for an older review.Underlying all FTQC schemes are basic assumptions about the nature of the quantum devices and the noise afflicting them.Many of these assumptions, laid down long before experimental devices came about.As we learn more about the shape of quantum computers to come, it is important to re-visit those assumptions, to update them to properly describe real devices, so that the schemes remain relevant to our progress towards largescale, useful quantum computers.
FTQC tells us how to scale up the quantum computer, to accommodate larger problem sizes and improve computational accuracy, by increasing the physical resources spent on implementing the computation.Every known FTQC scheme relies on quantum error correction (QEC) codes to remove errors, using more and more powerful codes to remove more and more errors, accompanied by a prescription to avoid uncontrolled spread of errors as the computer grows.One key assumption is that the physical error probability η-the maximum probability that an error occurs in a physical qubit or gate-remains constant as the computer scales up.Then, so long as the error is below a certain threshold (typically an error probability per gate of less than 10 −4 ), one can perform more accurate calculations by investing more physical resources to scale up the computer's size (adding more qubits, gates, etc).In principle, this can be repeated until computational errors are arbitrarily rare.
However because of resource limitations, the growth of η with scale is observed in current quantum devices.For instance, in ion-trap experiments, the gate fidelity drops rapidly if more and more ions are put into the same trap; this volume constraint is the motivation behind the push for networked ion traps and flying qubits to communicate between traps (see, for example, [17]).Another example is provided by qubits that are coherently controlled, by resonantly addressing their transition.Here a constraint on the total available energy to perform gates results in lower gate fidelity [18].Finally, a constraint on the available bandwidth makes the qubit transition frequencies closer and closer as the computer size grows, causing more and more crosstalk between qubits when performing gates [1].These three typical examples lead to a scale-dependent noise.
If the physical error probability η is scale-dependent, so that it grows as the computer scales up, we cannot expect quantum error correction to keep up with the rapid accumulation of errors, so it should come as no surprise that Qubits not well controlled FIG. 1.How resource constraints can lead to increased computational errors: Each gate operation on a physical qubit (blob) requires a certain amount of physical resource (fence) for good control.If the number of physical qubits increases as the computer grows, without a proportionate increase in control resource, errors will increase.
In this work, we examine the consequences on FTQC of growing physical error probability η as the computer scales up.The absence of a threshold means that arbitrarily accurate computation is unattainable, but it does not mean that quantum error correction is useless.We find generic situations where a certain amount of error correction is good, but too much is bad.Hence, the amount of error correction should be optimized, leading to a maximal achievable computational accuracy.We provide experimenters with a methodology to estimate this maximum for a given scale dependent noise, and show the importance of adjusting the experimental design to control this scaling.Inverting the perspective allows us to estimate the minimum required resource cost to perform a computation with a given accuracy.
After recalling the basics of FTQC, we present our general strategy.We exemplify it with a toy model that captures the main features of FTQC in the presence of scale dependent noise.We then focus on three physically motivated situations where resource constraints like energy, volume or bandwidth lead to scale dependent noise, and exam the feasibility of FTQC in the limit of large quantum computers.We finally provide first methodological steps towards minimizing the energetic costs to run an algorithm with a given accuracy.This suggests the possibility of a detailed energetic analysis for a fullstack quantum computer, which however goes beyond the scope of this paper.

II. ACCURATE QUANTUM COMPUTING
To be concrete, we examine the FTQC scheme of Ref. [7], built on the idea of concatenating a QEC code put forth in earlier works.This formed the foundation of many subsequent FTQC proposals; our results are hence applicable to those based on concatenating codes.Such schemes have more well-established and complete theoretical analyses than some of the more recent developments like surface codes.They are hence a good starting point for our investigation here.
Universal quantum computation in the scheme of [7] is built upon the 7-qubit code [30], using seven physical qubits to encode one (logical) qubit of information.We refer to the seven physical qubits used to encode the logical qubit as a "code block", and gates on the logical qubit as "encoded gates".At the lowest level of protection against errors, which we refer to as "level-1 concatenation", each logical qubit is encoded using the 7-qubit code into one code block, and every computational gate is done as an encoded gate on the code blocks.Every encoded gate is immediately followed by a QEC box, comprising syndrome measurements to (attempt to) correct errors in the preceding gate.Faults can occur in any of the physical components-physical qubits and gatesincluding those in the QEC boxes, so the error correction may not always successfully remove the errors.Faulty components in the QEC box may even add errors to the computer.A critical part of the construction of Ref. [7] is to ensure that the QEC boxes, even when faulty, do not cause or spread errors on the physical qubits in an uncontrolled manner provided not too many faults occur, a realization of the notion of fault tolerance.
At level-1 concatenation, the ability of the code to remove errors is limited.The 7-qubit code ideally removes errors in at most one of the seven physical qubits in the code block.To increase the QEC power, we raise the concatenation level of the circuit: Every physical qubit in the lower concatenation level is encoded into seven physical qubits; every physical gate is replaced by its 7-physical-qubit encoded version, followed by an QEC box.In this manner, level-k concatenation is promoted to level-(k + 1) concatenation, for k = 0, 1, 2, . ... The QEC ability of each level of concatenation increases in a hierarchical manner.For example, at level-2 concatenation, every logical qubit is stored in 7 topmost layer comprising seven blocks of seven physical qubits each.Each block of seven physical qubits is protected using the 7-qubit code; the 7 blocks of qubits are themselves protected by QEC in the second layer.This logic extends to higher levels of concatenation.
The concatenation endows the overall computational circuit with a recursive structure (see Fig. 2), a crucial ingredient in the proof of the quantum accuracy threshold theorem.The increase in computational resource as the concatenation level grows is beneficial only if the increased noise due to the larger circuit is less than the increased ability to remove errors.This leads to the concept of a fault-tolerance threshold condition.The quantum accuracy threshold theorem gives a prescription for increasing the accuracy of quantum computation with no more than a polynomial increase in resources, provided the physical error probability is below a threshold level.Specifically, for the FTQC scheme of [7], the error probability per logical gate at level-k concatenation is upper-bounded by Here, η is the physical error probability, and p (0) = η.B is a numerical constant, determined by the fault tolerance scheme, that captures the increase in complexity (number of physical components) of the circuit used to implement a single logical gate as one increases k for increased protection.Eq. (1) expresses quantitatively the idea of the accuracy threshold theorem: As long as p (k) decreases as k increases.Eq. ( 2) is the threshold condition, i.e., the physical error probability η in the quantum computer has to be below the threshold level η thres for FTQC to work.The number of physical gates in the circuit that implements the level-k logical gate is , where A and A are integers given by circuit details; the well-known scheme of Ref. [7] has A = 575 and A = 291 [31].From A and A , one counts the number of fault locations, B. A simple over-estimate of the integer B is A 2 10 5 , with a more careful counting giving an improved value of B 10 4 [7].
The quantum accuracy threshold theorem [2][3][4][5][6][7][8] shows that a double-exponential decrease in p (k) with k can be achieved with only an exponential increase in resources, giving the no-more-than-polynomial increase in resource costs.This quantum accuracy threshold theorem assumes that the value of η, the physical error probability, remains constant even as the level of concatenation k increases.However, as mentioned, as k increases and the physical size of the computer grows, current experiments suggest that η also increases.Our goal here is thus to examine how the conclusions on quantum accuracy are modified if η grows with k.Then it is intuitively clear that as k is increased in an attempt to reduce the logical error probability, the underlying noise per physical component increases to thwart that reduction.We will see that there is a maximum k beyond which further concatenation only serves to worsen the computational accuracy.

III. EFFECT OF SCALE-DEPENDENT NOISE
We examine the consequences of a k-dependent physical error probability η (k) , illustrating it first with a toy model, before analyzing the more realistic situation where a constraint on the total resource available for the computation leads to a shrinking amount of resource per physical gate as the computer scales up.The general effect of a k-dependent physical error probability can be summarized in the schematic Fig. 3.We also show how to obtain the maximum computational accuracy available for a given model for scale-dependent error.

A. Toy model
We first illustrate this with a simple model in which η (k) = η (0) (1 + ck), for k = 0, 1, 2, . .., where η (k) is the physical error probability per gate in a computer large enough to perform level-k concatenation.Here, c ≥ 0 and η (0) ≥ 0 are constants governed by the physical system in question.Although this is a toy model, one can think of it as the affine approximation of any η (k) function with weak k dependence, expanded about η (k) = η (0) (see also the long-range noise with z = d in Table I).For this η (k) , Eq. (1) gives The black dotted curves are a schematic of the conventional situation (where the physical error probability η is independent of the scale-the concatenation level k-of the computer.The red solid curves are a schematic of this work's consideration, where η grows with k (each red curves is for a different value of p (0) ≡ η).If η is k independent, standard FTQC analysis says that the error per logical gate p (k) can be brought as close to 0 as desired by increasing k, provided one starts below the threshold (solid horizontal line) at k = 0.If η depends on k, even if one starts below the threshold, p (k) eventually turns around for large enough k; p (k) cannot reach 0, there is a maximum concatenation level, and further increase in k only increases the logical error.All examples in this work have at most one minimum in each red curve, but in general a curve can be multiple minima.
Here, we define which would be the value of the error probability per logical gate if the error probability per physical gate were < 1 as long as η (0) < 1/B, as in Eq. ( 2); if c > 0, the multiplicative factor (1 + ck) grows with k so, eventually, p (k+1) > p (k) for k beyond some k max value.Figure 4(a) shows an example of how p (k) varies as k increases, for different c values.As long as c > 0, p (k) decreases (if at all) before rising again, above some k max value.
Corresponding to this maximum useful level of concatenation k max is the minimum attainable error probability, p min ≡ p (kmax) , giving the limit to computational accuracy attainable for given values of c and Bη (0) , quantities that give information about the noise scaling and the fault-tolerance overheads.Figure 4(b) shows the k max values for different c and Bη (0) values (c.f.Fig. 3b).Clearly, k max decreases as c grows (stronger k dependence).Current experiments have Bη (0) 1; for example, the IBM Quantum Experience system has η (0) 10 −3 , giving Bη (0) 10 for B = 10 4 .In nearto middle-term experiments , we expect Bη (0) to not be far below 1, i.e., the error probability is just below the c = 0 threshold value [see Eq. ( 2)].In this case, Fig. 4b suggests that one quickly loses the advantage of concatenating to higher levels even for small c values.In fact, for encoding to be helpful at all, i.e., for k = 1, we must have p If Bη (0) = 0.8, say, this amounts to the requirement that c 0.1, so that a very weak dependence on k is necessary for even one level of encoding to help at all in reducing the error probability.

B. General case
Consider the physical error probability growing as a monotonic function of k: η (k) = η (0) f (k), with f (k) ≥ 1 monotonically growing with k, and f (0) = 1.Then, the error probability per logical gate is p where Eq. ( 4) gives p (k) 0 .Treating k as a continuous variable,let us assume there is only one minimum, which we define as k = k st (with "st" for stationary point).Then the minimal attainable error will occur at an integer k max , which is one of the two integers nearest to the minima k st , so the minimal error will occur at k max ≤ k st +1.If there are multiple minima, we define k st as the minimum with the largest k.A priori, we do not know which minimum will be the best, but we still know that k max ≤ k st + 1. Combining this with a little algebra for k st yields with However in many cases (such as the above toy model and the example in the following section), one has f (k → ∞) → ∞.Then the minimum p (k) will occur at finite k max , no matter how small η (0) is; one can never attain arbitrarily small logical error probability by concatenating further.

IV. EXAMPLES OF RESOURCE CONSTRAINTS
We now give examples of how specific physical resource constraints can lead to the scale-dependent noise discussed in the previous section.In the first example, the constraints lead to a scale-dependent local noise on each physical component, so the above theory applies directly.In the second example, the constraints lead to scale-dependent crosstalk between qubits, which can be mapped to the above theory, using a mapping in Refs.[20,25].

A. Resource constraints affecting local noise
This section considers the local noise on each qubit η (k) scaling with a total number of physical components (gates, qubits, or similar) N (k) which grows exponentially with k.Let us assume that adding a level of concatenation involves replacing each physical component by D physical components, so N (k) = D k .For the noise, we take η (k) ∝ N (k) β for some positive constant exponent β, so There could be various origins for such a scaling, however a common one will be total resource constraints.One expects the resources needed to maintain a given quality of physical gate operations to scale with N , so a constraint on the total available resource will result in a fall in the resource per physical component as the computer scales up.This gives a consequential drop in the quality of the gate, or, equivalently, a rise in the physical error probability η.
The error probability per logical gate is then p (k) = p (k) 0 D β2 k k , where Eq. ( 4) gives p (k) 0 .Going from (k − 1) to k levels of concatenation reduces the logical error probability when p (k) p (k−1) < 1, this is only satisfied for the model of Eq. ( 7) when k ≤ k max , where k max is the largest positive integer satisfying, see Appendix A, If no positive integer k max satisfies this inequality, then k max = 0 and concatenation is not useful at all.This is because concatenation is only useful if p (1) < η (0) , which requires This is often a much more stringent condition than η (0) < η thres in Eq. (2).For example, if the noise scales with the number of gates in a concatenated FTQC scheme, we can take N (k) = G (k) ≡ A k (see paragraph following Eq.( 2) above).This means we set D = A where A = 291 as in Ref. [7], then one sees for β = 1 that concatenation is only useful if η (0) < B −1 A −2β ∼ 10 −9 which is 10 5 times smaller than the usual threshold, η thres .This condition is so stringent because D is so large.If the noise scales with a different physical parameter (number of qubits, number of wires, or similar), the value of D will be different but it will typically still be large.Eq. ( 9) then makes it clear that the larger a given parameter's D is, the more important it is to minimize the noise's scaling with that parameter (i.e. to minimize β).
The minimal attainable error probability per logical gate is given by taking k = k max in the above formula for p (k) .For fixed system parameters (D, B, η 0 , β), this p (kmax) is easily found by taking p (k) for different integer ks to see which is smallest.However, to see its dependence on those parameters, Appendix A gives algebraic formulas for upper and lower bounds on p (kmax) .

B. Resource constraints affecting crosstalk
A common problem in existing prototype quantum computers is crosstalk between qubits.This is an example of a more general problem of non-local non-Markovian noise, usually called long-range correlated noise.To treat this, we follow Refs.[20,25], and define H ij as the arbitrary (and potentially noisy) unwanted interaction between physical qubits i and j.This interaction could be direct, or it could be mediated by other degrees of freedom (which one traces out).In the latter case, it could be non-Markovian, meaning it can account for interactions mediated by sub-Ohmic, Ohmic or super-Ohmic baths [32].One then defines the error strength for a computer containing N physical qubits.Refs.[20,25] showed that t 0 ∆ is a good measure of the error per gate, where t 0 is the duration of the slowest physical gate, although it should not be interpreted directly as the error probability per gate, see e.g., Ref. [33].They then showed that fault-tolerance occurs when t 0 ∆ < (2e 2+1/e B 2 ) −1 ∼ 10 −9 for N → ∞.
Here, in contrast, we consider cases where ∆ diverges for N → ∞, violating the condition for fault-tolerance in Refs.[20,25].This growth of ∆ with N will often occur due to resource constraints.A simple example would be a constraint on the physical volume of quantum computer, which is a current limitation in qubit technologies based on ion traps [34].Then the density of qubits must scale like N .If each qubit has unwanted interactions with all other qubits within a given radius, one would have ∆ ∝ N .A second example-relevant to multiple technologies-is a bandwidth constraint, i.e. a limit on the available range of transition frequencies of the qubits.This is particularly a problem for existing superconducting and ion-trap qubit technologies.There each twoqubit gate corresponds to a different transition frequency, and a given gate is performed by sending a driving signal (typically a microwave signal) into the quantum computer with the frequency of that gate.This means that the driving signal for a two-qubit gate between qubits n and m will also cause an unwanted interaction H ij for pairs of qubits i and j with frequencies too close to n and m [35].In some technologies, all qubits feel the driving signal, then the number of qubits feeling this unwanted interaction grows with the number of qubits in any given window of transition frequencies, which grows like N .In this case ∆ ∝ N , however clever engineering may well reduce ∆'s scaling with N , so we prefer to consider ∆ ∝ N β with arbitrary β [36].Now we study how the physics depends on the scaling of ∆ with N .By taking ∆ ∝ N β , we have ∆ ∝ D βk for k levels of concatenation, where the number of physical qubits increases by a factor of D with each level of concatenated error correction.We then use the method of Refs.[20,25], which involves taking all results in Sec.III above and replacing η (k) by e 1+1/(2e) √ 2t 0 ∆, where t 0 ∆ ∝ D βk (see Appendix B).Defining ∆ (k) L as the upper bound on effective long-range correlated noise between logical qubits performing a given algorithm with k levels of concatenated error correction, Appendix B shows that where ∆ (0) is the magnitude of the long-range noise between physical qubits performing the same algorithm without error correction (so its logical qubits are its physical qubits).Eq. ( 11) gives an over-estimate of the true error, but no better bound exists at present.Thus the k that minimizes this bound gives the best existing bound on the achievable accuracy in the presence of such noise.For example, if ∆ (0) is the crosstalk between physical qubits in a computer performing a given calculation without error correction, then ∆ L is an upper-bound on the crosstalk between logical qubits in a computer performing the same calculation with k levels of error-correction.Since Eq. ( 11) has the same k-dependence as in Sec.IV A, all results there and in Appendix A hold for this longrange noise, upon replacing B by 2e 2+1/e B 2 .A quick estimation of D is that it equals the number of fault locations in a "Rec", so D ∼ A = 291; a more precise calculation [37] confirms that D is of order A .
The good news is that this shows that error correction can reduce the errors to a certain extent, even for noise that is too long-ranged to have a fault-tolerance threshold.The bad news is that one requires t 0 ∆ (0) < 2e 2+1/e B 2 D 2β −1 for error correction to be useful in reducing crosstalk between qubits.This is always tiny; it is of order 10 −9 for small β, and of order 10 −13 for β = 1.If one achieves noise as weak as this, one can already do huge quantum calculations without worrying about errors.
In this section, we treated scale-dependent crosstalk induced by resource constraints, but our conclusions apply to long-range correlated noise of any origin.One example is unwanted long-range interactions which decays with the distance r between qubits i and j, so H ij ∝ (1/r) z .Ref. [20] considered this example in a d-dimensional lattice of qubits for z > d, but we treat longer-ranged noise (z ≤ d) in Appendix B, for which ∆ ∝ N (1−z/d) .Then this example is the same as those treated above with β = (1 − z/d).

C. Energy constraint for resonant gates performing
Shor's algorithm As a concrete example of the situation described in Sec.IV A (with β = 1), we examine a resource constraint in a specific type of quantum gate implementation, performing a specific quantum algorithm.We take the gate implementation to be resonant qubit gates , and assume there are limited energy-resources available to perform a given computation (see also a related early analysis in Ref. [29]).We take the desired computation to be the Shor algorithm, and investigate how big a computation can be performed (with a given calculational accuracy) when there are limited energy-resources.
We consider qubits embedded in waveguides, i.e., a continuum of electromagnetic modes prepared at zero temperature.Gates are activated by resonant propagating light pulses with a well-defined average energy, or, equivalently, an average photon number n g [see Fig. 5(a)].This describes the situation in superconducting circuits [38, 39] and integrated photonics [40].It is also the paradigm of quantum networks and light-matter interfaces, with successful implementations in atomic qubits [41].Here, we maximize the accuracy of a computation for a given energy constraint.Then, by inverting the logic, we use this to find the minimal energy budget necessary to realize a specific computation with desire accuracy.For our illustrative goals, we treat only single-qubit gates subjected to noise from spontaneous emission.In doing so, we neglect the dephasing noise.While this is a fair approximation for atomic qubits, it is more demanding for solid-state qubits, but within eventual reach of superconducting circuits and spin qubits.
Note that Ω and γ are not independent, typical of waveguide Quantum Electrodynamics, where the driving and the relaxation take place through the same onedimensional electromagnetic channel.Spontaneous emission events while the driving Hamiltonian is turned on cause errors in the gate implementation.Their impact is reduced if the qubit is driven faster, i.e., if the Rabi frequency is larger.Conversely, the Rabi frequency is related to the mean number of photons inside the driving pulse through Ω = 4γ θ n g (see Appendix C).In principle, pulses containing more photons induce better gates, with perfect gates for an infinite number of photons.However, if one designs the gates to work within RWA (to avoid the complicated pulse-shaping issues that come with finite counter-rotating terms) then there cannot be too many photon; n g ω 0 /γ.Then the remnant noise, for a θ = π gate, has physical error probability (see Appendix C) η = π 2 16 1 ng , with a minimal noise of order γ/ω 0 .We now assume a constraint on the total number of photons n tot available to run the whole computation.As we show below, taking this constraint into account allows us to minimize the resource needed, for a target level of tolerable computational error.At level-k concatenation, the number of physical gates needed to implement a computation with L (logical) gates is LG (k) = LA k .Assuming a distinct pulse for each gate, the number of photons available per physical gate, given the total energetic constraint, is n g = n tot (LA k ) −1 .Thus, the physical error probability for the θ = π gate acquires an exponential k dependence, Thus this is a concrete example corresponding to the β = 1 case in Eq. ( 7) above.
To better grasp the consequences of this k-dependent η, which we take as the generic behavior for all gates, we consider carrying out Shor's factoring algorithm [42].Shor's factoring algorithm is touted as the reason the RSA public-key encryption system will be insecure when large-scale quantum computers become available.The current RSA key length is R = 2048 bits.The exponential speedup of Shor's algorithm over known classical methods comes from the fact that we can do the discrete Fourier transform on an R-bit string using O(R 2 ) gates on a quantum computer (see, for example, Ref. [10]), compared to O(R2 R ) gates on a classical computer.The discrete Fourier transform gives a period-finding routine within the factoring algorithm, the only step that cannot be done efficiently classically.Thus, to run Shor's algorithm, one needs L ∼ R 2 non-identity logical quantum gates for the discrete Fourier transform.The exact number of computational gates, including the identity gate operations which can be noisy, for the full Shor's algorithm depends on the chosen circuit design and architecture.We will take the lower limit of L ∼ R 2 in what follows.The concatenation values we find below are thus likely optimistic estimates.
A standard strategy is to demand that the computation runs correctly with probability P target > 1/2; once this is true, the computation can be repeated to exponentially increase the success probability towards 1.For P target = 2/3, with L logical gates, each with error probability p err ( 1), we require (1 − p err ) L > P target = 2/3, giving a target error probability per logical gate of p err (3L) −1 .
Fig. 5(b) presents k max , the maximum concatenation level, as an increasing function of the photon budget per logical gate For fixed total resource, in this case, the photon budget per logical gate, as the concatenation level increases, the available photon count per physical component falls, and we recover the behavior observed in earlier sections, giving a finite k max , and consequently, a limit to the computational accuracy.The solid bold black line in Fig. 5(c) gives the corresponding minimum attainable error per logical gate as a function of n L .

V. MINIMIZING THE RESOURCE COSTS OF AN ALGORITHM
One can turn our results for resource constraints around, to enable us to answer the following question: What are the minimum resources needed for a target computational accuracy, sufficient for given problem?
As an illustration, we answer this question for the Shor's algorithm for an R-bit string in Sec.IV C. The number of gates in the algorithm grows with R, requiring a smaller p err for the algorithm to be successful.This demands a larger photon budget to implement the algorithm using resonant gates.For the parameters in Fig. 5(c), R = 10 3 requires no concatenation, and the minimum photon budget is n L = 10 6 ; for R = 10 5 , we need k = 1 and n L = 10 9 ; for R = 10 7 , we need k = 2 and n L = 10 11 .Recall that the gates are assumed to be designed within the RWA, hence ω 0 γn g .For R = 10 3 , this translates to the condition ω 0 γn g = γn L /A 0 = 10 7 γ; for R = 10 7 , we need ω 0 10 6 γ for A = 575 (this A is from Ref. [7]).These conditions are attained for atomic qubits.They are within reach of future generations of superconducting qubits, where γ ∼ 10 Hz for qubit frequency ω 0 ∼ 10 GHz.Today, the best coherence time for superconducting qubit is within the millisecond range: γ ∼ 1kHz [43,44].
Our analysis also provides an estimate of the energy needed to run the gates involved in the Shor's algorithm, namely, E tot ∼ ω 0 Ln L (R).For R = 10 3 , with the above photon budget of 10 6 , this translates into E tot ∼ 1 pJ.Taking into account the parallelization of the computation (see Appendix D), this corresponds to a typical power consumption of about 1 pW, while the R = 10 7 case requires only 10 nW of power.
Thus for a realistic constraint on the photon power to perform quantum gate operations, quantum error correction would allow one to perform large quantum computations.This is a surprising and positive conclusion, when one considers that the constraint causes scale-dependent errors for which there is no fault-tolerant threshold.It clearly shows that the absence of a threshold is not necessarily a significant impediment to using error correction in quantum computing.
Here, we have calculated the energy of photons arriving at the qubits for a gate operation.However, that energy is a fraction of the energy for signal generation because there is an attenuator between the signal generator and the qubit.This attenuator's job is to absorb thermal photons to avoid them perturbing the qubit, however this means it also absorbs most of the photons in the signal sent to perform the gate operation.To calculate the signal generation energy from the above results, one must multiply by the attenuation.Unfortunately, the attenuation depends on design choices beyond the discussion here (like the temperature at which the signal generation occurs).Furthermore, there is a large cryogenic energy budget for keeping the qubits and attenuators cool, which depends on the photon energy absorbed by the attenuator.Elsewhere [37] we will perform a full energetic optimization of all of these interlinked components (along with other critical components of a quantum computer) in a full-stack analysis of a large-scale quantum computer.Nevertheless, the above example does show the fundamental ingredient in the minimization of energy (or any other resource) is the calculation of the scale-dependence of the noise that occurs when that resource is constrained.To go from this to a full-stack analysis is mostly an issue of optimizing cryogenics, control circuitry (including signal generation), and the quantum algorithm for the calculation in question.

VI. CONCLUSIONS
Many quantum computing technologies currently exhibit physical gate errors that grow with the size and complexity of the quantum computer.Then there is no fault-tolerance threshold.Despite this, we show that a certain amount of error correction can increase calculational accuracy, but that this accuracy decreases again with too much error correction.We show how to find the amount of error correction that optimizes this accuracy.For concreteness, we considered concatenated 7-qubit codes here.However, our approach could be applied to other fault-tolerance schemes (surface codes, measurement-based, etc) [45], where we also expect that scale-dependence of noise on physical components can lead to situations where a little error correction is good, but too much is bad.
We explored the optimization of calcuational accuracy in increasing levels of practical relevance, from a simple toy example, to physical qubits in waveguides.We identified some cases, such as reasonable energy constraints for gate operations, where optimization gives a maximum accuracy good enough for large quantum algorithms.In other cases, such as volume or bandwidth constraints causing long-range crosstalk between qubits, error correction is only useful against such crosstalk when the error strength per physical gate is already so small (ranging from 10 −9 to 10 −13 ) that one could perform huge quantum calculations without any error correction.
Our analysis suggests three priorities for experimenters working towards useful quantum computers: (1) they should try to characterize the scale-dependence of the errors for their technology; (2) they should strive to make this scale dependence as weak as possible; (3) they should reduce the physical error probability significantly below the standard threshold.Point (3) is good to make standard fault tolerance work well, but it becomes critical when errors are scale-dependent.In this context, the optimization in this work will enable them to see the size of quantum computation that can be treated with their error magnitude and scale-dependence.Experimenters will be aided in addressing these points by a full-stack model of a quantum computer [37], built from the theory presented here.where we note that 2e 2+1/e ∼ 21.Refs.[20,25] considered ∆ to be finite and independent of the number of physical qubits, N , in the limit of large N .Then Eq. (B1) gives a fault-tolerance threshold at t 0 ∆ = (2e 1+2/e B 2 ) −1 ∼ 10 −9 .In contrast, we consider ∆ that grows with N , and so grows with k like N D k N 0 where N 0 is the number of qubits required to perform the algorithm without any error correction.Here D is defined by saying that each additional level of concatenation replaces each logical qubit with D logical qubits.A very rough estimate of D shows that it is of order the number of gates in a "Rec" so D ∼ A = 291.A more detailed calculation of D confirms that it is of the same order of magnitude as A [37].
In principle, one could imagine an arbitrary dependence of ∆ on N , and hence on k.Then the optimal amount of error correction for such long-range noise would be that given above in Sec.III B, with B replaced by 2e 2+1/e B 2 .If ∆ ∝ N β one substitutes ∆ = ∆ (0) D βk into the right hand side of Eq. (B1), where ∆ (0) is the magnitude of the long-range noise when there is no error correction, for which there are only N 0 physical qubits.This gives Eq. ( 11), which has the same k-dependence as in Sec.IV A. Hence, all results in Sec.IV A and Appendix A hold for long-range noise, so long as one replaces B by 2e 2+1/e B 2 .
We then use the results in Appendix A to plot the minimal value of t 0 ∆ (k) L in Fig. 6c.While we took the rough estimate of D give above (D = 291) for the plots, we observed that the form of the curves in Fig. 6b,c was rather insensitive to the exact value of D.
We now turn to an example given in Ref. [20], in which it was assume the qubits were placed on a d-dimensional lattice, with the unwanted interaction between qubits at position r i and r j being Ref. [20] considered this model when the noise was not too long-ranged (z > d) so that ∆ remains finite as N → ∞.However, in many designs of quantum computer, one has circuit elements that perform two qubit gates between physically distance qubits.Noise in such circuit elements could generate even longer-range noise than considered in Ref. [20].Such noise may not always be a simple function of (r i − r j ), but we can get a feel for such very long range noise by taking Eq.(B2) with z ≤ d.If we assume that N 1 and that the nature of the lattice (i.e., its dimensionality, aspect ratio, etc) is unchanged as we increase N , then ∆ ∝ N 1−z/d for z < d.This then gives Eq. ( 11) with β = (1 − z/d).
To calculate ∆ 0 from δ, one must take a concrete example, such as a chain of N 0 qubits in one dimension, or a √ N 0 × √ N 0 square lattice in two dimensions.Then ∆ 0 is given by the central qubit in the lattice (since it has the largest unwanted coupling to other qubits), and in the limit N 0 1 one gets the results in table I.For z > d this has the scaling discussed in Sec.IV B with β = (1 − z/d).
In the special case of z = d, one has a k-dependence that coincided with the toy-model in Sec.III A.

Parameters for d dimensional array of qubits Scaling with
Parameter dependence for the long-range interaction model with z ≤ d in Eq. (B2).We assume large N0, so the sum over j in ∆ can be approximated by an integral.The z-dependent constant for the square lattice Cz = π/4 0 cos z−2 θ.For z = d, there are order-one constants, κ1 and κ2, which can be neglected for large N0.These constants come from the integral's short distance cutting-off, and a precise calculation of them would require not approximating the sum as an integral.

Appendix C: Resonant qubit gates
We consider a two-level system-the qubit-with bare Hamiltonian H 0 = − 1 2 ω 0 σ z embedded into a waveguide for light at resonant frequency ω 0 for implementing gate operations on the qubit.Assuming the system is at 0K, and neglecting pure dephasing, the physics is described by the optical Bloch equations in which there is only spontaneous emission.The driving Hamiltonian is H D (t) ≡ Ωh(t) cos(ω 0 t)σ x , writable in the form given in the main text, H D (t) → 1  2 Ω(t)(|0 1|e iω0t + |1 0|e −iω0t under the RWA.The overall qubit dynamics follows the Lindblad equation given in the main text: ρ = − i [H(t), ρ] + D(ρ), with H(t) ≡ H 0 + H D (t) and D(ρ) the dissipator defined as D(ρ) = γ(σ − ρσ + − 1 2 {ρ, σ + σ − }).The gate on the qubit is accomplished by an incoming coherent light pulse of power P in = ω 0 Ṅin where Ṅin is the rate of incoming photons.The Rabi frequency induced by pulse is Ω = 2(γ Ṅin ) 1/2 [46].It increases with γ, the time-constant for spontaneous emission, as both quantities measure the strength of the coupling between the qubit and the modes of the waveguide which provide both the decay and driving channels.As we are considering energetic constraints, i.e., a limit on the total number of photons to do gates, it is useful to express Ω in terms of the photon number n g available for that gate.For H D (t) describing a square pulse of duration τ with constant power, with n g available photons, Ṅin = n g /τ .
In addition, to induce a rotation angle of θ, we require Ωτ = θ, so that τ = θ/Ω.We thus have Ω = 4γn g /θ, and τ = θ 2 /(4γn g ) when expressed in terms of given n g and θ.Observe that, for a target θ, larger input energy, i.e., larger n g , enables faster gate operation.The Lindblad equation, together with the expressions for Ω and τ in terms of θ and n g , describes the noisy implementation of a rotation of the qubit state by angle θ about the x axis in the Bloch ball, using given energy ω 0 n g .The noisy gate operation, G, obtained by integrating the Lindblad equation over the gate duration τ , is a linear map that takes the input qubit state ρ(t = 0) to the (noisy) output state ρ(t = τ ).It can be written in terms of the ideal gate G as G = G • E, with  where χ αβ are scalar coefficients.The coefficients χ 11 ≡ p x , χ 22 ≡ p y , and χ 33 ≡ p z give the probabilities of X, Y , and Z errors, respectively, relevant for the 7-qubit code used in our discussion (χ αβ , with α = β, do not affect the code performance; see, for example, Ref. [10]).Straightforward calculation gives 1  2 Tr{σ α E(σ α )} = χ 00 + χ αα − β =0,α χ ββ , for α = 0, 1, 2, 3. We obtain p x , p y , and p z by inverting these relations.For θ = π, corresponding to the commonly used gate G = X(•)X, we find ω 0 R 2 n L (R), where, as in the main text, n L (R) is the number of photons required per logical gate operation for given R (and hence a given target logical error probability p err ; see main text).n L (R) can be read off Fig. 5(c) in the main article.We can also estimate the average power cost, by assuming that all the logical gates in Shor's algorithm are run sequentially.Each logical gate is assumed to take M clock cycles per concatenation level; M = 3 for the scheme of Ref. [7].The duration of a logical gate with k levels of concatenation is thus τ L = M k τ g , with τ g = π 2 /(4γn g ) as the clock interval, taken to be the duration of the π-pulse gate analyzed above.The power P associated with the energy E tot can hence be estimated as E tot /(Lτ L ).Some energetic numbers (orders of magnitude only) for the scheme of Ref. [7], with γ = 10 Hz and ω 0 = 10 GHz, are given in Table II.
FIG. 3.The black dotted curves are a schematic of the conventional situation (where the physical error probability η is independent of the scale-the concatenation level k-of the computer.The red solid curves are a schematic of this work's consideration, where η grows with k (each red curves is for a different value of p (0) ≡ η).If η is k independent, standard FTQC analysis says that the error per logical gate p (k) can be brought as close to 0 as desired by increasing k, provided one starts below the threshold (solid horizontal line) at k = 0.If η depends on k, even if one starts below the threshold, p (k) eventually turns around for large enough k; p (k) cannot reach 0, there is a maximum concatenation level, and further increase in k only increases the logical error.All examples in this work have at most one minimum in each red curve, but in general a curve can be multiple minima.

FIG. 6 .
FIG.6.(a) Sketch of p (k) (black curve), with integer k marked by black filled-circles.Then p (kmax) must be between p (k st ) and p ( k) (the gray region).(b) A plot of p(kmax) with D = 291 and B = 10 4 , for β = 0.1 (black), β = 0.5 (red), β = 1 (green) and β = 2 (blue).The filled-circles mark η (0) = B −1 D −2β for each β.For η (0) ≥ B −1 D −2β , concatenation is useless, and p (kmax) = η (0) (solid lines).For η (0) < B −1 D −2β , concatenation is useful, and p (kmax) is in the shaded region between the two dashed curves of the same color, the upper being p ( k) and the lower being p (k st ) .The two curves are close enough to give a good estimate of p (kmax) ; indeed for β < 0.5 the two curves are almost indistinguishable.(c) The same as (b) but for the long-range noise given by Eq. (B1).The curves look similar to (b) but the magnitudes on the axes are much smaller.

TABLE II .
Energetic bill for carrying out Shor's algorithm in our qubit-in-waveguide example.R = 10 3 R = 10 5 R = 10 7 order in 1/n g .The largest of these, namely, p x is what we set as η in the main text.