A high gain travelling-wave parametric amplifier based on three-wave mixing

We extend the theory for a Josephson junction travelling wave parametric amplifier (TWPA) operating in the three-wave mixing regime and we propose a scheme for achieving high gain. The continuous three-mode model [P. K. Tien, J. Appl. Phys. 29, 1347 (1958)] is on one hand extended to describe a discrete chain of Josephson junctions at high frequencies close to the spectral cutoff where there is no up-conversion. On the other hand, we also develop a continuous multimode theory for the low-frequency region where the frequency dispersion is close to linear. We find that in both cases the gain is significantly reduced compared to the prediction by the continuous three-mode model as the result of increasingly strong dispersion at the high frequencies and generation of up-converted modes at the low frequencies. The developed theory is in quantitative agreement with experimental observations. To recover the high gain, we propose to engineer a chain with dispersive features to form a two-band frequency spectrum and to place the pump frequency within the upper band close to the spectral cutoff. We prove that there exists a sweet spot, where the signal and the pump are phase matched, while the up-conversion is inhibited. This results in a high gain which grows exponentially with the length of the TWPA.


I. INTRODUCTION
Quantum limited parametric amplifiers [1] are important tools for measuring and monitoring states of superconducting qubits. Built with nonlinear, superconducting lumped element oscillators or transmission line resonators and demonstrating high gain and small added noise [2][3][4][5][6], the parametric amplifiers became an essential part of the circuit Quantum Electrodynamics (cQED) [7] toolbox.
To build a large-scale multiqubit quantum processor, an optimisation of qubit readout by multiplexing is desirable, which requires amplifiers with a large bandwidth, high gain and low added noise. Such a capability is provided by travelling wave parametric amplifiers (TWPA). During the last few years, the interest has rapidly grown in the development and investigation of the properties of different types of TWPA.
The amplification principle of the TWPA is based on nonlinear interaction of a weak propagating signal with an intense co-propagating wave (pump), which under a phase-matching condition results in an exponential spatial growth in the signal amplitude [8][9][10]. In the quantum regime, the TWPA is capable to generate signal squeezing and photon entanglement [11,12].
The 3WM amplifiers employ the lowest order, cubic, nonlinearity of the inductive energy, which is similar to the χ (2) nonlinearity in optical crystals. Such nonlinearity is associated with the broken time-reversal symmetry, which can be introduced by applying a dc current bias, or a magnetic flux bias. The amplification occurs due to a down-conversion process, which is capable to provide an efficient amplification within a large bandwidth in a weakly dispersive medium already at relatively small pump intensity [10]. An important property of this regime is the separation of the amplification band from the pump, and also the possibility of phase preserving as well as phase sensitive amplification.
In practice, however, the amplification performance of 3WM devices with weak frequency dispersion is compromised by the generation of pump harmonics [33] as well as signal and idler up-conversion [15,34].
The 4WM amplifiers employ the next order, quartic, nonlinearity of the inductive energy, which is similar to the χ (3) nonlinearity in optical fibers. Amplification in this regime is less efficient since it is a higher order effect with respect to the pumping strength, and it also suffers from dephasing due to Kerr effect that makes exponential amplification impossible without dispersion engineering [23][24][25]. Furthermore, the pump position in the middle of the gain band is undesirable for certain applications.
Our quest in this paper is to investigate whether it is possible to realise in practice full exponential amplification in a TWPA using 3WM by avoiding the poisoning effect of up-converted modes.
At first glance, the discreteness of the Josephson junction chain allows solving of the problem by placing the pump frequency close to the spectral cutoff, inherent in the discrete chains, and thus eliminating the upconverted modes. Our analysis based on the exact solution for the discrete chain shows that this is in principle possible. However, to overcome the effect of dispersion, which becomes increasingly strong in the vicinity of the spectral cutoff, a rather strong pump signal is required that is unlikely to be realised in practice.
We propose to solve this difficulty by engineering a two-band frequency spectrum of the TWPA and placing the pump within the upper band close to the spectral cutoff. In this case, as we prove, there exists a sweet spot where the pump and the signal belong to the different bands and are exactly phase-matched, while the generation of up-converted modes is inhibited since the pump is close enough to the cutoff. In the vicinity of this sweet spot, a rather broad window opens where a strong exponential amplification occurs. The width of this window is limited by the dispersion and up-conversion effects (c.f. experimental observation in a kinetic inductance TWPA in [19]).
The structure of the paper is the following. In Section II we derive universal dynamic equations for three different kinds of TWPAs, that use either current biased junctions, or magnetic flux biased radio-frequency superconducting quantum interference devices (rf-SQUIDs) [29], or magnetic flux biased superconducting nonlinear asymmetric inductive elements (SNAILs) [31].
In Section III we derive the exact solution to the model containing only three modes involved in the downconversion, while neglecting up-converted modes, and we evaluate the exponential gain and the frequency region where the exponential gain exists. This part is a generalisation of the solution for a continuous medium in Ref. [10]. Here we also evaluate the pumping strength required for this model to be valid.
In Section IV we investigate the low frequency region with weak frequency dispersion and show that generation of up-converted modes makes it impractically hard to achieve high exponential gain (cf. [34]). We also compare our theoretical results with experimental data obtained on a SNAIL-based TWPA, and find a very good quantitative agreement.
In Section V we pursue the strategy of boosting the gain by engineering a two-band spectrum of the TWPA. We identify the sweet spots, where the high gain is achieved, for the two TWPA designs -adding resonators to unit cells [23][24][25], and periodically modulating the TWPA parameters [27]. The obtained results are summarised in Section VI.

II. TWPA DYNAMICAL EQUATIONS
The travelling-wave parametric amplifier we study is a chain of identical cells, each consisting of a block of Josephson elements, L J , and a capacitor, C, as depicted in Fig. 1  of the TWPA is the Lagrangian [35], where φ n (t) is the dynamical variable -the superconducting phase at node n,φ n (t) is its time derivative, Φ 0 = h/(2e) is the magnetic flux quantum, L J is the Lagrangian of the Josephson junction block, and θ n = φ n − φ n−1 is the superconducting phase difference across the block. We consider three flavours for the Josephson junction blocks suitable for 3WM. The simplest one is the Josephson junction block consisting of a single Josephson junction, Fig. 1b, where C J and L J = /(2eI c ) are the Josephson junction capacitance and inductance, respectively. In order to provide the 3WM mechanism, the Josephson junction has to be biased with a dc current, I dc , which induces a constant shift of the phase difference across each cell, θ 0 , defined by equation, After including this constant shift to the phase difference, θ n → θ 0 + θ n (t), a dynamical equation for the TWPA is derived by varying the Lagrangian over the dynamical variable φ n (t) yielding, is the resonance frequency of the current-biased L J Coscillator.
The linear terms, with respect to θ n , in Eq. (4) define the spectral properties of propagating waves in the TWPA, while the nonlinear terms, with respect to θ n , are responsible for the mixing processes. In the absence of the biasing dc current the 3WM term vanishes.
The TWPA dispersion relation is derived by assuming the solution to a linearised version of Eq. (4) as a discrete analogue to the propagating wave with quasi-wave vector κ, φ n (t) ∝ e i(κn−ωt) , giving the relation, The dispersion relation has a cutoff at κ c = π, where the frequency reaches the maximum value ω c , The dispersion relation cutoff is related to the discrete nature of the TWPA, but it is also affected by the Josephson junction capacitance. In practice, however, the latter is usually small, C J C, and we will neglect it for most of the theoretical analysis throughout the paper, but keep it when comparing with experiments in Section IV D.
Assuming a small amplitude of the phase oscillation, |θ n | 1, we expand trigonometric functions in Eq. (4) up to the second order, thus retaining the 3WM term but omitting the higher order 4WM term, to get, An alternative solution for a 3WM TWPA is to use an rf-SQUID instead of a Josephson junction [29], as shown in Fig. 1c. An rf-SQUID includes an inductance L in parallel with the Josephson junction, which allows replacing the dc current bias with a dc magnetic flux bias to achieve 3WM. The Lagrangian L J in this case has the form where F = 2πΦ/Φ 0 is the normalised magnetic flux. The dc phase shift θ 0 is now defined, in the absence of net dc current through the rf-SQUID, by the relation The dynamic equation now takes a form similar to the one in Eq. (8), , Another alternative design is to replace the rf-SQUID with a SNAIL circuit [31], as shown in Fig. 1d. This device can be viewed as a modification of a dc-SQUID with several series connected junctions placed in one of the arms. The corresponding Lagrangian reads, here N refers to the number of identical junctions in the top arm in Fig. 1d. The biasing phase difference is defined in the absence of the net dc current through the SNAIL, by Proceeding with the derivation in a similar way to the single junction TWPA, we get the following dynamical equation for the SNAIL-TWPA, where It is worth to note at this point that it is not possible to employ an asymmetric dc-SQUID for 3-wave mixing instead of the SNAIL: in this case N = 1 in Eq. (15), and the nonlinear term in Eq. (14) turns to zero by virtue of Eq. (13).
Comparing Eqs. (4), (11) and (14), we find that they all have similar structures and could be written in a universal form, where ω 0 is the resonance frequency of the cell, and χ 3 is the 3WM nonlinear coefficient. Particular values of these quantities for different TWPA designs are presented in Table I.

III. 3-MODE MODEL
The amplification in a TWPA results from the process of resonant down-conversion, when the frequencies of three interacting waves, i.e. pump, signal, and idler, obey the resonance condition, ω p = ω s + ω i . Besides this process necessary for amplification, the nonlinear term in Eq. (16) generates a large set of up-converted modes of all the waves involved in the amplification and their combinations. These processes of up-conversion significantly degrade the performance of the amplifier, as we will show in the next section.
To reveal the full amplification potential of the 3WM mechanism, we consider an ideal model where only three waves participating in the down-conversion are taken into account, while all the up-converted modes are neglected. In this model, the field in the TWPA chain consists of a linear combination of three partial harmonic tones, whose amplitudes satisfy equation that follows from Eq. (16), where integration is done over the period of the corresponding mode. For the pump amplitude, the right-hand side of Eq. (18) only contains the resonant products, A s,n A i,n , which are responsible for the pump depletion. For the amplifiers employed for qubit measurements, an input signal is typically ∼ 1 nA, which is by two orders of magnitude smaller than the pump current, which is typically a considerable fraction of the critical current, I c ∼ 1 µA. Thus, for a power gain up to the order of 40 dB the signal and idler remain weak compared to the pump within the whole TWPA chain, A s,n , A i,n A p,n , and we will neglect their effect on the pump. As a result, Eq. (18) for the pump becomes linear and has a free propagating wave solution, A p,n = A p e −iκpn , with the dispersion relation, The dispersion relation has a cutoff at ω c = 2ω 0 , κ c = π, and is strongly dispersive in the vicinity of the cutoff. In the long-wave limit, κ p 1, the dispersion relation becomes linear, ω p ≈ ω 0 κ p .
Proceeding to the equations for signal and idler we find that the only resonant contributions here come from the products A p,n A * i,n for the signal, and A p,n A * s,n for the idler. This implies that the equations form a linear equation set. The solution ansatz has the form, whereκ is an unknown quasi-wave vector. Then the corresponding equations in Eq. (18) reduce to an algebraic equation for spatio-temporally independent amplitude, A s , and a similar equation for A i with the replacements, The quasi-wave vectorκ is found from the solubility condition for Eq. (21) and equation for A i , i.e. from the condition for the determinant of the system to be equal to zero. After some algebra the corresponding equation can be presented in the form, A solution to this equation generally has the imaginary part that describes the amplification effect. We quantify the amplification with the gain coefficient g = Im(κ/2), which characterises the amplitude gain per unit cell and is related to the power gain of TWPA as where N is the number of unit cells. The exact numerical solution for g at the degeneracy point, ω s = ω i = ω p /2, is presented in Fig. 2a for different pumping strengths. The pumping strength here is characterised through the quantity Here θ p is the the amplitude of oscillation of phase difference across the cell associated with the pump amplitude at the node, A p . At small frequencies, ω p ω c , κ p 1, the gain coefficient is small and grows linearly with frequency. It reaches a maximum and then sharply drops to zero at the gain cutoff frequency, ω p = Ω c . The maximum gain and Ω c depend on the pumping strength as illustrated in Fig. 3. As seen from Fig. 2a, the maximum gain at large frequencies is quite large: for pumping strength ε ≈ 0.4, the gain coefficient reaches the value g ≈ 0.06, which translates to the power gain, G ≈ 26 dB, for a chain with N = 50 cells.
A numerical solution of Eq. (22) for the gain coefficient as a function of detuning is presented in Fig. 2b with solid lines. The plots are made for the pump frequency values corresponding to maxima of the gain coefficients in Fig. 2a for the same pump intensities.
To better understand the gain properties, we derive an approximate analytical solution to Eq. (22). To this end we castκ on the form, where quasi-wave vectors, κ s,i , are related to the signal and idler respective frequencies via the dispersion relation similar to Eq. (19). We also define the detuning from the degeneracy point, and the phase mismatch ∆(δ), At weak coupling, ε 1, the gain coefficient is small, as is also seen in Fig. 2a, in comparison with the wave vectors κ p ∼ κ s ∼ κ i , which are of order unity at high frequencies, as given by the dispersion relation in Eq. (6). On the other hand, the phase mismatch, which is sufficient to suppress the gain is also small. Therefore, both g and ∆(δ) are small additive corrections to κ p,s,i , and can be omitted from the corresponding terms in Eq. (22). This approximation yields the explicit solution for g, The obtained solution is a generalisation to a discrete chain and arbitrary frequency of the result obtained in [10] for a continuous medium. One can see from Eq. (28) that it is the competition between the nonlinear coupling controlled by the intensity of the pump (the second term on the right) and the phase mismatch (the first term on the right) that determines whether g is real or imaginary, i.e., whether exponential amplification occurs or not. The sharp drop of the gain is explained by the increasingly strong dispersion near the cutoff frequency. At zero detuning, Eq. (28) reduces to the form where is the phase mismatch between the frequency points, ω p and ω p /2. This solution is represented in Fig. 2 with dashed lines.
Equation (22) and its approximate analytical solution, Eq. (28), together with numerical solution presented in Fig. 2 constitute the first main result of this paper. second pump harmonic falls above the cutoff, ω c = 2ω 0 . For the signal/idler the lowest bound is established by condition that the up-converted signal at zero detuning falls above the cutoff, ω p /2 + ω p > 2ω 0 . This yields a more stringent constraint, ω p > Ω th = 4ω 0 /3. At pump frequency larger than this threshold value, the detuned signal and idler are not up-converted within the band defined by equation, This situation is illustrated in Fig. 4. Therefore we conclude that the three-mode model considered is justified, when the gain cutoff frequency exceeds the threshold for no-up-conversion, Ω c (ε) > Ω th , and within the bandwidth in Eq. (31). This condition imposes the lowest bound for the required pumping strength. An accurate estimate of the lowest bound is extracted from the numerical solution to Eq. (22), To obtain an analytical estimate we assume, g = 0 in Eq. (29), to get, At ω p = Ω th , κ p ≈ 0.46π, and for zero detuning, δ = 0, we have κ s ≈ 0.22π, giving ∆ ≈ 0.03π. This results in the bound for the pumping strength, χ 3 |A p | > 0.20, or ε 0.27, which slightly overestimates the exact result from Eq. (32).
The crucial question is now whether the required coupling strength can be experimentally achieved with a feasible pump intensity. Let us first consider the dc current biased TWPA. The pumping strength here is limited by the switching of the Josephson junctions to the resistive branch. In the quantum limit, the maximum dc supercurrent that the junction can sustain corresponds to the disappearance of the last quantized energy level from the well of the tilted Josephson potential. The maximum dc supercurrent can be crudely estimated from the relation, ω pl /2 ∼ ∆U , where ω pl = cos θ 0 /(L J C Σ ) is the effective plasma frequency for the junction, C Σ = C/2 + C J ≈ C/2, and ∆U ≈ (2E J /3) cos 3 θ 0 is the depth of the well of the tilted Josephson potential. Impedance matching of the TWPA with the transmission line, L J /(C cos θ 0 ) = Z 0 , gives for the maximum current, cos 2 θ 0 ∼ 3π √ 2Z 0 /R q , where R q = h/(2e 2 ) is the quantum resistance; this corresponds to I dc ∼ 0.97I c . Assuming that the quasi-classical tunnelling rate, Γ ∝ exp(−7.2∆U/ ω pl ) [36], is valid for the two-three quantized energy levels in the well [37], and also taking into account the experimental observations, e.g. in Ref. [38], we may safely assume for the switching current value, When the pump is on and has a small frequency, ω p ω pl , the instant adiabatic current consists of the dc biasing current, I dc , and the pump ac current, I p , and their sum should not exceed the switching current, I dc + I p < 0.9I c . The maximum pumping strength under this constraint is, ε = 0.28, which is achieved at I dc = 0.63I c and I p = 0.27I c . Although this pumping strength is above the bound in Eq. (32), the frequency window where the model is valid is very small, Ω c − Ω th ≈ 0.06ω 0 , Fig. 4, making the amplification bandwidth unacceptably narrow. Furthermore, the corresponding pumping current is too large given the theory constraint, I p I c , more feasible would be the lower current values, I p ∼ 0.1I c . In addition, a spread of the junction parameters in a real TWPA would also reduce the estimated pumping strength.
More importantly, however, is that the relevant ac regime is non-adiabatic: the pump frequencies above the no-up-conversion threshold, ω p > Ω th ≈ 1.33ω 0 , are close to and even higher than the plasma frequency, ω pl = √ 2ω 0 ≈ 1.41ω 0 . In this regime, the resonant excitation facilitates tunnelling (especially due to the multiphoton processes at large pump amplitude), therefore the biasing current should be even smaller than in the adiabatic regime, hence the pumping strength would be further reduced.
In the case of an rf-SQUID TWPA, the maximum nonlinear coupling is achieved at θ 0 = π/2 [29], when χ 3 = L/L J . For a non-hysteretic regime, L/L J < 1, combination of this constraint with the a small value of the amplitude of phase oscillation, θ p < 0.1, results in a pumping strength, ε < 0.1, which is below the threshold, Eq. (32). Similar argument also applies to the SNAIL-TWPA.
To summarise, we conclude that the 3-mode amplification regime, which would provide the high exponential gain at high frequencies, ω p ∼ ω 0 cannot be realised in practice with any of the TWPA designs considered here. The desired condition, Ω c (ε) > Ω th , cannot be fulfilled because of small values of nonlinear 3WM coefficients in realistic devices and limited pumping current.
To go beyond the studied 3-mode model, two strategies can be followed. The one is to consider the lower frequencies and include the up-converted modes in the model, while the other is to keep the 3-mode model but consider dispersion engineering at high frequencies. In the next section we will discuss the first option in detail.

IV. QUASILINEAR DISPERSION REGIME
In this section we analyse TWPA performance in the low frequency region, ω ω 0 , κ π, where the dispersion relation is quasi-linear. The limit of the continuous medium is natural for the kinetic inductance TWPA, but it is also considered in most of publications devoted to the Josephson junction TWPAs.
In this limit the discrete chain of the Josephson junctions is described with a continuous variable, an → x, φ n → φ(x), where a is the physical length of the unit cell. Then the difference equation Eq. (16) turns into a differential equation, wherek = −i∂ x . Keeping the lowest order terms with respect to a in the expansion of the interaction term, we write Eq. (34) in the form, This dynamical equation (and a similar one for the 4WM) is a standard object of study in the TWPA literature.
To analyse the resonant wave dynamics described by this equation one has to take into account, in addition to the down-conversion discussed in the previous section, the processes which are efficient for a weakly dispersive medium: (i) generation of pump harmonics with frequencies nω p [33], (ii) up-conversion of the signal and down-converted idler by the pump tone and its harmonics, ω s,i + nω p [34]. The processes that can be neglected for a weak signal include pump depletion and generation of signal/idler harmonics and their intermodulation products.

A. Pump harmonics
Let us first consider pump harmonic generation. When a pump tone is injected, the field in the cavity has the form where k is a wave vector related to ω via the free wave dispersion relation, ka = 2 arcsin(ω/(2ω 0 )), and M is the number of harmonics included in the computation (see below). A slow variation of the amplitudes of the harmonics accounts for the effect of nonlinear interaction. Substituting Eq. (36) in Eq. (35) we get a set of M coupled nonlinear equations for the pump harmonics, where the prime signifies spatial derivative, A = dA/dx, ∆ n,m = k np − k mp − k (n−m)p , and m ∈ {1, ..., M } is the harmonic number. In this equation, the first sum describes coupling to the higher harmonics, while the second sum describes coupling to the lower harmonics, as illustrated in Fig. 5. The factor 1/2 in the second sum accounts for the double counting in the sum, e.g. n = 1 and n = m − 1.
When the number of harmonics is restricted to M = 2, an analytical solution is available [33], which shows that for a linear dispersion the injected tone is fully converted to the second harmonic on a length inversely proportional to the amplitude of the injected signal. In the presence of dispersion, both harmonics exhibit oscillatory behaviour (see blue curve in Fig. 7) while preserving the quantity |A p (x)| 2 + 4|A 2p (x)| 2 = constant. Analytical solutions are also available for larger number of harmonics, but their behaviour becomes rather complex, so we resort to numerics. We perform a numerical study under the assumption that all relevant harmonics have frequencies well below the cutoff. For a weak dispersion the pre-exponential factors in Eq. (37) can be approximated with linear functions, k mp a = mω p /ω 0 , while the exponential dephasing factors are approximated with the lowest order (cubic) corrections. The latter can be expressed through the dephasing ∆ ≈ k 3 p a 2 /32 defined in Eq. (26), To compute and analyse the solutions to Eq. (37), it is convenient to introduce dimensionless spatial coordinate and rescaled harmonic amplitudes, where ε is the pumping strength defined in Eq. (24), which has the form in the low frequency limit, ε = where the spatial behaviour of all harmonics is described with a single scaling parameter, This parameter is proportional to the ratio of the dephasing and the nonlinear pumping strength, and has clear physical meaning indicating that it is the interplay between the dispersion and the nonlinear coupling (the pumping strength) that defines the behaviour of the harmonics. The differential equations in Eq. (40) are solved numerically for different values of µ and M using 4th and 5th order Runge-Kutta methods of the MATLAB function ode45. The spatial dependence of the solutions truncated at M = 5 is shown in Fig. 6 for µ = 2. All the harmonics oscillate but while amplitudes of the first and second harmonics are substantial, the amplitudes of higher harmonics, m = 3, 4, 5, are decreasingly small. Correspondingly, the effect of the latter on the main pump tone is small, as illustrated in Fig. 7 for µ = 2 and different values of M . Here the solution for the main pump tone coupled to three harmonics clearly differs from the one coupled to two harmonics, but the more harmonics are included the smaller effect they have on the solution for the main pump tone. The solution for the main pump tone appears to converge at M 5.
Our numerical studies show that the number of largeamplitude harmonics depends on the value of µ, the number of large-amplitude harmonics increases when µ decreases, which corresponds to a weaker dispersion or stronger pumping. Furthermore, for a given µ, the amplitudes of harmonics with numbers exceeding a certain critical value, m > M c , become negligible on a given length, as illustrated in Fig. 8. One can hence truncate Eq. (40) at M = M c (µ) to accurately compute the solution for the main pump tone. The result of such a study is presented in Fig. 9. We solve Eq. (40) for certain val-  Fig. 8. As seen in Fig. 9, the pump behaviour can in general be summarised as follows: The pump oscillates between full transmission and some lower bound. The larger the µ, the smaller the oscillation amplitude and period.

B. Comparison with four-wave mixing
It is instructive to compare the pump harmonic generation studied in the previous section to the pump harmonic generation by the 4WM, where the phase mis- match introduced by the Kerr effect prevents exponential amplification. One should anticipate that a similar mechanism would suppress the generation of pump high harmonics [33]. We test this assumption by computing the third harmonic of the pump using the developed scaling method.
The equations for the pump harmonics are derived in a similar way as for the 3WM. Restricting to the third harmonic, we have, where χ 4 is the 4-th order nonlinearity coefficient derived in a similar way to χ 3 in Section II. The value of χ 4 is 1 2 for a junction TWPA, while it is typically smaller for an rf-SQUID or a SNAIL TWPA at zero bias. With similar rescalings as in Section IV A, we write these equations in a dimensionless form, Here the spatial behaviour of both harmonics is defined by a single scaling parameter, We solve Eq. (42) numerically for different values of µ, the result is presented in Fig. 10. In contrast to 3WM, where the first harmonic is completely depleted in the absence of frequency dispersion, µ = 0 (blue line), it oscillates for the 4WM (yellow line), similar to 3WM at a considerable dispersion, µ = 3 (orange line). Thus we conclude that the Kerr effect, caused by the χ 4 -term, not only prevents exponential amplification but also suppresses the generation of higher harmonics, which makes up-conversion processes less dangerous for 4WM amplification compared to 3WM.

C. Full multimode model
Now we proceed with the discussion of the amplification in the 3WM regime and assume a weak signal tone being injected in addition to the strong pump tone, A s (0) A p (0). We perform our analysis under the same assumption as in Section III of small amplitudes of signal and idler compared to the pump amplitude within the whole TWPA, A p (x). This assumption allows us to neglect the back-action of signal and idler on the pump (pump depletion), and also neglect the generation of signal and idler harmonics while allowing the up-conversion of signal and idler by the pump harmonics. The latter assumption implies a linearisation of the equations with respect to all signal and idler harmonics.
With the adopted approximations, the field in the TWPA will consist of a linear combination of the pump and pump harmonics, mω p , signal ω s , down-converted idler, ω i = ω p − ω s , and all their up-converted modes by the pump and pump harmonics, ω s+mp = ω s + mω p , ω i+mp = ω i + mω p , see Fig. 11. The ansatz is therefore, where m ∈ [0, M − 1]. The numerical factors in the exponents are derived by using the cubic approximation of the dispersion relation and have the form, where ∆ is defined in Eq. (27). The equations for the idler modes are similar, but with the replacements s↔i and δ ↔ −δ.
For M = 1, Eq. (47) reduces to the continuous analog of the 3-mode model in Section III, The solution has an exponential form, a s , a i ∝ e gξ , with the gain coefficient [10], that coincides with the long-wave length asymptotic in Eq. (28). The exponential gain occurs only for small values of the scaling parameter given by, µ < 8/ √ 1 − δ 2 . We solve Eq. (47) numerically in the same way as we solved them for the pump harmonics, including the spatial dependence of the pump harmonics. The results for the power gain as the function of length, G(ξ) = |a s (ξ)/a s0 | 2 , are shown in Figs. 12 and 13 for µ = 1 and µ = 10 at zero detuning (δ = 0). These results demonstrate two distinctly different amplification regimes: For small values of µ, the gain grows on average with the TWPA length, but much slower than expected from 3mode model, Eq. (50). The gain suppression is the result of the up-conversion with many up-converted modes affecting the signal. In Fig. 12, the gain reaches value G ∼ 20 dB at scaled length ξ ≈ 30. In the opposite case of large µ, the gain profile converges quickly, and the number of modes included in the simulation is small. Correspondingly, the gain spatial profile becomes oscillatory and has a relatively small amplitude, Fig. 13. This reduction of the gain is the effect of phase mismatch, in accord with the criterion of a non-exponential amplification of the three-mode model, µ > 8.
While one should not expect the gain larger than few dB for TWPA with any length when µ is large, for small µ the gain grows with the length and interesting question is what gain can be achieved for realistic Josephson junction TWPA. The limitation is imposed by the necessity to accommodate all relevant modes within the spectral range. For the ten pump harmonics involved, as in Fig. 12, the pump frequency should be limited, ω p < 0.1ω c , which corresponds to ka < 0.2. From Eqs. (39) and (41), we then deduce the real space length, x ∼ 500aξ = 15000 unit cells. Such a long TWPA is unpractical.
Summarising, we formulate the second important result of this paper: in weakly dispersive spectral region of low frequencies the gain above 20 dB is hard to achieve. For small µ, the gain is reduced by a strong up-conversion effect, while for large µ, the gain is small and oscillatory because of a large effective phase mismatch.

D. Comparison with experiment
In this section we compare our theoretical predictions with experiments performed on a SNAIL-TWPA, Fig. 1d. The device consists of N = 440 unit cells, each unit cell contains one junction with I c1 = 0.8 µA, and N = 3 identical junctions with I c2 = 3 µA, Eq. (12). The biasing magnetic flux is applied at Φ ≈ 0.4Φ 0 , where χ 3 (Φ) ≈ 0.82 (recall Eq. (15)), and the four-wave mixing vanishes. The pump frequency is placed at f p = 8.5 GHz, and the transmission (S 21 ) is measured using a VNA while sweeping the signal frequency within the band 4 − 12 GHz. We determine the gain by comparing the transmission of the signal for pump on versus pump off. The data is presented in Figs. 14 and 15 with black lines for the expected pump powers P exp p ≈ −99 dBm and P exp p ≈ −94 dBm, respectively. The red lines represent  the theoretical fit. The fitting is done by using Eqs. (40) and (47). These equations contain two parameters, the scaling parameter µ, and the signal detuning δ. We generate the solution, a s (ξ; δ, µ) for chosen δ and µ and compute the gain at the end of the chain for a range of detunings, G(ξ max , δ, µ). Then we sweep the two parameters, ξ max and µ, to obtain the best fit to the data. Equivalently, one can use the parameters, k p a -normalised pump wave vector and εstrength of coupling, which are related to ξ max and µ by virtue of Eq. (39) for the given length of the chain and Eq. (41), in which we use the exact dispersion relation, Eq. (6), including the junction capacitances.
The parameter values extracted from the fitting are presented in Table II. The values of ε are found different for both datasets, which agrees quantitatively with the difference in the pump powers. At the same time, the value of k p a is the same in both cases as expected. The found value of the pump wave vector, k p a = 0.51, however, differs from the theoretical value, k p a = 0.42, computed from the dispersion relation, Eq. (6) using the SNAIL parameters, ω S = 2π · 20.6 GHz, and C = 154 fF and C J = 17.9 fF. We attribute this discrepancy to a nonuniform magnetic flux bias. The on-chip pump current, I p , is determined by computing the pump-induced phase difference, |θ p (0)| = ε/|χ 3 |, Eq. (24), and then connecting it to the current using Eq. (13). The found values of the on-chip pump power are consistent with the expected values, as the estimated loss of the line is ∼ 84 dB.
Summarising, the theory reproduces very well the measured frequency dependence of the gain, G(ω s ), despite that only two fitting parameters are at hand to describe an intricate interplay of the pump, signal, idler and their up-converted modes. Furthermore, our analysis reveals that the measurements are done in the regime of nonexponential amplification of large µ, Fig. 13, which explains the small measured values of the gain. In this section we revisit the 3-mode model of Section III and consider the possibility of reducing the dispersion at high frequencies, in the no-up-conversion region, ω p > Ω th = 4ω 0 /3, to maintain the high gain. The idea is to create a sweet spot in the TWPA frequency spectrum where the signal injected at the degeneracy point, ω s = ω p /2, is exactly phase matched with the pump, κ s = κ p /2. Such a possibility does not exist for the TWPA studied in Section III. However, as we theoretically prove in this section, such a possibility appears for a TWPA with a two-band frequency spectrum (cf. [19]). At such a sweet spot a strong exponential amplification is predicted to occur and, moreover, it persists within a wide frequency band.
A common way to create a gap in the TWPA spectrum is to periodically modulate the device parameters along the propagation direction. This is routinely done for kinetic inductance TWPAs by modulating the geometry, and thereby the impedance, of the transmission line [13]. For the Josephson junction TWPAs another method is used -adding linear LC-oscillators to the TWPA cells. In this case, a spectral gap opens at the resonance frequency of the oscillators. This method of dispersion engineering is used in four-wave mixing devices to mitigate the Kerr effect [23][24][25].  Fig. 4).
Let us consider the TWPA with LC-oscillators. The corresponding circuit is presented in Fig. 16a. The derivation of the dynamical equation is straightforward: we add the oscillator circuit variables to the Lagrangian, Eq. (1), and eliminate the oscillator variables from the dynamic equations. Specifically we consider the dc current biased TWPA. In this case, Eq. (21) for the chain without oscillators remains valid with the only difference, the factor ω 2 /ω 2 0 in the first term is replaced with where The dispersion equation then takes the form Solving for ω we get The derived spectrum is depicted in Fig. 17; it consists of two bands separated by a gap.
To identify the sweet spot we assume the pump frequency within the upper band, and the signal frequency within the lower band, and solve equation, ω − (κ p /2) = ω + (κ p )/2. Converted for the pump frequency, this equation has the explicit form One can check by direct calculation that the solution indeed possesses the property, κ s = κ p /2, as illustrated in Fig. 17: the pump and signal points are located on a straight line. Moreover, the pump frequency is located in the no-up-conversion frequency region (recall Fig. 4). The equations for the gain coefficient, Eqs. (28) and (29), derived in Section III do not change their form in the present case; however, the dependence of the gain coefficient on the frequency is now different due to the different dispersion relation. Dependence of the gain coefficient at the signal degeneracy on the pump frequency is illustrated in Fig. 18. As soon as the pump is placed within the lower frequency band, the gain coefficient behaves similarly to the one in Fig. 2a for a TWPA without oscillators. However, when the pump is placed within the upper band at the sweet spot, the amplification dramatically increases up to g ≈ 0.014 for a rather weak pumping strength, ε = 0.06, about seven times smaller than the maximum pumping strength in Fig. 2a and definitely For a detuned signal, large amplification persists within a quite wide frequency band, ∼ 0.5ω p that could be of order of few GHz, as shown in Fig. 19. This is due to a relatively weak dispersion in the low frequency region.

B. Sweet spot in periodically modulated chain
Here we examine a periodically modulated TWPA and prove that there also exists a sweet spot. Experimental realisation of such a device will be reported elsewhere [39]. The circuit is presented in Fig. 16b, here each unit cell consists of two subcells with different Josephson junction parameters. Consider the dc current biased TWPA with different Josephson inductances in the subcells. The Lagrangian can be written for odd and even circuit nodes in analogy with Eqs. (1) and (2), cos(θ 01 + θ 2n ) + 2n+1 1 L J2 cos(θ 02 + θ 2n+1 ) .

VI. CONCLUSION
In this paper, we propose a method for achieving a high gain for a lumped-element TWPA operating in the 3WM regime. The simple model of amplification in weakly dispersive medium [9,10], relevant for kinetic inductance TWPAs and Josephson junction TWPAs at low frequencies predicts the gain up to 40 dB for a chain with 100 unit cells and a reasonable pump intensity. However, in practice, such a gain was never demonstrated. This was explained with a parasitic effect of generation of high harmonics and up-conversion processes [34]. We performed a detailed theoretical analysis of this regime including multiple pump harmonics and signal and idler up-converted modes. We identified a scaling parameter µ that controls the gain and quantifies an interplay between the dispersion and the nonlinear wave interaction and found that the gain is strongly reduced for both small values as well as large values of µ, although for different reasons. When the dispersion is weak in relation to the interaction, i.e. for small µ, the generation of up-converted modes is prominent. In the opposite limit of strong dispersion in relation to the interaction, i.e. for large µ, the phase mismatch becomes the dominant effect. This finding is supported by the experimental observations on a SNAIL-TWPA, and the data are in quantitative agreement with the theoretical simulations.
Our proposal concerns a different operation regime for which the cutoff frequency of the TWPA plays the central role. We proposed to place the pump close to the cutoff such that generation of up-converted modes is inhibited. Then, by solving the difference equations for discrete Josephson junction chain we found that there is a sweet spot where the pump and the signal are exactly phase matched. The sweet spot was proven to exist when the TWPA frequency spectrum consists of two bands separated by a gap. Studying different ways of engineering the two-band spectrum -by adding LC-oscillators or periodically modulating the chain parameters, we predicted that the gain at the sweet spot may achieve the values of order 25 dB within a few GHz amplification bandwidth for a chain with ∼ 200 unit cells and for moderate pump intensities.