Runaway Transition in Irreversible Polymer Condensation with Cyclisation

The process of polymer condensation, i.e. the formation of bonds between reactive end-groups, is ubiquitous in both industry and biology. Here we study generic systems undergoing polymer condensation in competition with cyclisation. Using a generalised Smoluchowski theory, molecular dynamics simulations and experiments using DNA and T4 ligase, we find that this system displays a transition, from a regime with finite-length chains at infinite time and dominated by rings to one dominated by linear polymers that grow in time. Finally, we show that fluids prepared close to the transition may have profoundly different compositions and rheology at large condensation times.


I. INTRODUCTION
Linear polymer condensation is the process by which two polymeric end groups react to form a bond.Beyond its relevance to industry [1], and biotechnology [2], it underpins the biophysics of DNA repair and cloning [3].In the absence of loop formation, polymer condensation will yield linear chains with average length ⟨l⟩ = 1/(1−p) where p is the extent of the condensation reaction [1,4].However, looping, or cyclisation, is expected to be favourable in certain conditions [5][6][7].Several theories on reversible polymer condensation and experiments have, over the last decades, attempted to reach a consensus on whether the polymers in such systems will all eventually convert into rings or whether there always be a linear population at a large-time scale [8][9][10][11][12][13][14][15].Despite this, the polymer physics and chemistry communities have not yet reached a consensus [15][16][17].Additionally, there is little literature on irreversible polymer condensation, which we also refer to as "ligation" henceforth in analogy with the biological process of connecting DNA segments by the enzyme ligase.
Here we study irreversible linear polymer condensation using a combination of theory, simulations, and experiments.First, we show that irreversible polymer condensation is well captured by a modified Smoluchowski coagulation equation [18,19] with an additional sink term that captures ring formation.By spanning a range of monomer concentrations c, we discover that above a critical c † ≃ 0.1c * there is a "runaway" transition characterised by a population of chains that permanently escape cyclisation.Here c * = l 0 /(4/3πR 3 g ) denotes the overlap concentration of polymers with l 0 and R g the initial polymer length and radius of gyration respectively.This transition separates a regime (c < c † ) in which all the chains are converted into rings at infinite time, from one (c > c † ) in which the length of the linear chains diverges in time.
The consequence of this runaway transition is that systems prepared close to c † and driven out-of-equilibrium by irreversible condensation will display markedly different architectural and rheological features at large enough times.
Our work differs from classic and also more recent papers on polymer condensation and cyclisation [8,11,16,20] because it deals with irreversible condensation while implementing subdiffusive search and cyclisation in a Smoluchowski framework and because it suggests through theory, simulations and experiments, that a runaway transition is expected beyond a critical concentration.We also argue that DNA is particularly suitable to test these theories as we can readily visualise the products of ligation reactions by gel electrophoresis and distinguish linear and circular forms by treating the samples with exonuclease, as described below.We conclude our paper by discussing the implications of our findings in the design of soft materials and DNA cloning.

A. Molecular Dynamics Simulations
We model a 6,500 bp-long linear DNA molecule as a bead-spring polymer made of l 0 = 174 beads.The total number of polymer chains is N c = 200.The polymers are modelled via the Kremer-Grest model [21].Each bead has a diameter σ = 13 nm (or ∼ 38 bp), modelled as a for r < r c = 2 1/6 σ and 0 otherwise.Here r represents the distance between beads and ϵ = 1.0 (in LJ units) parametrises the strength of the potential.The diameter of the bead, σ, defines the length units in our system.Consecutive beads are connected through a permanent Finite Extensible Non-linear Elastic (FENE) bond with K = 30ϵ/σ 2 and R 0 = 1.5σ, which is summed to a WLC potential to yield an equilibrium bond length around 0.9σ.The bending stiffness of the polymer is controlled by a Kratky-Porod interaction which constrains the angle (θ) defined by the two tangent vectors connecting three consecutive beads along the polymer.Here, l p = 4σ = 150 bp is the persistence length of DNA.We note that as l 0 ≫ l p , we are always in the flexible chain regime.The solvent is simulated implicitly using a Langevin thermostat so that the time evolution of our system is governed by the stochastic partial differential equations where r is the position of a particle, ζ its friction, m its mass, U the sum of the interaction potentials discussed above and δ white noise with unit variance.The diffusion timescale is τ B = ζσ 2 /k B T .The integration of the Langevin equation is done with a velocity-Verlet algorithm, using a time step ∆t = 0.01τ B in LAMMPS [22].
Various monomer densities were considered, ranging from 10 −2 c * to 1c * , where c * = 0.012σ −3 is the monomer concentration at which the polymers start to overlap.The overlap concentration c * was measured by computing the radius of gyration R g of the polymers in equilibrium at infinite dilution.All the systems were equilibrated for a sufficient amount of time to ensure that the polymer chains have moved at least a distance equal to R g .
After the equilibration step, 40 replicas of production runs were started for each number density considered.The ligation is performed stochastically and is attempted every t l = τ B between two end beads that are closer than R c = 1.1σ using the fix bond/create LAMMPS command.The choice of the time in between ligation attempts, t l , was made so that it was much shorter than the relaxation time of the chains; in this way, the condensation process is diffusion-limited.The distance threshold R c was chosen so that the new bond created is a FENE with cutoff 1.5 σ and to avoid unstable simulations.The probability of successful ligation (i.e., bond formation) is set to p l = 0.1.This value was chosen to avoid "granularity" in the stochastic condensation reaction.If this parameter was set to 1, all the ends that can react would do so in a single time step introducing granular events in our simulations.Setting p l < 1 introduces some randomness that simply maps to a smaller average condensation rate.We have tested slightly different choices of these parameters and we found that the main results and qualitative behaviour of our results are not affected.In particular, we have tested that the reactions remain diffusion-limited even with our choice of p l .A schematic representation of the simulation process is shown in Fig. 1.
Once ligated, the bond formed between the polymers is irreversible and cannot be broken, therefore accounting for the formation of a covalent bond between the DNA fragments.During the ligation process snapshots of the system are taken every 10 6 time steps on both the 3D coordinates of the beads and the bond list at those time steps.From the bond list we can, later on, reconstruct the topology of the individual polymers, i.e. if fused with others to form linear chains or if circularised.
For the topology reconstruction, the trajectories and bond lists were analysed using our Python code (https: //git.ecdf.ed.ac.uk/taplab/dna-ligation.git).The description of the algorithm can be found in Appendix A.

B. The DSMC algorithm
The modified Smoluchowski equation proposed here, (see below Eq. ( 5)), can only be solved analytically for certain forms of the condensation rate k 1 (i, j) and of the cyclisation rate k 0 (l).As our Molecular Dynamics simulations are practically limited to systems of hundreds of chains, to characterize the behaviour of larger systems we solve the Smoluchowski equation numerically employ-ing the Direct Simulation Monte Carlo (DSMC) algorithm [23][24][25][26].DSMC is a powerful stochastic method to solve differential equations such as Eq. ( 5), and which samples the correct ligation kinetics in the limit of large system sizes.The algorithm employed here is similar to the one described in Ref. [26], with the difference that here we do not include fragmentation, but instead, we include ring formation.The description of the algorithm also follows Ref. [26].The starting point for the Monte Carlo algorithm is an array m of length N c , each element i of which contains a number m i which represents the mass/length of the chain i: A value of 0 corresponds to the absence of a certain chain.Moreover, to satisfy mass conservation we ensure that Nc i=1 m i = N c is true at any time during the simulation.Here, N c denotes the total number of polymer chains.We will also consider an analogous array r of length N c (initially empty), where we save the masses of the rings.
For an initial monodisperse condition, we set m 0 = (1, 1, . . ., 1).After the array m is initialized, we run the DSMC simulation, which consists of repeating a large number of times a Monte Carlo step (described in detail in Appendix B).The execution is terminated when the system has reached a state in which there is a single linear chain and several non-reactive rings, where the only possible reaction is the cyclisation of the remaining linear chain.

Ligation Reactions with DNA
We perform irreversible condensation on linear DNA using T4 ligase New England Biolabs (NEB).This enzyme consumes ATP to form a covalent bond between two proximal and complementary double-stranded DNA ends.More specifically, we perform irreversible condensation on a monodisperse solution of linear, l 0 = 6, 500 bp-long plasmid (referred to as "1288" plasmid here) which is converted into a linear form by using a restriction enzyme (XhoI).This linearisation step is checked on gel electrophoresis.The equilibrium radius of gyration of this linear DNA molecule is about R g ≃ l p l 0 /3l p ≃ 0.2 µm (in agreement with diffusion data from Ref. [27]).This yields an overlap concentration c * = 3l 0 M w /(4N A πR 3 g ) ≃ 0.2 µg/µl with M w = 650 g/mol the molecular weight of a DNA basepair and N A the Avogadro number.For the low DNA concentration experiments we set the sample at 0.01c * , i.e. c = 2 µg/ml.To perform ligation we use T4 ligase (NEB, M0202L, 1U corresponds to 0.5 ng or 0.00735 pmoles of protein according to Ref. [28]), and work at 1x T4 ligase reaction buffer concentration, which contains 1 mM ATP.To classify the topology of the DNA under ligation, we perform time-resolved gel electrophoresis.We prepare a master solution of DNA at the desired concentration, 1x ligase buffer and 2 U/µl T4 ligase.
After adding T4 ligase, we draw aliquots at time intervals and heat-inactivate the reaction by heating the aliquot at 65 • C for 15 minutes.We then split the aliquot and treat one of the two sub-aliquots using exonuclease (RecBCD, Lucigen), an enzyme that digests linear, but not circular, DNA.Finally, we treat all aliquots with Nb.BbvCI Nickase (NEB, R0631L) to relax the supercoiled population [29].The resulting aliquots are run on a gel: we load 20ng of DNA from each aliquot onto a 1% agarose gel prepared using 1x TAE buffer.A standard λDNA -HindIII digest (NEB, N3012S) marker is also loaded.The gel is run at ∼ 2.5V/cm for 5 hours and post-stained with SybrGold (ThermoFisher) for 30 minutes.A Syngene G-box and Genesys software is used to image the gels.
The combination of nickase (relaxing the DNA supercoiling) and exonuclease (fully digesting linear DNA molecules) allowed the topology of the DNA in each band to be unambiguously identified.Further, the λDNA -HindIII digest marker confirmed the bands were of the correct size for monomer and dimer lengths.Here the terms "monomer" and "dimer" refer to a single DNA molecule and two molecules ligated, respectively.To extract the relative amount of molecules in each lane we compute, using ImageJ, the intensity of each lane and account for the fact that the band with dimers has chains that are twice as long.We then normalise against the sum of the three bands to obtain the relative fraction of chains in each population.

Microrheology
The viscosity of the systems is measured using particle tracking microrheology.Solutions are made by mixing 8 µl of 1288 linearised plasmid at different concentrations to a final concentration in the range 2ng/µl-500ng/µl with 1 µl of 40 U/ul T4 ligase and 1 µl of T4 ligase reaction buffer.Control solutions are prepared at the same time and in the same manner substituting additional TE for the T4 ligase.The samples are kept at room temperature on a roller for several days.The samples are then spiked with a = 800 nm PVP-coated polystyrene beads, pipetted and sealed onto a slide and imaged using an inverted microscope.We take a 30minute movie and we analyse the movies using a particle tracking algorithm (trackpy [30]) and extract the trajectories and mean squared displacements(MSD) of the tracers ⟨∆r 2 ⟩.Diffusion coefficients are extracted by fitting to the MSDs via MSD= 2Dt.The viscosity is obtained using the Stokes-Einstein relation [31], η = k B T /(3πDa).

A. Smoluchowski equation with cyclisation
In this result section, we first propose a modified Smoluchowski equation [18,19] describing polymers undergoing irreversible condensation (ligation) and cyclisation.Linear polymers undergo irreversible ligation with rate k 1 (i, j), with i, j the polymerisation indexes of the reactants, and cyclisation with rate k 0 (q).The concentrations of linear polymers of polymerisation index q at time t, n q (t), and of rings, n r q (t), are thus governed by the following equations: Once a linear chain undergoes cyclisation, it becomes a ring and cannot undergo ligation anymore, as the reactions are assumed to be irreversible.The kinetics is also constrained by the requirement that the total mass is conserved: where M is the total number of monomers and V is the system's volume.Assuming that the reaction takes place on a time scale larger than the Rouse relaxation time, the length-dependence of the annealing rate is [32,33] where l = il 0 is the length of a polymer with a degree of polymerisation i and l 0 is the initial polymer length, so that the chain's radius of gyration is R i = l 0 i ν .In Eq. ( 8), κ1 is a dimensionless constant and κ 1 is a constant that depends on temperature and the viscous friction of the solvent ζ.For example, in the Rouse model [34] This condensation rate captures the diffusioncontrolled search process [32,33].The cyclisation rate is taken to be k 0 (q) = κ 0 q µ , where µ = −4ν.Note that this is different from the classic Shimada-Yamakawa theory [20,35] which would predict µ = −3ν at lengths larger than l p because we (i) are out-of-equilibrium and (ii) account for the subdiffusion of the polymer end within the volume of the coil.
In equilibrium, the looping probability of a chain is given by the Shimada-Yamakawa formula [20,36].For l ≫ l p the looping probability of a polymer decays as P (l) ∼ l µ with µ = −3ν.This looping probability also holds for an irreversible, non-equilibrium scenario if the process is reaction-limited.This is because the chain ends would have the time to explore many conformations and to diffuse the whole volume of the chain, V ∼ l 3ν , before reacting (as it would happen in equilibrium).In a diffusion-limited, irreversible ligation process, one should instead compute the time it takes for an end to diffuse over a certain distance ξ.The dynamics of the end is described by the Rouse model [34] , where b is the size of a Kuhn monomer.Then, setting ξ = R (the size of the polymer coil) one obtains (R/b) 4 = k B T t/(ζb 2 ), which implies k 0 ∼ t −1 ∼ R −4 ∼ l −4ν .So considering µ = −4ν effectively takes into account the fact that the chain ends are performing a sub-diffusive search process within the polymer coil, as expected for Rouse dynamics.
We have verified that the rate of cyclisation scales as the length of the chain to the power −4ν by measuring the rate at which rings are produced for different lengths of the linear chains (Fig. 2a-b).We have done this by changing the initial length l 0 and by running short simulations, in turn assuming that the system has had no time to create dimers, trimers, etc. and by measuring the number of rings formed.We have observed that the rate of ring formation at early times Ṅrings ∼ l −2.6 0 which is close to the expected µ = −4ν = 2.4 with ν = 0.588.Thus, both theory and simulations suggest that the diffusion limited, irreversible looping probability of a polymer scales with its length as l −4ν .
To validate the functional form used for the condensation rate k 1 (i, j) (Eq.( 8)), we solve the Smoluchowski equation in the limit of small concentration and short times, where only monomer, dimer and monomer ring populations are assumed to be present (see next Section).In Fig. 2c we plot the condensation rate k 1 as a function of different initial polymer lengths obtained by fitting the analytical solution of Eq. (10b) (see below) to the monomer chains population omitting the second term since no rings were present in these conditions and at early times.From this quantity, we fit a power law l ν−α with ν = 0.588 and find α ≃ 1 yields a good fit to the simulated data.This validates de Gennes' hypothesis for the functional form of the condensation rate (Eq.( 8)) and our choices for ν and α.
In our experiments and simulations, we typically track the mean length of the polymers as a function of time and we fit this observable with the numerical solution of the full Smoluchowski equation, Eqs.(5a)-(5b)).This is practically implemented in a MATLAB code.The numerical evaluation of the system is iterated to find the best free parameters κ 0 and κ 1 that fit the mean length ⟨l(t)⟩ obtained from simulations or experiments.The fit is done using the nonlinear least squares MATLAB function lsqcurvefit.The rate of ring formation κ 0 and the rate of linear chains formation κ 1 are extracted from this fit by considering 40 independent replicas, allowing us to obtain the error on the fit parameters.For experiments, we typically average over 3 independent replicas.The numerical and fitting algorithms are described in detail in Appendix C.

Time-dependence of the mean length: dilute regime
At short times and in the dilute regime, we can assume that the formation of rings and short n-mers is more favourable.This assumption is valid in the experiments whenever only linear monomers, dimers and monomer rings are visible in the gel electrophoresis after ligation.In more dense solutions the presence of rings consisting of more than two monomer chains will be present and is observed in our simulations.Under very dilute conditions, we can thus assume that only monomers, dimers and monomer rings are present.Denoting the number density of monomer rings, linear monomers and dimers as n r 1 , n 1 and n 2 , respectively, the Smoluchowski equations describing the system take the form We solve Eq. (10b) neglecting the second term as n 2 1 ≪ 1 in the infinite dilution limit: The concentration of monomer rings is thus which yields Substituting in Eq. (10c), we get from which one obtains Assuming these three are the only contributions to the system, the mean length is then given by the following relation In denser solutions, where the population is more polydisperse, the Smoluchowski equation cannot be solved analytically and we refer to the next Section for a scaling prediction and to Sec.III C 1 for a perturbative approach in the limit of small cyclisation rate.
As mentioned above, we validate de Gennes' equation for the condensation rate (Eq.( 8)) by running short simulations at very high dilution.We then fitted the change in number of ring monomers with the closed solutions Eqs.((13)) for different values of initial polymer length l 0 .Similarly, we fit the solution of Eq. (10b) without the ring term to the population of monomers.From these data, we validate the scaling of the rates k 0 (l 0 ) = κ 0 l −µ 0 and k 1 (l 0 , l 0 ) = κ 1 l α−ν 0 as a function of length l 0 (Fig. 2bc).

Time-dependence of the mean length: concentrated regime
Here we give scaling arguments for the solution of the Smoluchowski equation in the concentrated limit, with the assumption that ring formation is negligible.At the mean-field level, we can make the simplifying assumption that the system can be described by a single characteristic length scale l [37].Under this assumption, the annealing rate scales as The total polymer density n thus follows ṅ = −k 1 (l)n 2 , so that from the dimensional analysis the time evolution of the characteristic length is [38,39] with λ = ν − α.For Rouse dynamics, one has α = 1, whereas α = 2 for reptation [40].The Flory exponent has value ν = 1/2 for ideal chains and ν = 0.588 for self-avoiding chains [40].Assuming concentrations above overlap but still far from the melt concentration (for which one would have ideal chain statistics and α = 1/2), we can assume ν = 0.588, so that γ ≃ 0.7 if the system is unentangled and γ ≃ 0.4 in the presence of entanglement.We note, however, that using Eq. ( 17) in the presence of entanglements is only valid for times longer than the reptation time τ R ∼ l 3 [41].
B. Linear DNA condensation

Simulations
We first simulate linear condensation using Molecular Dynamics.As detailed in the Methods section, we simulate polymers with N = 174 beads of size σ ∼ 38 bp and persistence length l p = 4σ = 150 bp.These polymers are thus designed to coarse-grain 6.5 kb-long DNA plasmids which will be employed in experiments (see next section).During the simulation, we take snapshots of the system and record the list of bonds to reconstruct the topology of the polymers (see Fig. 1).Over the simulation time, the number of initial linear chains decreases due to the formation of (i) longer linear polymers or (ii) circular chains (Fig. 3a).Additionally, lower monomer concentrations c promote the formation of more rings at large times and a slower decrease of the linear species.We also note that (i) the number fraction of rings converges to a finite value at large time, and that (ii) while the number of linear chains appears to go to zero, their mean length increases (Fig. 3b).Accordingly, the (number) average length of polymers grows more quickly for larger c (Fig. 3b).Thus, we conclude that loop formation competes with the growth of the chains, and that cyclisation is dominant in dilute systems.Interestingly, the curves of the mean length ⟨l(t)⟩ can be fitted extremely well by the numerical solution of the Smoluchowski equation Eq. ( 5) (Fig. 3b).

Experiments
As described in the Methods Section, we can perform DNA condensation using solutions of linearised DNA plasmids, mixed with ATP and DNA T4 ligase.We then perform a time-resolved experiment, where we draw aliquots from a master reaction at given time points from the addition of the T4 ligase.By running the aliquots on agarose gels we can visualise and compute the fraction of molecules in the linear and ring, monomeric, dimeric, etc. states.Fig. 4a reports a picture of one such gel, displaying a single band of monomeric linear DNA (as it disappears after exonuclease treatment) at t = 0, evolving into three bands, one of which is exonuclease resistant (a monomer ring) at larger times.In Fig. 4b we plot the relative abundance of these populations, from which we obtain the number average molecular length ⟨l(t)⟩ (Fig. 4c).

Dimensionless topological parameter
Since we initialise our simulations and experiments below entanglement conditions we fix α = 1 as expected for Rouse dynamics and ν = 0.588 as expected for self-avoiding polymers [34] (we verified these exponents through direct MD simulations in Fig. 2a-c).In general, the Smoluchowski coagulation equation (Eq.( 5)) is then solved numerically to fit the data of mean length versus time, ⟨l(t)⟩, obtained in simulations and experiments via the free parameters κ 1 and κ 0 .A key number in our system is the ratio of the rates at which polymers are condensed κ 1 , and the one at which rings are formed κ 0 .We thus define a dimensionless "topological parame- ter" κ ≡ 2κ 0 /(n 0 κ 1 ), where n 0 is the number density of monomeric chains of length l 0 at the start of the simulation or experiment.Albeit related to the classic j-factor employed in DNA looping [6,20], our topological parameter is more naturally interpreted as the number of rings formed for every two linear chains that are fused together.Intuitively, this number determines the final topological composition of the system.At κ ≫ 1, we expect the final state of the system to be dominated by rings, while for κ ≪ 1 to be dominated by linear chains.Importantly, since k 0 ∼ ⟨l(t)⟩ −4ν the probability of ring formation decreases in time as the average length of the linear chains increases.Accordingly, and even though our system has a ring-only irreversible absorbing state, we conjecture that the strongly decreasing looping probability may effectively yield a very long time-transient in which the system is dominated by entangled linear chains with circular contaminants (see below for more simulations on this).
Importantly, we expect the Smoluchowski equation to be valid only in the limit in which three-body interactions are negligible, the values of κ 0 and κ 1 should be independent on concentration only when c is small enough.By plotting κ ≡ 2κ 0 /(n 0 κ 1 ) as a function of c/c * (where c * is computed at the beginning of the simulation or experiment) we show that κ scales as n simulations and experiments until c ≃ c * where it starts to deviate (Fig. 5); this confirms that the Smoluchowski approximation is valid in this range of concentrations.Importantly, in Fig. 5 we also identify the crossover value κ = 1 (at which the initial cyclisation rate is larger than the dimerisation rate) around c/c * ≃ 0.1 − 0.2.We note that the agreement between simulations and experiments is excellent for small c/c * .However, quantitative analysis of gel electrophoresis images at larger c/c * is challenging due to the poor separation of multimeric bands.

C. Runaway Transition
The results in Fig. 3 suggest that at large c/c * the chains tend to grow longer, and cyclisation is suppressed; at the same time, the density of reactive ends and the speed of spatial exploration of the chains become smaller, thus suppressing dimerisation.Due to this kinetic competition, we ask whether the system can truly display a "runaway" phase, defined as a regime where at least one chain permanently escapes cyclisation and its length diverges in time.One way to address this question is to look at the number of chains that belong to the longest chain in the system, and how this quantity changes in time.By using a graph representation of our simulations (see Fig. 6a-b) we can compute the fraction of chains (nodes) that belong to the giant connected component (GCC), i.e. the largest cluster of connected monomer chains (Fig. 6b).In Fig. 6a one can visually appreciate that at large reaction time, rings (blue) are abundant at low c/c * while linear chains (grey) are more abundant at large c/c * .These systems display a qualitatively dif- ferent graph topology (Fig. 6b).At small c/c * (large κ) the network of monomers is mostly disconnected; accordingly, even when the fraction of unreacted bonds goes to 0 at t → ∞, the average length of the polymers does not diverge.On the contrary, at larger c/c * we observe only a few rings and some very long chains that are connecting most of the nodes in the system.Overall, the graph appears much more connected and approaching percolation, i.e.where most of the nodes belong to the GCC, whose size grows with the size of the system (see Fig. 6c).

Calculation of mass converted into rings at infinite time
Although our simulations support the notion that small values of κ will result in linear chains of increasing lengths and vanishing cyclisation rate, they are fundamentally limited to finite-size systems where the cyclisation rate of the largest chain never rigorously goes to 0. To estimate the amount of mass that is converted into rings at long times, we do a perturbative calculation valid in the limit of small κ.We start from the continuum Smoluchowski equation: We define K ≡ k 1 /κ 1 which is thus a scaling function such that K(ai, al) ∼ a λ k 1 (i, l) where λ = ν − α [42].We now treat κ 0 perturbatively, starting with κ 0 = 0.In this case, there is no mass lost into rings and we can thus write a conservation law Even for κ non-zero, we assume the loss of mass to cyclisation remains finite and of order κ.We will check the self-consistency of this assumption below.Using the mass conservation and Eq. ( 19) we can write the following scaling relations: l 2 n = 1, nt −1 = l 1+λ n 2 .We therefore obtain: which is the same as Eq. ( 18).Note that we must have λ < 1 for the average length of polymers to increase over time.We can also write the density distribution as which in the limit of long times or large lengths may be written as where G is a scaling function that only depends on the ratio l/t 1/(1−λ) .
We now introduce the ring length distribution n r l (t) and its evolution equation as Since at time t = 0 there are no rings, we can then write We can plug in the result we obtained for the distribution of length of linear chains Eq. ( 23) to yield (26) where we defined x = l/t 1/(1−λ) .Thus, the number density of polymers that are converted into rings over infinite time is Since λ < 1 and assuming the G(x) = O(1) when x → 0, the integral converges at 0. For convergence of this integral at ∞ we also require that the scaling function decays faster than x λ−1 .Assuming this functional form for the distribution of ring lengths at infinite time, we now compute the total average mass transformed into rings at infinite time as The convergence of this integral requires that λ − µ > 1 and in this case we get From this equation, we see that the fraction of mass in rings at infinite time M ∞ rings /M 0 converges to a finite value proportional to κ 0 (and hence < 1 at small κ 0 ).
With this calculation, we have thus shown that at small enough but non-zero κ 0 , the fraction of mass turning into rings is finite if λ − µ = ν − α + 4ν > 1 or ν > α/5 which is valid for any type of polymer in the non-entangled (α = 1) regime.This implies that in this regime we expect the cyclisation probability to decay fast enough and cannot prevent the runaway of the M 0 − M ∞ rings mass into linear chains that keep growing in time.
Consistently with this, in both MD and MC simulations, we never observe the formation of rings larger than 10 initial monomers.As shown here using asymptotic theory the mass fraction of linear polymers goes to a finite limit at t → ∞ in a thermodynamic system.We find that the key condition to ensure the existence of runaway transition is that the cyclisation rate k 0 = κ 0 l µ decays strongly enough.More specifically, we require the exponent µ to be µ = −4ν < −4α/5 or ν > α/5.This condition is always met in the Rouse unentangled (α = 1) regime, provided that the polymers are not fully collapsed (ν = 1/3).This argument establishes the existence of a runaway transition in the limit of large time and at large enough concentrations c/c * .

Direct Simulation Monte Carlo simulations of irreversible condensation
To formally address the existence of a true runaway transition in the thermodynamic limit, we compute the fraction of monomers belonging to linear species in systems of increasing size.To perform this calculation, we employ Direct Simulation Monte Carlo [23][24][25][26] to solve the Smoluchowski equation in systems with up to 10 5 chains.We run the DSMC code until it has reacted all ends apart from 2 and compute the average length of the linear population of chains, ⟨l lin (t ≫ 1)⟩.As shown in Fig. 6c, our MD simulations show that at κ = 1 the GCC displays a change in scaling, growing as GCC ∼ κ −1 ∼ c/c * as κ → 0 suggesting that a qualitative change in behaviour takes place around κ ≃ 1.In Fig. 6d, we also plot the number averaged chain length at an arbitrarily large time when the DSMC code has evolved the system as long as possible and has generated only a single linear chain.Fig. 6d suggests that the linear-dominated regime (κ < 1) displays an average polymer length at a large reaction time that scales as ⟨l(t ≫ 1)⟩/l 0 ∼ κ −1 ∼ c/c * .Additionally, the fraction of mass "lost" in forming rings grows as M rings ∼ κ and is thus negligible for small enough κ (Fig. 6e).Finally, as shown in Fig. 6f, the mean length of the linear chains ⟨l lin (t ≫ 1)⟩ displays a plateau for κ ≲ 1, which grows with the system size, strongly indicating a true runaway transition at the critical value κ ≃ 1 or c/c * ≃ 0.1 − 0.2.

D. Dynamics and Rheology
To test the consequences of the runaway transition on the dynamics and rheology of the system, we perform microrheology experiments and compute dynamics in MD simulations.DNA microrheology is well established and the effects of DNA concentration, length and topology on microrheology have been studied in the past [43][44][45][46][47][48][49].Here, we perform microrheology by tracking 800 nm PVP-coated polystyrene beads added in a solution of DNA that has been treated with either 40U T4 ligase for a week (and thus to full extent of reaction) or with buffer for a week (control) at different initial concentrations.We ran a small aliquot of the samples in a gel and observed that indeed at c/c * ≃ 0.1 the fraction of linear chains overcome the rings at large times (Fig. 7a-b).
At low concentrations, our microrheology shows that the MSD of the tracer particles is unaffected by DNA ligation (Fig. 7c).On the contrary, for c/c * ≥ 0.1, we find that the MSDs of the tracers in the ligated systems are much slower and display a stronger subdiffusive behaviour than the control (Fig. 7c).From the MSD, we extract the large-time diffusion coefficient D of the tracers and the effective viscosity of the sample via the Stokes-Einstein equation [31].The plot of the normalised viscosity (Fig. 7d) suggests that a dynamical transition takes place around c/c * = 0.1 − 0.2 (or κ ≃ 1 − 2) which matches the structural runaway transition seen before (Fig. 6).After the transition, the viscosity increases exponentially with the concentration (see inset of Fig. 7d).This suggests a relaxation process dominated by end-retraction [34], possibly due to the threading of very long linear chains through small rings [50][51][52][53][54] or pseudo-knotted parts of their own extremely long contour [55,56].We note that, especially at large c/c * , the ligated solution is extremely elastic and the passive tracers do not display a freely diffusive behaviour even after a lag time of ten minutes.We thus argue that the reported η/η 0 may be lower bounds at large c/c * , which would render the transition even more dramatic.All this implies that, intriguingly, near the transition c/c * ≃ 0.1, systems prepared at similar concentrations may display extremely different rheology at large condensation times.To further support the existence of a qualitative change in the dynamics, we compute the values of viscosity obtained in MD simulations through the diffusion coefficient of the centre of mass of chains that have been ligated for long time at different initial concentrations (see red circles in Fig. 7d).One can appreciate that our simulations also suggest a qualitative difference in dynamics for c/c * ≥ 0.1, albeit the transition appears less dramatic than in experiments; we argue that this may be due to finite size effects present in MD simulations.

IV. CONCLUSION
We have studied a system of linear polymers undergoing irreversible condensation in competition with cyclisation.We have shown that the key adimensional parameter controlling growth kinetics is κ = 2κ 0 /(n 0 κ 1 ); naturally interpreted as the number of rings formed for any one dimerisation.At large concentrations (or κ < 1) dimerisation is kinetically favoured and drives the growth of linear chains.While growth disfavours cyclisation, it also reduces the number of available reactive ends and the annealing rate of the chains (see Eq. ( 8)), disfavouring further growth.Despite this, we discover that the net result of this kinetic competition is a runaway transition for κ < 1 if the cyclisation rate decays strongly enough with polymer length, i.e. with ν > α/5, with ν the metric exponent (typically 1/2 for random walks and 0.588 for self-avoiding walks) and α the dynamics exponent (typically 1 for Rouse and 2 for reptative dynamics).In these conditions, the fraction of monomers transformed into rings is finite, thus leaving the rest of the monomers available to form a permanently growing linear chain which then drive a runaway reaction.
We also discover that the runaway transition has deep consequences on the rheology, and triggers an exponential increase for κ < 1 (or c/c * > 0.1).Our results suggest that it may be possible to tune the final topological composition of ligated systems by judiciously choosing c/c * .For instance, the most likely regime to form large rings and ring-linear blends [51,52] is near the transition c/c * ≃ 0.1.Mixing polymer families with different reactive ends further enhances the designability as it introduces different c * for each family.Our results can be used to optimise the conditions for DNA engineering, e.g., transfection vectors [2] ought to be ligated at c/c * < 0.1 whereas synthetic chromosomes assemblies [57] at large c/c * .Finally, it may be possible to couple dissipative DNA breakage reactions [48,58,59] with ATP-consuming ligation to create dense solutions of self-sustained topologically active viscoelastic fluids which would be an interesting active fluid to investigate in the future.
To obtain the mean number of cyclisations of a generic l−mer we need to multiply this quantity by V n l , i.e., the volume fraction of filaments of length l.Equating this quantity to the expected number of rings formed in a time interval ∆t we obtain By equating the two expressions for ∆t, Eq. (B4) and Eq.(B7), we find Eq.(B1).We have thus proven that the latter is the correct expression of p, which gives the correct number of cyclisation and ligation events per unit time and unit volume, as required by the Smoluchowski equation.
Finally, we will prove below that the constants A and 1 − A introduced when calculating the waiting time increments are consistent with Eq. (B4) and Eq.(B7).To show this, it is sufficient to observe that the total time

FIG. 2 .
FIG. 2.Validation of cyclization and annealing rates.a. Number of rings Nrings as a function of time for systems initialised to N = 400 chains with l0 = 87 monomers each (5 independent replicas).The black line represents the numerical derivative of the average ⟨Nrings⟩ in the limit t → 0. b.The numerical derivative dNrings/dt displays a power law decay with initial polymer length and with exponent −2.6, close to the value −4ν = 2.4 for ν = 0.588 (as predicted, see text Sec.III A). c. Value of k1(L) obtained from fitting the analytical solution of Eq. (10b) neglecting the second term, in the limit t → 0. The exponent of the power law is close to the predicted value ν − α (see text Sec.III A) with ν = 0.588 and α = 1.

FIG. 3 .
FIG. 3. Results from MD simulations.a. Number fraction of linear (dotted line) and ring (solid line) polymers during the condensation process for different concentrations c (averaged across 40 independent replicas).b.Number averaged polymer length for the simulations in a as a function of time (symbols), and fitted (solid line) with a numerical solution of Eq. (5) (see Sec. III A for details).

FIG. 4 .
FIG. 4. Experiments of DNA condensation at low concentration.a. Time-resolved gel electrophoresis during ligation of a 2 ng/µl (c/c * = 0.01) solution of 6.5 kb linearised plasmid.The lanes marked with 'e' are treated with exonuclease to remove linear DNA; 'RM' indicates ring monomers, 'LD' linear dimers and 'LM' linear monomers.The term monomer here refers to a single DNA molecule.The last column is λ-HindIII marker as a reference of DNA fragment lengths.The numbers at the top represent the minutes for which the solution was incubated with ligase enzyme.b.Number fraction of polymers in ring and linear (monomer and dimer) topologies obtained from image analysis of the gel in a. c.Number averaged length calculated from 3 independent gels at 2 ng/µl (symbols) and associated fit (solid line) using Eq.(5) numerically solved as described in Sec.III A.

1 FIG. 5 .
FIG. 5.The dimensionless topological parameter.Dimensionless topological (also "cyclisation") parameter κ = 2κ0/(n0κ1) as a function of c/c * .This is obtained by fitting simulations and experimental curves ⟨l(t)⟩ with the numerical solution of the Smoluchowski equation with κ0 and κ1 as free parameters.There are no other free parameters in our model.The scaling κ ∼ (c/c * ) −1 is consistent with κ0 and κ1 being independent of concentration, and with this assumption breaking down near c/c * ≃ 1 where three-body interactions become important.

FIG. 6 .
FIG. 6. Runaway Transition.a. Snapshots of MD simulations with initially Nc = 200 chains: the blue polymers represent rings of any length, grey polymers linear chains.b.In the corresponding graph representations, each circle represents a single chain and its colour represents the number of chains connected to it: 0 (grey), 1 (cyan), or 2 (blue).For example: a linear monomer chain is grey, a ring monomer is blue, a linear dimer has two cyan nodes, and a dimer ring has two blue nodes.c.Fraction of monomers in the giant connected component of the system as a function of κ. d.Average polymer length (including both linear and ring) at a large time as a function of κ. e.Average mass of rings Mrings = Nrings × ⟨lrings(t ≫ 1)⟩ at large simulation time.Note that some simulations may not display any rings, thus bringing the average Mrings below 1. f.Average mass of linear chains at large simulation time as a function of κ.MC = "Monte Carlo"; MD="Molecular Dynamics".

FIG. 7 .
FIG. 7.Rheological consequences of the runaway transition.a. Time-resolved gel electrophoresis during ligation of c/c * = 0.01, 0.1, 0.25, 1 and 2.5 solution of 6.5 kb linearised plasmid.The lanes marked with 'e' are treated with exonuclease to remove linear DNA; 'RM' indicates ring monomers, 'LM' linear monomers and 'Multi' various lengths of linear structures.The numbers at the top represent the concentrations of DNA.b.Fraction of linear and ring molecules of any size as a function of overlap parameter c/c * .c. Mean squared displacements (MSDs) of 800 nm PVPcoated polystyrene beads diffusing in solutions of DNA.We compare the MSDs in solutions treated for 1 week either with 40U T4 ligase or buffer (control).d.Viscosity of the ligated solutions η, normalised by the viscosity of the control η0 vs η/η0 of the ligated solutions in simulations.The inset shows the same plot in a log-linear scale to highlight the exponential increase.