Understanding repulsively mediated superconductivity of correlated electrons via massively parallel DMRG

The so-called minimal models of unconventional superconductivity are lattice models of interacting electrons derived from materials in which electron pairing arises from purely repulsive interactions. Showing unambiguously that a minimal model actually can have a superconducting ground state remains a challenge at nonperturbative interactions. We make a significant step in this direction by computing ground states of the 2D \mbox{U-V} Hubbard model - the minimal model of the quasi-1D superconductors - by parallelized DMRG, which allows for systematic control of any bias and that is sign-problem-free. Using distributed-memory supercomputers and leveraging the advantages of the \mbox{U-V} model, we can treat unprecedented sizes of 2D strips and extrapolate their spin gap both to zero approximation error and the thermodynamic limit. We show that the behaviour of the spin gap is only compatible with a spin excitation spectrum that is either fully gapped or has zeros only in discrete points, ruling out a Fermi liquid ground state. Coupled with the enhancement to short-range correlations that we find exclusively in the $d_{xy}$ pairing-channel, this allows us to build a strong indirect case for the ground state of this model having superconducting order in the full 2D limit, and ruling out the other main possible phases, magnetic orders and Fermi liquids.


I. INTRODUCTION
Understanding the impact of interparticle repulsions in fermionic quantum systems remains one of the most challenging problems in physics. As was shown by Landau, electron-electron repulsions in a solid may often be treated by redefining excitations as quasiparticles. The resulting physics is thus essentially of free particles (the so-called Landau quasiparticles), with interactions mostly leading to a renormalization of some physical parameters such as the mass. This Fermi liquid (FL) approach has seen very broad success and allowed to understand the effects caused by perturbations on top of the Fermi liquid, such as instabilities towards, e.g., a Bardeen-Cooper-Schrieffer (BCS) superconducting (SC) state, a magnetically ordered (MO) state or a charge density wave (CDW) state [1].
There are however situations where the FL approach breaks down, leading to so-called "non-Fermi liquid" behavior. At high spatial dimensionality, this can happen if the interactions are particularly strong, or if the filling of the systems is commensurate with the lattice, with, e.g., one particle per site leading to a Mott (MI) state. Reduced dimensionality of the system can further enhance the effect of interactions. In one dimension in particular, the interactions always have drastic effects and lead to physical properties very different from the ones of a Fermi liquid. The FL is replaced by another set of universal features, called the Tomonaga-Luttinger liquid (TLL), where the elementary excitations are the collective excitations of charge and spin in the system. Such systems are critical and at zero temperature T = 0 possess various competing quasi-long-range orders, ranging from antiferromagnetism to superconductivity [2].
Non-FLs have thus been prime candidates to search for novel physical phases, in particular unconventionally superconducting (USC) phases mediated by excitations other than the electron-phonon one, of which the high-temperature (high-T c ) superconductors [3,4] are just one example. The observed proximity between USC and MO phases has made fluctuations around the MO state prominent candidates for such pairing mechanisms. On the experimental side, many systems ranging from heavy fermions [5,6], organic superconductors [7,8], high-T c superconductors, and pnictides [9][10][11] show USC phases. Yet, obtaining any widely accepted theory on the nature of such exotic pairing for any of these materialgroups has proven to be difficult; even the simplest physical models, the so-called minimal models, which are abstracted from the materials' full physical structure are fundamentally hard to solve: the 2D U-V model at half filling (organics), and the weakly doped 2D Hubbard model (cuprates). So far, any solution of these minimal models has almost always necessitated additional technical approximations that introduce errors of unknown magnitude. It is thus difficult to determine whether phases predicted for such models in the framework of a given approximation scheme are truly present or an artefact of the approximation, or whether to explain the experiments, different, more extended models are required, such as, e.g., in the three-band models of the high-T c cuprates [12], or in those approaches that investigate the role of phonons in USC materials [13].
Even when moving to the more tractable regime of weak repulsion, SC and MO instabilities interfere at all orders of the perturbative renormalization group (RG) treatments applied so far [14][15][16], necessitating complex numerical methods [17] or sophisticated, and still approximate, functional RG procedures [18][19][20][21][22]. More uncertain yet is the situation in the most important regime, actually relevant to the models' underlying materials: when repulsion becomes large and competes with kinetic energy. For this regime, numerical methods are a route of choice. However, in two or more spatial dimensions these too are faced with steep challenges. Quantum Monte Carlo (QMC) treatments suffer from the so-called sign problem for fermions leading to an exponential degradation of the signal to noise ratio as, e.g., the temperature is lowered [23]. And while innovative techniques like diagrammatic QMC have delivered intriguing insights into USC pairing for the 2D Hubbard model, these are still constrained to intermediate interactions at most and doping very far from unity [17].
Our numerical approach here is different, and rests on two observations. (1) The fact that for the density matrix renormalization group (DMRG) technique, and other tensornetwork methods such as PEPS, any possible bias can be controlled against via extrapolation, and that these methods are free of the sign problem [24,25]. (2) That the one minimal 2D model in which a USC ground state from strong electron repulsions can unambiguously be shown to exist, is comprised of quasi-1D electrons (doped 2-leg Hubbard ladders) weakly coupled in parallel [26,27]. This minimal model however, as yet, does not correspond to any extant material.
These two observations have led us to conclude that the minimal 2D U-V model of the organic superconductorswhich has an analogous structure, of many 1D systems coupled weakly in parallel to each other-offers an unique opportunity for insight on the strongly interacting ground state of a candidate model for USC pairing. This would be done by swapping out the perturbative theory used on the Hubbard ladders in the low-energy field-theory limit, which would not be applicable to the U-V model setting, with quantitatively reliable DMRG.
Yet, using DMRG in this particular setting comes with issues to resolve first. While DMRG delivers highly accurate results both for the statics and the dynamical correlations in 1D systems while using only modest computational resources, the 2D setting of the U-V model is different. On such lattices, DMRG is known tho require resources that increase quasiexponentially with lattice width when trying to maintain any preset accuracy [25]. Tractable sizes for systems of itinerant electrons so far usually have been on the order of about 100 sites, depending on the specific problem [23,[28][29][30][31][32]-pushing towards around 200 sites, while done, is generally challenging and goes along with a decrease in accuracy. As the sum of the evidence indicates that states with SC order are in close energetic competition with MO ones, even changes in geometry (i.e., boundary effects) could easily result in different orders winning out for lattices of such size. As a result, no unbiased approach has so far been able to show unambiguously that a ground state of either of the two minimal models may exhibit USC order when repulsion is nonperturbatively large.
In the present work, we take a significant step in that direction. We build on our development of the numerical parallelized density matrix renormalization group (pDMRG), a distributed-memory version of one of the standard DMRG formulations [24] capable of exploiting modern supercomputer architectures. With this method we investigate the properties of the 2D U-V Hamiltonian at half filling of the bands. We consider an array of parallel chains of strongly correlated 1D electrons, each described by such a U-V Hamiltonian and weakly coupled to each other by transverse tunneling. In addition to being the model Hamiltonian for organic superconductors [33], and thus potentially containing the unconventionally superconducting phase observed in this system, this Hamiltonian has two critical advantages for numerical study compared to the doped 2D Hubbard model [23]: (1) it is trivial to maintain fixed ratios of electrons to lattice sites and thus to extrapolate to infinite system size here, since no doping is required. The pDMRG then allows doing this with the accuracy required to extrapolate energies of large systems, such as to enable the reliable calculation of energy differences, as appearing in, e.g., the most important observable we use, the spin gap. Underlying this accuracy is that pDMRG can exploit the strong anisotropy of electron tunneling in the 2D U-V model by spreading out a single ground-state calculation across many supercomputer nodes. This anisotropy ties directly into the U-V models' second advantage: (2) the strip or cylinder geometry, that DMRG is especially good at handling intrinsically, suits the U-V model naturally if aligned with the strongtunneling direction, as correlations across the strips' width will be naturally weak due to the small perpendicular tunneling, which is not the case for the doped 2D Hubbard model.
We show that in the thermodynamic limit this systems' averaged spin gap s is incompatible with either a FL or an MO ground state, for strips of finite width. The behavior of s as L → ∞ appears most compatible with s > 0 at infinite length, and being either nondecaying or even increasing with the strips' width, i.e., pointing towards a fully gapped spin excitation spectrum. But we also consider the alternate possibility that s might ultimately scale to zero beyond the system sizes we can access. However, we show this alternative is still at most compatible with a spin excitation spectrum with isolated zero-energy points, i.e., in line with weak-coupling theories of USC systems. However, either scenario is incompatible with either antiferromagnetic order or Fermi liquid-like ground state. Together with the strong enhancement exclusive to the d xy channel of correlations that we find with increasing strip width, our results make the case for SC singlet pairing with d xy order being the dominant component in the ground state of the U-V model at half filling when repulsion is competitive with kinetic energy.
The structure of the paper is as follows. In Sec. II, we outline the U-V model and our approach to analyzing the system and choosing the model parameters; in Sec. III, we describe the key features of pDMRG relevant to this work; in Sec. IV, we detail our scaling procedure; in Sec. V, we analyze the behavior of the spin gap and the correlation functions 1. (a) Overview of the 2D U-V model. Conduction electrons tunnel along effective 1D chains with amplitude t, where electrons are subject to strong on-site repulsion U and a weaker nearestneighbor repulsion V . Interchain tunneling t ⊥ is much weaker than t. (b) Schematic parameter space of the 2D U-V model at half filling, for fixed U = 4t. The only limit in which all possible ground states and the transition between them is understood is the 1D limit, i.e., at t ⊥ = 0, when the chains decouple, and are either in a TLL or MI phase, depending on whether V < V c or V > V c . The red dots show the parameters at which we work in this present paper.
as the width of the 2D U-V strips is increased, making the case for a SC ground state with d xy order and against any MO or FL competitor order; in Sec. VI, we discuss these results and how they connect to experiment; in Appendix A, we provide details for the used scaling procedure; and finally, in Appendix B, we provide a broader technical overview on pDMRG.

II. MODEL AND PHYSICAL OBSERVABLES
As depicted in Fig. 1(a), the U-V model in 2D is formed by a parallel array of N ch 1D chains of lattice electrons, each of length L, with interchain tunneling. Its lattice Hamiltonian iŝ Here,ĉ † i,n,σ (ĉ i,n,σ ) denotes the electron creation (annihilation) operator on site i of chain n of spin σ =↑, ↓, andn i,n,σ := c † i,n,σĉ i,n,σ ,n i,n :=n i,n,↑ +n i,n,↓ denote local electron density for spin σ and total electron density operators respectively. The tunneling amplitude along the chains is given by t, while in-between adjacent chains it is given by t ⊥ , and generally t t ⊥ . We point out that the t ⊥ term above obeys open boundary conditions, just as the t term does. We have made this choice because obtaining ground-state energies accurate enough to compute reliable differences between them is an indispensable requirement to extrapolate this systems spin gaps in the limit L → ∞. Cylindrical 2D lattices however are well known to immediately suffer from lower accuracy, and are thus avoided here. Finally, the onsite Coulomb repulsion is given by U , while intrachain nearest-neighbor repulsion is V , with V < U/2. The entire system is at half filling, i.e., N ↑ = N ↓ = LN ch /4, and thus k ↑ , where a is the lattice spacing.
This minimal model originally arose from the study of the organic Bechgaard and Fabre salts ("the organics"), the first materials discovered to support an USC phase [7,34]. These compounds show a very rich phase diagram, and especially a change in effective dimensionality with temperature-1D TLL-like at high temperature, then 2D-like and finally 3Dlike at near zero-as quantum coherence can increasingly be established along the three orthogonal and successively weaker directions for electron tunneling. Their most striking feature is the phase transition from a magnetically ordered to a unconventionally superconducting phase at low temperature, as the proximity, and thus the interchain tunneling t ⊥ , between the strong-coupling chains of the U-V model is increased [35]. To understand this competition, and whether the minimal U-V model (1) of the organics can actually capture it at least in a weak-interaction scenario (which does not correspond to the actual materials), perturbative RG has been repeatedly applied [18][19][20][21]. While beset with the same technical limitations this approach has for the doped 2D Hubbard-model of the cuprates (cf. Sec. I), it does suggest that the minimal 2D U-V model, might indeed support a USC phase transition from a MO phase when extended with, e.g., a next-nearest-neighbor tunneling perpendicular to the chains, and that pairing would be in the d x 2 −y 2 -channel (gapless superconductivity would be compatible with experiments on the organics [36,37]). However, so far there has been no complementary method to either validate these results, and especially no numerics that could deal with the U-V model (1) for the experimentally relevant case of strong interactions.
In order to be able to address the case t ⊥ = 0, N ch > 2 for this Hamiltonian with reliable numerics, we have developed pDMRG to exploit the distributed-memory architectures of modern supercomputers. Based on the tensor-network formulation widely used in modern DMRG, the code has major advantages over other quantitative numerical approaches for this problem: it inherits the lack of sign-problem and the ability to use extrapolation in the known error to filter out possible implicit bias towards any particular order (which is inherent, e.g., to any wave-function Ansatz of variational quantum Monte Carlo) from conventional DMRG. And further, as described in Sec. III and Appendix B, the parallelized nature of pDMRG makes the necessary ground-state calculations tractable in the first place, by distributing the many nonlocal tunneling terms of the model across different nodes of the supercomputer. As laid out in Sec. III, the large number of such terms becomes unavoidable when keeping the amount of bipartite entanglement at manageable levels, which is performance-critical for any 2D model.
As detailed in Sec. II A, the spin gap is the most important observable for our study of this system, as its scaling as L → ∞ allows the crucial elimination of possibilities for ground-state ordering in the thermodynamic limit that has so far proven elusive for any minimal model of USC systems. This is because ground states with either MO order or in a conventional FL state will show a linear vanishing of any spin 075138-3 gap. In contrast, standard analytical theories of USC systems combine weak-coupling RG, for identifying the leading instability and its symmetry, with mean-field descriptions of the ordered phase. For these theories, the unavoidable zero nodes of the order parameter result in sublinear scaling to zero for a spin gap [36,[38][39][40][41][42], while experiments on the actually strongly coupled regime in USC materials have found finite spin gaps (cf. Sec. VI). It is therefore vital that the known maximal value of the approximation error of the DMRG technique allows us to extrapolate the computed spin gaps to zero truncation error for any given lattice size, as described in Sec. IV B.
To outline our strategy of analysis, we now discuss in Sec. II A the concrete ground-state observables we considered, providing the road map for ruling out the main competitor states to any superconducting ground states, the FL and the MO states. In Sec. II B, we then discuss our choice of parameters for Hamiltonian (1), and why these should reduce the competition with charge-ordered ground states.

A. Observables
One of our key priorities is to establish whether any set of parameters for Hamiltonian (1) could or could not result in one of the main competitor orders to a USC ground state in the 2D limit, a MO or FL state, being realized. One quantity that would allow making these distinctions is the spin susceptibility at zero temperature where E L,N ch GS denotes the ground-state energy of (1) at half filling, and S z = N ↑ − N ↓ is the difference between spin populations. Here, s can be any spin number, as long as it is negligible compared to a macroscopic number of spins (proportional to L). Typically, s is chosen so that other quantum numbers in the system can be kept identical between s and s = 0. The usual choice is s = 1, corresponding to the spin flip of a single spin. However, as discussed in Sec. IV A, the choice s = N ch has a crucial technical advantage for extrapolating to infinite length L → ∞.
When L → ∞, χ s (L, N ch ) becomes the thermodynamic spin susceptibility. In order to get a finite susceptibility, as expected for a FL or a MO state, one would thus need the quantity to scale as 1/L. Other scalings of this quantity thus directly signal a non-FL, non-MO behavior for the spin susceptibility. The simplest case is that the averaged spin gap remains finite when L → ∞, where s [L, N ch ] is the generalization of the standard spin gap Then the spin susceptibility would scale as (L s [L → ∞, N ch ]) −1 , leading to zero susceptibility in the thermodynamic limit, as expected for fully gapped systems. For SC systems with unconventional electron pairing, such fully gapped spin excitations have been actually been found experimentally (see Sec. VI).
More complex behavior can be expected if (3) has a different scaling with the size of the system such as, e.g., 1/L α with 0 < α < 1. The thermodynamic spin susceptibility remains zero in this case, yet the systems' spin excitations are not fully gapped. In mean-field theories of USC systems, all this stems from the line or point nodes of the SC gap [36,[38][39][40][41][42]. At finite temperatures, the scaling with L would be converted into a nontrivial temperature dependence of the spin susceptibility corresponding to a pseudogap behavior [43,44].
Establishing the L dependence of s [L → ∞, N ch ] through our numerical results therefore allows to differentiate between three possibilities for the ground state of the twodimensional system in the T = 0 limit: FL/MO, fully spin gapped USC, or mostly spin-gapped USC-to the extent that we can extrapolate trends as N ch is increased.
The use of pDMRG allows meaningful extrapolation of s [L → ∞, N ch ] in 1/L for the first time, enabling the calculation of ground states of multiple coupled, long U-V chains (N ch 8, L 64) with quantitatively accurate numerics (cf. Sec. III) [45]. As discussed in Sec. V A, this allows us to make a strong case that for our choice of Hamiltonian parameters any magnetic order or Fermi liquid behavior is suppressed in the ground state of the 2D U-V model, due to the way s [L, N ch ] behaves with L → ∞, and showing this bevior to be robust as N ch grows.
Our point of reference for this are 2D arrays of coupled spin-chains, which demonstrate the opposite scenario, of the spin gap decaying with N ch . If infinitely many 1D Heisenbergchains were to be coupled in parallel, the result is a gapless 2D Heisenberg model-but when only two such chains are coupled this system exhibits a finite spin gap [46,47]. Multiple techniques have since shown how these two extremes are bridged: as the number of coupled Heisenberg chains increases, the spin gap remains finite (for an even numbers of chains) but continuously decreases as N ch grows, scaling to zero in the limit of infinitely many chains [48,49].
The information gained from spin gaps can be supplemented by studying the change in the short-range behavior of certain correlation functions as N ch is increased. When magnetic orders can be ruled out, the functions of interest here are those of the two other main competitor groups: SC ordering on the one hand-with the possibility of pairing in the d x 2 −y 2 -, d xy -, and extended s-wave channels-and CDW ordering on the other hand. The corresponding correlation functions considered for this are Green shaded area: single chain enters MI regime. Red shaded area: two coupled chains can enter MI regime [50]. To be certain we are outside the regime where this counter-intuitive increase in parameter space for the MI insulator takes place in the few-chain regime, we work within the blue-shaded area of parameter space. For reference, the yellow shaded area shows the parameter regime realized in the Bechgaard and Fabre salts. Black lines show lines of constant K ρ for the single U-V chain (the data and figure reprinted from Ref. [51]).
whereD denotes the nearest-neighbor spin-singlet operatorŝ D i,n, j,m :=ĉ i,n,↑ĉi+ j,n+m,↓ −ĉ i,n,↓ĉi+ j,n+m,↑ . As illustrated in Fig The basic results serving as background to any choice of Hamiltonian parameters are summarized in Fig. 1(b). In the limit t ⊥ = 0, the chains decouple and are described by TLL theory for sufficiently small V . As long as U is fixed to any value 2t, there exists a value V c above which a charge gap opens in the system, turning it into a Mott insulator (MI) [51,52]. The phase boundary at t ⊥ = 0 is shown as a green line in Fig. 2. Now, in order to maximize the chance of finding SC order, we attempt to target a parameter regime in which the decoupled chains would be in the TLL phase at the same time as the coupled chains (t ⊥ > 0) would remain in a delocalized regime. Ultimately, only the computational results can indicate whether one succeeds with the latter condition. However, given the limited supercomputing time at hand, we try to maximize our chances for this to happen from the outset, as described within the rest of this section.
A look at the minimal U-V model of the organic superconductors (Bechgaard and Fabre salts) is instructive here. Experimental probes of these show that here the 1D chains of the U-V model have a Luttinger-liquid parameter K ρ (which characterizes interactions and correlations of the charge mode of an isolated chain) in the vicinity of 0.25 [33]. When K ρ = 0.25, theory predicts the single chain at half filling to develop a charge gap and become a MI. In terms of the microscopic parameters, this vicinity to K ρ = 0.25 for the single chain would correspond to U/t = 10 and V/t ≈ 2-3, cf. yellow area in Fig. 2. When the ratio t ⊥ /t is small enough, experiments show the physics of the single chain extending to the whole material, i.e., the system is an MI as well, and magnetically ordered on top of it. Only as t ⊥ /t grows and t ⊥ becomes competitive with the Mott gap of the single chain is superconductivity found in experiments, typically at t ⊥ /t ≈ 0.1.
It may be tempting to try to straightforwardly replicating this setting for our calculations. However, there has recently been new insight on the two-chain limit by two co-authors of the present work [50], which cautions against such an approach. Combining analytical RG treatment of the bosonized chains with DMRG, we find that finite t ⊥ for two chains increases the U-V parameter space in which the system develops a Mott gap (red line in Fig. 2). While early exploratory work indicates that this parameter space might shrink again for N ch = 4, it is beyond the scope of this work or of Ref. [50] for definite statements how exactly the few-chain regime evolves in this regard.
In the following, we therefore focus on a regime in which these complications of few-chain physics are unlikely to apply, at U/t = 4 and V/t = 0.5, 1, and with t ⊥ /t = 0.1 (blue shaded region in Fig. 2). This regime realizes the essential physics present in the superconductivity of the organics, i.e., a regime where interchain tunneling is not effectively suppressed by a single-chain gap. A priori, this regime thus appears to offer the highest chance of finding a superconducting ground state due to repulsively mediated pairing. We have targeted two values of V/t to be certain our results will be representative for a finite area of parameter space.

III. THE METHOD: PARALLEL DMRG
As stated in Sec. II, exploiting the physical structure of the 2D U-V model is essential for being able to handle the bipartite entanglement of this system when mapping the physical lattice to the DMRG chain. Being comprised of many strongly correlated 1D chains in parallel with weak coupling in-between, it would seem intuitive that performing the mapping along the direction of strong tunneling [cf. Fig. 3(a)] delivers lower overall bipartite entanglement in the 1D DMRG chain. This argument is complemented by theoretically estimating the performance of the alternate mapping along the t ⊥ direction that is used conventionally when 075138-5 applying DMRG to what is effectively a long and narrow 2D strip. The bond dimension, denoted as χ in the following, is the central quantity that controls the accuracy of DMRG. It denotes the number of optimally chosen basis states in which any bipartition of the system is represented. If χ chain was required to represent a single U-V chain of a given length with truncation error , achieving the same accuracy for N ch chains would require χ = χ N ch chain -even in case χ chain = 10, this scaling would be hopeless beyond N ch = 2. A simple test comparing the two mappings for a representative system of free fermions, summarized in Fig. 3(b), confirms the total disparity between the performance of these mappings. Shown is the energy versus the number of DMRG iterations for L = 20, N ch = 5, N ↑ = N ↓ = 50, t ⊥ /t = 0.2, U = V = 0, χ = 5000, once with mapping along t (blue line), once with mapping along t ⊥ (green line). The t-mapping converges quickly to the exact ED solution, while the t ⊥ mapping flounders even for this small system.
The price paid for the t-direction mapping, necessary to have manageable χ in the first place, is the very large number of long-range tunneling terms. In the two-site DMRG, we use, for every adjacent pair of sites (i, i + 1) on the DMRG chain for which the tensors are jointly optimized, the Hamiltonian is represented as a matrix product operator (MPO) [24]: Here, each sum runs from 1 to D, the MPO bond-dimension, which for the t-direction mapping scales as D ≈ 4L. For the range of system lengths we treat, L = 20 to 64, this amounts to D ≈ 80 to 256, implying an equal number of Hamiltonian . These describe the action of the Hamiltonian to the left (right) of the sites i, i + 1, is a purely local operator on site i (i + 1). Combined with the fact that we utilize χ = 10 000 to 18 000, this large number of Hamiltonian contributions result in memory requirements on the order of several TB for any single ground-state calculation and a commensurate amount of computational effort.
Parallelized DMRG was developed to make calculations of such magnitude tractable [53]. We provide a more comprehensive overview of its technical features in Appendix B, and highlight here its first parallelization layer, which is the most relevant to this work, shown schematically in Fig. 3(c): the largest objects of DMRG are the boundary terms. These are the expressions ofĤ left [b l ],Ĥ right [b r ] in the χ optimal basis states of the system to the left or right of sites i, i + 1, respectively. Denoting these basis states as |α l , |α r respectively, the left boundary is , and the right boundary R is defined analogously usinĝ H right [b r ] and |α r . The parallelization layer distributes each of the D elements of both these vectors (each element being a block-sparse matrix due to the use of conserved quantum numbers to enhance performance), to nodes of a parallel supercomputer. For this project, the computer cluster was the "Piz Daint" system of the Swiss National Supercomputing Center (CSCS). In this manner, we have spread the calculations out to up to many dozens of nodes (for the largest lattices).

IV. SCALING ANALYSIS
As discussed in Sec. II, our indirect strategy for arguing that a ground state of the 2D U-V model has SC order, by eliminating competitor phases, relies on obtaining the correct behavior of the systems spin gap as L → ∞. In the following, we discuss the two necessary steps required in service of this goal: sharply reducing the oscillations of the spin gap as 1/L decreases, and extrapolating ground states energy to zero error in the DMRG approximation.

A. Removing oscillations of s [L]
The standard definition of the spin gap, Eq. (5), is based on the difference between two ground-state energies. However, due to the small value of t ⊥ /t in the regime of interest, one encounters oscillations of s [L] with L, which presents a complication for obtaining clean extrapolation of s [L] in L, as shown in Fig. 4(a). That weak interchain coupling is evidently the cause is demonstrated by studying the U-V Based on this analysis, clearly one should smooth this finite-size effect by using a more general definition of the spin gap. The natural generalization, given by Eq. (4), is the one that averages over all the N ch bands close to the Fermi level in the noninteracting limit, and that applies unchanged to the interacting regime. As depicted in Figs. 4(a) and 4(b), we find this definition removes the oscillatory behavior to a large degree for all our data. As discussed in Appendix A, this definition of the averaged spin gap results in a linear polynomial in 1/L as the only consistent method for extrapolating s [L, N ch ] to L −1 = 0; linear extrapolation can recover physically expected behavior correctly, while the oscillatory remnant overwhelms any vestiges of quadratic dependency in 1/L that might remain after the averaging.
We further note that this procedure is similar to adding two particles instead of one when computing the compressibility of spinless particles, to avoid the unwanted oscillations pro-voked by the breaking of k → −k symmetry that adding a single particle would entail. Analogously, one usually adds four particles to stay in the sector of total spin zero for spinful systems.

B. Extrapolating in the DMRG truncation error
In DMRG, one attempts to compute the ground-state wave function |ψ GS of a lattice Hamiltonian, by mapping the physical lattice to a 1D chain. At every bipartition of this DMRG chain, the algorithm achieves the required reduction of the exponentially scaling full Hilbert space by approximating |ψ GS through two sets of χ optimally chosen basis states, yielding a wave function |ψ χ GS . A given χ corresponds to a particular truncation error := δψ|δψ , |δψ := |ψ GS − |ψ χ GS , where is known in principle in DMRG.
The controlled accuracy and known error enables local quantities, and thus energies, to be extrapolated to zero error [54]. Specifically for energies, it is known that the approximation error in the energy, δE , is ∝ . As we obtain s [L, N ch ] from the difference between the S z = N ch and S z = 0 groundstate energies, wherever we have at least two ground states with different χ we use this technique to extrapolate the energy to zero before computing s . In this, we always employ max , the maximal truncation error committed in the final sweep as a proxy for . For V/t = 1 and N ch = 4, we have obtained multiple energies, by computing ground states for χ = 10 000 and 18 000 independently, then obtained χ = 12 000, 14 000, and 16 000 using the χ = 18 000 ground state as an initial state and applying two complete sweeps at the lower χ . An example is of this is shown in Fig. 4(c). As illustrated in Fig. 4(d), the change in s [L → ∞, N ch ] relative to values at finite χ is typically small anyway, on the orders of a few percent at most.
Outside these parameters, we have aimed to produce both χ = 10 000 and 18 000 ground states in independent simulations wherever possible, but particularly for V/t = 0.5 available computing resources proved ultimately insufficient. Thus, many results for this parameter are based partly or fully on a single χ = 10 000 ground state.
In order to obtain any particular value of s [L, N ch ] as the basis for an extrapolation to L → ∞ at fixed N ch , we therefore pursue two separate protocols.
Straight extrapolation. Here, we extrapolate both S z = 0 and S z = N ch ground-state energies to zero max when possible. If scaling is only possible in one of the spin sectors, we form s from the two lowest-energy ground states with comparable .
Extrapolation plus estimates. Whenever a ground state is available at χ = 10 000 exclusively, we estimate the correction heuristically. We have done this based on the following observations, made on states where extrapolations were possible: (i) the fractions by which energies further decrease upon extrapolation seem to roughly double going from one L to the next, (ii) they also seem to, roughly double with every increase of N ch , and (iii) for the same parameters, fractions seem to be about 1.3 larger in the S z = N ch sector compared to the S z = 0 sector. We then assume an uncertainty of 25% in these estimated correction fractions.  Table III of Appendix A.

V. EXTRAPOLATED SPIN GAPS AND INFERRING SC ORDER FROM CORRELATION FUNCTIONS
The exact behavior of s [L, N ch ] as 1/L scales to zero is critical to our approach. In Sec. V A, we take the data as they appear to be, linearly increasing in 1/L, with only weak remnants of the oscillatory behavior that s was defined to eliminate (cf. Appendix A for details). It then follows that the spin gap is finite for the infinitely long strips, and nondecaying or even increasing in N ch . In Sec. V B, we consider the possibility that what appear to be remnants of oscillations is actually the onset of a scaling of s [L, N ch ] to zero as 1/L decreases. In Sec. V C, we then analyze the corrollary information that correlation functions offer.  Table I and shown in Figs. 6(a) and 6(b).

A. The spin gap as a function of N ch
Using the straight extrapolation protocol, there seems to be little room for doubt that s [N ch ] is at least a nondecaying function beyond N ch = 2 for V/t = 0.5, while it is unambiguously a monotonically increasing function at V/t = 1. At this latter value of the nearest-neighbor repulsion, the extrapolation plus estimates protocol is benefitting from relatively small uncertainty, as here we have a greater collection of states that we could extrapolate to max = 0, and thus heuristic extrapolations were only necessary for a few states. With these smaller uncertainties, this second protocol supports the straight extrapolation protocol. While the nondecaying nature of s [N ch ] beyond N ch = 6 cannot be guaranteed, a scenario in which the spin gap suddenly reverses at larger N ch and starts decaying towards zero seems highly unlikely and we are not aware of any mechanism that would support such a reversal. We find further evidence for this view with the significant enhancement of short-range d xy correlations as N ch increases, which we discuss further in Sec. V C.
For V/t = 0.5, the situation is possibly more uncertain when comparing the s [L → ∞, N ch ] resulting from the two different extrapolation protocols. The deviations are TABLE I. Summary of spin gaps at L → ∞ for different N ch , assuming the 1/L scaling of s [L, N] we appear to observe continues to hold beyond the strip lengths we computed. "Extr. + estimates" is based on estimating ground-state energies at zero truncation error ( = 0) for cases where ground states cannot be systematically extrapolated to = 0, because converged simulations were only available for a single χ value (see text for details).  noteworthy for N ch = 6, but it is very easily possible for our heuristic model of the corrections (described in Sec. IV B) to be just off in this case. But we will observe that neither protocol is compatible with a steady decrease of the spin gap with increasing N ch . And, as for V/t = 1, a significant and consistent enhancement of short-range d xy correlations at N ch > 2 (see below) further supports this reading of the spin gap behavior We thus can summarize that for V/t = 1 the extended U-V model is highly likely to have a finite spin gap for N ch → ∞, while for V/t = 0.5 this likelihood, while reduced, remains high.

B. Testing the alternative: could s [L → ∞, N ch ] scale to zero with N ch ?
As discussed, the results for s [L, N ch ] summarized in Figs. 5(a) and 5(b) appear to be most congruent with a linear scaling in 1/L to a finite intersect at 1/L = 0, with weak oscillations surviving the averaging inherent to definition (4) (cf. also Appendix A).
Returning to the discussion of Sec II A, the main follow-up question then must be: could there be alternate fits that make physical sense, associated with either nodal superconductivity, or with a Fermi liquid, or a magnetically ordered state? The first possibility would be marked by s [L, N ch ] scaling to zero as A/L α , α < 1 in the large-N ch regime, while for the latter two possibilities s [L, N ch ] should also scale to zero in the large-N ch regime, but as B/L + C/L 2 . We allow for a quadratic correction term to the ideal linear vanishing of s [L, N ch ] expected for a FL or MO state (cf. Sec. II A) because our results from Sec. V A show that a purely linear scaling to zero is impossible given our data.
Analyzing our data under these alternative scenarios requires recognizing that at N ch = 2 we start in a regime where we are guaranteed a finite spin gap [2] (cf. also Appendix A). That makes it extremely likely that some finite spin gap could survive for at least some larger N ch -our linear-polynomial Ansatz of the previous sections yields just that.
For this reason, one cannot just blindly fit the data in Table III with these two alternate scaling Ansätze; these Ansätze will always "fit" any data like in those tables, appearing linear within the fitting region via parameter tuning, then bending down to zero outside of it. They easily "fit" even the N ch = 2 data, a nonsensical result considering these systems are known to have a finite spin gap. Thus, the only meaningful question is whether or not these fits converge on a physically consistent scenario as 1/N ch decreases. For this purpose, we plot the fitting parameters of both Ansätze to the raw data of Table III  The linear-polynomial Ansatz was already treated in Sec. V A, yielding a finite spin gap. There is a motivated alternative however. A near-linear Ansatz A[N ch ]/L α[N ch ] with α<1, that scales to zero gap would be suggested by earlier theory in contradiction to our result. These prior works were based on analytical perturbative RG and mean-field theory; for our 2D U-V model here, a d x 2 −y 2 order has been predicted [18][19][20][21]. Any order other than isotropic s-wave results in nodal lines intersecting with the Fermi surface (cf. Fig. 9). The intersects result in gap nodes, isolated points in the Brillouin zone where the SC gap closes. At these nodes zero-energy quasiparticles may thus be produced, according to the mean-field treatment of these systems, and these theories thus predict an ungapped spin excitation spectrum. This nodal structure results in the 075138-9 spin susceptibility, Eq. (2), scaling as 1/L α , with α < 1 [36,[38][39][40][41][42].
The results of fitting A[N ch ]/L α[N ch ] to our spin-gap data, depicted in Figs. 7(c) and 7(d), then show a difference between V/t = 0.5 and V/t = 1 as N ch increases. In the latter case, both coefficients saturate quickly to nearly stable values. FIG. 9. Fermi surface of the noninteracting lattice electrons for the 2D U-V model (solid black lines) and possible symmetry of the order parameter, shown as a generic mixture between d xy and g xy(x 2 −y 2 ) (see text for details). This symmetry minimizes, but does not eliminate intersections of nodal lines and Fermi surface (red circles), where the gap of USC order would exhibit zero nodes in standard mean-field treatments. From this, these theories predict a nongapped spectrum for spin excitation, i.e., zero spin gap.
For α[N ch ], this is crucially well below 1. A ground state with nodal USC in the 2D regime could not be completely ruled out on this basis. To what extent this is an artefact of our data being confined to smaller L values at larger N ch must remain as a key question to be answered in future work.
For V/t = 0.5 on the other hand, at N ch 4, we cannot find saturation of the growing α[N ch ] within the accessible N ch . Taken together with the linear polynomial fitting done in Sec. V A and the behavior of C[N ch ] in the FL/MOscaling Ansatz above [ Fig. 7(b)], it may not be possible to rule out that the V/t = 0.5 data could move towards a fully linear scaling to zero in the large N ch limit. Three conjoined caveats apply however: firstly, N ch , is limited to 8 here, it is clearly possible that α[N ch ] stabilizes below 1 at larger N ch . Secondly, the behavior of α[N ch ] with N ch could easily be an artefact of the oscillatory remnant in 1/L that s [L, N ch ] retains despite the averaging, analogous to the discussion in Appendix A for the case of a full second-order polynomial fit to the raw data. Thirdly, the results of the straight extrapolation protocol for the spin gap at L → ∞ [ Table I, Fig. 6(a)] would point in the opposite direction.
We thus conclude that our data most strongly support a fully gapped spin spectrum. As laid out in Appendix A, the averaged spin gap s [L, N ch ] appears to result in a linear structure in 1/L with a superimposed weak oscillatory remnant, of unknown analytical form, with its amplitude slowly decreasing with 1/L. It is those weak downwards oscillations that the two alternate scaling Ansätze we now studied are, erroneously, susceptible to. A conservative reading of our results would still mandate that at V/t = 1, we cannot completely rule out a spin excitation spectrum with nodal zeros, which would be compatible with the nodal USC systems described by mean-field theories. At the same time, within the confines of the N ch values that we have treated in this work, there is no consistent scenario for a FL or MO state at V/t = 1 due to the apparent instability of the quadratic correction as shown in Fig. 7(b).
For V/t = 0.5 the situation is less clear. The upward growth of α[N ch ] parametrizing the fit to a nodal USC regime, combined with the instability of the quadratic correction term C[N ch ] in the fit to the FL/MO regime opens up the possibility that there may not be enough data to reliably exclude any of the three regimes for this parameter, when viewed in conjunction with the outcome of the extrapolation plus estimates protocol for the linear-polynomial extrapolation [ Fig. 6(a) and Table I]. However, viewed together with the enhancement exclusive to the d xy correlation channel discussed in the next section, it turns out that either the nodal USC scenario or the fully spin-gapped one should be favored for V/t = 0.5 as well.

C. Correlation functions: pairing in the d xy channel as likely candidate order in the 2D limit
From the ground states, any desired correlation function between site r 1 and site r 2 in the lattice can in principle be computed. With the form factor of the lattices we consider, strips of length L and width N ch , where L N ch , generally a crossover of behaviors is to be expected with growing r := |r 1 − r 2 |. As N ch increases, at short distances the physics will become increasingly dominated by that of the 2D limit 075138-10 TABLE II. Decay of correlations at short range, r ≈ r s = 4a, in the effective 2D regime [cf. main text and example shown in Fig. 8(a)]. The strong enhancement in the d xy channel compared to all other channels for N ch 4 is highlighted in red. (N ch → ∞). Conversely, at sufficiently large distances, any correlation function will exhibit the typical behavior of a 1D system, i.e., it will decay algebraically as a function of r-as long as no discrete symmetry breaks spontaneously [55]. This behavior is generic to the ground state of any 1D quantum system, even one of finite width [2]. The length scale at which the cross-over between the two regimes takes place will increase as N ch is raised, with the 1D regime starting at larger and larger distances, until it disappears completely when N ch → ∞.
Even with pDMRG we cannot access the full 2D regime for the time being, but we can get new and valuable insight into how the various potential channels for ordering of the U-V model in the 2D limit respond to the systematic increase of N ch . As we already evaluate the possibility of magnetic order or a Fermi liquid based on the scaling of s [L, N ch ], we limit ourselves to the main competitor orders, the SC d x 2 −y 2 (r), d xy (r), s(r), as well as the charge-density-wave channel. We compute the correlation functions (6)-(8) connected to each of these, as defined in Sec. II A.
We then specifically study the dependency of the shortrange behavior, r r s := π/k F = 4a of the above correlation functions on N ch . An example that is highly representative of the general behavior for N ch 4 is shown in Fig. 8: when each correlation function is normalized to its value at r = 1, the first maximum within the short-distance quasi-2D regime decays strongly in all channels except in the d xy channel. Table II, which summarizes the short-range decay for all studied N ch values, confirms this short-range enhancement as exclusive to the d xy channel. As is to be expected, once r increases beyond the short-range regime, d xy (r), like all the other correlation functions, behaves according to 1D physics, i.e., it decays algebraically.
We do not take the above short-range behavior of these systems at N ch 8 to constitute definite proof of d xy ordering in the full 2D-limit. At the same time however, we observe that it would align with a simple order-of-magnitude estimate, namely that the diagonal singlet pairing of d xy order is energetically advantageous at strong coupling-naive perturbative analysis suggests that the gain in energy should be O(t 2 ⊥ /V ) across a diagonal, while it is O(t 2 ⊥ /U ) for singlet pairing across a rung in d x 2 −y 2 order. And of course, this would be in line with the high likelihood for a finite spin gap and the attendant elimination of competing MO or FL phases in these systems that we discussed in Secs. V A and V B; the open questions raised by a system having a finite spin gap jointly with SC d xy order we will discuss now.

VI. DISCUSSION AND CONNECTIONS TO EXPERIMENT
Our results show a fully gapped spin excitation spectrum for finite-width strips of the 2D U-V model at the given parameters, as evidenced by the finite s [L → ∞, N ch ] we find.
We then show that the behavior of the gap with the strip width is most consistent with a fully-gapped spin spectrum also in the 2D limit when taking the apparent structure of s [L, N ch ] as consisting of a linear part with a superimposed weak oscillatory remnant into account (cf. Sec. V B and Appendix A). At the same time, the behavior of the various correlation functions at short range with increasing width, indicative of the behavior of the 2D system, lends support to pairing happening predominantly in the d xy channel, rather than in the extended s or d x 2 −y 2 channel.
Such a conclusion however is at variance with existing analytical descriptions of USC systems, such as from combining the results of perturbative functional RG with meanfield descriptions. Besides these methods predicting d x 2 −y 2 symmetry for the order parameter instead [18][19][20][21], the more far-reaching difference to our findings lies in the effects these theories predict from the nodal lines, which any plausible order parameter will exhibit (these originate from the systems attempt to minimize the Coulomb repulsion). Specifically, they forecast that node lines' intersections with the Fermi surface will always create point nodes for gapless quasiparticle excitations (cf. Fig. 9). In the mean-field theory, these point nodes in turn necessitate an ungapped spin excitation spectrum.
There are two ways to look at the apparent discrepancy between this mean-field treatment and our findings of a finite spin gap in Sec. V A. We considered the first way in Sec. V B, observing that disregarding the apparent structure of s [L, N ch ] (linear with oscillatory remnant, cf. Appendix A) allows for sublinear scaling of s [L, N ch ] to be a physically consistent fit within the accessible N ch range, at least for V/t = 1. As explained there, this scaling of s [L, N ch ] taken together with the boost to the d xy correlations from Sec. V C would be consistent with the prediction of nodal USC derived from mean-field theory just the exact symmetry of the order parameter would be different.
The second way is to consider the structure of the s [L, N ch ] data as linear + weak oscillations. Then one has to reconcile the finite spin gap in the 2D limit that ultimately results from our numerics at strong coupling with the weakcoupling analytical theory. Taking stock of the experimental situation allows just such a reconciliation. Firstly, we notice that our result of dominant d xy correlations in the 2D regime of the strips' ground state would be in line with the general argument that it is energetically advantageous for the system to minimize the number of nodes, which d xy would achieve better than a d x 2 −y 2 order given the Fermi-surface topology (cf. Fig. 9). Secondly, measurements on actual USC materials, which are certainly in the strongly interacting regime, provide evidence for USC order with zero-gap point nodes, yet which exhibit fully gapped spin excitations at the same time. This has been demonstrated for LSCO [56] close to and away from optimal doping, as well as YBCO [57] over a range of doping. To our knowledge, this fundamental discrepancy between the standard mean-field descriptions of USC models with zerogap nodes and actual measurements so far has neither been theoretically explained nor even replicated. Thirdly, strongly underdoped yet still superconducting LSCO has been shown to have not just gapped spin excitations but to be fully gapped overall [58]. The possibilities discussed for this observation are either another phase coexisting with USC order or topological superconductivity of the chiral d x 2 −y 2 ± id xy type, which the authors show to fit the measured gap. Fourthly, the loss of D 4 symmetry in the anisotropic 2D lattice results in a mixing of orbitals, such as d xy and g xy(x 2 −y 2 ) (cf. Fig.  9) [59]. When viewed in light of the numerous proposals to explain USC phases at low or zero temperature via mixing of different order symmetries, it cannot be ruled out that the remaining nodes resulting from a pure d xy + αg xy(x 2 −y 2 ) orbital, sketched in Fig. 9, are eliminated when admixed with yet other orders, even without invoking the possibility of topological superconductivity.
Given the available data we cannot currently distinguish among these possibilities, but provide them to make it plain that the most consistent interpretation of our results, a fully spin-gapped USC order (with dominant d xy symmetry), has clear precedents, the predictions of weak-coupling analytics notwithstanding.

VII. CONCLUSION
We have performed large-scale calculations with parallel DMRG to obtain ground states of the 2D U-V Hubbard model, the presumed minimal model of unconventional superconductivity in the organic Bechgaard and Fabre salts. The use of parallel DMRG here allows to study large systems at sufficient accuracies to reliably compute ground-state energy differences between different quantum number sectors, i.e., the spin gap.
Considering strips of increasing width up to eight chains, we have computed this models' spin gap as a function of strip length and width, averaged so as to strongly reduce oscillations that would preclude extrapolations to infinite strip length. After extrapolating the ground-state energies in the discarded weight, we arrived at data consistent with a finite spin gap at the thermodynamic limit (i.e., at infinite length). Taking the apparent structure of the computed spin gaps as a function of inverse length into account, we conclude that the spin gap of the infinite strips is nondecreasing or even increasing with the strips width. A finite spin gap in the full 2D limit is therefore the most likely possibility, thus  Fig. 5 and for Fig. 7. making the case for singlet pairing. Disregarding key features of the computed spin gaps (i.e., its decomposition into linear and weak oscillatory remnant) still yields scaling results at most consistent with nodal unconventional superconductivity, where the spin excitation spectrum would touch zero in isolated points. Even then however our results do not admit a physically consistent scaling scaling solution in line with a Fermi liquid or magnetically ordered states.
As we steer away from model parameters that could result in charge density wave order of the ground state, the remaining candidate by exclusion is unconventional (i.e., repulsively mediated) superconducting order of singlet type. This reading is further supported by the enhancement of short range correlations (i.e., where the strips' resemble a 2D system) with the strips' width to take place only in the d xy channel.
These results suggest several lines for further inquiry. One would be to test for the possibility of topological d x 2 −y 2 ± id xy superconductivity that is raised by the presence of the spin gap at strong coupling. Another would be to move more closely to the parameter regime of the Bechgaard and Fabre salts, i.e., to increase both U/t and V/t further yet. This will have to entail a careful study of the enlarged parameter space for the onset of the Mott-insulating state that we discussed in Sec. II B for the case N ch = 2, and the behavior of which at N ch > 2 is currently unclear. In this section, we provide background on the linear scaling procedure we have used in Fig. 5 to obtain the infinite-sized gaps summarized in Table I and shown in Fig. 6. We also list all s [L, N ch ] values that entered into performing those scalings, in Table III. There are three related reasons why we perform the fitting of s [L, N ch ] in Sec. V A with a linear polynomial in 1/L: (i) we find that in the noninteracting regime, U = V = 0, the averaged spin gap s [L, N ch ] shows behavior congruent with purely linear scaling with weak superimposed oscillations, while a fitting with a quadratic polynomial yields physically nonsensical answers. (ii) Mirroring this effect for the interacting systems, we find that the amplitude of the oscillatory remnant that survives in the averaged spin gap seems to decrease so slowly with 1/L that the oscillations swamp any weak quadratic dependency, should there be any. For this reason, fitting the limited number of available data points with a quadratic polynomial yields unstable results, while a linear fit works stably. (iii) We know by comparing to the existing theory of the N ch = 2 ladders that a linear polynomial 075138-13 fit yields physically consistent results that will, at minimum, provide correct trends for the behavior of s [L → ∞, N ch ] with N ch . To take each of the points (i)-(iii) in turn. (i) Consider the noninteracting version of our model Hamiltonian, U = V = 0, at half filling and t ⊥ /t = 0.1. To take N ch = 8 as an arbitrary example, analyzing the averaged spin gap s [L, 8] as a function of 1/L reveals an universal pattern in these tunneling-anisotropic systems: while much reduced over the nonaveraged gap s [L, 8], weak oscillations persist in s , as shown for this example in Fig. 10(a). We find these data to be naturally decomposing into a subset of points obeying perfect linear scaling (blue crosses in Fig. 10(a), linear fit superimposed as dash-dotted line), and the complementary subset of points that is affected by the downwards oscillatory deviations from this scaling [red crosses in Fig. 10(a)]. That the seemingly linear subset of points is actually a linear one becomes apparent if one tries to fit it with a second order polynomial C 0 + C 1 /L + C 2 /L 2 . In Fig. 10(b), we plot the resulting C 2 as a function of the maximal upper fitting range, L max . It is apparent that C 2 cannot be any underlying physical property of the system like, e.g., a remnant of a band curvature that survives the averaging, in which case it should converge towards a constant value with increasing L max . Rather, C 2 continues to drop even at the large L max shown here, because that is the only way a second-order polynomial fit can maintain optimal approximation to a purely linear distribution of data points as its fitting range grows. If any dependence on 1/L 2 remains within s [L, N ch ] in the noninteracting regime, the oscillatory remnant clearly dominates it completely.
(ii) Just as for the noninteracting system, remnants of oscillatory behavior remain visible as we compute s [L, N ch ] at U/t = 4 and V/t = 0.5, 1. However, unlike the noninteracting regime, each data point now comes with substantial computational cost. The limited number of available data points compared to the noninteracting regime means that they cannot be reliably divided into a purely linear subset and an oscillatory remnant. As a consequence, the impact of the weak oscillatory remnant on any scaling L → ∞ must be considered, especially as the amplitude of oscillations appears to decay only weakly with 1/L, as in the noninteracting case (i). In Figs. 10(c) and 10(d), we contrast the instability of any second-order polynomial fit depending on the fitting range with the respective stability of a linear fit over the same fitting ranges. This illustrates again, in a way different from (i), how the oscillatory remnant overwhelms any quadratic component in 1/L, if such were to exist in the interacting regime (as opposed to the noninteracting one, where we can rule out its existence within the linear subset).
(iii) Complementing the argument of (i) and (ii) so far, that s [L, N ch ] is dominated by linear behavior in 1/L with weak oscillatory remnants that overshadow a potential quadratic contribution, is the physical consistency of linear extrapolation. For this, we can turn to the N ch = 2 case, the two-leg U-V ladder. The bosonized field theory that describes this ladder at low energy predicts that both its spin sectors are gapped [2]. It also shows that both spin gaps must grow with both U/t and V/t (Eqs. (8.23), (8.24), and (8.26) in Ref. [2]). This effect is clearly recovered by our linear extrapolation, as shown in Table I. This further supports that a linear extrapolation, even one that necessarily averages over the weak oscillatory FIG. 11. Overview of pDMRG: (a) The physical system, e.g., a M-site 2D lattice, is mapped to the 1D DMRG chain. Every one of the possible M + 1 bipartitions, carries left and right boundaries L n , R n (cf. Sec. III), typically by far the largest objects in any simulation that has sizable long-range terms in the Hamiltonian. (b) For every bipartition, the Hamiltonian consists of the (small) local MPOs W n , W n+1 on sites to the left/right of the bipartition, together with the boundaries L n−1 , R n+1 . (c) The three layers of parallelization our pDMRG offers. In layer 1, each element L n [b] of L n is dispatched to a MPI group (each of which may be split into G subgroups). Every L[b l ] is a block-sparse matrix. In layer 2, every one of the dense matrices making up L[b] is distributed round-robin over the G split-off subgroups held by a global MPI group. In layer 3, the tiles making up a dense block can are distributed among the physical compute nodes held by each split subgroup. remnant, correctly captures the spin gaps behavior in the thermodynamic limit and in particular its change with the systems' parameters.

APPENDIX B: HANDLING LARGE χ AND LARGE MPO BOND DIMENSION SIMULTANEOUSLY THROUGH pDMRG
To make DMRG capable of handling simulations that would exhaust the RAM of any single compute node, and in order to bring the many CPU-cores of modern parallel supercomputers to bear on such large-scale problems, the groups of T. Giamarchi (U. Geneva) and M. Troyer (ETHZ) have developed pDMRG [53], with financial support from the Swiss government under its' HP2C initiative, with a focus on easy extendability and optimizability. Key features are (i) a clean tensor network framework for development of new algorithms; (ii) use of a matrixlike data storage type for direct use of BLAS/LAPACK routines; (iii) implementation of an arbitrary number of Abelian symmetries, reducing the matrix complexity by use of block-sparse matrices in models with conserved quantum numbers; and (iv) bindings to the ALPS library [60,61] for a generic model description. Parallelization uses the Message Passing Interface (MPI) standard for distributed memory and Intel ® Cilk TM Plus tasks for multithreading. The shared memory version of the code has 075138-14 been already published [62], and is used extensively in 1D and 2D condensed matter physics [63][64][65][66][67]. The resulting parallelism of pDMRG is fully scalable and can be run on anything from a single CPU core to hundreds of compute nodes.
The stages of pDMRG have been summarized in Fig. 11. As in regular DMRG, the physical problem is exactly mapped onto a 1D chain. In any realistic problem requiring the use of pDMRG, there will be many long-range interaction and/or tunneling terms along the chain. As a result, the left-/rightboundary pairs L n , R n at the n = 0, . . . , M bipartitions of the DMRG-chain will become objects requiring very large amounts of memory for any problem with a substantial bond dimension χ [ Fig. 11(a)]. For every bipartition of the DMRGchain the DMRG-Hamiltonian will be a set of one left and one right boundary, together with the (usually small) local MPOs W n , W n+1 for the sites n and n + 1 to the left/right of the nth bipartition [cf. Fig. 11(b)]. The technical challenge for any DMRG implementation aiming to parallelize the repeated application of any of these Hamiltonians to a tensor-decomposed wave function, which is the linear-algebra operation at the heart of any ground-state DMRG, is to distribute the boundaries across the nodes of a parallel supercomputer. Our pDMRG implementation offers three layers for this, summarized in Fig. 11(c), making extensive use of the Message Passing Interface (MPI), the de facto standard for parallel supercomputing.
(i) Layer 1 exploits that the boundaries are effectively vectors of block-sparse matrices [cf. Eq. (9) and following, and Fig. 3(c)]. It distributes the D elements of both these vectors, each element being a block-sparse matrix, in a sizeordered, round-robin fashion within the global group of MPI groups. Each of the MPI groups within the global group may encompass several nodes of the compute cluster.
(ii) Inside each MPI group, layer 2 then may further distribute the dense matrices comprising each boundary element among MPI subgroups into which the main group can be split.
(iii) As dense matrices can themselves become large, they are always stored in tiled format. For layer 3, these tiles may be distributed to different compute nodes within a subgroup.
In the Jacobi-Davidson eigensolver of pDMRG, the tensor contractions become dot products of vectors of matrices. For this, MPI groups communicate via MPI_Allreduce. When shifting to the next pair of sites, the number of boundaries can change, which results in a redistribution of elements among the MPI processes, using asynchronous point-to-point transfers. On each node, operations are executed in two steps. A dry run collects all linear algebra operations (mostly GEMM and AXPY). In the second step, the directed acyclic operation graph is executed in parallel on several threads. This abstract approach optimizes performance, autonomously exploiting hidden operational parallelism. The basic linear algebra operations are then explicitly unrolled between tiles of the dense matrix and dispatched to the underlying BLAS library.