Disorder in Twisted Bilayer Graphene

We develop a theory for a qualitatively new type of disorder in condensed matter systems arising from local twist-angle fluctuations in two strongly coupled van der Waals monolayers twisted with respect to each other to create a flat band moir\'e superlattice. The new paradigm of 'twist angle disorder' arises from the currently ongoing intense research activity in the physics of twisted bilayer graphene. In experimental samples of pristine twisted bilayer graphene, which are nominally free of impurities and defects, the main source of disorder is believed to arise from the unavoidable and uncontrollable non-uniformity of the twist angle across the sample. To address this new physics of twist-angle disorder, we develop a real-space, microscopic model of twisted bilayer graphene where the angle enters as a free parameter. In particular, we focus on the size of single-particle energy gaps separating the miniband from the rest of the spectrum, the Van Hove peaks, the renormalized Dirac cone velocity near charge neutrality, and the minibandwidth. We find that the energy gaps and minibandwidth are strongly affected by disorder while the renormalized velocity remains virtually unchanged. We discuss the implications of our results for the ongoing experiments on twisted bilayer graphene. Our theory is readily generalized to future studies of twist angle disorder effects on all electronic properties of moir\'e superlattices created by twisting two coupled van der Waals materials with respect to each other.


I. INTRODUCTION
The ability to isolate and characterize single sheets of graphene [1] has lead to a significant amount of control over van der Waals heterostructures [2]. This spectacular materials engineering feat has led not only to relatively clean, high mobility graphene samples, but also to the ability to place two coupled sheets of graphene on top of each other with a relative "twist" angle between them [3]. The introduction of the 'twist angle' as a new experimental parameter to tune the electronic properties of 'twisted' van der Waals heterostructures has led to a new paradigm in condensed matter systems where one can now study materials properties not only as a function of temperature, carrier density, magnetic field, gate voltage, applied pressure or strain, etc., but also as a function of the twist angle between the two layers controlling the electronic band structure in a radical manner, which is a completely new tool in the laboratory. This new tool of a variable twist angle is revolutionizing condensed matter physics, leading to many new discoveries in twisted bilayer graphene every month. Since the initial experiments on twisted bilayer graphene establishing the fabrication technique [4], recent experiments have shown that the system can support purportedly correlated insulating states and superconductivity [5][6][7][8]. While these bilayers are clean and relatively disorder free, the twist angle can vary across samples, leading to a new source of disorder. Thus, even if the two starting monolayers are completely clean (i.e. no impurities or defects), the very fact of creating the twisted bilayer system introduces an inherent (and a new type of) disorder by virtue of local fluctuations in the twist angle throughout the macroscopic sample. This 'twist angle disorder', which has no analogy in usual condensed matter systems and has never before been studied in the literature, is thought to be the main disorder controlling the quality of the currently available twisted graphene systems.
In single-layer graphene, the most dominant effect of disorder near the Dirac point has been attributed to charge disorder (arising from unintentional quenched random charged impurities in the system) inducing "puddles" of unequal charge density that locally dope the Dirac cones [9]. This issue has recently been circumvented by using an all van der Waals device geometry, and the absence of any significant charge inhomogeneities in such ultra-clean samples has enabled the observation of exotic many-body states [10] akin to what has been seen in clean suspended graphene [11]. As a result, the current graphene sample quality is rather remarkable and for most practical purposes, both charge inhomogeneities as well as any extrinsic disorder due to vacancies or defects has been greatly suppressed, if not almost eliminated except perhaps for experiments using very low (< 10 10 cm −2 ) carrier densities.
With these capabilities, very clean samples of twisted bilayer graphene (TBG) near the magic-angle (where the nominal band structure becomes completely flat suppressing the Dirac velocity to zero) have recently been observed to develop insulating states at integer filling fractions of the moiré miniband near the Dirac points [5,7]. Upon gating (i.e. doping) away from the insulating phases, nearby superconducting phases have been observed [6][7][8]. To achieve an accurate choice (and rather small, ∼ 1 • ) of the twist angle, the "tear and stack" mechanical approach places two sheets of graphene on top of each other with a great deal of precision in the twist angle [12]. Only after such a mechanical procedure of creating the twisted bilayer sample with a carefully chosen twist angle, the sample is transferred to the cryostat for electrical measurements. To study the electronic properties as a function of the twist angle, the whole procedure has to be repeated for a different sample with a different twist angle. In practice however, this procedure does not produce a single twist angle across the entire sample: Scanning tunneling microscopy has observed different twist angles across separate regions of the sample [3,[13][14][15][16][17][18]. Moreover, signatures of the nonuniformity of the twist angle have also been observed in conductance measurements that have a strong dependence on where the leads are placed on the device [7]. In addition, two different samples with nominally identical twist angles typically manifest quite different electronic properties in transport and STM measurements, again reflecting that some inherent variations in the twist angle invariably exist in the system. Thus, in any given high-quality (i.e. low extrinsic impurity and defect concentration) sample the main source of disorder comes in the form of a varying twist angle across the sample. The nature of this variation is not unique: some samples have hard domain walls separating regimes with different twist angles, whereas some samples have a much smoother change in the twist across the sample [19]. Currently, the qualitative effect of such forms of disorder on the single-particle spectrum near the magic angle is unknown and to the best of our knowledge, there has not been any attempt to describe this in a precise fashion. Twist angle disorder is a radically new type of intrinsic disorder in condensed matter systems whose study is, quite apart from its singular importance in determining the twisted graphene bilayer properties, of fundamental conceptual significance.
The numerical study of twist-angle disorder is difficult with the current models available in the literature. First, the usual continuum model is built as a hexagonal lattice in momentum-space [20,21] where disorder enters the Hamiltonian in a highly non-local way. Second, current real-space models rely on both a uniform and commensurate twist angle [22,23]. To circumvent this problem, we build a new real space model where the twist is built directly into the interlayer hopping in such a way that it can be continuously tuned, and can vary spatially while the model remains local in real space. The model exactly reproduces the continuum model as written by Bistritzer and MacDonald [21] near the K and K Dirac points in the Brillouin zone. The version of this model presented here preserves C 2 T symmetry (i.e. the combined operation of a 180 • rotation and a time reversal operation) and hence preserves the Dirac nodes. Further, it qualitatively preserves the spatial structure of AA and AB tunneling; however, it explicitly breaks C 3 symmetry. While there is no obstruction to building the model with C 3 symmetry (a version of which will appear in Ref. [24]), real experiments introduce strain which also explicitly breaks C 3 [25], so we do not require this of our real space model. So, in some sense, our disorder model incorporates both the twist-angle disorder (in a controlled manner) and strain  1. (color online) The density of states ρ(E) as a function of energy E for the lattice model of twisted bilayer graphene at a twist angle θ = 1.05 • , and a weak breaking in the interlayer tunneling between AA and AB sites (w0/w1 = 0.75, w1 = w), which captures lattice relaxation effects and it opens a hard gap on both sides of the semimetal miniband. We note that at small angles, a single parameter controls the physics: w/[2vF kD sin(θ/2)], so lowering the angle is equivalent to increasing w1. Therefore, one can read the plots of smaller w1 as at an angle larger than 1.05 • . This density of states has a number of features relevant to the physics: Van Hove peaks, gaps, and the velocity (as determined by the scaling of the density of states). Dark (light) blue lines give the calculated density of states for finite (zero) values of the parameter w as shown in the inset of the figure.
effects (in an uncontrolled manner through the explicit breaking of C 3 symmetry).
Once we have a viable real-space model, we are able to study the effects of disorder on twisted bilayer graphene at the single-particle level. Not only does twisting two sheets of graphene create flat bands near the magic-angle (∼ 1-1.1 • ), it also induces gaps that separate the miniband, which has Van Hove singularities in the density of states [21,26,27], from the rest of the spectra as seen in Fig. 1. These miniband insulating gaps arising from the single-particle band structure of the twisted system are simply the moiré superlattice band gaps due to the interference between the two graphene bands in the combined bilayer heterostructure. We are interested in how all of these single-particle, superlattice, miniband features are affected or even destroyed due to randomness in the twist angle. Recently, it has been demonstrated that many of these features can be captured using much simpler models with Dirac points perturbed by a quasiperiodic potential that mimics the twist [24,28]. These effective models are rather natural as most twist angles are not commensurate, and hence, a quasiperiodic incommensurate background potential should have effects very similar to the moiré potential induced by the twist angle. In fact, twisted bilayer graphene at a large twist angle (∼ 30 • ) has recently been used to form quasicrystals [29,30], and renormalized but stable low-energy Dirac excitations have been observed [30], supporting the idea of an incommensurate quasiperiodic potential mimicking the twistangle moiré superlattice. These simpler quasiperiodic models exhibit a similar magic-angle condition where the velocity of the Dirac cone vanishes continuously. In addition, the miniband effects are pronounced with large gaps and the velocity can be seen to clearly vanish without having to resort to very large system sizes as in the case of twisted bilayer graphene. Therefore, we supplement our calculations on twisted bilayer graphene with similar disorder calculations on a quasiperiodic "toy" model to determine how our choice to model twist disorder impacts our results (see Appendix A). The two models produce similar results on disorder effects.
We focus on various features of the low energy density of states and the miniband structure to determine how the single-particle spectrum is modified as a result of randomness in the twist angle. We demonstrate that disorder smooths the non-analyticities in the density of states, fills in the band-gaps, broadens the minibandwidth, and smears out the Van Hove peaks. We compare this with the size of the gap isolating the low-energy miniband, the renormalized Dirac velocity, and the size of the minibandwidth. Surprisingly, we find that the Dirac cone velocity is remarkably robust to twist disorder, whereas other miniband characteristics are systematically broadened. The essential complete protection of the miniband Dirac velocity (at low energy, where the Dirac cone approximation holds) in the twisted bilayer graphene (TBG) against the twist-angle disorder is a rather unexpected finding of our nonperturbatve calculations, particularly since all other aspects of the miniband electronic structure are strongly affected by the twist-angle randomness.
This paper is organized as follows: In Sec. II we build an approximate lattice model for twisted bilayer graphene and use it to introduce real-space disorder in the twist angle. In Sec. III we discuss the results of the numerical calculations, and in Sec. IV, we discuss our approximations and the implications of these results for ongoing experiments. Finally, we conclude in Sec. V with a summary of our results. Throughout, we take the lattice spacing between neighboring carbon atoms to be unity, which serves as our unit of length. In Appendix A we analyze a simpler model with similar magic-angle phenomena using a deterministic quasiperiodic potential and in Appendix B we provide details on the perturbation theory of the twisted bilayer graphene lattice model.

II. MODEL AND APPROACH
The model (see Fig. 2) we primarily focus on is a lattice model that is an approximation of twisted bilayer graphene which captures the low energy limit of the continuum model [20,21]. However, this particular ultraviolet (UV) completion of the continuum model does not respect the underlying C 3 symmetry of the microscopic lattice. As a result, the velocity does not strictly vanish at the magic angle, but becomes very small due to the Dirac points not being pinned to high-symmetry points in the Brillouin zone (we show this explicitly using perturbation theory in Appendix B). However, the band structure that results is still qualitatively similar, and so we expect that effects arising from this approximation are not relevant to understand the qualitative effects of disorder. In any case, it is unclear that a strict magic angle with vanishing velocity can ever be achieved in any laboratory samples, so our approximation of a finite, but very small, velocity should not be a practical problem in any sense.
Setting the hopping between sites in graphene to t = 2.8 eV [31], we first write down the graphene Hamiltonian for both layers without any interlayer coupling where r labels points on the triangular lattice, c r, = (c r,A, , c r,B, ) T is a vector of annihilation operators at triangular lattice site r and layer = 1, 2 whose first and second components represent the A and B sublattices, respectively. The lattice vectors a 1 and a 2 are shown in Fig. 2 (a) where the lattice site r is the central hexagon.
We couple these layers with an interlayer tunneling term to arrive at the full twisted bilayer graphene Hamiltonian The second line of of Eq. (2) represents interlayer hopping to the nearest neighbors on the triangular lattice, summed over all a n , as depicted in Fig. 2 (a), and the interlayer hopping matrices are given by w 0 cos(q j · r + φ j − θ/2) w 1 cos(q j · r − 2π(j−1) 3 + φ j ) w 1 cos(q j · r + 2π(j−1) 3 + φ j ) w 0 cos(q j · r + φ j + θ/2) AA AB BA AA where w 0 represents AA tunneling, w 1 is the AB tunneling (commonly, if we refer to w, we are referring to w 1 and a fixed w 0 /w 1 ratio), θ is the twist angle, φ j are random phases which represent the center of rotation, q 1 = k θ (0, −1), q 2 = k θ ( √ 3/2, 1/2), and q 3 = k θ (− √ 3/2, 1/2). The value of the twisted wavevector k θ is given by . The effect of varying w for a fixed twist angle θ = 1.05 • is shown in Fig. 3 (a), which demonstrates the formation of a semimetal miniband and shrinking minibandwidth.
If we go to the crystal momentum basis and expand about the K point, we obtain the continuum model [21] up to a unitary transformation The unitary transformation here is equivalent to "unrotating" the layers, which operates purely in pseudospin space (using the properties of Dirac cones, one can replace the full angular momentum operator L z with σ z /2). If we compare the low-energy continuum model to the actual lattice model itself, we find remarkable agreement in the calculated density of states [defined in Eq. (5)] as shown in Fig. 3(b,c,d) for three representative sets of parameters.
Some comments are in order. First, while it reproduces the continuum model at the K (and K ) point, this particular UV-completion explicitly breaks the C 3 symmetry present in the original model (this symmetry is just weakly broken near the K and K points). To see this explicitly, we can consider the pattern of AA, AB, and BA tunnelings our model exhibits. This can be entirely deter-mined by the form of T 0 and T 1 in Eq. (3): If an electron is on an A site on layer 1 and wants to hop to an A site on layer 2, then the sum of the squares of the hoppings give that P AA (r) = |[T 0 (r)] AA | 2 + 6|[T 1 (r)] AA | 2 (and similarly for P AB (r) and P BA (r)). Comparing which term [P AA (r), P AB (r), or P BA (r)] is largest gives us Fig. 2 where we can explicitly see how C 3 is broken for this model. As a result of this symmetry breaking, the Dirac points are not pinned to the high-symmetry points and are free to move around the Brillouin zone, yet since the model preserves the C 2 T symmetry they do not gap out. Numerically, we find that the Van Hove peaks never fully merge [ Fig. 3(d)] unlike the continuum model, and further, perturbation theory can be used at second order in the interlayer tunneling strength to see that the velocity never fully vanishes either (see Appendix B). For ideal theoretical calculations this might pose a problem, however for our present study, disorder already breaks this C 3 symmetry. Furthermore, in the experimental samples strain from the substrate explicitly breaks the C 3 symmetry [25], which is a natural single-particle source for a nonvanishing velocity and further justifies the use of this model (not shown). Thus, the weak breaking of C 3 symmetry in our model is not a problem at all in understanding the physics of real twisted bilayer graphene systems. Second, while the model still has the spatial structure of AA, AB, and BA tunneling, it is slightly distorted, as seen in Fig. 2(b). Consequently, the usual TBG moiré unit cell is larger than the unit cell considered in this model. In fact, the mini-Brillioun zone is folded more than in actual TBG (it is smaller by a factor of 3). This introduces small indirect gaps in the mini-Brillioun zone of TBG as seen near the Van Hove peaks in Fig. 3(c)(inset). Last, we can emulate the effects of strain and lattice relaxation, similar to Ref. [32], by setting the Comparisons between our lattice model and the continuum theory near E = 0 for w = 80, w = 100 and w = 110 meV respectively, we find remarkable agreement. The insets show the details of the miniband. At θ = 1.05 • and w = 100 meV, inset of (c), we see a splitting of the Van Hove peaks that is missing from the continuum model associated with additional zone folding in this model. This is seen clearly in the right inset; the left inset shows how the gap of the lattice model here and in the continuum model also match rather well. In (d) at the magic angle θ = 1.05 • and w = 110 meV, we see that the Van Hove peaks never clearly merge as they do in the continuum model. Again, this is clearly seen in the right inset. The continuum model data here includes 338 bands and has NC = 2 13 or 2 14 whereas the lattice model has L = 569 and NC = 2 17 . Overall, the agreement with the continuum TBG model is quite excellent.
ratio of AA to AB/BA tunneling to w 0 /w 1 = 0.75 and w 1 = w, acknowledging the empirical fact that Bernalstacked graphene is the energetically favored stacking arrangement in untwisted bilayer graphene.
We compare the lattice model with the continuum model in Fig. 3(b,c,d). We find good agreement between the two models over a rather broad energy range even beyond the low-energy miniband. In particular, we find that the TBG gap and Dirac velocity are well-produced by the lattice model, see the insets in Fig. 3(c,d) . However, a direct comparison at the magic angle condition (w = 0.11meV and θ = 1.05 • ) reveals that the minibandwidth is slightly overestimated within the lattice model. We further notice that beyond the "magic-angle" (i.e. smaller angle θ or larger interlayer tunneling w 1 ), the lack of symmetries leads to disagreement with the continuum model. As a result, we restrict ourselves to the regime where the dimensionless parameter w 1 /(k θ v F ) is below or at the "magic" value where the discrepancy between the continuum model and the effective lattice model is minimized. For this θ > θ Magic , our model captures the TBG electronic structure very well and should be a quantitatively reliable model. This is also the regime of current experimental interest.
In the following, we model the effect of a non-uniform twist by breaking the system up into four equal sections, each having their own twist angle θ, depicted in Fig. 2(c). We first choose random phases φ j in the interlayer coupling (this reflects different centers of origin for the twist). In what follows, we take a uniform random phase in the TBG calculations as this seems to be the most physically sensible starting point provided the twist angle is not sufficiently small, which would induce significant lattice relaxation. The θ in each patch is sampled from a box distribution around a central value where we express W R as a percentage and we fix θ 0 = 1.05 • . For twist angles that are small and near the "magic-angle," the moiré unit cell includes roughly 10,000 atoms in each layer. Numerically, we can reach on the order of 36-49 unit cells containing up to 500,000 atoms. This should suffice for our purpose of studying random twist angle disorder effects since the disorder is essentially local in nature. However, to confirm these disorder calculations we consider a related model in Appendix A: a model which can numerically include an order of magnitude more unit cells. That model has the same features as TBG (the formation of a semimetal miniband and a vanishing velocity at a critical potential strength), confirming the picture presented here. It is gratifying that we get very similar results in the two models (Appendix A), thus justifying our investigation of twist-angle TBG disorder.
We focus on the density of states (DOS), that is defined as where [. . . ] denotes an average over disorder, phases, and twists in the boundary condition. In what follows we average over 100 disorder samples. In order to reach large system sizes we use the kernel polynomial method (KPM) to compute the density of states through an expansion in terms of Chebyshev polynomials and we use the Jackson Kernel to filter out oscillations due to truncating this expansion to an order N C [33]. In the following, we focus on a linear system of L = 569 and a KPM expansion order ranging from N C = 2 13 up to 2 17 . This should give us an essentially exact nonperturbative evaluation of the TBG DOS in the presence of twist disorder.
From the density of states we extract an estimate of the renormalized velocity of the Dirac cones, using the scaling for two-dimensional Dirac cones with velocity v, (e,f) The minibandwidth DMB for interlayer tunneling w and disorder WR. Note that for larger disorder strength (WR = 6% or above) in (e) the bandwidth appears to plateau; this is just an artifact arising from disorder completely filling out the gap at this point. While the gap and bandwidth are strongly affected by disorder, the velocity remains unchanged.
near the Dirac nodal energy E D and we extract an estimate of v through a fit of the low-energy density of states. We mention that the Dirac cone approximation is only valid at low TBG energies well below the Van Hove singularities, and hence our extracted Dirac cone velocity applies only at low energies. Despite the expectation that disorder will induce a small but nonzero density of states at E D , we can still use the scaling in Eq. (6) to provide an estimate of the renormalized velocity. To quantify the effect of disorder on the Van Hove peaks in the DOS we make a qualitative estimate of the "BCS superconducting transition temperature" from the DOS through where E vH is the location of the Van Hove (vH) peak in energy, we take an electron-phonon coupling g = 1, and T c is in units of eV for the TBG model. We stress that we by no means are claiming electron-phonon interaction is the origin of supeconductivity in twisted bilayer graphene (although we do not rule out this possibility either), we are only using Eq. (7) as a qualitative measure of how disorder smears out the Van Hove peaks, which reduces the largest possible mean-field critical temperature in the miniband within BCS theory. One should think of the effective T c in Eq. (7) as a measure of the effective nonperturbative coupling induced by the vH singularity, and Eq. (7) is a simple quantitative approxima-tion to estimate the effect of twist angle disorder on the vH singularity expressed in units of energy (i.e. coupling strength). The fact that this formula coincides with the BCS formula for the superconducting transition temperature is a matter of convenience in this respect. Any other such formula should provide the same qualitative results although the quantitative details will depend on the specific form of the chosen formula.

III. RESULTS
To begin, we first discuss the effects of a random twist angle in the effective TBG lattice model. Since the twist shows up explicitly in the interlayer tunneling term, randomness appears solely in this part of the Hamiltonian. However, interlayer tunneling either occurs between equivalent sites or nearest neighbors (on the triangular Bravais lattice) between the two layers. This is due to T 0 and T 1 terms in Eq. (2), and thus, randomness in the twist angle will induce contributions from both of these terms.
The miniband that is formed due to the twist can be characterized by the following independent and complementary quantities: (1) the size of the energy gaps (mostly at 'higher' energies at the miniband edges) separating it from the rest of the states, (2) the effective low-energy velocity of the Dirac cones in the minizone, (3) the minibandwidth, and (4) the size and shape of the Van Hove peaks (which are strongly enhanced due to the formation of the miniband itself before disorder is taken into account). These features are all summarized in Fig. 1.
First, as shown in Fig. 4, disorder destabilizes the integrity of the miniband that is created due to the twist. The gaps form at energies ∼ v F k D sin(θ/2) and their size is controlled by w 1 = w. As the figure shows, the gaps become soft due to averaging together different patches of random twist angles. We extract the miniband (MB) gap ∆ MB for various values of interlayer tunneling (w 1 ) and disorder strength, as shown in Fig. 5(a,b). Increasing the interlayer tunneling and approaching the magicangle condition makes the semimetal miniband more pronounced and stable by increasing the size of the gap, which is maximal near w = 0.1 eV. Introducing finite disorder makes these gaps soft and the average band gap fills in monotonically with increasing disorder. Eventually, the gap is filled in completely, which we find occurs roughly for W R = 6% of the clean twist angle, and there is no longer a clear separation between the miniband and the rest of the states. This effect is clearly visible in experiments, as the band insulating gap is destroyed (e.g. as seen in Ref. [7]). The sensitivity of the gap to disorder in the twist angle is rather intuitive, as the location of the gap is dictated by the scattering between the Dirac nodes of equal chirality but different layers, and the energies that mix to open a gap are determined by θ, whereas the size of this gap is dictated by w. But the fact that the primary insulating gap at the full filling of the moiré miniband may be completely suppressed by a twist-angle disorder as small as just < 10% is non-obvious-naively on perturbative grounds one may expect a relative disorder of the order of unity (i.e. 100%) in order to completely suppress the gap. Clearly, the miniband insulator is very sensitive to twist-angle disorder, and this may be the reason why the measured gaps vary quite a bit from sample to sample even for nominally fixed twist angles.
Second, we discuss to the features of the miniband which presumably drive strong correlation effects, namely the renormalized Dirac cone velocity v [ Fig. 5(c,d)] and the size of the minibandwidth D MB [Fig. 5(e,f)]. Surprisingly, we find that the Dirac velocity is remarkably robust to disorder and while it is strongly suppressed for increasing w (as expected since this is an effective decrease of the twist angle), increasing disorder enough even to fill in the band gaps and suppress T c completely is not sufficient to modify the effective velocity which maintains its clean value in a robust manner even in the highly disordered situation. As shown in Figs. 5(c,d), the effective velocity extracted from Eq. (6) does not renormalize until the disorder is very large; in particular, Fig. 4 demonstrates that the low-energy scaling of the DOS ρ(E) ∼ |E − E D | remains robust for a range of disorder with an unmodified slope. Close to the magic-angle regime (w ≈ 0.11 eV), the vanishing of the velocity is becoming rounded out; however to see this develop for a large disorder range is challenging as we are limited by the energy resolution needed and therefore we only present results for disorder strengths where the scaling in Eq. (6) is clearly visible. In any case, close to θ Magic , the whole concept of a velocity becomes dubious as the TBG basically is a completely flatband system with essentially no energy regime available for the Dirac cone approximation to apply.
The minibandwidth D MB is similarly substantially reduced as we approach the magic-angle regime, as shown in Figs. 5(e,f). However, disorder both fills in the band gaps [ Figs. 5(a,b)] and also broadens the minibandwidth which we we are able to track provided the band gaps have not completely filled in [that we mark with a red dashed line in Fig. 5(f)]. The effect of disorder on the minibandwidth is much stronger then the effect on the velocity, and we expect disorder may reduce the strength of correlations by broadening the size of the miniband. It will not, however, have a very large effect on the Dirac velocity for weak disorder. We believe that such effects of disorder strongly suppressing correlation effects in the system (by effectively broadening the minibandwidth) are already apparent in the experimental samples since the insulating gaps (i.e. the correlated insulator phase) at commensurate fractional filling of the miniband often do not show up in many samples, and when they do, the typical correlated insulating gap energies are often rather small and vary strongly from sample to sample.
While the gap and hence minibandwidth are strongly affected, disorder also has an effect on the finer features of the minibands. The effects of twist disorder on the Van Hove peaks are captured quantitatively in Fig. 6. Van Hove singularities in 2D have a logarithmic singularity and thus should diverge with system size ρ(E vH ) ∼ log L. However, in our KPM calculations, we expect that the finite expansion order (N C ) produces a larger finite size effect than the system size. Therefore, we study the scaling of the Van Hove peaks with the KPM expansion order in Fig. 6(a). This clearly demonstrates that the 2D logarithmic vH singularity, manifesting the scaling ρ(E vH ) ∼ log N C in the clean limit, becomes rounded out due to disorder and no longer diverges with increasing N C . Interestingly, however, the location and separation of the Van Hove peaks is very insensitive to disorder as shown in Fig. 6(b). Despite the average location of the Van Hove peaks remaining fixed, disorder broadens them out as we show in Fig. 6(c) by computing the full width at half maximum (FWHM). Not only does this figure demonstrate that the FWHM of the vH peaks strongly decreases with increasing w it also shows that the effects of disorder on the vH peaks are much stronger for smaller w away from the magic-angle regime. This subtle effect of twist angle disorder on the vH peaks is rather nonobvious.
To study disorder effects on the Van Hove peaks in more detail we extract an estimate of the mean-field BCS superconducting transition temperature from Eq. (7) due to the DOS at the Van Hove peak energy. We show the effects of interlayer tunneling and disorder on T c in  Since the Van Hove peaks are strongly effected by w, we normalize T c by its value in the clean limit to compare our disordered results for each value of w. In the absence of randomness, shown in Fig. 3(a), as we increase w the minibandwidth shrinks, pushing the same number of states down to a lower energy scale, which in turn enhances the Van Hove peaks considerably. Upon introducing the twist disorder, T c is suppressed, and this effect, rather unexpectedly, is most pronounced for weak interlayer tunneling strengths, whereas for w close to the magic-angle condition (w = 0.11 eV for θ ≈ 1.05 • ) we find T c is not as strongly affected by weak disorder in comparison.
As W R increases, the Van Hove peaks smear out completely and for w < 0.11 eV this clearly produces an antilocalization peak near zero energy, as shown in Fig. 4  (b). When w 0 = 0, the twist disordered Hamiltonian is of the BDI Altland and Zirnbauer symmetry class [34] and thus should have an antilocalization peak. When we turn on finite w 0 , we expect this peak to broaden as we see in Fig. 4.

IV. DISCUSSION
First, we discuss our approximations in incorporating twist-angle disorder effects in the otherwise defect and impurity free clean twisted bilayer graphene. Using an effective model for twisted bilayer graphene we have the-oretically investigated effects of twist angle disorder nonperturbatively by breaking the system into four separate equally sized squares each with a random twist angle around a mean value of θ 0 = 1.05 • close to the magic angle. To understand the effects of our choice of modeling twist disorder with four equal sized squares, in Appendix A we analyze a simpler model to determine the effects of this patching scheme. By breaking the system into randomly sized rectangles with each having a different twist value we show that our qualitative results are robust. Increasing the patch number as well as changing the size and shape introduces more randomness into the system and increases the effective disorder strength overall. Therefore, the amount of randomness in each sample is a function of both the random distribution and the number of patches. Here, we want to ensure that each patch has enough sites in it to host a well defined lowenergy semimetallic miniband at the magic-angle regime (w = 0.11 eV and θ ≈ 1.05 • ) and therefore have focused on 4 squares and total linear system size L = 569 (in terms of Bravais lattice sites). Increasing the number of squares or modifying the shape will only introduce more randomness into the system.
We have introduced an effective lattice model of twisted bilayer graphene that is local, only requiring nearest neighbor (on the triangular Bravais lattice) interlayer hopping terms which already captures many of the features of the continuum model [20,21], such as the miniband gaps, Van Hove peaks, as well as the velocity and minibandwidth renormalization. The model we have used maintains the C 2 T symmetry but breaks the C 3 symmetry. As a result, the velocity and minibandwidth renormalization are affected in the magic-angle regime, which leads to a non-vanishing velocity (see Appendix B) and an overestimate of the minibandwidth. Moreover, the model also introduces fine structure into the Van Hove peaks that we attribute to additional zone folding that appears in the lattice model. Despite these shortcomings, this lattice model does capture the qualitative behavior of the low energy miniband very well while remaining local and easy to work with numerically. It is possible to construct an effective lattice model that preserves the C 3 symmetry and more accurately reproduces the continuum model in the magic-angle regime with a true vanishing velocity. However, this requires a more non-local interlayer hopping model keeping up to third nearest neighbor tunneling terms on the triangular lattice, which will appear in Ref. [24] (our conclusions change little using this more sophisticated model). In experiments on twisted bilayer graphene, the encapsulating substrate as well as other forms of disorder break the C 3 symmetry explicitly. Therefore, we do not expect that the weak breaking of this symmetry in our effective lattice model affects our conclusions on the qualitative experimental implications of disorder in the twist angle. Now we briefly summarize our main findings. Our results clearly demonstrate that the low-energy scaling of the semimetal miniband ρ(E) ∼ v −2 |E − E D | and the ef-fective Dirac cone velocity (v) are remarkably robust to disorder in the twist angle. While this robustness slightly weakens in the magic-angle regime due to disorder eventually rounding out the velocity minimum, we find that v is essentially disorder independent for less then 15% of randomness in the twist angle. This result suggests that the semimetallic scaling near the magic-angle regime should be clearly visible in transport experiments that average over the whole sample. Indeed, our findings are consistent with the experimental observations on twisted bilayer graphene that have observed a robust "V-shaped" conductance minimum at charge neutrality [7,35] that signifies that the semimetallic low-energy scaling persists in spite of the inevitable presence of twist angle fluctuations in the sample. The existence of a low-energy Dirac cone is protected against twist angle disorder. It is interesting to note that twisted bilayer graphene samples that are "massaged" to remove bubbles that have formed in the "tear and stack" approach exhibit an insulating phase at charge neutrality [8]. Presumably, this procedure eliminates some of the twist disorder in the sample and, as a result, domains of twist angle that still possess the semimetallic density of states no longer contribute to the density of states near E = 0. Thus, the suppression of twist disorder comes with the price of a strong modification of the observed density of states at low energies.
On the other hand, the minibandwidth is much more strongly affected by disorder, and D MB monotonically increases for increasing disorder strength until the gap is completely filled in and the integrity of the miniband is lost. Similarly, we have found that the insulating gap that separates the miniband from the rest of the states is completely filled in at weak disorder strengths (∼ 6% of the clean twist angle). This strong sensitivity of the single-particle gaps to twist disorder has been observed in Ref. [7] by placing leads at different places in the sample and finding very strong variations in the gap energies. We suspect that twist disorder will have an even stronger effect on the gaps at the correlated insulator filling fractions. In particular, the increase of the effective minibandwidth by twist disorder entails an effective lowering of the dimensionless correlation strength (i.e. the effective U/t value in the Hubbard-type models) since the Coulomb interaction energy (i.e. the effective U ) should not be affected by the disorder whereas the minibandwidth (i.e. the typical t) increases. These combined results imply that disorder will reduce the strength of many-body correlations by increasing the bandwidth of the miniband but will not affect the flatness of the Dirac cones. This interesting subtle prediction of our nonperturbative theory may already have support in the existing experiments since many otherwise high-quality TBG samples (i.e. made from extreme high-mobility graphene sheets) often manifest correlated insulating phases that are very weak, and it is unclear why the correlated insulator phase at commensurate fractional fillings is not universally seen in all TBG samples of nominally same quality at the same twist angle. We propose that the twist angle randomness is responsible for causing sample to sample variations in the TBG physics for the same average twist angles.
Last, the Van Hove peaks are a clear signature of the miniband in twisted bilayer graphene experiments [3,4,[13][14][15][16][17][18]. Our results demonstrate that the location of the Van Hove peaks of the miniband as well as their separation in energy, which is minimized in the magic-angle regime, are essentially unaffected by twist-angle disorder. Twist-angle randomness smears out the logarithmic Van Hove singularity without affecting their locations in energy. As a result, the density of states becomes an analytic function of energy and system size at the Van Hove peaks in the presence of twist-angle disorder. We have qualitatively assessed the impact of disorder on the mean-field BCS superconducting transition temperature in the miniband by considering a Fermi energy at a Van Hove peak. We have found that twist disorder strongly suppresses T c [as it is defined in Eq. (7)]. If the superconductivity in twisted bilayer graphene is BCS like then our results suggest that samples with large amounts of disorder in the twist angle will likely not superconduct. This is again consistent with experimental observations where not all samples with similar twist angles manifest superconductivity, and we speculate that this nonuniversality is connected with the presence of variable twist-angle disorder in different samples.

V. CONCLUSION
In this work we construct an effective lattice model of twisted bilayer graphene which we use to study the effects of disorder in the twist angle within a nonperturbative essentially exact theory. We investigate how our choice of modeling disorder affects our results through a detailed investigation of a related but simpler model in Appendix A. We demonstrate how randomness in the twist angle affects various properties of the low energy miniband through numerically exact calculations of the density of states using the kernel polynomial method. Remarkably, we show that the velocity of the Dirac cone is robust to disorder, whereas the other features of the miniband are rather sensitive to randomness in the twist angle. Last, we also discuss how the implications of our theory might already been observed out in existing experimental data and have given guidance for how these disorder effects can be used to help understand the putative strongly correlated effects seen in experiments.  In addition to the lattice model of twisted bilayer graphene (described in the main text), the second disordered TBG-like model we study is a two-dimensional tight-binding model with spin-orbit coupling (SOC) in the presence of a quasiperiodic potential, which is defined as [24] (A1) where t is the hopping strength, χ r denotes a two component spinor of annihilation operators, and σ µ are the Pauli operators. We mimic the effect of a twist through a quasiperiodic potential of strength W , an incommensurate (or quasiperiodic) wave-vector Q, and φ µ is a random phase sampled between 0 and 2π. We average over twisted boundary conditions to reduce the finite size effects. The goal of using this second model is to test the universality of the conclusions we reached in the main text using the TBG lattice model. The effect of the quasiperiodic potential on Dirac points is similar to twisting two layers of graphene [24]. As shown in Fig. 7(a) for a large enough value of W , a semimetal miniband forms with a renormalized velocity, sharp Van Hove peaks, and hard gaps separating it from the rest of the spectrum. Importantly, the SOC model has the great advantage that the formation of minibands, a hard gap, and flat bands are clearly visible on much smaller system sizes compared with the effective lattice model of TBG. Here, it is sufficient to consider system sizes of L = 144 or larger to see these effects, whereas in the TBG model the minimum number of sites required to form a clear miniband is at least a linear system size of L = 300.
In the calculations of the TBG model we broke the system into four squares of equal size, which was for simplicity of modeling while being able to correctly capture the formation of the miniband by keeping each patch sufficiently large. We now investigate the effects of making the size and shape of these regions random as well as increasing the number of random patches n P , something that the computational demands of the lattice TBG model did not allow us to do. We divide the L × L lattice into (n P ) 2 domains, by cutting it through n P − 1 vertical and n P − 1 horizontal lines which are randomly located. Each domain i is given a quasiperiodic wavevector and phase [Q(i), φ µ (i)], as illustrated in Fig. 7 (b). We introduce randomness in Q in a similar way as in the main text, such that Q(i) = Q 0 +δQ i where Q 0 = 2πF n−2 /F n , F n is the nth Fibbonnaci number, and we take the system size L = F n so that Q 0 is a rational approximant to the irrational number 2π(2/[ √ 5 + 1]) 2 . In each domain (or patch) δQ i is taken from a uniform distribution around and W Q is expressed as a percent (similar to the random disorder W R in the main text). For the results on the SOC model we average over 300 disorder samples.
In order to understand the role of taking a uniform phase [φ j in Eq. (3)] in the TBG calculations we consider choosing the phase in each patch φ µ (i) in two distinct ways, which are: (A) One global phase φ µ (i) = φ, which temperature Tc with random phases φµ(i) in each block but without randomness in Q, as function of number of patches n 2 p , and normalized by Tc with only one patch. (Bottom right) The gap size as function of randomness. Comparing to the suppression of Tc, the gap is filled in for WQ ≈ 0.5%, which is much smaller than the critical WQ (∼ 10%) needed for Van Hove peaks to be smeared out.
is equivalent to our set up in the TBG model. (B) In each patch, the phases φ µ (i) are independently picked from a uniform distribution [−π, π], which amounts to a disorder potential even for a fixed wavevector across the sample. Option (A) has no discontinuity in the phase across the boundaries of each patch. Note that because of the variation in Q, even fixing φ µ (i) = φ to be one global phase does not enforce a continuous boundary condition across patches. The most random choice we can make is through option (B), which means any phase can be chosen on each patch, with no restriction. In this case there is a sharp jump of the potential across all of the patches. In particular, when the number of domains approaches the number of sites, the quasiperiodic potential turns into random disorder potential. The "randomness" of option (B) is clearly the strongest and is not controlled by the parameter W Q . This is demonstrated in Fig. 8, which shows that randomness in the phase smears out the fine features of the density of states and fills the gaps in more easily and is qualitatively similar to the case with a fixed phase. Thus, randomness in the phase is not essential to include to study disorder, and in the following we will mainly focus on keeping the phase fixed throughout the sample.
To understand the effects of a finite number of patches we present results in the semimetal (W ≈ 0.35t) and magic-angle (W ≈ 0.54t) regime of the SOC model (see Ref. [24]) in Figs. 9 and 10. A clear trend in all of the results is that increasing the patch number enhances the randomness, which effectively increases the strength of disorder. This is demonstrated in Fig. 11 by the gaps becoming soft for weaker disorder strength, as well as an increased rounding of the Van Hove peaks as we increase the number of random patches. Eventually, at large enough disorder in the wavevector, any remnant of the semimetal scaling regime is destroyed, as shown in Fig. 9. In the magic-angle regime as shown in Fig. 10, which has a small miniband and a large density of states at the Dirac node energy, we find that disorder systematically broadens the size of the minibandwidth while also smearing out the structure of the DOS at finite energy. Similarly, increasing the number of random patches effectively increases the strength of disorder.
We capture the effects of disorder on the Van Hove peaks through T c [see Eq. (7) in the main text for the definition of T c , which is simply an effective coupling constant inspired by the BCS theory], which is shown in Fig. 11 for wavevector disorder. We find that disorder reduces T c monotonically, however when compared to the main insulating gap isolating the miniband we find that the Van Hove peaks are relatively much more robust than the main miniband gap. This features is distinct from what we saw in the case of TBG in the lattice model (main text), where T c was suppressed more strongly than the gap. Given this dichotomy, we believe that the lattice model should be trusted more in capturing the Van Hove physics of real TBG, and thus, T c is likely to be suppressed more than the main insulating gap in the presence of TBG twist disorder.
We now turn to the effects of wavevector disorder on properties of the Dirac velocity and the minibandwidth, as shown in Fig. 12. The velocity that vanishes in the magic angle regime is rounded out and remains finite due to the finite disorder strength. Away from the magic angle regime, the effects of disorder on the velocity remain weak. Moreover, the minibandwidth broadens with both increasing disorder strength and the number of patches until the gaps are completely filled. This is consistent with the behavior of the TBG model in the main text, namely that twist disorder weakly effects the velocity and increases the size of the minibandwidth, and the latter effects weakens the strength of correlations in the miniband. Thus, both models predict a universally ro- (a) Effective velocity of the Dirac cone and how it is rounded out due to randomness in the wavevector. The finite velocity in the magic-angle regime for WQ = 0 is just a finite size effect [24]. (b) Minibandwidth as a function of disorder in the quasiperiodic wavevector, which monotonically broadens for increasing disorder until the gap is filled in and the miniband is no longer separated from the rest of the band (marked as dashed lines). We include both W = 0.35 for semimetallic phase and W = 0.54 for the magic-angle regime. Note that we have set t = 1 here.
bust disorder-resistant Dirac cone velocity at low energies and a considerable disorder-induced broadening of the minibandwidth, thus weakening the correlated insulator phase.
We then want to isolate the K point in order to perform perturbation theory around the Dirac cone. We first note that the bare velocity is v 0 = 3 2 t for this cone, and we do the perturbation theory by using Dyson's equation for the Green function At second order in perturbation theory, the self energy Σ(ω, k) is given by It is important to notice that while there are six terms here in contrast to the continuum model which has only three. Those corresponding three have support near K while the rest will give small or negligible curvature corrections (which we will nonetheless account for). In order to do the perturbative expansion, we identify the small parameters controlling the expansion for the continuum model, these are where v F = 3 2 t. In Ref. [21] α 1 = α 0 = α while we will keep them arbitrary to account for lattice relaxation. To identify curvature corrections, we can further expand in k θ , so we will have terms that go as α j , α j k θ , and α j k 2 θ . Expanding Σ(ω, k) for small ω and k and for small curvature, we obtain the following Σ(ω, k)/t ≈ −3(α 2 0 + α 2 1 )ω/t WFcn Renorm. + 9 4 α 0 α 1 k 2 θ − 9 4 (α 2 0 + α 2 1 )k 2 θ σ x Shift cone + 9 4 α 0 α 1 − 3 2 k 2 θ k x − 2k θ k y Tilt cone − 9 2 α 2 1 − 27 16 α 2 0 k 2 θ k x σ x + 9 4 α 2 1 k θ k x σ * y − 9 2 α 2 1 − 81 16 α 2 0 k 2 θ k y σ * y + 9 4 (2α 2 0 − α 2 1 )k θ k y σ x Velocity renormalization . (B7) The first term is labeled "WFcn Renorm." for "wavefunction renormalization." The next term labeled "Shift cone" is second order in curvature and it shifts both the position of the cone in k-space (recall that C 3 is broken, so this term is expected) and in energy. At first-order in the curvature, we obtain the next term labeled "Tilt cone," that acts as a Galilean boost to the cone, tilting it over in k-space. And finally, we obtain corrections labeled "Velocity renormalization" since it directly modifies the v 0 k · σ * term in the Hamiltonian near the K point. To obtain the effective Hamiltonian, we put the Green's function in the form where Z = [1 + 3(α 2 0 + α 2 1 )] −1 is the quasiparticle residue, and from this we find near the K point where we have defined h 0 = − 9 8 t α 0 α 1 k θ 1 + 3(α 2 0 + α 2 1 ) (3k θ , 4), (B10) To find the renormalized velocity, consider the velocity operatorv and if we take the expectation value with respect to eigenvalues of H eff , we obtain σ * = w for a normalized vec- tor w = (cos ϑ, sin ϑ). This allows us to define a velocity v(ϑ) = | v | = |V T w(ϑ) + h 0 |.
We can define from this a maximum and minimum velocity v min = min In the limit where we neglect curvature corrections, v min = v max and is given exactly by the renormalized result given in Ref. [21] for the continuum model. Comparison of the velocity renormalization with and without the curvature corrections in this model is given in Fig. 13.