Ground State and Hidden Symmetry of Magic Angle Graphene at Even Integer Filling

In magic angle twisted bilayer graphene, electron-electron interactions play a central role resulting in correlated insulating states at certain integer fillings. Identifying the nature of these insulators is a central question and potentially linked to the relatively high temperature superconductivity observed in the same devices. Here we address this question using analytical arguments and a comprehensive Hartree-Fock numerical calculation which includes the effect of remote bands. The ground state we obtain at charge neutrality is an unusual ordered state which we call the Kramers intervalley-coherent (K-IVC) insulator. In its simplest form, the K-IVC exhibits a pattern of alternating circulating currents which triples the graphene unit cell leading to an"orbital magnetization density wave". Although translation and time reversal symmetry are broken, a combined `Kramers' time reversal symmetry is preserved. Our analytic arguments are built on first identifying an approximate U(4)$\times$U(4) symmetry, resulting from the remarkable properties of the tBG band structure, which helps select a low energy manifold of states, which are further split to favor the K-IVC. This low energy manifold is also found in the Hartree-Fock numerical calculation. We show that symmetry lowering perturbations can stabilize other insulators and the semi-metallic state, and discuss the ground state at half filling and a comparison with experiments.

In magic angle twisted bilayer graphene, electron-electron interactions play a central role resulting in correlated insulating states at certain integer fillings. Identifying the nature of these insulators is a central question and potentially linked to the relatively high temperature superconductivity observed in the same devices. Here we address this question using analytical arguments and a comprehensive Hartree-Fock numerical calculation which includes the effect of remote bands. The ground state we obtain at charge neutrality is an unusual ordered state which we call the Kramers intervalley-coherent (K-IVC) insulator. In its simplest form, the K-IVC exhibits a pattern of alternating circulating currents which triples the graphene unit cell leading to an "orbital magnetization density wave". Although translation and time reversal symmetry are broken, a combined 'Kramers' time reversal symmetry is preserved. Our analytic arguments are built on first identifying an approximate U(4) × U(4) symmetry, resulting from the remarkable properties of the tBG band structure, which helps select a low energy manifold of states, which are further split to favor the K-IVC. This low energy manifold is also found in the Hartree-Fock numerical calculation. We show that symmetry lowering perturbations can stabilize other insulators and the semi-metallic state, and discuss the ground state at half filling and a comparison with experiments.
Several aspects of the physics of tBG are reminiscent of multi-component quantum Hall systems (e.g. with spin, valley, or layer) where correlated insulators also arise at integer fillings. The driving force there is the exchange interaction that spontaneously polarizes the electrons into a subset of the components. The Landau-level form of the single particle wavefunctions, which quenches the kinetic energy while preserving their spatial overlap, plays a key role in stabilizing these ferromagnets. However, the addition of the time reversal symmetry present in tBG, particularly when combined with 180-degree in-plane rotation symmetry (C 2 ) that effectively enforces time reversal in each valley, opens the door to different orders, including superconductivity, that are absent in the quantum Hall setting. Indeed tBG is one of the few Moiré materials that retains C 2 symmetry, which leads to special properties such as unremovable band touchings that double the number of low energy modes. Symmetry-lowering perturbations such as an aligned h-BN substrate or weak magnetic fields, are known to induce an integer quantum Hall (IQH) insulator in certain cases [15,16].
In the other canonical model of strong coupling physics, the Mott-Hubbard model, symmetry breaking in the correlated (Mott) insulator is governed by anti-ferromagnetic superexchange. A pivotal question is whether the single particle subspace defined by tBG leads to insulators that parallel the quantum Hall case, with a cascade of polarized states, or more closely resembles that in the Hubbard model. We answer this question by considering the structure of Coulomb interactions projected directly into the k-space continuum model of tBG, including several of the remote bands [10,12,13]. While Mott-Hubbard representation [7,8,17] are complicated by the topology of the nearly-flat bands [6,11,[18][19][20][21], one can work directly in the space of the continuum wavefunctions. Here, careful analysis reveals some generic features of the Coulomb matrix elements which arise from the symmetry and topology of the flat bands. This analysis allows us to identify both an enlarged U(4) × U(4) approximate symmetry group and an intervalley-coherent order at neutrality, missed in previous approaches.
This "hidden" symmetry of the model has important phenomenological consequences. Experimentally, many of the basic phenomena, such as the existence of correlated insulators at integer fillings, the location of superconducting domes, and the presence of anomalous Hall effects, differ from sample to sample. Since the energetics may depend on parameters like the precise twist angle, alignment with the h-BN substrate, and strain, this leads to the sinking feeling that the search for a "unified" theory of tBG will become mired in a swamp of microscopic details. However, in this work we identify a hierarchy of energy scales in tBG which can naturally unify many of these findings. Due to the remarkable properties of the tBG band structure, we show that the largest energy scales (15 − 30 meV) preserve the approximate U(4) × U(4) symmetry which relates a small number of competing symmetry-breaking orders. Smaller effects (0.2 − 5 meV) then choose between these orders, and we identify several concrete mechanisms, such as strain or substrate align-FIG. 1. Circulating currents and magnetization of the Kramers intervalley-coherent state (K-IVC) Similar to a Kekule distortion, spontaneous intervalley-coherence between the K − K points of the graphene triples the graphene unit cell. The amplitude of the circulating current slowly modulates over the Moiré unit cell, shown here as the magnetization density m(r), while preserving the Moiré superlattice translations. We show the contribution from a single spin species summed over the two layers; the other spin carries either identical or reversed currents if the K-IVC is a spin singlet or spin 'triplet' respectively. Lower-left inset shows an example of the circulating current pattern which retains C2T symmetry, at the scale of the graphene lattice, in the AA-region of Moiré unit cell. ment, which can tilt the balance between them.
The primary focus of this work is to understand the implications of this hierarchy at charge neutrality (ν = 0). In certain samples with low twist-angle disorder, an insulating state is observed in transport at ν = 0, even in the absence of apparent hBN alignment [5]. Scanning tunneling microscopy also finds that the density of states reconstructs at ν = 0, where a gap ∼ 15-30 meV opens up [11,[22][23][24]. We identify this phase as a new "Kramers intervalley-coherent" (K-IVC) state. In the K-IVC phase ( Fig. 1), time-reversal is spontaneously broken in each spin component and a pattern of alternating circulating currents develop which triple the graphene unit cell (the Moiré unit cell is unchanged). The K-IVC does not have a net magnetization, but is rather a "magnetization density wave" at the wavevector K of graphene's Dirac point. Like an antiferromagnet, the K-IVC preserves a modified time-reversal symmetry T combining the regular (spinless) time reversal T with a π shift in the IVC phase. The new time reversal has the remarkable property that (T ) 2 = −1, i.e. it is a Kramers time-reversal symmetry arising from valley rather than spin. The presence of T leads to Kramers pairing in the spectrum, independent of spin, and may have important implications for the nature of superconductivity when the K-IVC at ν = 0 is doped. Furthermore, restricting to each spin, the K-IVC is a topological insulator, though the protecting T -symmetry may be strongly broken by the edge (due to broken translation symmetry).
Before detailing the Hamiltonian, let us briefly summarize the origin of the approximate U(4) × U(4) symmetry. The eight flat bands are labeled by spin s, valley τ , and a two-fold "band" index σ. Since the bands are quite flat, there is no particular reason that σ should label the single-particle eigen-basis. Instead, it turns out the two bands can be decomposed into a Chern C = 1 band and a C = −1 band related by C 2 T symmetry, leading to a total of four C = 1 and four C = −1 bands. Remarkably, the wavefunctions in the Chernbasis have a substantial sublattice polarization, i.e. they have a larger projection on one sublattice compared to the other. Thus, we can label them by σ z = A/B = ±1 with the Chern number C = σ z τ z . Due to this sublattice polarization, the slowly-varying part of the charge density decouples, to a good approximation, into the two Chern components: n(r) = n C=1 (r) + n C=−1 (r) (otherwise there would be large cross-terms). The four C = 1 (C = −1) wavefunctions are almost identical up to a permutation of spin and sublattice, so n(r), and hence the interaction, is invariant under separate U(4) rotations acting on the C = 1/ − 1 components. The single-particle dispersion and other perturbations then weakly break this symmetry down to the physical one.
This story is in fact highly reminiscent of the QH effect in the zeroth Landau-level (ZLL) of monolayer graphene, which also has a sublattice-valley locking σ z τ z = sgn(B) which leads to an approximate U(4) symmetry. Indeed, tBG is, in essence, two time-reversed copies of the ZLL of MLG: σ z τ z = C = ±1, with the tBG flat-band dispersion mapping onto weak tunneling between the two copies. This explains why, in the absence of dispersion, and with full sublattice polarization there is then a U(4) × U(4) symmetry coming from each "ZLL". Thus, much intuition from the theory of U(4) quantum-Hall ferromagnetism in MLG [25] can be translated to tBG, albeit with the novel twist of time-reversal symmetry: unlike a single ZLL, unfrustrated Cooper pairs can form from one electron in each copy.
This doubled-ZLL picture also brings us back to the tension between the QH and Hubbard paradigms. In the end, tBG is a novel hybrid of both: within each copy of the ZLL, the electrons prefer to polarize into a subset of the four components by direct analogy to U(4) QH ferromagnetism. However, the tunneling-induced coupling between the two ZLLs couples their order-parameters via an anti-ferromagnetic "t 2 /U " super-exchange. This picks out a submanifold of states comprising of the K-IVC and the valley Hall state. Finally, taking into account the finite sublattice polarization, the K-IVC which remains a 'generalized ferromagnet' is favored relative to the valley Hall state.
Hamiltonian and symmetries-Our starting point is the Bistritzer-Macdonald (BM) [1,2] model of twisted bilayer graphene which considers two graphene layers with a relative twist angle θ coupled via a slowly varying Moiré potential. The interlayer Moiré potential is specified by two parameters w 0 and w 1 denoting intra-and intersublattice coupling, respectively. The ratio w 0 /w 1 , which was taken to be 1 in the original BM model, is reduced in realistic samples to about 0.75 due to lattice relaxation effects, which shrink the AA stacking regions relative to the AB regions [26,27]. In the extreme limit where w 0 = 0, an extra chiral symmetry is present which leads to several interesting features including perfectly flat bands at the magic angle [28].
Let us now define an extended BM Hamiltonian which includes interactions. The interaction is taken to be double-gate screened Coulomb interaction with V q = 2π tanh(|q|d)/ |q| where d is the distance to the gate and a dielectric constant (similar results are also obtained for the single-gate screened case). Next, we choose a subset of bands of the BM Hamiltonian h BM near charge neutrality labeled by the band index N − ≤ n ≤ N + and assume that all states with n > N + (n < N − ) are empty (full). The projected Hamiltonian has the form where c(k) is a vector of annihilation operators in the combined index α, β, . . . containing spin s =↑, ↓, valley τ = K, K and band n = N − , . . . , N + indices, and u α (k) are the eigenstates of the BM Hamiltonian. A is the area and h(k) is the single-particle Hamiltonian which includes the BM Hamiltonian as well as band renormalization effects due to the exchange interaction with the filled remote bands (see supplemental material for details) [10,12,29]. We neglect electron-phonon interactions as well as the short-distance Coulomb scattering V K−K between the Dirac points, both of which are suppressed by powers of the lattice-to-Moiré scale a/L M 1. We will refer to these neglected terms as the "intervalley-Hunds" terms.
Since the competing ν = 0 states are distinguished by their broken symmetries, let us review the symmetries of the extended BM Hamiltonian. Letting σ z , τ z denote sublattice (A/B) and valley (K/K ), H eff has the following symmetries: (i) C 2 = σ x τ x and (ii) T = τ x K which relate the two valleys, (iii) C 3 = e − 2πi 3 σzτz which acts within each valley and (iv) valley charge conservation, and SU(2) K,K represent independent spin rotations in the K and K valleys. In addition, the BM Hamiltonian has an approximate (v) particlehole symmetry P = iσ x µ y K at small angles, where µ i are the Pauli matrices acting on the layer index [20,30].
The intervalley Hunds terms, whose magnitude is of the order J H ∼ 0.2 − 0.5 meV, break the independent spin rotations in each valley down to the physical global spin rotation symmetry: SU(2) K × SU(2) K → SU(2). This effect occurs at order a/L M ∝ θ. Furthermore, umklapp processes which scatter three electrons between the two valleys (either due to phonons, or higher-order Coulomb scattering) break U V (1) down to Z 3 , and are suppressed by a further factor of θ 2 [31,32].
Hartree-Fock mean-field-In the Hartree-Fock (HF) method, we solve for the set of self-consistent ground state Slater determinant states characterized by the one-electron density matrices P α,β (k) = c † α (k)c β (k) . Similar to Refs. [10,11,13], we take both the flat bands and a range of remote bands around charge neutrality into account. However, in contrast to previous studies [10][11][12][13], we allow for coherence between the two valleys which spontaneously breaks the U V (1) symmetry (see also Ref. [6] for an early suggestion of a differ-  ent IVC order motivated on phenomenological grounds). Further details of our procedure are provided in the supplemental material.
The numerical results at CN (ν = 0) are given in Fig. 2 for fixed θ = 1.05 o , = 7, 12, and w 0 = 35, 75 meV as a function of w 1 . Since the magic angle condition depends on the ratio w 1 /θ [2], this is approximately equivalent to changing θ. We exploit this fact to plot the HF energies as a function of an "effective" angleθ ≡ 1.05 • × (101.8 /w 1 [meV]), where w 1 = 101.8 meV is the magic angle condition for the parameters we've used. From comparison with ab-initio methods, the magnitude of the inter-layer tunneling terms are estimated to be w 1 ∼ 110 meV and w 0 ∼ 80 meV [2,26,27]. Here, we consider a range of values of w 0/1 which can be far from these estimates as this provides valuable information when comparing numerical results with our analytical findings below.
Depending on the initial condition or which symmetries are explicitly enforced, we find several self-consistent solutions which can be grouped into three categories: (i) a semimetallic (SM) state which preserves C 2 , T , and U V (1) but may break C 3 (this state can be understood as a renormalized version of the BM semi-metallic band structure); (ii) a quantum hall (QH) insulator with Chern number ±4 which breaks T but preserves C 2 and U V (1); and (iii) several insulating states with Chern number 0, including valley-Hall (VH) state, which breaks C 2 but preserves T and U V (1), valley-polarized (VP) state [33], which breaks T and C 2 but preserves C 2 T and U V (1), and an intervalley coherent (IVC) state which breaks T and U V (1) but preserves the combination T = τ y K which acts as a spinless Kramers time-reversal symmetry between valleys. Unlike previously studied IVC states in TBG [34] and related Moiré materials [35,36], this Kramers IVC (K-IVC) takes place between wavefunctions which have the same Chern number, thus evading the energy penalty associated with vortices in the order parameter [34]. The competition between the VH, VP, QH, and SM states, which were all found in previous mean field studies [10][11][12], is very sensitive to the values of (w 0 , w 1 ). This explains why these studies, all of which assumed unbroken U V (1) symmetry, did not agree on the nature of the ground state. On the other hand, the U V (1)-breaking K-IVC state is always the lowest energy state regardless of the values of w 0 , w 1 and . Another salient feature is that the competition between the K-IVC, QH, and VH is closest when w 0 → 0, but is lifted in favor of the K-IVC for larger w 0 . The reason will become clear from our analysis of the approximate symmetries. The HF numerics shown in Fig. 2 were obtained by keeping six bands per spin and valley, but more generally we find that mixing between the flat and remote bands has only a quantitative effect over the range of parameters considered. In particular, the K-IVC remains the ground state as more bands are included, and the magnitude of the IVC order parameter remains almost unchanged (Fig. 3), indicating the symmetrybreaking occurs predominantly in the flat bands. The charge gap decreases quantitatively as more bands are included, but saturates at a value of ∼ 15.8 meV when sixteen bands per spin and valley are taken into account. As a result, our numerical results can be reproduced to a good degree of accuracy within the two-band projection of Ref. [12], where the effect of the remote bands is incorporated only via the exchangerenormalization of h(k).
A better intuition for the symmetry-breaking phases in where P (k) is an 8 by 8 matrix which we parameterize as P (k) = 1 2 (1 + Q(k)), with Q(k) 2 = 1 and tr Q(k) = 2ν. Furthermore, rather than working in the basis which diagonalizes h BM , it is convenient to work in the sublattice-polarized basis which diagonalizes the sublattice operator σ mn (k) = u n (k)|σ z |u m (k) , with n, m ∈ {1, 2} restricted to the two flat bands. This basis is well-defined as long as the eigenvalues of the matrix σ(k) are non-zero, indicating finite sublattice polarization. In the supplemental material we check that this is indeed the case. The 8 flat bands are then labeled by s z =↑ / ↓, τ z = K/K , σ z = A/B. A crucial feature of this basis is that each band carries a quantized Chern number C = τ z σ z [28,34,37,38].
With this basis in hand, we can concisely summarize the competing insulators: Q QH = σ z τ z = C (which explains its net Hall conductance); Q VP = τ z ; Q VH = σ z = τ z C (which explains its valley-Hall conductance); and finally which was found to be the ground state at charge neutrality for the entire parameter range that was studied. Under (graphene-scale) lattice translations, the K-IVC order parameter transforms as θ IVC → θ IVC + 2π 3 , while under spinless T , θ IVC → θ IVC + π. In addition to the spin-singlet variant of the K-IVC state discussed here, there are other K-IVC states with different spin structures which are all degenerate on the level of H eff . These will be discussed below in the sections containing our analytical results.
Enlarged U(4) × U(4) symmetry-Below, we will show how a large U(4) × U(4) symmetry appears in the pure interaction model (i.e. with no dispersion) in the chiral limit. We will begin by showing that even away from the chiral limit, the flat-band-projected interaction term has an enhanced U(4) symmetry. Next we will then show that the chiral model also has a different enhanced U(4) symmetry, even when dispersion is included. Combining these we will obtain a large U(4) × U(4) symmetry for the chiral model in the absence of dispersion.
Motivated by the numerical result, we are going to restrict ourselves in the following to the two flat bands (per spin and valley) and rewrite the interacting Hamiltonian (1) as where the interaction term differs from (1) by some singleparticle terms which are absorbed intoh. We have made this rearrangement so thatρ q is exactly odd under particle-hole, and henceh and the interaction are separately particle-hole symmetric ( G,k δ G,q tr Λ G (k) is the total charge-density of the flat bands). Let us first consider the limit where sublattice polarization is not saturated, i.e. chiral symmetry is not present w 0 = 0. Now, the particle-hole symmetry of the projected Hamiltonian (5) has important consequences. This follows from the observation that a PT symmetry (which flips energy but not momentum) is equivalent, within a perfectly flat band (i.e on ignoring the single particle dispersion), to a single particle unitary symmetry since it leaves the space of eigenstates invariant. In our model, the gauge can be chosen such that the PT symmetry has the following simple form in the flat band projected basis (see supplemental material) PT acts locally in space and momentum but exchanges valley and sublattice, relating flat-bands with the same Chern number C = τ z σ z . Thus, if we neglect the dispersion termh, we find that the U(2) K × U(2) K of the Hamiltonian is enlarged to a U(4) PT symmetry whose generators are {t a , t a σ y τ y } where t a are the 8 (sublattice and valley diagonal) generators This unitary symmetry is broken by the dispersion termh which anticommutes with the extra generators t a σ y τ y . Another limit where the symmetry of the Hamiltonian is enhanced is the chiral limit w 0 = 0 [28,39], where the BM Hamiltonian has an extra chiral symmetry S = σ z , {S, H BM } = 0, leading to complete sublattice polarization. In this case, we can combine PT symmetry with S to obtain a Z 2 unitary symmetry R given by Similar to PT , R acts locally in space and momentum but exchanges valley and sublattice, relating bands with the same Chern number C = τ z σ z . Its existence enlarges the symmetry of the model to U(4) R whose generators are {t a , t a R}. It is important to notice that this U(4) R symmetry is different from the U(4) PT symmetry discussed earlier. In addition, the U(4) R symmetry is preserved on including the dispersioñ h and does not rely on the flat band projection, i.e. it is a symmetry of the full Hamiltonian in the chiral limit.
Combining the two previous discussions, we find that the interaction in the chiral limit has a large U(4) × U(4) sym-metry whose generators are {t a , t a τ y σ x , t a τ y σ y , t a σ z }. An intuitive understanding of this result is obtained by observing that in the chiral limit, the form factor Λ q (k) has the remarkably simple form where F q (k) and Φ q (k) are two real scalars whose properties are discussed in more detail in the supplemental material. As a result, the interaction is invariant under any unitary rotation which commutes with σ z τ z yielding the symmetry U(4) × U(4) corresponding to arbitrary unitary rotations which relate flat-bands with the same Chern number, as illustrated in Fig. 5. Hierarchy of energy scales-In the realistic case where w 0 = 0 andh are not negligible, we can estimate the strength of the U(4) × U(4) symmetry breaking by splitting the form factor Λ q (k) into components Λ ± q (k) which commute/anticommute with R. Using the remaining symmetries, one can show (supplemental material) that Λ + q (k) has the form given in Eq.
We can now write the density as We notice that the R-symmetric component of the densityρ + q acts within the same sublattice whereas the R nonsymmetric partρ − q acts between sublattices. This induces a splitting of the interaction into an intrasublattice part H + = 1 2A −q which has the full U(4) × U(4) symmetry and an intersublattice part with only a U(4) symmetry. Similarly, the form of the dispersionh is restricted by symmetries tõ with the R-symmetric/non-symmetric parts given by h x,y /h 0 (note that, unlike the interaction, the symmetric part acts between sublattices and the non-symmetric part acts within each sublattice). One crucial observation is that even though the realistic value of w 0 /w 1 is not small, the R-breaking terms H − , h 0 are smaller by a factor of 3-5 than their R-symmetric counterparts H + , h x,y as shown numerically in supplemental material and summarized in Fig. 5. Furthermore, even after accounting for the band renormalization effects, the dispersionh is on average smaller by a factor of 3-5 compared to the interaction. The previous discussion points to a hierarchy of energy scales associated with different symmetries. The largest scale is associated with the intrasublattice interaction H + which has the enlarged symmetry U(4) × U(4) implemented by unitary rotations which commute with σ z τ z . This symmetry is broken at lower energy scales by two different terms. First, the intersublattice h x,y breaks this down to a single U(4) R which commutes with σ x corresponding to the symmetry of the chiral model discussed earlier. Second, the intersublattice inter- action H − breaks it down to a different U(4) PT subgroup which commutes with σ x τ z . The presence of both terms thus reduce the symmetry down to U(2) K × U(2) K which is the intersection of the two U(4) subgroups. The intrasublattice dispersion h 0 is smaller in magnitude (∼ 0.5-1 meV) and does not break the symmetry any further so it can be neglected. Finally, the intervalley Hund's coupling breaks the symmetry down to U C (1) × U V (1) × SU(2) at smaller scales. Close to the magic angle all the scales are governed by the interaction, and depend crucially on the structure of the wavefunctions (via Λ q (k)) rather than the detailed q dependence of V q .
Energetics and ground state of the spinless model-To understand the competition between different states, it is instructive to start by considering the simpler problem of spinless electrons at half filling for which we simply need to replace U(4) → U(2) in the discussion above. Physically, this is equivalent to assuming a spin-unpolarized solution at CN or a spin-polarized solution at half-filling.
We take the strong coupling limit by assuming that the intrasublattice interaction scale is much larger than the other scales , i.e. H + H − , h, and subsequently solve for the ground states in this limit. For the realistic parameters, H + is only a factor of 3-5 larger than H − and h. However, as we will see, the results of the strong coupling analysis agree remarkably well with the Hartree-Fock numerics, providing an independent justification for the results beyond mean field. We will comment later on the validity of our results for intermediate coupling H + ∼ h.
We start by noting that H + is a non-negative definite operator for any repulsive interaction V q > 0, which implies that any state satisfyingρ + q |Ψ = 0 for q = 0 is a ground state [9,29,40]. Next, we note that the diagonal form of Λ + q (k) in sublattice and valley implies thatρ + q annihilates any sublattice or valley "ferromagnet" where two of the four sublattice/valley states shown in Fig. 6 are completely filled. For q which is not a reciprocal lattice vector, this follows by noting that the action ofρ + q changes an electron's momentum by q which is impossible in a completely filled or empty band. For reciprocal lattice vector q, the action of the first term in (9) on a completely filled/empty band is finite but cancels exactly against the second term at CN as shown in supplemental material. Simple states satisfying this condition are the QH σ z τ z , VH σ z and VP τ z state. More general states are obtained by acting with any U(2) × U(2) rotation which commutes with σ z τ z on these simple states yielding a manifold of Slater determinant states labelled by a k-independent Q satisfying [Q, σ z τ z ] = 0. They fall into two categories: (i) a U(2) × U(2) invariant QH state with a total Chern number ±2 obtained by filling two bands with the same Chern number and (ii) a manifold of zero Chern number states generated by the action of U(2)×U(2) on the VP state. This manifold includes the VH state as well as two distinct types of IVC orders which break U V (1): the Kramers IVC state σ y τ x,y discussed earlier and a T -symmetric IVC state with σ x τ x,y . Both IVC states hybridize bands with the same Chern number and, as a result, the order parameter can be uniform in k and evade the energy penalty due to vortices discussed in earlier works [34][35][36].
Including h x,y breaks the U(2) × U(2) down to U(2) R . It has the form of an intra-valley, inter-sublattice tunneling with amplitude h x (k) + ih y (k) connecting pairs of opposite Chern bands as shown in Fig. 6. Thus, a state in which all pairs of bands connected by h x,y are either both full or both empty is annihilated by h x,y since the tunneling processes are completely blocked. This is equivalent to [Q, σ x ] = 0. This can be seen by noting that commutation with both σ x and σ z τ z means that Q is proportional to the identity in the SU(2) pseudo-spin variable (σ x , σ y τ z , σ z τ z ) whose z-component is the Chern number and x, y components correspond to the tunneling h x,y , i.e Q describes to a state with zero total pseudospin which is annihilated by the pseudo-spin flip operators ∝ h x,y . For the remaining states, the action of h x,y creates an electron-hole (e-h) excitation between these pairs of bands. Since the electron and hole carry opposite Chern numbers, the electron-hole excitations always have a finite energy of the same order as H + as shown in the supplemental material. This can be understood by noting that the condensation of such electron-hole pairs is equivalent after a particle-hole transformation to superconducting pairing in a ±2 Chern band which is known to be energetically unfavorable [34]. The energy due the tunneling h x,y can be computed within second order perturbation theory leading to an energy reduction where E h and E + are the energy scales associated with h x,y and H + . This gain, which resembles antiferromagnetic "superexchange", is due to virtual tunneling processes between pairs of bands connected by h x,y which is maximized if only one band is filled in each pair. This is equivalent to the condition {Q, σ x } which is satisfied by two types of states:(i) a U(2)-invariant QH state with Chern number ±2 and (ii) a manifold of states with vanishing Chern number isomorphic to U(2)/U(1) × U(1) S 2 generated by the VH and K-IVC states which form a sphere (see Figure 7b).
The intersublattice part of the interaction H − breaks U(2)× U(2) to a different U(2) PT subgroup. Because the crosstermsρ − qρ + −q +h.c. ∈ H − are already guaranteed to vanish on 6. Schematic illustration of the symmetry reduction and ground state selection in the spinless model (top panel). Beginning with the U(2)×U(2) symmetric intrasublattice interaction H+, which allows for free rotations within the two C = 1 and two C = −1 levels, the symmetry is lowered by the dispersion hx,y (left) and the intersublattice interaction H− (right), which splits the degenerate states. The K-IVC is the unique state which is optimal for both perturbations. the ground-state manifold of H + , and the residualρ − q ρ − −q is positive definite, H − selects the submanifold of ground states annihilated byρ − q . Due to the structure of the intervalley form factor Λ − q (k) ∝ σ x τ z , σ y , these states satisfy the condition [Q, σ x τ z ] = 0 forming the manifold U(2)/U(1) × U(1) S 2 generated by the VP and K-IVC. The energies of the other states is increased by an amount of the order E − ∼ 1 meV.
Thus, in the presence of both h xy and H − , the K-IVC, which benefits from both perturbations, has the lowest energy followed by the VP and QH/VH (the latter two are degenerate) whose competition is determined by the relative strength of the intersublattice interaction E − and the energy reduction due to superexchange E 2 h /E + . This is consistent with the nu-merical results in Fig. 2, where the energies of the VP state and the QH/VH state cross as a function of w 1 which controls both h x,y and H − . At a fixed w 1 , decreasing w 0 whose main effect is decreasing H − clearly favors the VH/QH states and makes them closer in energy to the K-IVC ground state. The T -IVC state, which was not seen in the numerics, is disfavored by both and has the highest energy.
In the realistic magic angle parameter regime, h x,y (k) is only a factor of 3-5 smaller than the interaction and some states may become energetically competitive by optimizing this part first. Indeed, this eventually occurs away from the magic-angle when the dispersion becomes comparable to the interaction scale. The simplest such states are semimetallic (SM) solutions preserving both C 2 T and U V (1) [12], which are characterized by Q(k) = σ x e iφ(k)σzτz . Such SM states also break C 3 for realistic values of the parameters w 0 and w 1 [12]. Due to the topology of the bands, the phase φ(k) winds twice around the Brillouin zone which means it has at least two vortices (this assumes a smooth gauge choice). Another way to see this is by noting that this order parameter can be obtained by condensing electron-hole pairs discussed earlier, thus gaining energetically from h x,y (k) but paying an energy penalty ∼ E + . The competition between the SM and the insulating states is determined by the relative magnitudes of these two terms. Our numerical results suggest that close to the magic angle, the insulating ground state is always energetically favored, but the energy difference is relatively small ∼ 1-3 meV making the competition sensitive to small perturbations such as strain that favor the SM state [12].
Charge neutrality: Ground state and spin structure -Upon including spin, we can similarly study the manifold of ground states at CN starting with the states minimizing the intrasublattice interaction H + which satisfy [Q, σ z τ z ]. These are obtained by completely filling 4 of the 8 bands in Fig. 5. h xy selects states satisfying {Q, σ x } = 0. These states can be divided into three classes: (i) a spin-unpolarized QH state with Chern number ±4 obtained by filling all 4 bands with the same Chern number, (ii) a manifold of states with Chern number ±2 obtained by filling 3 bands with the same Chern number and one band with opposite Chern number, and (iii) a manifold of states obtained by filling 2 bands in each Chern number sector. The states in (ii) are mixed states corresponding, for instance, to a QH state in one spin species and a VH or IVC state in the other and they form the manifold U(4)/U(3) × U(1). States in (iii) include the spin-unpolarized versions of the spinless phases discussed earlier including the VH and K-IVC states, which form the manifold U(4)/U(2) × U(2). In contrast, the intersublattice interaction H − selects states satisfying [Q, σ x τ z ] = 0 which include spin/valley polarized states as well as spin-unpolarized K-IVC states. However, the spin/valley polarized states do not benefit from the dispersion. Thus, combining the effect of the dispersion and H − we are left with K-IVC as the unique state that is maximally stabilized by both perturbations.
Note that the spin-unpolarized K-IVC state is not invariant under the action of U(2) K × U(2) K rotations. Instead, this action generates a manifold of states which are degenerate with respect to H eff . This manifold can be parameter-ized by a single 2×2 unitary matrix V in spin space with To understand the structure of these states, we write the manifold as U (2) U(1) × SU(2) which can be parametrized as V = e iφ e i θ 2 n·s . Thus, a given K-IVC state is specified by choosing a spin quantization axis n on S 2 and specifying two U(1) K-IVC phases φ ± θ 2 for the up and down spins along n. Note however that the spin axis n loses meaning for the spin-singlet state θ = 0. The intervalley-Hunds coupling fixes the value of the relative phase θ between the K-IVC states for up and down spins. An antiferromagnetic coupling, perhaps driven by phonons [41], leads to θ = 0. As expected this is the spin singlet K-IVC state, where the orbital currents from opposite spins add. On the other hand, ferromagnetic Hunds coupling leads toθ = π, i.e. a spin 'triplet' K-IVC state. At this special value, the orbital currents of the oppositely directed spins cancel, leaving behind circulating spin currents (see Figure 1).
Half Filling: Ground State and Spin Structure-While we have largely focused on charge neutrality ν = 0, let us now briefly discuss half filling i.e. ν = ±2, leaving a more through discussion for the future. At half-filling ν = −2 (the case of ν = 2 can be deduced by performing a particle hole transformation on the conclusions below), the ground states of H + are obtained by filling 2 out of the 8 bands encoded by the condition [Q, σ z τ z ] = 0. In contrast to CN, these states are not completely annihilated by the operatorρ + G for reciprocal lattice vectors G. Instead, the action of H + on these states yields a constant energy that does not affect their energy competition. However, such contribution may affect the competition between the ν = ±2 insulating states and metallic or superconducting phases emerging from the ν = 0 state. We leave investigating such competition to future works. Within the manifold of groundstates of H + , states can gain energetically from tunneling if at most one out of each pair of bands coupled through h x,y is filled. The resulting states either have (i) Chern number ±2 such as valley and sublattice polarized or spin-polarized QH states (forming the manifold U(4)/U(2) × U(2)) or (ii) Chern number 0 such as the spin-polarized VH or K-IVC states (forming the manifold U(4)/U(2) × U(1) × U(1)). Again, the intersublattice interaction H − selects instead states satisfying [Q, σ x τ z ] = 0 which include spin and valley polarized states and spin-polarized K-IVC. The ground state manifold in the presence of both band dispersion and H − is the K-IVC state. The set of nearly degenerate K-IVC states is obtained by acting with U K (2) × U K (2) on the spin-polarized K-IVC state. The resulting manifold is isomorphic to U(1) × S 2 × S 2 denoting the K-IVC phase and the direction of the spin in each valley which can be chosen independently. Intervalley Hund's coupling locks the spin in the two valleys to be either parallel (J < 0 ferromagnetic Hunds coupling) or anti-parallel (J > 0 antiferromagnetic Hund's coupling). In both cases spatially varying orbital magnetization currents are present. A full Hartree-Fock numerical analysis of this case is left to future work but it is worth noting that band renormalization effects at half-filling are expected to be larger than at CN, resulting in smaller gaps.
Phenomenology of the K-IVC-We now comment on the phenomenological consequences of the K-IVC order: • Circulating currents. Fixing a spin species, the latticescale current j ij in the K-IVC ground state manifests a pattern of circulating currents which triples the unit cell, as shown in Fig. 1 ∼ 0.7µA obtained by assuming each electron in the flat band is circulating at velocity v F . In the spin-singlet K-IVC, the two spin-species carry the same current, and the state is thus an orbital-magnetization density wave. The spin-triplet K-IVC Q = n · s τ x/y σ y , however, is invariant under the usual spinful time-reversal operation TR = is y τ x K. Hence the two spin-species carry opposite current and the magnetization cancels -instead, there are circulating spin currents.
Nevertheless, both cases triple the unit cell. In the presence of umklapp scattering, this tripling will manifest as small bond distortions or topographic changes reminiscent of a Kekule pattern, which may be observable in atomically-resolved STM spectroscopy.
• Landau fan. Due the T Kramers degeneracy, the conduction (valence) bands of the K-IVC (Fig. 4) have a doubly degenerate band minimum (maxima) at the mini-Γ point. Per spin, they consist of a pair of bands, which we label Z = ±1, which disperse quadratically. Both bands carry trivial C 3 quantum number, and thus to leading order within a k.p approach the Hamiltonian for the conduction band-minima is where m * is the effective mass, ∇ × A = B is the external magnetic field, m Γ is the orbital magnetization of the bands at the Γ-point (which is odd under T ), and g s is the g-factor for spin. The low-field Landaulevel spectrum is thus N = B( e m * (N + 1 2 ) + m Γ Z + g s µ B sz 2 ) + · · · , with an analogous result for the valence band. Neglecting g s and the magnetization m Γ , the Landau-fan would thus have a ν = ±0, 4, 8, · · · degeneracy arising from spin and T -Kramers degeneracy. With m Γ , however, this degeneracy splits, ν = ±0, 2, 4, · · · , with the relative strength of the splitting depending on the ratio of e m * to m Γ . Experiments reporting a charge-gap at neutrality find oscillations at ν = ±0, 2, 4, 8, · · · [5], which seemingly combines the two. This may be because at higher N or B, the O(p 3 ) terms become important, making a full quantitative calculation a useful direction for future work.
• Z 2 -topology. Remarkably, when restricting to a spinspecies, the K-IVC is a topological insulator protected by Kramers time-reversal T and U(1) charge conservation. This is expected since it consists of two IVCs with opposite Chern number (|KA + |K B and |KB + |K A ) related by T . Note however this does not automatically imply edge states since the fractional translation τ z involved in T may be broken by a rough edge.
• Phase-transitions. Finally, on breaking various symmetries the K-IVC can be weakened or destroyed as discussed below.
Effect of single-particle perturbations-Due to the presence of an enlarged U(4) × U(4) symmetry which is only broken by relatively small terms which settle the energy competition among a few low energy states, we expect the ground state to be sensitive to symmetry lowering perturbations such as sublattice potential, strain and magnetic field. The presence of a sublattice potential ∆σ z is associated with alignment with hBN substrate which explicitly favors the VH state (Q = σ z ) over the K-IVC state. Assuming a fixed spin structure (Q ∝ s 0 or n · s), the two order parameters anticommute forming an O(3) vector living on S 2 as shown in Fig. 7. As ∆ is increased, this vector rotates towards the z-axis (VH) until it points completely along the z-direction restoring U V (1) symmetry as shown in Fig. 7. As a result, we do not expect this phase transition to be associated with a gap closing in the fermionic sector which is verified numerically in Fig. 7.
Next, we consider the effect of explicit C 3 symmetry breaking (e.g., strain) implemented following Refs. [12,42] by rescaling one of the Moiré hopping parameters by 1 + β. This mostly affects the dispersion term h x,y coupling linearly to the C 3 -breaking SM. In addition, the energies of both the VH and K-IVC states also decreases since they couple to the tunneling h x,y through creation of electron-hole pairs in opposite Chern bands in contrast to the "blocked" manifold descending from the VP phase. This contribution is quadratic in h x,y which is consistent with the quadratic decrease of the VH and K-IVC energies as a function of β seen in Fig. 7. With increasing β, the energy of the three orders approach each other whereas other states such as VP are not affected. Since the creation of electron-hole pairs is associated with a gapless order parameter due to vortices, the gap of the VH and IVC states goes down quadratically with β as shown in Fig. 7. It is worth noting that semimetallic behavior in transport can also emerge purely from disorder, even when the ground state of the clean system is insulating [43].
Finally, let us comment briefly on the effect of magnetic field. The Zeeman coupling depends on the spin structure and its effect on the gap depends non-trivially on the type of lowlying excitations [41]. On the other hand, the orbital effect of the magnetic field can be understood as follows. For inplane field, its main effect is to break C 3 symmetry, shifting the Dirac points away from the Moiré K and K points. In this regard, the effect is similar to the C 3 -breaking perturbation discussed above yielding a quadratic decrease of the gap with in-plane field which is consistent with the observation of Ref. [4]. On the other hand, an out-of-plane field is associated with a relatively large Chern-Zeeman effect ∼ σ z τ z which shifts the energies of the opposite Chern bands relative to each other. As a result, it is expected to drive a transition (b) Schematic illustration of the manifold of low energy states for finite ∆. Since the IVC and VH order parameters anticommute, the order parameter is a vector on S 2 which gradually rotates towards the z-axis as ∆t is increased. (c) IVC gap as a function of the C3breaking parameter β with the energies of the different states as a function of β given in (d). For comparison the results in (c) and (d) were generated using the subtraction scheme and parameters (θ = 1.05 o , w0 = 110 meV, w1 = 82.5 meV, = 7) in Ref. [12] .
to a QH state with Chern number ±4 at neutrality and ±2 at half-filling. We leave a more quantitative discussion for the effect of magnetic field to future works.
Conclusions-To summarize, based on both numerical and analytical arguments, we propose that the insulating state observed at charge neutrality in pristine MATBLG [5] is the K-IVC state, i.e. an inter-valley coherent state with an emergent spinless Kramers time-reversal symmetry T . Interestingly, modulo spin degeneracy, the K-IVC is a non-trivial topological insulator protected by T . The analytical arguments in favor of the K-IVC state are based on the presence of an approximate U(4) × U(4) symmetry. One consequence of this approximate symmetry is that small perturbations to the BM band spectrum coming from e.g. h-BN alignment or strain can destroy the K-IVC state and instead give rise to a valley-Hall or semi-metallic state at charge neutrality. It is therefore important to have an estimate of the magnitude of these effects in different devices. Our analysis has a natural generalization to doped systems with two additional electrons or holes per Moiré unit cell (ν = ±2), so we expect a spin-polarized version of the K-IVC state to occur at those fillings. At odd integer fillings the situation is different. Applying our construction to odd filling inevitably leads to anomalous Hall insulators, which is at odds with the present experimental data in tBG devices which are unaligned with the h-BN substrate. In fact, our analysis points to the possibility of different types of states at odd filling since, unlike the K-IVC states at even filling, no translationally symmetric Slater determinant state takes advantage of all the terms in the Hamiltonian. In addition, band renormalization effects are expected to play a bigger role, particularly at ν = ±3 where mixing with remote bands is more likely [10].
The K-IVC state has a very subtle type of symmetrybreaking order, leading to an interesting phenomenology. Depending on the spin texture of the K-IVC state, which is only determined by the small intervalley Hunds terms, we have put forward a physical interpretation of the K-IVC state as either an 'orbital-magnetization density wave' on the atomic scale, or a state with circulating spin currents. These types of order are presumably hard to directly detect experimentally, but leave their imprint on the electronic structure. Proposals for a smoking-gun experiment to identify the K-IVC state is left to future work.
Finally, let us comment briefly on the implications of our findings for superconducting states close to charge neutrality. The presence of the Kramers time-reversal symmetry T may have important implications for the nature of superconducting states which result from doping the K-IVC state. For example, recall that in conventional solids the Anderson theorem [44] protects pairing between Kramers time-reversal partners in the presence of non-magnetic impurities. Also, the K-IVC meanfield band structure suggests that small electron or hole doping will lead to concentric Fermi surfaces around the Γ point. A Fermi surface instability triggered by phonons and/or K-IVC order parameter fluctuations could then potentially give rise to the superconducting state. We leave a more detailed analysis of the nature of the superconducting states and their connection to the K-IVC for future work.
Acknowledgement.-We thank T. Senthil for helpful discussions. This work was partly supported by the Simons Collaboration on Ultra-Quantum Matter, which is a grant from the Simons Foundation Our starting point is the Bistritzer-MacDonald (BM) model of the TBG band structure [2], which we now briefly review. We begin with two layers of perfectly aligned (AA stacking) graphene sheets extended along the xy plane, and we choose the frame orientation such that the y-axis is parallel to some of the honeycomb lattice bonds. Now we choose an arbitrary atomic site and twist the top and bottom layers around that site by the counterclockwise angles θ/2 and −θ/2 (say θ > 0), respectively. When θ is very small, the lattice form a Moiré pattern with very large translation vectors; correspondingly, the Moiré Brillouin zone (MBZ) is very small compared to the monolayer graphene Brillouin zone (BZ). In this case, coupling between the two valleys can be neglected. If we focus on one of the two valleys, say K, then the effective Hamiltonian is given by: Here, l = t/b is the layer index, and f l (k) is the K-valley electron originated from layer l. The sublattice index σ is suppressed, thus each f l (k) operator is in fact a two-column vector. In the original BM model, h k (θ) is the linearized monolayer graphene K-valley Hamiltonian with twist angle θ: where v F is the Fermi velocity. In our numerics, we do not use the linearized dispersion (S2), but we replace it with the complete mono-layer graphene Hamiltonian. This means that h k (θ) is instead given by where R(θ) is the two-dimensional rotation matrix which rotates over an angle θ, g 0 (k) is given by and δ l are the three vectors connecting an A-sublattice site to its neighboring B-sublattice sites.
To define the second term in the BM Hamiltonian (S1) describing the inter-layer tunneling, we write the K-vector of layer l as K l . With this notation, q 1 is defined as Finally, the three matrices T i are given by Unless otherwise stated, we will use the values t 0 = 2.6 eV, w 0 = 80 meV, w 1 = 98 meV, θ = 1.05 o (S6) The single particle Hamiltonian within each valley H ± is invariant under the following symmetries where M y k = (k x , −k y ). In addition, the two valleys are related by time-reversal symmetry given by Here, σ, τ and µ denote the Pauli matrices in sublattice, valley and layer spaces, respectively. As a result, we can also write C 2 as In addition, at small angles, we can neglect the θ dependence of h k (θ). In this case, we have the extra unitary particle-hole symmetry given by In addition, we have U V (1) valley charge conservation given by For the first-quantized Hamiltonian, the symmetries can be written as illustrated in Table I. Here, we made the replacement M → iM whenever necessary to make all Z 2 unitary symmetries square to +1 and [T , P] = 0.
Symmetries of the BM model with non-vanishing/vanishing intrasublattice hopping in the original microscopic basis and the projected sublattice basis. In the latter, the gauge is chosen such that T = τxK and P = τzσyK and K. Here, τ , σ, µ, γ denote the Pauli matrices in the valley, sublattice, layer and band, respectively.

B. Chiral model
The chiral limit corresponds to taking the limit of vanishing intrasublattice Moire hopping w 0 = 0. In this case, the Hamiltonian has the extra anti-unitary chiral symmetry In this case, we can perform the gauge transformation f l,k → e i 2 lθσz f l,k to get rid of the θ dependence in the first term in the Hamiltonian. As a result, the particle-hole symmetry (S10) is exact at all angles. This means that we can combine S, T and P to get a Z 2 unitary symmetry R = SPT which flips layer, valley and sublattice. Combining this symmetry with the different symmetries of the model, we can generate different versions of the symmetries, for example new time-reversal symmetry T = T U acting as This time-reversal symmetry flips layer and sublattice indices but acts within the same valley. We can also define a new C 2 symmetry C 2 = C 2 U which leaves valley and sublattice index invariant In addition, we can combine T with C 2 to get a new C 2 T symmetry which leaves momentum invariant but interchanges valley and sublattice which satisfies (C 2 T ) 2 = −1.

C. Interaction and projection onto active bands
In the following, we derive the form of the interaction when projecting onto a set of active bands. Let c † α (k) be the creation operator for the energy eigenstate labelled by the combined index α = (µ, n) which includes the flavor index µ labelled by spin s =↑, ↓ and valley τ = ± and band index n. The fermion creation operator f † µ,a (r), with a denoting the layer and sublattice indices a = (l, σ), in the continuum model is defined by expanding the graphene lattice fermion creation operator close to the K and K' points as f † s,σ,l (r) = e iK l ·r f † (s,τ =+),a (r) + e −iK l ·r f † (s,τ =−),a (r). f † µ,a (q) is its Fourier transform in terms of the continuous momentum q which is not restricted within the first Moiré Brillouin zone. c † and f † are related to each other by the k-space wave functions as follows: where G is a Moiré reciprocal lattice vector and we used the fact that the wave functions are spin-independent. Once we choose a gauge of u τ,n;G,a (k) for all k in some MBZ, c † (k) are defined in terms of the f † (q) for those k. Due to the band topology, it is generally impossible to choose a symmetric, smooth and periodic gauge [6,[18][19][20]. We will generally always choose the gauge to be symmetric which means that it is either singular/discontinuous or not periodic. In general, this means that u τ,n;G,a (k + G 0 ) = m [U τ,G0 (k)] mn u τ,m;G+G0,a (k), U τ,G0 (k) † U τ,G0 (k) = 1 (S18) where U τ,G (k) = 1 for any periodic gauge. This, in turn, implies Note that the momentum argument for f † is unconstrained since we are using the continuum theory for monolayers of graphene and the normalization is chosen such that {f µ,a (q), f † µ ,a (q )} = δ µµ δ aa δ qq (suppose the system size is finite), and u τ,n (k)|u τ ,n (k) = δ τ τ δ nn , which imply {c µ,n (k), c † µ ,n (k )} = δ µµ δ nn δ kk when k, k are confined in the MBZ. For the purpose of projecting the interaction into these two bands, it is convenient to introduce the form factor matrix It follows from the definition that the form factor satisfies The interacting Hamiltonian is given by where A is the total area of the system and V (q) is the momentum space interaction potential, related to the real-space one by V (q) := d 2 rV (r)e −iq·r and h(k) includes both the single-particle BM Hamiltonian as well as band renormalization effects due to remote bands not included in the projection. Depending on the number of gates, V (q) takes the following form in the SI units: where the screening length d s is the distance from the graphene plane to the gate(s). Unless otherwise stated, we will use the double-gate-screened expression with = 9.5 and d s = 40nm.

II. HARTREE-FOCK
Here we detail our implementation of the Hartree-Fock method, in particular our "subtraction" scheme for avoiding a doublecounting of the mean-field interaction, and our prescription for projecting onto a finite number of bands. Modulo a soon-to-bediscussed correction, the Hamiltonian is where h BM is the BM-Hamiltonian, with eigenstates labelled by α. For the numerical calculations, we find it convenient to adopt a periodic gauge which means -using the notation of the previous appendix-that U G (k) = 1 (note that in our analytical discussions we sometimes use a different gauge, namely a symmetric one).
Given a Slater determinant with correlation matrix P αβ (k) = c † α,k c β,k , the corresponding Coulomb contribution to the Hartree-Fock Hamiltonian is so that the total energy of this state is given by However, as pointed out in Ref. [12], if the parameters of h BM are obtained by a method such as DFT, or by comparison with experiment, then h BM will already contain, to some extent, the effect of the interactions, and the above expression will double-count this contribution. Consider, for example, the case where the two layers are decoupled, so that h BM is two copies of graphene. If we take for P (k) the ground state of graphene at neutrality, then the Fock contribution to H C MF [P ] will lead to a logarithmically divergent renormalization of the Dirac velocity. However, the tight-binding parameters h BM are already chosen to replicate the measured Dirac velocity, so this renormalization will be unphysical.
To remedy this, it was suggested that the BM Hamiltonian should be replaced by where P 0 is a "reference" density matrix such that h BM is the full effective Hamiltonian when P = P 0 . The choice of P 0 then in principle depends on the method used to derive h BM . As in Ref. [10], we choose P 0 to be the density matrix of two decoupled graphene layers at neutrality. While it may be tempting to choose P 0 to be the density matrix of h BM at neutrality, in most ab-initio methods [45] the parameters in h BM are obtained without any reference to the twist angle θ, so it wouldn't make sense for P 0 to then depend on θ.
Having chosen P 0 , we must truncate to a finite number of bands for computational purposes. We truncate based on projection into the eigenbasis α of h BM , choosing 4N − ≤ α ≤ 4N + of the bands closest to the flat bands (N − and N + denote the number of bands per spin and valley). We assume that below / above 4N − /4N + the density matrix is empty / full, e.g. P αβ (k) = δ αβ for α, β < 4N − and P αβ (k) = 0 for α, β > 4N + , while for 4N − ≤ α, β ≤ 4N + , P αβ (k) is determined by HF. In principle, this implies the HF Hamiltonian includes a contribution from all the filled bands α < N − . However, there is also the corresponding subtraction of the reference density matrix P 0 αβ . Because of the small inter-layer tunneling w ∼ 100 meV, the contributions from P αα and P 0 αα cancel out for α corresponding to bands far away from the charge neutrality point. With the subtraction of the reference density matrix P 0 taken into account, the Hartree-Fock mean field Hamiltonian is given by The zero-temperature Hartree-Fock self-consistency condition states that the correlation matrix of the ground-state Slater determinant of H M F [P ] should be given by P (k). To numerically solve the self-consistency equation we used both the 'ODA' and 'EDIIS' algorithms, both of which are developed and explained in detail in Refs. [46,47].

III. FLAT BAND PROJECTED HAMILTONIAN
Motivated by the numerical observation that mixing between the two flat bands and the remaining bands is relatively small for symmetry-broken phases at CN, we will only keep these two bands in the following discussion. The effect of the other bands will be included only through renormalization effects of the single-particle Hamiltonian h(k) following the scheme of Ref. [12].

A. Sublattice-polarized basis
In the chiral limit w 0 = 0, the sublattice operator σ z anticommutes with the BM Hamiltonian leaving the space of states spanning the flat bands invariant. Thus, we can choose the flat band states to be eigenstates of the sublattice operator σ z . This basis is distinct from the band basis where the chiral symmetry operator is off-diagonal since it maps positive energy states to negative energy states. We note that the sublattice basis remains well-defined in the flat band limit for which the band basis is not well-defined.
Away from the chiral limit, we can still define the sublattice basis by diagonalizing the operator Γ nm (k) = u n (k)|σ z |u m (k) . This yields a well-defined basis as long as the eigenvalues of Γ(k) (which have equal magnitude and opposite sign due to C 2 T ) are non-zero, indicating a finite sublattice polarization. The value of the positive eigenvalue of Γ(k) for the realistic model parameters (S6) is given in Fig. S1 and we can see it never goes to zero. In this case, we can define the sublattice-polarized basis as the one where Γ(k) is diagonal. As in the main text, we will use the same Pauli matrices σ both sublattice index and the band index for sublattice-polarized wave-functions. It should be noted, however, that for w 0 = 0, the wavefunctions labelled by σ are only partially polarized on one of the sublattices i.e. they have amplitude on both sublattices. We note that for the chiral model at the magic angle, the sublattice-polarized wavefunctions have an explicit form in terms of theta functions as shown in Ref. [28].
To obtain the implementation of the different symmetries we start by noting that the eigenstates for a given spin can be labelled by their eigenvalues under τ z (valley index) and σ z (sublattice index). The phases of the four different wavefunctions can be chosen arbitrarily. Such choices will affect the form of the remaining symmetries in this basis. Once the phase of the wavefunction in valley K, sublattice A is fixed, we can use two of the three symmetries C 2 , T and P (or some combinations of them) to fix the phase for the other three wavefunctions. Since we will be mostly using time-reversal and particle-hole symmetries, we will choose to fix these as which are chosen such that T flips valley but not sublattice ({T , τ z } = 0, [T , σ z ] = 0) and P flips sublattice but not valley ({P, σ z } = 0, [P, τ z ] = 0). This leads to simple forms for PT and R symmetries Once these operators are fixed, we are not free to choose the form of C 2 , e.g. σ x τ x . This can be seen by noting that the modified C 2 symmetry C 2 = iτ z C 2 R which is diagonal in sublattice and valley cannot be k-independent. The reason is that the sublattice-polarized wavefunctions have a Chern number ±1 so the product of their parity eigenvalues over the four TRIMs (Γ, M , M and M ) has to be odd [48], which is incompatible with a k-independent representation of the symmetry. To obtain the most general form of C 2 , we start by noting that C 2 T is diagonal in valley space and off-diagonal in sublattice so its action on a state u σ,τ (k) labelled by sublattice σ and valley τ is This implies that we can write C 2k = diag(e iθ+(k) , e iθ−(k) ) τ σ x τ x . In this gauge, C 2 = diag(e iθ+(k) , e iθ−(k) ) τ . At any TRIM, (C 2 ) 2 = 1, yielding θ ± (k) = 0, π. Since each sublattice-polarized band has Chern number ±1, the sum of θ k over all TRIMs (Γ, M , M and M ) should be an odd multiple of π which means that θ k cannot be chosen to vanish everywhere. The symmetry representations in the sublatticepolarized basis in the chosen gauge are summarized in Table I.

B. Properties of the form factors
Let us start with decomposing the form factor into parts which commutes/anticommute with the chiral symmetry σ z In the chiral limit, the space of flat bands is invariant under chiral symmetry leading to vanishing Λ − q (k). Let us now see what is the most general form of Λ ± q (k) deduced from the other symmetries. U(2) K × U(2) K imply that Λ q (k) is diagonal in valley and independent of spin, limiting it to the terms τ 0,z σ 0,x,y,z . PT symmetry further restricts these to τ 0 σ 0,y and τ z σ x,z whereas C 2 T enforces the term multiplying τ z σ z to be purely imaginary and the remaining terms to be purely real, leading to In addition, T implies

C. Hierarchy of scales
The full Hamiltonian is given by (S22) where h(k) is the renormalized single-particle dispersion. That is, the dispersion when the flat bands are completely empty, i.e. ν = −4. This can be written as This expression assumes a periodic gauge such that Λ G (k) † = Λ −G (k). Here, h ν=0 (k) is the dispersion at ν = 0 which we can identify with the band projection of the BM Hamiltonian on the two flat bands to a very good approximation [12]. P 0 is the density-matrix obtained by filling the negative energy states of the BM Hamiltonian. U V (1), C 2 T and PT imply that h ν=0 has the form which leads to Substituting in (S35), we get where we used the fact that tr Q 0 (k)Λ * G (k) vanishes due to the form of Λ q (k) given in (S32) and (S33). Substituting in the interaction Hamiltonian, we get To reach this expression, we note that the normal-ordered interaction in (S22) differs from the density-density interaction in (S40) by a bilinear term which cancels exactly against the third term in (S38). In addition, we separated the terms in the sum over q corresponding to a reciprocal lattice vector G and combined it with the second term in (S38). Let us now writeρ which induces a splitting of the interaction term into a symmetric intrasublattice part (under the unitary symmetry R) and a symmetry breaking intersublattice part given by We note here that although ρ − ρ − does not by itself break the R symmetry, it is only non-vanishing if this symmetry is broken FIG. S2. The parameters of the renormalized dispersion defined in (S44). We notice that the chiral symmetric part hx,y(k) is much larger than the symmetry breaking part h0(k).
since otherwise the antisymmetric form factor Λ − q (k) vanishes. Similarly, we can splith(k) into a symmetric intersublattice part ∝ σ x , σ y τ z and a symmetry breaking intrasublattice part ∝ τ z as The values of h 0 (k) and |h x (k) + ih y (k)| (h x,y (k) are not separately gauge invariant) are shown in Fig. S2.
The different terms in the interaction can be estimated by replacing fermion bilinears with their maximum expectation value of 1 leading to Fig. S3 for the parameters in (S6). By averaging the plotted functions over k, we get the values of the bounds E +,m and E −,m to be 20, and 6 meV, respectively. This implies that the intrasublattice part of the interaction H + ∼ 20 meV is the largest scale in the problem. It is followed by the intersublattice part of the dispersion h x,y (k) and the intersublattice part of the interaction H − which are of the same order ∼ 5 meV which is about a factor of 4 smaller than H + . The non-symmetric part of the dispersion is much smaller ∼ 0.5 meV.
The largest scale in the problem H + is associated with an enlarged symmetry which can be seen by noting its invariance under any unitary transformation which yields the symmetry group U(4)×U(4) denoting independent rotations between states within a fixed Chern number sector.
The discussion of symmetry is simplified if we introduce the a new basisτ andσ such that 1 2 (1 ±τ z ) projects onto the space of bands with Chern number ±1 andσ labels the states with the same spin within this subspace. This means the subspace of bands withτ z = 1 consists of states polarized to sublattice A in the + valley and sublattice B in the − valley and vice versa for τ z = −1 withσ specifying the state within this 2-band subspace. More specifically, we definẽ In this basis, the symmetry of H + is given by This symmetry is broken in two different ways. The intersublattice dispersion h x,y (k) breaks it down to U(4) by requiring U to commute with σ x =τ x which is equivalent to the condition U 1 = U 2 in (S48) reducing the U(4) × U(4) symmetry of the interaction to U(4).
The other symmetry breaking effect comes from the intersublattice interaction H − which reduces the symmetry of the interaction from U(4) × U(4) to U(4) by enforcing the extra condition that U commutes with σ x τ z =σ zτx leading to U 1 =σ z U 2σz . This U(4) subgroup is different from the U(4) subgroup leaving h x,y (k) invariant. In the presence of both h x,y (k) and H − , the symmetry is obtained by intersecting the two U(4) subgroups leading to U 1 = U 2 =σ z U 1σz which implies This corresponds to U(2) K × U(2) K symmetry corresponding to independent U(2) rotation within each valley. We notice that the much smaller intrasublattice dispersion h 0 does not introduce any extra symmetry breaking.

IV. SYMMETRIC STRONG COUPLING LIMIT
Motivated by the discussion of the previous section, we will now consider a strong coupling approach to the problem by assuming that the intrasublattice interaction scale is much larger than both the intersublattice interaction H − and the intersublattice single-particle Hamiltonian h x,y , i.e. H + H − , h. Furthermore, we will neglect the intrasublattice dispersion h 0 since it is smaller than all other terms and does not break any symmetry that is unbroken at a higher energy scale.

A. Spinless model
For simplicity, let us first consider the simpler problem of spinless electrons at half-filling with the U(4) × U(4) symmetry replaced by U(2) × U(2). We start by noting that sinceρ + −q = [ρ + q ] † , the Hamiltonian H + is a non-negative definite operator. As a result, any many-body state annihilated by H + is a ground state. We further note that for any Slater determinant state |Ψ Q described by the density matrix P (k) = 1 2 (1 + Q) where Q commutes with σ z τ z , the action of the Hamiltonian is given by where C T is the total Chern number of the state. In a periodic gauge, (S21) implies Λ G (k) = Λ † −G (k) which, together with (S34), implies that Φ + G (k) = −Φ + −G (k). As a result, the summation over k in E C vanishes identically, leading to H + |Ψ Q = 0. Thus, we see that the states |Ψ Q with Q commuting with σ z τ z are exact ground states of H + .
These states can be understood using the schematic illustration of Fig. 3 in the main text. The figure contains four bands labelled by valley and sublattice indices and since the symmetric form factor Λ + q (k) is diagonal in both, the interaction H + is minimized by minimizing density fluctuations within each band which is achieved by completely filling two of the four bands. There are several possible ways to do this which can be grouped into two categories: (i) filling two bands with the same Chern number leading to a C T = ±2 QH states which is invariant under U(2) × U(2), or (ii) filling two states with opposite Chern number leading to a manifold of C T = 0 states generated by acting with U(2) × U(2) on the VP state Q = τ z .
1. Effect of hx,y Next, we consider the effect of the dispersion h x,y (k) on this manifold of states. This breaks the symmetry down to U(2) by coupling pairs of bands with opposite Chern number through tunneling with amplitude h x (k) + ih y (k). This can be seen by noting that the symmetric dispersion has the formτ x h x (k) +τ y h y (k), thus, in theτ ,σ basis, it connects bands with oppositẽ τ z = ±1 with the same value ofσ z .
We first note that, among the manifold of ground states of H + , those for which Q commutes with σ x are annihilated by h x,y . These correspond to states in which each pair of bands connected by the tunelling in Fig. 4 are both either filled or empty, forming the manifold U(2)/U(1) × U(1) S 2 spanned by the VP state Q = τ z and T -IVC. To understand the action of h x,y , let us first consider the case in which Q anticommutes with h x,y corresponding to states for which only one out of each pair of coupled bands in Fig. 4 is filled. The action of h x,y on such states creates an electron-hole pair by moving an electron at momentum k from a filled band to the same momentum at the corresponding empty band with opposite Chern number.
The energy associated with such process can be calculated within second order perturbation theory where an electron-hole pair is created then annihilated. Since this can be done for each pair of bands independently, we can limit ourselves to only one pair of bands with opposite Chern number coupled by h x,y which can be labelled with C = σ z τ z =τ z = ±1. We denote a state with an electron-hole pair with momentum k by where |Ψ + is the state where the C = +1 band is filled and the C = −1 band is empty. The energy correction due to the coupling h x (k) + ih y (k) between the two bands is then given by We note that operator H + conserves the number of electron-hole pairs and their momentum so it acts within the space of states |Ψ k,eh . Its action is given by This can be understood by noting that the action ofρ + q on an electron-hole pair either increases the momentum of the electron by q or decreases the momentum of the hole by q. As a result, the action ofρ + −qρ + q either returns the electron-hole pair to its initial state or shifts both momenta by ±q. The matrix H eh can be easily evaluated numerically leading to ∆E ≈ −1.85 meV. This agrees with the simple estimate which assumes that the typical value for the eigenvalues of H eh are of the same order as the interaction scale 15-20 meV, yielding ∆E ≈ 1-2 meV.
One important observation that is crucial for the previous argument is that the spectrum of the electron-hole pairs has a gap. This feature can be traced to the band topology as follows. We start by writing a general state with a single electron-hole pair We note that restricting this sum to the first Brillouin zone requires a k c † k,− c k,+ to be periodic in k. Due to band topology, it is impossible to choose a smooth and periodic gauge. In the following discussion, we will prefer to choose a smooth gauge for which the operators c k,± are not periodic in k. Instead, their phase winds by ±2π around the Brillouin zone. As a result, the phase of a k should wind by 4π around the Brillouin zone, i.e. a k has at least two vortices.
Substituting in (S53), we get The last term is a constant which vanishes in the thermodynamic limit and can be dropped. We now notice that to leading order in q, Φ + q (k) can be written as Φ + q (k) = −q · A k + O(q 3 ) where A k is the Berry connection To make further analytical progress, we assume that the magnitude of the form factor F + q (k) depends only on |q| and decays relatively quickly in q on the scale of the Brillouin zone leading to where c is the constant of the same order as the interaction scale. We note that (S57) has the same form as the Ginzburg-Landau free energy of a superconductor in a magnetic field in momentum space which is in the vortex lattice phase with two vortices per unit cell due to the non-trivial winding of a k . The free energy of such phase is always larger than zero which can be seen by employing similar arguments to those of Ref. [34] as shown below.
We start with the observation that, in a smooth non-singular gauge, A k is finite everywhere in the Brillouin zone |A k | ≤ |A k,max |. Without loss of generality, we can write a k = ρ(k x + ik y ) close to a vortex. If we choose a ball of radius |A k,max | −1 around the vortex, then its energy is given by c π ρ 2 2 [1 + O( 2 |A k,max | 2 )] which is always finite since ρ cannot be made arbitrarily small (this is a result of the normalization constraint (S54)). As a result, the energy expectation value in (S57) is always positive, i.e. the Hamiltonian H eh is gapped.
In summary, h x,y favors states where only one from each pair of bands connected by h x,y is filled, allowing for virtual hopping which lowers their energy by an amount ∼ E 2 h /E + . This is equivalent to the condition {Q, σ x } which selects a manifold of states consisting of two sectors: a U(2)-invariant QH state with Chern number ±2 and a manifold of states with vanishing Chern number corresponding to U(2)/U(1) × U(1) S 2 generated by the VH and K-IVC states.

Effect of H−
The intersublattice interaction H − , on the other hand, picks a different submanifold of ground states corresponding to the states annihilated by the non-symmetric density operatorρ − q . These are characterized by Q which commutes with σ y forming the manifold U(2)/U(1) × U(1) S 2 generated by the VP and K-IVC. The energies of the remaining states for which Q anticommutes with σ y (QH, VH, and T -IVC) are increased by an amount of the order Thus, in the presence of both h x,y and H − , the K-IVC, which benefits from both perturbations, has the lowest energy followed by the VP and QH/VH (the latter two are degenerate) whose competition is determined by the relative strength of the symmetrybreaking terms in the interaction ∆E − and the energy gain due to virtual tunneling E 2 h /E + . The T -IVC state, which was not seen in the numerics, is disfavored by both and has the highest energy. B. Spinful model

Charge neutrality
Upon including the spin, we can study the manifold of ground states at CN in a similar fashion starting with the states which minimize H + for which Q commutes with σ z τ z . These states are obtained by completely filling 4 of the 8 bands in Fig. 3. There are three sectors of such states with Chern numbers ±4, ±2 and 0. The manifold of Chern number ±4 consists of a single state Q = σ z τ z invariant under U(4) × U(4) rotations which is the analog of the QH state in the spinless case. The manifold of states with Chern number ±2 is obained by filling three bands with one Chern number and a single band with the opposite Chern number. This manifold is 12-dimensional and is isomorphic to [U(4)/U(3) × U(1)] 2 . The states within this manifold break several symmetries and are characterized by mixed orders where, for instance, one valley has a spin-polarized state whereas the other has a C 2 T -breaking sublattice polarized state. They necessarily involve some non-trivial spin structure and do not have analogs in the spinless problem. Finally, the manifold of zero Chern number states is 16-dimensional and is isomorhphic to [U(4)/U(2) × U(2)] 2 . This manifold includes all the zero Chern number states found in the spinless model (VH, VP, T -IVC, and K-IVC) in their spin-singlet Q ∝ s 0 or spin-triplet Q ∝ n · s variants. It also includes the purely spin-polarized (SP) state Q = n · s.
The dispersion h x,y selects for the states where only one band out of each pair of bands coupled by h x,y is filled which is equivalent to the requirement that Q anticommutes with σ x . This reduces the manifold of state with Chern number ±2 to a 6-dimensional manifold U(4)/U(3) × U(1) and that of Chern number 0 states to an 8-dimensional manifold U(4)/U(2) × U(2). The latter includes the VH and K-IVC states (both spin-singlet and triplet versions) but does not include the SP, VP or T -IVC states.
The non-symmetric interaction H − selects a different submanifold of states with the requirement that Q commutes with σ x τ z . The possible states satisfying this requirement has Chern numbers ±2 or 0 with the former forming the 6-dimensional manifold U(4)/U(3) × U(1) and the latter forming 8-dimensional manifold U(4)/U(2) × U(2). The zero Chern number states include the VP, SP, and K-IVC states.
Thus, the presence of both h xy and H − selects the K-IVC state as the unique ground state up to the action of U(2) K × U(2) K symmetry corresponding to independent spin rotation in each valley. The most general K-IVC state is given by where V is a 2 × 2 unitary matrix acting in the spin space. The gound state manifold is thus isomorphic to U(2). The action of U(2) × U(2) symmetry on the ground state is which generates the full U(2) group starting from any given state , e.g. V = 1. This means that any given state is invariant under a U(2) subgroup given by the condition U − = V † U + V which is consistent with the ground state manifold being U(2) U(2)×U(2) U (2) . This manifold which can be written as U(1) × SU(2) can be parametrized as e iφ e i θ 2 n·s , 0 ≤ φ < 2π, 0 ≤ θ ≤ π. Here, φ and θ can be associated with the total and relative angles of the IVC orders in the spin up and down sectors.
In the presence of intervalley Hund's term given by the U(2) K × U(2) K symmetry is broken down to U C (1) × U V (1) × SU (2). For the ground state Q parametrized by φ, θ and n, we note that the energy should remain independent on φ and n since they still transform non-trivially under the symmetry. Thus, we can fix their value to φ = 0 and n =ẑ. We now evaluate E J (θ) = Ψ θ |H J |Ψ θ which is nothing but the Hartree-Fock decoupling of H J . The Hartree term vanishes whereas the Fock term yields (up to a constant) E J (θ) ∝ −J a tr e i θ 2 sz s a e −i θ 2 sz s a = −2J(1 + 2 cos θ) For ferromagnetic Hund J < 0, the minimum is at θ = π while for antiferromagnetic Hund J > 0, the minimum is at θ = 0.