Moir\'{e} flat Chern bands and correlated quantum anomalous Hall states generated by spin-orbit couplings in twisted homobilayer MoS$_2$

We predict that in a twisted homobilayer of transition-metal dichalcogenide MoS$_2$, spin-orbit coupling in the conduction band states from $\pm K$ valleys can give rise to moir\'{e} flat bands with nonzero Chern numbers in each valley. The nontrivial band topology originates from a unique combination of angular twist and local mirror symmetry breaking in each individual layer, which results in unusual skyrmionic spin textures in momentum space with skyrmion number $\mathcal{S} = \pm 2$. Our Hartree-Fock analysis further suggests that density-density interactions generically drive the system at $1/2$-filling into a valley-polarized state, which realizes a correlated quantum anomalous Hall state with Chern number $\mathcal{C} = \pm 2$. Effects of displacement fields are discussed with comparison to nontrivial topology from layer-pseudospin magnetic fields.

Motivated by advances in twisted graphene systems, explorations into moiré superlattices formed by other two-dimensional materials, such as transition-metal dichalcogenides (TMDs) [29][30][31][32][33][34][35] and copper oxides [36,37], have also seen rapid progress recently. In particular, moiré flat bands in a twisted homobilayer TMD formed by the valence band (VB) states in each K-valley were shown to carry nonzero Chern numbers [29]. The nontrivial band topology arises from combined effects of moiré potential and interlayer coupling, which act together as a layer-pseudospin magnetic field and create skyrmionic pseudospin textures in superlattice cells. The strong spin-valley locking due to giant Ising-type spinorbit coupling (SOC) of order ∼ 100 meVs [38,39] further entails a possible quantum spin Hall state.
While flat bands from VB states have enjoyed wide interest, moiré physics arising from conduction band (CB) states in twisted TMDs remains largely unexplored. One possible complication lies in the relatively weak SOC in the conduction bands on the scale of a few to tens of meVs [39,40], which may cause crossings between spinup and spin-down moiré bands. Recent works on untwisted TMDs, on the other hand, revealed that the small spin-orbit splitting in CB states, when combined with Rashba SOC introduced by mirror symmetry breaking, can lead to nontrivial Berry phase effects [41][42][43]. Notably, upon assembling two identical TMD layers into a twisted bilayer, the mirror symmetry is broken locally in each layer ( Fig.1(a)-(b)) and Rashba SOC is expected to arise. How this SOC would affect the band topology of twisted TMDs is an outstanding question.
In this Letter we establish a novel topological phase generated by SOC in CB moiré bands in twisted TMDs, which is essentially different from those in twisted graphene [20][21][22][23][24][25] where SOC is negligible, and those in the VB moiré bands of twisted TMDs where SOC plays no essential role in creating nonzero Chern numbers [29]. Focusing on the specific case of a twisted homobilayer MoS 2 we predict that an interplay among angular twist, local symmetry breaking and spin-orbit coupling in the CB states creates moiré flat bands with nonzero Chern numbers C = ±2, and density-density interactions generically drive the system at 1/2-filling into a valley-polarized correlated QAH state. We further show that the Chern bands generated by SOC stay robust against displacement fields which would otherwise destroy the nontrivial topology from layer-pseudospins [29].
Continuum model for twisted homobilayer MoS 2 .-Stacking two monolayers of MoS 2 with a small relative arXiv:2107.01328v3 [cond-mat.mes-hall] 18 Mar 2022 twist angle θ 0 results in a moiré superlattice with lattice constant a M = a/θ 0 , where a is the lattice constant of each monolayer. The system has a symmetry group of D 3 ⊗ {U v (1), T }, where U v (1) stands for valley conservation, T for time-reversal symmetry, and the point group D 3 is generated by the three-fold rotation about the zaxis (C 3z ) and a two-fold rotation about the y-axis (C 2y ) ( Fig.1(a)). For an aligned bilayer at θ 0 = 0, the mirror symmetry M z about the horizontal 2D mirror plane, which is respected in a monolayer, is broken locally on each layer. This generates uniform electric fields of opposite signs in different layers, which remain effective upon a small angular twist with θ 0 ∼ 1 • ( Fig.1(b)). Based on models of monolayer MoS 2 with broken M z [41][42][43], the effective Hamiltonian for CB minima at ±K valleys of two twisted isolated layers can be written as: where ξ = ± denotes the valley index, l = 1(2) labels the top (bottom) layer, α, β label the spin indices with the σ-matrices representing the usual Pauli matrices for spins. K 1 ≡ K m,+ and K 2 ≡ K m,− are the shifted +Kpoints on layer 1 and layer 2 ( Fig.1(c)) due to the angular twist of θ 0 equivalent to rotating the top (bottom) layer by an angle θ 0 /2 (−θ 0 /2) about the z-axis. In Eq.1, m * ≈ 0.5m e is the CB effective mass at K, µ is the Fermi energy measured from band minima. The α so -term and β so -term account for the Rashba SOC and the Ising SOC, respectively. The Ising SOC has a valley-dependent sign due to time-reversal symmetry and causes a splitting of ∆ so = 2|β so | in the spin-up and spin-down CB states. The Rashba SOC has a layer-dependent sign imposed by C 2y which swaps the two layers. Spatial variations of α so are discussed in Supplemental Material (SM). Spatial modulations due to the formation of moiré pattern can be captured by introducing coupling terms between states at k and k + G M,j [23,29] ) are the reciprocal vectors of the moiré superlattice. The momentum-space moiré potential is given by where V l,G M,j = V M (cos ψ − i(−1) j l sin ψ) are complex coupling parameters with amplitude V M and phase ψ.
The effective inter-layer tunneling for states at Kvalleys is modeled following the general recipe of twocenter approximation [4,29], which can be written as . White arrows indicate the in-plane spinor field, which has opposite vorticities at Km,+ and Km,− and leads to a nontrivial skyrmion number S = +2. Parameters used are presented in Table I. where w 0 denotes the effective inter-layer tunneling amplitude near K, which is extrapolated to be w 0 ≈ 10 meV using a combined approach of empirical models and Slater-Koster methods (see the Supplemental Material [44] for details). The total non-interacting continuum model thus reads: H 0 = ξ=± l=1,2 H ξ 0,l + H ξ T + H M with parameters tabulated in Table I.
Moiré flat Chern bands and skyrmionic spin textures.-From general considerations, level spacings among the lowest moiré bands correspond roughly to the quantization energy ∆E ∼ 2 π 2 /(2m * a 2 M ) of an electron confined in a superlattice cell with size a 2 M = a 2 /θ 2 0 . For θ 0 ∼ 1 • − 2 • , ∆E ∼ 1 − 10 meV in twisted MoS 2 , which is comparable to the spin-orbit splitting ∆ so ≡ 2|β so | = 3 meV caused by the Ising SOC [39]. If spin is conserved (e.g., in the absence of Rashba SOC), spin-up and spindown moiré bands are expected to cross in general.
By setting α so = 0 in H 0 , we find that for ξ = +, the 1 st spin-down and the 2 nd spin-up moiré bands cross each other for θ 0 ∈ (1.25 • , 1.5 • ). As a specific example, spinresolved moiré bands at θ 0 = 1.4 • with α so = 0 are shown in Fig.2(a). With spin conservation in this case, the spin Chern number is well-defined, and the spin-up and spindown bands involved in the level crossing have Chern numbers (C 1,↓ = +1, C 2,↑ = −1), which originate from the layer-pseudospin magnetic fields as in moiré bands from VB states [29]. Upon turning on α so , crossing points are gapped out by the layer-dependent Rashba SOC µ αso βso VM ψ w0 0 meV 80 meV·Å -1.5 meV 10 meV -89.6 • 10 meV which mix the spin-up and spin-down species (Fig.2(b)). As a result of this mixing, the original C 1,↓ and C 2,↑ from layer pseudospins annihilate each other; meanwhile, a new pair of flat bands (2 nd and 3 rd bands in Fig.2 with Chern numbers C = ±2 emerge (see SM [44] for details on Chern number calculations). The nontrivial gap induced by Rashba SOCs is further signified by giant Berry curvatures of order 10 4Å2 in the MBZ (Fig.2(c)).
The unusual Chern numbers C = ±2 arise from a unique combination of angular twist and local mirrorsymmetry breaking: as moiré bands at opposite K m,+ and K m,− points originate from states in different layers ( Fig.1(c)), the layer-dependent Rashba SOCs (Eq.1) create an in-plane spinor field with opposite vorticities at K m,+ and K m,− in the MBZ. This unsual pattern causes the in-plane spin to wind twice as one goes around a loop enclosing the Γ m -point in the MBZ ( Fig.2(d)). As we confirm numerically, the spin textures of the 2 nd and 3 rd moiré bands are characterized by nonzero skyrmion numbers S = d 2 kn k · (∂ kxnk × ∂ kynk )/4π = ±2 (n k : spin orientation of state at k), and these two bands describe a two-level system with one-to-one correspondence between Chern numbers C and skyrmion number S [45].
Correlated QAH state at 1/2-filling.-Given the narrow bandwidth W ≈ 1 meV of moiré Chern bands ( Fig.2(b)), the characteristic Coulomb energy scale g C e 2 / a M ≈ 10 meV overwhelms the kinetic energy and correlation physics is expected to arise (assuming θ 0 = 1.4 • and dielectric constant = 10 [35]). Motivated by the observation that twisted graphene at 3/4−filling can behave as an SU(4)-quantum Hall ferromagnet [23,[26][27][28], and the fact that all CB moiré bands in Fig.2(b) involves only a two-fold valley degeneracy (note that spin SU(2)-symmetry is already broken by SOC), we examine correlation effects when the 2 nd moiré band is half-filled. For simplicity, we drop the band index n in the following.
Including density-density interactions, the low-energy effective Hamiltonian: is the non-interacting part with band energies E ξ (k), and the interacting Hamiltonian compatible with U v (1) and T reads where V (q) is the Fourier transform of the two-body interaction, A is the total area of the system, and Λ ξ (k + q, k) ≡ u ξ,k+q |u ξ,k (|u ξ,k : periodic part of the Bloch states) are the form factors arising from projecting the density-density interaction to active bands. The general trial Hartree-Fock ground state at 1/2-filling has the form: where |Φ 0 is the vaccuum state with all lower-lying bands filled, and w ξ (k) are the variational parameters.
Minimizing E Φ ≡ Φ|H eff |Φ with mean-field approximations for Fock exchange terms (see SM [44]) leads to a set of self-consistent equations: Here, J 0 and J 1 denote the mean-field intra-valley and inter-valley exchange coupling constants, ∆ ξ = J 0nξ (n ξ : mean occupancy for valley ξ) and ∆ +− are the intra-valley and inter-valley order parameters, which characterize the average energy gains from intravalley and inter-valley exchange interactions, respectively. δ(k) ≡ E + (k) − E − (k) denotes energy difference between bands from two valleys ξ = ±, D(k) = Solutions of Eq.S27 are divided into two classes: (i) the valley-polarized (VP) state [6,23], in which one out of the two nearly degenerate moiré bands from two valleys is fully filled, with spontaneous T -symmetry breaking: ∆ M = ξJ 0 , ∆ +− = 0, w ξ (k) = 1; (ii) the inter-valley coherent (IVC) state, in which the Slater determinant is formed by coherent superpositions of states from the two valleys, with spontaneous U v (1)-symmetry breaking [6,23,24]: ∆ M = 0, ∆ +− ≈ J 1 /2, w +,− (k) = 0. By comparing the energies of the VP and IVC states, we obtain the mean-field J 0 − J 1 phase diagram for the range 0.3g C ≤ J 0 , J 1 ≤ g C (Fig.3(a)). The IVC state is favored throughout the entire J 1 ≥ J 0 regime, while the VP phase takes up most of the J 1 < J 0 regime. By tuning J 0 /J 1 across the phase boundary (green dashed line in Fig.3(a)), a first-order transition occurs at J 0 J 1 (Fig.3(b)).
It is worth noting that given a general repulsive interaction V (q) > 0, the inequality 2|Λ + (k , k)||Λ − (k, k )| ≤ |Λ + (k , k)| 2 + |Λ − (k, k )| 2 together with T -symmetry leads to J 0 ≥ J 1 , regardless of details of the form factors and the microscopic interaction [44]. Thus, electron correlations most likely drive the system at 1/2-filling into a T -broken VP state, which realizes a correlated QAH state with C = ±2. The case of general filling away from 1/2 is discussed in SM [44].
Effects of displacement fields and comparison with Chern bands from layer pseudospins.-As shown in Fig.2(a), with α so = 0, nonzero Chern numbers also arise in CB moiré bands due to layer pseudospin magnetic fields, with the same origin as those in VB moiré bands [29]. Thus, two different mechanisms for moiré Chern bands, one from layer pseudospin and the other from SOCs, coexist in the CB moiré bands. In contrast, the Rashba SOC is unimportant in the VB moiré bands where spin-up and spin-down bands are separated by 100 − 400 meV [39], and the VB moiré band topology is governed by layer pseudospin alone.
The nontrivial topology from layer pseudopsin is shown to be prone to displacement fields which can destroy the skyrmionic pseudospin textures by polarizing the layerpseudospins [29]. On the other hand, the SOC mechanism has an independent origin in the avoided level crossings between spin-up and spin-down bands and does not require nontrivial layer pseudospin textures. Thus one would expect the nontrivial band topology in the CB moiré bands to be more robust against displacement fields than its VB counterpart.
To demonstrate the effects of displacement fields, we follow Refs.
[29] and introduce a layer-dependent potential V D,l = (−1) l V z /2, (l = 1, 2) in H 0 . The moiré bands at θ 0 = 1.4 • under finite V z are shown in Fig.4(a), where we set V z = 4 meV strong enough to destroy the nontrivial topology generated by layer pseudospins (Fig.4(b)). Clearly, the avoided level crossings still occur and a pair of Chern bands with Chern numbers (+1, −1) remain. The Chern number is reduced from C = ±2 to C = ±1 under V z because the displacement field drives a band inversion near K m,+ and mediates a Chern number exchange ∆C = ±1 between the 1 st and 2 nd moiré bands.
The topological phase diagram for CB moiré bands as a function of θ 0 and V z is shown in Fig.4(b). Due to the extra mechanism from SOCs, the critical displacement field for the nontrivial topological regime is enhanced to be V c2 z , which is 2-4 times of the critical V c1 z (yellow dashed line) needed to destroy the layer pseudospin mechanism for θ 0 ∼ 1 • (V c1 z obtained by setting α so = 0 in H 0 ). Im- portantly, there is a wide parameter regime in the phase diagram (depicted in orange) where the layer pseudospin mechanism proposed in Ref.
[29] fails while the CB moiré bands remain topological, which confirms that the CB states are more robust than VB states against displacement fields.
Conclusion and Discussions. -While on-going activities in twisted TMDs have focused mainly on VB moiré bands [29][30][31][32][33][34][35], our proposal of moiré Chern bands generated by SOC opens a new pathway into the largely unknown territory of CB moiré physics. When these Chern bands are half-filled, according to our predictions, electron interactions can lead to a spontaneously T -broken VP phase and realize a correlated QAH state, which will manifest itself through quantized Hall conductance in transport measurements.
To realize the topological moiré bands generated by SOC, any finite Rashba SOC which is allowed by the D 3 -symmetry of the system would be sufficient, in principle, to induce a nontrivial gap. Notably, as MoS 2 is intrinsically semiconducting (with Fermi level ∼ 0.8 eV below the conduction band edge) [38,39,46], it is necessary to gate the chemical potential such that the CB moiré bands are filled in the first place. With the dual-gate setup used widely in experiments [1,34], local mirror symmetry breaking can be further enhanced by interfacial contact with gating electrodes, which enhances the SOC gap in the non-interacting bands. Due to the correlated nature of the QAH state at 1/2-filling, the actual topological gap to be manifested experimentally is not solely determined by the SOC gap in the non-interacting bands; instead it should be largely governed by the intra-valley exchange coupling J 0 ∼ 10 meV for θ 0 ∼ 1 • [44]. This sizable correlation-induced topological gap will reduce complications of disorder and thermal effects in experimental detection of the predicted correlated QAH state. However, as θ 0 increases, correlation effects become weaker and the moiré bands more dispersive, thus the predicted correlated QAH phase would be less robust in the large twist angle regime. In the main text, we mention that an effective inter-layer tunneling parameter w 0 is extrapolated using a combined approach of empirical model and Slater-Koster method. Here, we present details on the calculation of w 0 .
The inter-layer tunneling between two MoS 2 monolayers with a small twist angle can be conveniently modeled by the two-center approximation following the general recipe introduced in previous studies on twisted bilayer graphene [S1] and twisted homobilayer MoTe 2 [S2]. Particularly, electronic states near the conduction band minimum of monolayer transition-metal dichalcogenides(TMDs) are predominantly from the 4d z 2 -orbitals of the transition metal atoms [S3] and the inter-layer tunneling is expected to be mediated by the σ-bonding between 4d z 2 -orbitals in each layer. To start with, we consider fermionic operators that create Bloch states of the form: Here, k, R and k , R denote the momentum and lattice vectors of layer 1 and 2, respectively. c † l,α (R) creates a localized Wannier orbital of d z 2 character with spin α at site R in layer l. In general, the inter-layer tunneling Hamiltonian is given by: where the tunneling matrix T reads: Here, t αβ (q) = d 2 rt αβ (r)e −iq·r is the Fourier transform of t αβ (r), A is the total area of the system, Ω = A/N is the area of the primitive unit cell. In the last step of the derivation above, we made use of the identity R e i(q−k)·R = N G δ q,k+G .
To evaluate the effective tunneling strength t(q = K) near the K-points, we use the following simple model for t(r) which incorporates the Slater-Koster σ-bonding between d z 2 -orbitals in the two different layers with an empirical exponential decay: Here, d = 0.7 nm is the inter-layer distance between two TMD monolayers [S4], n(r), l(r), m(r) are the directional cosines of the vector r connecting the two position vectors in the two layers, with n 2 (r) = d 2 d 2 + |r| 2 , l 2 (r) + m 2 (r) = |r| 2 d 2 + |r| 2 .
The value of λ is estimated based on the empirical fact that t ≈ 0.1t, where t and t are the next-nearest-neighbor and neareast-neighbor hopping amplitudes. Thus, t/t = e λa ≈ 10, and λa = ln(10) ≈ 2.3. Note that the form of t(r) in Eq.S6 is a radial function t(r) ≡ t(r), and its Fourier transform has the general form of: where J 0 (x) is the Bessel function of order zero. With the form of t(r) in Eq. S6, t(q) can be obtained by numerical integration with the only fitting parameter t 0 . To fix t 0 , we note that states near the Γ-point of the valence band states in MoS 2 also originate from 4d z 2 -orbitals [S3], and the inter-layer coupling causes an energy splitting of ∆E 2t(q = 0) ≈ 800 meV at the Γ-point in the valence bands in aligned bilayer MoS 2 [S5]. This implies t(q = 0) ≈ 400 meV and allows us to fix t 0 and obtain the values of t(q) for all q as shown in Fig.S1. The effective interlayer coupling strength at the K-point in our continuum model is then extrapolated to be w 0 ≈ 10 meV at qa = Ka = 4π/3 (dashed vertical line in black in Fig.S1).

A. Numerical method for computing Chern numbers
The Chern numbers of the CB moiré bands presented in Fig.2 and Fig.4 of the main text are calculated numerically using the standard formula introduced in Ref. [S6]. The procedures of the numerical method is outlined below. For the sake of simplicity, we consider a given CB moiré band and drop the band index in the following.
First, we construct a discretized moiré Brillouin zone (MBZ) with an N x × N y grid. For each momentum point k in the grid, we evaluate the following complex quantities: where |u(k) is the periodic part of the Bloch state at k,x,ŷ are unit vectors in k x and k y directions, ∆k x and ∆k y measure the spacings between neighboring grid points in k x and k y directions. Note that the phase factors of U x and U y essentially account for the Berry connections on each link between two neighboring grid points. The gauge-invariant Berry flux threading through one plaquette around k is given by γ k = arg W (k), where W (k) is the Wilson loop: The Chern number is then expressed as the sum over all Berry fluxes γ k through a total number of N x ×N y plaquettes: We note that to obtain accurate results for Chern numbers in the continuum model, the total number of grid points must be large enough such that | arg W (k)| < π for all k, and the total number M of the moiré Brillouin zones must also be large enough to ensure an accurate description of the low-energy physics at the exact M → ∞ limit. In our model, we find that N x = N y = 40 and M = 81 produce accurate numerical results for C in the CB moiré bands.

B. Effects of Berry phase in monolayer transition-metal dichalcogenides
It is known that a Berry phase with value close to π can appear in a local region around K-points in the Brillouin zone of a monolayer transition-metal dichalcogenide (TMD) [S7]. This Berry phase originates from the massive Dirac Hamiltonian involving both the conduction and valence band edges at +K-point (see black dashed rectangles in Fig.S2), which is essentially different from both the Berry curvatures generated by the layer degrees of freedom in VB moiré bands [S2] (green dashed rectangles in Fig.S2)) and the giant Berry curvatures (Fig.2(c) of the main text) generated by the spin and layer degrees of freedom in our continuum model (pink dashed rectangles in Fig.S2)). While by summing over both the spin-up and spin-down valence bands in a monolayer TMD at one of the K-valleys, one acquires a total Berry phase close to 2π and appears to correspond to a "Chern number" of 1 for this valley, such a "Chern number" is, strictly speaking, not well-defined. This is because the Berry curvatures in a monolayer are summed over two bands and integrated only over a local region around +K in the (large) monolayer Brillouin zone. The integral is not guaranteed to be quantized by any topological reasons. In contrast, the Chern numbers in both the CB moiré bands in the main text and the VB moiré bands in Ref. [S2] are well-defined quantities because the Berry curvatures of each single band are integrated over the entire (small) moiré Brillouin zone. The resultant Chern numbers of each band from valley ξ = + or ξ = − are guaranteed to be quantized by topology.
Importantly, we note that the Berry phase in monolayer TMDs do not affect the topology of CB moiré bands in a twisted homobilayer. This is because the Berry curvatures from the massive Dirac Hamiltonian is only on the order of ∼ 10Å 2 [S7], three orders of magnitude smaller than the Berry curvatures ∼ 10 4Å 2 in the moiré Chern bands ( Fig.2(c) of the main text) generated by spin and layer degrees of freedom. Thus, the topology of CB moiré bands is dictated by the spin and layer degrees of freedom, which justifies why the contributions from remote valence bands (∼ 1.6 eV away from conduction band states) can be neglected.

A. Hartree-Fock energy functional and Lagrange equations
When the Chern bands in a twisted bilayer MoS 2 are half-filled, the low-energy effective Hamiltonian is written as: where the non-interacting Hamiltonian is given by Here, ξ = ± is the valley index, E ξ (k) is the energy of moiré band from valley ξ. The effective interacting Hamiltonian is given by: where V (q) = d 2 rV (r)e iq·r is the Fourier transform of the two-body interaction V (r 1 − r 2 ), A is the total area of the system, Λ ξ (k + q, k) ≡ u ξ,k+q |u ξ,k are the form factors arising from projecting the general density-density interaction to the active bands, which end up as the overlap between the cell-periodic parts of Bloch states within the same band from valley ξ. Note that due to the U v (1)-symmetry, wave functions from different valleys do not overlap: As mentioned in the main text, the Coulomb energy scale g C ∼ 10 meV in twisted bilayer MoS 2 overhelms the band width W ∼ 1 meV in the Chern bands found in the continuum model. This motivates us to consider possible ground states at 1/2-filling as a general Hartree-Fock state of the form: where w ξ (k) and θ ξ (k) are real numbers and denote the amplitude and phase of the coefficients of valley ξ at momentum k. |Φ 0 is the vaccuum state with all bands below the active bands being fully filled. The variational parameters can be determined by minimizing the ground state energy subject to the normalization condition w 2 + (k) + w 2 − (k) = 1. The energy of |Φ is given by: where E Φ 0 , E H and E Φ F refer to the kinetic energy, Hartree energy, and the Fock exchange energy of |Φ 0 , respectively. It is straightforward to show that the Hartree term essentially accounts for the direct coupling: E H ∝ n 2 e , where n e is the electron density of the system, which is fixed at a given band filling. Thus, we neglect E H from now on and focus on E Φ 0 and E Φ F : Note that in order to minimize E Φ F , the phase factors would adjust themselves such that e −iθ ξ (k+q) e iθ ξ (k) Λ ξ (k+q, k) = |Λ ξ (k + q, k)|. This reduces E Φ F to the following form: where we have used the identity |Λ ξ (k, k + q)| = |Λ ξ (k + q, k) * | = |Λ ξ (k + q, k)|. To minimize E Φ 0 + E Φ F subject to the constraint w 2 + (k) + w 2 − (k) = 1, we introduce the Langrange multipliers λ k such that This leads to the following two equations: where we introduced the shorthand notation: As we seek for nontrivial solutions of Eq.S19, it is convenient to introduce the following two quantities: which allows us to reformulate Eq.S19 as an eigenvalue equation: Note that λ k has the physical interpretation as the energy contribution from k to the total energy of the Hartree-Fock state. This motivates us to focus on the lower branch of eigenvalues of the 2 × 2 symmetric matrix: The defining equations (Eq.S21) require that ∆ + (k), ∆ − (k), ∆ +− (k) satisfy the following self-consistent equations: Here, J ξξ (k, k ) are the Fock exchange functions: where Ω is the area of the real-space moiré unit cell, G M denotes the moiré reciprocal lattice vectors.

B. Mean-field approximation
To simplify Eq.S25, we employ the mean-field approximation which replaces the exchange coupling strengths J ξξ (k , k) by their averages: intravalley exchange coupling J 0 ≡ J ++ (k , k) = J −− (k , k) , and the intervalley exchange coupling J 1 ≡ J +− (k , k) . This reduces Eq.S25 to a coupled set of self-consistent equations involving only constant order parameters: cos(φ k ) = 1 2 By adding and subtracting the first two equations in Eq.S27, we end up with a compact form (Eq.5 of the main text): It is worth noting that there exist two possible solutions for the last equation in Eq.S29, which divides the whole set of solutions to Eq.S29 into two different classes: (i) ∆ +− = 0, and (ii) ∆ +− = 0. Particularly, in case (i) where ∆ +− = 0, we have D(k) = 1 2 |δ(k) − (∆ + − ∆ − )|, i.e., cos(φ k ) = ±1 for all k (or equivalently, either ∆ + = J 0 , ∆ − = 0 or ∆ + = 0, ∆ − = J 0 ). These two degenerate states are the valley-polarized (VP) solutions which spontaneously break time-reversal symmetry. The energy of the two degenerate VP states is given by In case (ii) where ∆ +− = 0, one can divide both sides of the 3 rd equation in Eq.S29 by ∆ +− , which simplifies the set of equations as below: where the three unknown mean-field order parameters ∆ + , ∆ − , ∆ +− that characterize this state need to be solved self-consistently. In particular, we note that ∆ +− = 0 implies a nontrivial inter-valley order associated with this Hatree-Fock state, in which the microscopic valley is no longer conserved (i.e., spontaneous breaking of valley-U v (1) symmetry). On the other hand, any solution with nonzero ∆ M ≡ ∆ + − ∆ − necessarily lifts the valley degeneracy and breaks time-reversal symmetry spontaneously. Thus, we call ∆ M as the valley magnetic order. As it turns out, under time-reversal symmetry (e.g., in the absence of any external magnetic fields), the only U v (1)-breaking solution must satisfy the condition ∆ + = ∆ − , i.e., there is no valley magnetic order ∆ M = ∆ + − ∆ − involved. This solution is known as the (generalized) inter-valley coherence (IVC) state [S8] (note: w + (k) = w − (−k) = 1/ √ 2 in general except at time-reversal-invariant momenta). According to Eq.S23, the energy of the IVC state is: The ground state can then be determined by comparing the two energies E IV C and E V P : where ∆ + , ∆ − , ∆ +− involved in D(k) are given by the self-consistent solutions of Eq.S31. The ground states obtained for each pair of (J 0 , J 1 ) map out a J 0 − J 1 mean-field phase diagram.
To develop some physical insights into the energetics of the system, we first consider some simple limits. First, we note that the flat band condition implies J 0 , J 1 |δ(k)|. This combined with the self-consistency condition 1 = J1 2N k 1 D(k) suggests D(k) ∼ J 1 /2. Thus, when the intravalley exchange coupling dominates over its intervalley counterpart: J 0 J 1 , we have ∆E ∼ N (J 0 − J 1 )/2 0. In this case, the VP state is favored. By a similar argument, when the intervalley exchange coupling dominates: J 1 J 0 , ∆E ∼ N (J 0 − J 1 )/2 0 and the IVC state always wins. In particular, in the entire J 1 ≥ J 0 region, using the harmonic mean-arithmetic mean (HM-AM) inequality: where we used the last equation in Eq.S31. This shows that for J 1 ≥ J 0 , ∆E ≤ N (J 0 − J 1 )/2 ≤ 0 always holds, and the system favors the IVC state with nonzero ∆ +− and ∆ M = ∆ + − ∆ − = 0. Next, we consider the J 0 > J 1 regime. As we discussed above, at J 0 = J 1 , the system always favors the IVC state, while in the limit J 0 J 1 the system should favor the VP state. This suggests that upon increasing the ratio J 0 /J 1 , a phase transition must happen somewhere in the J 0 > J 1 regime. By solving the self-consistent equations (Eq.S29) numerically, we obtain the J 0 − J 1 phase diagram in Fig.3 of the main text, which reveals that the VP state is favored almost immediately when the J 0 > J 1 regime is accessed, and a first-order transition from IVC to VP phase occurs at J 0 J 1 . This is due to the fact that the energetics of the system is essentially governed by (J 0 , J 1 ) under the condition J 0 , J 1 |δ(k)|. As a result, ∆E N (J 0 − J 1 )/2 almost holds all the time, and the VP state is favored almost in the entire regime with J 0 > J 1 . D. Physically accessible regimes of (J0, J1) under density-density repulsions We note that the discussions above are based on pure algebraic analysis -it is clearly desirable to examine the accessible regimes of (J 0 , J 1 ) on physical grounds. Indeed, given a generic density-density repulsive interaction which respects U v (1) and T , the relation between J 0 and J 1 can be derived. In particular, the intra-valley exchange coupling strength is Note that in the equations above we used the fact that V (q) = V (−q) and the condition Λ + (k , k) = Λ − (−k, −k ) imposed by time-reversal symmetry. For the inter-valley exchange coupling strength, Therefore, we identify the physically accessible points (J 0 , J 1 ) are located in the J 0 ≥ J 1 regime of the phase diagram.
Notably, the relation J 0 ≥ J 1 derived above holds regardless of details of the wave functions and the microscopic interaction V (q) as long as the interaction is repulsive in nature. Thus, as shown in Fig.3(a) of the main text, the VP state takes up a large portion of the physical regime with J 0 ≥ J 1 except a narrow strip with J 0 ≈ J 1 where the IVC state is favored.
E. Case of general filling away from 1/2 Here, we briefly discuss the physics for general fillings away from 1/2. For general fillings above 1/2, one valley is fully filled, while the other is partially filled. On the other hand, for fillings below 1/2, one valley is empty while the other is partially filled. In both cases, the system is metallic and one expects anomalous Hall effects with non-quantized Hall conductance due to the large Berry curvatures hosted in the Chern bands (Fig.2(c) of the main text).
Moreover, at full band filling (i.e., both valleys being fully filled), we expect time-reversal symmetry to be restored and the opposite Chern numbers in opposite valleys give rise to a pair of counter-propagating edge states. This effect can be termed as the quantum valley Hall effect, analogous to the quantum spin Hall effect in which opposite Chern numbers from opposite spin species give rise to a pair of counter-propagating edge states.
F. Correlated topological insulating gap at 1/2-filling As we mentioned in the Conclusion and Discussions section of the main text, under the flat band condition W g C (W : band width, g C : interaction strength) at small twist angles θ 0 ∼ 1 • , the actual topological insulating gap to be manifested experimentally at 1/2-filling is not solely determined by the gap induced by Rashba SOCs in the noninteracting bands. Instead, it is largely governed by the correlation-induced insulating gap for the valley-polarized (VP) state, which is approximately given by the intra-valley exchange coupling strength J 0 ∼ g C ≈ 10 meVs (see estimation of g C in the section "Correlated QAH state at 1/2-filling" in the main text).
Physically, for a VP state, the intra-valley exchange interactions among electrons in the fully filled moiré band lowers the energy of each individual electron in this valley by an amount of J 0 . On the other hand, the energy of the empty band in the other valley remains unaffected, as electron correlation cannot occur within an empty band. As a result, once the non-interacting bands acquire nonzero Chern numbers from any finite Rashba SOC, the correlated topological gap at 1/2-filling would be on the order of J 0 = 10 meVs. To explicitly demonstrate this, we use our Hartree-Fock formalism (see Eq.S30) to calculate the correlated band structures for both valley ξ = + and valley ξ = − when the pair of 2 nd moiré bands are half-filled (Fig.S3). As the spontaneous T -broken VP phase selects one out of the two valleys randomly, we assume without loss of generality that the 2 nd moiré band from ξ = + being fully filled, while its counterpart from ξ = − is empty. Clearly, the topological insulating gap (indicated by double arrow in Fig.S3(c)-(d)) corresponds to the scale of J 0 , instead of the gap induced by Rashba SOC in the non-interacting bands in Fig.S3(a)-(b).

IV. Estimate of spatial variation in Rashba SOC strength
As we discussed in the Conclusion and Discussion section in the main text, the formation of moiré pattern in principle leads to spatial variations of the Rashba SOC strength on the moiré scale (the amplitude of variations denoted by δα so in the following).
To estimate δα so , the key observation is that the spatial variation in Rashba strength is not independent of the moiré potential. In particular, the moiré potential causes a spatial variation in the inter-layer potential differences between the two constituent layers with its amplitude on the scale of V M ≈ 10 meVs (Table I of the main text). Given the inter-layer distance d = 7Å in bilayer MoS 2 , this corresponds to a spatially modulating electric field with the amplitude of variations given by δE z = 1.4mV ·Å −1 . Assuming a linear scaling relation δα so = λδE z , we obtain an estimated value of δα so ≈ 0.08meV ·Å by using a proportionality constant λ = 0.0625e ·Å 2 for MoS 2 extrapolated from Fig.7 of Refs. [S9]. Thus, the amplitude of the spatial variation in Rashba SOC is negligibly small compared to previous reports on gate-induced Rashba SOC with α so ∼ 30meV ·Å in each layer [S10]. This justifies the use of a uniform Rashba SOC in our continuum model as a good approximation.