Skyrmions in twisted bilayer graphene: stability, pairing, and crystallization

We study the excitations that emerge upon doping the translationally-invariant correlated insulating states in magic angle twisted bilayer graphene at various integer filling factors $\nu$. We identify parameter regimes where these are associated with skyrmion textures in the spin or pseudospin degrees of freedom, and explore both short-distance pairing effects and the formation of long-range ordered skyrmion crystals. We perform a comprehensive analysis of the pseudospin skyrmions that emerge upon doping insulators at even $\nu$, delineating the regime in parameter space where these are the lowest-energy charged excitations by means of self-consistent Hartree-Fock calculations on the interacting Bistritzer-MacDonald model. We explicitly demonstrate the purely electron-mediated pairing of skyrmions, a key ingredient behind a recent proposal of skyrmion superconductivity. Building upon this, we construct hopping models to extract the effective masses of paired skyrmions, and discuss our findings and their implications for skyrmion superconductivity in relation to experiments, focusing on the dome-shaped dependence of the transition temperature on the twist angle. We also investigate the properties of spin skyrmions about the quantized anomalous Hall insulator at $\nu=+3$. In both cases, we demonstrate the formation of robust spin/pseudospin skyrmion crystals upon doping to a finite density away from integer filling.


I. INTRODUCTION
The experimental observation of an array of interaction-driven electronic phases in magic-angle twisted bilayer graphene (TBG) [1][2][3][4] has stimulated intense theoretical efforts to formulate a unifying physical description of their emergence from the interplay of topology, symmetry, and correlations. One promising route to a solution has close ties to the now-classic paradigm of strongly interacting electrons in Landau levels (LLs). As shown in Ref. 5, this so-called 'strongcoupling' picture is rooted in the identification of an idealized limit with a U (4) × U (4) global symmetry that operates within the subspace of the eight central bands: the two U (4) symmetries rotate within quartets of topological Chern bands with Chern numbers C = ±1 respectively (see also Refs. [6][7][8][9]. Correlated insulating phases arise as a result of symmetry breaking within this (almost) degenerate subspace, in a manner analogous to the formation of quantum Hall ferromagnetism in integrally-filled LLs 10 . This perspective receives support from the detection of a quantized anomalous Hall insulator (QAHI) at ν = +3 [11][12][13] and various magnetic-field stabilized Chern insulators at other fillings 4, [14][15][16][17][18][19] . While a purely strong-coupling treatment does not completely resolve the experimental puzzles surrounding TBG, it serves as a natural starting point for considering how various physically realistic corrections influence the competition between ordered states. In this manner, it can aid in the identification of new and distinct brokensymmetry phases that are not accessible in any obvious way from the weakly-interacting limit, but can be viewed as descending from strong-coupling ferromagnetic orders in a natural, albeit non-trivial, fashion as the system is tuned towards the physically relevant intermediatecoupling regime 20,21 .
The underlying topology of TBG not only influences the selection of correlated states, but also leaves its fingerprint on the nature of their excitations -suggesting a tantalizing link between insulating behaviour at integer filling and superconducting behaviour away from it [22][23][24][25][26] . In this vein, neutral excitations of the QAHI have been predicted to inherit striking characteristics from the symmetry-breaking and topological properties of the parent insulator [27][28][29][30] . Given the intimate connection to QHFM, the strong-coupling insulators of TBG can also be expected to host charged excitations in the form of spin or pseudospin textures known as skyrmions or merons 22,[31][32][33][34][35][36][37][38] . For example, under certain conditions additional charge carriers can enter the QAHI as skyrmions rather than as single spin-flip quasiparticles. The situation is richer at even integer fillings since the intricate structure afforded by the approximate U (4)×U (4) symmetry and the associated anisotropies is more apparent here. Besides spin, the relevant textures now also have access to the valley and sublattice degrees of freedom (additional 'flavours'), leading to the emergence of pseudospin skyrmions. The importance of a detailed understanding of such topological objects is underscored by a recent proposal 22 of purely electronic superconductivity arising from bosonic 'paired skyrmions' 39,40 of charge 2e (see Fig. 1), whose stabilization arises from the particular features of the strong-coupling Hamiltonian. Such unconventional Cooper pairs may be relevant for understanding the various superconducting domes in TBG [2][3][4][41][42][43][44][45] , whose properties and origins remain the subject of intense debate.
To date, however, the investigation of such flavour textures has been limited to either analyses of the effective non-linear sigma model (NLSM) 22 or numerical studies of a simplified model mimicking the gross features of the central bands 23 . While such studies provide qualitative insight, a systematic exploration of these charged textures within a realistic microscopic model of TBG remains an outstanding challenge. There is no guarantee that topological textures are relevant once unavoidable complications such as momentum space structure, finite kinetic energy and the presence of a moiré lattice are accounted for. (A recent study of domain-wall textures in the ν = +3 QAHI, where microscopic energetics significantly alter properties of the topologically mandated electronic boundary modes, suggests that it is indeed important to consider such details 46 .) This is the lacuna which we address in this paper, by directly probing the nature of (pseudo)spin skyrmions using unbiased self-consistent Hartree-Fock (HF) calculations within the interacting Bistritzer-MacDonald (BM) model 47,48 . This requires us to consider completely unrestricted Slater determinants, allowing for full breaking of moiré translation symmetry. This is a challenging task even within the constrained variational space of the HF mean-field ansatz, but one that we pursue successfully below.
We have three main objectives in this work: first, to provide a proof-of-principle demonstration of skyrmion pairing in an appropriately-chosen limit of a microscopic model of TBG; second, to investigate whether such pairing can result in superconducting phenomenology consistent with that seen in experiment (most notably, the dome-shaped dependence of the superconducting transition temperature T c on the twist angle 41 ); third, to establish the regime in parameter space over which pairing survives away from the idealized limit. We also consider a subsidiary set of issues pertaining to the emergence of skyrmions at other fillings, and their formation of longrange ordered structures [49][50][51] when doped into TBG at a finite density.
Within a HF treatment of the interacting BM model, we find that pseudospin skyrmion pairing does indeed occur about even integer filling and at sufficiently small chiral ratio, and in qualitative agreement with the pre-dictions of the NLSM. Paired skyrmions are lower in energy than particle excitations for small chiral ratio, with relatively better energetics at charge neutrality than at |ν| = 2. For chiral ratios approaching the realistic values, they become energetically unfavorable and harder to stabilize within mean-field theory. External perturbations such as strain and substrate, as well as increased screening of interactions, tend to also have the same effect. We devise a effective skyrmion hopping model to estimate the effective masses, and find that the corresponding superconducting critical temperature T c 52 can be larger than typical scales observed experimentally. However, more detailed comparison of the twist-angle dependence is complicated by the sensitivity of the results to how electron interactions are incorporated (i.e. the 'subtraction scheme', to be explained in Sec. II A). We critically discuss our findings in the context of experiments, and conclude that, at least for some of the samples, skyrmion pairing is unlikely to be mechanism for superconductivity. For the QAHI at |ν| = 3, we demonstrate the formation of spin skyrmions for doping of single charges, and various spin texture lattices at larger dopings.
The balance of this paper is organized as follows. After reviewing the basics of the NLSM and strong-coupling picture of TBG in Sec. II, we focus on the pseudospin skyrmions about the spinless insulator at charge neutrality (ν = 0) in Sec. III. Via an effective skyrmion hopping model, we further estimate the dispersion and effective mass of delocalized skyrmions. We also briefly address the situation at |ν| = 2 in the spinful case. In Sec. IV, we discuss the properties of spin skyrmions and spin texture lattices at |ν| = 3 about the QAHI. We end with a discussion in Sec. V.

II. THEORETICAL CONSIDERATIONS
In this section, we review the key concepts that underpin the existence and properties of skyrmions about the strong-coupling insulators in TBG 5,8,53 .

A. Interacting BM model
The starting point is the continuum BM modelĤ BM which generates the single-particle moiré bands of TBG in the moiré Brillouin zone (mBZ) 47 . Details of the model are relegated to App. A, but we mention here the important parameters which will influence the skyrmions. The interlayer coupling is parameterized through two sublattice-dependent hopping constants w AA and w AB , reflecting corrugation effects in the superlattice. We fix w AB = 110 meV but consider variable w AA , thereby tuning the value of the chiral ratio κ ≡ w AA /w AB . Estimated values for κ in the literature range from 0.5 to 0.8 [54][55][56] . The kinetic bandwidth is minimal when the twist angle θ is close to the magic angle, which is weakly dependent on κ. While the physical relevance of θ as a tuning parameter is obvious, it will prove theoretically useful to vary w AA to control deviations from the fully symmetric theory, to be described below. (Note that it has been argued that the effective value of w AA may be susceptible to renormalization towards the chiral limit 57 , κ = 0, due to the effects of remote bands; thus, tuning w AA may be viewed as phenomenologically modeling this downward renormalization 9 .) Sometimes, we will also incorporate heterostrain 20,58,59 with strength and a sublattice potential of strength ∆ 60 . In the absence of such single-particle perturbations, the point group is D 6 , which includesĈ 2z ,Ĉ 3z andM (inplane rotation about the x-axis). Accounting for both valley and spin degrees of freedom, the BM model contains a U (2) K × U (2) K flavour symmetry which includes charge/valley conservation and independent SU (2) spinrotations within each valley. Naturally, a spinless timereversal symmetry (TRS)T = τ xK is also present, wherê K is the complex conjugation operator. Upon neglecting the weak relative rotation of the Dirac cones in the two layers, there is also a particle-hole symmetry (PHS) P [61][62][63] . Throughout this work we will focus solely on the central eight bands, and project out the remote bands which are separated by a large band gap. (Although, see the preceding comments on their possible role in adjusting w AA .) Repulsive interactions are implemented by augmenting the BM model with dual-gate screened density-density Coulomb interactions V (q) = e 2 2 0 r q tanh qd sc , where the relative permittivity r = 10 and the screening length d sc = 25 nm 5 . By neglecting subleading 'intervalley-Hund's couplings' which scatter electrons between valleys, we retain the global U (2) K × U (2) K symmetry.
Note that interactions may alter the one-body terms somewhat, leading to a total effective single-particle dis-persionĤ SP5,7,8,[63][64][65] . The precise correction depends on the choice of 'subtraction scheme' which arises as follows (a more detailed exposition can be found in App. A). Including only normal-ordered interactions within the central band subspace neglects contributions from the remote bands. Furthermore, some interactions are doublecounted because the hopping parameters of graphene (e.g. from density functional theory calculations) are obtained self-consistently with filled valence bands. A convenient parameterization to remedy this and restore PHS is in terms of a reference density matrix P 0 . There is no unique prescription for fixing P 0 and we will consider three choices that have been used in the literature: (i) the 'average', (ii) 'charge neutrality' (CN), and (iii) 'graphene' schemes. This nomenclature is motivated by how P 0 is constructed: (i) P 0 ∝ I, (ii) P 0 consists of the filled valence BM bands, and (iii) P 0 consists of the filled valence bands of decoupled layers of graphene. Most of the initial discussion on skyrmions will be based on the average scheme, which leads to no corrections when interactions are recast in the strong-coupling form that highlights the strong-coupling hierarchy of scales (see Sec. II B and App. A). The other two schemes will be analyzed in more detail in Sec. III D.

B. Strong-coupling hierarchy
Since the typical Coulomb scale exceeds the kinetic bandwidth, the BM band basis may not be the most natural one in which to understand interaction physics. A more appropriate choice, that reveals the underlying topological character of the bands, is furnished by the Chern basis obtained by diagonalizing the sublattice operator σ z 5 . Each band is predominantly polarized on one sublattice, which allows us to label the central bands with sublattice σ, valley τ , and spin s. Combining these into a single index α, we can write the Bloch functions as |ψ α (k) = e ik·r |u α (k) with associated creation operatorsd † α (k). Crucially, these bands have Chern number C = σ z τ z (Fig. 2a).
The link to QHFM is sharpened in the well-studied chiral limit κ = 0 57,66 , where the kinetic bandwidth becomes exactly zero at the magic angle and the Chern bands become completely sublattice polarized. In addition, the new chiral symmetry {σ z , H BM }, in concert withĈ 2zT andP, heavily constrains the form factors Λ α,β (k, q) ≡ u α (k)|u β (k + q) to be diagonal and to depend only the Chern number: this allows us to write Λ α,β (k, q) = Λ S (k, q)δ αβ , with 5 where the superscript 'S' denotes 'symmetric'. In the chiral-flat limit where the effective dispersionĤ SP is neglected, this implies a huge U (4) C=1 × U (4) C=−1 symmetry, as can be seen by checking the invariance of the as a function of twist angle θ and interlayer hopping parameter wAA, where AUC is the moiré unit cell area. The pseudospin stiffness ρps 1.5 meV, calculated in the isotropic limit, does not depend on wAA and subtraction scheme (see App. A), and is largely insensitive to θ. c) Strong-coupling energy scales US, UA, tS, tA as a function of θ and wAA. All quantities calculated using the average subtraction scheme. density operatorρ S (q) = kd † (k)Λ S (k, q)d(k + q) under independent Chern-number preserving rotations. Remarkably, at ν = 0 one can prove the existence of exact Slater determinant ground states, which simply involve uniformly polarizing any four orthogonal directions within the Chern quartets ('ferromagnets') 5,8 . Hence, the chiral-flat limit will also be referred to as the isotropic limit.
The utility of the chiral-flat limit lies in the existence of a hierarchy of scales which permits corrections to be treated perturbatively within the manifold of strongcoupling ferromagnets 5,8 . The energy scale of the full symmetry group is U S , and the main competing scales are the single-particle inter-Chern tunneling (t S ) and deviation from chirality (U A ), with intra-Chern dispersion (t A ) being subleading [see Fig. 2d and App. A]. Focusing for simplicity on the spinless insulator at neutrality such that the parent symmetry group is U (2) × U (2), the effects of the above perturbations can be captured via anisotropies with positive strengths J and λ respectively (i.e. energy scales E J = A UC J and E λ = A UC λ where A UC is the moiré unit cell area). It turns out that the states that get energetically selected by these terms have total C = 0. Defining an intra-Chern Pauli triplet η = (σ x τ x , σ x τ y , τ z ), we can parameterize the strongcoupling ferromagnets residing in the Chern-neutral sec-tor with two Chern-filtered pseudospins which are independent of k ( Fig. 2a). Inter-Chern tunneling generates superexchange, inducing an antiferromagnetic (AFM) coupling between the two pseudospins, while a finite κ manifests as a AFM coupling inplane, and an FM coupling out-of-plane. Both break the chiral-flat symmetry to distinct U (2) subgroups. The resulting ground state with n + = −n − (and hence valley-U (1) V degeneracy) is the Kramers intervalley-coherent insulator (KIVC), so called because it preserves a modified TRST = τ yK . An equivalent description of the spinless KIVC is via its density matrix P = 1 is parametrized through the IVC angle φ IVC .

C. Non-linear sigma model and pseudospin textures
To understand low-energy excitations and spatially non-uniform configurations, it is useful to consider a con-tinuum NLSM description purely in terms of the pseudospins 22 . The energy functional, familiar from QHFM, is where ρ ps is the pseudospin stiffness in the isotropic limit, and δρ = δρ + + δρ − consists of the topological charge (Pontryagin index) densities of the two Chern sectors The skyrmion number is given by where skyrmions and antiskyrmions have N top = +1 and N top = −1 respectively. The NLSM parameters J, λ, ρ ps are plotted in Fig. 2, with explicit expressions supplied in App. A. In the "average scheme" described there, the superexchange J has a prominent minimum and nearly vanishes around θ 1.08 • , coincident with where the bare BM bandwidth is smallest. λ is a monotonically increasing function of w AA and is largely insensitive to θ. The pseudospin stiffness 1.3 meV is a property of the maximally-symmetric manifold and has a very weak dependence on twist angle. While there are sizable regions where λ > J, this only occurs for the average scheme where J is directly tied to the bare BM scale, which is substantially smaller than the bandwidth obtained via STM measurements of the van Hove singularities in the density of states [67][68][69][70][71] . Therefore we will mostly be interested in the case J λ, that is conducive to skyrmion pairing as explained below.
Consider first the isotropic limit J = λ = 0 where the Chern sectors decouple. The ground state at ν = 0 is specified by free choices of uniform pseudospins n 0 ± . An additional charge enters as a 1e-skyrmion (in one of the Chern sectors) rather than a particle-like excitation if the energy for a well separated skyrmion-antiskrymion pair (8πρ ps ) exceeds the particle-hole gap (∆ ph ). We can use the infinite-size Polyakov solution 31 which minimizes the gradient energy because of the lack of any Zeemanlike term, meaning that the texture prefers to expand to minimize the Coulomb term.
Reintroducing dispersion leads to an effective AFM coupling J, constraining the parent ground state n 0 + = −n 0 − which now belongs to an SO(3) manifold. A skyrmion of radius R s in say n + now experiences a Zeeman penalty ∝ R 2 s arising from misalignment with n 0 − . The skyrmion becomes finite with a size determined from the competition between J and the interaction term. (Note that in order to avoid divergences with system size, the tails of the skyrmion profile must decay faster in space than the Polyakov ansatz 22,32,72 .) For large enough J, the skyrmion shrinks and crosses over to an ordinary particle-like excitation. Having instead a finite λ leads to similar conclusions, except with a different residual SU (2) manifold. Including both perturbations restricts the insulator to a KIVC parameterized by a U (1) valley angle.
Adding instead a net charge of |Q| = 2e to the system leads to substantially different conclusions. In the scenario with non-zero J, the AFM interaction leads to the pairing of a skyrmion in n + with an antiskyrmion in n − (the total charge is 2e, a crucial consequence of the opposite assignment of Chern numbers). To see this, note that the AFM inter-Chern coupling is completely satisfied if the two textures are centered at the same position with exactly the same profile such that n pair (r) ≡ n + (r) = −n − (r). In this case the resulting 'paired 2e-skyrmion' dilates without limit to avoid the Coulomb self-energy. Hence an n + -skyrmion and an n − -antiskyrmion will bind for any positive value of J, even though the underlying electron-interaction is purely repulsive. A paired skyrmion configuration preservesTsymmetry, which is a useful numerical diagnostic.
In the presence of an additional λ term, the paired skyrmion experiences an energy penalty ∝ R 2 s from regions where n pair (r) is not lying in-plane. This not only leads to a finite size, but also elongates the texture somewhat to reduce the area spent pointing out-of-plane. When λ is comparable to ρ ps , the texture deforms to resemble the topologically equivalent configuration of two paired 1e-merons separated by a finite distance. This can be understood as the λ-term shrinking the costly n pair ẑ regions such that they become the cores of the merons (see red arrows in Fig. 1). If λ/J is sufficiently large, pairing is no longer favorable.

D. Skyrmion superconductvity
We now assume that the microscopic parameters are chosen such that charges enter as skyrmions rather than particles. 1e-skyrmions from opposite Chern sectors attract to form paired skyrmions. Even if particle excitations are slightly lower in energy, this typically occurs in a small region of the mBZ centered at Γ M . Hence above a critical doping, additional charges are expected to form skyrmions 22 .
A non-zero superconducting T c requires a finite boson effective mass M pair , which can be motivated within the phenomenological picture of a pair of coupled quantum Hall ferromagnets 22 in equal and oppposite magnetic fields. A single 1e-skyrmion feels a net magnetic field and hence has a flat-band dispersion. On the other hand, the magnetic fields experienced by a paired skyrmion cancel out. To understand the generation of an effective mass in this case, imagine a paired skyrmion propagating at some velocity v. The Lorentz forces in the two sectors act to push the constituent 1e-skyrmions in op-posite directions, which is counteracted by a restoring force depending linearly on J and the separation (λ is assumed small). By balancing the two, one can deduce that M pair is inversely proportional to J. Therefore the paired skyrmions condense with a finite superfluid stiffness and associated Berezenskii-Kosterlitz-Thouless transition temperature 52 We define a corresponding filling-independent energy scale E c = k B T c /ν, which equals E J /2 in this framework. Above, ν should be taken to be the doping relative to the integer filling of interest. Note the sensitive dependence on the superexchange J, which potentially provides a litmus test for the applicability of this mechanism to realistic samples of TBG. Namely, experiments observe a superconducting dome in T c as a function of twist angle, which peaks around θ = 1.08 •41 . The value of J, and indeed the effective mass as computed independently in Sec. III D, are also affected by θ through the evolution of the bandwidth and character of the BM bands. However as we discuss later, the precise relationship is subtle and differs based on the choice of interaction. In particular, while the nature of the ordered normal state at ν = 0 is unambiguous, depending only on the signs of J and λ, finer details such as skyrmion effective masses are susceptible to the specifics of the subtraction scheme (App. A).

E. Numerical methods
Underpinning the discussion in the following sections are explicit mean-field solutions |φ for the (pseudo)spin skyrmions, constructed using unbiased self-consistent HF calculations. Since such configurations break translation symmetry and flavor rotation symmetries, the HF density matrix takes the most general form The mBZ momenta k take discrete values due to the finite system dimensions L 1 , L 2 measured in units of the moiré lattice vectors a M 1 , a M 2 . Throughout this work we take L = L 1 = L 2 so that a non-degenerate moiré band can accommodate L 2 electrons. The filling is fixed by the constraint TrP = (4 + ν)L 2 = (4 + ν 0 )L 2 + N , where ν 0 is the parent integer filling factor and N represents the number of additional doped electrons. We typically initialize the calculations with random initial projectors, and accelerate convergence using the optimal damping algorithm 73 . Near ν = 0, we neglect spin for simplicity and hence omit the index s when calculating both skyrmion and particle excitations.
For spin textures about the QAH state, the local order parameter is simply the local spin density where s is a Pauli triplet in spin space, and I = {1A, 1B, 2A, 2B} labels the types of electrons living on different layers/sublattices. Replacing s by the identity gives the local charge density. The real-space fermion creation operator is defined in terms of the Chern basis asψ † For pseudospin textures, the natural extension is the 'diagonal' Chern-filtered pseudospin density where spin has been dropped and η is the triplet of pseudospinor Paulis defined above Eq. 2. The net global Chern polarization is defined by the difference in fractional occupation in each Chern sector. Above, the Chern band-filtered real-space creation operator is where τ σ labels the Chern band. However, the in-plane components of Eq. 11 identically vanish in the chiral limit κ = 0, where the Chern basis becomes completely sublattice polarized. For example, for the C = +1 Chern sector, n +,x and n +,y should measure coherence between the KA and K B bands, but To remedy this, we occasionally consider an alternative definition that includes off-diagonal terms in the layer/sublattice degrees of freedom To address the question of the effective mass, we can perform variational calculations in the space of states obtained by translating a localized paired skyrmion |φ by all possible moiré lattice vectors. This defines an effective skyrmion hopping model leading to delocalized 'Bloch skyrmions' and a Bloch dispersion, from which M pair can be extracted. Note that this way of calculating the paired skyrmion mass is completely different from the classical calculation of M pair used to obtain the second equality of Eq. 7. The technical details of the skyrmion-plane wave calculations, including generalizations involving symmetry-related skyrmions, are outlined in App. B.

III. PSEUDOSPIN TEXTURES AT EVEN INTEGER FILLING
A. 1e-skyrmions  Relative energy ∆E = E 1e-skyr − E1e of 1e-skyrmions (compared to particle excitations) as a function of system size. Apart from the isotropic limit (black triangles), 1e-skyrmions are not favored. c) ∆E for different θ and wAA. Dispersion is always included. System size is 11 × 11. Missing data points indicate parameters where HF was unable to find skyrmion solutions. d) Local alignment of Chern pseudospins corresponding to the rightmost panel of a). All calculations used the average scheme (see App. A).
Chern couplings J and λ can be controlled by tuning the effective one-body term and the chiral ratio respectively. Since the Hamiltonian obeys particle-hole symmetry, adding holes instead of electrons leads to analogous results.
In the isotropic limit with J = λ = 0, the ground state can be any member of the U (2)×U (2) manifold described by independent choices of n + , n − . (Strictly speaking the |C| = 2 ferromagnets are also ground states, but these do not allow for textures). As verified numerically, the extra charge enters as a delocalized skyrmion solely in one of the Chern sectors, while the other sector remains unchanged. Hence its properties are qualitatively the same as for spin skyrmions about the ν = 3 QAH insulator (see Sec. IV). Consistent with the absence of anisotropies, the skyrmion expands as much as possible and is only limited by the finite simulation cell. This explains the slow convergence in Fig. 3b of the 1e-skyrmion energy (measured with respect to the 1e-particle excitation) with the linear system dimension L, compared with other cases. Fitting to a power law ∆E(L) = ∆E(∞) + αL −γ , we obtain ∆E(∞) = −4.1 meV, α = 18.3 meV and γ = 0.68. Note that the band gap at ν = 0 depends only weakly on L since there is a direct gap at Γ M . Hence in the NLSM picture, the L dependence is expected to predominantly arise from the system size constraint on R s which controls the interaction contribution to Eq. 4. Given that the gate screening length is comparable to the moiré length d sc a M , one expects γ > 1 in the continuum limit for R s a M . The discrepancy suggests that lattice corrections (the dotted patterns in Fig. 3a denote the AA-stacking regions) and finite size effects have a quantitative impact here.
In the presence of anisotropies, the 1e-skyrmions become finite-sized. For example in the chiral-nonflat limit with J = 0, λ = 0, the charge density is localized to within a few moiré lengths, leading to faster energy convergence with system size. This is consistent with the intuition from the NSLM, and the energetics of the skyrmion becomes significantly less favorable. The parent insulator is a member of the SU (2) family containing the KIVC and the valley Hall state (e.g. polarizing into KA and K A bands). The global net Chern polarization is less than 1/L 2 , consistent with the fact that the superexchange tunnels between opposite Chern sectors.
A similar story holds for the nonchiral-flat limit with J = 0, λ = 0, except the ν = 0 insulator now interpolates between the KIVC and the valley-polarized state. This time, the added skyrmion is perfectly Chern polarized.
Including both perturbations, which is the case for realistic TBG, the symmetry reduces to the U (1) family of KIVC insulators, and the Chern polarization of the skyrmion is imperfect again. Fig. 3d illustrates that the localized violation of Chern anti-alignment occurs at the same position as the charge density modulation, confirming the skyrmionic nature of the added charge. In Fig. 3c, we chart the relative energy of the skyrmions as a function of θ for different chiral ratios. For the average scheme, the 1e-skyrmions are actually most costly near the magic angle where J is supressed (compare with 2b). This is despite the fact that artificially turning off J while keeping other parameters fixed improves the skyrmion energy significantly (Fig. 3b). On the other hand, λ is largely constant as a function of θ. This suggests that the continuum description in terms of a small number of coupling parameters is not completely adequate.
The numerical results for ∆E in any plot such as Fig. 3c are generally expected to represent upper bounds for two reasons. First, our calculations are performed on finite system sizes L. In the thermodynamic limit L → ∞, the skyrmions will have some ideal radius R s (∞) set by the intrinsic properties of the BM model and interaction potential. Unless L R s (∞), the pseudospins in our calculation will experience some degree of frustration from the finite simulation cell, leading to an energy penalty. In addition, a larger L introduces a greater number of basis Bloch states which allows for smoother pseudospin rotations. Second, the restriction to Slater determinants in HF likely impacts skyrmions more than particle-like excitations. As shown in Sec. III D, skyrmions can gain a small delocalization energy by going beyond mean-field and restoring the translation symmetry.
B. 2e-skyrmions Fig. 4a shows the textured configurations when two electrons are added to the translation-invariant ν = 0 ground state. As before, the Chern couplings J, λ can be controlled through the effective one-body term and the chiral ratio respectively.
For J = λ = 0, the texture breaks moiré translation symmetry completely, but the charge density appears to spread throughout the system without a clear identification of one or two well-defined objects. Interestingly, the lowest energy solution consists of both additional charges entering the same Chern sector, meaning that from a symmetry standpoint the situation will again be qualitatively similar to adding two charges to the QAH state. As shown in Fig. 4b, the pseudospin alignment (note that one Chern sector has a constant pseudospin) rotates in a complicated fashion without reconstructing a new superlattice periodicity. However in analogy to the case of SU (2)-invariant spin textures in Sec. IV, the inclusion of more doped electrons can result in a so-called 'doubletetarton lattice' 51 with emergent periodicity if they all Chern polarize. The resulting pseudospins form an ordered pattern as shown in Fig. 12a-c. In the chiral non-flat limit with J = 0, λ = 0, the charge density consists of a single smooth modulation with approximate circular symmetry. This is precisely an explicit realization of a paired 2e-skyrmion: a skyrmion and an antiskyrmion with identical spatial profiles in the two Chern sectors exactly overlapping. This binding is induced by the superexchange J, as evidenced by the perfect anti-alignment of Chern pseudospins (and hencê T -symmetry). In the absence of other perturbations, the paired skyrmions spreads out and is limited only by the system size, leading to slow convergence in Fig. 4c. Hence the resulting physics controlling the texture is similar to that of the 1e-skyrmion in the isotropic limit.
In the nonchiral-flat limit with J = 0, λ = 0, one may naïvely expect a similar paired skyrmion where the pseudospins are locked instead as n +,z = n −,z and n +,xy = −n −,xy . However this does not work as it leads to an electrically neutral object. The numerics reveal that both charges predominantly go into the same Chern sector. For small system sizes, the state closely resembles the double-tetarton lattice of the isotropic limit, which is reflected in the energetic trends in Fig. 4c for small L. For larger sizes, the lattice deforms such that the texture is better described by a nearby pair of 1eskyrmions. While this may be considered pairing (actually since ∆E > 0, it is a metastable bound state that is unstable towards decaying into two particles), we reserve the term 'paired skyrmion' for theT -symmetric cases where a skyrmion and an antiskyrmion from opposite Chern sectors bind. It is noteworthy that ∆E increases as a function of L, bucking the trends of all other types of 1e-and 2e-skyrmions. This occurs because for L R s (∞), the constituent skyrmions are forced to strongly interact with each other in the confined system area , and may therefore form a delocalized configuration with better energetics.
It is clear from Fig. 4c that reintroducing dispersion to the achiral-flat limit is favourable to the textures, which are once again paired skyrmions. The faster convergence of the relative energy with L is a hint that these paired skyrmions are now finite in size. This is clear from Fig. 5a, which illustrates that increasing the chiral ratio not only reduces the skyrmion area, but also leads to an elliptical shape. As discussed in Sec. II C, this could be anticipated from the NLSM analysis which predicts that the λ-term penalizes the pseudospins when they anti-align out-of-plane. This anisotropy is apparent in Fig. 5b,c, where n +,z (r) has a much tighter profile than n +,xy (r). At mean-field level, the orientation of the n z lobes is very soft, with distinct HF solutions differing by 10 µeV. The system cell has been duplicated three times for presentation. c) Relative energy ∆E = E 2e-skyr − E2e of 2e-skyrmions (compared to adding two particle excitations) as a function of system size. ∆E < 0 indicates that the 2e-skyrmion is favored. d) ∆E for different θ and wAA. Dispersion is always included. Note that most of the solutions (indicated with crosses rather than squares) for 1.05 • < θ < 1.13 • are not T -symmetric paired skyrmions. System size is 11 × 11. All calculations used the average scheme (see App. A).
The energetic trends of the 2e-skyrmions as a function of θ and w AA are shown in Fig. 4d. Note that specific to the average scheme, λ significantly exceeds J for a finite region of θ around the flat-band point. Consequently in the range ∼ 1.05 • to ∼ 1.13 • , the best textured solution has finite Chern polarization and breaksT , and hence is not a paired skyrmion.
Strain and substrate alignment represent two singleparticle perturbations that are deleterious to the paired skyrmions, making them less energetically favorable and harder to find within HF. Strain significantly increases the kinetic bandwidth and is known to substantially degrade the gap of the KIVC at charge neutrality, eventually leading to a symmetric semimetal 59 . As shown in Fig. 6a, strain of strength 0.1 % is enough induce a positive relative energy ∆E for paired skyrmions. This should be interpreted in the context of STM studies which typically measure strains of 0.1 − 0.7 % 67,68,70 . A staggered sublattice potential ∼ ∆σ z breaksĈ 2z and acts as a constant easy-axis Zeeman field with opposite direction in the two Chern sectors, which deters smooth rotations in pseudospin space (Fig. 6b). For ∆ beyond a fraction of an meV, HF fails to stabilize paired skyrmions at all. In fact if the coupling is strong enough, the meanfield ground state becomes a valley Hall state with no intervalley coherence 20 . On the other hand, increasing the strength of interactions tends to favor the paired skyrmions. This can be achieved by increasing the gate screening distance d sc (Fig. 6c) or reducing the relative permittivity r (Fig. 6d).

C. Skyrmion composites and crystallization
By going beyond one or two particles, we can generate (metastable) higher skyrmion composites and skyrmion crystals. Fig. 7a shows a typical configuration obtained by adding 3 electrons on top of ν = 0, resulting in a 1e-skyrmion (in the C = +1 sector) and a paired 2eskyrmion that position themselves apart due to Coulomb repulsion. This identification can be straightforwardly made by looking at the pseudospin orientation. On the other hand, adding 4 electrons can lead to aTsymmetric clump of two paired skyrmions which arrange their n z lobes in order to minimize the gradient cost (Fig. 7b). For larger numbers of particles, the random initialization of the self-consistent HF loop tends to drive the system into local minima with complicated translation symmetry-breaking patterns (Fig. 7f). Such states are often well-characterized by clumps or 'trains' of paired skyrmions with suitably arranged pseudospin lobes, punctuated by 1e-skyrmions within the gaps. The prevalence of such motifs, even for large w AA where particle-hole excitations are favorable, points to the robustness of skyrmions as well-defined localized excitations far away from the perturbative strong-coupling regime.
For intermediate dopings, the skyrmions can lose their individual identities and order into textured crystals. This was already shown in the isotropic limit for two added particles in Fig. 4a,b, but can occur even with anisotropies if the inter-skyrmion spacing becomes comparable with R s (∞). For 6 extra particles, we can find a meron crystal where the lobes of the paired skyrmions lie on the honeycomb sites (Fig. 7c, compare with Fig. 12d).

D. Effective mass of paired skyrmions
A key prediction of the NLSM analysis is that paired skyrmions have a finite dispersion 22 , which is crucial for generating a finite BKT energy scale E c for superconductivity. In the NLSM framework, this generation of a finite mass is non-trivial, arising from the interplay between the superexchange J and the contrasting magnetic fields. An important question is how this picture holds up in the periodic moiré setting, where magnetic fields and flat bands are replaced by inhomogeneous Berry curavtures and interaction-renormalized dispersions. We address these issues using the effective hopping model described in App. B, paying close attention to the dependence on the twist angle θ and subtraction scheme (the role of the subtraction is explained in Sec. II A and App. A). Because of moiré translation invariance under T R , our method is in essence a variational calculation of 'Bloch skyrmions' |ψ q = 1 L R e iq·RT R |φ based on a starting localized skyrmion |φ with an ideal pseudospin structure at the mean-field level. The required inputs are the matrix elements φ|T R |φ , φ|ĤT R |φ of |φ and its translated images.
In Fig. 8a,b, we show the resulting bandstructure of 1e-skyrmions and paired skyrmions, measured relative to the energy of the starting texture. The 1e-skyrmion is characterized by a sharp peak at Γ M and shallow minima near the zone boundaries (the positions of these features may change for larger w AA , but the overall structure re-FIG. 7. Skyrmion composites and crystallization (spinless ν = 0). Charge density δρ, inter-Chern pseudospin alignment n+ · n−, and pseudospin orientation n+ for different numbers of electrons above charge neutrality ν = 0 in the spinless model. Emergent supercells for the double-tetarton lattice are highlighted in grey. All pseudospin densities are diagonal in microscopic layer/sublattice. System size is 16 × 16 with average scheme. mains the same). Usually a large number of overlaps and matrix elements beyond nearest neighbours needs to be computed to converge for a fixed set of parameters.
On the other hand, the paired skyrmions are robustly associated with a broad energy minimum at Γ M and peaks at the zone corners. Typically only matrix elements for distances 2a M need to be computed to accurately describe the properties of the full hopping model. The skyrmion mass M pair , and hence the BKT energy scale E c , can be estimated by fitting a parabola to the band minimum. This typically leads to T c of order 1 K. Fig. 8c plots the convergence of E c with system size L, which implies that the finite-size calculations here will typically underestimate the L → ∞ results. As can be verified by checking nearest neighbor overlaps, there is no 'orthogonality catastrophe' for moiré translationsthe spatially inhomogeneous part of a paired skyrmion is finite in size, and most regions of the HF state remain translation invariant.
Having established the basic properties of the hopping model for paired skyrmions, we now turn to details of the dependence of E c on various parameters. The main motivation is to touch base with experiments which have observed a T c dome as a function of twist angle 41 . However as noted in Sec. III B, there is a large window of θ centred at the magic angle where paired skyrmions cannot be found in the average scheme. The utility of this scheme is that J can be easily toggled by artificially turning off the kinetic Hamiltonian, but the drawback is that J therefore becomes substantially smaller than λ when the BM bands are flat, leading to a different type of 2eskyrmion. To address this issue, we consider now two alternative schemes which are not fine-tuned to have a vanishing superexchange. Fig. 8d charts the relative energy of the paired skyrmion as a function of θ and w AA for the CN scheme. In contrast to Fig. 4d, the paired skyrmions are energetically favored in an energy window centered around the magic angle. Again, ∆E has an inverted relation compared to J, which is plotted with dotted lines in the right panel. Note that paired skyrmions can be found up to w AA = 60 meV for L = 11, which should improve with increasing system size. Looking at the hopping model results, E c appears to qualitatively follow the same trends as J for the whole range of twist angles investigated, suggesting that the physical intuition from the NLSM maintains some level of validity in the lattice case. Notably, the CN scheme suggests that a skyrmion superconductor would have a T c dome around the magic angle.
The graphene scheme in Fig. 8e paints a somewhat different picture-E c monotonically increases in concert with J for decreasing twist angle. For smaller angles the applicability of this calculation will be cut off by the fact that the parent insulator is no longer the ground state. Similarly the paired skyrmions, at least for small chiral ratios, become less favorable compared to particle excitations. Hence there are still possibilities for the hopping model results in the graphene scheme to be consistent with a T c dome.
The stability of paired skyrmions in the CN, graphene and average schemes is summarized in Fig. 9. Energetically favorable skyrmions can be found up to w AA 50 meV for the right twist angles in the CN and graphene schemes (this is likely an underestimate when accounting for skyrmion delocalization and finite system size). Beyond this, metastable solutions are obtained for chiral ratios as large as w AA 65 meV. Closer to realistic values of w AA 80 meV, we find that HF is not able to converge to any paired skyrmion states. In the average scheme, the paired skyrmions are energetically unfavourable close to the magic angle.
It is notable that different subtraction schemes lead to radically different behaviors in E c . Indeed we believe that this is one of the few cases where such a choice impacts a physical quantity in a qualitative way. Many theoretical studies focus primarily on the type of symmetry-breaking order 5,6,8,20,64,65,[74][75][76][77][78][79][80][81][82][83] -all the schemes studied here lead to the KIVC which only requires that J, λ > 0. On the other hand, we are interested in the θ-dependence of E c which depends sensitively on the value of J itself.
The origin of this discrepancy can be understood by examining the effective dispersionĤ SP in more detail. As explained in App. A, its matrix elements take the form (14) J depends quadratically on the overall magnitude of h SP . In the expression above, P 0 = 1 2 (1 + Q 0 ) is the reference projector of the particular scheme. For the average scheme, Q 0 = 0 so that the only single-particle contributions (when the interaction terms are recast in strongcoupling form) arise from the kinetic piece h BM which gets heavily suppressed at the magic angle.
P 0 for the CN scheme is constructed by occupying the valence bands of the BM Hamiltonian, while P 0 for the graphene scheme is built by filling the decoupled Dirac cones of each layer to the charge neutrality point and projecting to the central bands. An important clue is that both schemes qualitatively agree for θ above the magic angle. We focus on the states in the vicinity of the Dirac points for simplicity. For large angles, the Dirac cones of TBG are simply renormalized versions of those in decou- pled graphene. In particular, we can calculate the behavior of the relative sublattice phase ω = arg(u 1A /u 1B ) for the valence band Bloch function as we traverse a small circle around a Dirac point. From a fixed starting point, this will wind by 2π with some initial offset that agrees for both TBG and decoupled layers. As θ is reduced, the BM bands distort until we reach the magic angle regime where the Dirac velocity vanishes and the bandwidth becomes tiny. Continuing past this point, the valence and conduction bands of TBG actually swap roles 84 , which is reflected in an additional π offset in ω.
The crucial insight is that the filling of the CN projector tracks this band reversal, while the graphene projector is oblivious to this physics. Note that the second term of Eq. 14 represents the subtraction of the exchange gain for the filled bands of P 0 . For the CN scheme, this is a positive contribution for the valence band of the BM model, which counteracts the negative contribution from the BM kinetic term itself. Therefore the two parts of Eq. 14 tend to cancel each other out, leading to a dome at the magic angle when h BM (k) becomes suppressed. For the graphene scheme, this destructive interference occurs above the magic angle. However when the BM bands swap roles, the graphene projector does not follow, and hence the two parts add constructively for smaller θ.
We reserve judgment on the matter of which scheme is most appropriate for capturing the physics in experimental TBG. Each choice has its own merits and justifications. The average scheme is the simplest and puts the strong coupling hierarchy front and center. The graphene scheme aims to prevent additional renormalizations of the Dirac cones that have already been accounted for in the input value for the bare Dirac velocity. The CN scheme provides a basepoint (i.e. charge neutrality of the BM model) at which the BM kinetic energy is precisely the mean-field band structure. However it is known to lead to incommensurate Kekulé spiral (IKS) order at extremely small strains for non-zero integer ν 20 . The microscopically correct answer is likely complicated and may differ based on the twist angle itself.
Since paired skyrmions also break TRS and all point group symmetries, a natural extension of the effective mass calculation is to include symmetry-related partners of the starting HF solution |φ in a 'hybridization + hopping' model (see App. B for details). By restoring the symmetries, this would shed light on the internal structure of the quantum skyrmion, including the angular momentum of pairing. However we find that an orthogonality catastrophe prevents feasibility of this method. Partners that are constructed from symmetries that do not leave the ν = 0 KIVC invariant, such asT , have effectively vanishing overlaps with |φ . This is not surprising since these operators have a non-trivial action on the pseudospins even in the bulk of |φ away from the localized paired skyrmion. However partners related by certain symmetries of the KIVC, such asĈ 3 , also have suppressed matrix elements and overlaps with |φ . Fundamentally the reason is that the skyrmion texture is really a many-body pseudospin rotation of the parent insulator involving many particles, as evidenced by Fig. 5b,c. Hence the Hamiltonian, which only involves few-body terms, cannot effectively couple the different partners.

E. Spinful KIVC at |ν| = 2
We now reintroduce spin and turn to filling factors near |ν| = 2 where the mean-field ground state is also a KIVC insulator (we only show results for ν = +2, but the situation at ν = −2 can be inferred from PHS). A representative member of the spin-degenerate manifold of states, arising from the approximate SU (2) K × SU (2) K symmetry, consists of fully filling the spin-↑ flat bands while forming a spinless KIVC in the (minority) spin-↓ subspace. We concentrate on pseudospin skyrmions by enforcing collinear spins.
Paired skyrmions can be constructed as before by treating the majority spin bands as spectator bands and performing non-trivial pseudospin rotations in the halffilled minority spin bands, but there are complications compared with the spinless neutrality case. First, the NLSM parameters J, λ reflect the relevant energy scales at the neutrality point of the strong-coupling Hamiltonian. However the starting insulator now contains additional majority carriers, which impacts the effective energetics of the minority subspace. Second, this additional interaction renormalization enters in a particlehole asymmetric way. Generally the bands away from neutrality have a significantly enhanced bandwidth with a prominent extremum at Γ M 85,86 (Fig. 10a), which disfavors skyrmions because they are built from momentum states throughout the mBZ. Third, the edge of the majority spin bands can lie inside the minority band gap which reduces the gap to particle-hole excitations. On the other hand, a pseudospin skyrmion cannot be formed by adding holes to a completely filled spin sector.
These considerations are reflected in the HF calculations in Fig. 10b at ν = +2, showing that paired 2eskyrmions on the electron-doped side are more expensive than paired 2h-skyrmions on the hole-doped side. Overall the relative energies ∆E are less favorable than the results at charge neutrality.

IV. SPIN TEXTURES AT |ν| = 3
The physical intuition and considerations behind skyrmions in Sec. III can also be applied to odd integer fillings. In this section, we briefly discuss textures near ν = +3 that arise from doping the Chern, spin and valley-polarized QAHI (the results at ν = −3 can be found from particle-hole symmetry about neutrality). The situation here is significantly simpler because of the SU (2) S spin-rotation symmetry that holds independent of the presence of dispersion or deviation from the chiral limit. Furthermore the starting insulator is easy-axis in valley space. The low-energy charged topological excitations are then expected to be spin skyrmions 36,37,51 in the partially filled Chern sector without any texturing in the fully filled Chern sector.
In Fig. 11, we compute the properties of 1h-and 1eskyrmions by relaxing the spin collinearity constraint in our HF calculations. In a similar fashion to isotropic 1e-skyrmions in Fig. 3a, the change in charge density is roughly circularly symmetric with moiré-periodic modulations on the AA-stacking regions. The local spin ori- FIG. 11. Spin skyrmions about QAH (spinful ν = +3). a) Charge density of a single hole about the ν = +3 QAH state. b) Topological density ρtop computed from the spatial spin configuration of a). c) Energy of single electron and hole skyrmions ∆E = E 1e(h)-skyr − E 1e(h) (relative to a particlelike electron or hole excitation) as a function of system size L. d) Local spin orientation S(r)/|S(r)| corresponding to the skyrmion in a). The spins have been rotated so that the state has a net polarization along z. System size is 16 × 16 with graphene scheme. entation is consistent with the hallmark features of an SU (2)-symmetric skyrmion. Indeed the calculation of the topological density ρ top (in spin space) confirms the topological origin of the excitation. Unlike in QHFM, there is no explicit Zeeman field, meaning that the skyrmion expands to fill the available system size to minimize the Coulomb energy. This is reflected in the slow convergence of ∆E with L. There is a clear asymmetry between adding holes and electrons because the interaction-renormalized band structure 85-90 at finite integer fillings strongly breaks particle-hole symmetry (see Fig. 10a).
Just as in Sec. III C, doping additional charges leads to the formation of various skyrmion crystals. The unit cell of the double-tetarton lattice 51 contains two charges with an intricate spin pattern illustrated in Fig. 12a,c. An alternative configuration is the meron crystal whose unit cell is associated with a charge of 3e (Fig. 12d). Computations on system sizes and dopings where both options can be stabilized are required to numerically determine the energetically preferred lattice. It has also been proposed that applying a Zeeman field will enrich

V. DISCUSSION
In closing, we consider various generalizations and extensions of the work presented above, comment on the possibility of experimentally observing skyrmion physics, and critically assess the implications of our work for the proposed skyrmion-driven mechanism for superconductivity in TBG.

A. More general skyrmions
In this work, motivated by the particular physics and questions relevant to TBG, we have imposed restrictions on the directions in flavor space that the skyrmions are allowed to rotate in. The strong-coupling insulators at different integer ν are all Chern ferromagnets, and starting from the fully symmetric limit, one can consider more general skyrmions where the only constraint is that locally the state is polarized within the Chern sectors. For instance, starting from the |ν| = +3 QAHI, it is possible to form a texture in both spin and pseudospin to create an 'entangled' skyrmion, akin to what happens at |ν| = 1 in the zeroth LL of monolayer graphene 91 . With the full Hamiltonian, pseudospin rotations will be gapped since the QAHI is easy-axis, while spin rotations remain lowenergy due to the SU (2) S -symmetry.
These considerations can be generalized to any integer filling. Starting from a generalized ferromagnetic insulator withν C filled bands in Chern sector C, the local flavor configuration is parameterized by two matrix spinors f C (r) living in Grassmannian projective spaces 92 whereν + +ν − = ν. The factors in the normal subgroup represent unitary rotations in within the filled bands, unfilled bands, and the phase difference between the filled and unfilled bands respectively. The presence of four flavors and two spinors leads to a large manifold of skyrmions. From the discussion in Sec. II B, the leading corrections from the U (4) × U (4) limit will be anistropic couplings between the spinors, while additional singleparticle perturbations such as substrate coupling may manifest as anisotropic fields. The relevant space of textures will be dictated by the energetics at a given filling. The ground state in the spinful model at charge neutrality is the spin-unpolarized KIVC 5 where V = e iφ e i ϕ 2 m·s and τ ± = 1 2 (τ x ± iτ y ). This represents two copies of the spinless version in Eq. 3 with spin projections ±m and IVC phases φ ± ϕ 2 (the relative phase ϕ is set by the intervalley Hund's terms). Equivalently we have four Chern pseudospins n C,s where s indicates the projection of the spin along m. Consider doping two electrons. Forming a paired skyrmion in one of the spin sectors in the usual way can be understood using the analysis of Sec. III. Attempting to rotate in spin space will either lead to Pauli blocking or the loss of antiferromagnetic exchange J, which justifies our restriction to pseudospin textures in Sec. III. Similar conclusions can be reached for the spin-polarized KIVC insulator at |ν| = 2, though we cannot rule out the majority spin being non-trivially involved if it is the closest filled band to the chemical potential.

B. Disorder
The effects of disorder on skyrmions have been treated previously in the quantum Hall context [93][94][95][96][97] . A strong and smooth disorder potential can drive the system into a spin glass. We note that all the strong-coupling insulators of interest in this work violate TRS, and hence ordinary impurities cannot directly couple to the symmetrybreaking order as a random field. Isolated charged impurities are expected to pin individual skyrmions as well as spin texture lattices, which may aid in their detection as discussed in the following subsection. TBG can also harbor more subtle forms of disorder. For the QAHI, modulations in the magnitude and sign of the sublattice coupling ∆ can lead to the enforced nucleation of domain walls, including ones which separate regions with opposite Chern number 46 . These would interrupt the propagation and crystallization of spin skyrmions. Twist angle inhomogeneity 98 , which is pervasive in experimental samples [67][68][69][70]99 , is notoriously difficult to model, and its theoretical impact on the correlated insulators at integer fillings is still beyond quantitative characterization. For sufficiently long-wavelength fluctuations, the properties of the skyrmions (e.g. the effective masses) are likely to depend on the local twist angle.

C. Detecting skyrmions
The presence of spin textures about the QAHI at ν = +3 could be detected by measuring the degradation of magnetization for small dopings, e.g. through NMR measurements 100 , NV center magnetometry, or using a superconducting quantum interference device (SQUID) 13 . Individual spin skyrmions involve a large number of spin flips, and the spin texture lattices of Fig. 12 are close to completely spin unpolarized. However there is an orbital contribution from the spontaneous Chern polarization which is likely larger 13 . A more direct probe would be spin-resolved STM near impurity sites that could pin a localized skyrmion.
Measuring pseudospin textures at even integer filling is trickier since experimental techniques are not able to directly couple to the valley degree of freedom. IVC generally leads to √ 3 × √ 3 spatial order at the microscopic graphene scale, but theT -symmetry of the KIVC insulator means that it does not exhibit a Kekulé density distortion (KD) 101,102 . Paired skyrmions preservê T and hence do not give rise to KD 102 . They may still leave a dipole-shaped fingerprint in sublattice polarization within regions where the state is locally in the valley Hall configuration, i.e. pseudospins anti-aligned and pointing along the z-axis (Fig. 5c), but this is likely a faint signature since away from the chiral limit the Chern basis is only partially sublattice polarized. On the other hand, 1e-skyrmions are tightly localized within a few moiré lengths and give rise to a spatially varying KD when pinned by charged impurities. KD has been observed in the related context of QHFM within the lowest LL of monolayer graphene [103][104][105] , including the imaging of an individual valley skyrmion 104 . We caution that KD in TBG has also been predicted for IKS order in the presence of a small amount of strain 20,102 .

D. Skyrmion superconductivity
We finally turn to the implications of our work for skyrmion superconductivity. First, note that two important factors controlling the feasibility of this proposed mechanism are the stability of paired skyrmions (i.e. relative energy ∆E compared to particle excitations) and their effective masses. In general, we can conclude that paired skyrmions are especially favored close to the strongly-interacting isotropic limit. The realistic value of w AA lies in the range 55 − 90 meV (i.e. κ 0.5 − 0.8) and we only find paired skyrmions in the lower range of these values (Fig. 9). We note that a mechanism has been proposed that might drive a downward renormalization of the chiral ratio κ 9 . We note also that large skyrmions, which are relatively classical and relevant for small w AA , are likely to be well captured in our mean-field treatment. However, for larger w AA , the paired skyrmions become smaller and quantization effects are more important, and fluctuations can be more significant. In this regime the mean-field result is really only an upper bound on the skyrmion energy, which could be lowered by fluctuations, which are not expected to substantially affect the singleparticle excitations.
Enhancing interactions by suppressing screening, either through increasing the gate distance d sc or decreasing the permittivity r , also favors skyrmions (Fig. 6). However, this observation makes the superconducting domes that persist in Refs. 42-45 upon reducing the interaction strength difficult to reconcile with a topological mechanism. In these experiments, the insulators at integer ν, from which the skyrmions would be seeded, disappear with increasing screening. The superconducting region can also straddle the integers where the BKT transition temperature from Eq. 7 seemingly vanishes. Substrate coupling rapidly destroys pseudospin skyrmions, consistent with the absence of superconductivity in aligned samples 11,12 . This topological mechanism would not be effective in other moiré platforms that lackĈ 2z -symmetry 22,[106][107][108][109] . Strain takes the system away from the strong-coupling regime and similarly disfavors skyrmions: for instance, the parent insulator has been predicted to give way to a symmetric semimetal at charge neutrality or an IKS at non-zero integer fillings 20,59 . Hence the general expectation from our work and from these experimental observations is is that skyrmion superconductivity is most likely to emerge in 'pristine' samples with minimal screening.
The question of effective masses (and hence T c via Eq. 7) is more subtle. We have demonstrated that the dependence of the BKT transition scale E c on θ is rather sensitive to precise details of how electron interactions are incorporated. Without further external inputs, e.g. from detailed ab initio studies or spectroscopic probes of the band structure over a range of twist angles, it is difficult to make quantitative contact with experiments such as Ref. 41 which show a dome in T c near the magic angle. Any comparison would also inevitably be complicated by the presence of confounding variables such as twist angle disorder 99 which are difficult to fully characterize, let alone control. However, what we can reliably distill from our numerical study is that paired skyrmions can in principle emerge with a sufficient mass to support an estimate of T c 5 K that is comparable to experimentally observed values. Both the CN and graphene schemes are able to support a non-vanishing superfluid velocity at the magic angle, and the fragility of the parent correlated insulators to deviations of θ will tend to also reduce the strength of skyrmion superconductivity away from this regime.
The discussion above focuses primarily on a "BEClike" limit of skyrmion superconductivity, where the binding is present already in the dilute limit; this is the regime primarily accessed in this work. An intriguing alternative possibility is a more "BCS-like" picture, where skyrmions while unstable at low density nevertheless become favored and paired at finite density. We note that Ref. 23 which studies skyrmion superconductivity in the Landau level limit found superconductivity in the doped case, even in the absence of a pairing gap at zero doping.
Another potential challenge to the applicability of the skyrmion mechanism to TBG lies in the fact that superconductivity is most frequently observed near ν = −2 on the side away from charge neutrality. On the other hand, our numerics show that paired skyrmions are relatively harder to stabilize at |ν| = 2 compared to ν = 0 (Fig. 10b). Furthermore, the skyrmions are more disfavored when doping in the direction away from charge neutrality. This latter observation can be explained by the increased dispersion due to the interaction renormalization from the extra filled bands (Fig. 10a), and has been argued to be consistent with the asymmetry of the Landau fans 85 .
We cannot rule out a scenario where skyrmion superconductivity is operative in only a subset of samples, for instance the device studied in Ref. 4 which is nominally non-aligned with hBN and exhibits an remarkably large number of correlated insulators and superconducting domes, including near neutrality. Another moiré material where the skyrmion mechanism may be a plausible explanation of superconductivity is mirror-symmetric magic-angle twisted trilayer graphene (TTG), which is closely related to TBG but has a somewhat larger value for the magic-angle 110 . Interestingly, the superconductor in TTG is observed to have a very short coherence length 111 , and an associated pseudogap regime 112 . Partly because of this, Refs. 111 and 112 suggested that part of the TTG superconducting dome is in the BEC regime. Skyrmion pairing is a natural way to get preformed charge-2e bosons, and is at least known to give rise to superconductivity in the chiral limit of TBG 23 . It is therefore worthwhile to investigate whether this mechanism can explain at least a subset of the experimental observations in TTG.
In summary, our work provides clear evidence that the formation and pairing of skyrmions can indeed occur in microscopically faithful treatments of TBG, thereby illustrating that a purely electronic "topological" Cooper pairing mechanism can operate away from the exactlysolvable limit without leveraging any approximate sigmamodel description. However, despite this in-principle demonstration of the feasibility of a novel pairing mechanism, we cannot on the basis of present evidence definitively attribute superconductivity in TBG to this mechanism. This is highlighted by the difficulty in reconciling the deleterious effect of variations in strain and interaction strength and deviation from the chiral limit on the stability of skyrmions with the apparent robustness of superconductivity to such effects. Despite this, the uncertainty of various microscopic parameters and the mean-field nature of our study leaves open a real possibility that a skyrmionic mechanism may ultimately survive these challenges. Further work, especially involving numerical approaches that can capture fluctuations beyond the mean-field level, is clearly warranted, as are the exploration of new regimes and systems where skyrmion pairing may be more favored and the identification of new probes that can directly interrogate the nature of the pairing 'glue' in TBG.

Note Added. A preprint by Eslam Khalaf and Ashvin
Vishwanath appearing in the same arXiv posting performs a complementary analysis of 1e-skyrmion formation in the limit of small skyrmion size. In this regime, skyrmions may be viewed as "spin polarons", consisting of a charged electron or hole bound to a single spin/pseudospin flip particle-hole pair excitation. Our results are broadly in agreement where they overlap.

Strong-coupling form
Consider first the normal-ordered interaction Hamiltonian in the projected subspace (this will later be augmented by interaction-induced corrections) where h BM are the BM kinetic matrix elements in the Chern basis. Consider decoupling the normal-ordered interaction term with the projector P αβ (k) = d † α (k)d β (k) To complete the Hamiltonian, we need to account for the interaction contribution of the remote bands, as well as the fact that some interactions have been double-counted (for example the normal-ordered interaction is not particlehole symmetric about neutrality due to the self direct and exchange interaction of the filled central bands). There is no unique prescription for doing this, but a convenient parameterization is in terms of some reference projector P 0 . Assuming that the net contributions arising from Bloch states in the remote bands cancel, the total Hamiltonian can be writtenĤ In this way at the reference density P = P 0 , the HF spectrum is the same as that of the BM model. We now recast the Hamiltonian in 'strong-coupling' form, which makes the hierarhcy of scales more transparent. Letting P 0 (k) = 1 2 [1 + Q 0 (k)], the expression for h(k) becomes where a term vanished because P 0 is assumed to be a neutral projector arising from aĈ 2zT andPT symmetric Hamiltonian. The normal-ordered interaction can be rewritten The second term of Eq. A18 cancels with the third term of Eq. A17. Collecting the rest of the terms, we obtain the strong-coupling form of the total Hamiltonian h SP (k) = h BM (k) + 1 2A q V q Λ q (k)Q 0 (k + q) T Λ † q (k) (A20) δρ q =ρ q −ρ q ,ρ q = 1 2 Gk δ G,q TrΛ G (k).
Dividing the effective single-particle term into inter-sublattice and (sub-dominant) intra-sublattice pieces h SP (k) = h 0 (k)τ z + h x (k)σ x + h y (k)σ y τ z , we can define the energy scales of the strong-coupling hierarchy Similarly, the NLSM parameters are The pseudospin stiffness ρ ps is a property of the fully symmetric limit.
where the superscripts in S A,B emphasize that the determinant is taken over the filled spaces of |A and |B (i.e. the entire matrix S).

Application to TBG -single-component hopping
We first discuss the case of a single type of localized skyrmion (e.g. we neglect partners related by point group symmetries in the case of paired skyrmions with substantial ellipticity). Recall how a localized skyrmion |φ 0 is constructed from the eigenvectors v n k,τ,a (0) of the HF Hamiltonian (the 0 denotes that the skyrmion is centered at some arbitrarily chosen origin moiré site) i.e. fill orbitals kτ a v n k,τ,a (0)d † kτ a up to some n.
We construct skyrmions |φ R at all other moiré sites R by boosting the eigenvectors v n k,τ,a (R) = e ikR v n k,τ,a (0). Treating the set of eigenvectors as a matrix [v(R)] α,n = v n α (R), where α = (k, τ, a), we obtain expressions for the