$\varphi_0$-Josephson junction in twisted bilayer graphene induced by a valley-polarized state

Recently, gate-defined Josephson junctions in magic angle twisted bilayer graphene (MATBG) were studied experimentally and highly unconventional Fraunhofer patterns were observed. In this work, we show that an interaction-driven valley-polarized state connecting two superconducting regions of MATBG would give rise to a long-sought-after purely electric controlled $\varphi_0$-junction in which the two superconductors acquire a finite phase difference $\varphi_0$ in the ground state. We point out that the emergence of the $\varphi_0$-junction stems from the valley-polarized state which breaks time-reversal symmetry and trigonal warping effects which break the intravalley inversion symmetry. Importantly, a spatially non-uniform valley polarization order parameter at the junction can explain the highly unconventional Fraunhofer patterns observed in the experiment. Our work explores the novel transport properties of the valley-polarized state and suggests that gate-defined MATBG Josephson junctions could realize the first purely electric controlled $\varphi_0$-junctions with applications in superconducting devices.


I. INTRODUCTION
The discovery of correlated insulating states and superconducting states in magic angle twisted bilayer graphene (MATBG) [1][2][3] motivated intense studies of moiré materials in recent years. The rich symmetry breaking states discovered in MATBG  enable the creation of novel quantum devices with various quantum phases on a single material platform. Recently, gate-defined Josephson junctions (JJs) were created on MATBG [39][40][41][42] when a non-superconducting (weak-link) region in a superconducting MATBG device was created by local gating. Interestingly, a highly unconventional Fraunhofer pattern was observed in Ref. [41] when the weak-link region was gated to near half-filling ν = −1/2 filling (two holes per moiré unit cell).
The observed unconventional Fraunhofer pattern motivated us to study the Josephson effects in a gate-defined superconductor/valley-polarized state/superconductor (SC/VP/SC) in MATBG, as schematically shown in Fig. 1(a). As the unconventional Fraunhofer pattern indicates time-reversal and inversion symmetry breaking at the weak-link of the Josephson junction [41], we choose the weak link to be a partially valley-polarized state. In this case, the energy degeneracy of moiré bands of the K and -K valleys is broken due to electron-electron interactions [ Fig. 1(b)]. Such a valley-polarized state is one of the possible energetically favourable states at half-filling from Hartree-Fock calculations [28][29][30][31][32][33][34][35][36]43], which also satisfies the symmetry requirements of the experiment. The choice of the valley-polarized state as the weak-link in the Josephson junction is further motived by the observation of anomalous Hall effect at half-filling in the recent experiment, in which the twist angle is slightly away from the magic angle [43]. This anomalous Hall effect can also be explained by the partially valley-polarized state.
In this work, we show that the current-phase relation induced by the interaction-driven valley-polarized state as the weak link of a Josephson junction is highly unconventional, which has the form I s = I c sin(φ − ϕ 0 ). Here, I c is the critical current, φ = φ L − φ R is the phase difference of the two superconductors with phases φ L and φ R respectively. Such Josephson junctions with general ϕ 0 are called ϕ 0 -Josephson junctions (ϕ 0 -JJs). We fur-ther point out that the valley polarization and the trigonal warping effects are the key ingredients for realizing ϕ 0 -JJs. Importantly, a spatially non-uniform valley polarization order parameter at the junction can provide a plausible explanation for the unconventional Fraunhofer patterns observed in the experiment [41].

II. MODEL FOR NUMERICAL CALCULATION
First, we introduce a microscopic model which describes a MATBG Josesphon junction as realized experimentally in Ref. [41] and schematically shown in Fig. 1(a). The relevant moiré bands near charge neutrality of MATBG can be captured by an effective twoorbital tight-binding model on a hexagonal lattice [4,5], which can be written as : Here , ξ labels the two p-wave-like orbitals p x + iξp y as a representation of two valleys τ = ±K , σ =↑ / ↓ denotes the spin indices, t 1 = 0.331 meV and t 2ξ = −0.01 + 0.097ξi meV denote the first-nearest neighbor and the fifth-nearest neighbor hopping. Note that the imaginary part of t 2ξ describes the warping effects. Moreover, the spatial dependent chemical potential is denoted by µ which is chosen such that the filling factor ν satisfies −1 < ν < −1/2 for the superconducting part of the junction and ν ≈ −1/2 at the weak-link region [41]. As shown in Ref. [4,5], H 0 captures the symmetries of the moiré bands of MATBG.
To include the effects of interactions, we introduce the superconducting order parameter on the left (L) and right (R) sides of the Josephson junction and the valley polarization order parameter to the weak link. The resulting effective tight-binding Hamiltonian is: Here, the second term characterizes the pairing potential on the left and right side of the Josephson junction with phases φ L and φ R respectively. To be specific, we set the spin-singlet pairing [10,71] amplitude ∆ s = 0.1 meV according to the experiments [3,41], which is roughly one order smaller than the moiré band width. It is important to note that other time-reversal invariant unconventional pairings have been proposed in MATBG [37,72]. For simplicity, a conventional spin-singlet pairing order parameter is assumed in the main text. The conclusions obtained here are still valid even if we assume other momentum independent pairings which involve both the spin and valley degrees of freedom of MATBG (Appendix E). The temperature effects on the pairing can be included by setting ∆ s (T ) = ∆ s tanh(1.74 (T c − T )/T ) (T c is the superconducting critical temperature) [73].
On the other hand, the third term with the Pauli matrix τ z characterizes the valley polarization in the weaklink (WL) region with valley-polarization order parameter ∆ vp . The order parameter ∆ vp can be seen from the Hamiltonian with Coulomb interactions under the Hartree-Fock mean-field approximation [see Appendix A]. More details about the tight-binding model can be found in the Appendix C.

III. UNCONVENTIONAL JOSEPHSON JUNCTION INDUCED BY THE VALLEY-POLARIZED STATE
To study the properties of the gate-defined MATBG Josephson junction, we first calculate the energy dispersion as a function of phase difference φ of the junction which is described by H ef f . Here, we set the length of the non-superconducting part of the junction be d = 10L M / √ 3 [41] (L M ≈14 nm is the moire lattice constant), and set the filling ν to be close to half-filling. To match the experimental situation in which the junction resistance is much smaller than the quantized resistance h/e 2 [41], we set the weak-link regime to be partially valley-polarized [31,43] such that the weak-link section is metallic as schematically illustrated in Fig. 1(b). The case of fully valley-polarized topological state is studied in the Appendix C.
Figures 2. (a) and (b) show a typical energy spectrum of the MATBG Josephson junction as a function of the phase difference φ = φ L − φ R , obtained by diagonalizing the junction Hamiltonian H ef f with ∆ vp /∆ s = 0 and ∆ vp /∆ s = 1, respectively. As expected, there is a large number of Andreev bound states within the superconducting gap. The energy-phase relations of a few Andreev bound states with large slopes are highlighted by red solid lines in Figs. 2(a) and 2(b). It can be seen that the in-gap Andreev bound states with large slopes ∂E ∂φ contributing mostly to the supercurrent exhibit a phase shift which is close to (but not equal to) π when the valley polarization ∆ vp /∆ s = 1. This phase shift gives the first indication that the valley polarization has nontrivial effects on the Josephson junction.
To study the ground state of the Josephson junction, we calculate the free energy as T/T c =0.05 where T is the temperature, the energy of the states E n (φ) is obtained by diagonalizing the Hamiltonian H ef f (φ). For convenience sake, we define the free energy of the Josephson junction per unit width to be where W J is the width of the junction. Therefore, the phase difference of the two superconductors at the ground state is determined by φ 0 such that F J (φ 0 ) = 0. In Fig. 2(c), we plot the free energy landscapes F J (φ) with temperature T = 0.3T c at various valley polarization strengths (∆ vp /∆ s = 0, 1, 3). As expected, without valley polarization, the junction is conventional so that the ground state appears at φ = 0. Interestingly, the ground state of the Josephson junction can appear at a finite φ in the presence of valley polarization. For example, in the case of ∆ vp /∆ s = 1, the ground state with F J (φ) = 0 appears at a phase difference close to (but not equal to) π. For a larger ∆ such that ∆ vp /∆ s = 3, the ground state appears at a phase further away from π.
To show the effect of valley polarization on the currentphase relation, the supercurrent density J s (in unit of nA· µm −1 ) as a function of φ is depicted in Fig. 2(d) for the case without valley polarization (∆ vp /∆ s = 0) and in Fig. 2(e) for the case with valley polarization (∆ vp /∆ s = 3). Here, the supercurrent density is obtained from the free energy of the Josephson junction as ∂φ . Without valley polarization, the junction has conventional current-phase relation at both the low and high temperature regimes [74,75]. However, in the case with finite valley polarization, the supercurrent can either exhibit a sign change or even display a generic phase shift [see Fig. 2(e)]. In particular, it can be seen that the curves with higher temperature [the red and green lines in Fig. 2(e)] follow a standard ϕ 0 -JJ currentphase relation of J s = J c sin(φ − ϕ 0 ). Our calculation thus clearly shows that the valley polarization can result in ϕ 0 -JJs in MATBG.
One important consequence of a ϕ 0 -JJ is that there is a supercurrent even at zero phase difference (φ = 0), called anomalous supercurrent [59,60]. The anomalous supercurrent density J s (φ = 0) for ∆ vp /∆ s = 3 can be seen in Fig. 2(e). The J s (φ = 0) as a function of valley polarization strength ∆ vp at various temperatures is shown in Fig. 2(f). We find that the anomalous supercurrent is generally finite with valley polarization. Moreover, when ∆ vp ∆ s , the anomalous current density at the low temperature range can be as large as tens of nA· µm −1 . As depicted in Fig. 1(c), we expect to see an anomalous current in a ring geometry when part of the superconducting ring is gated to the valley-polarized state. It is also important to note that unlike previously studied ϕ 0 -JJs, the MATBG ϕ 0 -JJs do not involve ferromagnetism or spin-orbit coupling, which calls for a new understanding about the underlying mechanism for the formation of ϕ 0 -JJs in the MATBG.

IV. UNDERLYING MECHANISM FOR ϕ0-JJS IN MATBG
Next, based on the scattering matrix method [74,75], we show analytically that the valley-polarization and the warping effects of moiré bands are crucial in realizing a ϕ 0 -JJ. At the junction, the states can be labelled by the transverse momentum k y . For illustration, we demonstrate how the Andreev bound state associated with the k y = 0 mode (normal incident states), is affected by valley polarization and the warping terms. The 1D Hamiltonian associated with the k y = 0 mode can be written as H 1D = τ α dxΨ † τ α (x)Ĥ τ α (x)Ψ τ α (x). Here, τ = +/− labels the valley index, α = +/− labels the incoming/outgoing normal states near Fermi energy, Ψ τ ν = (ψ τ α (x), ψ † −τ,−α (x)) T denotes the Nambu basis, and (4) Here, the linearized single-particle Hamiltonian where v s,τ α and v vp,τ α are the Fermi velocities for the superconducting region and the valley-polarized weak-link region, respectively. Notably, the warping term which breaks the intravalley inversion symmetry could lead to v vp,τ α = v vp,τ −α . The superconducting pairing potential is written as , and the valleypolarized order parameter is ∆ vp (x) = ∆ vp Θ(x)Θ(d−x).
With the effective one-dimensional Hamiltonian H 1D , we can solve the energies of the Andreev bound states τ analytically (τ is a good quantum number), which are given by [for more details see Appendix B] Here, β( τ ) = arccos τ ∆s , E T = v vp /d is the Thouless energy, E A = δv vp /d is an energy scale that reflects the intravalley asymmetry induced by the warping term, wherev vp and δv vp are defined byv vp = 4( τ ν v −1 vp,τ α ) −1 and δv vp = 2(v −1 vp, Importantly, many features of the numerical results as shown in Fig. 2 can be captured by Eq. (5). For example, we can calculate the Andreev bound state energies associated with the τ = +/− valleys by solving τ (φ) from Eq. (5) at ∆ vp /∆ s = 0 and ∆ vp /∆ s = 1 [see the gray lines and blue lines in Fig. 3(a), respectively]. Note that in Fig. 3(a), the valley degeneracy of Andreev bound states are lifted by the warping term and valley polarization. With the bound state energies τ , it is straightfor- ward to obtain the supercurrent I s (φ) by adopting the relation As an illustration, we plot the calculated I s (φ) at the low temperature limit in the cases of ∆ vp /∆ s = 0 and ∆ vp /∆ s = 3 [see Fig.3 Notably, the features are in agreement with the ones shown in Figs. 2(d) and 2(e).
In the short junction and at the high temperature limit, we can obtain an analytical form for the Josephson current: It can be seen that I s (φ) indeed significantly differs from the conventional form given by I s (φ) ∝ sin φ. Specifically, the supercurrent exhibits a phase shift ϕ 0 = ∆ vp /E A , which is determined by both the valley polarization and Fermi velocity asymmetry induced by warping Remarkably, the factor cos( 2∆vp E T ) indicates that the supercurrent oscillates periodically as a function of ∆ vp , being consistent with the numerical result in Fig. 2(f). It is worth noting that the results are similar to Eq. 7 for modes with small transverse momentum k y as well.

V. UNCONVENTIONAL FRAUNHOFER PATTERN
In practice, the sample inhomogeneity and the formations of valley polarization domain walls may lead to spatially non-uniform ∆ vp inside the weak-link region [76], which can affect the transport properties of the ϕ 0 -JJ. As an illustration, we calculate the Fraunhofer pattern for a simple geometry with two valley polarization domains (see the inset of Fig. 4), in which each domain generates a phase difference of ϕ 1 and ϕ 2 respectively at the junction. Such a ϕ 1 -ϕ 2 junction is a generalization of the previously studied 0-π junctions [77][78][79]. Here, we plot the resulting Fraunhofer patterns in Fig. 4 and the details can be found in the Appendix D. Interestingly, the Fraunhofer pattern in the ϕ 1 -ϕ 2 junction captures the main features found in the recent experiment [41] which exhibits a shift in the central peak, a large asymmetry with respect to the central peak, and a non-vanishing critical current as a function of magnetic fields. These features are not naturally expected by conventional Josephson junctions nor 0-π Josephson junctions [77][78][79]. It is worth noting that although our study provides a plausible mechanism for the unconventional Fraunhofer pattern seen in the experiment based on ϕ 0 -JJs, we cannot exclude other ways of generating such Fraunhofer patterns. To directly view the ϕ 0 -JJs in the experiment, it would be more straightforward by using the SQUID structures as previous experiments [67,68], which are discussed in detail in Appendix D.

VI. DISCUSSION
It is important to note that our model exhibits some similarities between the ϕ 0 -JJ model induced by Rashba spin-orbit coupling and exchange field [57,64] by regarding the valley as a pseudospin. Specifically, the trigonal warping term and valley polarization can play the role of spin-orbit coupling and spin polarization, respectively. The detailed mapping is illustrated in the Appendix F. This mapping provides a good insight about the appearance of the ϕ 0 Josephson junction in the MATBG platform, although in which the spin-orbit coupling is negligible. We expect ϕ 0 Josephson is only an example, while other profound physics that typically arises from the interplay of spin-orbit coupling and ferromagnetism can also be explored in the MATBG according to our framework.
In the main text, we focus on the Josephson junction in the ballistic limit. On the other hand, intervalley backscatterings can couple the two valleys and effectively weaken the valley polarization and reduce ϕ 0 , similar to the spin-relaxation effects in ϕ 0 -JJ with spin-orbit coupling [64,65,80]. However, as long as the valley polarization is finite, we still expect a ϕ 0 -JJ. Furthermore, our work can be easily extended to study the unconventional Josephson effects mediated by valley-polarized states in other moiré materials/superconductor heterostructures.

Moiré bands of MATBG and trigonal warping effects
In the main text, the trigonal warping impact of moiré bands and its important effects on creating ϕ 0 -Josephson junctions (JJs) in twisted bilayer graphene is highlighted. Here, we present the details of showing the trigonal warping effects using the continuum model of magic angle twisted bilayer graphene (MATBG) (which is depicted in Fig. 5(a)). The continuum model of MATBG (c.f. [1,4]) can be written as Here, the intra-layer moiré Hamiltonian is where v/a = 2.1354 eV. Here, l = t/b and τ = ± label the top/bottom layers and ± valleys respectively. The twist angle is denoted as θ, σ j represent the Pauli matrices defined in the AB sublattice space, K (l) τ labels the Dirac point at valley τ of the l-layer, and the rotational operator R(θ) = diag(e −il θ 2 , e il θ 2 ). The interlayer Hamiltonian T (r) can be written as ) with the moiré unit length L M = a/ sin θ ∼ 14 nm, and we adopt u = 0.0797 eV, u = 0.0975 eV according to Ref. [4]. The moiré bands can be obtained by diagonalizing the continuum Hamiltonian using the plane wave basis ψ k (r) = G C G e i(k+G)·r with G = n 1 G M 2 + n 2 G M 3 , where n 1 , n 2 are integers. Fig. 5(b) show the band structure of the lowest moiré bands near the charge neutrality point of MATBG [4].
To highlight the trigonal warping features of the moiré bands, a Fermi energy contour near half-filling of the τ = +1 valley (the black dashed line in Fig. 5(b)) is plotted in Fig. 5(c) (blue line). We can denote the warped Fermi energy contour as k f (ϕ). As k f (ϕ) = k f (ϕ + 2π 3 ) due to the C 3 symmetry and an emergent C 2x symmetry in each valley such that k f (ϕ) = k f (−ϕ). To the lowest order in ϕ, we can expand k f (ϕ) ≈ a + b cos 3ϕ. By using a = 2.315, b = −1.299 (in unit of L −1 M ), we find that k f (ϕ) can approximately fit the warped Fermi energy contour of the continuum model. The Fermi energy contour at the τ = −1 valley can be obtained by a time-reversal operation.
Such prominent warping behaviour is a direct consequence of the narrow bandwidth and the constraint of D 3 point group symmetry of MATBG [4,5]. Here, we present the symmetry transformation properties under the generators of D 3 which include a three-fold rotation along the z-axis and a two-fold rotation along the y-axis. Note that the moiré bands are assumed to be decoupled in the valley space so that only the terms that involve τ 0 and τ z are allowed. Without loss of generality, we consider the Fermi energy cuts the lower branch of the moiré bands only as shown in Fig. 5(b). In this case, we can construct a simple symmetry-invariant continuum model near Γ m point as Here, the first term is the kinetic energy term, µ denotes the chemical potential term, and the second term is the warping term which is opposite at the opposite valley to preserve the time-reversal symmetry T = τ x K and C 2y = τ x symmetry. The presence of the warping term breaks the intra-valley inversion symmety as As emphasized in the main text, the breaking of intra-valley inversion symmetry together with the valley polarization enables the generation of ϕ 0 -Josephson effect in MATBG even in the absence of the spin-orbit coupling.
One of the important consequences of the warping effects is to enable the velocity of incoming and outgoing states in the junction to be asymmetric, which plays a crucial role in creating a nontrivial ϕ 0 as as shown in later sections. To show the asymmetry of the Fermi velocity in the moiré bands, we plot the angular dependence of Fermi velocity v f (ϕ) (see the blue line in Fig. 5 is the Fermi momentum contour as shown in Fig. 5(c). In other words, v f (ϕ) is the Fermi velocity along the radical direction at each ϕ. For normal incident states, we can estimate the asymmetry of the Fermi velocity is given by M ≈ 2 × 10 4 m/s. Note that the Fermi velocity is two orders smaller than that of monolayer graphene due to the formation of flat bands under moiré superlattice potential.
with parameters λ 0 = 0.5347 and λ 1 = 0.0885 (see the dashed line in Fig. 5(d)). Although there is some deviation from the numerical one (in blue), all the symmetry features are captured. To obtain a closer fitting to the numerical results, one can expand it to higher order terms, which is not necessary for the purposes of this manuscript. Therefore, we have shown that the trigonal warping effects would result in anisotropic Fermi velocities. The warping term would induce an asymmetry for the velocities of the incoming and outgoing modes in our scattering matrix method calculations later.
2. An illustration of valley-polarized states from the Hartree-Fock mean-field approximation More detailed Hartree-Fock mean-field approximation for the moiré bands upon Coulomb interaction has been extensively studied in previous works. But for the sake of completeness and illustrate some features of the moiré bands upon the valley-polarization, we present the basic formalisms of the valley-polarized states from the Hartree-Fock approximation with a minimal interacting Hamiltonian: (A5) Here, A is the sample area, the density operator, s denotes the spin index, µ is the chemical potential for the single-particle moiré band, ρ q = k,k ,τ,s c k,τ,s |e iq·r |c k ,τ,s c † k,τ c k ,τ,s . Without loss of generality, we focus on the first valence moiré band and the Coulomb interaction is projected to this moiré band [71]. The singlet-particle wave function can be decomposed into the plane wave basis as |c k,τ,s = 1 √ A G a k+G,τ,s e i(k+G)·r with G being the moiré reciprocal lattice vector.
One can solve Eq. (A8) in a self-consistent way. Note that the doubly counted interacting energy should be subtracted after the mean-field approximation, which is given by Depending on the filling and the Coulomb interaction strength, various time-reversal breaking states with valley-polarization which satisfy the self-consistent Hartree-Fock equation can be obtained, including i) The fully valley-polarized, spin-unpolarized insulating or semi-metallic state. This state can appear when the Coulomb interaction strength is strong and the filling factor is near some integer fillings; ii) The partially valleypolarized, spin-unpolarized metallic states, which can appear when the Coulomb interactions are weaker [31]; and iii) The valley-polarized, spin-polarized states. A schematic illustration of the valley-polarized states is presented in the main text Fig. 1. As the focus of this work is the valley-polarized state, we neglect the spin polarization and rewrite the mean-field Hamiltonian as where˜ k,τ = k,τ + (∆ k,τ + ∆ k,−τ )/2 and ∆ vp,k = (∆ k,τ − ∆ k,−τ )/2, and ∆ vp τ z is the valley-polarized order parameter. Note that here we are directly taking the valley-polarized state as the ansatz state. It is actually difficult to determine which state is more energetically favorable based on Hartree-Fock consideration alone. Especially when the correlated state appears at the weak-link region, the coupling with the superconducting regions can also be important.
Appendix B: Analytical calculations of the current-phase relation using the scattering matrix method In the main text, we have used scattering matrix method to show that the studied MATBG Josephson junction is a ϕ 0 -JJ and find that the valley polarization and warping effects are crucial in giving rise to the observed ϕ 0 -JJ. We present the corresponding details in this Appendix section.

Model Hamiltonian
To gain some insight into the crucial features of the junction, we first look at the limit of ∆ s , ∆ vp µ, i.e., the bandwidth is the biggest energy scale. In this case, we can linearize the momentum near Fermi energy for a fixed transverse momentum k y and obtain a low-energy effective model as H = 1 2 τ α ky dxΨ † ky,τ α (x)Ĥ ky,τ α (x)Ψ ky,τ α (x).
(B1) Here, τ ± labels the ±K valley, α = +/− labels the incoming/outgoing normal states near Fermi energy, Ψ ky,τ α = (ψ ky,τ α (x), ψ † ky,−τ,−α (x)) T denotes the Nambu basis, and (note that here, we have assumed a uniform valley polarization for the sake of simplicity), the longitudinal Fermi velocity at a fixed k y of the superconducting part and junction part is given by Here, φ is the phase difference, d is the length of the junction, ∆ vp is the valley polarization strength, v s,τ α , v f,τ α are the longitudinal Fermi momentum along the current direction of the superconducting part and the junction part with valley polarization. One can verify that the whole HamiltonianĤ (dimension is eight by eight) preserves particle-hole symmetry PĤP −1 = −Ĥ but breaks time-reversal symmetry: TĤT −1 =Ĥ if ∆ vp is finite. Here,P = ρ x α x τ xK , T = α x τ xK ,K is complex conjugate, and α j ,K is com τ j , and ρ j are Pauli matrices defined in α = +/−, valley, and particle-hole space, respectively.
Note that in general v vp,τ α = v vp,−τ −α due to the breaking of time-reversal symmetry, but v vp,τ α ≈ v vp,−τ −α in the limit of ∆ vp E F . On the other hand, the warping term breaking intra-valley time-reversal symmetry could lead to v vp,τ α = v vp,τ −α , which plays a crucial role in giving rise to the ϕ 0 junction as shown later.
It is worth noting that the model Hamiltonian resembles that for an S/F/S junction if the valley is regarded as a pseudo-spin (flips sign under both time-reversal and inversion operation). As we will show later, the π junction, which was commonly explored in S/F/S junctions, can also be stabilized in S/VP/S junctions. But we would emphasize that the physics system in our case is very different, given that the polarization appears in valley degree of freedom rather than spin.

Scattering states and boundary conditions
The scattering states in the S region of the left(L) and right(R) side are obtained as with k 0 s,τ α being the Fermi momentum along the longitudinal direction for a fixed k y and the definitions The in-gap states ψ L/R s,τ α with ∆ s are superpositions of electron and hole with an exponential decay length κ −1 τ α into the left/right superconducting regions. One can verify that the states in the S region possess time-reversal symmetry: ψ s ( , −φ) =T ψ s ( , φ) with k 0 s,τ α = k 0 s,−τ −α . Here, k e,τ α and k h,τ α are the wave vectors for electronand hole-like states, respectively [see an illustration in Fig. 6], and N e(h),τ α are normalization factors to ensure that the scattering matrices are unitary. Up to the leading order, k e,τ α ≈ k 0 e,τ α + δk e,τ α , k h,τ α ≈ k 0 h,τ α + δk h,τ α with k 0 e,τ α = k 0 h,τ α = αk 0 vp,τ α , and Here, ψ vp,e,τ + , ψ vp,h,τ − are the states moving in the +x direction, while ψ vp,e,τ − , ψ vp,h,τ + are the states moving in the −x direction. The particle-hole symmetry requires ψ vp,h (− ) =P ψ vp,e ( ) so that δk h,τ α (− ) = −δk e,−τ −α ( ). The factors N e,τ α and N h,τ α are to ensure that the scattering matrix is unitary.

Andreev bound states in the case without normal reflections
We now solve the Andreev bound states using the scattering matrix method [74,75]. First, we need to work out the scattering matrices. We can define The scattering matrices in the scattering method are defined as and the transition matrices are defined as According to the scattering matrix method [74,75], the energies of Andreev bound states are given by The transmission matrices can be directly obtained according to the definitions Eqs. (B15) to (B18) and Eq. B20 as The form of scattering matrix S L(R) would depend on the interface at x = 0 and x = d. Let us first consider the case without chemical potential difference between the superconducting region and valley polarized region, i.e., µ = µ . In this case, v vp,τ ± = v s,τ ± , the factors in the scattering states can be simply taken as N e,τ α = N h,τ α = 1. Using the definitions Eqs. (B15) to (B18) and the boundary conditions Eqs. (B11) to (B14), one can easily obtain Here, β = acos τ ∆s for in-gap Andreev bound states, and only Andreev reflections in the scattering matrix are finite due to the absence of momentum mismatches. By inserting the scattering matrix back to Eq. (B21), we find that the energies of Andreev bound states are given by We can further define two energy scales: one is the Thouless energy E T = v vp /d, and the other one is E A = δv vp /d, which reflects the intra-valley asymmetry induced by the warping term. Then Eq. B25 is rewritten as It clearly shows that the phase φ is shifted asφ = φ − ϕ 0 with due to the combination of valley polarization and warping effects. As we will see later, ϕ 0 would manifest as the phase shift in a current-phase relation, which would result in the so-called ϕ 0 junction. In the short junction limit, E A , E T , we can actually obtain the energies of the bound states: 4. The angular dependence of the ϕ0 phase shift Next, we briefly comment on the angular dependence of the ϕ 0 phase shift. It is important to note that the magnitude of ϕ 0 in general would depend on the angle θ between the current direction and lattice orientation. As an illustration, we can evaluate the ϕ 0 phase shift with the approximated angular dependence of the Fermi velocity presented in Sec. I: v f (θ) = α 0 + β 0 τ cos 3θ, where τ is the valley index, α 0 captures the isotropic part, and β 0 captures the anisotropic part of the Fermi velocity. It is straightforward to obtain ϕ 0 according to the relation between E A and v f (θ), which gives .

Andreev bound states in the case with normal reflections
In general, the chemical potential between the superconducting region and valley-polarized region is different with µ = µ . To see the effects of such a difference in chemical potential, we solved the scattering matrices in the same way as Here, r A , r N are coefficients for Andreev reflections and normal reflections. Note that we have used N e,τ + = One can verify that the scattering matrix is unitary with |r A | 2 + |r N | 2 = 1 for the in-gap bound states with < ∆. Evidently, the normal reflections r N would be finite due to the momentum mismatches, i.e., v vp,τ ± = v s,τ ± induced by the difference in the chemical potential (µ = µ ). It can also be seen that the scattering matrix Eq. (B32) would return to the Eq. (B24) if there is no momentum mismatch.
Next, we solve the energies of Andreev bound states in the case of finite normal reflections. For the compact of notations, we rewrite the scattering matrix as: Here, r = |r N |, η = Arg[X −1 ]. By substituting the scattering back to Eq. (B21), we find that the Anreev bound states are given by As expected, the phase shift ϕ 0 = ∆vp E A would not be affected by the presence of normal reflections. Instead, the normal reflection would mainly weaken the magnitude of the supercurrent and thus is not essential for our study.

Free energy and Josephson currents
The free energy of a JJ can be written as where n is the eigenenergies of the BdG Hamiltonian of the Josephson junction, β = 1/k B T , U is an effective interaction strength. We neglect the U dependent term which is independent of φ. One can further subtract a constant normal state free energy F (∆ s = 0) to avoid the divergence at large energies and would not affect the current-phase relation I s (φ) [75]. The supercurrent through the JJ can be obtained from the free energy with Here, e is the charge of an electron. One can easily figure out the current units by using ≈ 6.581 × 10 −13 meV·s and e/s ≈ 1.6 × 10 −19 A (A is Ampere), i.e., 2e/ ≈ 486 nA/meV. By substituting the bound state energy Eq. (B30) into Eq. (B39), and at the high temperature limit ∆ s /k B T 1, we obtain Eq. (7) of the main text:

The scattering modes of different transverse momentum ky
In the previous sections, we have solved the 1D scattering matrix problem for each mode at a fixed k y . To obtain the total supercurrent through the junction, we need to insert different longitudinal Fermi momentum v f,τ α (k y ), and sum over different k y that are quantized by the finite width. Unfortunately, we could not do it analytically due to the complicated warping effects. For the completeness, we still present a brief discussion of the effects of k y here.
The total supercurrent through this Josephson junction is given by In the short junction and at the high temperature limit, the supercurrent at a phase difference φ: I s,ky (φ) carried by each mode can be obtained by replacing E T , E A in Eq. (B40) with the ones calculated from v f,τ α (k y ). As shown in Fig. 5(d), the value of longitudinal Fermi momentum v f,τ α (k y ) and its asymmetry near k y = 0 are similar so that the resulting current-phase relation is expected to be similar to Eq. (B40) for a small transverse momentum. However, the situation becomes complicated in the large transverse momentum k y . Because of the warping effects, there are multiple scattering modes near Fermi energy for a fixed k y [see Figs. 5(c) and 5(d)], which are not captured by Hamiltonian (B2) that only includes one incoming electron-or hole-dominant mode. Nevertheless, we expect the scattering modes with large momentum to carry less supercurrent and thus in the main text, we find that the 1D scattering Hamiltonian provides a good understanding of our numerical results, in which the current carried by all incoming modes are included.
Appendix C: More details for the MATBG Josephson junction using the tight-binding method In this section, we present more details about the numerical calculations, including the geometry details, the result in the case of turning off the warping effects, and the result in the case of the weak-link region being a half-filling valley-polarized Chern insulator with a Chern number two. As introduced in the main text, we adopt the following effective tight-binding model to capture MATBG Josephson junction:

Model and Geometry details
See the main text for the detailed definitions of the ingredients in Hamiltonian H ef f . Here, we depict the adopted geometry of the MATBG Josephson junction in Fig. 7. The superconducting order parameter ∆ s and valley-polarized order parameter ∆ vp are added in the green region and gray region of the top panel of Fig. 7, respectively. As shown in the bottom panel of Fig. 7, the lowest moiré bands near the charge neutrality are captured by hoppings on the two-orbital hexagonal lattice in each region, where t 1 represents the first-nearest hopping, t 2 represents the complex fifth-nearest hopping (giving rise to the warping effects). We note that the minimal tight-binding model proposed in Ref. [4] that is used to capture the moiré bands up to the lowest hopping is narrower than that from the continuum model shown in Fig. 5(b). This however would not affect our result as the presence of ϕ 0 -JJs is determined by the symmetries according to our main text analysis. The key length scales are also highlighted in Fig. 7. The lattice sites in Fig. 7 label the center of wannier orbitals so that the length of the nearest bonds is the moiré lattice constant L M . We thus measure the adopted junction length d and W in main text in units of L M .

Symmetry consideration
We now present a symmetry analysis to show why the valley polarization and the warping effects are crucial for the emergence of the ϕ 0 -JJ in MATBG. Without these two ingredients, the system would exhibit time-reversal symmetry which gives I s,τ (φ) → −I s,−τ (−φ) and an intravalley inverison symmetry which gives I s,τ (φ) → −I s,τ (−φ). As I s (φ) = τ I s,τ (φ), it can be seen that both symmetries would enforce the total supercurrent to satisfy the condition I s (φ) = −I s (−φ), so that I s (φ = 0) = 0. Therefore, according to our symmetry analysis, the conclusion that valley-polarized state induces ϕ 0 -JJ is general as long as time-reversal and intravelley inversion symmetries are broken. In the below, we present more cases to verify this symmetry analysis.

The case without warping effects
In the main text, the warping effects are naturally included in our calculations with the fifth-nearest hopping t 2ξ = 0 (c.f. [4,5]). As discussed in the main text, the warping term would lift the intra-valley inversion symmetry so that the minimal free energy of the junction is not necessary 0-or π-JJ, resulting in a ϕ 0 -JJ in general. To make a comparison, we now artificially turn off the warping term. As expected, we find that the junction is restricted to be 0-or π-JJ [ Fig. 8(a)]. We consistently find that the anomalous Josephson current, i.e.,J s (φ = 0), vanishes [ Fig. 8(b)]. It thus clearly shows that the warping effects are crucial for the ground state of MATBG Josephson junction to be ϕ 0 -JJ, which is in agreement with our symmetry analysis presented in the main text. Here we set the temperature T = 0.05Tc.
It was pointed out in the main text that the mechanism: valley-polarized state mediates unconventional Josephson junction is quite robust regardless of whether the state is topologically trivial or nontrivial. In this section, as a demonstration, we present the calculated current-phase relation [ Fig. 9] by setting the junction region to be half-filling (ν = −1/2) valley-polarized Chern insulating states with Chern number C = 2 [see a schematic illustration in the inset of Fig. 9]. One can add a Haldane term to the tight-binding Hamiltonian (C1) in order to make the junction region topological (c.f. Ref. [41]). In this case, as shown in Fig. 9, the curves of supercurrent J s (normalized by its maximal value) versus the phase difference φ would still display a finite phase shift, i.e., sin(φ − ϕ 0 ), for various junction lengths d. In other words, the junction would still behave a ϕ 0 -JJ. Note that in the topological case, the edge states that can mediate some supercurrents may play an additional role. Nevertheless, Fig. 9 clearly shows that our conclusion about the valley polarization causing ϕ 0 -JJ is not affected. It is understandable given that timereversal and intra-valley inversion symmetry are still broken by the valley-polarized Chern bands in this case. In this section, we show that the magnetic interference of a uniform ϕ 0 Josephson junction should be the standard Fraunhofer pattern. The gauge invariant phase difference across an extended junction is For an out of plane magnetic field, the gauge can be chosen as A = (−By, 0). The Josephson current is given by Assuming that the current follows the simplest sin(φ − ϕ 0 ) feature, we will obtain Here, we denote the width of the junction to be W J . In the case of a uniform current density j(x) = j b , then where the flux quantum Φ 0 = h/2e, Φ = B ×W J d. Thus, where the critical current at zero-field is denoted as I c = j 0 W J . Hence, for a uniform ϕ 0 JJ, our phenomenological calculation suggests that the critical current as a function of external fields I c (Φ) follows the standard Fraunhofer pattern.
2. Phenomenological theory for the magnetic interference of a ϕ1-ϕ2 Josephson junction As we have mentioned in the main text, the total current through this junction under external magnetic fields can be written as Note that ϕ 1 and ϕ 2 would be different as ∆ vp in two domain walls are different. If ϕ 1 = 0 and ϕ 2 = π, the scenario would reduce to the 0-π JJ studied in Ref. [77][78][79], where the Fraunhofer pattern exhibits a dip near the zero magnetic flux due to the cancellation of the supercurrent of the 0-JJ parts and π-JJ parts. In our case, the ϕ 1 and ϕ 2 can be a value ranging from 0 to 2π due to the formation of ϕ 0 -JJ. Hence, we call it ϕ 1 -ϕ 2 JJ. The Fraunhofer pattern of the ϕ 1 -ϕ 2 JJ can be obtained from Eq. (D5). Specifically, the total supercurrent is written as Here, I s1 = j b W J1 and I s2 = j b W J2 denote the current through the two domain walls, respectively, and the magnetic flux though the j-th domain wall is Φ j = BW j d.
For the sake of simplicity, we denote I s1 = 1 2 (1 + δ)I s , where I s = I s1 +I s2 is the total supercurrent through the junction, and Φ = B(W J1 + W J2 )d is the total magnetic flux. Using these notations, the where ϕ ± = ϕ 2 ± ϕ 1 . The critical current I c = max(I(φ)), given by the maximal value of I s (φ) within 0 ≤ φ ≤ 2π. For the 0-0 JJ, one can easily obtain the standard Fraunhofer pattern I c (Φ) = I s | sin(πΦ/Φ0) πΦ/Φ0 |. Due to the presence of the asymmetry parameter δ, beyond 0-0 JJ, we could only find analytical solutions of the critical current in some special cases , such as in the limit of δ = 0: It can be noted that the critical current is always zero if the magnetic flux reaches certain integer flux quantum Φ = 2nΦ 0 (n are finite integers). As there is no node in the Fraunhofer pattern of experiments, we remove these nodes by introducing a finite asymmetric parameter δ.
For example, at Φ = 2nΦ 0 , the critical current becomes |, which could be finite if δ = 0. It is worth noting that another key feature of the experimentally observed Fraunhofer pattern is to exhibit the I c (Φ) = I c (−Φ). In the cases of 0-0 JJ and 0-π JJ, we find that the resulting Fraunhofer patterns are always symmetric, regardless of the choice of δ. However, if we consider ϕ 0 -JJ, where ϕ ± can take a more generic value rather than 0 or π, we find that the resulting Fraunhofer pattern is asymmetric in general. In Fig. 4 of the main text, we plotted the Fraunhofer pattern for the 0-0 JJ, 0-π JJ, and ϕ 1 -ϕ 2 JJ with δ = 0.4, ϕ 1 = 0.2, ϕ 2 = π +0.8. The resulting Fraunhofer pattern arising from ϕ 1 -ϕ 2 JJ is quite consistent with that seen in the experiment. Our calculation thus suggests that the presence of ϕ 0 -JJs could provide a plausible explanation for such highly unconventional Fraunhofer patterns.
Here we further emphasize how the features of the unconventional pattern shown in the main text Fig.4 are related to the ϕ 1 -ϕ 2 JJ model, especially the model parameters ϕ 1 , ϕ 2 , δ: (i)The unconventional Fraunhofer pattern (red line), |I c (Φ) = I c (−Φ)|, which would indicate the time-reversal breaking. (ii) The unconventional Fraunhofer pattern exhibits a local minimal around zero flux. As a result, the central peak is shifted to a finite flux. It is sharply different from the conventional Fraunhofer pattern (gray line) which exhibits a maximal peak around zero flux. In the ϕ 1 -ϕ 2 JJ, the local minimal appears around zero flux when the difference between ϕ 1 and ϕ 2 exceeds π, which results in a cancellation of supercurrent through ϕ 1 and ϕ 2 junction part. (iii)The unconventional Fraunhofer pattern exhibits non-vanishing nodes at finite integer flux. According to the expression I c (Φ = 2nΦ 0 ), the non-vanishing nodes at Φ = 2nΦ 0 indicatesϕ 1 and ϕ 2 are different, W J1 and W J2 are different as δ = 0.

Detection of ϕ0-JJ with a superconducting MATBG SQUID
The SQUID can be used to identify the ϕ 0 -JJ behavior in the experiments [67]. For the sake of completeness, as shown in Fig. 10(a), here we propose a MATBG SQUID geometry to detect the ϕ 0 -JJ predicted by our theory. In this geometry, there are two weak-linked junction regions that are achieved by local gates-a,b. Without loss of generality, we consider one is the ϕ 0 −JJ with the junction gated into valley-polarized states (JJ-a), while the other one is the conventional JJ (JJ-b).
The total supercurrent through the SQUID under magnetic fields is written as Here, I a (ϕ a ) and I b (ϕ b ) are the supercurrent (phase difference) across the JJ-a, JJ-b, respectively. Φ is the magnetic flux through the SQUID. In a simple case where I a = I b = I 0 , we can obtain the critical current at each magnetic flux as Hence, the ϕ 0 would cause a phase shift in the SQUID pattern [see the top panel of Fig. 10(b)], where ∆ϕ = ϕ 0 /2π. In general, I a and I b are not equal. As a illustration, we plot the magnetic interference pattern with I b = 3I a in Fig. 10(c). In this case, it can be seen that although the critical currents no longer vanish at certain magnetic fields, the phase shift does not change. Therefore, the proposed MATBG SQUID provides a feasible way to directly measure the predicted ϕ 0 phase shift. Upon finishing our work, we noticed that the MATBG SQUID geometry has recently been successfully fabricated in the experiment [42]. Our work thus would motive experimentalists to further gate the junction region into valley-polarized states and study the proposed unconventional Josephson effects in the near future.
iτy ⊗ σx -(iτy ⊗ σz, iτy ⊗ σ0) In the main text, to be specific, we have adopted the spin-singlet pairing as the pairing order parameter for the superconducting part of the MATBG Josephson junction. In this section, we point out that the ϕ 0 -JJ can still persist even when the pairing is unconventional, such as various spin-triplet pairings. The pairings can be expanded in the space formed by spin and valley degrees of freedom. We first classify the possible pairings using irreducible representations of the D 3 crystal group of MATBG. For simplicity, we focus on all k-independent inter-valley pairings.
Specifically, the generators of D 3 point contains a three-fold rotation along the z-axis represented by C 3z = τ 0 ⊗ e −i π 3 σz , and a two-fold rotation along the y-axis represented by C 2y = τ x ⊗ iσ y . Here, σ and τ are Pauli matrices defined in spin-and valley-space. Note that C 2y would exchange the K and −K valley, while C 3 would not.
The pairing matrix transforms under the a point group symmetry operation as: where∆ s is defined in Nambu basis: (ψ +,↑ , ψ +,↓ , ψ −,↑ , ψ −,↓ ) T with +/− as valley index and ↑ / ↓ for spin up/spin down, U (g) is the matrix representation of the generator g in the spin-and valley-space. Note that∆ s = −∆ T s due to the Fermi statistics, and the representation of∆ s in the valley degree of freedom is restricted to be τ x and τ y , i.e., inter-valley nature. All the k-independent inter-valley pairings are summarized in Table I. There is one inter-valley spin-singlet A 1 pairing : ∆ A1,s = τ x ⊗ iσ y ,, and there are two inter-valley spintriplet pairings: one one-dimensional spin-triplet A 1 -pairing ∆ A1,t = iτ y ⊗ σ x , and one two-dimensional spintriplet E-pairing, which we label as E 1 -pairing and E 2pairing with (∆ E,1 , ∆ E,2 ) = (iτ y ⊗ σ z , iτ y ⊗ σ 0 ). Note that the pairings labeled by different irreducible representations do not mix, and the mixing of ∆ A1,s and ∆ A1,t is expected to be neglectable as the spin-orbit coupling in graphene is extremely small. It is also worth noting that the possible nematic pairings can be constructed using the pairing matrices in the two-dimensional E-pairing.
We can replace the order parameter of the superconducting part with the above unconventional momentum independent pairings in the previous tight-binding model calculation and evaluate the supercurrent in the same way. As shown in Figs. 11 (a)-(d), we find that the current-phase relation is unchanged in the cases of various spin-triplet pairings, which thus implies that our result is not sensitive to the spin configurations of Cooper pairs of the superconducting part. This observation is understandable as the appearance of ϕ 0 -JJ is mainly induced by the valley polarization of the junction region.
Appendix F: A comparison between our model and the ϕ0 JJ model arising from exchange fields and spin-orbit coupling In the main text, we pointed out that the warping term and valley polarization effectively, respectively, play the role of exchange fields and spin-orbit coupling in comparison with previous ϕ 0 JJ models with these terms. Here we illustrate this with more details. To map the trigonal warping term as a spin-orbit coupling, we can look at the k· p normal state Hamiltonian of our system before linearization as shown in Sec. I: Note that the normal part of the phenomenological model in the main text Eq. (3) is given by linearizing the momentum near Fermi energy.
If we regard the valley as a pseudospin, the trigonal warping term λ 1 k x (k 2 x − 3k 2 y )τ z indeed can be regarded as a spin-orbit coupling and ∆ vp τ z can be regarded as a polarization induced by an exchange field. When we consider the Josephson junction, k y can be fixed as a good number by setting the current direction to be the k x -direction. For simplicity, we set k y = 0 as we considered for the phenomenological model Eq. (3), and then H N,1D = λ 0 k 2 x + [λ 1 k 3 x + ∆ vp ]τ z . Next, we highlight the similarities between our model and the Rashba nanowire model, which can be written as H R = k 2 x /2m + αk x σ y + hσ y with the hσ y denoting the Zeeman coupling of the magnetic field with the spin of the electron in the y-direction, α is the strength of the Rashba spin-orbit coupling. It is known that the Rashba spin-orbit coupling combined with an in-plane spin polarization would results in the ϕ 0 JJ. Our H N,1D model (with k y =0) is equivalent to replacing the Rashba spinorbit coupling model with a cubic warping term, which can be seen by performing a unitary transformation mapping τ z to τ y in H N,1D : H N,1D = λ 0 k 2 x + λ 1 (k 3 x + ∆ vp )τ y . After this mapping, it is indeed understandable that we can get a ϕ 0 JJ.
However, it should be noted that the underlying physical system in our case is very different, given that the polarization appears in valley degrees of freedom rather than spin. Our proposal relies on the valley-polarized moiré bands, and it does not need to involve spin-orbit coupling or exchange fields. Only when we restrict the interaction-induced order parameter to the valley polarization ∆ vp τ z , our model can be mapped to the model with Rashba SOC in some limit. Indeed, the interactioninduced order parameter can be richer. Another difference we would like to highlight here is that the possessing of both spin and valley degree of freedom in moiré bands allows much more rich cases with various band alignment. For example, the spin-valley polarized state can also appear at very low temperatures in the experiment [41]. In this case, the order parameter can be written as ∆ vp τ z σ 0 +∆ sp τ 0 ⊗σ z . Both the spin polarization and valley polarization would contribute to the anomalous phase shift. However, this model cannot be generally mapped to a Rashba spin-orbit coupling model as we would deal with a four by four matrix with two kinds of polarization.