Universality of Abelian and non-Abelian Wannier functions in one dimension

Within a Dirac model in $1+1$ dimensions, a prototypical model to describe low-energy physics for a wide class of lattice models, we propose a field-theoretical version for the representation of Wannier functions, the Zak-Berry connection, and the geometric tensor. In two natural Abelian gauges we present universal scaling of the Dirac Wannier functions in terms of four fundamental scaling functions that depend only on the phase $\gamma$ of the gap parameter and the charge correlation length $\xi$ in an insulator. The two gauges allow for a universal low-energy formulation of the surface charge and surface fluctuation theorem, relating the boundary charge and its fluctuations to bulk properties. Our analysis describes the universal aspects of Wannier functions for the wide class of one-dimensional generalized Aubry-Andr\'e-Harper lattice models. In the low-energy regime of small gaps we demonstrate universal scaling of all lattice Wannier functions and their moments in the corresponding Abelian gauges. We also discuss non-Abelian lattice gauges and find that lattice Wannier functions of maximal localization show universal scaling and are uniquely related to the Dirac Wannier function of the lower band. Our results present evidence that universal aspects of Wannier functions and of the boundary charge are uniquely related and can be elegantly described within universal low-energy theories.


I. INTRODUCTION
Wannier functions [1] have matured to an invaluable tool in various fields of solid state physics. They provide a useful basis set of exponentially localized singleparticle states [2] integral to density functional theory [3], are used in the definition of tight-binding models with short-ranged hoppings, can characterize the topological properties of crystals [4][5][6], and are the basis for the modern theory of polarization and localization [7][8][9]. However, Wannier functions are gauge dependent, begging the question: which gauges are particularly useful? One of them has been identified as the gauge of maximally localized Wannier functions [10], which we call the ML gauge in the following. In this gauge the quadratic spread ∆x 2 1/2 of the Wannier function is related to the fluctuations of the bulk polarization and to the quasimomentum integral of the geometric tensor [10,11] quantifying important topological hallmarks of the system. Furthermore, the first moment of the Wannier function (its average) can be related to the Zak-Berry phase or the bulk polarization [9], a quantity widely used to determine Chern numbers in two-dimensional topological systems. * Email: dante.kennes@mpsd.mpg.de Complementary to the bulk polarization the macroscopic boundary (or surface) charge Q B for a half-infinite insulating system has been studied extensively in a series of recent works [12][13][14][15][16][17][18]. Interestingly, the boundary charge [19] and its fluctuations [18] can be related to the bulk polarization and its fluctuations by the so-called surface charge and surface fluctuation theorem. The surface charge theorem has first been discussed in Ref. [19] relating Q B to the Zak-Berry phase of all occupied bands, up to an unknown integer, for noninteracting, nondisordered systems. Recently, this theorem was further analysed in Ref. [15] for the wide class of generalized Aubry-André-Harper models [20][21][22][23][24][25][26]. Here even the unknown integer of the surface charge theorem was determined fixing the precise gauge in the Wannier functions needed to relate its first moment to the Zak-Berry phase. In the small gap limit a low-energy Dirac model in 1 + 1 dimensions [13,17,27] can be used to reveal a universal form of the boundary charge in terms of the phase of the gap parameter [17]. Complementing the surface charge theorem, recently the surface fluctuation theorem [18] was established as a relationship between the boundary charge fluctuations and the second moment of the density-density correlation function or, equivalently, the fluctuations of the bulk polarization.
These recent developments raise the important ques-arXiv:2105.07747v1 [cond-mat.mes-hall] 17 May 2021 tion whether Wannier functions themselves exhibit universal properties beyond their first two moments, in particularly fundamental gauges. To address this question we cover the wide class of generalized Aubry-André-Harper models. We do so by considering a 2-band Dirac model, which requires a field-theoretical generalization of Wannier functions, the Zak-Berry connection and the geometric tensor, which we provide. Any of such field theories must always be complemented by appropriate asymptotic conditions at high-energies to provide a welldefined relation to the lattice models falling into the universality class of the respective field theory. In addition to the above mentioned ML gauge, we define the asymptotically free (AF) gauge, which is fixed by requiring that the eigenfunctions turn into conventional plane waves at large momenta. Here, we show that, for a microscopic lattice model, this gauge corresponds precisely to the one used in Ref. [15] to fix the unknown integer in the surface charge theorem. As a result, the AF gauge is useful to formulate a universal field-theoretical version of the surface charge theorem, relating the boundary charge to the first moment of the Dirac Wannier function and to the phase γ of the gap parameter, which in turn is related to the field-theoretical version of the Zak-Berry phase. In contrast, we show that the fundamental ML gauge is useful for the universal formulation of the field theoretical version of the surface fluctuation theorem, relating the boundary charge fluctuations to the second moment of the Wannier function or the momentum integral over the field-theoretical version of the geometric tensor.
Returning to the question of universality of Wannier functions beyond the first and second moment we find that strikingly the entire line shape of Wannier functions exhibits universal behavior in the small gap limit. Surprisingly, we show that for the whole class of generalized Aubry-André-Harper models all lattice Wannier functions display universal scaling on the length scale ξ, which is the fundamental length scale of insulators on which charge correlations decay exponentially. ξ is proportional to the inverse gap and is defined here by the exponential decay length of the square of the Wannier functions. We show that all scaling functions can be related to four fundamental functions of the Dirac model which are fully characterized by the phase γ of the gap parameter.
In Figs. 1(a,b) we illustrate these four fundamental scaling functions, denoted by F A,B andF A,B in the following, for the lower band Wannier functions in the AF and ML gauge of the Rice-Mele model, a fundamental one-dimensional lattice model with a 2-site unit cell put forward in the context of topological insulators [28]. In Fig. 1(a) we show that for the AF gauge, all Wannier functions multiplied with the lattice site index m collapse upon scaling m → ma/ξ for different correlation lengths ξ (here, a denotes the lattice constant). For the ML gauge, as shown in Fig. 1(b), the same applies but it turns out to be important to include a shift of the lattice site index m by the first moment of the Wan-nier function. Surprisingly, the universal scaling holds even outside the strict range of low-energy theories, i.e., it persists for rather large gaps (where ξ ∼ O(a)) and also for rather small length scales |ma| ∼ O(a). The inset of Fig. 1(a) and the left inset of Fig. 1(b) show the universality of the surface charge and surface fluctuation theorem discussed above, respectively, which follow from the universality of the Wannier functions. We find that the first moment converges to the universal result C 1 /(Za) = γ 1 /(2π), where Za is the length of the unit cell and γ 1 is the Zak-Berry phase of the lower band, which is related to the phase of the gap parameter and to the boundary charge. For the quadratic spread in the ML gauge we find the universal result ∆x 2 /(Za) = ξ/8, which is in full agreement with the universal result for the boundary charge fluctuations [18]. Interestingly, we obtain the scaling ∆x 2 /(Za) ∼ O(ξ) in any gauge and for any band for arbitrary unit cell length Za. This shows that the spread is not an independent length scale but is universally linked to the exponential decay length ξ. In the small gap limit this solves a long-standing problem since in the previous literature only the inequality ∆x 2 Za ξ has been stated [8,11], leaving it open whether the so-called localization length λ = ∆x 2 1/2 defines an independent length scale [29].
Our analysis shows that universal aspects of Wannier functions and the boundary charge are ultimately related to generic effects arising at band anticrossing points with a small gap defining a corresponding universal length scale clearly separated from the lattice spacing. The reason why universal scaling was easily overlooked before is at least two-fold: On the one hand density functional theory calculations for specific materials are often applied to systems where the lattice spacing a and the exponential decay length ξ are not clearly separated. On the other hand, even in cases of clearly separated ξ a, the visual impression of the square |w(ma)| 2 of the Wannier function does not reveal any universal scaling. As shown in the right inset of Fig. 1(b), the visual impression is that of a rather boring Lorentzian with broadening ∼ a. This form is only important for the correct normalization of the Wannier function but it does not reveal any universal scaling on length scale ξ. Universal scaling is only visible when multiplying the Wannier function by the spacial coordinate and plotting the square |m w(ma)| 2 , such that the asymptotic (1/m) 2 -behavior of the Lorentzian is cancelled. As a consequence, it is roughly the product of a Lorentzian on scale a and a universal scaling function on scale ξ which characterizes the subtle and universal line shape of Wannier functions revealed in the present work.
Furthermore, we show that the fully universal scaling exemplified for the Rice-Mele model persists for the Wannier functions of all bands for any size Z of the unit cell. It turns out that the lattice Wannier functions of a given band with two gaps at the bottom and top of the band show universal scaling in terms of two different correlation length scales referring to these two energetically adjacent gaps. Additionally, we show that each Wannier gauge, respectively, for different correlation lengths ξ = ta/∆ determined by the gap size 2∆ and the average hopping t. The gap parameter is given by ∆e iγ = −δv − 2iδt, where v1,2 = ±δv and t1,2 = t ± δt are the alternating on-site energies and hopping amplitudes, respectively. The phase of the gap parameter is chosen as γ = 0.2 π. Main panels: 1 2 |m w(ma)| 2 (AF gauge) and 1 2 |mw(ma)| 2 (ML gauge; withm = m − γ1/π and γ1 = −γ + π sgn γ = 0.8π denoting the Zak-Berry phase) as function of ma/ξ andma/ξ for different ξ = 5a, 10a, 50a, 100a. All discrete lattice points fall on top of the fundamental universal scaling functions of the Dirac model, |F A/B (ma/ξ)| 2 (AF gauge) and |F A/B (ma/ξ)| 2 (ML gauge), for m odd (FA andFA) and m even (FB andFB) (black solid lines; the even points in the AF gauge are multiplied by a factor of 5 for visibility). Left  Right inset of (b): |w(ma)| 2 for ξ = 5a, 10a, 50a, 100a shown as function of m, giving rise to the misleading visual impression of a localized state with ξ-independent spread of order a. A similar form appears for |w(ma)| 2 (not shown).
function can be naturally split in two additive contributions, one corresponding to the upper and one to the lower half of the band, each of them scaling only with a single length scale. Moreover, we also discuss a non-Abelian gauge of maximal localization (NA-ML) mixing the Wannier functions of all bands including and below a given one, which has been proposed in Ref. 10 to maximally localize the total sum of the quadratic spreads of a certain set of Wannier functions. Interestingly, we find that all Wannier functions in the NA-ML gauge reveal universal scaling on the same length scale ξ referring to the band gap between that given band and the next. Additionally, these Wannier functions can be uniquely related to the lower band Wannier function of the Dirac model. Furthermore, we will present an explicit construction of the NA-ML gauge in terms of the Wilson propagator and show that the winding number associated with the unitary non-Abelian gauge transformation allows for a formulation of a bulk-boundary correspondence (BBC) determining the number of edge states up to the band chosen to define the NA-ML gauge via bulk properties. This form of the BBC is of particular interest since it is not restricted to zero-energy edge states or to particular symmetry properties of the system.
Our work is organized as follows. In Section II we start with the description of the lattice models under consideration and the relation of their eigenstates to the eigenstates of the Dirac model. The low-energy theory of the Zak-Berry connection, the Zak-Berry phase, and the geometric tensor, together with their relation to the corresponding lattice quantities is the subject of Section III. In this section we will also define the AF and ML gauge and formulate the low-energy version of the surface charge and surface fluctuation theorem in terms of the low-energy version of the Zak-Berry phase and the momentum integral of the geometric tensor, respectively. The definition of Wannier functions in Dirac theory and their precise relation to the lattice Wannier functions will be analysed in Section IV. Furthermore, we will describe in this section the relation of all moments of the Wannier functions in Dirac and lattice theory and explain how the surface charge and surface fluctuation theorem can be formulated in terms of the moments of the Dirac Wannier functions. The universal scaling of the Dirac and lattice Wannier functions is the central subject of Section V. We will calculate the Dirac Wannier functions together with their moments analytically in the AF and ML gauge, and state the definitions and properties of the fundamental scaling functions. The universal scaling of all lattice Wannier functions and their moments is then stated via their relation to the Dirac Wannier functions, and is demonstrated for two explicit examples Z = 2 and Z = 3. Finally, in Section VI we construct explicitly the NA-ML gauge and demonstrate that all non-Abelian Wannier functions show universal scaling corresponding to the Dirac theory of the highest gap defining the set of mixed bands. In addition, we set up an interesting bulkboundary correspondence relating the winding number of the determinant of the non-Abelian gauge transformation to the number of edge states present in all gaps up to a certain one. We close with a summary and outlook in Section VII. Some more involved technical details are presented in a series of Appendices.

II. THE MODEL AND EIGENSTATES
In this section we will state the class of lattice models under consideration together with their relation to the Dirac field theory. In particular we will show the representation of the Bloch eigenstates in lattice theory and the eigenfunctions of the Dirac Hamiltonian, together with their precise relationship. In summary it will turn out that the Dirac field theory is an elegant way for a formulation of degenerate perturbation theory in terms of slowly varying right and left movers, applicable to lattice models with arbitrary size of the unit cell.

A. Lattice model
We consider the following class of so-called generalized Aubry-André-Harper lattice models in one dimension, characterized by nearest-neighbor hopping and a single orbital per site, together with any periodic on-site and hopping modulation Here, m is the lattice site index, t > 0, and all δv m = δv m+Z , δt m = δt m+Z are periodic and real (possible phases of the hoppings can always be gauged away in one dimension). The number of sites in a unit cell is denoted by Z. We relate m = Z(n − 1) + j to the index n labelling the unit cells and the index j = 1, . . . , Z labelling the sites within a unit cell. We assume the condition |δv m |, |δt m | t , (2.4) such that H can be considered as a small perturbation.
Due to the presence of H there will be Z bands labelled by the band index α = 1, . . . , Z (from bottom to top), with Z − 1 gaps in between labelled by the gap index ν = 1, . . . , Z − 1, see the sketch of the band structure shown in Fig. 2.
In standard solid state notation of the reduced zone scheme, the Bloch eigenfunctions on the lattice are written as where α = 1, . . . , Z is the band index (labelled from bottom to top), ma denotes the lattice space position and a is the lattice spacing. The length Za defines the size of the unit cell. The quasimomentum k is defined within the first Brillouin zone − π Za < k < π Za and u kα (ma) = u kα (ma + Za) (2.6) is periodic with the unit cell size. We assume that the gauge is always chosen such that the total Bloch function is periodic in k Note that this means that u kα (ma) is not periodic in k but fulfils the condition At fixed k in the first Brillouin zone, the states u kα (ja) = j|u kα in unit cell space j = 1, . . . , Z are eigenfunctions of a Hamiltonian h k defined on the periodic continuation of the unit cell where t j = t + δt j , and we identify |Z + 1 ≡ |1 . In contrast to h k and |u kα , the dispersion relation is periodic in k: k+ 2π Za ,α = kα . We note the following normalization and completeness relations This corresponds to the following relations for the periodic Bloch part u kα (ja) at fixed quasimomentum k and j = 1, . . . , Z Z j=1 u kα (ja) * u kα (ja) = δ αα , Sketch of the band structure for Z = 5. We find Z − 1 = 4 gaps labelled by ν = 1, . . . , 4 opening up at k = 0 for ν = 2, 4 even, and at k = ± π Za for ν = 1, 3 odd. All bands labelled by α = 1, . . . , 5 are then naturally split in two halves, the upper and lower halves labelled by (ν, τ ) = (α, −) and (ν, τ ) = (α − 1, +), respectively. An exception are the lowest α = 1 and highest α = 5 band. They are not split in two halves and labelled by (ν, τ ) = (1, −) and (ν, τ ) = (4, +). The Dirac theory for gap ν is a good description for all eigenstates referring to a pair of band parts labelled by (ν, ±) (indicated by violet and green color for ν even or odd, respectively). The momentum q in Dirac theory is the difference of k to the point where the gap opens, i.e., k = q for ν even, and k = ± π Za + q for ν odd and sgn q = ∓. The cutoff for |q| is given by Λα = π 2Za for α = 2, 3, 4 and by Λα = π Za for α = 1, 5.
Introducing the following scalar product in unit cell space For a small perturbation H compared to the typical band width set by the average hopping t, the states in the middle of the bands are nearly untouched and only close to the gaps the eigenstates of H 0 are coupled significantly by H . We denote the single-particle eigenstates of H 0 in the extended zone scheme as with energy k = −2t cos(ka), and −π/a < k < π/a. Since H is periodic with the size Za of the unit cell, the gap openings will happen in general via higher-order processes close to the two Fermi points ±k As outlined in detail in Refs. 13, 17, and 27 the coupling of the low-energy states close to gap ν can be described via Brillouin-Wigner perturbation theory by the effective Hamiltonian where P (ν) projects on the low-energy space, Q (ν) = 1 − P (ν) , and (ν) is the Fermi energy where the gap opens. This coupling leads to a complex gap parameter ∆ (ν) e iγ (ν) defined by In standard convention of low-energy theories it is very useful to split off the strongly oscillating parts e ±ik (ν) F ma of the eigenfunctions (2.5) and parametrize the Bloch states ψ kα (ma) in terms of slowly varying right and left movers ψ (ν) qτ p (x) in continuum notation as Here, p = ± is the index for right/left movers, and τ = ± is the index describing the bands above/below the gap ν, see Thereby, the band part labelled by (ν, τ ) corresponds to the lower (upper) half of band α = ν + 1 (α = ν) for τ = ±, see Fig. 2. An exception are the lowest α = 1 and highest α = Z band which are not split in two halves and labelled by (ν, τ ) = (1, −) or (ν, τ ) = (Z − 1, +), respectively. The momentum q in Eq. (2.22) corresponds to the difference of the quasimomentum k in lattice theory to the quasimomentum defining the band bottom or band top at which the gap appears. This means that for even ν one expands around k = 0 such that q ≡ k, and for odd ν one expands around k = ± π Za such that q ≡ k ∓ π Za , see Fig. 2. In both cases the momentum q has to fulfil the condition for α = 2, . . . , Z − 1 π Za for α = 1, Z . (2.26) By linearizing the dispersion relation k of H 0 around ±k (ν) F , and using (2.21) one can set up an effective Dirac Hamiltonian H (ν) D corresponding to gap ν = 1, . . . , Z − 1 with gap parameter ∆ (ν) e iγ (ν) , which has the vectors where v is the Fermi velocity. The energy of the Dirac eigenstates are given by which provides a very good approximation to the true dispersion when |q| π 2Za . In contrast to the dispersion, we note that the Dirac model provides a very good approximation to the exact eigenstates for all |q| < Λ α . For large |q| ∼ 1/(Za) F , the eigenstates of the Dirac model are (up to a gauge factor) given by Inserting this form in (2.22-2.24), one obtains the correct eigenfunctions of the lattice model in the absence of a gap, see Appendix A. This means that for small gaps ∆ (ν) t, the Dirac theory is a useful approximation for the description of all eigenstates of the lattice model. Therefore, all Wannier functions can be fully described by Dirac theory on all length scales. In contrast, dynamical quantities like Green's and correlation functions can only be described for low energies since the energy dispersion enters into these quantities.
When comparing the Bloch eigenstates (2.5) with the parametrization (2.22-2.24) one finds straightforwardly the following relations between u kα (ja) and χ (ν) qτ p (note that α and (ν, τ ) are related via (2.25)) Since the Dirac theory explicitly exhibits the slowly varying parts ψ (ν) qτ p (x), one can interpret it as a continuum theory for all x on the real axis and not only for x = ma, where the lattice theory is reproduced. Therefore, although the Dirac theory has only a physical meaning for |q| < Λ α , one takes all values for the momentum q into account and writes the normalization and completeness relations as This is equivalent to taking the following relations for the parts χ Introducing the following scalar product in right/left space   (2.31), are consistent with the normalization condition (2.13) and the completeness relation (2.14) for u kα (ja), see Appendix A. The normalization condition follows exactly from (2.30) and (2.31), and is a special case of the useful relation (valid for any integers r, s ≥ 0) The completeness relation is more subtle and uses in addition the fact that the Dirac theory can reproduce all eigenstates of the original lattice model under the perturbative condition (2.4). The fact that the Dirac theory can capture all eigenstates of the lattice model leads to the following rigorous rule when considering any k-integral over a function f kα depending on the eigenfunctions of band α of the lattice theory (like projectors on Bloch states for the completeness relation, Zak-Berry connection for the Zak-Berry phase, geometric tensor for boundary charge fluctuations, or Bloch states for Wannier functions) qτ is the corresponding function in Dirac theory obtained by the replacements (2.22), (2.30) or (2.31). Thereby, for α = 1 or α = Z, we omit implicitly the terms with ν = 0 or ν = Z on the right hand side, respectively. The approximate sign in this relation is meant in a perturbative sense that all higher order corrections are of relative order O(∆ (ν) /t) and negligible for ∆ (ν) t. However we note that (2.41) is only valid when the gauges in lattice and Dirac theory have been chosen identical such that the relation (2.22) between the eigenfunctions holds globally for all quasimomenta in a certain band α.
Since a field theory does not know anything about scales of the order of the lattice spacing, the integrals over q on the right hand side of (2.41) are conveniently extended to infinity in the field theoretical version and properly regularized in case of divergences. Since the regime of large momenta refers to the physics in the absence of a gap their contribution can be analytically analysed quite easily. For the various physical quantities discussed in this work we will show that the field-theoretical contributions beyond the cutoff either vanish for each individual term on the right hand side of (2.41) (as for Zak-Berry connection and geometric tensor) or, after a proper regularisation, the contributions of high momenta cancel between the two terms (as for Wannier functions in certain gauges). In this way a full equivalence between lattice and Dirac theory can be set up, provided that the gaps are small compared to the typical band width. Furthermore, for the bands α = 2, . . . , Z the two terms on the right hand side will be shown to refer to the description of the universal behavior of the two halves of the bands, the upper one corresponding to the Dirac theory for ν = α and the lower one to ν = α − 1. However, for this interpretation to make sense we will see that nonuniversal terms arising from cutting the spectrum in the middle of the band have to be removed by extending the lattice theory for the two halves to infinity via an asymptotically free theory far away from the gap.
When eigenfunctions are compared between the lattice and Dirac theory and identified via (2.22) one always has to guarantee that the gauges are chosen in precisely the same way (at least locally for a certain region in quasimomentum space). Therefore, it will turn out to be important to state specific gauges of particular interest. One of them is the so-called asymptotically free (AF) gauge, which is characterized by a real and positive value of u kα (Za) Here we used (2.32) to formulate the equivalent condition in Dirac theory. This gauge has the particular advantage that it leads precisely to the asymptotic condition (2.29) without any further gauge factors. As explained later in Section III C 1 the AF gauge leads to a unique relation between the Zak-Berry phase and the boundary charge both in lattice and Dirac theory. Another gauge will be introduced in Sections III A and III B which is called the maximally localized (ML) gauge. In the ML gauge the Wannier functions in lattice and Dirac theory have minimal spread which can be related to the boundary charge fluctuations. It will turn out that the ML gauges in lattice and Dirac theory are not the same, such that the relation (2.22) involves additional phase factors depending on the quasimomentum. Finally, in Section VI we will also discuss non-Abelian lattice gauges where a mixing of the Bloch states from a set of bands α = 1, . . . , ν is involved, such that the non-Abelian Wannier functions turn out to have maximal localization. This gauge is called the non-Abelian gauge of maximal localization (NA-ML). It is a special lattice gauge which turns out to be constructed in such a way that the non-Abelian Wannier functions can be directly related to the Wannier function of the upper half of the highest band ν or, equivalently, to the Wannier function of the lower band of the Dirac model corresponding to the gap between the highest band ν and band ν + 1.

III. ZAK-BERRY CONNECTION, GEOMETRIC TENSOR AND BOUNDARY CHARGE
As a prerequisite for the analysis of universal aspects of Wannier functions we develop in this section the lowenergy theory for the Zak-Berry connection, the Zak-Berry phase, and the geometric tensor in Dirac theory, together with the relation to the corresponding objects in lattice theory. In this connection, we will also introduce the definition of the ML gauge in Dirac theory and relate it to the corresponding definition in lattice theory. Finally, we will show how an important physical observable, the boundary charge and its fluctuations, can be related via the surface charge and surface fluctuation theorem to the Zak-Berry phase and the momentum integral of the geometric tensor in low energy.
A. Zak-Berry connection and geometric tensor for lattice model For the lattice the Zak-Berry connection and geometric tensor are defined by where we made use of the normalization and completeness relations (2.16) and (2.17) to derive the last equality. We note that the diagonal component Q kα ≡ (Q k ) αα of the geometric tensor can also be written as Taking a gauge transformatioñ we find from (3.4) that the geometric tensor is gauge invariant and from (3.1) that the Zak-Berry connection is gauge invariant for α = β, whereas A kα ≡ (A k ) αα transforms asÃ Furthermore, denotes the Zak-Berry phase (which should be distinguished from the notation γ (ν) for the phase of the gap parameter in Dirac theory). According to (3.9), the Zak-Berry phase transform under a gauge transformation by the winding number of the phase ϕ kα γ α = γ α − ϕ π/Za,α + ϕ −π/Za,α . (3.11) In the AF gauge defined by (2.42) it is shown in Ref. 15 that (3.12) The ML gauge is defined by a constant Zak-Berry connection According to (3.9) this can be achieved by the gauge factor e iϕ kα with which fulfils where we used (3.12) for the last equality. From (3.16) we find that the winding number is zero leaving the Zak-Berry phase invariantγ Since the AF and ML gauge are the relevant gauges used in this work, we will implicitly indicate the AF gauge by symbols without a tilde and the ML gauge by a tilde symbol.
B. Zak-Berry connection and geometric tensor for Dirac model In the Dirac theory we define the Zak-Berry connection and the geometric tensor by where we definedτ = −τ and used δ τ τ − δτ τ = sgn(τ τ ) to get the last equation. This gives the important property q ) τ τ can also be written as Taking a gauge transformatioñ we find analog to the lattice that the geometric tensor and the nondiagonal components of the Zak-Berry connection are gauge invariant, whereas A (ν) In the following we allow only for gauges where the Zak-Berry connection vanishes for |q| → ∞ such that the Zak-Berry phase in Dirac theory, defined by is a well-defined quantity. We note that this is fulfilled for the AF gauge (2.42) where we get from (2.29) for momenta beyond the cutoff Λ c leading to a vanishing Zak-Berry connection for |q| > Λ c .
To get the relation for the Zak-Berry connection and the geometric tensor between the lattice and Dirac definition, we use the identity (2.40) and find (3.31) As discussed above this holds only when the Zak-Berry connection in Dirac theory vanishes beyond the cutoff Λ c which is fulfilled for the AF gauge. In the AF gauge we will furthermore show below via the explicit eigenstates of the Dirac model (see Section V A) that γ (ν) τ is related to the phase γ (ν) of the gap parameter ∆ (ν) e iγ (ν) by γ (ν) where we abbreviated the sign function by s γ = sgn γ , (3.33) and assumed −π < γ (ν) < π with periodic continuation to the other regimes.
In the following, we denote the eigenstates of lattice and Dirac theory in the AF gauge by ψ kα (ma) and ψ (ν) qτ (x), respectively, which are connected by the relation (2.22). Correspondingly we use the notation u kα and χ (ν) qτ p in this gauge which are related by (2.30) and (2.31). From (3.12) we note that we get in the AF gauge (3.34) The ML gauge in Dirac theory is defined by a vanishing Zak-Berry connection Dirac ML gauge :Ã (ν) such that the corresponding Zak-Berry phase is also zerõ According to (3.25) this corresponds to the choice of a gauge factor e iφ (ν) qτ with Note that this is not a contradiction to the corresponding condition (3.13) for the lattice theory where it is not possible to remove the Zak-Berry phase via a global gauge transformation. The Dirac model for a given gap ν describes only those states in k-space which belong to the two halves of the bands separated by gap ν. Therefore, a particular global gauge chosen in Dirac theory corresponds to a certain local gauge in lattice theory defined for all quasimomenta lying in one half of a given band. Such a gauge can not necessarily be extended to a global gauge for a certain band which is continuous and periodic in k.
In order to show the explicit difference between the ML gauges in lattice and Dirac theory we start from the AF gauge and define a gauge transformation to the ML gauge by the transformed quantities From the conditions (3.13), (3.14), (3.35), and (3.37), defining the ML gauge in lattice and Dirac theory, we get where we made use of (3.12) in the AF gauge to get the form (3.40). Using (3.29) we find from the two forms for ϕ kα that, for both the lower and upper half of band α, we get the following relation between the ML gauges of lattice and Dirac theory leading to the following relation between the gauge factors The last two factors on the right hand side define the difference between the ML gauges in lattice and Dirac theory, which have to be added to (2.30) and (2.31) to get the precise relation between the eigenstates of lattice and Dirac theory in the ML gauge. Furthermore, they show that the ML gauge in Dirac theory is not smooth when crossing over the middle of a band α since the indices (ντ ) and the relation between q and k change.
C. The boundary charge

The surface charge theorem
The charge accumulated at the boundary, or the boundary charge [12][13][14][15][16][17][18], is defined by restricting the tight-binding Hamiltonian (2.1) to the half-infinite space m > 0 and averaging the excess charge with a macroscopic envelope function f m via m is the average charge at lattice site m for a halfinfinite system if the chemical potential is located in gap ν, including all edge states up to this energy. The average charge per site in the bulk is denoted byρ (ν) = ν/Z. The macroscopic and probe specific envelope function is denoted by f m which must have certain properties. In particular, it has the value 1 in the first range of m, which far exceeds the localization length ξ, and then it smoothly crosses over to 0 over the second range of m, which is also ξ. The importance of including these features into the definition of f m for the universality of the boundary charge properties as well as their experimental relevance has been elucidated in Refs. 12-15. If only a single band α is occupied we denote the corresponding boundary charge by Q B,α , such that the total boundary charge can be decomposed as where Q (ν) E denotes the integer contribution from all occupied edge states up to the chemical potential, and Q E,α denotes the contribution of an edge state present in gap α. The latter is either unity or zero and we assume for the validity of (3.46) that, for the last gap α = ν, the chemical potential is located at the bottom of the conduction band α = ν + 1.
As shown in Ref. 15, one finds in the AF gauge of the lattice theory that the Zak-Berry phase is related to the boundary charge of a single band in the following way Inserting (3.31) and (3.32) in the expression (3.47) for the boundary charge we get Noting that an edge state appears in gap ν for 0 < γ (ν) < π (see Ref. 17), we find that the sum of the boundary charge of band α and the charge Q E,α of the edge state in gap α is given by Assuming that the chemical potential is located at the bottom of the conduction band, this gives the following result for the boundary charge Q (ν) B when the lowest ν bands are filled Here we have used the fact that the lower half of the lowest band gives a negligible contribution to the Zak-Berry phase since we can approximate the eigenstates by the free ones in this regime leading to a vanishing Zak-Berry connection (see Appendix A). The result (3.50) states the low-energy version of the surface charge theorem relating the boundary charge to the phase γ (ν) of the gap parameter which is related via (3.32) to the low-energy version of the Zak-Berry phase. We note that the result (3.50) is fully consistent with the direct calculation of the boundary charge for a half-infinite Dirac model as shown in Ref. 17. An essential point for the formulation of the surface charge theorem in low energy is the fact that the sum of the two Zak-Berry phases γ − of the top of the valence band alone. We note that the result τ γ (ν) τ = πmod(2π) can be shown on quite general principles without using the explicit forms of the eigenfunctions of the Dirac model, see Appendix B.

The surface fluctuation theorem
In lattice theory the k-integral over the geometric tensor is of particular interest since it can be related to the boundary charge fluctuations [18] B ) 2 are the fluctuations when the chemical potential lies in gap ν (the precise position is unimportant). Here, l p ξ is a length scale on which the charge measurement probe looses smoothly the contact to the sample, see Ref. 18 for details. Since the geometric tensor is gauge-invariant, this relation holds in any gauge, in particular in the AF and ML gauge. Summing the geometric tensor (Q k ) αβ over α or β up to some ν in the low-energy regime, we find from (3.30) and (3.22) that all contributions vanish where two bands are separated by a gap. Therefore, we obtain These relations are also independent of the gauge choice in Dirac theory since the geometric tensor is gauge invariant in Dirac theory as well. Using (2.41) and the fact that the geometric tensor vanishes in Dirac theory in the asymptotically free region |q| > Λ c , we obtain in the low-energy regime for the fluctuations where we made use of (3.21), (3.23), and (3.26) for the last two steps. The result (3.55) is the low-energy form of the surface fluctuation theorem relating the boundary charge fluctuations to the momentum integral of the geometric tensor. As shown only the geometric tensor of the valence band enters. This is physically intuitive since one expects no fluctuations of the charge from the low-lying bands. As will be shown below in Section V we obtain explicitly via the eigenfunctions of the Dirac model showing that the result (3.55) is fully consistent with the direct calculation of the boundary charge fluctuations within the Dirac model via the second momentum of the correlation function [18]. Most importantly, we note that the fluctuations of the total boundary charge can not be written as the sum over the fluctuations of the individual bands. If only band α is occupied the fluctuations are given by where R = Zan, with n integer, denotes a lattice vector. Using (2.11) and (2.12) one finds that the Wannier functions labelled by α and R form an orthonormal and complete set of states Defining by w α (ma) ≡ w 0,α (ma) the Wannier function centered at zero we find with the Bloch form (2.5) and the periodicity property (2.6) that the Wannier function w R,α (ma) follows from shifting w α (ma) by the lattice vector R Therefore, all properties of the Wannier functions follow from studying the properties of w α (ma). The Wannier functions depend on the gauge in a nontrivial way. If (4.5) is the Wannier function in the AF gauge, we obtain in the ML gaugẽ where the phase ϕ kα is given by (3.14). We note that due to the properties (3.12) and (3.17), both the Wannier functions in AF and ML gauge are real. For the bands α = 2, . . . , Z −1 we show in Appendix C that the Wannier functions in the AF or ML gauge can be split into two contributions corresponding to the upper and lower half of the band where w u/d,α (ma) denotes the part from the corresponding integration regions dk u kα (ma)e ikma for α even/odd π 2Za <|k|< π Za dk u kα (ma)e ikma for α odd/even , and δw α (ma) arises from the extension of the quasimomentum integrations to ±∞ taking the free solutions of u kα (ma) in the gapless case, see Appendix C for details.
In the AF and ML gauge one obtains explicitly where we defined the shifted variablẽ Whereas the corrections δw α (ma) cancel out when considering the total Wannier function w α (ma) of a band, we will see in Section V that the Wannier functions w u/d,α (ma) andw u/d,α (ma) show only universal scaling if the corrections ±δw α (ma) and ±δw α (ma) are taken into account. For the bands α = 1, Z the splitting in upper and lower part makes no sense since the lower/upper half of the band α = 1/Z is already described by a free theory for small gap. Therefore, we use the convention The moments of the Wannier functions are defined by A corresponding definition is used for the moments C u/d,rα of w u/d,α . Inserting (4.4) and (2.5) we find after some straightforward manipulations For r = 1 and r = 2 we find with (3.1) and (3.7) and for the quadratic spread Since the geometric tensor is gauge invariant, the condition for maximal localization is given by a constant Zak-Berry connection corresponding to the ML gauge (3.13).
Since, according to (3.18), the Zak-Berry phase does not change in this gauge, the first moment stays invariant In the ML gauge we get for the minimal quadratic spread ∆x 2 α,min from (4.20) and (3.52) the result ∆x 2 α,min = Za where l p ∆(Q B,α ) 2 are the fluctuations when only the band α is occupied. So in summary we can formulate the surface charge and surface fluctuation theorem for a singly occupied band in terms of the first and second moment of the Wannier function as where C rα refers to the AF gauge, andD rα are the moments in the ML gauge defined relative to the first mo- withm α defined in (4.14). These moments are related to the momentsC (α) r . E.g., for r = 1, 2, one obtains by using the normalization m |w α (ma)| 2 = 1 and (4.21) As discussed at the end of Section III C 2 the fluctuations l p ∆(Q B,α ) 2 of the individual bands are not sufficient to calculate the fluctuations l p (∆Q (ν) B ) 2 when all bands α = 1, . . . , ν are occupied. However, as we will see in the next two sections for the low-energy regime of small gaps, the surface charge and surface fluctuation theorem for ν occupied bands can be written entirely in terms of the moments of the upper component w u,ν andw u,ν of the Wannier functions for the highest valence band as This physically intuitive result reflects the fact that the boundary charge is a low-energy property only which is insensitive to the properties of low-lying and occupied bands. Alternatively, as we will discuss in Section VI, it is also possible to define maximally localized Wannier functions in a non-Abelian gauge where all occupied bands are mixed. As we will see these Wannier functions are closely related to the Wannier functionw u,ν in the Abelian ML gauge and shows precisely the same universal scaling as the Dirac Wannier function of the valence band.

B. Wannier functions for Dirac model
Within the Dirac theory we define the Wannier functions by where and Correspondingly, for the ML gauge, we definew In contrast to the lattice Wannier functions, the Dirac Wannier functions are complex quantities both in the AF and ML gauge.
Since the Wannier functions in the Dirac model contain a continuous shift y their dimension is given by inverse length and the normalization and completeness relation follow from (2.33) and (2.34) as The moments C (ν) rτ of the Dirac Wannier functions are defined by and, correspondingly, we denote the moments in the ML gauge byC rτ . Note that these moments have dimension (length) r−1 in contrast to the moments C rα defined within the lattice theory. This is due to the different normalization in the continuum Dirac theory. Nevertheless, we will see below that the first and second moment are related to the boundary charge and the fluctuations, respectively, in a similar way as in (4.23) and (4.24) but without the denominator Za, see below. Note that the first moment does not play the role of the position of the Wannier function in Dirac theory since it is dimensionless. A finite value of the first moment C Using the form (4.33) we find after some straightforward manipulations Using (3.27), (3.21) and (3.23) we can thus write for the first and second moment Since the first term on the right hand side of (4.40) is gauge invariant and the second one has a positive integrand it follows that the gauge of maximally localized Wannier functions is given by the ML gauge (3.35) where the Zak-Berry connection vanishes. We note that there is a fundamental difference to the lattice theory where the quadratic spread is defined by C 2α − (C 1α ) 2 . The analog formula is not possible in low energy since the dimension of C Taking (4.39) and (4.40) together with (3.50), (3.32), and (3.56), we can formulate the surface charge and surface fluctuation theorem in low energy via the first and second moment of the Dirac Wannier functions as Here, the moments C (ν) r− refer to the AF gauge, whereas C (ν) r− are the moments in the ML gauge. Note that, in contrast to (4.23) and (4.24) (where a single band has been considered), we consider here the total boundary charge and its fluctuations when the lowest ν bands are filled and the chemical potential is located at the bottom of the conduction band (which is only important to calculate the edge state contribution for the boundary charge in the last gap). This result shows that the boundary charge and its fluctuations are quantities probing only low energy features and, therefore, can be related in a universal way to the first and second moment of the Dirac Wannier functions corresponding to the top of the highest valence band.

C. The relation between Wannier functions in lattice and Dirac theory
We now consider the precise relation of the Dirac Wannier functions to the Wannier function defined within the lattice theory. We start with the AF gauge where the relation (2.22) between eigenfunctions of the lattice and the Dirac theory hold globally for all quasimomenta in a certain band α since the gauge is the same. For the ML gauge it is more subtle since there is difference between the gauges in lattice and Dirac theory, see Eq. (3.43). Therefore one has to add gauge factors depending on whether one considers the contribution of the upper or lower half of the band to the Wannier function.

The AF gauge
For the AF gauge, we can insert (2.22) in (4.4) and use (2.41) to get where as usual we omit the terms with ψ q− if α = 1 or α = Z, respectively. As shown in Appendix C the region |q| > Λ α contributes ±δw u/d,α (ma) to the first/second term on the right hand side, i.e., precisely the same contribution (4.12) we used to extend half of a band for α = 2, . . . , Z − 1 (for α = 1, Z there is no contribution from |q| > π Za ). Therefore, after inserting (2.23) and (4.33), we obtain the following relation between the Wannier functions in lattice and Dirac theory Thereby, we will show in Appendix C that the momentum integral to define the Wannier functions w (ν) τ p (x) via (4.33) has to be regularized in such a way that unphysical contributions ∼ δ(x) are absent, see the explicit calculation in Section V.
To understand the relation of the various moments in the AF gauge, we need to evaluate |w α (ma)| 2 using the Wannier function from the sum of (4.44) and (4.45). The simplest cases are α = 1 or α = Z, where only one term contributes: and a similar result for α = Z with w −,p (ma) over the unit cell, we find that the strongly oscillating terms of (4.46) can be neglected when averaging them over a unit cell. This leads to the result For the bands α = 2, . . . , Z − 1, both terms (4.44) and (4.45) contribute to the Wannier function. This leads to more oscillating terms occurring for the moments involving also e iπp(2α−1)m/Z and e iπpm/Z . Such terms give again a negligible contribution to the moments when averaging them over two unit cells, leading to the general result Inserting (4.50) in (4.41) we find (4.28), which states the surface charge theorem in terms of the upper component w u,ν of the lattice Wannier function for the highest valence band.
As discussed later in all detail in Section V E, we note that the divergence of the Dirac Wannier functions |w is not important for the calculation of the moments for r > 0 since the contributions from the region |m| ∼ O(1) can be neglected in the universal limit ξ a. The only exception is r = 0 where the normalization of the lattice Wannier function is fully determined by the region |m| ∼ O(1).

The ML gauge
In the ML gauge we have shown via the two last factors on the right hand side of (3.43) that there is a difference between the gauges of maximally localized Wannier functions in lattice and Dirac theory. The last factor leads to a trivial shift of the position of the Wannier function w Including this shift and the phase factor e −iφ (ν) 0,τ we have to modify (4.44) and (4.45) in the following way (see Appendix C for details) to get the correct relationship between the maximally localized Wannier functions in lattice and Dirac theorỹ wherem α is defined in (4.14). Using (4.52) and (4.53), we obtain analog to (4.49) from the definition (4.25) for the moments in the ML gaugeD Inserting (4.55) in (4.42) we find (4.29), which states the surface fluctuation theorem in terms of the upper componentw u,ν of the lattice Wannier function for the highest valence band.
As discussed in more detail in Section V E, the relations (4.54) and (4.55) should be used only for even values of r. As shown in Section V C the Dirac moments C (ν) rτ are exactly zero for all odd values of r. However, this does not mean that the lattice momentsD rα and D u/d,rα are exactly zero for odd values of r (except for D 1,α = 0 which is zero by definition). The smallness of the odd moments in lattice theory has to be understood by an order of magnitude analysis compared to the even moments. The odd moments are of subleading order O(a 2 ξ r−2 ) as compared to the even moments which are of O(a ξ r−1 ). Therefore, in the limit of small gaps ξ a, the odd moments are of no interest in the ML gauge and can be neglected.
We note that corresponding relations ofC rα orC u/d,rα to the Dirac theory are not possible since the Dirac Wannier functions have a divergence |w Therefore, it is important to define the moments of the Dirac Wannier functions around the reference point x = 0, otherwise they contain a divergence.

V. UNIVERSALITY OF WANNIER FUNCTIONS
This section is devoted to the most important result of this work that, in the case of small gaps, all lattice Wannier functions in AF or ML gauge show universal scaling in terms of a small set of universal scaling functions determining the shape of the Dirac Wannier functions in the corresponding gauges. To obtain this result we first calculate all eigenstates and the Wannier functions of the Dirac model analytically in Section V A. The fundamental universal scaling functions in the Dirac theory are then introduced in Section V B, together with a summary of all their symmetry properties and asymptotic forms. The moments of the Dirac Wannier functions are discussed analytically in Section V C, both in the AF and ML gauge. In Section V D we derive the universal scaling form of all lattice Wannier functions and their moments in terms of the fundamental scaling functions of the Dirac theory. The universal scaling is demonstrated explicitly for two examples with unit cells of size 2a and 3a in Sections V D 1 and V D 2, respectively. Finally, Section V E contains a qualitative discussion of the properties of lattice Wannier functions and the scaling of their moments on all length scales, in particular including the one at small scales of the order of the lattice spacing. This analysis shows clearly that the visual impression of the square of lattice Wannier functions reveals only the trivial gapless limit, leading to the misleading visual impression of a localized wave function with spread determined by the lattice spacing. In contrast, the whole universal scaling properties on length scale ξ are only visible when multiplying the lattice Wannier function with the spatial coordinate and then squaring it.

A. Explicit evaluation of Dirac Wannier functions
The eigenfunctions (2.23) of the Dirac model (2.27) in the AF and ML gauge are given by 2) and the gauge factor is given by We note the helpful properties As required for the AF gauge by (2.42), we find with We note that the gauge factor (5.3) allows for a unique and analytic definition of the phase φ Therefore, the phase φ (ν) qτ can be chosen to have the same sign as γ (ν) for all q (note that we take −π < γ (ν) < π). Using we get from We note that sin γ (ν) = 0 is a special point where the Dirac Zak-Berry connection is zero. For a halfinfinite system with x > 0 it can be shown [17] that an edge state is present in the gap for γ (ν) > 0 with energy . Therefore, at this special point, the edge state energy touches either the higher (for γ (ν) = ±π) or the lower band (for γ (ν) = 0), and the Wannier functions of the corresponding bands have very special properties, they do not decay exponentially (for w (ν) τ p (x) in AF gauge) or change discontinuously (for w (ν) τ p (x) in ML gauge). Therefore, we exclude the cases γ (ν) = 0, τ = − and γ (ν) = ±π, τ = + in the following and discuss them separately in Appendix G.
( 5.12) Inserting the forms (5.1) and (5.12) forχ qτ in (4.33), and introducing the dimensionless variables F /(2∆ (ν) ), we find for the Wannier functions the integral representation where We have included a convergence factor e −η|q| , withη ∼ a/ξ (ν) , since the integrals diverge for large |q|. This divergence occurs since a low-energy theory can only describe the universal regime |x| a. As shown in Appendix E, one can also find a more convenient integral representation by closing the integration contour for s x = sgn(x) = ± in the upper/lower half, respectively, leading to convergent integrals such that the limitη → 0 can be performed under the integral. This regularisation is possible for all x = 0 and removes an unphysical contribution ∼ δ(x) from the Dirac Wannier function, see Appendices C and D. In this way we obtain the universal representation where we defined s x = sgn(x), and a. However, as long as ξ (ν) a, the lattice Wannier functions at x = ma can be reproduced for all m (even for m = 0) from the Dirac Wannier functions via the sum of (4.44) and (4.45) (in the AF gauge) or the sum of (4.52) and (4.53) (in the ML gauge). As shown in Appendix C, this arises from the fact that the high-momentum region |q| > Λ α does not contribute to the total Wannier function of a certain band α.

B. The universal scaling functions
It is convenient to express the complex Dirac Wannier functions w which corresponds to the definitions The fundamental universal and dimensionless scaling functions are then defined by 29) in terms of which the moments (4.37) can be expressed as γ is covered by the above properties. A special case is cos γ = 0, where F B,τ (y; γ) = 0 for cos γ = 0 . (5.42) The asymptotic forms for small |y| 1 and large |y| 1 can be analysed analytically, see Appendix E, where we derive the corresponding asymptotic behavior of the Dirac Wannier functions in the AF and ML gauge for small |x| ξ (ν) and large |x| ξ (ν) . For y = 0 we get from (E.10) and (E.11) cos γ = 0 : As discussed after (5.10), we remind that the cases γ = 0, τ = − and γ = ±π, τ = + are excluded. At these points the scaling functions F A/B,τ (y; γ) are not exponentially decaying state touches the band edge, it is quite remarkable that the Wannier functions in the AF gauge stay continuous but show a non-exponential decay exactly at the touching point although the gap is still present. Up to our best knowledge this has not been reported before. In summary we find an exponential decay with a preexponential power law for |y| 1 (similar power-laws for the pre-exponential functions have been obtained in Ref. 30 for a variety of other localized single-particle wave functions) whereas, for |y| 1, we obtain the scaling This scaling together with the exponential behavior at large y is the essential reason why the moments scale as leading via (4.49) and (4.54) to the universal scaling aξ (with ξ ∼ ξ (α) , ξ (α−1) ) for the quadratic spread of the lattice Wannier functions. This is in contrast to the visual impression of the lattice Wannier functions leading to the incorrect scaling a 2 , as we will discuss in all detail in Section V E.

C. Moments of Dirac Wannier functions
The moments in the ML gauge are given bỹ Noticing that only even moments (r = 2l) are nonzero, we rewrite this expression in the symmetrized form In particular we find that the moments are independent of τ and depend on ν only via the decay length ξ (ν) . The scaling behavior ofC In particular, the second moment is recovered from At large l the sequence (5.66) is well approximated by (2l−2)! e , see Fig. 4.
In the AF gauge, the moments (4.38) are more complicated and can be expressed as such that In contrast to the ML gauge, the dimensionless coefficients c (ν) rτ depend on τ and ν in the AF gauge via the phase factors involving φ (ν) qτ . In particular, we find which agrees with (4.39) and (3.32), and One finds that C (ν) 2τ >C (ν) 2τ = ξ (ν) /8 as expected and a divergence when the first moment (5.71) jumps (for τ = + at γ (ν) = ±π, and for τ = − at γ (ν) = 0), compare with the discussion after (5.10).

D. Scaling properties of lattice Wannier functions
In this Section we reveal the scaling properties of the lattice Wannier functions as given by (4.44) where m = Z(n − 1) + j, we obtain where we leave out the second (first) term on the right hand side for α = 1 (α = Z). For α = 2, . . . , Z − 1 universal scaling appears with two different length scales corresponding to the gaps at the bottom and the top of the band. Therefore, to reveal universal scaling, one has to keep the ratio of the two length scales fixed. For the universal scaling of the moments we get from the above equations after neglecting the strongly oscillating terms dy y r−2 G + (y; γ (α−1) ) , (5.82) By using (5.35) and (5.37) we note thatG τ is a symmetric functionG As a result the right hand sides of (5.83) are exactly zero for odd values of r. As mentioned already at the end of Section IV C 2, this does not mean that the lattice momentsD u/d,rα (M a) are zero for odd values of r. It only means that by dividing them by the leading order aξ r−1 (with ξ ≡ ξ (α) forD u,rα (M a) and ξ ≡ ξ (α−1) for D d,rα (M a)), one gets zero in the limit ξ → ∞. This is in contrast to the even moments in ML gauge which stay finite in this limit after divided by this order. Therefore, the odd moments in ML gauge are negligible and are not considered in the following. In the AF gauge the function G τ (y; γ) is asymmetric due to the asymmetry of the scaling function F A,τ (y; γ), see (5.32) and Fig. 3, with a significant larger part for either positive or negative values of y. Therefore, in the AF gauge, all moments are of order a ξ r−1 and stay finite in the limit ξ → ∞ when divided by this order. The total moment for α = 2, . . . , Z −1 is obtained from the sum (neglecting strongly oscillating terms involving (−1) n ) where we leave out the second (first) term on the right hand side for α = 1 (α = Z). According to (4.49) and (4.54), we get for the asymp- In the following two subsections we will demonstrate the universal scaling of the lattice Wannier functions and their moments for two examples with Z = 2 and Z = 3. For simplicity we assume that the dominant contribution to the gap parameter (2.21) is the term H of the effective Hamiltonian (2.20). This means that all Fourier components The 2(Z − 1) lattice parameters δv j and δt j , with j = 1, . . . , Z and j δv j = j δt j = 0, are then fixed via the Z − 1 complex gap parameters ∆ (ν) e iγ (ν) , with ν = 1, . . . , Z − 1, by using (5.103) and the inverse Fourier transform of (5.100) and (5.101) The parameters are the same as in Fig. 1 and Fig. 10, respectively. For all ξ (α) used in these figures we find that the deviation from the approximation γ1/(2π) ≈ 0.4 (left panel, see (5.106)) and γ2 ≈ 7 20 = 0.35 (right panel, see (5.115)) is less than ≈ 0.02%.

Universal scaling for Z=2
For Z = 2 (the so-called Rice-Mele model [28,31] which, for δv 1/2 = 0, turns into the Su-Schrieffer-Heeger model [32]) we show in Fig. 1(a) and Fig. 1(b) the universal scaling of the Wannier functions w(ma) ≡ w 1 (ma) = w u,1 (ma) andw(ma) ≡w 1 (ma) =w u,1 (ma) of the lowest band α = 1, together with the scaling of their first and second moment, respectively. The lattice parameters are fixed by the complex gap parameter ∆e iγ ≡ ∆ (1) e iγ (1) . The lattice Wannier functions are calculated numerically from their definitions (4.5) and (4.7) via the Bloch states u k1 (ma) and the gauge phase ϕ k1 . The latter follows from (3.14) via the Zak-Berry connection A k1 . The Bloch states and the Zak-Berry connection are calculated analytically in Appendix H. The Zak-Berry phase γ 1 of the lowest band can either be calculated from its definition (3.10) via numerical integration over the Zak-Berry connection or from the approximate formulas (3.31) and (3.32) as As shown in Fig. 5 the two ways to calculate the Zak-Berry phase coincide quite well in the low-energy regime of small gaps. Therefore, it makes no visible difference in Fig. 1(b) which choice is taken to calculate the shift variablem 1 = m − γ 1 /π. With m = 2(n−1)+j and j = 1, 2, we find from (5.74) and (5.78) for odd/even values of m (i.e., for j = 1, 2) where we used the abbreviations y ≡ ma/ξ ,ỹ ≡ma/ξ , (5.109) withm ≡m 1 = m − γ 1 /π. As a consequence, the four fundamental scaling functions F A/B,− (y; γ) and F A/B,− (y; γ) show up naturally in the scaling behavior of the lower band (in AF or ML gauge, respectively) of the Rice-Mele model for odd (A) and even (B) sites. Similarly, for the upper band the corresponding scaling functions with τ = + appear, which are related via (5.38-5.41) to the scaling functions with τ = −. For larger values of Z > 2, the same scaling functions appear but in subtle combinations, as we will demonstrate in the next subsection for Z = 3.
In the main panels of Fig. 1(a) and Fig. 1(b) we reveal the universal scaling properties by plotting the left hand side of Eqs. (5.107) and (5.108) for different values of ξ as function of y = ma/ξ orỹ =ma/ξ, and find that all curves fall on top of the universal functions |F A/B,− (y; γ)| 2 and |F A/B,− (y; γ)| 2 (for m odd/even), respectively (note that we omitted the subindex τ = − and the superindex (1) for all quantities used in the caption of these figures). Although (5.107) and (5.108) are expected to hold only for large |m| 1 and for small gaps (where ξ a), it is remarkable that the coincidence is even quite well for |m| ∼ O(1) and for large gaps (where ξ ∼ O(a)). We will come back to this point in Section V E, where we discuss the behavior for small scales m ∼ O(1). All the universal scalings are in accordance with the symmetries and asymptotic behaviors discussed for the scaling functions in Section V B, compare with Fig. 3. For even m, the function |m w(ma)| 2 scales symmetrically with zero value at m = 0, all other cases are asymmetric and have a finite value at m = 0 (orm = 0), in accordance with (5.43-5.46). For large |m| 1 all functions are exponentially decaying with a pre-exponential power-law ∼ |ma/ξ| r exp(−|ma|/ξ). Since γ = 0.2π and τ = −, we get τ = − sgn(cos γ) = 0, leading via (5.49) and (5.50) to r = ±1 for |m w(ma)| 2 and odd m ≷ 0, and to r = −1 for |m w(ma)| 2 and even m. For |mw(ma)| 2 we get r = 0.5 for both m even or odd, see (5.53) and (5.54).
In the insets of Fig. 1(a) and Fig. 1(b) we show the universal scaling of the first moment C 1 (M a)/(2a) of w(ma) and of the quadratic spreadD 2 (M a)/(2aξ) ofw(ma) (with C 1 ≡ C 11 = C u,11 andD 2 ≡D 21 =D u,21 ) when plotted against M a/ξ for different ξ. As demonstrated, all discrete points fall on top of the universal curves (5.82) (with r = 1) and (5.83) (with r = 2), i.e., The last result gives the universal value for the quadratic spread ∆x 2 =D 2 = 2aξ/8 of the Wannier functioñ w(ma) in the ML gauge (for arbitrary size Za of the unit cell it turns into Zaξ/8 for the lowest band).

Universal scaling for Z=3
For Z = 3 the Bloch states and the Zak-Berry connection are calculated analytically in Appendix H. In contrast to the case Z = 2, two gap parameters ∆ (ν) e iγ (ν) , with ν = 1, 2, are needed to fix the lattice parameters. They correspond to the two gaps and contain different phases γ (ν) and different length scales ξ (ν) = v (ν) F /(2∆ (ν) ). For the Zak-Berry phases of the three bands we obtain approximately from (3.31) and (3.32) where, as shown in Fig. 5, it makes a negligible difference for small gaps whether one takes these approximate formulas to calculate the shift variablesm α = m − 3 2π γ α or the precise definition (3.10) by integrating over the lattice Zak-Berry connection.
where we defined s 2 = M a/ξ (2) and r 12 = ξ (1) /ξ (2) . As a result the ratio r 12 of the two length scales has to be kept fixed to see universal scaling by varying ξ (2) .

E. Properties of lattice Wannier functions on different scales
In this section we exhibit and compare the important properties of lattice Wannier functions on different scales, always taking the case of small gaps where a clear separation of length scales is present ξ a (here, ξ denotes some typical order of all ξ (ν) , ν = 1, . . . , Z − 1). We distinguish three different regimes, called (F) (for free or gapless case), (S) (for scaling region where the length scale ξ appears), and (E) (for exponentially decaying re- As a consequence, the Wannier functions are identical to the zero gap limit in this regime (although the phases γ (ν) from the gap parameter occur in the ML gauge due to the special choice of this gauge). As we will see in the following the regime (F) is only important for the correct normalization of the Wannier function but is of no significance for the scaling and provides a misleading visual impression of the Wannier function.
scales a and ξ roughly as follows (in AF or ML gauge for any band). Replacing ma by the continuous variable x and rescaling the Wannier function via the qualitative form can be stated as follows (omitting strongly oscillating terms on scale Za which contribute a negligible amount to the moments) where δ a (x) is a Lorentzian delta function on scale a, and f (y) is a dimensionless function of order O(1) The most important fact is that the delta function covers the complete normalization on scales whereas, due to the Lorentzian form of the delta function, the scaling of all moments is determined from the region |x| ξ as for all r ≥ 1. Note that the region |x| ∼ a contributes only the negligible amount a r a ξ r−1 to the moment for r ≥ 2. Strictly speaking, this argument is only rigorous for even values of r since in this case all terms of the integrand are positive. For odd values of r it can happen that the prefactor in front of the leading term is zero. This happens for the scaling of the odd moments D 2l+1,α (M a) andC 2l+1,α (M a) in the ML gauge, which do not show universality but are of no importance since their asymptotic value is of negligible order ∼ a 2 ξ 2l−1 for l > 0, see the detailed discussion after Eq. (5.87). In contrast, in the AF gauge the odd moments scale according to the estimate (5.149) for all r ≥ 1 and their order of magnitude is fully determined by the scaling region (S) (even for r = 1). They can be written as such that convergence is guaranteed at y = 0 even for l = 0. Importantly, in the AF gauge, the function f (y) ≈ f (−y) is nearly symmetric for |y| 1, see (5.139). Therefore, the region of small scales does not contribute and the odd moments show universal scaling from the regime (S) due to a significant asymmetry of the scaling function f (y) for |y| ∼ O(1), see the discussion after Eq. (5.87). E.g., it is quite remarkable that the first moment in the AF gauge is fully determined by the nearly invisible asymmetry of the Wannier function on large scales |ma| ∼ ξ a, whereas the visible impression of a symmetric shape on small scales |ma| ∼ O(a) would predict an incorrect vanishing first moment.
The scaling behavior of the moments is demonstrated numerically in the insets of Figs. 1(a), 1(b), 10, and 11, where it is shown that the first moments C 1α (M a) in the AF gauge and the second momentsD 2α (M a) in the ML gauge scale indeed smoothly to their universal asymptotic values on the scale M a ∼ ξ and obtain a negligible contribution from the small scale regime (F). Similar numerical evidence can be shown for all the other higher moments C rα (M a) with r ≥ 1 andD 2l,α (M a) with l ≥ 1.  As can be seen the normalization reaches unity already on very small scales M ∼ Z for large ξ (2) a.

VI. NON-ABELIAN WANNIER FUNCTIONS
In this final section we analyse another class of lattice Wannier functions which arise from a special non-Abelian gauge transformation which mixes the Bloch states from all bands α = 1, . . . , ν below a given one labelled by ν.
In particular the non-Abelian gauge of maximally localized Wannier functions has been proposed in the literature and the connection of their spread to the polarization fluctuations have been put forward. Here we analyse the universal scaling properties of the non-Abelian lattice Wannier functions and find the interesting result that they scale up to a surprisingly high precision according to the Dirac Wannier function of the lower band of the Dirac model corresponding to gap ν or, equivalently, similar to the upper part of the Wannier function of band ν as shown in Fig. 8. In Section VI A we summarize the general definition of the non-Abelian gauge of maximal localization and show our central result of its universal scaling. The technical part of the construction of this non-Abelian gauge is outlined in Section VI B, supplemented by three Appendices I, J and K. We show this construction in the main part of this work since it provides a very efficient way to analyse non-Abelian gauges analytically via the Wilson propagator which, up to our best knowledge, has not been reported before.

A. Non-Abelian lattice gauge and summary of results
So far we have discussed Abelian gauge transformations for a fixed band index α by allowing for a phase factor e iϕ kα to transform the Bloch state as |u kα → |ũ kα = e iϕ kα |u kα , (6.1) where ϕ kα = ϕ k+ 2π Za ,α is a periodic phase variable. In multi-band systems more general non-Abelian gauge transformations have been discussed, see Ref. 10 or reviews in Refs. 3 and 9. These gauge transformations mix the Bloch states of all band indices α = 1, . . . , ν up to a certain value ν (defining the valence band if a chemical potential is present in gap ν) via a k-dependent unitary transformation as Here,Û From the normalization (2.13) of the Bloch states and the unitarity ofÛ have interesting properties [10] which we summarize in the following. Defining the Zak-Berry connection and the geometric tensor in the non-Abelian case analog to (3.1) and (3.4) as and introducing the following definition for the non-Abelian Zak-Berry phase matrix together withγ αα for the diagonal components, one finds the following useful transformation propertieŝ where wn[detÛ Furthermore, defining the moments of the Wannier functions in the non-Abelian case analog to (4.16) aŝ ma) r |ŵ (ν) α (ma)| 2 (6.14) = Za 2π and introducing the following definitions for the sum over the positions and the quadratic spreads 1α , (6.16) we find from (6.15) and the above definitions after some straightforward algebra Since the first term on the right hand side of (6.19) is gauge invariant according to (6.13), and the last two terms are positive, the quadratic spread ∆x 2 (ν) is minimal in the non-Abelian gauge of maximal localization (NA-ML), defined by a k-independent and band-diagonal Zak-Berry connection As a consequence, using the surface fluctuation theorem (3.52), we find that the minimal quadratic spread in the NA-ML gauge is related to the boundary charge fluctuations as where we defined the momentŝ relative to the shift by the first moment, i.e., witĥ Most importantly, the surface fluctuation theorem (6.21) formulated in terms of the non-Abelian quadratic spread shows that the NA-ML gauge is unique in the sense that the boundary charge fluctuations can be written as the sum over the second moments of the Wannier functions of the individual bands. Within the Abelian ML gauge this is not possible, as already discussed after (3.58). For the boundary charge itself it seems to be similar at first sight to the Abelian AF and ML gauge, where it is also related to the sum over the Zak-Berry phases of the individual bands. However, as we have seen in Section III C 1, a subtle cancellation procedure happens such that the sum over the Zak-Berry phases of the individual bands α = 1, . . . , ν is related to the phase γ (ν) of the gap parameter of gap ν, see the central equation (3.51). As we will see in the following this is not needed in the non-Abelian gauge where all Zak-Berry phasesγ (ν) α are equally spaced and related to γ (ν) in the following universal waŷ such that −π <γ (ν) α < π. Consistent with the way the fluctuations are distributed among the bands in the non-Abelian case, we will find that all Wannier functionsŵ where e iθ (ν) α is some unimportant phase factor. Up to this phase factor, we find in comparison to (4.52) the central result that all Wannier functionsŵ As a consequence, defining the following scaling functions for the non-Abelian momentŝ we find from (5.83) and (5.96) the universal scalinĝ dy y r−2G − (y; γ (ν) ) , (6.28) with the following asymptotic value for the second mo-mentD The universal relation (6.25) of all non-Abelian Wannier functions α = 1, . . . , ν to the Dirac Wannier function of the lower band of the Dirac theory corresponding to gap ν is one of the most important results of this work. It shows that all non-Abelian Wannier functions scale independent of the band index in the same way and depend only on the low-energy properties of the model, provided that the condition of small gaps is fulfilled. Up to our best knowledge the non-Abelian Wannier functions have not been studied analytically in the literature and their universal scaling behavior for all generalized AAH models in terms of a single length scale ξ (ν) has not been reported before.
We note that also the winding number of the determinant detÛ (ν) k contains interesting information about the total number of edge states N (ν) E present in all gaps ν = 1, . . . , ν. According to (6.12), the winding number controls the change of the sum over all Zak-Berry phases from the bands α = 1, . . . , ν in the gauge transformation. On the other hand we can calculate the sum over the Zak-Berry phases in the AF and NA-ML gauge separately by using (3.31), (3.32) and (6.24) Since an edge state is present in gap ν if s γ (ν ) > 0, we get in addition for the total number of edge states the expression (6.32) Comparing these equations we find the following relation between the number of edge states and the winding number of detU which is obviously an integer number for both ν even or odd. This relation can be viewed as a bulk-boundary correspondence relating the total number of edge states up to a certain gap ν to a winding number defined in terms of the bulk states. In contrast to other topological invariants defined in one-dimensional systems, this relation holds independent of any symmetry constraints and includes all edge states, independent of whether they are at zero energy or not. Using (3.47) and (3.50), we finally find for the total boundary charge without or with including the edge states The last result states the surface charge theorem in terms of the non-Abelian Zak-Berry phasesγ where denotes the Wilson propagator along the path from k 2 → k 1 (here, P denotes the k-ordering operator analog to the time ordering operator), k 0 is an arbitrary reference point, and V (ν) is the unitary transformation which diagonalizes the Wilson loop operator Indeed, using the differential equation we find periodicity of the non-Abelian gauge transformation Here we used in the first step the group property (6.41), and in the second step the periodicity (6.42) together with the definition (6.38) of the transformation V (ν) . We note that the non-Abelian Zak-Berry phasesγ FIG. 14. Sketch of the typical band structure for the interval − π Za < k < 3π 2Za for Z = 5. The two subintervals (I) with − π Za < k < π 2Za and (II) with π Za < k < 3π 2Za are indicated in violet and green colour, respectively. The lowest/highest band in subinterval (I)/(II) (corresponding to ν = 0 and ν = Z, respectively) are characterized by k-independent Bloch states u kα . Four different k-values are shown, indicating the block structure (6.61) and (6.62) of the lattice Wilson propagator for ν even or odd in the two subintervals. Here, k1 corresponds to U (ν) (k1, − π 2Za ) for ν = 4 even in subinterval (I) (see the first equation of (6.61)), k2 to U (ν) (k2, − π 2Za ) for ν = 3 odd in subinterval (I) (see the first equation of (6.62)), k3 to U (ν) (k3, π 2Za ) for ν = 4 even in subinterval (II) (see the second equation of (6.61)), and k4 to U (ν) (k4, π 2Za ) for ν = 3 odd in subinterval (II) (see the second equation of (6.62)).
where we used the group property (6.41) and the periodicity (6.42). This means that the unitary operator (V (ν) ) with respect to the reference point k 0 is related to V (ν) by As a consequence, up to trivial phase factors to define the unitary transformation V (ν) , we find that the unitary transformationÛ (ν) k is unique and independent of k 0 . For convenience, we will choose in the following k 0 = −π/(2Za) as reference point and discuss the two intervals (I) k ∈ [−π/(2Za), π/(2Za) and (II) k ∈ [π/(2Za), 3π/(2Za)] separately since they are related to two different Dirac theories, see qτ p e ipk (ν) F ma , (6.47) where in both cases |q| < π 2Za . By convention we have added two trivial Dirac theories, one for α = 1 in subin-terval (I) with ν = 0 and τ = +, and another for α = Z in subinterval (II) with ν = Z and τ = −. In these two cases the Bloch states are given by the gapless ones and are independent of k, such that the Zak-Berry connection is zero.
The non-Abelian Wannier functions are then split aŝ where the transformed Bloch states follow from (6.2) and (6.36) aŝ It turns out to be useful to define the propagators when crossing the two subintervals as and use As outlined in detail in Appendix I one can extend the integration limits in (6.49) and (6.50) to ±∞ by using the k-independent values of the Bloch states and the propagator at the edges of the two subintervals. This means that the propagators for the two subintervals are extended as , (6.56) (6.57) The additional contributions are shown to cancel each other when taking the sum (6.48) up to trivial delta functions which are subtracted as usual both in lattice and Dirac theory.
Since the Zak-Berry connection has the block structure (3.29) in terms of the Dirac Zak-Berry connection, we obtain a corresponding block structure for the propagator denotes the Wilson propagator in Dirac theory with respect to reference point q = −∞ Here, A (ν) q denotes the 2 × 2-matrix of the Dirac Zak-Berry connection (3.19). According to Fig. 14 we then get the following block structure of the two propagators for ν even and odd in the corresponding subintervals (where we use k = q for subinterval (I) and k = π Za + q for subinterval (II)) ν even : ν odd : q− is the Abelian Dirac propagator for the upper part of the last band ν, which can be expressed with the help of (5.10) and (5.9) as of the Wilson propagator in Dirac theory is calculated in Appendix J. Most importantly, it is shown that this propagator transforms the Dirac states asχ i.e., leads to a q-independent contribution. This has the effect that, after inserting (6.46), (6.47), (6.51), (6.54), (6.61), and (6.62) in (6.49) and (6.50) (with k = q and k = π Za + q, respectively, and the q-integrals extending to ±∞), all propagators U (ν ) q give rise to integrands where the q-dependence of the Dirac states disappears and only a purely oscillating exponential remains in the integrand. This leads to unphysical delta function contributions which can be disregarded, see Appendix I. The only remaining terms are those from the Abelian propagators U (ν) q− , leading to the following result for the two subintervalŝ  In addition we will analyse in Appendix K the spectrum of the Wilson loop operator (6.55) and find indeed that all non-Abelian Zak-Berry phases are equidistantly distributed according to (6.24).

VII. SUMMARY AND OUTLOOK
The universal scaling properties of Wannier functions found in the present work are fundamental in several respects. Of particular interest is the high precision value of the universal scaling even for rather large gaps, as it appears most prominently in the non-Abelian gauge of maximal localization (NA-ML), see Fig. 13. The NA-ML gauge has been put forward in the literature [3,9,10] since in this gauge the total quadratic spread probes the fluctuations of the bulk polarization which, according to the surface fluctuation theorem [18], is related to the fluctuations of a directly measurable observable, the boundary charge Q (ν) B . Importantly, Q (ν) B is the total boundary charge corresponding to a chemical potential located in gap ν, in contrast to the boundary charge Q B,α of a single band α which is not measurable. The observable Q (ν) B probes essentially low-energy properties of the system since the low-lying states are occupied and contribute a fixed amount to the boundary charge which can not fluctuate. This characteristic feature of the boundary charge is directly linked to our result that the Wannier functions in the NA-ML gauge are probing low-energy properties and reveal universal scaling to high precision. This motivates further investigations of the NA-ML gauge in other multi-channel or higher-dimensional systems to find similar universal scaling assuming the small gap limit.
Our proposal of how to define Wannier functions for field-theoretical Dirac models opens up pathways to describe universal scaling properties of Wannier functions for a whole class of multi-band lattice models very efficiently via a simplified continuum model with a low number of bands only. We have exemplified this here for a 2-band Dirac model in 1 + 1-dimension, covering the whole class of one dimensional tight-binding models with nearest-neighbor hopping, one orbital per site, and generic periodic on-site potential and hopping modulation. These are special models in the sense that the gap opens up at a single point in quasimomentum space (in our case either at k = 0 or at k = ±π/(Za)). This gives rise to a field-theoretical low-energy model consisting of one pair of a slowly varying right-and left-mover. In other more general multi-channel cases or models with longer-ranged hopping several anticrossings can appear in quasimomentum space each of them characterized by the presence of several channels. The same will apply to higher-dimensional systems. However, also for these more general cases, the boundary charge and the Wannier functions in the NA-ML gauge are expected to probe only the low-energy properties of the system provided that all gap openings close to the Fermi level remain small. Therefore, based on our proposal of how to define Wannier functions within continuum field theories, it will be of high interest to study the Wannier functions of more general field-theoretical models, consisting of multichannel Dirac models or several Dirac models which are coupled to each other (each of them corresponding to one anticrossing point).
It is a quite remarkable result of our work that fieldtheoretical models also allow for the definition of the Zak-Berry connection and the geometric tensor, although these are quantities which are used in lattice models to characterize global topological properties of the whole band structure in a nonlocal way. In contrast, a field theory can only characterize the physics close to the Fermi level where the gap opens, i.e., it probes essentially local aspects of the band structure. However, this does not mean that a field theory is per se not capable of characterizing topological properties of lattice models. If the gaps at the bottom and the top of a given band are both small, it is possible to set up two independent field theories around the two gaps and supplement them by appropriate asymptotic conditions to connect them in a unique way. This allows to obtain full control over the whole Zak-Berry connection in the lattice theory via the Zak-Berry connection defined in the field theoretical models. For the lattice models studied in this work the band structure in the middle of the bands is characterized by free plane waves, leading naturally to the choice of asymptotically free plane waves in the corresponding field-theoretical models. For more general multi-channel models, as well as in the presence of spin-orbit interaction and Zeeman fields, generalizations have to be studied for small gaps by expanding around a different reference system (called H 0 in our work, see Eq. (2.2)). This reference system will again prescribe certain asymptotic conditions for the field-theory far away from the gap, which can be used to set up unique correlations between gaps opening up at different energies. Therefore, we expect that our way to describe universal properties of Wannier functions and the boundary charge via low-energy models will be also of interest for the description of global topological aspects in more general models. Here we show that the normalization and completeness relations (2.13) and (2.14) are consistent with the corresponding ones in the low-energy region, given by (2.35) and (2.36). To achieve this we first discuss the gapless case and show that (2.29) reproduces the correct result for the eigenstates of H 0 .
For the gapless case δv m = δt m = 0 the Hamiltonian h k given by (2.10) has the form One obtains straightforwardly for the dispersion and the eigenfunctions where n kα are Z integers in some interval of size Z. To achieve the ordering in the reduced zone scheme − π Za < k < π Za , we choose n k1 = 0, n k2 = − sgn(k), n k3 = sgn(k), n k4 = −2 sgn(k), n k5 = 2 sgn(k), etc., which gives for α even e i sgn(k) π Z (α−1)j for α odd . (A.5) According to (2.30) and (2.31) it is straightforward to show that this corresponds precisely to the choice χ (ν) qτ p = δ p,sgn(qτ ) in Dirac theory, as stated in (2.29).
To check the normalization (2.13) we prove (2.40) rigorously for any integers r, s ≥ 0 via the relations (2.30) and (2.31). Except for the lower half of the lowest band (i.e., for |k| < π 2Za and α = 1) or the upper half of the highest band (i.e., for |k| > π 2Za and α = Z), the gaps with even ν describe the region |k| < π 2Za , whereas the gaps with odd ν correspond to |k| > π 2Za . Therefore, in almost all cases, the parity of ν and ν must be the same for given k. In this case we get where we used δ pν,p ν = δ pp δ νν in the last step (since ν, ν > 0) and the fact that pν − p ν is always an even number (since ν and ν have the same parity). The special case ν = 1 and ν even is only possible for τ = −, p = sgn(k) = − sgn(q), and |k| < π 2Za . It leads to This expression must be zero since, by using p + sgn(q) = 0 and ν even, we conclude that p ν must be zero which is not possible. Finally, the special case ν = Z − 1 and ν having a different parity than Z − 1 can be treated in a similar way and leads also to a contradiction. To prove the completeness relationship (2.14) for given k we note that, for each given gap ν, one has to sum over both band indices τ = ± of Dirac theory, except for ν = 1 or ν = Z − 1 when |k| < π 2Za or |k| > π 2Za , respectively. In the latter case only the Dirac bands τ = − (for ν = 1 and |k| < π 2Za ) or τ = + (for ν = Z − 1 and |k| > π 2Za ) have to be taken, respectively. This means that only the eigenstates far away from the gap ν = 1 or ν = Z − 1 are involved for the lowest or highest lattice band, where we can take approximately the free solution (A.5). The same happens for the contribution from the pairs τ = ± since one finds from (2.30) and (2.31) that the expression τ u kα (ja)u kα (j a) * involves always the combination where we used the completeness relationship (2.36) of the Dirac states. Since the same result arises for the free Dirac states χ (ν) qτ p = δ p,sgn(qτ ) , we find that the completeness relationship in the presence of a gap is not changed compared to the one for the free Bloch states (A.5).

Appendix B: Cancellation of Zak-Berry phases
Here we show on quite general grounds that the sum of the Zak-Berry phases in the AF gauge of the two bands of the Dirac model is quantized in odd units of π. We first need that one can write (with χ qτ ≡ |χ qτ defining column vectors in p = ±) where we defined the unitary matrix This means that (up to multiples of 2π) the sum of the Zak-Berry phases of the two Dirac bands can be written as minus the phase difference of the determinant of M  Here we show how to split the lattice Wannier function for band α = 2, . . . , Z − 1 into the contributions corresponding to the upper and lower halves of the band and how to extend the momentum integration to infinity, see Eqs. (4.8), (4.9), (4.12) and (4.13). Since the eigenfunctions of the lattice and the Dirac model are related via (2.22), (2.23), (2.24), (2.30) and (2.31), we can perform this analysis directly within the Dirac model by calculating the contributions from the region |q| > Λ α of the Dirac Wannier functions occurring on the right hand side of (4.44), (4.45), (4.52) and (4.53). Denoting the contribution of the various Wannier functions from this region by δw u/d,α (ma) we get qτ p e iqx e −η|q| , (C.6) withχ (ν) qτ p . We have introduced a convenient regularization of the integrals via the convergence factor e −η|q| . For |q| > Λ α we can take the free form χ Using these relations we obtain for (C.5) and (C.6) Leaving out the contribution proportional to the delta function δ(x) (this is the regularization we use in Section V to define the Dirac Wannier function), we obtain after inserting these expressions into Eqs. (C.1-C.4) For α = 2, . . . , Z − 1 we can insert Λ α = π 2Za and γ α = γ where δw α (ma) and δw α (ma) are given by (4.12) and (4.13), respectively. In contrast, for α = 1, Z, we can use Λ 1/Z = π Za , γ 1 = γ In this Appendix we present the representation of the Dirac Wannier functions w pτ (x) together with the scaling functions F A/B,τ (y; γ) andF A/B,τ (y; γ) via convergent momentum integrals on the real axis. Starting from (5.14-5.17), we first subtract the large |q| 1 behavior of the integrands to get convergent integrals and to explicitly exhibit the part which remains forη → 0. Using (2.29) and (C.7) together with (C. 8) and (C.9) (with Λ α = 0) we find (D. 2) The first term on the right hand side of the Dirac Wannier functions contains an unphysical contribution ∼ δ(x) which arises from large momenta and has to be subtracted in the regularization procedure. In the scaling functions this part does anyhow not appear since one multiplies with x. In contrast, the principal part ∼ 1 x of the first term is an important contribution to the scaling functions. It should be contrasted to the contributions (C.8) and (C.9) which arise from extending the momentum integration to infinity and cancel in the lattice Wannier function of a certain band when adding the contributions from the upper and lower part of the band to the Wannier function, see Appendix C.
Using (5.11) and (3.32) we note the identities where we inserted the form (5.17) forχ (ν) τ p (q), and used the definition y = x/ξ (ν) = 2x. Inserting (5.16) and using (D.4) one can calculate the scaling functions from (5. 24-5.29) and finds after some straightforward manipulations the compact forms with¯ q = 1 +q 2 and s γ = sgn γ. These formulas can be straightforwardly evaluated numerically and have been used to produce the results shown in Fig. 3. To obtain (5.18) and (5.19), we use (5.14) and (5.15), and close the integration contour in the upper or lower half of the complex plane for s x = sgn(x) = ±, respectively. Choosing the branch cuts of the various square roots and of e −η|q| on the imaginary axis, we can close the integration contour around the branch cut, and obtain withq = is x κ and 0 < κ < ∞ the form and an analog equation forw τ p . Due to the exponentially decaying factor e −|x|κ , these integrals are convergent in the limitη → 0, and we obtain the results (5.18) and (5.19) in this limit.

(E.6)
For |x| 1, the integrals are dominated by the regime κ 1, where we get Inserting these forms in (5.18) and (5.19), we find straightforwardly the following result for small |x| ξ (ν) .
For Z = 2 this gives the explicit formulas where δv = δv 1 = −δv 2 , and t 1/2 = t ± δt. Using (5.103), We do not indicate the dependence of δv j ≡ δv jkα on k and α since it is clear which band index has to be taken for the Abelian case.
For the derivative of the dispersion we use the results obtained in Ref. 15 and for Z = 3 .

(H.25)
Appendix I: Splitting of non-Abelian Wannier functions In this Appendix we show that extending the momentum integrations in the representations (6.49) and (6.50) of the Wannier functions in NA-ML gauge for the two subintervals gives only rise to delta function contributions in the total sum (6.48) which can be disregarded. After inserting (6.51) we replace the propagators beyond the subintervals via (6.56), (6.57) and (6.54) as Obviously, the two contributions k > π 2Za from the extension of subinterval (I) and k < π 2Za from the extension of subinterval (II) lead to the same integrand, such that the sum contains dk e ikm (ν) α a = 2πδ(m (ν) α a). This are unphysical delta function contributions which we always disregard. Similar delta function contributions are generated from the high-momentum integrals in Dirac theory which cancel these terms.
For the two remaining contributions we first shift k → k + 2π Za for the contribution k > 3π 2Za from the extension of subinterval (II) to get the same starting point − 2π Za for the k-integration as the end point for the contribution k < − π 2Za from the extension of subinterval (I). This shift gives an additional e i 2π Zm (ν) α factor from the exponential e ikm (ν) α a in the integrand. The integrands for the two contributions are then again the same since such that only (V (ν) ) α α remains for the product of (I.6) and (I.7). Here, we have used the definitionm for the derivation of (I.6), and the definition (6.38) of the transformation V (ν) for the derivation of (I.7). As a result we get again a delta function from the two remaining contributions which we can disregard.

Appendix K: Spectrum of Wilson loop operator
The Wilson loop operator defining the unitary transformation V (ν) can be calculated from (6.55) as the product U (ν) II follow from the limit q → ∞ of (6.61) and (6.62) as ν even : U