Dulmage-Mendelsohn percolation: Geometry of maximally-packed dimer models and topologically-protected zero modes on site-diluted bipartite lattices

The classic combinatorial construct of {\em maximum matchings} probes the random geometry of regions with local sublattice imbalance in a site-diluted bipartite lattice. We demonstrate that these regions, which host the monomers of any maximum matching of the lattice, control the localization properties of a zero-energy quantum particle hopping on this lattice. The structure theory of Dulmage and Mendelsohn provides us a way of identifying a complete and non-overlapping set of such regions. This motivates our large-scale computational study of the Dulmage-Mendelsohn decomposition of site-diluted bipartite lattices in two and three dimensions. Our computations uncover an interesting universality class of percolation associated with the end-to-end connectivity of such monomer-carrying regions with local sublattice imbalance, which we dub {\em Dulmage-Mendelsohn percolation}. Our results imply the existence of a monomer percolation transition in the classical statistical mechanics of the associated maximally-packed dimer model and the existence of a phase with area-law entanglement entropy of arbitrary many-body eigenstates of the corresponding quantum dimer model. They also have striking implications for the nature of collective zero-energy Majorana fermion excitations of bipartite networks of Majorana modes localized on sites of diluted lattices, for the character of topologically-protected zero-energy wavefunctions of the bipartite random hopping problem on such lattices, and thence for the corresponding quantum percolation problem, and for the nature of low-energy magnetic excitations in bipartite quantum antiferromagnets diluted by a small density of nonmagnetic impurities.


I. INTRODUCTION
Static impurities are an important feature of many physical systems. This has long motivated studies of quenched disorder in the context of diffusion of fluids in disordered media, and electronic conduction and magnetism in disordered solids.
The random interconnectedness of a porous medium can affect the diffusion of a fluid through it in a strikingly crucial way if the end-to-end connectivity of a porous material is lost at a sharp porosity threshold. This key insight led Broadbent and Hammersley to initiate the theoretical study of percolation processes over sixty years ago as a paradigm for understanding such behaviour. Universal features of such threshold behavior are expected to be shared by the corresponding percolation transition of diluted Euclidean lattices. This transition represents an elementary and intuitively compelling geometric example of universality and scaling at a second-order critical point [1][2][3]. Percolation theory has therefore come to occupy center-stage at the confluence of probability theory, geometry, and statistical physics in recent years.
Unlike a classical fluid, quantum-mechanical particles do not simply diffuse through such a disordered medium; their wave-like character leads to interference phenomena that can render diffusion over large length scales impossible. This insight led to Anderson's nearly-concurrent development of the theory of localization for describing transport properties of the electron fluid in a dirty metal [4]. This general theory addresses the effects of random potentials and random hopping amplitudes in models of electronic conduction [5].
The literature on quantum percolation, dating back nearly fifty years [6], constitutes the closest point of contact between studies of percolation on the one hand, and localization on the other. Quantum percolation is a special case of the localization problem. It models electronic conduction in disordered binary alloys by studying the behaviour of a quantum particle hopping with nearestneighbor hopping amplitudes on a diluted lattice without any external random potentials [7][8][9][10][11][12][13][14].
In the simplest versions of interest to us, the constituent which does not contribute to the conduction band of the alloy is represented by missing sites of a site-diluted bipartite lattice (which admits a decomposition into two sublattices, labeled "A" and "B" in our discussion below, with sites of one sublattice only having neighbors on the opposite sublattice). Can such a system of noninteracting electrons support ohmic conduction at some specified Fermi energy when the underlying lattice has end-to-end connectivity in the thermodynamic limit? Conversely, and more informally: Is there a regime of dilution in which a classical fluid percolates, but a quantum fluid does not? These are the key questions that motivate this body of work.
The random geometry of such diluted lattices is also probed by the classic combinatorial problem of maximum matchings [15]. In a maximum matching of a lattice, one attempts to cover the maximum number of sites of the lattice by hard-core dimers that each occupy a single link of the lattice. In this way, one attempts to match as many sites as possible with one of their neighbours. While no FIG. 1. Topologically-protected zero modes of bipartite random hopping problems are expected to live in R-type regions of the lattice, with local sublattice imbalance and a site boundary consisting entirely of sites belonging to the sublattice that is locally in the minority [20]. See Sec. I, Sec. II, and Sec. III for a detailed discussion. two dimers are allowed to touch at any site, some sites may remain unmatched, i.e., host monomers, if the diluted lattice has no perfect matchings (equivalently, fullypacked dimer covers). This ensemble of maximum matchings, with some choice of positive weight for each maximum matching, defines a maximally-packed dimer model on the underlying lattice.
Like percolation and localization, the problem of maximum matchings also has a long and distinguished history, having seeded major developments in graph theory, combinatorics, and computer science. For instance, it was Edmonds's analysis of the computational complexity of his algorithm for finding maximum matchings [16] that led to the notion of the computational complexity class P that plays a fundamental role in theoretical computer science.
For such maximally-packed dimer models on sitediluted bipartite lattices such as the square and honeycomb lattices in two dimensions, and the cubic lattice in three dimensions, our computational study establishes the presence of a nonzero monomer density in the thermodynamic limit for any nonzero dilution probability n vac > 0. As we will see below, monomers of any maximum matching live in well-demarcated regions of the lattice. We also study the dilution dependence of the end-to-end connectivity of these monomer-carrying regions. For a range of n vac well within the geometricallypercolated phase of the diluted lattice, our computations reveal the striking presence of a localized phase of these monomers; this reflects the finite extent of each such monomer-carrying region.
As n vac is lowered further, our study uncovers interesting percolation phenomena displayed by the monomercarrying regions. In the three dimensional case, we find a sharply-defined transition associated with the percolation of these connected monomer-carrying regions. This occurs at a nonzero n crit vac that lies well within the geometrically-percolated phase of the underlying lattice. At an even lower dilution n low vac < n crit vac , we also observe a spontaneous sublattice symmetry-breaking transition of the monomer fluid, whereby the monomers break the statistical sublattice symmetry of the randomly-diluted bipartite lattice.
In two-dimensions, we find that these monomercarrying regions undergo an incipient percolation transition in the n vac → 0 limit of vanishing site dilution. In this limit, large-scale properties of these regions exhibit universal behaviour controlled by a diverging length-scale that represents their typical size. In three dimensions, large-scale properties in the vicinity of n crit vac are again controlled by the divergent typical size of such regions. Our detailed computational study allows us to identify the correlation length exponent ν that controls the powerlaw divergence of this length scale in the vicinity of the critical point in three dimensions, and in the low-dilution limit in two dimensions: ν 2d = 5.1 (9), ν 3d = 0.87 (10).
This universal critical behaviour represents an intrinsic property of the random geometry of the underlying site-diluted lattice in the following sense: Since every dimer in any maximum matching of a bipartite lattice pairs one A-sublattice site with an adjacent B-sublattice site, monomers in the corresponding maximally-packed dimer model are associated with regions of the diluted lattice with local sublattice imbalance. The critical exponent ν should thus be thought of as characterizing the random geometry of local sublattice imbalance in such site-diluted bipartite lattices.
Clearly, these results are of fundamental interest from the point of view of the statistical mechanics of such maximally-packed dimer models, and from the point of view of the graph theory of such random lattices, particularly percolation theory. However, at this point, it would be natural for any reader to wonder: When the theoretical literature already abounds with several other interesting variations on the theme of geometric percolation [17][18][19], what motivated our interest in yet another variant, this one associated with monomers of maximum matchings? Indeed, what is the motivation for a computational study of maximum matchings of such lattices in the first place? And what does any of this have to do with the questions of Anderson localization and quantum percolation introduced at the very outset?
It is useful to answer these natural questions before proceeding further. In the remainder of this Introduction, we provide a colloquium-level overview that answers them and sketches a number of other interesting connections. Readers who prefer to first understand things at a technical level may wish at this point to go to the more detailed and technical overview in Sec. II, and then circle back to the rest of this Introduction.
To proceed, we begin by noting that our original motivation came from two pieces of earlier work: In one of these, Ref. 20 noticed that the site-diluted honeycomblattice tight-binding model that describes low-energy carriers of undoped but diluted graphene had a nonzero density of single-particle states at exactly zero energy. These zero modes were argued to arise from so-called "R-type" regions of the lattice which have a local imbalance in the number of A-sublattice sites relative to B-sublattice sites (even when there is no such imbalance globally), and a boundary consisting entirely of sites belonging to the sublattice that is locally in the minority [20] (see Fig. 1).
Crucially, this proposed mechanism also predicted that these zero modes are topologically-protected, in the sense that their existence is unaffected by changes in the actual values of the nonzero hopping amplitudes on each nearest-neighbor link, so long as the pattern of nearest-neighbor connectivity in the lattice does not change [20]. Associated with this robustness is a topologically-protected localization property of the corresponding wavefunction: it lives entirely within the Rtype region [20].
In the other closely-related work, Ref. 21 studied the effects of non-magnetic impurities in a SU(2)-symmetric version [22] of Kitaev's model [23] for a Majorana spin liquid on the honeycomb lattice. Low-temperature properties of such systems are controlled by fermionic excitations, whose spectrum is obtained by solving a tightbinding model on the site-diluted honeycomb lattice with π flux attached [23][24][25] to each vacancy. The topologically-protected zero-energy states identified earlier in the context of diluted graphene are expected to survive this flux attachment, albeit with modified wavefunctions [20,21].
In a striking corroboration of this prediction, the density of zero modes obtained numerically in Ref. 21 in this problem with flux attachment matched the corresponding density in diluted graphene to within a few percent. This strongly suggests the zero mode density is dominated by such topologically-protected zero modes, with fragile modes (that depend on a specific pattern of values for the nonzero hopping amplitudes) contributing an insignificant fraction of the total.
If the vacancies are not permitted to get too close to each other, as was the case in the computations of Refs. 20 and 21, the smallest nontrivial example [20] of such topologically-protected zero modes on the sitediluted honeycomb lattice is a single mode associated with a R-type region with six vacancies at specific positions near each other. The expected density of these "R 6 " regions provides a simple but rigorous lower-bound on the density of topologically-protected zero modes [20]. However, the density of zero modes computed by multiprecision numerics far exceeds this bound in both cases, i.e., with and without flux attachment [20,21]. This raises the question: If R 6 regions are atypical, how does a generic R-type region "look", and how many linearlyindependent zero modes does it support?
This is an interesting question, both in the context of such Kitaev-like models, and in the context of Hubbardlike models for electron-electron interactions in diluted undoped graphene. In the Kitaev systems, the density of zero modes controls the coefficient of a Curie-like term that dominates the low-temperature susceptibility [21], while their wavefunction determines the spatial profile of the vacancy-induced magnetic moments that lead to this Curie term. Likewise, in the Hubbard model, Hartree-Fock mean field theory suggests [26] that the zero modes would be associated with local moment formation, with their wavefunctions again controlling the spatial profile of these local moments.
Since these topologically-protected modes are expected to live in R-type regions with local sublattice imbalance, it is natural to explore the possibility that progress on this question can be made by thinking in terms of maximum matchings of the site-diluted lattice. After all, as we have already discussed, every dimer pairs one Asublattice site with an adjacent B-sublattice site, while monomers of a maximally-packed dimer model are associated with local sublattice imbalance.
Indeed, a similar line of reasoning led Longuet-Higgins [27] nearly sixty years ago to a pair of insightful exact results that partially anticipated related developments in the graph theory literature [15]. Transcribed to the language used here, these results say the following: i) The number of monomers in any maximum matching of a bipartite lattice equals the number of linearly-independent topologically-protected zero-energy wavefunctions of a quantum-mechanical particle hopping on nearest-neighbor links of this lattice. ii) The spatial support of the topologically-protected zero-energy eigenspace of this hopping Hamiltonian is given by the set of all sites that can host a monomer in any maximum matching of the lattice.
The first of Longuet-Higgins's results provides a simple way to independently confirm [28] the relatively large density of topologically-protected zero modes [20], far in excess of the simple lower bound of Ref. [20]. While this confirmation [28] is reassuring, the initial impetus for our work came mainly from our attempts to go beyond Longuet-Higgins's second result in order to answer our earlier question: What is the spatial morphology of a "typical" R-type region of the diluted lattice?
This question can be restated more formally: Can we equip this eigenspace of zero modes with a "natural" basis that has a topologically-protected localization property? Clearly, answering this requires us to go beyond Longuet-Higgins's second result, which only provides a global characterization of the spatial support of the topologically-protected null space of the hopping matrix as a whole.
Our approach to this question brings into play a graphtheoretical tool, the Dulmage-Mendelsohn decomposition of a bipartite graph, which provides a crucial structural characterization of such diluted lattices via the combinatorial problem of maximum matchings [15,[29][30][31]. This allows us to use any one maximum matching of the diluted lattice to construct a set of non-overlapping connected regions of the lattice that host the monomers of any maximum matching.
Our key insight is simply stated: These connected monomer-carrying regions identified using Dulmage and Mendelsohn's structure theory provide a natural con-struction of a complete non-overlapping set of R-type regions of the lattice. Our construction shows that the number of linearly-independent topologically-protected zero modes localized within each such R-type region is precisely equal to the number of monomers hosted by it in any maximum matching.
Thus, we obtain an alternate local proof of both of Longuet-Higgins's results using the structure theory of Dulmage and Mendelsohn. This has the advantage that it also provides a prescription for constructing an orthonormal basis of zero modes with a topologicallyprotected localization property. The unusual monomer percolation phenomena advertised earlier correspond to the percolation transitions of these monomer-carrying components of the Dulmage-Mendelsohn decomposition. Clearly, this has consequences for the corresponding quantum-mechanical wavefunctions.
Indeed, our identification of these Dulmage-Mendelsohn percolation phenomena has interesting implications for the zero-energy quantum percolation problem on such site-diluted bipartite lattices: In the two-dimensional case, a zero-energy quantum particle is localized for any nonzero dilution, albeit with a localization length that diverges as one approaches the n vac → 0 limit of vanishing site dilution. There is thus no quantum percolation in this case, even deep within the classically percolated phase in which a classical fluid diffuses unhindered through the lattice. This settles a question about which there has been some controversy in the literature [7][8][9][10][11]. In the three-dimensional case, we conclude that the quantum percolation transition at zero energy is associated with the Dulmage-Mendelsohn percolation transition of the diluted lattice. A more detailed discussion of implications of our work for quantum percolation is in Sec. VIII B and Sec. IX.
The structure theory of Dulmage and Mendelsohn also provides a way of factorizing the partition function of such maximally-packed dimer models. Each factor is either associated with a monomer-carrying R-type region, or a region of the lattice which is always perfectly matched in any maximum matching. Remarkably, this factorized structure is valid not just for the simplest case of non-interacting dimer models (with Boltzmann probabilities determined entirely by weights of occupied links), but also in the presence of monomer and dimer interactions, so long as the dimer interaction terms only act on flippable (perfectly-matched) elementary plaquettes or on larger flippable loops, and the monomer interactions are sufficiently short-ranged. The corresponding quantum dimer models share these potential-energy terms with their classical analogs. Additionally, they have monomer-hopping terms, as well as kinetic-energy terms that change the perfect matching of a flippable loop. For such quantum dimer models, this factorization property implies that any many-body eigenstate has a tensor product structure, with one factor associated with each of these regions.
Our computational study of two-dimensional site-diluted square and honeycomb lattices and the sitediluted cubic lattice in three dimensions reveals that the perfectly-matched regions remain finite and small in extent in the thermodynamic limit at any nonzero dilution. Thus, when there is no percolation of the monomercarrying regions, both dimer and monomer correlations in the classical model remain strictly localized. For the analogous quantum dimer models in two and three dimensions, this also implies the existence of a phase with area-law entanglement entropy of eigenstates in the middle of the energy spectrum. This is discussed in more detail in Sec. VIII C and Sec. IX.
To reiterate the main message of all of the foregoing: Maximum matchings probe the random geometry of regions with local sublattice imbalance in a site-diluted bipartite lattice. These regions host the monomers of any maximum matching and control the localization properties of a zero-energy quantum particle hopping on this lattice. The structure theory of Dulmage and Mendelsohn provides us a way of identifying a complete and non-overlapping set of such regions, as well as regions of the lattice that are always perfectly matched in any maximum matching. This motivates our large-scale computational study of the Dulmage-Mendelsohn decomposition of site-diluted bipartite lattices in two and three dimensions. Our computations uncover an interesting universality class of Dulmage-Mendelsohn percolation phenomena associated with the end-to-end connectivity of such monomer-carrying regions (see Fig. 2 and Fig. 3), with striking implications for maximally-packed classical and quantum dimer models on such lattices, and for the quantum percolation problem.
Having conveyed this main message, we now conclude this section by describing two other contexts in which our results yield interesting and significant conclusions. The first of these refers to a network [32,33] whose nodes represent individual Majorana modes engineered to exist on some physical platforms (for instance, a topological superconductor device [34,35]). The links of this network correspond to the dominant mixing amplitudes that couple these modes to each other. A missing node represents the absence of the corresponding localized Majorana mode, perhaps due to physical device being in the "wrong" regime of parameters due to disorder effects.
Although we expect these dominant mixing amplitudes to lead to most of the localized Majorana modes being lifted away from zero energy, we may ask: Are there any collective zero-energy Majorana fermion excitations of the network as a whole? If this network can be described by a site-diluted bipartite lattice in two or three dimensions, our results yield an interesting answer via a mapping to an equivalent problem of a quantummechanical particle hopping on this bipartite lattice: Every R-type region with an odd number of linearly independent zero modes of this hopping problem generically hosts a single such collective zero-energy Majorana fermion excitation which is perturbatively stable to the introduction of further-neighbor couplings in this net- The left (middle) panel shows the boundary of the largest three RA-type (RB-type) regions of the Dulmage-Mendelsohn decomposition of a honeycomb lattice sample with L = 4000 and periodic boundary conditions, with vacancy density nvac = 0.08. The right panel shows the largest two RA-type and RB-type regions of the same sample. The color-coding of the boundaries has been changed in the right panel (relative to the other two panels) for greater visibility. See Sec. I, Sec. II and Sec. III for further details. the largest RA (dark blue bulk with light blue surface) region and the largest RB-type region (red bulk with yellow surface) of the Dulmage-Mendelsohn decomposition of a site-diluted cubic lattice both percolate, as shown in this L = 100 example from two different vantage points in these panels. Right two panels: For smaller nvac, in each sample, either the largest RA-type region is small and the largest RB-type region percolates or vice-versa, as is evident from individual snapshots of the largest RA (dark blue bulk with light blue surface) and the largest RB region (red bulk with yellow surface) in the example shown. For such low dilution, each sample thus spontaneously breaks the sublattice symmetry that characterizes the ensemble of site-diluted lattices we consider. See Sec. I, Sec. II and Sec. III for further details.
work. The Dulmage-Mendelsohn percolation phenomena studied here thus have implications for the spatial structure of collective zero-energy Majorana fermion excitations of these networks, as discussed further in Sec. VIII B and Sec. IX.
The second of these contexts has to do with diluted quantum antiferromagnets. In a series of papers largely motivated by experimental work on impurity effects in quantum antiferromagnets [36], Sandvik [37] and Wang [38,39] reported on a detailed quantum Monte Carlo (QMC) study of the S = 1/2 Heisenberg antiferromagnet on the site-diluted square lattice. The basic conclusion was that long range antiferromagnetic order persists in the ground state all the way up to the classical percolation threshold of the diluted lattice. Indeed, it was argued that the critical percolating cluster at the geometric percolation transition has long range antiferromagnetic order in the ground state. As a result, the antiferromagnetic transition is driven by the underlying geometric transition, and occurs right at the geometric site-percolation threshold of the square lattice [38,39].
These studies also explored the physics of the antiferromagnetically-ordered phase in the vicinity of this transition [38,39]. An interesting finding in this regime was the presence of anomalously low-energy triplet excitations that dominated the low-energy behaviour of the ordered phase and had an effect on the quantum-critical scaling. These triplet excitations were argued to arise from regions of the lattice with local sublattice imbalance [39], and were therefore modeled in terms of the spatial monomer distribution function of the corresponding dimer model. Motivated by this, Henley and collaborators studied analogous phenomena on the Bethe lattice at percolation, and developed a Schwinger-boson mean field approach to this physics [40,41].
Our work adds to this understanding in a direct and crucial way, since the R-type regions constructed using the structure theory of Dulmage and Mendelsohn constitute a complete set of non-overlapping regions that host such low-energy triplet excitations of this diluted antiferromagnet. Our identification of incipient Dulmage-Mendelsohn percolation phenomena in the n vac → 0 limit, deep in the geometrically-percolated phase of the site-diluted square lattice, then implies that these triplet excitations of the diluted S = 1/2 antiferromagnet have their own hitherto-unsuspected critical behaviour deep inside the antiferromagnetic phase; this is unconnected with the geometric percolation driven antiferromagnetic transition itself. It also opens up interesting avenues for further work, which are discussed in Sec. VIII D and Sec. IX.
As mentioned earlier, this introductory discussion sets the stage for a more technical overview in Sec. II of our principal ideas and results. This more technical overview can be read in conjunction with Sec. VIII, which provides additional theoretical perspective and a more detailed discussion of the various implications of our results, and Sec. IX, which discusses interesting questions thrown up by our work. The middle third of our article describes the precise theoretical arguments, the computational methods, and the detailed analyses of numerical data that lead to these results. A road map through this part of the paper, and through the Appendix which augments this material, is provided at the end of Sec. II.

II. TECHNICAL OVERVIEW: MODELS, APPROACH, AND RESULTS
We consider site-diluted bipartite lattices such as the square and honeycomb lattice in two dimensions, and the cubic lattice in three dimensions, with compensated dilution n vac , i.e., exactly equal numbers of surviving sites on the two sublattices. On the one hand, we are interested in the classical statistical mechanics of maximally-packed dimer models, i.e., of the ensemble of maximum matchings of such lattices, with partition function where n rr (C) is the dimer occupation number of the link rr in maximally-packed configuration C, and exp(w rr ) (with real w rr > 0) is the associated bond weight. In contrast to the undiluted case, this ensemble of maximally-packed dimer configurations may have a nonzero number of monomers associated with it if the disordered lattice has no perfect matchings. The ellipses refer to monomer and dimer interaction terms. Monomer interactions are restricted to be short-ranged in nature, only coupling two monomers if they are at next-nearest neighbor sites (there can be no additional nearest-neighbour monomer interactions since monomers of a maximum matching already obey a nearest-neighbor exclusion constraint). The dimer interaction terms are defined on flippable (perfectly-matched) elementary plaquettes or larger flippable loops on the lattice (for the simplest such terms, see the precise definitions given in Fig. 4). Additionally, we are interested in the many-body spectrum of the corresponding maximally-packed quantum dimer models. The simplest such quantum dimer model inherits its "potential energy" terms from the corresponding classical dimer model defined above. In addition, it has two kinds of "kinetic energy" terms: ringexchange terms that flip the state of flippable elementary plaquettes with amplitude Γ, and monomer kineticenergy terms that hop a monomer to a next-nearestneighbor location with amplitude Γ (see Fig. 4 for the precise form of the Hamiltonian in the simplest case). Additional kinetic-energy terms that change the perfect matching of larger flippable loops are also permitted, as are terms that represent longer-range hopping of monomers (along an alternating sequence of unmatched and matched links).
On the other hand, we are also interested in the zeroenergy wavefunctions of a quantum-mechanical particle hopping with possibly random hopping amplitudes on links of such a lattice. The corresponding Hamiltonian for a gas of non-interacting fermions at zero chemical potential µ = 0 reads: where c † r creates a canonical fermion at site r of this sitediluted lattice and t rr is the possibly random hopping amplitude on nearest-neighbor link rr of this lattice.
Additionally, we are interested in characterizing topologically-robust zero-energy Majorana fermion excitations of a bipartite network of localized Majorana modes η r described by the Majorana Hamiltonian where a rr is a real antisymmetric matrix that describes the mixing between Majorana modes located on neighbouring sites of a site-diluted bipartite lattice. In all these settings, we focus on topologicallyprotected aspects which depend only on the connectivity of the underlying lattice while being insensitive to the actual values of the corresponding quantum amplitudes or weights or interaction energies. To achieve this, we rely crucially on the Dulmage-Mendelsohn decomposition [15,[29][30][31] of a bipartite graph. Using this structure theory, which provides us a classification of sites of a bipartite lattice into three types, we construct two classes of non-overlapping connected regions R i (i = 1 . . . N R ) and P j (j = 1 . . . N P ) that together cover the diluted lattice. These have an intuitively appealing characterization from the point of view of a maximally-packed dimer model defined on the diluted lattice: monomers of this maximally-packed dimer model are all confined to live in these R-type regions, which are of two types: R A -type regions in which the monomers only live on A-sublattice sites, and R B -type regions in which the monomers only live on B-sublattice sites. In any maximum matching, a given R-type region R i hosts the same fixed nonzero number I i of monomers. On the other hand, P-type regions are parts of the lattice that are always perfectly matched in any maximum matching.
For the maximally-packed classical dimer models defined in Eq. 1, we establish a complete factorization of the partition function, with each R-type and P-type region independently contributing one factor to the partition function. As a result, correlations between monomers of the maximally-packed dimer model are strictly localized within individual R-type regions, and dimer correlations are likewise localized within individual R-type and P-type regions. For the corresponding quantum dimer models, we demonstrate that an arbitrary many-body eigenstate can be written as a tensor product of eigenstates of individual R-type and P-type regions. As a result, the entanglement entropy across a cut that partitions the system into two halves is controlled by the typical size of the R-type and P-type regions that such a cut passes through. This implies that many-body eigenstates even in the middle of the spectrum have area-law entanglement entropy whenever R-type and P-type regions remain finite in the thermodynamic limit.
For the bipartite random hopping problem, we provide an alternative "local" proof of the graphtheoretic identity [15,27] between the number of topologically-protected zero-energy states and the number of monomers in any maximum matching of the underlying lattice. This local proof establishes that the wavefunctions of such topologically-protected zeroenergy states can be chosen to live entirely within individual R-type regions. I i such linearly-independent zero modes coexist on the A-sublattice (B-sublattice) sites of each R A -type (R B -type) region R i .
Our construction of this orthonormal basis implies a topologically-protected localization property of the basis-independent zero-energy on-shell Green function ∆G(r, r ): ∆G(r, r ) is nonzero if and only if r and r belong to the same R-type region. This has interesting consequences for the zero-temperature, zero-magnetic-field conductivity tensor of H F . For the Majorana Hamiltonian H Majorana , we show that each R-type region with an odd value for I generically supports a single topologicallyprotected zero-energy Majorana fermion excitation that is perturbatively stable to additional mixing amplitudes between further neighbors.
All of this motivates our detailed computational study of the random geometry of these R-type and P-type regions of site-diluted bipartite lattices in two and three dimensions. The basic picture that emerges from our results is common to the two-dimensional and threedimensional cases, and can be summarized thus: The P-type regions remain finite in extent in the thermodynamic limit over the entire range of dilution studied. At any nonzero n vac in the thermodynamic limit, we find a nonzero number density n R (n P ) of R-type (P-type) regions in the thermodynamic limit (Fig. 5), as well as a nonzero number density of "odd" R-type regions with odd imbalance I (this implies a nonzero density of collective topologically-protected Majorana fermion excitations of Eq. 3). These R-type (P-type) regions contain a nonzero fraction m tot (m P ) of sites of the undiluted lattice in the thermodynamic limit (Fig. 5), with the Rtype regions hosting a nonzero monomer density w in the maximally-packed dimer model of Eq. 1 (equivalently, a density w of topologically-protected zero modes in the bipartite hopping problem of Eq. 2).
In the small-n vac limit, we find that most of the sites of the diluted lattice belong to R-type regions, with m tot → 1 − n vac , although w → 0 quite rapidly as n vac → 0 (Fig. 5). In the range of system sizes (L) accessible to us, the typical size of R-type regions is much larger than the mean spacing between zero modes (Fig. 6) as n vac → 0; in fact it appears to be only limited by the finite size L (Fig. 7). The dominant contribution to both m tot and w in this regime comes from such large R-type regions ( Fig. 8 and Fig. 9). Moreover, in this regime, the dominant contribution to m odd tot , the fraction of sites of the undiluted lattice belonging to odd R-type regions, also comes from the large system-size-limited odd R-type regions (see Figs. 17, 18 in Appendix). This basic picture is equally valid in both two and three dimensions.
Going beyond this basic picture, we find interesting differences between the two-dimensional and threedimensional cases: In two dimensions, both R-type and P-type regions remain localized at any nonzero n vac accessible to our numerical study. However, the n vac → 0 limit exhibits incipient percolation of R-type regions. As a result, large-scale aspects of the random geometry of R-type regions are universal in the small-n vac limit. We obtain a numerical estimate of ν 2d = 5.1±0.9 for the correlation length exponent that characterizes this universality class, and place a bound η 2d 0.06 on the value of the corresponding anomalous exponent η. A particularly interesting aspect of this incipient Dulmage-Mendelsohn percolation phenomenon is that this nontrivial behaviour is exhibited in the limit of vanishing vacancy density, raising the intriguing possibility of being accessible to a rigorous analysis in the n vac → 0 + limit (although existing results in the mathematical literature are considerably weaker in nature [42,43]).
In contrast, on the cubic lattice in three dimensions, we find that there is a nonzero dilution threshold n crit vac = 0.5956(5) below which (above which) R-type regions percolate through the lattice (remain bounded in extent). In the vicinity of this Dulmage-Mendelsohn percolation transition, the random geometry of R-type regions exhibits critical scaling behaviour. We obtain a numerical estimate of ν 3d = 0.87 ± 0.1 for the correlation length exponent of this critical point, and place an upper bound η 3d 0.03 on the corresponding anomalous exponent. Immediately below n crit vac in the thermodynamic limit, there are two infinite (percolating) clusters, one R A -type region and another R B -type region. This behaviour persists throughout this intermediate-n vac phase, until n vac is lowered below n low vac = 0.35 (1), which represents the upper-limit of a small-n vac phase with spontaneous sublattice symmetry breaking. In this low-n vac phase, each random sample has exactly one percolating R-type region, which can be either a R A -type or a R B -type region.
Importantly, all these interesting phenomena occur well inside the geometrically percolated phase of the underlying site-diluted lattice. Clearly, this Dulmage-Mendelsohn percolation, incipient or otherwise, is a monomer-percolation phenomenon exhibited by the ensemble of maximum matchings of such lattices. In addition, for the classical statistical mechanics of the maximally-packed dimer models of Eq. 1, our twodimensional results, in conjunction with the factorization property of Z, imply that monomer and dimer correlation functions remain strictly localized for arbitrarily small but nonzero dilution (for caveats and details, see Sec. IV A and Sec. VII A). This is, at least at first sight, a rather remarkable property of such two-dimensional dimer models. Moreover, our results suggest that the corresponding monomer correlation functions could in principle exhibit critical scaling in the n vac → 0 limit, possibly with an additional set of independent exponents that characterize the long-distance behaviour of these functions in this limit.
Similarly, in three dimensions, our results imply a monomer-percolation transition at n crit vac . This separates an intermediate-dilution percolated phase from a highdilution phase in which monomers are strictly localized. Additionally, we predict a sublattice symmetry-breaking transition of the monomer gas at n low vac ; this transition separates a low-dilution percolated phase with spontaneous sublattice symmetry breaking from the intermediate dilution phase of the monomer gas.
For the corresponding maximally-packed quantum dimer models, our results immediately imply that manybody eigenstates even in the middle of the spectrum have area-law entanglement entropy for any nonzero n vac in the two-dimensional case (for caveats and details, see Sec. IV B, Sec. VII A and Sec. VIII C). For the threedimensional cubic-lattice quantum dimer model, our results imply the existence of a phase with such area-law behavior for n vac > n crit vac . In this three-dimensional case, the Dulmage-Mendelsohn percolation transition at n crit vac is thus expected to correspond to an entanglement phase transition separating a n vac < n crit vac phase with more conventional volume-law entanglement entropy from a n vac > n crit vac area-law phase (for details and a caveat, see Sec. IV B and Sec. VIII C).
For systems described by H F , the understanding developed here sheds considerable light on some long-standing questions in the literature on quantum percolation. Since ∆G(r, r ) is only nonzero when r and r both belong to the same R-type region, our results in two dimensions imply that there is no delocalized phase and therefore no quantum percolation transition for any nonzero n vac when the chemical potential is set to the particle-hole symmetric value µ = 0. Instead, our results point to the existence of a different universality class of scaling behaviour associated with incipient wavefunction percolation in the n vac → 0 limit (for caveats and details, see Sec. IV D, Sec. VII A, and Sec. VIII B).
On the cubic lattice with chemical potential µ = 0, our results establish the existence of a localized phase for n vac > n crit vac . Thus, for n vac ∈ (n crit vac , n geom vac ) (where n geom vac is the threshold for ordinary site percolation on the cubic lattice), quantum percolation is forbidden although a classical fluid percolates. In conjunction with the earlier literature, our results also imply that there is a delocalization transition precisely at the Dulmage-Mendelsohn percolation threshold n crit vac (for details and a caveat, see Sec. IV D and Sec. VIII B). This provides an interesting perspective on the nature of the quantum percolation transition in three dimensions. Further, our results strongly suggest that this delocalized phase has a sublattice symmetry-breaking transition at n low vac . Our results also have interesting consequences for bipartite Majorana networks described by H Majorana . In two dimensions, our results establish the existence of a nonzero density of topologically-robust zero-energy Majorana fermion excitations which are localized for any nonzero n vac . Additionally, they suggest the existence of an incipient Majorana percolation phenomenon, whereby these collective zero-energy Majorana fermion excitations undergo a delocalization transition in the n vac → 0 limit (for caveats and details, see Sec. IV D, Sec. VII A, and Sec. VIII B). In the three-dimensional case, our results establish the presence of a nonzero density of strictlylocalized zero-energy Majorana fermion excitations for n vac > n crit vac , and strongly suggest the existence of two different delocalized phases below n crit vac , separated from each other by the sublattice symmetry-breaking transition at n low vac (for details and a caveat, see Sec. IV D and Sec. VIII B).
The precise arguments and detailed analyses that lead to these results are presented in the next few sections: In Sec. III, we introduce the Dulmage-Mendelsohn decomposition of a bipartite graph. In Sec. IV, we use this graph-theoretical tool to construct the connected components R i and P j of interest to us, and establish key properties of these R-type and P-type regions. In Sec. IV A, we use these to derive consequences for monomer and dimer correlations in maximally-packed classical dimer models. In Sec. IV B, we discuss consequences for the entanglement entropy of arbitrary manybody eigenstates of the corresponding quantum dimer models. In Sec. IV C, we use these graph-theoretical ideas to provide an alternate proof of the correspondence, alluded to earlier, between the number of zero modes and the number of monomers. In Sec. IV D we establish a topologically-protected localization property of the onshell zero-energy Green function of the hopping problem, and discuss consequences for the conductivity tensor of the free-fermion Hamiltonian H F (Eq. 2) in the limit of vanishing temperature and external magnetic field. This is followed in Sec. IV E by an analysis of implications for the perturbative stability of collective Majorana fermion excitations of bipartite Majorana networks. In Sec. V, we describe our computational methods and define the various geometric quantities of interest to us. This is followed in Sec. VI by our results for the basic properties of the random geometry of R-type and P-type regions in the thermodynamic limit. Next, we provide a detailed account of our finite-size scaling analysis of Dulmage-Mendelsohn percolation in Sec. VII. This includes a precise characterization of the spontaneous breaking of sublattice symmetry in three dimensions. As noted earlier, the last two sections provide additional theoretical perspective and a more detailed discussion of the implications of our results, as well as some suggestions for promising lines of further enquiry. Finally, the Appendix is devoted to a description of the morphology of the largest R-type regions that dominate the low dilution limit, and to a separate study of scaling properties of Rtype regions with odd monomer number I. This separate study is motivated by the fact that such regions control the spatial structure of perturbatively-stable zero-energy Majorana fermion excitations of the Majorana networks defined in Eq. 3.

III. THE DULMAGE-MENDELSOHN DECOMPOSITION
The structure theory of bipartite graphs developed by Dulmage and Mendelsohn [15,[29][30][31] is simple and elegant, but perhaps not well-known to physicists. Here, we attempt to remedy this with a self-contained account, which also sets up our notation. Our description below follows the treatment of Ref. [31].
Consider the maximally-packed dimer model on a bipartite graph, i.e., with the proviso that the number of monomers is restricted to the minimum possible value.
Dulmage and Mendelsohn showed that this ensemble of maximum matchings defines a unique structural decomposition of the underlying graph, independent of the maximum matching one starts with.
To establish this, start with any one maximum matching, which has monomers at lattice sites h k , with k = 1 . . . W . Here, W is the number of unmatched sites in any maximum matching. Now, consider alternating paths starting from unmatched vertices that host monomers: These are paths that begin at any unmatched vertex h k , traverse any one of the unmatched (unoccupied by dimers) links emanating from it and subsequently go along an alternating sequence of matched and unmatched links of the lattice without visiting any site more than once or traversing any link more than once. Maximum matchings are characterized by the absence of augmenting paths, i.e., the absence of alternating paths of odd length which start and end at unmatched sites [15]. Indeed, if there was such a path, one could have added one more dimer to the system by switching the occupied and unoccupied links of this path, and the original matching would not have maximum cardinality.
Using this, one can classify sites into three groups, odd (o-type), even (e-type) and unreachable (u-type): Unreachable sites cannot be reached by such an alternating path starting from any monomer. o-type sites can be reached by an alternating path of odd length starting from some monomer. And even sites are those that can be reached by an alternating path of even length starting from some monomer; this class also includes the unmatched sites themselves. These three groups of sites are disjoint. To see this, we first note that the definitions themselves imply that the set of u-type sites is disjoint from the other two sets. Additionally, in a bipartite lattice, the same site cannot be reached both by even-length and odd-length alternating paths. For if this were possible, one would either have an augmenting path connecting two monomers, or have an odd-length alternating cycle (simple loop). Neither can exist since we are working with a maximum matching of a bipartite graph.
Further, it is easy to check that this decomposition is unique, no matter which maximum matching one starts with. To see this, one notes that the overlap of two maximum matchings consists of edges that are matched (covered by dimers) by both maximum matchings, vertices that are left unmatched by both maximum matchings, overlap loops, and overlap paths. Here, overlap loops are cycles whose links are alternately covered by dimers belonging to the two maximum matchings that have been superposed. Similarly, overlap paths are paths that start at an unmatched site of one maximum matching and end at an unmatched site of the other maximum matching; this is guaranteed by the fact that both matchings are maximum. These paths also have links alternately covered by dimers belonging to the two maximum matchings.
Sites left unmatched by both maximum matchings, and sites at the ends of links covered by dimers in both maxi-mum matchings clearly have the same type label (e-type, o-type, or u-type) starting with either maximum matching. Sites on the overlap loops also have the same type label starting from either maximum matching. Again, this is because the lattice is bipartite and all such cycles have even length. For the paths, we have already noted that they must start at an unmatched site of one maximum matching and end at an unmatched site of the other maximum matching. They must therefore have an equal number of matched edges (dimers) from each maximum matching and be even in length. Therefore, sites on the paths also have the same type label starting from either maximum matching.
Thus, this classification into odd, even and unreachable represents a fundamental structural property of the lattice itself. Additionally, it is clear from the definitions that every o-type site must be paired with some e-type site by a dimer in any maximum matching. Likewise, in any maximum matching, every u-type site must always be matched to another u-type site by a dimer, while an etype site can either be unmatched or matched with some o-type site by a dimer.
Next we note that two e-type sites cannot be connected by a nearest-neighbor link of the lattice. For if such a link were present, it would either imply the existence of an odd cycle in the lattice, or the existence of an augmenting path starting from a monomer in our matching; since we are dealing with a maximum matching of a bipartite lattice, neither is possible.
Additionally, an e-type site cannot be a neighbor of a u-type site either. To see this one first recalls from the above that any link between a u-type site and e-type site cannot have a dimer on it, since the e-type site is reached from a monomer of a maximum matching by an alternating path that ends in a matched link (covered by a dimer), and is therefore matched with an o-type site. Given this, it is clear that the existence of such a link, which must necessarily be unmatched, would allow for this alternating path to be extended to reach the u-type site via this unmatched link, contradicting the fact that it is unreachable.

IV. IMPLICATIONS
We now present five arguments that use this structural decomposition to provide information on i) monomer and dimer correlation functions of the maximallypacked classical dimer models defined in Eq. 1, ii) entanglement entropy of eigenstates of the corresponding maximally-packed quantum dimer models, iii) topologically-protected zero mode wavefunctions in the free-fermion problem Eq. 2, iv) the corresponding onshell zero-energy Green function ∆G and the conductivity tensor for the free-fermion Hamiltonian Eq. 2, and v) the perturbative stability of zero-energy Majorana fermion excitations in the corresponding bipartite Majorana networks with Hamiltonian Eq. 3.
All our arguments rely on using the Dulmage-Mendelsohn classification of sites into o-type, e-type and u-type sites to define a further decomposition of the lattice into non-overlapping connected regions R i (i = 1 . . . N R ) and P j (j = 1 . . . N P ). We define these as follows: Color all o-type and e-type sites red. Also color red all links that connect any o-type site to any etype site. Color all u-type sites blue. Also color blue all links between two u-type sites. And delete all links between u-type sites and o-type sites (e-type sites are never neighbors of u-type sites), as well as all links between two o-type sites (two e-type sites never have a link between them). Our lattice now decomposes into N R + N P connected components (due to the deletion of links described above): Red components R i (i = 1 . . . N R ) consisting of connected R-type regions, and blue components P j (j = 1 . . . N P ) consisting of connected P-type regions. From the definitions, it is straightforward to see that no monomers live in the P-type regions, which thus represent parts of the lattice that are always perfectly matched by any maximum matching. It is also clear that monomers of any maximum matching always live on etype sites inside a R-type region.
Additionally, we note that the boundary sites of any R-type region, i.e., those sites which have neighbors belonging to other R-type or P-type regions, are always of o-type. Further, since all o-type sites of a R-type region are matched to an e-type site of the same region, and monomers live only on e-type sites, this implies that the number of e-type sites in any R-type region exceeds the number of o-type sites in spite of all boundary sites being o-type. Additionally, we note that these R-type regions come in two "flavours" R A and R B : R A -type ( R B -type) regions have all their e-type sites on the A (B) sublattice and all their o-type sites on the B (A) sublattice.
Finally, we note that each P-type region P j itself can be uniquely decomposed into subregions defined to ensure that each overlap loop (in the ensemble of overlap loops obtained by superimposing any two perfect matchings of P j ) lives entirely within a single subregion of P j . This is related to the so-called "fine decomposition" of Dulmage and Mendelsohn used by Pothen and Fan in their algorithm for block triangularization of matrices [29,30]. As we will see in our subsequent discussion, this additional "fine structure" of P-type regions can play a potentially crucial role in determining the dimer correlation length of the maximally-packed dimer model, especially if P-type regions are large in size.
A. Maximally-packed classical dimer models: Monomer and dimer correlations We now consider the statistical mechanics of the maximally-packed dimer model, Eq. 1. Our discussion relies on three crucial observations: i) Any alternating path starting from an unmatched site of a maximum matching lies entirely within a single R-type region. In other words, the number of monomers inside any given R-type region remains the same in all maximum matchings. ii) The links deleted during our construction (of connected R-type and P-type regions) never host a dimer in any maximum matching of the full lattice. To establish this, we simply note that the boundary sites of any R-type region are o-type sites, which must be matched to e-type sites within that region itself, and all sites of any P-type region are matched among themselves within that region. iii) Since monomers only live on e-type sites of a R-type region, and the boundary of R-type regions is made up entirely of o-type sites, two monomers on next-nearest neighbor sites must lie in the same R-type region.
The last observation implies that each monomer interaction term in the maximally-packed dimer models defined in Eq. 1 always acts entirely within any one Rtype region. On the other hand, the second observation implies that each flippable (perfectly matched) elementary plaquette or larger flippable loop must lie entirely within a single R-type or P-type region. As a result, each dimer interaction term in Eq. 1 also acts entirely within a single R-type or P-type region. Therefore, the ensemble of maximum matchings can be generated from any one maximum matching by making all possible rearrangements of dimers (keeping their numbers fixed) independently within each connected component R i and each connected component P j .
Indeed, the classical statistical mechanics of the corresponding maximally-packed dimer models defined in Eq. 1 can be studied by separately considering each connected component R i and P j . Each P j is guaranteed to independently host a fully-packed dimer model, with its sites always being perfectly matched amongst themselves in any maximum matching, and inter-dimer interactions acting entirely within the region. On the other hand, since each R-type region hosts the same fixed nonzero number of monomers in any maximum matching, one has an independent maximally-packed dimer model defined on each R-type region, again with dimer and monomer interaction terms acting entirely within each region.
The classical partition function Z of such maximallypacked dimer models thus factorizes: This factorization is topological in nature, in the sense that it is independent of numerical values of the nonzero bond fugacities assigned to each link and only depends on the pattern of nearest-neighbor links of the graph. It immediately implies that monomer and dimer correlations do not extend beyond individual connected components. The foregoing implies that the typical size of a R-type region places an upper bound on the length scale over which correlations can propagate in the gas of monomers. For dimer correlations, the situation is more complicated, for the average two-point function of dimers has two contributions that could be very different from each other: One arising from dimer correlations within each R-type region (which hosts a maximally-packed dimer model) and the other arising from perfectly-matched Ptype regions (each of which hosts a fully-packed dimer model). In the site-diluted lattices we study, it is not a priori obvious which of these dominates the long-distance behaviour of the average two-point dimer correlator. The issue is the following: Although the P-type regions turn out to be typically much smaller than R-type regions in the regimes of interest to us, there is no a priori guarantee that their contribution is negligible compared to that of R-type regions. This is because the dimer correlation length in the R-type regions can potentially be much smaller than the typical size of such a region, due to the presence of a coexisting gas of monomers.
Clearly, a detailed numerical study of this question would depend on lattice-level details and is somewhat removed from the focus of the present work. Here, we confine ourselves to noting that this question becomes tricky if the dimer correlation length in R-type regions is of the same order of magnitude as the typical size of P-type regions. In this case, the fine decomposition of Ptype regions [29,30] comes into play: As we have noted in the previous section, each P-type region itself can be decomposed into subregions that are defined to ensure that overlap loops between two perfect matchings of P live entirely within a single subregion. This implies a further factorization of each Z Pi in Eq. 4, and means that dimer correlations cannot extend beyond these individual subregions identified by the fine decomposition. In this case, it is the typical size of these subregions that potentially plays the key role in determining the dimer correlation length.
B. Maximally-packed quantum dimer models: Entanglement entropy of arbitrary eigenstates This analysis also leads to an interesting conclusion about entanglement properties of eigenstates of a natural quantum version of the maximally-packed dimer model on such diluted lattices, whose Hamiltonian has been displayed in Fig. 4 and defined in the accompanying discussion in Sec. II.
From the properties of R-type and P-type regions established in Sec. III, and the analysis of the classical case presented in the preceding section, it is clear that all the terms in the Hamiltonian of such a quantum dimer model also act entirely within a single R-type or P-type region. This includes dimer potential-energy and kinetic-energy terms defined on arbitrary flippable loops, next-nearest neighbor monomer interactions, and any monomer kinetic-energy terms that hop a monomer by two or more units along an alternating path starting at that unmatched site. An immediate consequence of this is that every eigenstate of such quantum dimer models can be written as a tensor product of some eigenstates of the individual P-type and R-type regions: If these regions remain finite in extent in the thermodynamic limit, this immediately implies that |Ψ cannot have volume-law entanglement entropy even if it is in the middle of the many-body spectrum. Indeed, in this case, the entanglement entropy of |Ψ across a cut that partitions the system into two halves must have area-law scaling due to its tensor product structure. The prefactor of this area-law scaling will clearly be determined by the typical sizes of the R-type and P-type regions through which this cut passes. This conclusion holds independent of the values of Γ, Γ and V , including in the presence of quenched randomness in their values. We emphasize again that this tensor product structure of arbitrary many-body eigenstates is not entirely fragile: Any ring-exchange or dimer potential-energy terms that act on any flippable loops preserve this structure, as do next-nearest-neighbor interactions between the monomers, and monomer kinetic-energy terms that move a monomer along an alternating path. As noted in the preceding section, this is because any such flippable loop or alternating paths must lie entirely within a single P-type or R-type region, and two monomers on nextnearest-neighbor sites must also be in the same R-type region. However, an interaction between monomers on next-next-nearest neighbor sites can couple a R A -type region to a neighboring R B -type region. Therefore, in presence of such extended inter-monomer interactions, many-body eigenstates are no longer guaranteed to have this tensor-product structure.
C. Bipartite random-hopping models: Topologically-protected zero modes We now discuss topologically-protected zero modes of the hopping problem. By topologically-protected zero modes, we mean zero modes whose existence is robust to changes in the actual values of the hopping amplitudes, and depends only on the connectivity of the graph, i.e., on whether a given hopping amplitude is zero or nonzero. Such modes are robust to bond disorder, since their existence is unaffected by randomness or modulations in the hopping amplitudes.
In Ref. [20], the nonzero density of zero modes of the tight-binding model for diluted graphene was argued to arise from the presence of so-called R A -type ( R B -type) regions of the site-diluted sample, having more A (B) sites than B (A) sites, but a boundary consisting of only B (A) sites. From the foregoing discussion, it is now clear that the connected components R i defined here provide a consistent and convenient (not to mention elegant) construction of these R-type regions of Ref. [20], thus justifying the commonality of nomenclature.
This allows us to now make a precise argument for the number and structure of such topologically-protected zero modes: Let the number of o-type sites in any one particular R-type region be N o and the number of e-type sites be N e , with their difference I = N e − N o being a positive number equal to the number of monomers hosted by this region in any maximum matching. We now establish that there are I linearly-independent topologicallyprotected solutions of the zero-energy Schrödinger equation with the property that the corresponding wavefunctions are only nonzero on e-type sites within this one region. To establish this, we write down the system of Schrödinger equations that must be satisfied by any zeroenergy wavefunction of this form, and show that this is a rectangular system with N e variables and N o equations.
To see this, first note that the Schrödinger equation on all e-type sites of this R-type region is trivially satisfied since the wavefunction is zero on all the neighbours of each of these e-type sites. This is true because e-type sites always have o-type neighbours. This is guaranteed by the fact that there are no e-type sites on the boundary of a R-type region (as we have seen earlier, the boundary consists entirely of o-type sites). Further, all e-type neighbours of each o-type site belonging to this region, including e-type neighbours of o-type sites on the boundary of this region, lie within this R-type region. The zero energy Schröndinger equations to be satisfied by our wavefunction therefore reduce to N o constraints (one on each of the o-type sites of this region) that must be satisfied by N e variables (corresponding to the wavefunction amplitudes on the N e e-type sites of this region).
The minimum number of linearly independent solutions of this system of equations equals N e − r max , where r max is the maximum rank of the corresponding rectangular matrix with N e columns and N o rows. Since N o < N e , r max ≤ N o , implying the existence at least N e −N o ≡ I linearly independent solutions to these equations, independent of the precise values of the nonzero hopping amplitudes. Thus, we have established that each R-type region R i constructed using the structure theory of Dulmage and Mendelsohn contributes exactly I i topologically-protected zero modes in the quantum mechanics of a particle hopping on the lattice. The total number of topologically-protected zero modes is thus W = i I i , i.e., exactly equal to the number of monomers in any maximum matching of the lattice.
This argument provides an alternate proof of the wellknown graph-theoretic identity between the number of topologically-protected zero modes W and the number of monomers in any maximum matching [15,27]. In contrast to the standard "global" approaches [15,27] that makes use of determinants to prove this, our "local" argument uses the structure of the connected components R i to provide a constructive proof. In the process, we uncover a crucial aspect of the topologically-protected zero-energy eigenspace, namely, the fact that it is possible to choose a basis such that each zero-energy wavefunction of this basis lives entirely within one R-type region, with I i such basis functions co-existing in region R i (for i = 1 . . . N R ). D. Quantum percolation: On-shell zero-energy Green function and conductivity tensor Next, we consider the on-shell zero-energy Green function: ∆G(r, r ) ≡ α ψ α (r)ψ * α (r ) + h.c., where the sum on α is over W orthonormal zero-energy eigenfunctions that make up any choice of basis for the zero-energy eigenspace, and r and r represent position coordinates of points on the lattice. By construction, ∆G defined in this manner is independent of this choice of basis. It is therefore particularly convenient to evaluate it using the basis described above, i.e., consisting of I i orthonormal wavefunctions that live entirely within each R-type region R i for i = 1 . . . N R . There is of course considerable residual freedom in choosing an orthonormal set to span this I i dimensional subspace of wavefunctions that live within a single region R i . But again, ∆G is of course independent of these choices.
Employing such a basis, we see that ∆G(r, r ) in only nonzero if r and r both belong to the same R-type region. Thus, our argument identifies a generic maximallylocalized choice of zero-energy eigenbasis whose localization property is topological in nature and implies a corresponding topologically-protected localization property of the on-shell zero-energy Green function. While the behaviour of ∆G within any given R i will depend on the strength of the disorder in the hopping amplitudes in that region, our argument shows that ξ G , the localization length of |∆G| 2 (r − r ) (where the angular brackets represent averaging over quenched disorder) must be bounded above by the typical size ξ of R-type regions: To appreciate the consequences of this for transport properties of Eq. 2 at the particle-hole symmetric chemical potential µ = 0, we recall the careful analysis in Ref. [47] of the conductivity tensor σ(r, r ) of free-fermion systems. In the limit of vanishing temperature and magnetic field, σ(r, r ) has an expression given entirely in terms of ∆G(r, r ). As a result, the localization length ξ loc that governs the spatial dependence of σ (r − r ) (where the angular brackets again denote averaging over quenched disorder) must also be bounded above by the typical size of R-type regions: ξ loc < ξ.
Thus, if R-type regions remain bounded in size, this implies that the corresponding fermionic system is guaranteed to be an insulator at µ = 0 in the zero temperature limit. On the other hand, if the typical size of R-type regions diverges, ξ loc and ξ G could still be finite due to the localized character of the zero mode wavefunctions inside the largest R-type regions in a sample. Although this cannot be ruled out, one expects [48,49] that zero energy wavefunctions of bipartite random hopping problems are not exponentially localized, suggesting that a phase with delocalized R-type regions likely corresponds to a delocalization transition in the transport properties of the free fermions as well.

E. Bipartite Majorana networks: Zero-energy Majorana fermion excitations
Our understanding of the spatial structure of topologically-protected zero mode wavefunctions also leads us to an interesting conclusion regarding the nature of collective zero-energy Majorana fermion excitations of bipartite Majorana networks [32,33] modeled by the Hamiltonian: where r, r represent the locations of individual Majorana modes, a rr is a real antisymmetric matrix that represents the mixing of nearest-neighbor Majorana modes in this bipartite network, and we have neglected quartic interaction terms whose effects [50,51] are assumed unimportant in our analysis below. Specifically, we demonstrate that "odd" R-type regions (with odd values of I) are expected to generically host a single topologicallyprotected zero-energy Majorana fermion excitation of the bipartite Majorana network described by Eq. 6. This mode survives the leading effects of additional nextnearest-neighbor couplings. In contrast, R-type regions with even I do not generically host such robust zeroenergy Majorana fermion excitations.
To see this, we first use the bipartite structure of the network to define a new basis, whose B-sublattice components have a phase exp(iπ/2) relative to the corresponding components of the original basis in which a rr is written. The A-sublattice components of the new basis are identical to those of the original basis. In this new basis, the matrix of unperturbed mixing amplitudes is now a real symmetric matrix t (0) rr that defines a bipartite random hopping problem. The collective zero-energy Majorana excitations of the bipartite network are now obtained from the zero modes of this real symmetric matrix, which has matrix elements t (0) Thus, for bipartite networks whose geometry is that of a diluted bipartite lattice, this maps to a special case of the bipartite random hopping problem on such lattices (with real hopping amplitudes) [52].
Next, we consider the perturbative effect of weak additional mixing amplitudes that couple nearby modes on the same sublattice. In other words, we write iã rr = ia rr + iδa rr , where the real antisymmetric matrix δa represents weak mixing between modes on nearby sites of the same sublattice of this network. In the new basis, we have a random hopping problem defined by the hermitean hopping amplitudes t rr = t 0 rr + δt rr with To understand the effects of this perturbation on the collective zero-energy Majorana fermion excitations to leading order in perturbation theory, we study the effect of δt on the zero modes of the unperturbed random hopping problem. In order to do this, we must project the perturbation δt into the zero-energy subspace of t (0) rr and diagonalize the resulting matrix. This is greatly facilitated by thinking in terms of the maximally-localized choice of basis described in the previous section, with each unperturbed zero-energy basis state living entirely within any one R-type region, and I i such basis states coexisting in region R i for i = 1 . . . N R . Further, since t (0) is a real symmetric matrix, these basis vectors are real in this particular case.
Consider first a perturbation δt which mixes nextnearest neighbor sites on the same sublattice, but does not have any other nonzero matrix elements. Observe that all e-type sites (in the language of Sec. III) of a R A (R B ) region belong to the A (B) sublattice, and boundary sites of R A (R B ) regions are all o-type, and belong to the B (A) sublattice. Additionally, recall that the basis states have nonzero amplitudes only on e-type sites. As a result of this structure, the projection of δt rr into the unperturbed zero-energy subspace has a blockdiagonal form in this basis if δt rr is zero beyond nextnearest neighbor sites (in the sense of connectivity, not geometric distance). Indeed, when δt rr does not extend beyond next-nearest neighbor sites, this projection decomposes into N R independent blocks corresponding to the R-type regions of the lattice, with each region R i contributing a block of size I i × I i for i = 1 . . . N R .
Further analysis depends crucially on the fact that each of these I i × I i blocks inherits the "chiral" symmetry of the original problem [52], which guarantees that a positive eigenvalue +λ always has a partner at −λ. To see this is true, note that δt is pure imaginary and antisymmetric, and all states in our chosen basis of unperturbed zero modes are real, implying that each of these blocks is a pure imaginary antisymmetric matrix. As a result of this, one can immediately conclude that every region R i with odd I i will host at least one collective Majorana zero mode that survives the leading-order effects of any perturbation δt rr that does not extend beyond nextnearest neighbors. This lends additional significance to R-type regions with odd imbalance I. These robust zeroenergy Majorana excitations hosted by such R-type regions with odd I are also potentially stable to the leading effects of additional longer-range but weak mixing terms. However, this depends on the detailed morphology of the R-type regions and the range of the perturbation.
Finally, we note that the argument sketched here is somewhat analogous to earlier results in the onedimensional case of Majorana wires (as exemplified by paradigmatic Kitaev chain [35]). In the thermodynamic limit of the original Kitaev chain, there is one Majorana mode at each end of the chain in the topological phase. Together, these two form a single complex fermion whose wavefunction is split between the two ends of the chain, and whose energy goes to zero exponentially in the thermodynamic limit. In multichannel generalizations of this situation, one finds that Majorana modes at one end of the wire can generically mix due to additional local mixing terms in the Hamiltonian, leading to a situation in which each end of the wire has either a single Majorana mode or none, depending on whether the number of original Majorana modes at each end was odd or even [53].

V. COMPUTATIONAL METHODS AND OBSERVABLES
The arguments presented in Sec. IV motivate our computational study of the random geometry of the Dulmage-Mendelsohn decomposition of site-diluted bipartite lattices. We focus on random dilution of L × L honeycomb (square) lattices consisting of N sites = 2L 2 (N sites = L 2 ) sites in two dimensions, and cubic lattices with N sites = L 3 sites in three dimensions, with periodic boundary conditions and even L. In all our studies, we restrict ourselves to samples with compensated dilution, so that each random sample in our ensemble has an equal number of A and B sublattice sites. This ensures that our dataset is not "contaminated" by spurious effects associated with a global imbalance in the number of A and B-sublattice sites.
We consider two disorder ensembles in the twodimensional case: In one of these, we have chosen to impose a nearest and next-nearest neighbour exclusion constraint on the position of the vacancies to largely eliminate the possibility of subregions of the lattice disconnecting completely from the rest of the graph at small values of the dilution n vac . In the other ensemble, each site can independently be replaced by a vacancy, with dilution probability n vac . With the global compensation constraint in place in both ensembles, we find that the universal aspects we focus on here are independent of the exclusion constraints. Therefore, in the two-dimensional case, we only present the results for the ensemble with exclusion constraints. Likewise, we confine ourselves to a study of the independently-diluted case in three dimensions, again with global compensation.
Our tests of the efficiency of various maximummatching algorithms described in Ref. [54] suggest that the Breadth-First-Search (BFS) algorithm with pruning of search branches outperforms the others (including the Hopcroft-Karp algorithm which has the theoretical advantage in terms of worst-case complexity) in the twodimensional case at small dilution. On the other hand, on the diluted cubic lattice, we find that the Pothen-Fan algorithm outperforms both the Hopcroft-Karp and the BFS algorithm for vacancy densities smaller than ≈ 0.2, with the BFS algorithm being the worst of the three in this regime. However, the performance of the Pothen-Fan algorithm deteriorates very rapidly and dramatically when the vacancy density is increased beyond ≈ 0.2. For these higher vacancy densities, we have to rely on either the Hopcroft-Karp algorithm or the BFS algorithm. For vacancy concentrations up to ≈ 0.35, the Hopcroft-Karp algorithm outperforms the BFS algorithm, while the BFS algorithm outperforms the Hopcroft-Karp algorithm by a small margin for vacancy densities above ≈ 0.35.
To increase the computational efficiency of our matching code, we choose to use the maximum matching at a lower (higher) vacancy concentration to obtain an initial condition for the matching algorithm at the next higher (lower) concentration, and work out way up (down) a grid of concentrations; the two-dimensional data with exclusion constraints is obtained by working up along an ascending sequence of n vac , while most of the data without exclusion constraints is obtained by going down along a descending sequence of n vac . This gives us one random sample at each concentration. This process is then repeated many times to generate our ensemble. For most of the data shown, we use an ensemble consisting of at least 3000 samples at the largest sizes, with smaller size data being obtained by averaging over a substantially larger number of samples. In both two and three dimensions, we are limited by the time needed to find a maximum matching to sizes that correspond to N sites 10 9 .
Once we have constructed a maximum matching of the diluted lattice corresponding to a given sample, we can construct alternating paths starting from the unmatched sites to obtain the connected R-type regions R i (i = 1 . . . N R ) that contain the monomers of any maximum matching. In practice, we find it convenient to first use these paths to label each site odd (o-type), even (e-type) or unreachable (u-type) to obtain the Dulmage-Mendelsohn decomposition defined in Sec. III. Based on this labeling, we then use a simple and efficient burning algorithm to construct the connected R-type and P-type regions defined in Sec. IV. With this in hand, we measure a number of statistical properties of the random geometry of these regions.
The most basic of these are M tot = i m i , the ensemble-averaged sum of the mass (number of sites) m i of R-type regions R i , the ensemble-averaged number N R ( N P ) of R-type (P-type) regions in a sample, and the ensemble-averaged total number of zero modes W = i I i in a sample. Anticipating our results, we also define the corresponding densities in the N sites → ∞ thermodynamic limit: n R ≡ N R /N sites , n P ≡ N P /N sites , m tot ≡ M tot /N sites and w ≡ W /N sites . As expected, these densities are found to be self-averaging for large enough system size L, with smaller n vac requiring a larger size L for the self-averaging property to hold.
A nonzero w defines a length scale l w ≡ 1/w 1/d where d is the spatial dimensionality. This length scale l w is a measure of the "typical distance" between zero modes if one thinks of them as localized objects. Note however that this interpretation of l w cannot be taken too literally if the wavefunctions of individual modes are spread out on a scale comparable to l w , or, more precisely, if the length-scale ξ G that controls the spatial dependence of ∆G 2 (r − r ), is comparable to l w . Independent of this caveat, we will find it illuminating in our subsequent discussion to compare our results for l w to another length scale l vac , which represents the typical distance between vacancies: l vac = 1/n 1/d vac .
We also find it useful to separately keep track of the mass m A max (m B max ) of the largest (by mass) R A -type (R B -type) region in each sample. For the larger (by mass) of these two regions, we also keep track of the imbalance I max , and the size B max of the boundary. This size B max is defined in terms of the number of surviving links to the rest of the lattice from o-type sites that make up the boundary of the largest R-type region in each sample.
In addition, we also keep track of the number of deleted links D w max (D nw max ) that would have connected a e-type site (o-type site) of this region to a neighboring site that now hosts a vacancy. The superscripts "w" and "nw" stand for the labels "wavefunction" and "nonwavefunction". This refers to the fact that the zero mode wavefunctions of a R-type region have amplitude only on the e-type sites of the R-type region. The underlying intuition that motivates this measurement is this: Roughly speaking, since I = N e − N o (see Sec. IV C), we expect vacancies adjacent to e-type sites to "seed" zero modes in this R-type region, while vacancies adjacent to o-type sites "eliminate" zero modes.
It is also useful to measure a suitably-defined length scale associated with the size of large R-type regions. Two different but related length scales come to mind. The first is R max , the sample-averaged radius of gyration of the largest (by mass) R-type region in each sample. The second length scale ξ admits a natural interpretation as a correlation length associated with a sampleaveraged geometric correlation function C(r − r ) which gets a contribution of +1 from a sample if both r and r are in the same R-type region in that sample, and zero otherwise [2,3]. In terms of C, we define In classical bond percolation, this correlation function and the associated correlation length map to corresponding properties of the q-state Potts model in the q → 1 limit [2,3]. By thinking in terms of contributions of each R-type region to the double summation in Eq. 8, we see that ξ 2 is also the mean-square radius of gyration of R-type regions, with the average taken to be weighted by m 2 , where m is the mass of a R-type region. Thus, we have the expression The mean number densities nR and nP of R-type and P-type regions of the Dulmage-Mendelsohn decomposition of site-diluted honeycomb, square and cubic lattices are nonzero in the thermodynamic limit and decrease rapidly as we approach the limit of vanishing dilution nvac → 0 in all three cases. c), d), f): The sample-averaged total mass density of R-type regions, mtot, is nonzero in the thermodynamic limit and appears to tend towards 1 − nvac in the limit of small dilution nvac → 0 on the honeycomb, square and cubic lattices. This density is normalized to the number of sites in the undiluted sample, so that nvac + mtot + mP = 1 in all three cases (here mP is the corresponding sample-averaged total mass density of P-type regions). The sample-averaged density of zero modes, w, is also nonzero in the thermodynamic limit and goes rapidly to zero in the small-nvac limit. Sec. VI for details.
where the angular brackets in the first expression indicate averaging over the ensemble of diluted samples, and N m and R 2 m in the second expression are defined respectively to be the mean number of R-type regions of mass m and the mean square radius of gyration of such regions in this ensemble.
The toroidal geometry of our samples introduces a subtlety in our actual measurements of these length-scales and causes the strict correspondence between the two definitions of ξ in Eq. 8 and Eq. 9 to break down. This is because a definition of |r − r | (in Eq. 8) that takes this geometry into account would involve computing the 0.00 0.04 0.08 n vac vac , the mean distance between zero modes, lw = 1/w 1/d (where d is the spatial dimension), the correlation length ξ associated with the sample-averaged correlation function C(r − r ), and Rmax, the sample-averaged radius of gyration of the largest R-type region in a sample, plotted as functions of the vacancy density nvac. Note that ξ and Rmax track each other and both are much larger than the other two length scales in the small-nvac limit, being limited only by the system size L in this regime in the range of sizes accessible to our numerical work. See Sec. VI for details.
shortest distance on the torus between r and r for each pair of points r and r that belong to a particular Rtype region. The corresponding computational cost of using Eq. 8 to obtain ξ scales as L 2d if there is even one such region that contains O(L d ) sites; this introduces a significant inefficiency in the computation.
Fortunately, there is a simple work-around. The idea is to use Eq. 9 instead of Eq. 8 and compute the radius of gyration of each cluster using a method borrowed from the image-processing literature [55]: This involves making two passes through each cluster. The first pass computes a particular definition of center of mass which is well-adapted to the toroidal geometry and provides a sensible location of the center of mass of the cluster. The second pass then computes the mean-square shortest distance of each point of the cluster from this center of mass. In all our work, we adopt this definition of the radius of gyration of each cluster, and use Eq. 9 to obtain ξ 2 . Although the strict equivalence with Eq. 8 is now lost, we have checked that this computationally convenient redefinition shares all the qualitative features of the length-scale originally defined in Eq. 8.
In addition, we also probe the statistics of the mass m of R-type regions by measuring the susceptibility χ, which is defined in terms of the sample-averaged geometric correlation function C(r − r ) in the following natural way: where d is the spatial dimensionality. Finally, with an eye towards the possibility of a percolation transition associated with the random geometry of R-type regions, we keep track of the wrapping probability P which is defined in both two and three dimensions as the probability that a sample has at least one R-type region that wraps around in at least one direction around the torus. In addition, in the three-dimensional case, we also count, for each sample, the number of R-type regions that wrap around the 3-torus simultaneously in three independent directions, as well as the number of odd Rtype regions that wrap around the 3-torus simultaneously in three independent directions. These measurements are all made using efficient techniques borrowed from the percolation theory literature [56][57][58].

VI. RANDOM GEOMETRY OF THE DULMAGE-MENDELSOHN DECOMPOSITION: BASIC PICTURE
We now present the basic picture that emerges from our computational study of the random geometry of The sample-averaged radius of gyration, Rmax, of the largest R-type region in a sample, as well as the correlation length ξ associated with the sample-averaged geometric correlation function both increase rapidly in the small-nvac limit to values that are set by the system size L for accessible system sizes in two and three dimensions. In two dimensions, curves of the dimensionless ratios Rmax/L and ξ/L corresponding to different L do not cross over the range of nvac accessible to our numerics. However, on the cubic lattice, there is a clear crossing in the vicinity of nvac ≈ 0.6 (insets of c), d)). See Sec. VI for details.
the Dulmage-Mendelsohn decomposition of site-diluted square, honeycomb and cubic lattices. Our initial focus is on thermodynamic densities, which converge rapidly to a nonzero thermodynamic limit at any nonzero dilution n vac . This is followed by a characterization of the important length-scales that control the statistical properties of this decomposition.

A. Thermodynamic densities
How do M tot , N R , N P , and W scale with system size L and vacancy concentration n vac in the small-n vac limit? Our answers to these questions are displayed in Fig. 5.
From the behaviour of n R and n P for the honeycomb and square lattices in two dimensions and the cubic lattice in three dimensions, we see that n R and n P both tend rapidly to a nonzero value in the thermodynamic limit, with finite-size corrections that are not readily visible at the sizes studied here. The same also holds (although we have not displayed it) for n odd R , the corresponding number density of odd R-type regions. From the n vac dependence of these quantities, we see that these densities decrease rapidly to zero as n vac → 0 (Fig. 5).
From Fig. 5, we also see that the density of zero modes w on the honeycomb and square lattices in two dimensions and the cubic lattice in three dimensions saturates to a nonzero value in the thermodynamic limit, with finite-size corrections that are not readily visible in the range of sizes studied here. As expected, we also see that w tends to zero as n vac → 0. From Fig. 5, we see that m tot saturates rapidly to a nonzero value in the thermodynamic limit in both two and three dimensions, with finite-size corrections that are not visible in the size range studied. Moreover, it is apparent that m tot tends towards the value m tot = 1 − n vac as n vac goes to zero in both two and three dimensions. In conjunction with the behaviour of n R , this implies that the entire sample is taken over by a few R-type regions in this limit. This is, at first sight, a surprising and counter-intuitive result, since n vac = 0 corresponds to the pure lattice, which admits a perfect matching. In other words, at n vac = 0, we have w = 0. In the language of the Dulmage-Mendelsohn decomposition, there are no Rtype regions, and the undiluted system consists of a single P-type region at n vac = 0. How are we to reconcile this counter-intuitive result for m tot with the existence of perfect matchings at n vac = 0? As we detail below, the answer has to do with the order in which the thermodynamic limit and the limit of zero dilution are taken.

B. Diverging length-scales
As noted earlier, a nonzero density of vacancies is naturally associated with a length scale l vac that corresponds to the typical distance between vacancies, while a nonzero w defines a second length scale l w which is a measure of the "typical distance" between zero modes if one thinks of them as localized objects. If we take the thermodynamic limit at fixed nonzero n vac and then take the limit n vac → 0, we are studying the limit L/l vac → ∞, l vac → ∞. In this case, we see from our computations that m tot → 1 in this n vac → 0 limit. On the other hand, if we first send n vac to zero while keeping L fixed, and then take the thermodynamic limit, we are studying the limit l vac /L → ∞, L → ∞. This corresponds to the limiting case of the pure system for which m tot → 0, since the entire sample has a perfect matching.
From the n vac -dependence of w in the small-n vac limit, we see that l w /l vac → ∞ in this limit in both two and three dimensions, since w goes to zero very rapidly in the limit of small dilution. Thus, these two length-scales have a parametrically large separation l w l vac at small n vac . In other words, zero modes in this limit are a cooperative effect of a very large number of vacancies, both in two and in three dimensions.
We now come to the key observation that motivates much of our subsequent analysis. In Fig. 6, we plot R max , ξ, l w and l vac as a function of n vac for the largest sizes studied both in two and three dimensions, while Fig. 7 shows R max and ξ for different sizes as a function of n vac . It is clear from the displayed results that R max and ξ more or less track each other, with both these length scales dominating over l vac and over l w in the small-n vac limit in the two-dimensional case. For values of L accessible to our computational method, we see that both these length scales appear to be limited only by the system size L in the small-n vac limit in two dimensions. The three-dimensional case also displays similar behaviours, which set in much "earlier", i.e., below a much larger threshold in the vicinity of n vac ≈ 0.6. Thus, in the thermodynamic limit, we conclude that the typical size of a R-type region grows without bound in the n vac → 0 limit in two dimensions. In three dimensions, the corresponding quantity exhibits the same behaviour below n vac ≈ 0.6. The presence of this diverging length scale suggests that the random geometry of Rtype regions may be independent of lattice-scale details in the corresponding regimes both in two and in three dimensions. Additionally, as is clear from the insets in Fig. 7, we see that curves of R max /L and ξ/L corresponding to cubic lattices of different linear sizes L appear to cross near this threshold when plotted as a function of n vac . This suggests that this threshold is associated with a critical point in the cubic-lattice case.

C. Dominance of large R-type regions
In two dimensions, additional support for this scenario comes from a study of the n vac dependence of m small /m tot , where m small is the contribution to m tot from "small" R-type regions of absolute mass m < m * (n vac ). For the cutoff value m * , we choose m * (n vac ) = V small /n vac to ensure that clusters of mass m * can be expected on average to be associated with some fixed "small" number of vacancies, V small . Another characterization of m * (n vac ) is that it corresponds to the expected number of sites in a region of linear size l small = √ V small × l vac in the pure system. The results of this study with V small = 10000, i.e., l small = 100 × l vac , are shown in Fig. 8. Analogous results for the n vac dependence of I small /W , where I small is the contribution to W from these small R-type regions of mass m < m * (n vac ), are also shown in Fig. 8. From the data displayed in this figure, it is clear that the smalln vac limit is dominated by the physics of large R-type regions whose size diverges even when measured in units of l vac .
In three dimensions, a similar picture emerges in a somewhat more direct way from a study of the n vac dependence of m max /M tot , where M tot is the total mass in all R-type regions of a sample, the angular brackets denote an ensemble average, and m max is the mass of the largest R-type region in a sample, i.e., the larger of m A max and m B max . As is clear from Fig. 9, the largest R-type region contains a nonzero fraction of the total mass as well as a nonzero fraction of the total number of zero modes once n vac is reduced below a threshold in the vicinity of n vac ≈ 0.6.
When the dilution falls below a lower threshold in the vicinity of n vac ≈ 0.35, we see that this mass fraction climbs rapidly to 1, suggesting that a single large Rtype region dominates in the low-dilution regime below this lower threshold. This lower threshold is also associated with a cusp-like feature in I max /W , suggesting an abrupt qualitative change in the morphology of the largest R-type regions below this threshold.
In the context of the quantum percolation problem or in the context of diluted quantum antiferromagnets and closely-related particle-hole symmetric Hubbard models on such diluted lattices, the behaviour of I small /W ( I max /W ) in two (three) dimensions implies that the low-energy physics of interest to us is completely domi- nated by the fact that the typical size of R-type regions diverges in the limit of low dilution.
In the Appendix, we establish that odd R-type regions behave similarly to their even cousins, i.e., they also have a divergent typical size in the low-dilution limit both in two and in three dimensions. Before proceeding further, it is however useful to note the following distinction: In contrast to the above, the implication of this divergent typical size of odd R-type regions for the physics of Majorana networks is somewhat different. This is because each odd R-type region only hosts a single stable zero-energy Majorana excitation of the network. As a result, although long-distance properties are completely controlled by the large odd R-type regions, the density of collective zero-energy Majorana fermion excitations is not dominated by this large-scale physics, since small odd R-type regions also contribute significantly to this density.
Finally, in both two and three dimensions, we consider the probability P that a sample with periodic boundary conditions has at least one R-type region that wraps around the torus in at least one direction (Fig. 10). In classical percolation theory, the analogous quantity provides a simple diagnostic of the percolation transition [59,60]: If one plots this wrapping probability as a function of concentration for various L, these curves cross each other at the critical concentration. In the percolated phase, this probability is closer to 1 for larger L, while the unpercolated phase is characterized by the opposite behaviour. In our two-dimensional case, it is clear that the corresponding curves for different L all tend to 1 as n vac → 0, but there is no crossing point at any nonzero n vac accessible to our numerics. However, on the cubic lattice, we see that there is a clear crossing point in the vicinity of the previously identified threshold near n vac ≈ 0.6, signalling the presence of a percolation transition at a nonzero critical dilution in this three-dimensional case.
Thus, our two-dimensional results strongly suggest On the site-diluted cubic lattice below nvac ≈ 0.6, the largest R-type region contributes a nonzero fraction of the total mass in R-type regions in the thermodynamic limit, as well as a nonzero fraction of the total number of zero modes in this limit, as is clear from the corresponding sample-averaged ratios. See Sec. VI for details.
that the n vac → 0 limit in two dimensions may be fruitfully thought of in terms of universality and critical scaling associated with an incipient percolation phenomenon corresponding to a critical point at n vac = 0. By the same token, our three-dimensional results motivate a scaling description in the vicinity of the threshold n vac ≈ 0.6, associated with a transition to a percolated phase. In addition, the results at lower dilution point towards the possibility of a second transition near n vac ≈ 0.35 within the percolated phase. In two dimensions, this tends towards P = 1 in the small-nvac limit, but curves for different sample sizes L never cross each other at any nonzero nvac accessible to our numerical work. In contrast, for the cubic lattice, this rises sharply to P ≈ 1 around nvac ≈ 0.6, with curves of different sizes crossing one another at a sharply-defined threshold value. Sec. VI for details.

VII. RANDOM GEOMETRY OF R-TYPE REGIONS
With this motivation, we now present finite-size scaling analyses of incipient percolation of R-type regions in the n vac → 0 limit in two dimensions, and of the corresponding percolation transition in the vicinity of n vac ≈ 0.6 in the three-dimensional cubic-lattice case. In the threedimensional case, we also analyze the geometry of the largest R-type regions in the vicinity of a second transition deep within the percolated phase, and show that this is a sublattice symmetry breaking transition. A study of the analogous scaling behaviour of just the odd R-type regions, i.e., with odd imbalance I, is also interesting, since each such region hosts a robust zero-energy Majorana fermion excitation of the corresponding bipartite Majorana network described by Eq. 3. This is presented in the Appendix, which also describes our detailed study of the morphology of the largest R-type regions as a function of the dilution.

A. Overview of analysis
Our computational resources do not allow a study of large-enough samples in the small-dilution regime with n vac < 0.04 (n vac < 0.06) on the honeycomb (square) lattice at present. As a result, we cannot definitively rule out the possibility that there is a percolation transition at a small nonzero n h vac 0.04 (n s vac 0.06) on the honeycomb (square) lattice, rather than a n vac = 0 critical point on both lattices. Nevertheless, Occam's razor dictates that we first explore the simpler descrip-tion in terms of a n vac = 0 critical point in two dimensions. On the cubic lattice, we are limited to n vac 0.2. Fortunately, both transitions we study on the cubic lattice occur at much higher dilutions, and this limitation is therefore not a serious constraint. Thus, the main caveat attached to the various rather interesting implications of our two-dimensional results is the possibility of a percolation threshold at an extremely small but nonzero n vac , both on the square and on the honeycomb lattice. This possibility would leave all our conclusions unaffected for n vac greater than this currently-inaccessible threshold value in two dimensions, while possibly introducing a new regime with volume-law entanglement entropy in the phase diagram of the quantum dimer models of Sec. II and a delocalized phase of the particle-hole symmetric quantum percolation problem.
Guided by the usual finite-size scaling ideas [61], and by an analogy to the scaling picture of the standard geometric percolation transition [2,3,[59][60][61], we ask if various observables in the vicinity of n vac = 0 (n crit vac ) in two dimensions (on the three-dimensional cubic lattice), when rescaled by a suitable power of the system size L, collapse onto scaling functions of the scaling variableδ s ≡ n vac L 1/ν (δ s ≡ (n vac − n crit vac )L 1/ν ) for a suitable dimension-specific choice of ν, and, in the threedimensional cubic case, a lattice-specific choice of n crit vac . As in any such finite-size scaling analysis, one can either identify a single best-fit set of exponents that simultaneously achieves a good scaling collapse of all critical properties, or one can find the best-fit exponents individually for each critical observable. In the former approach, the observed quality of the simultaneous scaling collapse serves to validate the underlying scaling hypothesis. In the latter approach, the spread in the values of best-fit exponents obtained from different observables contribute to the uncertainty in estimates of various exponents. We have explored both approaches. Here, we display scaling collapses corresponding to a single set of best-fit exponents to provide the reader with a direct visual confirmation of scaling behaviour, but quote conservative error bars that are dominated by the spread in their values within the second approach.

B. Universal scaling at the percolation threshold
We begin by studying the scaling behaviour of the wrapping probability P . In Fig. 11, we see that this scaling hypothesis gives a rather good account of our data for these wrapping probabilities. Note that these wrapping probabilities, being dimensionless variables, require no rescaling by any power of L. Next, we consider another dimensionless ratio R max /L. In Fig. 12, we again see that this scaling hypothesis gives an extremely good account of the data for R max /L. Since ξ represents the correlation length associated with the sample-averaged geometric correlation function, we expect ξ/L should also exhibit scaling behaviour analogous to R max /L. In Fig. 13, we see that this is also the case.
Next, we consider two dimensionful quantities, the susceptibility χ, and the mean mass of the largest R-type region, defined as m max ≡ max(m A max , m B max ) . For the susceptibility, we expect that our data for χ/L 2−η at various L collapses onto a single curve when plotted versus the scaling variableδ s for n vac in the vicinity of the putative critical point. Similarly, we expect that m max /L d/2+1−η/2 exhibits an analogous scaling collapse. In Fig. 14, we see that these expectations are both borne out by our data.
We have also studied the scaling behaviour of the analogous observables constructed by considering only the 'odd' R-type regions, with odd imbalance I. Our basic conclusion is that restricting attention to such odd Rtype regions does not change the scaling picture. Some illustrative examples of this scaling behaviour are presented in the Appendix. From this, we conclude that the spatial extent of topologically-robust zero-energy Majorana excitations also exhibits the same critical behaviour in the n vac → 0 limit. However, we caution that their density n odd R does not exhibit critical behaviour since it is dominated by small odd R-type regions.
Although we have set η = 0 in all the tests of scaling displayed here, we emphasize that scaling collapses of comparable quality can also be achieved for η 0.06 in two dimensions, and η 0.03 in the cubic case. Nonzero values of η are generically accompanied by slightly larger best-fit values for the correlation length exponent ν. Folding the results of this systematic analysis into our error estimates, we conclude that ν 2d = 5.1 ± 0.9, ν 3d = 0.87 ± 0.1, η 2d 0.06, η 3d 0.03. Additionally, we obtain a rather accurate determination of the Dulmage-Mendelsohn percolation threshold of the cubic lattice: n crit vac = 0.5956(5).

C. Sublattice symmetry breaking transition
As noted earlier several properties dominated by the geometry of the largest R-type regions display a distinctive feature in the vicinity of a dilution of n vac ≈ 0.35 on the cubic lattice. Here we argue that this signals an interesting sublattice symmetry-breaking transition at a sharply defined threshold n low vac = 0.35 (1). To this end, we first note that just below the percolation threshold of n crit vac in the limit of large L, we find that there are always exactly two clusters that both wrap in three independent directions around the torus, and no clusters that wrap in just one or two independent directions. A closer inspection of our data reveals that one of these is always a R A -type region while the other is a R B -type region.
To explore this further, we keep track of the number of R-type regions in a sample that wrap in three independent directions in the large-size limit and study the dilution dependence of the corresponding ensemble average N xyz in the low and intermediate dilution regime. From Fig. 15, we see that this number drops from 2 to 1 at the threshold value of n low vac = 0.35 (1), with this threshold becoming increasingly sharp as one approaches the thermodynamic limit. Unlike the critical scaling in the vicinity of the percolation transition at n crit vac , the behaviour in the vicinity of this threshold has a distinct first-order character, with curves of N xyz corresponding to different L showing no crossing point. The corresponding number for odd R-type regions also shows a sharp drop, from a mean value of 1 to a mean value close to 0.5. We also monitor the sample average of the ratio |m A max − m B max |/m tot as a function of the dilution (Fig. 16). In the thermodynamic limit, we find that this exhibits a sharply defined transition, becoming nonzero as soon as we cross into the low-dilution phase with n vac < n low vac . This nonzero value serves as the order parameter corresponding to the spontaneous breaking of sublattice symmetry in this low-dilution phase.

VIII. DISCUSSION
We have provided convincing evidence for a new universality class of percolation phenomena in site-diluted bipartite lattices in two and three dimensions. These phenomena are associated with the end-to-end connectivity of regions of the lattice with local sublattice imbalance, identified using the structure theory of Dulmage and Mendelsohn. This geometric criticality is expected to be associated with monomer percolation in the maximally-packed dimer model on these lattices, with a transition from area law to volume law in the entanglement entropy of arbitrary many-body eigenstates of the corresponding quantum dimer models, with the quantum percolation transition of a zero-energy particle hopping on such a lattice, and with a Majorana percolation transition governing the spatial extent of collective zero-energy Majorana fermion excitations of the corresponding bipartite network of localized Majorana modes. In addition, these results on the geometry of regions with local sublattice imbalance also shed light on the origin of low-energy triplet excitations in diluted quantum antiferromagnets.
Here, we first provide a heuristic reinterpretation of our geometric results, and then discuss in more detail their implications in various contexts. In addition, we describe how our results lead to a unified perspective on two apparently disparate strands of recent work, one of which deals with dimer models on the quasi-periodic Penrose tiling, while the other studies tight-binding models on the same lattice.

A. Heuristic reinterpretation of results
At the risk of oversimplification, we present here a heuristic interpretation of our results on the random geometry of the Dulmage-Mendelsohn decomposition of diluted bipartite lattices. This is based on the coarsegrained Coulomb phenomenology of fully-packed dimers on the parent bipartite lattices [62][63][64][65]. As is well-known, in this way of thinking, a fully-packed dimer configuration corresponds to a divergence-free configuration of a polarization field P. An unmatched site on the A (B) sublattice corresponds to the location of a unit positive (negative) electric charge in this description.
Consider now the undiluted square or honeycomb lattice with boundary conditions that allow a perfect matching. If we remove exactly one A site (say the site A 0 at the origin) and look for a maximum matching, it is clear that any such maximum matching leaves one B site of the diluted lattice unmatched. The coarse-grained picture for this ensemble of maximum matchings then has a static unit charge +1 at the origin A 0 , and a mobile unit negative charge that is attracted to the origin by an entropically-generated logarithmic "Coulomb potential" V (r) = η m ln(r) of strength η m ; η m is expected to be equal to 1/2 for the square and honeycomb lattice dimer models [66][67][68]. At this coarse-grained level of description, one would say that this mobile charge can diffuse anywhere on the lattice, with the probability for being at distance r from the origin falling off in equilibrium as exp(−V (r)). In the three-dimensional case, the corresponding potential is the usual three-dimensional Coulomb potential of an isolated charge, V (r) ∼ −1/r at large r. In other words, one would heuristically expect that there is a single Dulmage-Mendelsohn region R A0 that spans the whole lattice. For the square and honeycomb lattices, this can presumably be verified by a direct free-fermion (Pfaffian) calculation of the lattice level partition function with exactly two monomers fixed at two given locations.
What about a nonzero density of vacancies? One may again think of each vacancy as being a static charge with a sign given by the sublattice. But it is no longer clear if the "screening cloud" provided by a mobile monomer on the other sublattice extends over all space. Indeed, in this heuristic language of fluctuating electrostatics, our results on the nontrivial geometry of R-type regions of slightly-diluted samples corresponds to a phenomenon whereby groups of vacancies seed a "screening-cloud" of mobile charges that are all of like sign, and confined to a finite-region of the lattice (corresponding to an individual R-type region).
The typical size of this screening cloud grows as ξ ∼ n −ν 2d vac in the n vac → 0 limit in two dimensions; on the cubic lattice, it undergoes a percolation transition at a nonzero dilution threshold n crit vac = 0.5956 (5). A curious aspect of our computational results is that such a group of vacancies does not correspond to static charges which all have the same sign, although vacancies with positive (negative) charge do outnumber those with negative (positive) charge whenever the screening cloud is made up exclusively of monomers of negative (positive) charge. The three-dimensional cubic lattice throws up another interesting phenomenon: Below a lower threshold of n low vac = 0.35 (1), only one spontaneously chosen kind of polarization cloud percolates in each sample, either with positive screening charge, or with negative screening charge. Screening clouds of the opposite charge remain finite in extent. This spontaneously breaks the sublattice symmetry of the underlying statistical ensemble from which diluted samples are drawn. In contrast, for n low vac < n vac < n crit vac , screening clouds of both signs of charge percolate through the sample.
It is natural to ask if this restatement of our results in Coulomb language suggests a scaling argument for the scaling behavior at low density. However, any attempt at a low-density scaling argument must first deal with the following difficulty: A key ingredient of any such argument would have to be some kind of large deviation estimate or Griffiths argument for the likelihood of having regions of a given size with extensive local sublattice imbalance and a site boundary made up entirely of sites belonging to the sublattice that is locally in the minority. It is this constraint on the nature of the boundary that complicates any such argument. We have not been able to overcome this difficulty, which represents an interesting question for future work.

B. Implications for quantum percolation and Majorana percolation
As we have already remarked earlier, R-type regions remain localized in two dimensions at any nonzero dilution n vac , no matter how small, but are characterized by a growing length scale ξ ∼ n −ν 2d vac in the limit of n vac → 0. Our results thus imply that the corresponding quantum percolation problem does not have a delocalized phase at any nonzero n vac . Further our results suggest that the low-dilution limit is associated with interesting scaling behaviour of the localization length ξ loc (defined in Sec. IV D) which determines the conductivity tensor of the system at zero chemical potential. However, there is a subtlety associated with this, which we now highlight.
Strictly speaking, our results only tell us that ξ loc < ξ. However, since bipartite random hopping problems are characterized by a localization length that diverges as the band center = 0 is approached, we expect that ξ loc will also grow without bound whenever ξ diverges [48,49], although we emphasize that we have not studied this directly in our work here.
Likewise, in three-dimensions, on the cubic lattice at µ = 0, our identification of a Dulmage-Mendelsohn percolation threshold n crit vac = 0.5956(5) serves to establish the presence of a localized phase of the fermion system for n vac ∈ (n crit vac , n gp vac ), where n gp vac ≈ 0.688 [2,3] is the geometrical site percolation threshold of the cubic lat-tice. With the same caveat as above, the particle-hole symmetric fermionic system with µ = 0 is thus expected to undergo a metal-insulator transition at n crit vac , allowing us to identify the Dulmage-Mendelsohn percolation transition with the µ = 0 quantum percolation transition. What about the sublattice symmetry-breaking that sets in below n low vac on the site-diluted cubic lattice? While the implications of this for transport in the free Fermi system are not entirely clear, it seems likely that a small Hubbard interaction U in this regime will give rise to interesting magnetic properties associated with the formation of local moments whose spatial profile is controlled by ∆G(r, r ). Some closely-related possibilities for followup work are discussed in Sec. IX.
Further, since R-type regions with odd imbalance I exhibit the same signatures of Dulmage-Mendelsohn percolation as those with even I, the spatial structure of the collective zero-energy Majorana fermion excitations of the corresponding Majorana networks is also expected to exhibit critical behaviour, with the same caveat as above.
All of this provides a strong motivation for a detailed numerical study of the character of the zero-energy wavefunctions inside the largest R-type regions that dominate both the low-dilution limit in two dimensions, as well as the critical regime in the vicinity of Dulmage-Mendelsohn percolation transition in three dimensions.
C. Area-law scaling of entanglement entropy in maximally-packed quantum dimer models These computational results on the random geometry of R-type and P-type regions thus imply, via the argument presented in Sec. IV B for the tensor-product structure of arbitrary many-body eigenstates, the existence of a phase with area-law entanglement entropy [44] of all many-body eigenstates of the corresponding maximallypacked quantum dimer models in two and three dimensions.
In particular, we expect that any nonzero dilution leads to such behavior in such site-diluted quantum dimer models on the square and honeycomb lattices in two dimensions. The prefactor of this area-law entanglement entropy is expected to diverge in the n vac → 0 limit. Similarly, on the three-dimensional cubic lattice, the corresponding site-diluted quantum dimer models are expected to exhibit such area-law behavior for n vac > n crit vac , with the prefactor of the area law again expected to diverge as n vac → n crit vac . Since these phases exist at dilutions well below the geometric percolation threshold both in two and in three dimensions, they represent genuinely interesting and nontrivial examples of area-law behavior.
Likewise, the Dulmage-Mendelsohn percolation transition at n crit vac is expected to correspond to a genuine entanglement transition to area-law behavior in such cubiclattice quantum dimer models. Curiously enough, the only uncertainty in connection with this expectation has to do with the existence of a volume-law entanglement entropy in the presence of disordered couplings in such quantum dimer models when R-type regions percolate for n vac < n crit vac ; while our arguments conclusively establish area-law behavior for n vac > n crit vac , they do not say anything definite about the existence of a volume law for n vac < n crit vac in the presence of quenched disorder in the coupling strengths.
As noted in Sec. IV B, the tensor-product structure of many-body eigenstates is not entirely fragile, and remains unaffected by additional ring-exchange and potential-energy terms defined on larger flippable loops, as well as next-nearest-neighbor interactions between monomers. Nevertheless, since the three-dimensional transition at n crit vac on the cubic lattice is entirely driven by the underlying Dulmage-Mendelsohn percolation transition, a generic many-body-localization critical point in higher dimensions, if it exists, will most likely be in a different universality class. Loosely speaking, the situation is akin to the ferromagnetic quantum phase transition of the two-dimensional or three-dimensional transverse field Ising model on a diluted lattice, which is also driven entirely by the underlying percolation transition of the diluted lattice. In that case, the percolation-driven transition [45] does share some features with the strongdisorder fixed point [46] controlling the ferromagnetic quantum phase transition of the random transverse field Ising model in higher dimensions. It is not clear if this will be the case for the geometrically-driven transition identified here, since area-law entanglement entropy is only one of several different ways in which many-body localization manifests itself [44]. In Sec. IX, we outline a specific suggestion for follow-up work that could address a related question.

D. Implications for diluted quantum antiferromagnets
Turning to implications for diluted quantum antiferromagnets, our work strongly suggests that it would be extremely interesting in follow-up work to repeat the earlier QMC calculations [38,39], but at much smaller values of n vac , corresponding to the low-dilution limit of the site-diluted square lattice, rather than the vicinity of the geometric percolation threshold studied earlier [38,39].
The rationale for this suggestion is as follows: In the valence bond picture, the antiferromagnetically-ordered ground state can be thought as a wavefunction with a broad distribution of valence bond lengths, so that two spins far away from each other have a sizeable amplitude for freezing into a singlet state. On the other hand, valence bond solid and spin liquid ground states are expected to be described by a wavefunction in which the valence bonds are short ranged [69] In the extreme limit in which the valence bonds only form singlets between nearest neighbour pairs of spins, it becomes possible make a precise connection between monomers in the maximally-packed dimer model on the one hand and dangling free spins in the magnet; these dangling free spins are expected to lead to a Curie tail in the low-temperature susceptibility. Conversely, in an antiferromagnetically-ordered state, monomers of the maximally-packed dimer model are not expected to be related in any such a direct way to dangling free spins.
Nevertheless, the results of Ref. [39] provide clear evidence that anomalously low-energy triplet excitations in the antiferromagnetically-ordered phase of diluted square lattice antiferromagnets have a spatial distribution that is accounted for quite well by identifying regions of the lattice which can host monomers in the maximallypacked dimer model. These results of Ref. [39] are in a regime with fairly high levels of dilution, close to the geometric percolation threshold.
Our heuristic explanation for this close correspondence between low-energy triplet excitations and monomercaryring regions is as follows: As noted above, if valence bonds were strictly restricted to nearest-neighbour links, monomers would correspond to dangling free spins. From this point of view, the main effect of the longer range valence bonds in the antiferromagnetically ordered phase is then to form singlets between two such would-be dangling spins on opposite sublattices of the square lattice.
Since monomers in a R A -type (R B -type) region all live on the A (B) sublattice, such a longer range singlet between two dangling spins would need to extend from one R A -type region to a neighbouring R B -type region. The longer the range over which such a singlet is formed, the lower is the corresponding binding energy or singlettriplet gap. Thus, our picture is that the low-energy triplet excitations found in the antiferromagneticallyordered phase in Ref. [39] correspond to triplet excitations of these longer-range valence bonds that connect a A-sublattice site of a R A region with a B-sublattice site of an adjacent R B region. This is plausible because our results show that R-type regions tend to be rather small in extent in the highdilution regime studied in Ref. [39]. As a result, the longer-range valence bonds characteristic of the antiferromagnetically ordered ground state can readily form singlets between two spins in two different R-type regions, giving rise to low but nonzero energy triplet excitations.
However, in the n vac → 0 limit, our results show that R-type regions are very large in size. In this regime, valence bonds that straddle two adjacent R-type regions would then need to be extremely long. Based on the picture developed above, the corresponding triplet excitation energy would be extremely small. In other words, slightly-diluted samples with very small n vac may be reasonably expected to carry a signature of the diverging length scale ξ ∼ n −ν 2d vac in their triplet excitation spectrum.
The analogous system in three dimensions, say a diluted quantum antiferromagnet on the cubic lattice, presents interesting questions of its own: Does the Dulmage-Mendelsohn percolation transition identified here correspond to a qualitative change in the nature of these triplet excitations? Does the sublattice symmetry breaking found deep in the low-dilution phase give rise to observable effects in the triplet excitation spectrum of the antiferromagnet? It would be interesting to explore these questions via QMC simulations, although the large length scales involved pose a significant challenge. Other closely related suggestions for follow-up work are described in Sec. IX.
E. An aside: Tight-binding and dimer models on the Penrose tiling There is a fairly large body of work going back nearly four decades, whose focus has been the spectrum of tightbinding models defined on quasiperiodic lattices, most prominently the Penrose tiling made up of rhombi. For instance, Kohmoto and Sutherland [70] noted that the hopping Hamiltonian for a quantum mechanical particle hopping along links of the Penrose tiling had extensively degenerate zero-energy states with localized wavefunctions. In subsequent work, Arai and collaborators provided a partial characterization of these localized wavefunctions by identifying certain geometric motifs that supported such states [71]. Based on this, they also arrived at a conjecture for the density of such localized zero-energy states in the thermodynamic limit.
In an insightful analysis [26], Koga and Tsunetsugu exploited the self-similar nature of the Penrose tiling to arrive at an essentially complete characterization of the geometric motifs that support such localized zero modes, and proved the conjecture of Arai and co-authors. Recognizing that local sublattice imbalance was an essential feature of these geometric motifs, Koga and Tsunetsugu also obtained a detailed characterization of the antiferromagnetic order that develops for infinitesimal onsite repulsion in the Hubbard model on this lattice. This local imbalance associated with these geometric motifs was also emphasized in very recent work [72].
On the other hand, in the recent work of Flicker and collaborators [73], essentially the same geometric motifs seem to arise as a by-product of their analysis of the density of monomers in any maximum matching of the Penrose tiling. Their result for the monomer density also corresponds exactly to the previously obtained density of zero modes of the hopping problem. Moreover, their characterization of regions accessible to monomers bears an uncanny resemblance to the earlier characterizations of the localized zero mode wavefunctions of the hopping problem.
Using the perspective developed here, we see that this is no coincidence: Indeed, it becomes apparent that the results of Koga and Tsunetsugu amount to an essentially complete analytic characterization of the Dulmage-Mendelsohn decomposition of the Penrose tiling, which was independently rediscovered in the context of maximum matchings by Flicker and collaborators. Moreover, this suggests other natural questions that appear worth studying. These are discussed in Sec. IX.

IX. OUTLOOK
The foregoing results and their interpretation lead us to identify several natural lines of enquiry. We conclude by listing some of these as suggestions for potentially fruitful follow-up studies.
We begin with a question about the n vac → 0 limit in two dimensions. Our results imply that there is a hierarchy of growing length scales in this limit: l vac ∼ 1/ √ n vac , l w ∼ 1/ √ w, and ξ ∼ 1/n ν vac , with ν 2d = 5.1 ± 0.9 ensuring that l vac l w ξ. Thus, R-type regions in finite-size systems with l w L ξ "look" critical, in the sense that that standard finite-size scaling ideas strongly suggest that their properties would be controlled by the incipient critical point at n vac = 0. Given that the critical point of classical percolation in two dimensions has conformal invariance [59,60], is there some sense in which a similar enlarged symmetry governs the behaviour of such R-type regions?
The next two questions concern both this critical regime in the vicinity of n vac = 0 in d = 2, and the corresponding critical regime in the vicinity of n crit vac = 0.5956(5) on the cubic lattice: The localization length ξ loc that controls the conductivity tensor of the free Fermi gas (Eq. 2) obeys the bound ξ loc < ξ, but may be expected to also diverge whenever ξ diverges (since the localization length in the closely-related bipartite random hopping problem is expected to diverge as one approaches the band center [48,49]). Is the ratio ξ loc /ξ in the critical regime characterised by an independent scaling exponent, and does this depend on the presence of hopping disorder? A similar question arises naturally for the monomer and dimer correlation lengths ξ M and ξ D in the associated maximally-packed dimer model (Eq. 1): What is the behaviour of ξ M /ξ and ξ D /ξ in this critical regime, and how does it depend on the strength of the bond-disorder?
Fourth, it would clearly be interesting to ask if some of the universal aspects of our results extend to situations in which bonds are diluted randomly instead of sites, both in two and in three dimensions. Fifth, there is the natural question of generalizing to other diluted planar bipartite graphs, most notably hyperbolic graphs similar to those studied recently in the context of circuit quantum electrodynamics [74] and network theory [75]. The percolation theory of such graphs is a well-developed subject in the mathematical literature [76], and it would be interesting to explore the possible critical behaviour of Dulmage-Mendelsohn clusters in this setting. The next suggestion has to do with a natural generalization to random regular bipartite graphs. For such graphs, there is no notion of geometric distance between vertices, but the question of the distribution of sizes of the Dulmage-Mendelsohn clusters remains interesting. This is because recent work has already identified interesting algorithmic implications of the size of a maximum-matching for some problems in computer science [77]. Given our arguments about the factorization of the monomer-dimer partition function into factors associated with Dulmage-Mendelsohn clusters, it would also be interesting to study the size distribution of Dulmage-Mendelsohn clusters in this algorithmic context. Given that some results for such graphs can be obtained analytically, there is also the intriguing possibility of obtaining some exact results in this setting.
The seventh question is a natural one regarding the statistics of overlap loops and paths. The dimer model on the undiluted square and honeycomb lattices has a useful coarse-grained description in terms of a compact scalar height field with a Gaussian action [62,63,65,78]. Dimer correlations, and correlations of test-monomers are readily related to correlation functions of this Gaussian theory, which can be computed exactly. In a certain well-defined sense, long-distance behaviour of correlation functions in these dimer models exhibit conformal invariance [78]. The double dimer model [79] consists of two independent copies of the dimer model, with partition function given by the square of the dimer model partition function. In such a double-dimer model, the interesting observables are overlap loops built by tracing closed paths that alternately go along dimers in one copy and then the other. This defines an ensemble of loops, which is related to the contour lines of a Gaussian free field theory [80,81] and exhibits conformal invariance in the scaling limit [79]. In the diluted case, the analog of the double-dimer model involves two copies of the maximally-packed dimer model. This defines an ensemble of overlap loops and paths. The question then arises: What is the statistics of these loops and paths in the small-n vac critical regime? The overlap loop ensemble defined by the fully-packed dimer model on the cubic lattice is also interesting [82]. This motivates a study of the corresponding ensemble of loops and paths on the site-diluted cubic lattice, particularly in the vicinity of the Dulmage-Mendelsohn percolation transition.
The next set of questions have to do with the physics of SU(N) antiferromagnets in a certain large-N limit [83] that has played a key role in the subsequent conceptual development of our understanding of quantum disordered phases of magnets. In these SU(N) models, one sublattice carries the fundamental representation and the other has SU(N) spins that transform under the complex-conjugate of the fundamental. Thinking in terms of the corresponding Hubbard model, each site has N different fermion orbitals with the constraint that the total fermion number of A-sublattice sites is 1, while that of B-sublattice sites is N −1. In the large-N limit of this model, the physics is dominated by the subspace of singlet states spanned by any fully-packed configuration of nearest-neighbor SU(N) singlet bonds, and reduces to a quantum dimer model on the lattice at leading order in 1/N . At large but finite N , longer-range valence bonds come into play. These models are amenable to explicit computational study, for example using Quantum Monte Carlo (QMC) methods that work in the basis of bipartite (but not necessarily nearest-neighbor) valence bonds [84][85][86]. Such computational approaches have been used to study the physics of these systems as a function of N , finding a transition from quantum antiferromagnetism at N = 2 to a valencebond solid state above a threshold value of N [87].
Clearly, the analogous large-N limit of the diluted model will exhibit interesting effects associated with the presence of a nonzero density of monomers in the corresponding maximally-packed dimer model, since these monomers are expected to be associated with SU(N) spinon degrees of freedom that cannot be quenched by short-ranged singlet bonds; this would be particularly interesting when the typical size ξ of R-type regions becomes very large. It would thus be interesting in the site-diluted case to revisit this large-N limit, and to study corresponding finite-N behaviour using QMC simulations. It would also be interesting to study resonating nearest-neighbor valence-bond wavefunctions [88,89] for such SU(N) antiferromagnets: In the pure case, these are singlet wavefunctions which map to interesting loop ensembles that interpolate between the classical dimer model and the double-dimer model [90][91][92]. In the sitediluted case, their generalizations will describe degenerate ground states with a nonzero spinon number [93][94][95][96][97][98], and map on to an ensemble of loops and paths closely related to the double-dimer model on the diluted lattice. It would be interesting to study the spinon localization properties of these wavefunctions on such slightlydiluted lattices, especially given the divergent size of the R-regions studied here. It seems likely that these studies will add to our understanding of the physics of dangling spins and local moments studied earlier [37][38][39][40][41].
Direct analogs of these questions are also potentially interesting in the context of Penrose tilings, since the results of Ref. [26] and Ref. [73] provide us with an essentially complete analytic determination of the Dulmage-Mendelsohn decomposition of the Penrose tiling. Although there is no geometric criticality at play in this case, it would clearly be of interest to i) use these analytical results to explore the physics of the Read-Sachdev large-N limit of SU(N) antiferromagnets, ii) perform a QMC study of the related physics at large but finite N , iii) to understand the nature of the SU(N) nearestneighbor RVB wavefunctions mentioned above, and iv) to study the closely-related ensemble of overlap loops and paths defined by the double-dimer model.
Next, we note that our results read in conjunction with those of Ref. [26] suggest some interesting questions about local-moment formation in the particle-holesymmetric Hubbard model on site-diluted bipartite lattices. Localized states tied to the Fermi energy µ = 0 are expected to be intimately connected with the physics of local moment formation [99] in such situations. How is this physics affected by the large length scale ξ associated with the typical size of R-type regions at low dilution, and by the presence of a nonzero density of coexisting zero modes in each such region? Does the topologically-protected nature of the zero modes lead to these moments being relatively robust to perturbations that preserve the particle-hole symmetry? How is the spontaneous breaking of sublattice symmetry reflected in the magnetic properties of this Hubbard model for n vac < n low vac on the cubic lattice? Given that recent scanning tunneling microscopy experiments [100] have detected direct signatures of π-electron magnetism associated with vacancies in undoped graphene, these questions may be of particular interest in the context of vacancy defects in graphene.
Our work also throws up interesting questions about the thermodynamic susceptibility of Kitaev's honeycomb model with nonmagnetic impurities. From the detailed analysis of vacancy effects in Ref. [24], it is clear that a vacancy-induced pile-up of low-but-nonzero energy fermionic excitations is associated with a weak singularity in the low temperature susceptibility. By analogy with the results of Ref. [21] on a SU(2) symmetric version [22] of the Kitaev model, the topologically-protected zero-energy states studied here are expected to lead to a stronger Curie-like singularity χ(T ) ∼ C/T in the linear susceptibility. How does the Curie coefficient C scale in the small-n vac limit of a weakly-diluted honeycomb lattice? And does the topologically-protected nature of the zero-energy states endow this Curie term with some degree of protection against time-reversal invariant perturbations such as a Heisenberg exchange term in the spin Hamiltonian?
Finally, we mention what is perhaps the most computationally-challenging suggestion for follow-up work: As noted in Sec. VIII C, our identification of a phase with area-law entanglement entropy of arbitrary eigenstates in a class of maximally-packed quantum dimer models (Fig. 4 of Sec. II) on diluted bipartite lattices relies on the existence of a particular tensor-product structure (derived in Sec. IV B) of arbitrary many-body eigenstates. Since this structure is disrupted by the presence of additional next-next-nearest-neighbor interactions between monomers (see Sec. IV B), the question arises: Can the computational methods of Ref. [101] and Ref. [102] be extended to study the effect of this additional interaction on the area-law phases identified here? Does this area-law behavior survive when this additional interaction is weak but nonzero?
As one goes through this list of suggestions for followup studies, it is clear that our work opens up a number of interesting and potentially fruitful lines of enquiry. It also becomes obvious that the elephant in the room throughout has been the bipartite nature of the underlying lattice. Are there natural generalizations to the nonbipartite case of any of the geometric questions studied here? Are the corresponding results in the small-dilution limit interesting? The answer to the first question turns out to be in the affirmative [103], motivating follow-up studies aimed at addressing the second question.

X. ACKNOWLEDGEMENTS
We are grateful to T. Kavitha and A. Mondal for introducing us to the literature on graph decompositions, D. Sen for stimulating discussions on the robustness of Majorana modes in various contexts, S. Roy for a useful discussion on many-body localization, D. Dhar and Mahan Mj for pointers to the literature on percolation theory, and D. Dhar for helpful comments on a previous version of this paper. We also acknowledge useful discussions with S. Bera, S. Bhattacharjee, R. Dandekar, F. Evers Statement of author contributions: RB and SB contributed equally to this work. SB and MMI devised efficient implementations of the various maximum matching algorithms [54] used here. SB devised an efficient implementation of a burning algorithm to obtain all Rtype and P-type regions from a single maximum matching. RB and SB (RB) performed a study of the twodimensional (three-dimensional) case using these computational tools. RB, SB, and KD (RB and KD) analyzed the two-dimensional (three-dimensional) data. KD conceived and directed this project, and wrote the manuscript based on results from these data analyses. where m odd small is the contribution of "small" odd R-type regions (with odd imbalance and less than 10000 vacancies associated with them) decreases rapidly with nvac. This may be compared with results discussed in Sec. VI Appendix: Scaling of odd R-type regions and morphology of the largest R-type regions As noted in Sec. VI, the density n odd R of odd R-type regions is nonzero in the thermodynamic limit, signalling the presence of a thermodynamic density of collective zero-energy Majorana fermion excitations of bipartite Majorana networks described by Eq. 3. Here, we provide evidence of critical behavior of such excitations associated with large odd R-type regions. This is important from the perspective of perturbatively stable Majorana modes hosted by the odd R-type regions. Indeed, this scaling analysis confirms that these collective zeroenergy Majorana fermion excitations of bipartite Majorana networks defined on such diluted lattices also exhibit the Dulmage-Mendelsohn percolation phenomena that are the focus of our work.
In Fig. 17, we display the fraction of the total mass of "odd" R-type regions that is contained in small "odd" regions, with smallness defined as in the main text: small regions have absolute mass m < V /n vac (with V = 10000). Again, we see that most of the mass is in large odd regions. A similar conclusion follows more directly in the cubic-lattice case: In Fig. 18, we display the fraction of mass of odd R-type regions contained in the largest such region in the diluted cubic lattice. Again, we see that below n vac ≈ 0.6, the largest odd region contains a nonzero fraction of the total mass of such regions in the thermodynamic limit, with this fraction displaying another kink-like feature at n vac ≈ 0.35. As in the main text, we attribute this feature to the sublattice-symmetry breaking transition studied in Sec. VII C In Fig. 19, we display the scaling of ξ odd , the correlation length that characterizes the sample-averaged odd correlation function C odd (r − r ) (to which each sample contributes 1 if both r and r belong to the same odd R-type region in that sample, and zero otherwise). The scaling behaviour of the corresponding odd susceptibility On the cubic lattice, the largest odd R-type region contains a nonzero fraction of the total mass in such odd regions below a threshold value of nvac ≈ 0.6 in the thermodynamic limit, as is clear from the sample average of this fraction shown here. This may be compared with results discussed in Sec. VI χ odd is displayed in Fig. 20 and Fig. 21. The same figures also shows the scaling of the mass of the largest odd Rtype region. All these scaling analyses confirm that the odd R-type regions share the same critical behaviour as the other R-type regions.
Next we turn to the morphology of the largest Rtype region in a finite-size sample. This is interesting especially since very large finite-size-limited R-type regions dominate many observables at low dilution both in two and in three dimensions. A plausible estimate for the size B max of its boundary is as follows: In the classical theory of percolation, the analog of B max is expected to scale as m max . This is due to the fact that B max counts both the size of the internal boundary (due to voids or holes) and the actual size of the external boundary. As a result, the boundary B max of the critical cluster in classical percolation is dominated by the contribution of these voids. In our case, it is reasonable to assume that the analogue of voids are P-type regions since it is clear from our results that the size of the typical P-type regions remain very small over the entire range of n vac studied, both in two and three dimensions. We estimate that there are of order m max n P such voids, with each void contributing (m P /n P ) (d−1)/d on average to B max . Assembling this information, we arrive at the estimate B max ∼ m max n P × (m P /n P ) (d−1)/d in d spatial dimensions. Based on this argument, we expect B max /m max ∼ m (d−1)/d P n 1/d P . In Fig. 22 and Fig. 23, we see that this expectation is borne out by the data, with the corresponding ratio converging very quickly to its thermodynamic limit in two and three dimensions (indeed, at the sizes we study, no finite-size effects are readily discernible). In two dimensions, this ratio has a very b) The same quantity on three-dimensional L × L × L cubic lattices, plotted as a function of nvac. Note that curves corresponding to different L display a sharp crossing at a threshold n crit vac near nvac ≈ 0.6. Right panel: Data for ξ odd from L × L × L cubic lattices in the vicinity of this threshold collapses onto a single scaling curve when plotted as a function of the scaling variable (nvac − n crit vac )L 1/ν . This may be compared with results discussed in Sec. VII. mild and nonsingular dependence on n vac in the limit of small dilution. On the cubic lattice, we see a noticeable cusp-like feature near n vac ≈ 0.35, which we attribute to a second sublattice-symmmetry-breaking transition (see Sec. VII C).
In Fig. 22 and Fig. 23, we also see that the ratio I max (1 − n vac )/(wm max ) saturates quickly to the thermodynamic limit in both two and three dimensions, with no discernible finite-size corrections in the range of sizes studied here. In two dimensions, it has a O(1) value with a mild dependence on n vac . Thus, the largest R-type region has a zero-mode density not very different from the globally averaged density w of zero modes. On the cubic lattice however, this ratio decreases noticeably with n vac inside the percolated phase, suggesting that the percolated infinite cluster has a systematically smaller density of zero modes compared to the sample-wide average in FIG. 20. a) For appropriate choices of ν and η, the susceptibilities χ odd associated with the odd analog C odd (r − r ) of the sample-averaged geometric correlation function constructed from just odd R-type regions of two-dimensional L × L samples, when rescaled by L 2−η , collapse on to a single curve when plotted as a function of nvacL 1/ν for small nvac. b) The masses m odd max of the largest odd R-type region in twodimensional L×L samples, when rescaled by L d/2+1−η/2 with d = 2, show analogous scaling behaviour. This may be compared with results discussed in Sec. VII. the limit of low dilution.
Next, we consider D nw max + D w max , which is a measure of the number of vacancies adjacent to sites that belong to the largest R-type region (more accurately, it measures the number of links deleted due to these vacancies). If this region is not atypical in terms of the overall density of vacancies associated with it, one would expect (D nw max + D w max ) ∼ n vac m max /(1 − n vac ). In Fig. 25 and Fig. 24, we see that the corresponding ratio saturates very quickly to the thermodynamic limit and has a very mild nonsingular dependence on n vac , thus confirming this basic picture in both two and three dimensions.
Turning to D w max − D nw max , we begin by making more explicit our intuitive picture for the zero modes and monomers hosted by R-type regions: A local imbalance in the numbers of surviving sites on the two sublattices gives rise to these zero modes and mandates the existence of monomers in this region. Since the largest R-type re-gion hosts a nonzero monomer density, these regions must be atypical, in that D w max − D nw max must scale in the same way as D w max + D nw max , i.e., proportional to m max , rather than exhibiting the √ m max scaling that would characterize a truly random region of the diluted lattice. In Fig. 24 and Fig. 25, we see that this expectation is also borne out by the corresponding ratio, which converges rapidly to the thermodynamic limit, again with no discernible finite-size corrections for the range of sizes studied here. Moreover, the n vac dependence of this ratio broadly resembles that of the ratio I max (1 − n vac )/(wm max ). where mmax is the mass of this region and nP (mP ) is the total number (mass) density of P-type regions in the thermodynamic limit in d = 3 dimensions on the cubic lattice. The cusp in the ratio near nvac ≈ 0.35 is attributed to a sublattice symmetry breaking transition (Sec. VII C). From the sample average of the corresponding ratio, Imax, the number of zero modes hosted by the largest R-region scales with wmmax/(1 − nvac), with no discernible finite-size corrections. On the cubic lattice, the corresponding proportionality constant decreases rapidly in the low-dilution limit. As is evident on the cubic latice from the corresponding sample-averaged ratios, D nw max + D w max scales with nvacmmax/(1 − nvac), and D nw max − D w max scales with D nw max + D w max , again with no discernible finite-size corrections even in the low-dilution limit. The nvac dependence of (D nw max − D w max )/(D nw max + D w max ) broadly resembles that of Imax(1 − nvac)/wmmax in the low-dilution limit. As is evident on the square and honeycomb lattices from the corresponding sample-averaged ratios, D nw max + D w max scales with nvacmmax/(1 − nvac), and D nw max − D w max scales with D nw max + D w max , again with no discernible finite-size corrections even in the low-dilution limit. The nvac dependence of (D nw max −D w max )/(D nw max +D w max ) broadly resembles that of Imax(1 − nvac)/wmmax in the low-dilution limit.