Exploring the high-pressure materials genome

A thorough in situ characterization of materials at extreme conditions is challenging, and computational tools such as crystal structural search methods in combination with ab initio calculations are widely used to guide experiments by predicting the composition, structure, and properties of high-pressure compounds. However, such techniques are usually computationally expensive and not suitable for large-scale combinatorial exploration. On the other hand, data-driven computational approaches using large materials databases are useful for the analysis of energetics and stability of hundreds of thousands of compounds, but their utility for materials discovery is largely limited to idealized conditions of zero temperature and pressure. Here, we present a novel framework combining the two computational approaches, using a simple linear approximation to the enthalpy of a compound in conjunction with ambient-conditions data currently available in high-throughput databases of calculated materials properties. We demonstrate its utility by explaining the occurrence of phases in nature that are not ground states at ambient conditions and estimating the pressures at which such ambient-metastable phases become thermodynamically accessible, as well as guiding the exploration of ambient-immiscible binary systems via sophisticated structural search methods to discover new stable high-pressure phases.


I. INTRODUCTION
The laws of thermodynamics dictate that only compounds corresponding to global minima of the Gibbs free energy for a given set of external conditions are viable ground states with infinite lifetimes (1). For such materials, there always exists a synthetis route that follows an overall exothermic chemical reaction pathway, and all systems at finite temperature will ultimately attain a Boltzmann distribution with a high occupation of the ground state in thermodynamic equilibrium. In practice, however, materials in many industrially relevant applications are metastable, i.e., they have higher energies than the equilibrium ground states. Such metastable phases, or polymorphs, correspond to local minima on the energy landscape and are surrounded by sufficiently high barriers to render them kinetically persistent on a finite time scale (2,3).
Synthesizing metastable materials essentially requires finding, in some manner, a path in configurational space such that precursors undergo chemical reactions along a downhill trajectory with sufficiently low activation barriers, until the desired product is formed and quenched (4,5). A plethora of thermodynamic parameters can be tuned to design such a pathway, including temperature, pressure, electromagnetic fields, compositional variations, choosing specific precursor materials, etc. A special case of this design procedure is to choose a set of thermodynamic parameters such that the desired phase becomes the thermodynamic ground state at the chosen conditions, where it forms at equilibrium, and can be recovered as a metastable phase at ambient conditions if all transition barriers leading away from it are sufficiently high (6).
This problem of identifying the ground states for a given set of external conditions is commonly tackled in the computational materials discovery community through global optimization of a target fitness function, using advanced crystal structure prediction (CSP) methods (7). Ideally, this fitness function corresponds to the Gibbs free energy, but it is often approximated by the potential energy (at zero pressure, temperature) or the enthalpy (at zero temperature) or some other biased energy landscape, and is sampled in an unconstrained manner in the configurational space. Many novel materials and their structures have been resolved using CSP at high pressures (8)(9)(10)(11), using chemical pressure and thermal degassing (12,13), as 2-dimensional materials (14)(15)(16), or at surfaces and interfaces (17)(18)(19)(20)(21). However, CSP approaches are computationally demanding and their applications are therefore often limited to small subsets of chemical spaces.
On the other hand, data-driven approaches using large materials databases in conjunction with high-throughput (HT) density functional theory (DFT) calculations have become increasingly popular in materials science (22)(23)(24)(25)(26). Such HT databases usually contain DFT-calculated properties such as formation energy, equilibrium volume, and relaxed atomic coordinates for experimentally reported phases available in repositories such as the Inorganic Crystal Structure Database (ICSD) (27). These datasets are sometimes augmented with hypothetical compounds constructed by decorating common structural prototypes with elements in the periodic table. Subsequent phase stability analysis is often performed to identify stable phases in every chemical space. Although approaches using such HT-DFT databases are useful for efficient large-scale analysis of energetics across a wide range of chemistries, they lack the power to predict novel materials with unknown crystal structures, and phases beyond ambient conditions since all such databases currently contain only materials properties calculated at zero temperature and zero pressure.
In this work, we effectively combine big-data in HT-DFT databases with CSP methods to predict and discover novel materials stable at non-ambient conditions. Using the "implicitly available" high-pressure information in a HT-DFT database, the Open Quantum Materials Database (OQMD), together with a simple approximation to the formation enthalpy of a compound, we study the effect of pressure on the thermodynamic scale of stability/metastability of inorganic compounds. Our model correctly predicts most (75-80%) experimentally reported high-pressure elemental and binary phases to become thermodynamically stable at non-ambient pressures. In fact, our statistical analysis of the data in the OQMD shows a large fraction of ambient-metastable compounds to be thermodynamic ground states at nonzero pressures. From an experimental point of view, vast unexplored pressure-composition space is becoming widely accessible through diamond-anvil cell techniques (28), and improving predictive methods for highpressure phases is, e.g., relevant to geophysical studies of planetary interiors where there can be numerous polymorphs energetically in close proximity, even in relatively simple compositional systems (29). Here, we use our model to sample all binary intermetallic chemical spaces with no experimentally reported compound in the OQMD (∼1780 chemical spaces) and predict nearly 3800 new compounds to be stable at some finite pressure. Finally, we demonstrate the power of our predictive framework in guiding sophisticated CSP methods by explicitly exploring ten binary-immiscible systems, and discover that our model correctly predicts phase spaces containing novel high-pressure materials, which could be potentially recovered to ambient conditions as metastable compounds.
Let us introduce a model to approximate the enthalpy of a phase to efficiently evaluate the phase stability of hundreds of thousands of compounds in a large chemical space at arbitrary pressure.

A. Linear approximation to enthalpy
At zero temperature and pressure p, the Gibbs free energy for a given phase reduces to the enthalpy H = E + pV , where E is the internal energy and V is the volume of the phase. Expanding H as a function of p around the equilibrium pressure p 0 yields (30) where ∆p = (p − p 0 ), and B = 1 β is the bulk modulus of the phase, where β = − 1 V ∂V ∂p is its compressibility. If we neglect all terms higher than second order and consider all phases to be incompressible (i.e., B(p 0 ) → ∞), for equilibrium pressure p 0 = 0, we can approximate the enthalpy of a phase simply as where E 0 is the internal energy at the equilibrium volume V 0 . Conveniently, both E 0 and V 0 are quantities that are readily available for hundreds of thousands of phases in most HT-DFT materials databases such as the OQMD (22,23), Materials Project (24), and AFLOWlib (25). The above linear approximation to enthalpy (henceforth referred to as "LAE") is illustrated in an energyvolume diagram in Fig. 1a, where the ground state and two metastable states are each represented by their respective equation of states (EOS) E(V ), i.e., their energy as a function of volume, approximated by parabola. The negative slopes of the common tangents connecting the EOS of neighboring phases represent the pressure at which both phases are in equilibrium ("transition pressures", black solid lines). With our approximation of the bulk moduli B(p 0 ) → ∞, the EOS curve of each phase would have infinitely large curvature, reducing the parabola to a vertical line originating at the corresponding equilibrium volumes V 0 and energies E 0 . Essentially, all information of each phase is then contained in a single point at (V 0 , E 0 ), represented by filled points.
Although the LAE is rather crude, it is reasonably accurate up to pressures in the range of tens, or even hundreds of GPa. As we will show in the rest of this work, the LAE can be used as a powerful tool to enable quick analyses of phase stability of a large number of materials at non-ambient pressures. Note that we will hereafter use the terms "zero pressure" and "ambient pressure" interchangeably, since the pV contribution to the free energy at atmospheric pressure is insignificant for most inorganic compounds. E.g., at one atmosphere, which corresponds to roughly 1 bar, the energy contribution of pV in diamond silicon with a volume of ∼20Å 3 /atom is merely 0.012 meV/atom, far smaller than the error bars encountered in DFT calculations.

B. Thermodynamic stability: the convex hull
The thermodynamic stability of a phase at zero temperature can be determined by the construction of the so-called convex hull of all phases in the chemical space. At zero pressure, the convex hull is constructed from the composition and formation energy (composition-energy hull, or simply "N -E convex hull") of all the phases. By definition, a phase on the convex hull has a formation energy lower than that of any other phase (or linear combination of phases) at that composition, and is therefore thermodynamically stable. At non-ambient pressures, thermodynamic stability is determined by a convex hull which also takes into account the energy as a function  1. (a) A schematic energy-volume (V -E) diagram with three phases and their EOSs, each represented by a parabola. The negative slope of the common tangent to two adjacent EOS (solid black line) represents the pressure at which the two phases are in equilibrium. The LAE approximates the common tangent with a line connecting the ambient-condition equilibrium volumes/energies of two adjacent phases (solid grey line connecting filled grey circles). (b) A schematic N -V -E convex hull for a model binary system. Individual phases are represented by spheres, and convex hull boundaries are indicated with solid red and dotted black lines. On the left is the conventional zero-pressure N -E hull, a projection of the extended N -V -E convex hull on the right. Phases that are thermodynamically stable at zero pressure lie on the N -E convex hull (blue spheres). Metastable phases that are stable at some non-ambient pressure lie above the N -E hull but on the N -V -E convex hull (teal spheres). A phase that is truly unstable at any pressure lies above the N -V -E hull (orange sphere).
of volume of all phases, given by their respective EOS E(V ). The LAE introduced in Section I A allows us to simplify the construction of the convex hull by taking into account the ambient volume of each phase, in addition to their composition and formation energy (compositonvolume-energy hull, or simply "N -V -E convex hull"). A phase on the extended N -V -E hull has a formation energy lower than any other phase or combination of phases at that composition and volume, and is therefore thermodynamically stable at some pressure. Further, a tie line on the convex hull represents a two-phase equilibrium, a triangular facet represents a three-phase equilibrium, and so on-a facet with n vertices represents an n-phase equilibrium.
A schematic N -V -E convex hull is shown in Fig. 1b. A projection of the extended N -V -E convex hull taking into account only the energy and volumes leads to the N -E hull (indicated by solid red lines). Phases that lie above the N -E hull, but on the N -V -E hull, are metastable at zero pressure but thermodynamically stable at some finite pressure. For example, in Fig. 1b, only two elemental phases and one binary compound (blue spheres) lie on the N -E hull (solid red lines), i.e., are thermodynamically stable at zero pressure, and all other phases are metastable. However, all elemental phases and all binary compounds except one lie on the extended N -V -E hull (teal spheres connected by dotted black lines), i.e., are thermodynamically stable at some non-ambient pressure. Only one phase shown (orange sphere at composition 0.2) is truly unstable at all pressures.

C. Pressure range of stability
For a system in thermodynamic equilibrium at zero temperature, dE = −p dV + i µ i dN i , where dE, dV are infinitesimal changes in internal energy E, volume V of the system, respectively, and dN i is the infinitesimal change in the composition N i of species i. The equilibrium pressure is thus given by p = − ∂E ∂V Ni , i.e., the derivative of energy with respect to volume at constant composition. Hence, the pressure range of stability of a phase P with ambient equilibrium volume and energy of V 0 and E 0 , respectively, is governed by the phase equilibria at volumes (V 0 + dV ) and (V 0 − dV ) (31). In other words, the window of pressures [p − , p + ] where P is stable is given by E(V 0 ± dV ) can be calculated by minimizing the free energy of the system at the target composition and volume. Grand canonical linear programming (GCLP) (32) techniques using efficient linear solvers are routinely employed to calculate phase stabilities and equilibrium reaction pathways at 0 K and 0 GPa (33)(34)(35)(36)(37). In this work, in addition to the average composition of the system being constrained to that of P, the volume is constrained to V 0 ± dV during energy minimization. Thus, a pressure range of stability can be calculated for every phase that lies on the extended N -V -E convex hull. As discussed in Sec. I A, the negative slope of the common tangent to the EOS of two phases is the pressure at which the respective phases coexist, or in other words, one phase transforms into the other under the effect of pressure. In the LAE, the common tangent is reduced to a line connecting the local minima of the two phases (solid gray line connecting filled grey circles in Fig. 1a). The LAE introduces errors compared to the real transition pressure, which depend on the overall features of the energy landscape. If we assume that all phases are compressible with identical, finite bulk moduli, the LAE will consistently lead to an underestimation of the magnitude of the transition pressures. In practice, however, highpressure phases often exhibit shorter, stronger bonds that lead to higher bulk moduli. Hence, the LAE would lead to a better agreement with the real transition pressures for phases stable at very high pressures. On the other hand, if the bulk moduli significantly decreased with pressure, the LAE would lead to an overestimation of the magnitude of the transition pressures. We also note that transition pressures, based on the above definition, can be positive or negative (e.g., the common tangents connecting the ground state with metastable phases 1 and 2, respectively in Fig. 1a). A negative pressure can be physically interpreted as a tensile stress, leading to the expansion of a phase toward volumes exceeding its ambient ground state equilibrium volume.

A. Model validation
We first evaluate the accuracy of the linear approximation to enthalpy by investigating two elements and three binary systems in detail.

Elemental solids
We choose two elements whose high-pressure phase diagrams are among the most complex as well as the most well-studied: silicon and bismuth. Both elements have intricate energy landscapes with several high-pressure allotropes.

a. Silicon
The phase diagram of silicon has been well explored experimentally, partially due to its importance in the semiconductor industry. The ambient ground state is Si-I, which crystallizes in a cubic diamond structure (38). It transforms around 11 GPa to the Si-II phase, which has a β-Sn structure (39). This is followed by a transformation to Si-XI with Imma symmetry (40) at 13 GPa. Above 16 GPa, Si-V forms in the simple hexagonal structure (41), and at 38 GPa, Si-VI forms in an orthorhombic Cmcm structure (42). The hexagonal close-packed Si-VII forms above 42 GPa (43), and finally, cubic closepacked Si-X forms at pressures above 78 GPa (44).
We first compute the pressure range of stability of the various silicon allotropes using DFT calculations (see the top panel labeled "Exact" in Fig. 2(a)). For each phase, we calculate the enthalpy explicitly at various pressures at intervals of 2 GPa and 10 GPa in the range of 0-20 GPa and 20-100 GPa, respectively. The transition pressures are then computed by minimizing the interpolated formation enthalpies as a function of pressure. The experimentally reported sequence of formation and transition pressures of high-pressure Si allotropes are well reproduced, with the exception of Si-II, which is effectively degenerate in enthalpy to Si-XI. The discrepancy between experiment and theory for the transition from Si-I to Si-II has been well studied (45,46), and is attributed to the errors associated with the PBE approximation to the DFT exchange correlation potential. We then calculate the pressure range of stability of all the allotropes using only the respective equilibrium energies and volumes at 0 GPa, extrapolated linearly as described in Sections I A-I C (see the bottom panel labeled "LAE" in Fig. 2a). The agreement between the "DFT Exact" and "LAE" phase diagrams is remarkable: (a) the sequence of the phases is correctly reproduced, with the only exception of Si-X, which the linear approximation model predicts to be unstable even at 100 GPa, and (b) the overall errors in the transition pressures predicted by the approximate model are within around 10% of those calculated explicitly.

b. Bismuth
At ambient condition, bismuth crystallizes in a rhombohedral Bi-I phase with space group R3m. It transforms at a pressure of around 2.55 GPa to Bi-II with a C2/m structure (47,48) and a very narrow range of stability at low temperatures. Upon increasing the pressure, Bi-III forms in a complicated, incommensurate host-guest structure with P 4/ncc symmetry (49)(50)(51). A Bi-IV phase with space group P 21/n has been reported between 2.4 GPa and 5.3 GPa at temperatures above around 450 K (52). Finally, the Bi-V bcc phase is observed at pressures above 7.7 GPa (53).
Similar to the case of silicon, we first compute the pressure range of stability of the various bismuth allotropes using enthalpies calculated explicitly at various pressures at intervals of 1 GPa in the range of 0-20 GPa (see the top panel labeled "DFT Exact" in Fig. 2b). Although the experimentally reported sequence of allotropes formed is well reproduced, the transition pressures between Bi-III/Bi-IV and Bi-IV/Bi-V are severely overestimated. This behavior has been reported previously by Häussermann et al. (51), and corroborated in our recent work on Cu-Bi intermetallics (54,55).
The pressure range of stability of all allotropes calculated using the LAE reproduces the correct sequence of formation (see the bottom panel labeled "LAE" in Fig. 2b). However, the agreement between the transition pressures predicted by the approximate model and those calculated explicitly are worse than that for silicon allotropes. We attribute these larger errors to the strong changes in the chemical bonds between the different bismuth phases, especially since ambient Bi-I has a layered structure, in contrast to the high-pressure phases. Hence, our approximation of equal, infinitely large bulk moduli for every phase is perhaps less reasonable for elemental phases of bismuth.

Binary intermetallics
When compared to pure elements, the high-pressure phase space of binary/higher-order chemical systems have been experimentally relatively unexplored. Including composition and pressure as additional degrees of freedom significantly increases the complexity of the phase space. In this section, we focus on a unique subset of chemistries: intermetallic systems of elements that are not miscible at ambient conditions but form compounds under pressure.
Many of these so-called ambientimmiscible systems involve bismuth in combination with other elements. Recently, we investigated three such systems in detail, namely Fe-Bi (56), Cu-Bi (54,55), and Ni-Bi (57), by performing extensive global structure searches. Here, we use these three systems to further evaluate the performance of the linear approximation to enthalpy.
a. Fe-Bi Using the minima hopping crystal structure prediction method (MHM), we recently predicted a highpressure FeBi 2 phase with I4/mcm symmetry at pressures above 36 GPa (56), which was experimentally con-firmed through evidences found in the in-situ X-ray diffraction pattern at above 30 GPa (10). We note that the discovery of FeBi 2 resulted from extensive MHM structural searches performed at pressures of 0, 50 and 100 GPa. The most promising candidate structures were then relaxed at pressure intervals of 10 GPa to compute enthalpies, which were in turn used to calculate the pressure range of stability of various phases (see top panel in Fig. 3a). Besides the FeBi 2 I4/mcm phase, we find a FeBi 3 phase with the Cmcm symmetry to be stable in a very small pressure window slightly below 40 GPa. This phase has so far not been observed in experiment.
We now compare the pressure range of stability calculated explicitly above against that calculated using the linear approximation to enthalpy, using only the ambient equilibrium energies and volumes of the phases. The phase diagram predicted by the approximate model (bottom panel in Fig. 3a) is qualitatively similar to the exact one: the FeBi 2 I4/mcm phase becomes stable at comparable pressures. This finding can be conveniently exploited in structural searches: since the MHM samples many low-lying metastable structures at a fixed pressure p 0 , one could use the energies and volumes at p 0 of such phases within the LAE to quickly predict if any of the metastable phases become stable at a different pressure p = p 0 . Even for immiscible systems at p 0 , potential candidate structures are found if the simulation cells are sufficiently small to prevent phase segregation. This means that a structural search conducted solely at 0 GPa might have been sufficient to uncover the I4/mcm structure and correctly predict the experimentally observed FeBi 2 phase. The FeBi 3 phase, on the other hand, with the narrow pressure window of stability is predicted to be unstable at all pressures by the approximate model; this could be a result of errors associated with DFT calculations. The good agreement between the exact and approximate phase diagrams is rather surprising: FeBi 2 undergoes a series of magnetic transitions between 0 and 40 GPa, accompanied by abrupt changes in the unit cell volume (56), all of which are neglected in the linear approximation to enthalpy.

b. Cu-Bi
In the ambient-immiscible Cu-Bi system, at least two compounds, with compositions Cu 11 Bi 7 and CuBi, have been recently discovered in DAC experiments between 3 and 6 GPa (54,55). Both phases can be recovered to ambient conditions, and exhibit exciting superconducting and structural properties. For example, CuBi has a layered structure, rather uncommon for high-pressure phases, and is calculated to have an extremely low energy cost associated with exfoliation from bulk into single sheets (58). Further, more recent structural searches predict additional, dense Cu 2 Bi phases to become thermodynamically accessible at pressures of above 50 GPa (59).
The top panel in Fig. 3b shows the pressure range of stability of the various high-pressure Cu-Bi phases  computed using explicitly calculated enthalpies for each phase. The CuBi phase is not thermodynamically stable at any pressure at zero temperature, consistent with recent reports of vibrational entropy playing a crucial role in rendering this phase synthesizeable (55). The Cu 11 Bi 7 phase is thermodynamically accessible at high pressures up to around 60 GPa, when it starts to compete with two dense Cu 2 Bi phases (59). The bottom panel in Fig. 3b shows the Cu-Bi phase diagram computed from the LAE, using only the respective equilibrium energy and volume of each phase at 0 GPa. All phases are correctly predicted to be stable by the approximate model, consistent with the exact phase diagram. As expected, the transition pressures predicted by the approximate model are underestimated overall when compared to those calculated explicitly-a trend that is presumably increased due to the significant structural changes in elemental bismuth as a function of pressure (see Section II A 1). Nonetheless, it is striking that, using the simple linear approximation to enthalpy, we could have correctly predicted all the high-pressure phases in the Cu-Bi system from a structural search only at 0 GPa.

c. Ni-Bi
We tested for the first time the predictive power of our model by investigating the high-pressure phases in the Ni-Bi binary intermetallic system. Two compounds have been experimentally reported at ambient pressures: NiBi in the hexagonal NiAs structure (60), and NiBi 3 in the orthogonal RhBi 3 structure (61, 62). Both compounds are superconductors with transition temperatures of 4.25 K and 4.06 K in NiBi (63) and NiBi 3 (64, 65), respectively. To generate phase data to be used within the LAE to construct the convex hull and predict transition pressures, we used prototypes from our previous structural searches of the Fe-Bi and Cu-Bi systems, and substituted the Fe/Cu sites with Ni atoms, followed by structural relaxation at ambient pressures. Using this ambient-pressure dataset of energies and volumes, the LAE model predicted stable compounds at high-pressure for the compositions Ni 3 Bi and NiBi 2 . Based on this prediction, we performed a thorough investigation of the Ni-Bi system using MHM simulations at pressures of 10 and 50 GPa, which indeed revealed a number of high-pressure phases.
In particular, our calculations predict new compounds stable at high-pressure at compositions of the previously reported ambient-pressure phases, i.e., NiBi and NiBi 3 . The hexagonal α-NiBi phase undergoes a structural transition to a TlI-type structure with Cmcm symmetry at pressures above around 20 GPa. Similarly, the orthorhombic NiBi 3 phase is thermodynamically unstable above 7.5 GPa, and a Cmcm structure is stable above 62 GPa. Further, we discover additional stable phases at previously unexplored compositions. We find that a NiBi 2 phase with C2/m symmetry in the PdBi 2 structure type is in fact thermodynamically stable at ambient pressures, a finding that was reported earlier by Bachhuber et al. (66). At the same composition, a second C2/m phase becomes stable above 52 GPa, over a very small pressure window of less than 1 GPa, followed by a I4/mcm phase, isostructural to FeBi 2 . Finally, a Ni 3 Bi compound with P mmn symmetry, isostructural to Ni 3 Sb in the Cu 3 Ti structure type, is predicted to be stable at pressures above 25 GPa.
One of our predictions was very recently verified by compressing NiBi in a diamond anvil cell (DAC). Heating to temperatures above 700 • C at pressures above ≈ 28 GPa, the hexagonal α-NiAs transforms into β-NiAs in the predicted TlI structure type (57). The experimental transition pressure is somewhat higher than the calculated value of 20 GPa. This discrepancy could be attributed to the presence of high kinetic reaction barriers in the first-order phase transition, which requires heat to induce the phase change and inevitably leads to calculated transition pressures being lower than those observed in experiment. This hypothesis is supported by detectable evidences of the β-NiAs phase in the XRD pattern upon decompression: the β-NiAs is kinetically persistent as low as 11.62 GPa, hence the equilibrium pressure lies anywhere between 11.62 and 28.3 GPa. In addition, errors inherent to the approximations used in DFT calculations could also explain the difference in the observed and computed transition pressures. The approximations to the exchange correlation potential alone can make a noticeable difference. E.g., the PBE functional predicts that both the experimentally observed NiBi and NiBi 3 phases (in their reported structures) are not thermodynamically stable at 0 GPa and 0 K. However, we find that LDA correctly places the two experimental phases on the 0 GPa convex hull, and if we additionally take into account the vibrational entropy contributions to the free energy, NiBi 2 becomes unstable at elevated temperatures. A detailed investigation of the influence of different exchange correlation potentials and temperature effects on the calculated phase stability of Ni-Bi compounds and their properties will be reported elsewhere.
After exploring the high-pressure Ni-Bi system with the MHM, we a posteriori compare the phase diagram of Ni-Bi computed using the explicitly calculated enthalpies against that predicted from our LAE model (Fig. 3c), and find remarkable agreement. Most phases, and the sequence in which they form under pressure, are correctly predicted by the approximate model. The only exceptions are the Cmcm phase at the NiBi 3 composition and the second C2/m compound at the NiBi 2 composition at around 50 GPa. As discussed earlier, the latter phase has a very small pressure range of stability of <1 GPa, so its absence in the phase diagram predicted by the approximate model is not surprising. In fact, similar to the Cmcm FeBi 3 phase that was predicted to be stable in a narrow pressure window of less than 3 GPa but not yet observed experimentally, synthesis of the NiBi 2 phase is likely to be challenging, if possible at all. B. Large-scale analysis of phase stability at high pressure

Elements and binary compounds
The power of our linear enthalpy model lies in its capability to efficiently assess the pressure range of stability of hundreds of thousands of phases. Since the linear approximation requires only equilibrium energies and volumes of phases calculated at ambient pressure, it can be used to leverage the large materials datasets available in HT-DFT databases such as the OQMD (22,23), Materials Project (24), and AFLOWlib (25). Here, we present large-scale analysis and statistics of thermodynamic phase stability of materials at high pressure using ambient-pressure phase data calculated in the OQMD.
First, we focus on elemental high-pressure phases, and begin by compiling a "validation-dataset" of experimentally reported high-pressure elemental phases. The crystal structures of many high-pressure phases reported in the Inorganic Crystal Structure Database (ICSD) (27) have been calculated in the OQMD, albeit at ambient pressure. For every element, we filter all entries in the ICSD using the "External Conditions → Pressure" metadata available for each entry. Further, Tonkov et al. (67) compiled a comprehensive list of phase transformations under pressure for nearly 100 elements, on which we rely heavily as a second reference to cross-validate and augment the list of high-pressure phases calculated in the OQMD. Our final compiled dataset contains 132 distinct elemental high-pressure phases, and can be found in the Supplementary Materials (SM).
For each element in the periodic table, we use the ambient-pressure energy and volume data for all ICSD phases (i.e., not limited to high-pressure phases) calculated in the OQMD within the LAE model to predict (a) the number of phases from our validation-dataset that lie on the extended N -V -E convex hull, i.e., the number of phases stable at some finite pressure, and (b) the pressure range of stability of every phase that lies on the N -V -E hull. Fig. 4 shows a summary of this analysis in the form of a periodic table: for every element with at least one experimentally reported high-pressure phase, we indicate the number of high-pressure phases in our compiled dataset from the OQMD (bottom-left half) and the number of phases predicted by the linear enthalpy model to lie on the N -V -E hull (top-right half), represented on a color scale. That is, the number of phases reported experimentally and those predicted to be stable at some finite-pressure match exactly whenever the colors in both the left and right segments are identical. This is indeed the case for most elements, with a few exceptions. Overall, 75% of all experimentally reported high-pressure phases are predicted to lie on the N -V -E convex hull (see the top panel of Table I). In addition, for around 35% of the phases, the predicted pressure range of stability overlaps with the respective transition pressures reported in experiment. The low success rate in correctly predicting the transition pressure is somewhat expected following the model validation on Si and Bi in Section II A. We discuss the possible sources of discrepancy between predictions and experiment toward the end of this Section. Next, we perform a similar large-scale analysis for all experimentally reported binary phases. Using calculations of experimentally reported compounds in the OQMD, curated using pressure-related metadata in the ICSD (in a manner similar to that employed for elemental phases), we compile a dataset of 343 unique binary compounds in total as a validation-dataset (the entire list is available in Supplementary Materials). This number is significantly lower than that expected from a simple combinatorial estimation. For elemental solids, we found in average more than one high-pressure phase per element. If we extend this observation to binaries and assume that every binary system has in average more than one highpressure phase, the number of potential high-pressure phases considering 90 elements is 90 C 2 = 4005. We note that our estimation is very conservative, since binary A-B systems introduce an additional, compositional degree of freedom, which allows multiple high-pressure phases to exist at the same pressure, A p B q , as we have seen in Sec. II A 2. This indicates that the high-pressure phase diagrams of binary systems in general have been relatively underexplored. The linear enthalpy model performs equally well for binary compounds -80% of exper- imentally reported high-pressure binary phases are predicted to be stable at some finite pressure (see lower panel of Table I). For around 35% of the phases, the predicted pressure range of stability overlaps with the respective transition pressures reported experimentally.
Overall, our "crude" linear enthalpy model performs surprisingly well, with a success rate of 75-80%, in predicting the stability of both elemental and binary highpressure phases. We identify four potential sources of error that could explain the discrepancy between the number of high-pressure phases reported experimentally and that predicted by our approximate model: (a) The crystal structure reported experimentally for the phase is erroneous. Resolving the crystal structure, e.g., from in-situ XRD measurements, under high pressure is a difficult and tedious task that can lead to incomplete/incorrect structural characterization. A prominent example is the Bi-III phase, the crystal structure of which was experimentally resolved only after several failed attempts (49). In fact, Bi-III has an incommensurate host-guest structure and the reported structure is only a representative ordered model with P 4/nnc symmetry (51). A similar incommensurate structure has been reported for phase IV of phosphorus in the pressure range of 107-137 GPa (68).
(b) The high-pressure phase emerges via a phase transition of second order. In this case, the structural relaxations performed using DFT will inevitably transform the high-pressure phase to a lower-pressure structure. Therefore, our linear enthalpy model, which relies on the equilibrium energy E 0 and volume V 0 at ambient pressure of a high-pressure phase, will expectedly not capture its stability.
(c) Errors inherent to DFT calculations and numerical noise, e.g., due to the approximation to the exchange correlation potential, pseudization of core electrons (which might be important especially at high pressures), unconverged basis sets and sam-pling meshes, insufficient tolerances during structural relaxations, etc.
(d) Finally, there is the inherent error due to applying a linear approximation to the enthalpy of each phase (i.e., assuming all phases to be perfectly incompressible), which might be unreasonable for some materials at large values of pressure.

All experimentally reported compounds
We now use our linear enthalpy model to analyze the phase stability of all experimentally reported compounds calculated in the OQMD (not limited to high-pressure phases), a total of around 33,000 unique ordered compounds. As earlier, using the equilibrium energy and volume at ambient pressure of each phase in our dataset, we predict the number, and the pressure range of stability, of all phases that lie on the extended N -V -E convex hull (i.e., presumably thermodynamically stable at some finite-pressure).
First, we find that only around 55% of the 33,000 compounds in our dataset lie on the N -E convex hull, i.e., are thermodynamically stable at ambient pressure conditions, consistent with a previous report on a similar dataset from the OQMD (23). A recent study by Sun et al.(6) on a dataset of 29,900 experimentally reported compounds calculated in the Materials Project also finds around 50±4% of the phases to be ambientmetastable. In the latter study, it is proposed that the observed metastable compounds are generally remnants of thermodynamic conditions where they were once the stable ground states.
We next test this hypothesis of "remnant metastability" by using pressure as a thermodynamic handle and tracking the number of metastable phases that become stable with incremental increase/decrease in pressure, with respect to ambient conditions. Fig. 5a shows the fraction of metastable phases as a function of positive (compressive) or negative (tensile) pressure, separated into binary, ternary, quaternary and higher-component systems. We observe a range of trends based on our statistical analysis.
First of all, the number of metastable phases decreases with incremental application of both positive and negative pressures, relative to 0 GPa. In other words, a significant fraction of the ambient-metastable phases are in fact thermodynamically stable ground states at nonambient pressure conditions. For example, in the case of binary compounds (top left in Fig. 5a), the fraction of metastable phases decreases from around 0.45 at 0 GPa to around 0.30 at 100 GPa-15% of the ambientmetastable phases are rendered thermodynamically stable at some pressure p ∈ (0, 100] GPa. However, in each case, a sizeable fraction of metastable phases remain metastable at all pressures, i.e., they are not equilibrium ground states at any pressure (represented by horizontal  5. (a) Fraction of metastable phases that become thermodynamically stable with incremental increase/decrease in pressure, with respect to 0 GPa. The horizontal dashed lines indicate the fraction of metastable phases that do not lie on the N -V -E convex hull at any pressure. (b) Fraction of ambient-metastable phases that cannot be accessed thermodynamically at any pressure larger than pressure p, equivalent to 1− (fraction of phases that can be accessed at some pressure larger than pressure p).
dashed lines in Fig. 5a). For example, around 21% of all binary ambient-metastable phases cannot be accessed thermodynamically via pressure alone.
Second, the rate of decrease in the number of metastable phases (or increase in the number of metastable phases made stable) with pressure is maximum near zero and decays rapidly toward higher positive/negative pressures. This is most likely due to a bias toward small values of pressure in our compiled set of phases-after all, most compounds reported experimentally are likely observed in near-ambient conditions-but could be also due to a fundamental property of materials, namely, the density of stable ground states as a function of volume/pressure is maximum near zero pressure.
Third, we find considerable differences concerning the "character" of metastability in binary, ternary, and higher order compositional systems. We distinguish two subsets for each n-component dataset (n = 2, 3, 4, ≥5)-"polymorphs" and "phase separation"-depending on whether a given phase is metastable with respect to another phase at the same composition or a combination of phases, respectively, at ambient conditions. We note that the higher the number of components present in a metastable compound, the more likely it is to phaseseparate rather than transform into a polymorph, in agreement with previous observations (6). Further, the lower the number of components in a metastable compound, the more likely it is to be stabilized with pressure. Considering the subset of all metastable phases that phase-separate at ambient pressure, 58%, 47%, 42%, and 27% become thermodynamically stable at some finite positive/negative pressure in the case of binary, ternary, quaternary, and higher-component systems, respectively.
Additionally, we observe that the effects of positive and negative pressures on the metastability of phases are not symmetric about zero pressure: a much larger portion of ambient-metastable phases become thermodynamically stable under positive (compressive) pressure when compared to negative (tensile) pressure. A difference is perhaps expected considering that the limiting behaviors are very different: large positive pressures favor the formation of close-packed phases before eventual overlap of atomic cores, while the limit of large negative pressures is simply the individual non-interacting atoms of each species in the phase.
Finally, we probe a complementary question: if one were to incrementally tune external conditions from large positive to large negative pressures, how many observed metastable phases N can be accessed thermodynamically below any given pressure p? We calculate at pressure p, the number of experimentally observed phases from our dataset that cannot be thermodynamically accessed at any pressure > p. We present this data as a cumulative histogram of the fraction of phases, integrated from pressures p to +∞, separated into elements, binaries, ternaries, etc. in Fig. 5b. Hypothetically, if all experimentally reported compounds were thermodynamically stable ground states at some finite pressure, one would expect this cumulative fraction of unstable phases to be 1 and 0 for p → ∞ and p → −∞, respectively. Consistent with our previous observations, we find that (a) a sizable fraction of the phases do not lie on the extended N -V -E convex hull at all, i.e., they are not ground states under any pressure (represented by horizontal dashed lines in Fig. 5b), and (b) the rate at which additional metastable phases can be thermodynamically accessed is maximum near zero pressure (given by the slopes of the curves). In other words, the pressure density of thermodynamic ground states, dN dp , is maximum near p = 0. Whether this is an artifact of using a dataset of experimentally observed phases or is a fundamental property of matter, needs further analysis, and will be the subject of future work.

C. Discovery of new high-pressure compounds
So far, we have used the LAE to analyze the phase stability of experimentally reported high-pressure elemental and binary phases, and to probe the accessibility of ambient-metastable phases using pressure as a thermodynamic handle. Now, we go a step further by using the LAE to predict new intermetallic compounds by combining it with CSP methods. For this purpose, we focus on a unique subset of binary systems, namely, the combination of elements that are immiscible at ambient pressures. According to the data we compiled from the OQMD, there currently exist ∼1780 binary systems that do not contain any experimentally observed compounds. Any high-pressure phases that we identify in these systems are therefore true predictions of new materials.
For the dataset to be used for construction of the convex hull and calculation of transition pressures within the LAE, we use ambient-pressure formation energies and volumes of phases calculated in the OQMD. As mentioned in Section IV A, the OQMD contains calculations of more than 450,000 compounds including experimentally reported compounds from the ICSD, and hypothetical compounds generated by decoration of common structural prototypes with all the elements in the periodic table. The Strukturbericht symbols of the prototype structures considered in this section are listed below (23, 69): We screen for promising chemical systems that contain high-pressure phases in the following manner: for every ambient-immiscible binary system, we use the LAE to predict the thermodynamic phase stability and pressure range of stability of each hypothetical compound in that chemical space. We select systems that contain at least one hypothetical compound predicted to become stable below an arbitrary pressure threshold of 50 GPa. We then rank these systems according to the predicted transition pressures, from lowest to highest, and select 10 of the most promising systems for further investigation.
For each system, we further verify that no compound in that chemical space is reported in the ICSD or in phase diagrams available in the ASM Alloy Phase Diagram Database (70). At each composition where our FIG. 6. The convex hulls of formation enthalpy of ten ambient-immiscible binary systems calculated using structural search at 50 GPa via the MHM. Each cross denotes a phase sampled with the MHM. In all but the Zn-Ga system, we find at least one thermodynamically stable high-pressure phase.
model predicts a stable high-pressure phase we perform structural searches using the MHM, starting from the respective prototype structure from the OQMD, using simulation cells with up to 10 atoms/cell. Due to the set of binary prototypes currently calculated in the OQMD (see list above), the compositions we sample are limited to A 3 B, AB and AB 3 . Note that both the system size and the number of sampled compositions are far too low to give accurate predictions of the true high-pressure ground states. The structural searches are merely intended as proof-of-concept, i.e., to provide a sampling of configurations beyond the limited number of prototype structures. Of the ten selected ambient-immiscible binary systems, namely, As-Pb, Al-Si, Sn-Bi, Fe-In, Hg-In, Hg-Sn, Re-Sn, Re-Br, Re-Ga, and Zn-Ga, structural searches performed at 50 GPa using the MHM confirmed the existence of at least one new stable high-pressure phase in all but the Zn-Ga system (see Fig. 6). All thermodynamically stable structures at 50 GPa are provided in the Supplementary Materials (SM). The high-pressure phases predicted present a number of avenues for experimental synthesis and verification. Overall, the success of the linear enthalpy model in guiding more accurate, sophisticated techniques based on crystal structure prediction in discovering novel high-pressure phases is remarkable.

III. CONCLUSIONS
In summary, we present a method that allows an efficient screening for materials that are thermodynamically stable at non-ambient pressures using a simple linear approximation to the formation enthalpy of a phase. Using a generalized convex hull construction, the stability of thousands of compound can be evaluated at a low computational cost based on ambient-pressure data that is currently available in many materials databases without performing any additional DFT calculations. Through a large-scale analysis of experimentally reported compounds, we show that a large fraction of the observed ambient-metastable phases are in fact thermodynamic ground states at some finite pressure. Our method can be readily extended by further generalizing the convex hull construction and taking into account additional thermodynamic degrees of freedom, including temperature or surface areas of finite particles. Finally, we demonstrate the predictive power of this model when combined with a crystal structure prediction technique by discovering novel high-pressure phases in a set of ambient-immiscible binary intermetallic systems.

A. Calculation of thermodynamic quantities
The equilibrium formation energy and volume data for all the phases considered in our analysis using LAE were retrieved from the Open Quantum Materials Database (OQMD) (22,23).
The dataset consists of DFTcalculated properties of over 450,000 compounds which include (a) unique, ordered experimentally reported compounds from the Inorganic Crystal Structure Database (ICSD), and (b) hypothetical compounds generated by the decoration of common structural prototypes with all the elements in the periodic table. Details of the settings used to calculate the equilibrium formation energy and volume of compounds in the OQMD can be found in Ref. 23.
All other DFT calculations reported in this work, i.e., those performed as part of global structure searches, were performed using the Vienna Ab initio Simulation Package (VASP) (71)(72)(73). We use the projector augmented wave (PAW) formalism (74,75) and the PBE parameterization of the generalized gradient approximation to the exchange correlation functional (76) throughout. For all calculations, we use Γ-centered k-point meshes with about 8000 k-points per reciprocal atom and a planewave cutoff energy of 520 eV. All atomic and cell degrees of freedom of a structure are relaxed until the force components on all the atoms are within 0.01 eV/Å, and stresses are within a few kbar.

B. Structural searches
The minima hopping method (MHM) (77,78) implements a highly reliable algorithm to explore the low enthalpy phases of a compound at a specific pressure given solely the chemical composition (79)(80)(81). The low lying part of the enthalpy landscape is efficiently sampled by performing consecutive, short MD escape steps to overcome enthalpy barriers, followed by local geometry optimizations. The Bell-Evans-Polanyi principle is exploited through a feedback mechanism on the MD escape trials, and by aligning the initial MD velocities along soft-mode directions in order to accelerate the search (82,83). The MHM has been successfully applied to identify the structure and composition of many materials, also for systems at high pressures (55,56,59,(84)(85)(86). In this work, we performed MHM simulations only at the compositions where a high-pressure phase is predicted to be stable by the linear enthalpy model.

C. Software implementation
All convex hull constructions in this work were performed using the Qhull library (87) as implemented in the SciPy Python package (88). The GCLP calculations reported in this work were performed using the Cbc solver distributed with the PuLP Python library (89). An implementation of the framework described in Sections I A-I C has been made available as an open-source Python module (90). An implementation of the MHM is available through the Minhocao package (77,78).