Correlation analysis of strongly fluctuating atomic volumes, charges, and stresses in body-centered cubic refractory high-entropy alloys

Local lattice distortions in a series of body-centered cubic alloys, including refractory high-entropy alloys, are investigated by means of atomic volumes, atomic charges, and atomic stresses deﬁned by the Bader charge analysis based on ﬁrst-principles calculations. Analyzing the extensive data sets, we ﬁnd large distributions of these atomic properties for each element in each alloy, indicating a large impact of the varying local chemical environments. We show that these local-environment effects can be well understood and captured already by the ﬁrst and the second nearest neighbor shells. Based on this insight, we employ linear regression models up to the second nearest neighbor shell to accurately predict these atomic properties. Finally, we ﬁnd that the elementwise-averaged values of the atomic properties correlate linearly with the averaged valence-electron concentration of the considered alloys.


I. INTRODUCTION
Refractory high-entropy alloys (HEAs), composed mainly of refractory elements from groups 4, 5, and 6 in the Periodic Table, have attracted interest due to their excellent hightemperature mechanical properties [1]. For example, the first reported refractory HEAs, NbMoTaW and VNbMoTaW, show higher yield strengths than typical Ni-based superalloys in a wide temperature range [2,3]. Another appealing example is TiZrNbHfTa, which has higher compressive ductility than NbMoTaW and VNbMoTaW at a significantly lower density, while maintaining an acceptable yield strength [4]. In general refractory HEAs reveal much larger yield strengths and Vickers hardnesses than the constituent elements [4][5][6][7], which is an indication of the extraordinary solid solution strengthening in HEAs.
While models of solid solution strengthening for conventional alloys were proposed many decades ago [8][9][10], theories for HEAs were reported only recently. The development has been mainly focused on an extension of the traditional Labusch model [10] to multicomponent alloys [11,12] or an application of the average effective-medium * shoji.ishibashi@aist.go.jp approach [13][14][15]. In any such development, quantities such as lattice misfit and effective atomic volume play an important role, which implies that local lattice distortions constitute a good descriptor to estimate the magnitude of solid solution strengthening in HEAs.
Local lattice distortions in disordered alloys have been analyzed in various ways, e.g., via atomic size mismatch of the different constituents [16]. For CrMnFeCoNi-based face-centered cubic (fcc) HEAs, the root-mean-square (RMS) atomic displacements from the ideal lattice sites were found by atomistic simulations to correlate well with the normalized (by the shear modulus) yield strength [17]. A correlation between atomic size misfit parameters and atomic displacements of HEAs was also discussed in several works [18,19]. In other works atomic volumes were investigated by means of Voronoi tessellation [20,21]. The concept of atomic-level pressure has been employed to analyze local lattice distortions and solid solution strengthening in HEAs [22,23]. Such concepts have been successfully employed to design new alloys with higher strength than available before [21,23].
The majority of the investigations of local lattice distortions has so far focused on fcc based HEAs. Similar studies for bcc alloys-which are generally more prone to distortions-are lacking, in particular when it comes to an explanation in terms of the electronic structure. In the present study we remedy this deficiency and analyze atomic volumes, atomic charges, atomic stresses, and mutual correlations for an extended set of 15 body-centered cubic (bcc) refractory alloys including refractory HEAs utilizing first-principles calculations. In particular, the atomic volumes, atomic charges, and atomic stresses are calculated based on the Bader charge analysis [24,25], and the relation to the local chemical environment is established.
These equiatomic alloys were modeled using supercell models whose number of atoms is commensurate with the number of the elements of the alloys. Specifically, the binary and quaternary alloys were modeled using a 128-atom supercell (4 × 4 × 4 expansion of the two-atom conventional cell), the ternary alloys were modeled using 54-atom supercells (3 × 3 × 3 expansion of the two-atom conventional cell), and the quinary alloys were modeled using a 125-atom supercell (5 × 5 × 5 expansion of the one-atom primitive cell). Ideal chemical mixing of the elements was achieved utilizing the special quasirandom structure (SQS) approach [29]. Note that, although the supercell models for the ternary alloys have a smaller number of atoms, the employed SQSs well sample the local chemical environments around the atoms and hence provide good predictions of the atomic properties considered in this study (see Appendix C for details).
Atomic volumes, atomic charges, and atomic stresses were obtained based on the Bader charge analysis [24,25]. Bader regions are regions defined by boundaries for which the charge density gradient along the normal direction to the boundary surface is zero. Each Bader region is usually located at an atomic site. Occasionally, because of small fluctuations of the charge density, small Bader regions appear in interstitial regions. They are, however, almost negligible. The sum of local charges therein is at most 0.05% of the total charge. In practice, these small regions are assigned to an atom, to which the local charge maximum is the closest, to ensure that the sum of local values is equal to the corresponding quantity for the supercell. The Bader volume of each atom (V Bader ) is computed as the volume of the thus determined Bader region for the corresponding atom, and the Bader charge of each atom (ρ Bader ) is computed by integration of the charge density over V Bader .
To compute atomic stresses we utilized the concept of stress densities. Electronic-structure-based stress densities have been widely used to analyze local properties, e.g., at grain boundaries [25,[30][31][32][33][34][35][36], at surfaces [37][38][39], in superlattices [40], to analyze chemical bonding in molecules [41][42][43], in metal clusters [44][45][46], in metal complexes [47], in lithiumionic conductors [48], and in Si-Fe [49], as well as to analyze electronic shell structures of atoms [50]. In the present study, the method by Filippetti and Fiorentini [51], later modified to fit the plane-wave basis projector augmented-wave (PAW) method [37], was employed to compute the stress density. The stress density σ αβ (r) satisfies where α and β are indices for the Cartesian coordinates, σ αβ is the macroscopic stress tensor, and V denotes the simulation cell volume. Atomically resolved stress components were evaluated by integrating σ αβ within the Bader volumes of the corresponding atoms as As discussed in Refs. [25,37], the gauge-dependent component can be eliminated by integrating σ αβ within the Bader volumes. For the sake of simplicity, in the present study, only the diagonal average of σ Bader,αβ was considered which we denote as σ Bader . In the following, positive and negative values indicate compressive and tensile stresses, respectively. For the integration over the Bader volumes required to obtain the Bader charge ρ Bader and the average of the diagonal stress components σ Bader , a weighted integration scheme [52] was applied to improve the convergence.
The QMAS code [40] was employed for electronicstructure calculations and for calculations of the Bader volumes, the Bader charges, and the Bader stresses. Electronicstructure calculations were conducted employing the PAW method [53] in the framework of density functional theory (DFT) within the generalized gradient approximation (GGA) of the Perdew-Burke-Ernzerhof (PBE) form [54]. The planewave cutoff energy was set to 20 Ha (544 eV). The Brillouin zones were sampled by -centered 6 × 6 × 6, 4 × 4 × 4, 4 × 4 × 4 k-point meshes for the 54-atom, 125-atom, 128-atom supercell models, and the Gaussian-smearing technique [55] was employed with the full width at half maximum (FWHM) smearing set to 25 meV. The calculations were performed at the DFT-determined equilibrium lattice constants (see Appendix A for details). The total energies were minimized until for each ionic step the charge differences became less than 10 −8 per simulation cell, and internal atomic positions were relaxed until the residual forces became less than 5 × 10 −5 Ha/bohr (2.5 × 10 −3 eV/Å).
Valence-electron concentrations (VECs) of HEAs have been previously discussed in relation to phase stabilities [56][57][58][59]. In the present study, the VEC of each alloy was computed as the average of the VECs of the constituent elements weighted by their concentrations in atomic percent, where the VECs of the elements in groups 4, 5, and 6 were taken as 4, 5, and 6, respectively. only on the element and the overall alloy composition, but also very sensitively on the specific local chemical environment. A similar observation was made previously for atomic bond distances in fcc HEAs [18,21,60].

A. Distributions of atomic volumes, charges, and stresses
Furthermore, we can draw a relation between the average atomic Bader volumes and the corresponding atomic VECs. For a given alloy, the elements of the same period from the Periodic Table show decreasing average volumes with decreasing VEC (compare, e.g., Nb with Mo or Ta with W in the NbMoTaW HEA). It needs to be emphasized that this trend is opposite to the trend valid for the atomic volumes of pure metals. For the latter, it is known empirically (e.g., Slater [61] or Pauling [62]) and observed explicitly in our calculations (cf. dashed colored lines) that the atomic volumes increase with decreasing VEC. The discrepancy between the pure-metal atomic volumes and the Bader atomic volumes of the alloys is related to different local chemical environments in the solid solutions. Nonetheless, a rather simple Vegard's law (i.e., prediction of the alloy's volume by a linear mixing of the pure-metal atomic volumes) still works very well for all of the alloys. This indicates that there is a compensating effect among the constituent elements.
An apparently appealing alternative to a Bader volume analysis is based on atomic volumes obtained by a Voronoi tessellation which requires only structural input instead of electronic charge densities. Figure 1(b) shows the distribution of the Voronoi volumes (V Voronoi ) for the investigated alloys. Inspection of Fig. 1(b) in comparison to Fig. 1(a) reveals significant differences between the Bader and Voronoi volume distributions. A main difference which applies to all alloys and elements is that the Voronoi volumes show a much smaller standard deviation of the distributions (up to 0.6 Å 3 ). A more detailed comparison reveals further that the averages of the distributions strongly differ for many of the elements.  This also implies that the Bader and the Voronoi volumes could be used as distinct descriptors for, e.g., modeling the solid-solution strengthening in HEAs. Figure 3 shows the distributions of the atomic charges ρ Bader obtained from the Bader analysis for each element in the investigated equiatomic bcc alloys. A wide spread of the distributions can be observed for each element in each of the alloys similarly as for the atomic Bader volumes shown in Fig. 1(a). In fact, there is a strong linear negative correlation between the Bader volumes and charges as clarified by Fig. 2(b). Specifically, a positive ρ Bader indicates less electrons and results in a smaller V Bader . For a given alloy, we observe that up to about one electron of charge can be transferred between the different elements. Further, for a given alloy, the elements with higher VEC tend to have negative charges, and also the elements in later periods in the Periodic Table tend to show more negative charges.
One may have expected that the charge transfer can be qualitatively predicted by electronegativities as defined, e.g., by Pauling [63] or by Allen [64][65][66], but the present results reveal a few exceptions. Taking the example of VW, Fig. 3 shows that V and W are charged positively and negatively, respectively, which is opposite to what is expected from Allen electronegativities of the two elements. In contrast, the correlation between the VEC and the Bader charge still holds in such alloys, indicating that the VEC is a good descriptor for the Bader charge (see Sec. III D for further analyses). Note that a similar correlation has been also found between local VEC and C solution energies in CrMnFeCoNi [67]. Figure 4 shows the distributions of the atomic stresses σ Bader obtained from the Bader analysis for each element in the investigated equiatomic bcc alloys. A very similar picture arises as for the atomic volumes and atomic charges, in particular the atomic stresses are substantially distributed for each element in each alloy. Noteworthy, the local stresses experienced by the atoms are remarkably high-100 GPa and above. These values are even higher than those reported previously for fcc FeCoNi and CrFeCoNi [22], indicating larger local stresses in refractory bcc HEAs than in 3d-transition-element fcc HEAs. The computed atomic stresses are also substantially higher than those at dislocation cores in bcc Fe [68,69], fcc Al [70], and B2 TiNi [71] as well as higher than the atomic stresses at grain boundaries of bcc Fe [30][31][32] and fcc Al and Cu [25,[33][34][35].
Further, we find strong correlations between the Bader volumes and stresses [ Fig. 2(c)] as well as between the Bader charges and stresses [ Fig. 2(d)]. We can again draw a relation to the VECs. For a given alloy, the elements with lower VEC tend to show negative (tensile) stresses, and the elements with higher VEC tend to show positive (compressive) stresses. This trend is consistent with what is expected from V Bader ; for a given alloy, the elements with larger V Bader tend to show more positive (compressive) stresses, and vice versa. The observed results for the atomic stresses are consistent with previous works for fcc alloys. Strong distributions of atomic pressures were also found for fcc FeCoNi and CrFeCoNi [22]. Strong correlations between atomic charges and atomic pressures were found for CrMnFeCoNi-based fcc equiatomic alloys [23]. A previous work [17] reported that the RMS of the atomic displacements from the ideal lattice sites correlates with the yield strength normalized by the shear modulus for fcc CrMnFeCoNi-based HEAs. We therefore analyze the distributions of atomic displacements from the ideal lattice sites for each element in the investigated equiatomic bcc alloys. The where r ideal i and r i denote the internal atomic position of the ith atom at the ideal lattice sites and after the relaxation, respectively. Note that the reference ideal lattice is set to Figure 5 shows the results. Particularly large displacements are found in ZrNbHf, TiZrNbHfTa,  Table I. and TiZrMoHfTa, which include 60% or more of group-4 elements (Ti, Zr, Hf). This is probably because the bcc phase of the group-4 elements is known to be dynamically unstable at low temperatures [72][73][74][75], and hence the group-4 elements do not prefer to be located at the ideal bcc lattice sites at 0 K in DFT simulations. ZrNbHf and TiZrMoHfTa show in fact relatively low phonon frequencies compared with other refractory-element based alloys [76,77], indicating again less dynamical stability of these alloys in the bcc phase. The 023608-7 dynamically unstable group-4 elements cause large displacements, i.e., large local lattice distortions. Consequently, not only the group-4 elements but also Nb and Ta show large displacements due to the large local lattice distortions around the atoms, even though Nb and Ta themselves favor ideal bcc lattice sites in their elemental states. The relatively large lattice distortions in ZrNbHf, TiZrNbHfTa, and TiZrMoHfTa may also affect V Bader , ρ Bader , and σ Bader ; in these alloys, relatively large distributions are found for the atomic properties (Figs. 1, 3, and 4).

B. Dependence of the distributions on the local environment
A main outcome of the analysis performed in Sec. III A is the widespread distribution of all investigated atomic properties V Bader , ρ Bader , and σ Bader -even when considering the same element in any of the alloys. In order to shed light on the origin of these distributions, we now analyze their relation to the local chemical environments which are known to affect various properties of HEAs such as, e.g., stackingfault energies (SFEs) or chemical short and long-range order [12,[78][79][80][81][82][83][84][85][86][87][88][89][90][91][92][93].
To this end we perform a linear regression of the atomic properties as a function of the local composition around each atom. Since the atomic stresses are most relevant for mechanical properties [23] we focus here on σ Bader . We have verified that similar results hold for V Bader and ρ Bader . The first, second, and third nearest-neighbor (NN) shells of the atoms, which include 8, 6, and 12 atoms, respectively, are considered. The compositional dependence of σ Bader is fitted to the following linear equation: where x i, j denotes the concentration of the ith element in the jth nearest neighbor shell, N the number of elements in the alloy, and a 0 and a i, j linear regression coefficients. Note that, for an N-component alloy, the concentrations of only N − 1 elements are independent for each shell, and therefore the summation over the elements in Eq. (4) runs only up to N − 1. To evaluate the regression quality, the following adjusted coefficients of determination R 2 adj are employed: where p is the total number of explanatory variables and n is the sample size. The coefficient of determination R 2 is computed as where y i and f i are the raw and the predicted values of the i th data point, respectively, andȳ is the mean value of {y i }. R 2 is equal to 1 when the model is perfect and decreases when the regression becomes worse. Unlike R 2 , it is possible that R 2 adj decreases when increasing the number of explanatory variables, indicating overfitting.
The resultant R 2 adj values from the linear regressions with Eq. (4) are shown in Table III (Appendix B). For a few alloys (MoTa, NbMoTa, NbMoTaW), already the first shell (1NN rows) provides an excellent description of the local chemical environment. Inclusion of the second shell into the regression (1 + 2NN rows) provides a very good description of the local environment for most of the alloys. A clear exception is ZrNbHf for which R 2 adj is only 0.370 and 0.729 for Zr and Hf, respectively. This may be attributed to the large structural distortions in this alloy, as demonstrated by the local atomic displacements in Fig. 5. When considering up to the 3NN shell, R 2 adj does not increase significantly; for some cases, it becomes even smaller indicating overfitting. We can conclude that σ Bader can be well described and predicted based on the local compositions up to the 2NN shell.
In Sec. III A, we have indicated the correlation of the DFT computed atomic properties with the VEC at several places. It seems natural to ask whether these correlations can be quantified further. We thus investigate the following linearregression model with the VEC as the variables: where VEC 1NN and VEC 2NN are the differences of the local VEC in the 1NN and the 2NN shells from the VEC TABLE I. Linear regression results of the elementwise-averaged atomic properties V Bader , ρ Bader , and σ Bader as a linear function of the VEC of the alloys [Eq. (9)]. The column VEC M shows the VEC for the elements. The columns y M show the predicted values when VEC of the elements are given in Eq. (9). The columns R 2 show the coefficients of determination. Note that the results for Zr and Hf are obtained only from three alloys with low VECs, and the corresponding fits might hence be less predictable for high VECs than the results for the other elements. of the center atom, respectively. In writing Eq. (7) we have used the recognition of the above regression results to restrict the parameters up to the second shell. Note, however, that the regression model of Eq. (7) is simpler than the one of Eq. (4) in that it contains only two linear-regression parameters. The model accuracy is tested by cross-validation (see Appendix C for details).
The results of the linear regressions with Eq. (7) are summarized in Table IV (Appendix B). In most cases, R 2 adj is over 0.8, indicating good predictive capability using just the two parameters, VEC 1NN and VEC 2NN . Exceptions are found for group-4 elements in ZrNbHf, TiVNbTa, and TiZrMoHfTa; R 2 adj is at most about 0.75. The less accurate fits may be caused, as already discussed above, by the larger local lattice distortions for these alloys (Fig. 5). The slightly reduced R 2 adj compared with the regression model based on the local compositions, i.e., Eq. (4), should be attributed to the residual chemical differences among the chemical elements in the same group, which cannot be captured by the VEC alone. Comparing a 1NN and a 2NN , the former is substantially larger than the latter, indicating that the 1NN shell has a stronger impact on σ Bader than the 2NN shell, as intuitively expected. The main conclusion based on this analysis is that, given that local lattice distortions are not too strong, the local atomic properties can be well parametrized by local model Hamiltonians including the first two shells.

C. Universal models for a prediction of σ Bader
In Sec. III B, we have expounded the good performance of VEC 1NN and VEC 2NN in predicting σ Bader for the different alloys and elements separately. Here we would like to generalize the linear-regression model in order to make it applicable not only for a specific alloy composition, but rather universally to the class of bcc refractory alloys.
For that purpose we again apply Eq. (7); however, now not for the individual alloys and elements, but instead taking the data of all the 1609 atoms in all the investigated alloys as input to a single fit. The model accuracy is tested by cross-validation (Appendix C). For σ Bader , the such obtained linear-regression coefficients are a 0 = −6.7 GPa, The R 2 adj value is 0.822, indicating a rather good predictive capability of the fit even when using only the two descriptors VEC 1NN and VEC 2NN . Figure 6(a) compares the thus predicted σ Bader values with the original DFT values. The data points are found to be divided into several subgroups, which can be assigned to the different element types.
Based on this result, we anticipate that the predictive power of the model can be improved upon inclusion of element-specific information. Thus we next introduce the DFT-computed pure-metal atomic volumesṼ listed in Table II (Appendix A) into the fitting procedure. Specifically, we compute the averages ofṼ over the 1NN and 2NN shells of each atom and then subtractṼ of the respective center atom ( Ṽ 1NN and Ṽ 2NN , respectively). The corresponding The model accuracy is tested by cross-validation (Appendix C). The resultant linear-regression coefficients 023608-9   It must be noted that, even though the two universal linear-regression models provide high R 2 adj values, it does not mean that these universal models have a predictive power as good as those obtained for the individual alloys and elements. The reason is that the less-predictable cases seen for group-4 elements in element-and-alloywise models in Sec. III B and Table IV are smeared out in the universal models. The universal models should, nevertheless, be useful because they can be immediately applied to refractory alloys with, e.g., nonequiatomic compositions and/or chemical short-range order, without explicit DFT calculations, once their chemical configuration is given.

D. Relation to the averaged valence-electron concentration
The above analysis has shown a strong correlation between the computed atomic Bader properties and the VECs of the nearest neighbor shells. We show now that, by further data coarse graining, we are able to establish a simple linear relation between corresponding averaged properties. Figure 7 shows the atomic properties averaged over the individual elements (denoted by angle brackets) as a function of the overall VEC of the considered alloys. For each element, the three investigated atomic quantities depend linearly on the alloys' VEC, almost independently of their actual chemical composition. The dependence of each atomic property on the VEC is qualitatively similar among the elements. Specifically, V Bader decreases, ρ Bader increases, and σ Bader decreases with increasing VEC. To quantify the dependence on the VEC, the results were fitted to a linear equation, i.e., where y stands for the elementwise-averaged atomic property, i.e., V Bader , ρ Bader , and σ Bader , and a and b are linear regression coefficients. Table I summarizes the obtained regression results. Among the three investigated quantities, ρ Bader shows particularly high coefficients of determination R 2 for all the elements, indicating a strong linear relation between ρ Bader and the averaged VECs. It is also interesting that, when plugging the VECs of the constituent elements (VEC M in Table I) into the linear regression model, the thus predicted values of V Bader approximately recover the equilibrium volumes of the pure metals (see Table II).

IV. CONCLUSIONS
We have analyzed atomic volumes, charges, and stresses in various equiatomic bcc refractory alloys ranging from binaries up to high-entropy alloys (HEAs) employing a Bader analysis. The atomic properties show unexpectedly large distributions even for the same element in the same alloy, indicating substantial impact from the local chemical environment. These atomic properties are also strongly linearly correlated with each other in each alloy and for each element, indicating the physical consistency among them as all are based on the same Bader analysis concept. The elementwise averages of the computed atomic properties are found to correlate well with the VEC of a given alloy. From coarse graining the atomistic data, the atomic stresses for each element are found to be well predicted by linear equations as functions of the local compositions or the VECs in the 1NN and 2NN shells. This quantitatively demonstrates that distributions of atomic stresses, as well as those of atomic volumes and atomic charges, originate from local chemical deviations from the equiatomic alloy composition. The findings clearly emphasize  the importance of taking local chemical environments into account to analyze properties of disordered alloys. The variation of the local atomic properties between the different elements in the same alloy can be very significant. The computed atomic Bader volumes can vary by several Å 3 . From the Bader charges we could observe that up to about one electron of charge can be transferred between the different element types. The resulting Bader stresses showed a very remarkable local variation of the stresses acting on the atoms, in the range of 100 GPa. These local stress levels are significantly higher than those in fcc alloys and also higher than those in the vicinity of dislocation cores or at grain boundaries. The observed gigantic local stresses due to chemical disorder are therefore likely to impact the interaction among these atomic structural defects in chemically disordered alloys.
Atomic properties such as local atomic volumes have been successfully used as an input, e.g., for modeling solid-solution strengthening in HEAs. The physically consistent Bader-analysis-based atomic properties are found to be almost uncorrelated with Voronoi volumes. This indicates that the Bader and the Voronoi volumes capture different characteristics of the atoms and hence could be used as distinct descriptors when modeling the solid-solution strengthening in HEAs. Since the Bader-analysis-based atomic properties are also found to be well predicted based on local chemical compositions, it becomes possible to screen wide compositions to find, e.g., compositions with high strength without employing explicit and thus computationally expensive supercell calculations. The present insights into the correlation between the key descriptors can also be used to design coarse-grained descriptions of the electronic structure (e.g., tight binding or bond order potentials [94]) for such chemically complex alloy systems.

ACKNOWLEDGMENTS
We thank Andrei V. Ruban for sharing his code to generate the SQSs. This work was supported by "Materials research

APPENDIX A: OPTIMIZATION OF LATTICE CONSTANTS
Before performing the analysis of the local atomic properties discussed in the main text, we first determined the lattice constants of the investigated refractory alloys employing the VASP code [96][97][98]. The plane-wave basis PAW method [53] was employed with the GGA of the PBE form [54]. The plain-wave energy cutoff was set to 400 eV. The Brillouin zones were sampled by -centered 6 × 6 × 6, 4 × 4 × 4, 4 × 4 × 4 k-point meshes for the 54-atom, 125-atom, 128-atom supercell models, and the Methfessel-Paxton scheme [99] was employed with a smearing width of 0.1 eV. The total energies were minimized until the energy differences became less than 10 −3 eV per simulation cell for each ionic step, and internal atomic positions were relaxed until the residual forces became less than 5 × 10 −2 eV/Å. Several volumes were first computed with relaxing the internal atomic positions from the ideal lattice sites. The equilibrium lattice constants were then obtained by fitting the energy-volume relations to the Vinet equation of state [100]. Table II shows the obtained lattice constants.

APPENDIX B: RESULTS OF LINEAR REGRESSION
Adjusted coefficients of determination, R 2 adj , for the linear regression model corresponding to Eq. (4) are listed in Table III. The linear regression coefficients, a 0 , a 1NN , a 2NN , TABLE VI. Root-mean-squared errors (RMSEs; in GPa) of σ Bader for the linear regression model in Eq. (7) based on the local VEC applied to the permuted chemical configurations for the ternary alloys. The first configuration ABC is used for constructing the linear regression model (hence these values correspond to the respective ones in Table IV adjusted coefficients of determination, R 2 adj , and root-meansquared errors (RMSEs) for the linear regression model corresponding to Eq. (7) are listed in Table IV.

APPENDIX C: QUALITY OF THE LINEAR REGRESSION MODELS
In order to examine the quality of the linear regression model in Eq. (7) for the individual alloys and elements based on the local valence electron concentration (VEC), we performed a repeated k-fold cross-validation analysis. The k value was set to 3 for ternary and quinary alloys, and to 4 for binary and quaternary alloys. The repeat count was 10. The resultant RMSEs of σ Bader for the test sets are listed in Table V. There is no significant degradation compared with the RMSEs for the whole data sets (given in parentheses and corresponding to Table IV) demonstrating no overfitting and hence a good predictive power of the linear regression models. This also indicates that the present supercells well sample different local chemical environments, i.e., they are converged with respect to the supercell size.
To further test the transferability of the linear regression model in Eq. (7), we applied the obtained fits to chemical configurations not used for the training. The new configurations were constructed by permuting the chemical elements in the SQS models. For ZrNbHf, for example, a new distinct chemical configuration "NbHfZr" can be obtained by replacing Zr with Nb, Nb with Hf, and Hf with Zr. The internal atomic positions of the such obtained new chemical configurations were relaxed and then σ Bader was computed. Table VI shows the RMSEs of σ Bader for the supercell models with the permuted chemical configurations for the ternary alloys. No significant degradation is found for the RMSEs compared with the original configurations (ABC). This, again, indicates a good transferability of the linear regression model in Eq. (7).
We also performed a 10-times repeated eightfold crossvalidation analysis for the universal models corresponding to Eqs. (7) and (8) in Sec. III C. The resultant RMSEs are 26.2 GPa and 12.8 GPa, respectively. These values are basically the same as the RMSEs obtained for the whole data sets (i.e., 26.2 GPa and 12.7 GPa as also given in Fig. 6), clearly indicating no overfitting.