Enhancing the efficiency of density functionals with a novel iso-orbital indicator

Due to its efficiency and reasonable accuracy, density functional theory is one of the most widely used electronic structure theories in condensed matter physics, materials physics, and quantum chemistry. The accuracy and efficiency of a density functional is dependent on the basic ingredients it uses, and how the ingredients are built into the functional as a whole. An iso-orbital indicator based on the electron density, its gradients, and the kinetic energy density, has proven an essential dimensionless variable that allows density functionals to recognise and correctly treat various types of chemical bonding, both strong and weak. Density functionals constructed around the iso-orbital indicator usually require dense real-space grids for numerical implementation that deteriorate computational efficiency, with poor grid convergence compromising the improved accuracy. Here, a novel iso-orbital indicator is proposed based on the same ingredients that retains the capability to identify the same chemical bonds while significantly relieving the requirement of dense grids, and giving an improved recognition for tail regions of electron densities. The novel iso-orbital indicator is therefore expected to be the prime choice for further density functional development.


INTRODUCTION
Density functional theory (DFT) is, in principle, exact for the ground state energy and electron density of a system of electrons under a scalar external potential, conventionally solved through a set of Kohn-Sham (KS) auxiliary single-particle Schrödinger-like equations [1,2]. In practice however, the exchange-correlation energy, an essential but usually small portion of the total energy, must be approximated as a functional of the electron density. The computational efficiency of this scheme and the accuracy of modern exchange-correlation approximations has resulted in DFT becoming one of the most widely used electronic structure theories and arguably the only practical method for high-throughput discovery of novel materials currently available.
Exchange-correlation approximations can be broadly categorised by the ingredients used into 5 rungs of increasing non-locality [3]. The accuracy of an approximation usually increases when more ingredients are included through increased flexibility though this enhancement is often accompanied by a deterioration of efficiency, especially when non-local information is included. It is therefore critical to understand the ingredients of common density functional approximations and how they can be utilised to increase the accuracy of a functional whilst maximising the possible computational efficiency. As functionals of higher rungs are usually developed based on functionals of lower rungs, knowledge obtained for the ingredients and their combinations in lower-rung functionals can be transferred to the development of more complex functionals at higher rungs.
The most successful dimensionless variable so far [18,21,25,26] is, where τ vW (r) = |∇n(r)| 2 /8n(r) is the von-Weiszäcker kinetic energy density of a single-orbital system and τ UEG (r) = (3/10)(3π 2 ) 2/3 n(r) 5/3 is the kinetic energy density of the uniform electron gas. The α(r) variable is an iso-orbital indicator, recognising different types of orbital overlap environments and directly related to the electron localization function (ELF) [23,24]. Single-orbital regions are identified by α(r) = 0, highly-overlapped metallic regions with slowly-varying electron densities are revealed by α(r) ≈ 1 and regions of non-bonding density overlap are shown as α(r) 1. Despite the general success enjoyed by recent meta-GGA functionals for a wide range of systems [15,16,[19][20][21][26][27][28][29], it has been observed that many meta-GGAs, including those based on α, suffer numerical instabilities in self consistent field (SCF) calculations [27,30], and have an unacceptably slow convergence with respect to the density of the numerical integration points. The increased computational cost of dense numerical grids severely limits the usefulness of many meta-GGA functionals and restricts the complexity of systems they can be applied to. Additionally, the uncertainty in overall grid convergence necessitates a time consuming validation that calculated properties are properly converged in grid density, placing an undesirable burden of expertise on the user.
Within α-dependent functionals, It is understood that this numerical sensitivity originates from sharp oscillations in the exchange-correlation potential that are only properly captured by very fine grids [27], particularly in inter-shell regions where the local orbital overlap character can vary rapidly. Here we show that the rapid variations in the derivatives of α are largely responsible for these undesirable oscillations. Further, we propose a modified iso-orbital indicator quantity to correct this problem termed β, that contains similar information about local orbital overlap environments whilst having smoother derivatives more easily amenable to evaluation on numerical grids. This indicator variable can be used instead of α to construct  [35]. An acceptable convergence is assumed when difference remains below the SCF convergence tolerance of 1µE h (dotted line).
functionals at the meta-GGA level and higher, without suffering the numerical problems associated with the α variable.
To show how the β indicator can control the numerical problems observed in α-dependent meta-GGA functionals, we modify the existing meta-GGA made simple 2 (MS2) [31] and meta-GGA made very simple (MVS) [32] functionals by substituting β in place of α within the exchange enhancement factors, the only part of the chosen functionals that depend on α. These functionals were chosen as meta-GGA functionals with relatively simple constructions and well reported numerical instabilities [27], particularly in the case of MVS. Modification of the simple exchange enhancement factors for β dependence was trivial for these functionals, requiring only minor adjustments to the balances between internal parameters to preserve exact constraints obeyed by the parent functionals, derived in Sections 2 and 3 of the supplementary material. The resulting β modified meta-GGA functionals are termed MS2β and MVSβ after their parent MS2 and MVS functionals respectively and were implemented into the Turbomole package [33] using the XCFun library [34] to evaluate functional derivatives. The rate of grid convergence for the original and β modified functionals is shown in Figure 1 by plotting the difference in self consistent electronic energy relative to the same calculation on a very fine benchmark grid, as a function of grid point density for the carbon and lithium atoms, chosen as typically difficult systems with pronounced numerical sensitivity. In all cases the β modified functionals show a convergence in total energy at lower grid densities than their parent α functional, indicated by the difference to the benchmark grid energy remaining below the SCF convergence threshold of 1µE h .
The close relationship between the α and β variables can be shown by examining their limits in three important orbital overlap regions. Firstly, in single orbital regions τ (r) = τ vW (r) and α(r) = β(r) = 0. This bound is important in exchange functional development, allowing the strong lower bound on exchange energy of, to be enforced for all spin-unpolarised single-orbital systems and is naturally preserved through the β substitution, e.g., in MVSβ. Secondly, in slowly varying densities τ vW (r) → 0 and τ (r) ≈ τ UEG (r), so α(r) ≈ 1 and β(r) ≈ 1/2. Finally, in non-covalent density overlap regions where n(r) → 0, then τ UEG (r) → 0 as n(r) 5/3 while τ (r) − τ vW (r) → 0 as n(r), thus the denominator, τ UEG (r), decays faster than the numerator, τ (r) − τ vW (r), and α(r) → ∞. A different behaviour is observed for β(r), with both numerator and denominator decaying as n(r), thus β(r) approaches 1, though no simple relationship can be established between the asymptotic behaviour of α and β in non-covalent density overlap regions. These relations between α and β are summarised in Table I.    The α(solid blue) and β(dashed red) functions and density derivatives are plotted for the majority (Maj., ) and minority(Min., ) spin from self-consistent PBE [11] densities.
This similarity between α and β in highlighting local orbital overlap environment is shown graphically in Figures  2 a) and b) for the lithium and carbon atoms respectively. The inter-shell peak behaviour clearly visible for both variables as a rise away from single orbital like values of the 1s core. The difference between α and β is seen in the tail region of the carbon atom for the majority spin electrons, where α diverges upwards whilst β decays slowly towards 0. For the minority spin, both α and β approach 0 in the tail region. In general, β consistently indicates the tail regions of all densities with 0 as τ vW (r) is the leading order of τ (r) there, and identifies them completely when combined with the dimensionless reduced density gradient, s, that is divergent there. This is impossible for α which can have different values, α(r) = 0 for the tail regions of single orbital systems and α → ∞ otherwise.
The benefit of β over α for functional development is most clearly seen in the derivatives with respect to density, exemplified in Figure 2 c) and d). As previously noted, the sensitivity of α-dependent functionals to numerical integration grid point density is understood as arising from rapid oscillations in the exchange-correlation potential, expressed in the generalised Kohn-Sham scheme [27,36,37] as, ϕ pvxc ϕ q dr = ϕ p ∂e xc ∂n ϕ q dr + ∂e xc ∂∇n · ∇(ϕ p ϕ q )dr + 1 2 ∇ϕ p · ∂e xc ∂τ ∇ϕ q dr  Thus, functionals showing sharp oscillations in functional derivatives will show sharp variations in exchange-correlation potential that are challenging to capture in numerical integration schemes. By modifying meta-GGA functionals to use the β indicator in place of α, oscillations in functional derivatives are minimised and the resulting exchangecorrelation potential is smoothed. This relative smoothness in density derivative is clear in Figures 2 c) and d) for the lithium and carbon atoms respectively where the derivative of β with respect to density does not show the same rapid variation as that of α. Similar behaviour is observed for the density gradient and τ derivatives, as noted for α in Ref. [27] and in more complex molecular systems, which are presented in supplementary material Section 1. We can therefore understand the reduced grid sensitivity of the β modified functionals as a consequence of smoother functional derivatives producing smoother exchange-correlation potentials that can be more easily captured by numerical integration grids.
Whilst it is feasible to use the MS2β and MVSβ functionals constructed as a naive substitution of α for β, the parameters determined for the α functionals are unlikely to be optimal for use with β. For this reason we continue with two versions of each β modified funcitonal, one with the original parameters modified only to preserve exact constraints, and another for which the free parameters have been re-optimised (denoted using a * suffix) following the respective procedures of the parent functionals in Refs. [31] and [32] for MS2β* and MVSβ* respectively, again preserving exact constraints. Full optimisation details are given in supplementary material Section 2. Final parameters for each functional are given in Table II.
The performance of the β functionals, both with and without re-optimisation, was assessed for a number of small molecules by calculating the mean error (ME) and mean absolute error (MAE) of the small data sets of atomisation energies (AE6) and barrier heights (BH6) [38,39], the results for which are shown in Table III. The behaviour of the MS2 derived functionals was found to be markedly different to those derived from MVS, though it was observed for both functionals that a naive substitution of β for α degraded results compared to the original form for the considered data sets.
For the re-optimised MS2β* functional, the increase in ME and MAE caused by the naive β substitution is reversed, exceeding the accuracy of the original MS2 functional for AE6 whilst preserving BH6 performance. A converse relationship is seen between MVSβ and MVSβ*, with re-optimisation resulting in a further degradation of accuracy. In this case the single free parameter, e 1 , was adjusted to reproduce the linear term of expansion of the exchange energies of rare gas atoms with nuclear number Z [40], rather than fitting to empirical data sets. With suitable adjustment of the e 1 parameter the MVSβ* functional reproduces the required high-Z expansion linear coefficient to high accuracy, though we note that this adjustment causes significant underestimation of the individual rare gas atom electronic energies that comprise this fit, as well as considerable change from the equivalent parameter in MVS. This poor performance for total energies is also seen in the small molecule tests, for which both MVSβ and MVSβ* show a systematic over-binding.
The grid convergence of data set accuracy was established for the original and re-optimised β functionals by increasing the grid density and comparing resulting errors to the densest tested grid, summarised in Figure 3. A similar trend is seen to the convergence of atomic total energies, Figure 1, with β modified functionals reaching a converged energy at coarser numerical grids than the parent α functional. We note that whilst convergence for the BH6 set is relatively rapid in all cases, much greater variation is seen for the atomisation energies in the AE6 set, consistent with the larger energy scale of the test set.
The different behaviour of the α and β variables is most pronounced in regions of non-bonded density overlap where α(r) 1.0 and β is in the region 0.5 < β(r) < 1.0. This difference is most clearly examined in the Ar 2 dissociation curve, for which it has been shown meta-GGA functionals can be accurate around equilibrium [21,26,31]. The dissociation curves for Ar 2 are shown in Figure 4(a-b) for both the conventional and β modified MS2 and MVS derived functionals. Benchmark data is included from Ref. [43] for comparison. Convergence of mean absolute error for the AE6 and BH6 small molecule data sets [38,39], with respect to increasing numerical grid density for the MS2 (red, ×), MS2β* (blue, dashed, ), MVS (green, ) and MVSβ* (purple, dashed, ). The presence of both angular and radial components and variety of molecules prevents a simple quantitative measure of grid density, as such grid density is reported as the Turbomole grid level. In this way a higher grid level (larger x value) directly corresponds to denser integration grids, though steps between levels may not be uniform. All calculations were performed with the 6-311++G(3df,3pd) basis set [41,42]. In contrast to their thermochemical performance, MS2 derived functionals show a reversed trend in accuracy for predicted Ar 2 binding curve, shown in Figure 4 a). The naive β substitution, MS2β, closely matches the performance of the original MS2 functional across the whole of the binding curve. The re-optimised MS2β* functional however, shows some degradation of accuracy around equilibrium and repulsive wall regions. The reduced performance of MS2β* for Ar 2 compared to MS2 and MS2β contrasted with the improvement for the AE6 set indicates that the optimisation process may have over fit parameters to atomisation energies at the expense of accuracy for weakly bound systems. This raises the possibility that including weakly bound systems into the fitting data set for MS2β* may result in parameters that strike a preferable compromise in between systems, though as we offer the current β functionals only as a proof of concept this is beyond the scope of the current communication.
The MVSβ and MVSβ* functionals show a strong over-binding of the Ar dimer, Figure 4 b), consistent with the systematic over-binding observed in small molecule sets. The systematic over-binding seen for both Ar 2 and the small molecule sets can be understood as a consequence of the interpolation-extrapolation function within MVS that controls functional behaviour by interpolating between 0 ≤ α(r) ≤ 1 and extrapolating α(r) > 1 being inappropriate for β, despite parameters being rebalanced for the [0, 1) range of β. This raises an important conclusion for modifying existing meta-GGA functionals to use β in place of α: whilst there is a clear mapping between α and β in the specific orbital overlap limits, the same is not true for the transitions between these cases. This is particularly apparent in the α(r) > 1, (0.5 < β(r) < 1.0) region and as such, functions controlling interpolation-extrapolation behaviour between the limiting cases should be adjusted to account of the differing behaviour of β.
In conclusion, we have identified the source of the numerical sensitivities commonly suffered in calculations employing meta-GGA functionals as resulting from sharp oscillations in the functional derivatives of the dimensionless variable α(r) commonly employed in their construction and have addressed these sensitivities by constructing a related dimensionless variable, β(r), which imparts similar information about local orbital overlap environment whilst having smoother functional derivatives. The enhanced numerical stability and potential for improved description of density tail regions presented by β are an appealing opportunity for enhancing the numerical performance of future functionals for both the meta-GGA level and for higher level fully non-local functionals.
We have devloped two simple meta-GGA functionals, MS2β* and MVSβ*, as simple proof of concept functionals by replacing α with β from the original MS2 and MVS functionals, and adjusting internal parameters following the schemes in Refs [31] and [32] respectively. We find that in all cases numerical performance was improved in the new functionals with β forms showing converged properties at much coarser integration grids than the original functionals. Whilst the MS2β* functional preserves much of the good accuracy of its progenitor, the MVSβ* functional does not perform as well. This highlights a need for the underlying functional form to be adjusted for β to ensure performance for appropriate norms is preserved alongside an adherence to exact constraints. We therefore expect the modification of existing functionals to include β should be undertaken as a new construction that follows the design philosophies of the original α functional, rather than a straight substitution of α for β. We are optimistic that wholly novel functionals constructed around the β indicator can provide ever greater accuracy and utility for the wider DFT community.