Inapplicability of exact constraints and a minimal two-parameter generalization to the DFT+$U$ based correction of self-interaction error

In approximate density functional theory (DFT), the self-interaction error is an electron delocalization anomaly associated with underestimated insulating gaps. It exhibits a predominantly quadratic energy-density curve that is amenable to correction using efficient, constraint-resembling methods such as DFT + Hubbard $U$ (DFT+$U$). Constrained DFT (cDFT) enforces conditions on DFT exactly, by means of self-consistently optimized Lagrange multipliers, and while its use to automate error corrections is a compelling possibility, we show that it is limited by a fundamental incompatibility with constraints beyond linear order. We circumvent this problem by utilizing separate linear and quadratic correction terms, which may be interpreted either as distinct constraints, each with its own Hubbard $U$ type Lagrange multiplier, or as the components of a generalized DFT+$U$ functional. The latter approach prevails in our tests on a model one-electron system, $H_2^+$, in that it readily recovers the exact total-energy while symmetry-preserving pure constraints fail to do so. The generalized DFT+$U$ functional moreover enables the simultaneous correction of the total-energy and ionization potential or the correction of either together with the enforcement of Koopmans condition. For the latter case, we outline a practical, approximate scheme by which the required pair of Hubbard parameters, denoted as U1 and U2, may be calculated from first-principles.

Approximate density-functional theory (DFT) 1,2 underlies much of contemporary quantum-mechanical atomistic simulation, providing a widespread and valuable complement to experiment 3,4 . The predictive capacity of DFT is severely limited, however, by systematic errors 5-7 exhibited by tractable exchange-correlation functionals such as the local-density approximation (LDA) 8 or generalized-gradient approximations 9 . Perhaps the most prominent of these pathologies is the delocalization or many-electron self-interaction error (SIE) 8 , which is due to spuriously curved rather than correctly piecewiselinear total-energy profiles with respect to the total electron number 5,6 . This gives rise to the underestimated fundamental gaps 5 , charge-transfer energies 10 , and reaction barriers 11 characteristic of practical DFT. While the construction of viable, explicit density functionals free of pathologies such as SIE is extremely challenging 12 , significant progress has been made in the development of implicit functionals in the guise of corrective approaches. Examples include methods that operate by correcting SIE on a one-electron basis according to variationally optimized definitions, such as generalized [13][14][15] Perdew-Zunger 8 approaches, in which much progress has recently been made by generalizing to complex-valued orbitals [16][17][18] , and those which address many-electron SIE directly, such as Koopman's compliant functionals 7,19 .
An established, computationally very efficient DFT correction scheme is DFT+U [20][21][22][23][24][25] , originally developed to restore the Mott-Hubbard effects absent in the LDA description of transition-metal oxides. A simplified formulation [23][24][25] , in which the required Hubbard U param-eter is a linear-response property of the system under scrutiny 25 , is now routinely and diversely applied [26][27][28][29][30] . Beginning with Ref. 11, Marzari, Kulik, and co-workers have suggested and extensively developed 31,32 the interpretation of DFT+U as a correction for SIE, for systems in which it may be primarily attributed to distinct subspaces (otherwise, the related Koopman's compliant functionals are available 7,19 ). The SIE correcting DFT+U functional is given, wheren Iσ =P IρσP I , by Here,ρ σ is the Kohn-Sham density-matrix for spin σ andP I is a projection operator for the subspace I. DFT+U attains the status of an automatable, firstprinciples method when it is provided with calculated Hubbard U parameters 24,25,28,33,34 (particularly at their self-consistency 11,30,35 ), which may be thought of as subspace-averaged SIEs quantified in situ. The subspaces are usually pre-defined for corrective treatment, having been deemed responsible for the dominant SIEs on the basis of physical intuition and experience, although a further level of self-consistency over subspaces is possible using Wannier functions 36 . DFT+U effectively adds a set of penalty functionals promoting integer eigenvalues inn Iσ , and it replicates the effect of a derivative discontinuity in the energy, for each subspace I, by adding an occupancy-dependent potentialv Iσ = U I (P I /2 −n Iσ ). While DFT+U is effective and computationally efficient, even linear-scaling 37 , a considerable degree of care is needed to calculate the required U parameters, which sometimes pose numerical challenges 28,31 . Fully selfcontained calculations of the Hubbard U by means of automated variational extremization would be extremely useful for many practitioners, and expedient in highthroughput materials search contexts 38 . The constraintlike functional form of DFT+U , where the U I resemble the Lagrange multipliers of penalty functionals on the eigenvalues ofn Iσ , suggests the possible viability of such a method. Constrained density-functional theory (cDFT) 10,39,40 formalizes and automates the use of selfconsistent 41 penalty functionals in DFT, enforcing them as exact constraints by locating the lowest-energy compatible excited state of the underlying functional. It is effective for treating SIE, in its own right, for systems comprising well-separated fragments, where it may be used to break physical symmetries and to explore the integeroccupancy states at which SIE is typically reduced 10,42 . However, as we now demonstrate, cDFT is fundamentally incompatible with constraints beyond linear order, and therefore exact constraints cannot be used to correct SIE in an automated fashion. As a result, it seems that we cannot excite a SIE affected system to a state that will reliably exhibit less SIE, without breaking a symmetry.
The simplest conceivable SIE-targeting constraint functional is the quadratic form C 2 = I (N I − N I c ) 2 , where N I = Tr[n I ] is the total occupancy of a particularly error-prone subspace I and N I c is its targeted value, neglecting the spin index for concision. This constraint is a functional of subspace total occupancies, rather than the occupancy eigenvalues as in DFT+U , which is an important distinction for all but single-orbital sites. For N sites symmetry-equivalent subspaces with N I c = N c for all I, the total-energy of the system is given by where V c is a common cDFT Lagrange multiplier. This gives rise to a constraining potential of the formv c = 2V c I (N I − N c )P I , making explicit its dependence on the constraint non-satisfaction. This, in turn, implies an externally imposed interaction correction given byf c = 2V c IP IP I , which acts to modify the energy-density profile, and which is identical to that generated by DFT+U (Eq. 1) when V c = −U I /2.
Following Ref. 41 for the self-consistent cDFT problem, the Hellmann-Feynman theorem provides that the first energy derivative is simply the constraint functional, i.e., dW/dV c = C n , so that the total-energy W (V c ) always attains a stationary point upon constraint satisfaction, in this case when C 2 = 0. Fig. 1 depicts this function for an ideal system for the study of one-electron SIE 6,12 , H + 2 , simulated 43 using the PBE functional 9 . At the considered, intermediate bond-length of 4 a 0 , the overlap of the two atom-centered PBE 1s orbital subspaces yields a total occupancy double-counting of 24%, accounting for spillage. The observed asymptotic behavior of W (V c ) demonstrates that the C 2 constraint is unenforceable. Here, a target occupancy of N c = 0.5 e has been applied, necessitating a repulsive constraint and a positive V c , but the same qualitative outcome arises for any N c = N DFT .
The key to the failure of the C 2 constraint is the Lagrange Multiplier V c (eV) Total Energy (eV) The constrained total-energy of the stretched H + 2 system, with a target occupancy of Nc = 0.5 e per fixed atom-centered 1s-orbital subspace, against the cDFT Lagrange multiplier Vc. Also shown is the constraint functional C2 = (N − Nc) 2 , averaged over the two atoms, and its first and second derivatives, which all fall off rapidly with Vc. Fig. 2. This results in a diminishing returns as V c is increased, i.e., as the constraint asymptotically approaches satisfaction. Motivated by this, we investigate whether non-linear constraints of form C n = (N − N c ) n are unsatisfiable for any order n = 1 and for any target choice N c = N DFT . The n = 0 case is trivial, and the constraint is ill-defined for n < 0 since there the total-energy diverges upon constraint satisfaction. C n becomes imaginary for noninteger n with negative (N − N c ), so that we may limit our discussion to integers n ≥ 2. We begin by analyzing the derivatives of the total-energy W (V c ). The second derivative follows directly from the above discussion, as

fall-off of all self-consistent cDFT response functions
The energy derivative of order m generally involves cDFT response functions up to order m − 1, and positive integer powers of (N − N c ) which may vanish, depending on m and n, but not diverge. The cDFT response function dN/dV c may be gainfully expanded, ifv ext is the external potential, in terms of the intrinsic subspaceprojected interacting response function defined by χ = Tr[(dN/dv ext )P ], since this object is independent of the form of the constraint. The first-order cDFT response dN/dV c is thus expressed, by means of the chain rule in an expression which we have verified numerically. At any valid stationary point, C n = 0 and each of V c , χ, and its derivatives must remain finite. Thus, for n ≥ 2, the response dN/dV c and energy curvature d 2 W/dV 2 c both then vanish. The latter is therefore not a stationary point discriminant, and we move to higher derivatives, such as where the required second-order response is given by This object, and thus d 3 W/dV 3 c both vanish at stationary points for all n ≥ 2, due to the vanishing C n−1 in the first term of Eq. 5, and due to the vanishing first-order response (for which, see Eq. 3) in all remaining terms.
In general, the cDFT response function d m N/dV m c comprises terms proportional to response functions of the same type but of lower order, plus a single term which is proportional to a potentially non-vanishing mixed response function d m−1 χ/dV m−1 c multiplied by the necessarily vanishing C n−1 . This serves as an inductive proof that response functions at all orders, beginning with dN/dV c , vanish as we approach a vanishing C n , as illustrated in Fig. 2. Then, since each term in the m th energy derivative is always proportional to non-divergent powers of (N − N c ) and response functions of at most order m − 1, all energy derivatives tend to zero, as depicted in Fig. 1, proving the conjecture. Thus, non-linear constraints of SIE-targeting C n form cannot be enforced.
In order to cast the SIE-targeting C n functional into a viable form, one possible option remains. We may expand the single-site C 2 , for example, as , and afford an additional degree of freedom to the system by decoupling these two terms. Writing the result in the notation of DFT+U , by change of variables, we arrive at the constraint energy The vanishing response problem is now circumvented, by interpreting the Hubbard U parameters for linear and quadratic terms as separate Lagrange multipliers. Adapting Eq. 6 to multiple, multi-orbital sites and neglecting inter-eigenvalue terms, in the spirit of DFT+U , while retaining only the free-energy 44 (setting N I c = 0), we arrive at the generalized DFT+U correction given by Tr n Iσ − Tr n Iσ2 . Here, the DFT+U functional of Eq. 1 is recovered by setting U I 1 = U I 2 . Otherwise, the corrective potential is modified tov I U1U2 = U I 1P /2 − U I 2n I , so that the characteristic occupancy eigenvalue dividing an attractive from a repulsive potential is changed from 1/2 to U I 1 /2U I 2 . Self-consistency effects aside, the U 2 parameters are responsible for correcting the interaction and for any gap modification, while the U 1 parameters may be used to adjust the linear dependence of the energy on the subspace occupancies, and thereby to refine eigenvalue derived properties such as the ionization potential. We note a resemblance between Eq. 7 and the three-parameter DFT+U αβ functional proposed in Ref. 45, where here a third degree of freedom may be retained by using N I c = 0. Fig. 3 shows the total-energy W of the H + 2 system as before, but now against the U 1 and U 2 defined in Eq. 6, with a subspace target occupancy of N c = N exact = 0.602 e, (the population of each of the two PBE 1s orbital subspaces, calculated using the exact functional). The total-energy is non-uniquely maximized along the heavy white line where the constraint is satisfied, at ∼ 0.92 eV below the exact energy. To understand why the total-energy is always degenerate, and hence why the occupancy condition under-defines the pair (U 1 , U 2 ), it suffices to show that the Hessian of the constraint functional, H ij = d 2 W/dU i dU j 46 , is everywhere singular. The determinant of H ij is conveniently calculated in terms of the response functions, i.e., by using the ground-state expressions dW/dU 1 = N/2 and dW/dU 2 = −N 2 /2, as as required, for all U 1 and U 2 . This implies a vanish- against the Lagrange multipliers U1 and U2 defined in Eq. 6. The subspace target occupancy is set to Nc = Nexact, and the zero is set to the exact total-energy. The constraint is satisfied at the total-energy maximum along the solid white line. The ionization potential is exact along the thick dashed line. The linear-response Hubbard U2, together with the U1 value needed to correspondingly recover the exact subspace density, are shown using thin dashed lines.
ing energy curvature along the lines on which the corrective potential is constant. The linear-response Hubbard U at this bond length, calculated using a method adapted from Ref. 25, is 4.84 eV. If we intuitively set U 2 = U , then a corresponding U 1 = 3.16 eV is required to recover the exact subspace density. The line on which the ionization potential is exact (for the special case of H + 2 , this means that the occupied Kohn-Sham state eigenvalue and the ion-ion energy sum to the exact total-energy, written ε DFT + E ion-ion = E exact ), intercepts U 1 ≈ 0 eV at the linear-response U 2 = U , echoing the 'SIC' double-counting correction proposed in Ref. 47. Finally, for the constrained total-energy, we note that while it can be tuned to reach a maximum at the exact energy for a plausible target, N c = 0.511 e, an unreasonable U 1 = U 2 = −444.5 eV is required to do so. We conclude, therefore, that an SIE affected ground-state cannot be systematically excited to a state that is less so by means of exact constraints, without breaking a physical symmetry. Put another way, the total-energy cannot typically be SIE-corrected by altering the density alone, and a non-vanishing energy correction term is required.
Such a correction is provided by the generalized DFT+U term of Eq. 7, the total-energy generated by which is shown in Fig. 4. The zero of energy and the heavy dashed line show E exact , and the thin dashed lines indicate the Hubbard U 2 = 4.84 eV and corresponding U 1 = 4.44 eV required to recover it. E exact is attained by a traditional DFT+U calculation at U 1 = U 2 = 3.85 eV, and the intersection of the heavy solid and dashed lines yields the pair, U 1 = 5.73 eV and U 2 = 6.98 eV, at which E exact and N exact are located. At the same point, the Kohn-Sham eigenvalue ε DFT lies at ∼ 2.8 eV above ε exact , reflecting that an accurate total-energy at a particular occupancy may coincide with an inaccurate ionization energy, and vice versa, as was recently shown in detailed analyses of the residual SIE in hybrid functionals 48 and in DFT+U itself, at fractional total occupancies 32 .
The generalized DFT+U functional enables simultaneous correction of the ionization potential and totalenergy, or the correction of either together with Koopmans' condition 5,7,8 . In the one-electron case, as in H + 2 , Koopmans' condition may be enforced at ε exact . The approximately parallel solid curves of Fig. 5 illustrate the large U 1 and U 2 values required to do so, as a function of internuclear distance. To estimate these, we have used the convenient feature of H + 2 that the PBE 1s orbital subspace projectors closely match Kohn-Sham orbitals in spatial profile, almost exactly so at dissociation. As demonstrated by the occupancy curves in Fig. 5, this implies a negligible charge and kinetic self- consistency effect, vanishing entirely for U 1 = 2N DFT U 2 , in subspace-uniform corrections such as those in question. For N sites equivalent one-orbital subspaces spanning the energy window responsible for ε DFT , the density non-self-consistent U 1 and U 2 are derived from E exact ≈ E DFT + N sites U 1 N DFT − U 2 N 2 DFT /2, where N DFT = Tr [n DFT ], and ε exact ≈ ε DFT + (U 1 − 2U 2 N DFT ) /2, in which the subspace overlap and spillage are also neglected.
Since subspace response stiffening very typically results from the application of a conventional DFT+U correction 11,35 , it is promising to construct a charge non-selfconsistent first-principles calculation scheme for U 1 and U 2 , e.g., for use in refining DFT+U calculations in order to approximately enforce Koopmans' condition. For this, let us suppose we have calculated a conventional U which reconciles the total energy reasonably, so that DFT /2. We may combine this with the previous two equations and a further requirement for Koopmans' condition at an accurate energy, i.e., ε exact = E DFT [N ] − E DFT [N − 1], where the latter is the DFT-estimated total-energy of the ionized system. This results (see Fig. 5 open circles for data) in we define the 'Koopmans U '. We emphasize that only convenient, approximate DFT quantities are used in these formulae. The interdependence U 1 − U 2 N DFT ≈ U (1 − N DFT ), for any value of N sites , reveals the role of U in splitting U 1 and U 2 . In H + 2 , Koopmans' condition pushes both up to considerably higher values than are commonplace 28 for the conventional U of DFT+U , which, following Ref. 32 and given N PBE , lies close to its regime of minimal efficacy for eigenvalue correction. The Koopmans fraction of each parameter, U K /U 1 or U K / (N DFT U 2 ), generically denoted by 'U K /U ' in Fig. 5, lies close to unity for U 2 at all H + 2 bond lengths, and it exceeds unity slightly when the U K and U -related contributions tend to cancel. U 1 is also U K -dominated at short bond lengths, at which U 1 ≈ N DFT U 2 , before ultimately falling off to the average of U and U 2 in the fully dissociated limit. If the outlined proposed scheme is applied to finesse an existing DFT+U calculation that is already accurate for recovering the total-energy, using a linear-response 11,25 or otherwise calculated U , call it U 0 , then U = 0 eV is the appropriate value to use in our approximate formulae of Eq. 9. An approximately Koopmans' compliant DFT+U calculation then results from the use of the parameters U 0 +U K and U 0 +U K /N DFT+U0 , in place of U 1 and U 2 . The proposed scheme may be generalized to multi-orbital subspaces straightforwardly, in terms of the eigenvalues ofn I DFT instead of N DFT . The constant-N DFT approximation may be replaced by a linear-response approximation, in terms of χ, or lifted entirely by means of a parametrization of the occupancies and a numerical solution of the resulting equations.
To conclude, we have proven analytically, with stringent numerical tests, that non-linear constraints are incompatible with cDFT. It is not possible, therefore, to automate systematic SIE corrections of DFT+U type by means of cDFT, notwithstanding the great utility of the latter, e.g., for correcting SIE by promoting broken symmetry, integer-occupancy configurations well described by approximate functionals 10,42 . Nonetheless, we have found that the cDFT free-energy functionals, dubbed 'generalized DFT+U ' functionals, offer the intriguing capability of simultaneously correcting two central quantities in DFT, the total-energy and the highest occupied orbital energy. Our approximate formulae for the required parameters, which may differ greatly from the familiar Hubbard U , offer a framework within which to further develop double-counting techniques and first-principles schemes for the promising class of SIE correcting methods based on DFT+U 11,28,31,32 , as well as opening up possibilities for their diverse application. We envisage that SIE correction schemes of two or more parameters may also be useful for generalizing the exchange fraction of hybrid functionals 48 , and for DFT+U type corrections of perturbative many-body approximations such as GW 49 , the deviation from linearity of which is somewhat analogous to that of approximate DFT 50,51 . For the analysis and correction of spuriously self-interacting multi-reference systems, we may learn much from the exact solution of minimal models 52 .
This work was enabled by the Royal Irish Academy -