Space partitioning of exchange-correlation functionals with the projector augmented-wave method

We implement a Becke fuzzy cells type space partitioning scheme for the purposes of exchange-correlation within the GPAW projector augmented-wave method based density functional theory code. Space partitioning is needed in the situation where one needs to treat different parts of a combined system with different exchange-correlation functionals. For example, bulk and surface regions of a system could be treated with functionals that are specifically designed to capture the distinct physics of those regions. Here we use the space partitioning scheme to implement the quasi-nonuniform exchange-correlation scheme, which is a useful practical approach for calculating metallic alloys on the generalized gradient approximation level. We also confirm the correctness of our implementation with a set of test calculations.


I. INTRODUCTION
Density functional theory (DFT) 1,2 has become a standard technique for the computation of electronic properties of molecules and periodic systems. As a theory DFT is exact, but in practical calculations one of the functionals, the exchange-correlation (XC) functional, has to be always approximated. One of the great strengths of DFT is that even the simplest XC approximation, the local density approximation (LDA) [2][3][4][5] , turned out to be remarkably useful, albeit mostly for physics related applications and to a lesser degree for chemistry.
The next family of XC approximations beyond the LDA are generalized gradient approximations (GGAs), which offer fairly ubiquitous improvements over LDA. One of the most well-known GGA XC functionals is the Perdew-Burke-Ernzerhof (PBE) functional 6 , which has earned itself the rank of a "standard functional". Though even more sophisticated XC approximations, such as meta-GGAs 7-9 and hybrid XC functionals 10,11 , are being developed and further improved, GGAs are still the most sensible choice for many applications due to their excellent computational speed versus accuracy ratio.
The potential accuracy of GGAs, however, has a certain ceiling, because the information that a GGA functional has about any given system is limited to the electron density n and the gradient of the density ∇n. This deals with the computation of periodic and solid-state systems on GGA level, and in this realm there exists evidence that we might be approaching the accuracy limit. For example, a recent paper by Tran et al. 12 15 , and SG4 16 ) are all clustering above a mean absolute relative error (MARE) of about 0.5%. This suggests that the MARE of ∼ 0.5% represents an accuracy barrier that has proved difficult to breach on the GGA level. The accuracy limit of GGA functionals has important consequences.
In the work of Tian et al. 17 it has been discussed why GGA functionals often yield poor formation energies for metallic binary alloys, and the reason is connected to the GGA lattice constant accuracy limit. An accurate formation energy requires that the equation of states (and therefore lattice constants) of all alloy components are described accurately. On GGA level this is often impossible.
In this paper we utilize the "fuzzy cells" space partitioning concept of Becke 18 to circum-vent the accuracy limit of the GGA level by implementing a space partitioned GGA XC functional QNA 19,20 within the state-of-the-art electronic structure code GPAW 21,22 . Our starting point is the PBE-family (PBE, PBEsol), whose functionality is governed by two parameters noted as µ and β. The parameter µ gives the strength of GGA corrections over LDA exchange and β gives the strength of corrections over LDA correlation. Thus, in general, accurate description of XC effects only with semi-local description of PBE-ansatz with energy functional E xc [n(r); µ, β] = drn(r) (n(r), |∇n(r)| 2 , µ, β) is not possible, where is the XC energy per particle. A generic element within chemistry is obviously a single atom, and without any external fields, the external potential v ext (r) is solely a function of the atomic positions. Especially in case of solid alloys with metallic bonds we consider improving GGA by explicitly parametrising it in a volume around each atom species. In this approach, called quasi-nonuniform approximation (QNA) 19,20 , the XC functional is no longer a density functional theory in a strict sense, but becomes also a function of atomic positions and in- where R a is the atomic positions and µ a and β a are atom-specific parameters described later in the text. The approach has been previously implemented in exact muffin-tin orbitals (EMTO) method [23][24][25][26] and good results has been obtained for various binary alloys 17 .
However, in the original implementation volumes with a strict Voronoi partition were used, rendering the local PBE-ansatz parameters discontinuous with respect to r. Here we overcome the difficulty by employing the fuzzy cells space partitioning concept of Becke 18 , which allows the computation of analytic QNA forces and stress tensor. For efficient calculations, we implement the projector augmented wave method corrections 27,28 to QNA within the projector augmented-wave method based DFT code GPAW. Atomic Simulation Environment (ASE) 29,30 is used thorough the article for handling the atomic geometries and optimizations.

II. IMPLEMENTATION
The QNA scheme essentially generalizes the µ and β parameters of PBE XC functional 6 into space dependent µ(r) and β(r) fields where µ a and β a are optimized parameters corresponding to a given element occupying atomic site a. Consequently, the QNA XC energy can be written in the form is the PBE-type XC energy density per particle. The µ(r) and β(r) fields should interpolate sharply between atoms and in practice this creates the need to divide space into Voronoitype atomic site centered regions. Space division can be accomplished by appropriate weight fields w a (r). The value of w a (r) should approach unity close to atomic site a, and decay smoothly to zero away from site a. Additionally, it must always hold that a w a (r) = 1.
We define the weights as which follows the fuzzy cells concept first developed by Becke 18 . In the fuzzy cells scheme P a (r) are atomic site centered partial weights, which have the value one at the atomic site R a and decay to zero when the distance |r − R a | becomes large. P a (r) could be defined many different ways, but here we will use which is very similar to the expression developed in Ref. 31. The parameter λ controls the location of the transition from 1 to 0 and α controls the sharpness of the transition. We have found that values λ = 1.2 and α = 2.0 give partitioning that is very close to exact Voronoi cells, and also the most accurate formation energies. The calibration of formation energies has been done by calculating the formation energies of ordered Cu 3 Au and CuAu 3 (L1 2 ), and CuAu (L1 0 ) and then comparing them to previous EMTO QNA results 32 .
For periodic and solid-state calculation the expression of Eq. (6), and that of Ref. 31, for P a (r) is particularly beneficial, because the computational load of Eq. (6) scales only linearly as a function of nuclei. This is in contrast to the quadratic scaling of the original Becke form and others 33 , which are often used in chemistry. Chemistry calculations routinely employ computationally heavy hybrid XC functional, which means a quadratic scaling P a (r) is responsible for only a fraction of the total computational load. However, in solid state physics fast semilocal LDA and GGA XC functionals are popular, which can easily cause a quadratic scaling P a (r) to become a computational bottleneck.
Performing geometric relaxations using the QNA scheme requires the computation of forces and the stress tensor. For all-electron case, the XC potential can be evaluated in the where the dependence on µ(r) and β(r) is purely parametric as they are not explicit functions of density but of nuclear coordinates. This equation is now in useful form, as it allows simple analytical gradients.
We now consider the projector augmented wave method implementation and begin with a brief introduction of relevant concepts. The general idea in PAW-methods is that the Kohn-Sham equations are solved for smooth wave functions (ψ n (r)), but retaining one-toone mapping with the all-electron wave functions (ψ(r)). There exists a linear PAW transformation operator which defines a mapping ψ(r) = Tψ n (r) and the Kohn-Sham equations are derived to be T † HTψ n (r) = T † Tψ n (r). Furthermore, several pseudo quantities are defined, such as the pseudo charge densityρ(r) and the pseudo density (ñ σ (r)) which is written as (we consider here a spin-paired system for simplicity and drop the spin index) In GPAW code, in all of the grid, LCAO and plane wave modes, this density is defined in Cartesian real space grid with grid-spacing typically between 0.07-0.15Å. Furthermore, around each nucleus a, one can define the pseudo (ñ a (r)) and all-electron densities (n a (r)) which are defined in logarithmic radial grid where the angular part is expanded using 50 Space partitioning of exchange-correlation functionals with the projector augmented-wave method Lebedev points. For r > r c , where r c is the PAW cutoff of an atomic sphere, it holds that n a (r) = n a (r) with their derivatives also matched. The crux of PAW implementation of QNA, is to define analogous quantitiesμ(r),μ a (r) and µ a (r) andβ(r),β a (r), and β a (r) respectively. Although the end result is simpler than this, we utilize these quantities when deriving to quantify the approximations made. For now, the three versions per µ and β fulfill the same rules as density quantitiesñ,ñ a , and n a .
The energy gradients in PAW formalism can be in general written in form where "h.c." denotes the Hermitian conjugate. In case of the QNA XC functional, it does not have explicit wave function dependence and hence we only need to consider the partial derivative − dE dR a . For local and semi-local functionals, the XC energy in PAW formalism can be written as (10) where the term is typically called the PAW-correction and it introduces the atomic density matrix as defined in Ref. 34. By taking the partial derivative with respect to nuclear position of Eq. (10), we arrive at Eq. (A1), which is presented in Appendix A. The first term in the right hand side of Eq. (A1) is already handled by GPAW, and it is solved by noting that in Eq. (8), only the pseudo core densityñ c (r) depends on atomic positions i.e.
Furthermore, the density functional derivatives of the form δE/δn in Eq. (12) or via the atomic wise quantities in Eq. (A1) are readily evaluated in GPAW via the typical Euler- where σ(r) = |∇n(r)| 2 .
Thus, regarding the new implementation of QNA forces, we are left with only the partial derivatives with respect to various µ and β parameters. Due to the atomic site centered µ(r) and β(r) fields E QNA XC has an additional dependency on the positions of the nuclei, which created the extra derivative chain rules in Eq. (A1). However, we have not yet defined theμ,μ a and µ a (in the following, β is defined analogously). To this end, we make a typical approximation, where a quantity almost constant within an augmentation sphere is assumed to be constant. In other words, we setμ a (r) = µ a (r) = µ a . Outside the augmentation sphere, the fact thatμ a (r) deviates fromμ a does not matter, since the correction vanishes since n a (r) =ñ a (r) there also. Inside the augmentation sphere, where n a (r) and n a (r) deviate, the region is so close to atom a that µ a term dominates in Eq. (1).
With these approximations, the QNA XC becomes At this point we can readily evaluate the remaining partial derivatives Inside Eqs. (16) and (17)  and they have been written out in Appendix A. In order to get the δw a (r)/δR a derivatives in Eqs. (16) and (17) we notice that which gives, for example, The whole gradient with respect to R a (∇ R a ) is therefore easily obtained from the gradient of r using the ∇ operator: With the help of Eq. (20) we obtain the following expression for the δw a (r) / δR a derivatives: The stress tensor is needed in order to relax the unit cell. Analogously to the case of forces, additional terms will manifest in the stress tensor formula, because the µ(r) and Since XC energy is an integral in real space, it can be shown that the XC contribution to the total stress tensor can be written as For simplicity, here we drop the pseudo (∼) notation and just use generic µ(r) and β(r).
Below we expand each term one at a time (function arguments are dropped for simplicity).
Equation (24) is the LDA-level term and it can be written as Equation (25) is the GGA-level gradient term and it can be written as where we have used the fact that ∂|∇n| 2 /∂∇n = 2∇n. These LDA and GGA terms are already handled by GPAW. Equation (26) arises from the fact that the µ(r) field changes as a function of strain and it can be written as Equation (27) is the β(r) field change and it can be written as ∂F PBE X /∂µ and ∂H/∂β terms were already derived in the Forces section. The ∂w a / αβ derivative can be written as To get the ∂P a / αβ derivative we use Eq. (15) of Ref. 35: The above equations can be used to implement the needed stress tensor corrections in GPAW, or any other "stress tensor compatible" DFT code for that matter, but it has been shown that terms like those of Eq. (26) and Eq. (27), i.e. ones that are a consequence of the fact that the space has been partitioned, seem to be so small that they fall below the general numerical accuracy of DFT codes 35 .

III. TEST CALCULATIONS
The correctness of the analytical QNA forces and stress tensor can be straightforwardly it has been shown that GGA-level functionals struggle to predict the formation energies of Cu-Au binary alloys with acceptable accuracy 17,36 . Table I shows the formation energies of Cu-Au binary alloys calculated with PBE and QNA using the present GPAW implementation. We see that the GPAW results for QNA are in good agreement with previously published EMTO results. The present GPAW implementation differs from the EMTO implementation in the way that in EMTO the space is by construction divided into Voronoi-cells that surround the muffin-tin spheres, and therefore does not need the fuzzy cells formalism. Nevertheless, the results between the two codes agree, which indicates that QNA results are not sensitive to the underlying implementation and that the stress tensor can be succesfully used with QNA to optimize the unit cell geometry.

Appendix B: Computational details
The analytical force and stress tensor test used the planewave mode with an energy cutoff of 600 eV and a 10 × 10 × 10 grid of Monkhorst-Pack k-points 44 . The version 0.9.20000 of PAW setups were used. Fermi-Dirac smearing was used with a width of 0.01 eV.
The ordered Cu-Au calculations used the planewave mode and an energy cutoff of 550 eV.
Monkhorst-Pack scheme was used to generate the k-point grids whose sizes were 20×20×20.