The accurate computation of the electronic structure of molecular systems plays a central role in the development of methods in quantum chemistry,
but despite intense developments, no definitive solution to that problem have been found.
The theoretical challenge to be overcome falls back in the category of the quantum many-body problem due the intrinsic quantum nature
of the electrons and the coulomb repulsion between them, inducing the so-called electronic correlation problem.
Tackling this problem translate to solving the Schroedinger equation for a $N$~-~electron system, and two roads have emerged to approximate the solution to this formidably complex mathematical problem: the wave function theory (WFT) and density functional theory (DFT).
Although both WFT and DFT spring from the same problem, their formalisms are very different as the former deals with the complex
$N$~-~body wave function whereas the latter handles the much simpler one~-~body density.
The computational cost of DFT is very appealing as in its Kohn-Sham (KS) formulation it can be recast in a mean-field procedure.
Therefore, although constant efforts are performed to reduce the computational cost of WFT, DFT remains still the workhorse of quantum chemistry.
From the theoretician point of view, the complexity of description of a given chemical system can be roughly
categorized by the strength of the electronic correlation appearing in its electronic structure.
Weakly correlated systems, such as closed-shell organic molecules near their equilibrium geometry, are typically dominated by the avoidance effects when electron are near the electron coalescence point, which are often called short-range correlation effects,
of quantum chemistry, and the main remaining issue for these systems is to push the limit in terms of the size of the chemical systems that can be treated.
The case of the so-called strongly correlated systems, which are ubiquitous in chemistry, is much more problematic as they exhibits
a much more exotic electronic structure.
Transition metals containing systems, low-spin open shell systems, covalent bond breaking or excited states
have all in common that they cannot be even qualitatively described by a single electronic configuration.
It is now clear that the usual approximations in KS-DFT fails in giving an accurate description of these situations and WFT has become
the standard for the treatment of strongly correlated systems.
From the theoretical point of view, the complexity of the strong correlation problem is, at least, two-fold:
i) the qualitative description of the wave function is determined by a primary set of electronic configurations (whose size can scale exponentially in many cases) among which near degeneracies and/or strong interactions appear in the Hamiltonian matrix,
Fulfilling these two objectives is a rather complicated task for a given approximated approach, specially if one adds the requirement of verifying other formal properties, such as size-extensivity and additivity of the computed energy in the case of non interacting fragments, or $S_z$ invariance.
%To tackle this complicated problem, many methods have been proposed and an exhaustive review of the zoology of methods for strong correlation goes beyond the scope and purpose of this article.
operators, even if promising alternative approaches have been proposed using stochastic techniques\cite{Thom-PRL-10,ScoTho-JCP-17,SpeNeuVigFraTho-JCP-18,DeuEmiShePie-PRL-17,DeuEmiMagShePie-JCP-18,DeuEmiYumShePie-JCP-19} or symmetry-broken approaches\cite{QiuHenZhaScu-JCP-17,QiuHenZhaScu-JCP-18,GomHenScu-JCP-19}.
In the MR approaches, the zeroth order wave function consists in a linear combination of Slater determinants which are supposed to concentrate most of strong interactions and near degeneracies inherent in the structure of the Hamiltonian for a strongly correlated system. The usual approach to build such a zeroth-order wave function is to perform a complete active space self consistent field (CASSCF) whose variational property prevent any divergence, and which can provide extensive energies. Of course, the choice of the active space is rather a subtle art and the CASSCF results might strongly depend on the level of chemical/physical knowledge of the user.
On top of this zeroth-order wave function, weak correlation is introduced by the addition of other configurations through either configuration interaction\cite{WerKno-JCP-88,KnoWer-CPL-88} (MRCI) or perturbation theory (MRPT) and even coupled cluster (MRCC), which have their strengths and weaknesses,
The advantage of MRCI approaches rely essentially in their simple linear parametrisation for the wave function together with the variational property of their energies, whose inherent drawback is the lack of size extensivity of their energies unless reaching the FCI limit. On the other hand, MRPT and MRCC can provide extensive energies but to the price of rather complicated formalisms, and these approaches might be subject to divergences and/or convergence problems due to the non linearity of the parametrisation for MRCC or a too poor choice of the zeroth-order Hamiltonian.
A natural alternative is to combine MRCI and MRPT, which falls in the category of selected CI (SCI) which goes back to the late 60's and who has received a revival of interest and applications during the last decade \cite{BenErn-PhysRev-1969,WhiHac-JCP-1969,HurMalRan-1973,EvaDauMal-ChemPhys-83,Cim-JCP-1985,Cim-JCC-1987,IllRubRic-JCP-88,PovRubIll-TCA-92,BunCarRam-JCP-06,AbrSheDav-CPL-05,MusEngels-JCC-06,BytRue-CP-09,GinSceCaf-CJC-13,CafGinScemRam-JCTC-14,GinSceCaf-JCP-15,CafAplGinScem-arxiv-16,CafAplGinSce-JCP-16,SchEva-JCP-16,LiuHofJCTC-16,HolUmrSha-JCP-17,ShaHolJeaAlaUmr-JCTC-17,HolUmrSha-JCP-17,SchEva-JCTC-17,PerCle-JCP-17,OhtJun-JCP-17,Zim-JCP-17,LiOttHolShaUmr-JCP-2018,ChiHolOttUmrShaZim-JPCA-18,SceBenJacCafLoo-JCP-18,LooSceBloGarCafJac-JCTC-18,GarSceGinCaffLoo-JCP-18,SceGarCafLoo-JCTC-18,GarGinMalSce-JCP-16,LooBogSceCafJac-JCTC-19}.
Among the SCI algorithms, the CI perturbatively selected iteratively (CIPSI) can be considered as a pioneer. The main idea of the CIPSI and other related SCI algorithms is to iteratively select the most important Slater determinants thanks to perturbation theory in order to build a MRCI zeroth-order wave function which automatically concentrate the strongly interacting part of the wave function. On top of this MRCI zeroth-order wave function, a rather simple MRPT approach is used to recover the missing weak correlation and the process is iterated until reaching a given convergence criterion. It is important to notice that in the SCI algorithms, neither the SCI or the MRPT are size extensive \textit{per se}, but the extensivity property is almost recovered by approaching the FCI limit.
When the SCI are affordable, their clear advantage are that they provide near FCI wave functions and energies, whatever the level of knowledge of the user on the specific physical/chemical problem considered. The drawback of SCI is certainly their \textit{intrinsic} exponential scaling due to their linear parametrisation. Nevertheless, such an exponential scaling is lowered by the smart selection of the zeroth-order wave function together with the MRPT calculation.
Besides the difficulties of accurately describing the electronic structure within a given basis set, a crucial component of the limitations of applicability of WFT concerns the slow convergence of the energies and properties with respect to the quality of the basis set. As initially shown by the seminal work of Hylleraas\cite{Hyl-ZP-29} and further developed by Kutzelnigg \textit{et. al.}\cite{Kut-TCA-85,KutKlo-JCP-91, NogKut-JCP-94}, the main convergence problem originates from the divergence of the coulomb interaction at the electron coalescence point, which induces a discontinuity in the first-derivative of the wave function (the so-called electron-electron cusp). Describing such a discontinuity with an incomplete basis set is impossible and as a consequence, the convergence of the computed energies and properties can be strongly affected. To attenuate this problem, extrapolation techniques has been developed, either based on the Hylleraas's expansion of the coulomb operator\cite{HalHelJorKloKocOlsWil-CPL-98}, or more recently based on perturbative arguments\cite{IrmHulGru-arxiv-19}. A more rigorous approach to tackle the basis set convergence problem has been proposed by the so-called R12 and F12 methods\cite{Ten-TCA-12,TenNog-WIREs-12,HatKloKohTew-CR-12, KonBisVal-CR-12, GruHirOhnTen-JCP-17, MaWer-WIREs-18} which introduce a function explicitly depending on the interelectronic coordinates ensuring the correct cusp condition in the wave function, and the resulting correlation energies converge much faster than the usual WFT. For instance, using the explicitly correlated version of coupled cluster with single, double and perturbative triple substitution (CCSD(T)) in a triple-$\zeta$ quality basis set is equivalent to a quintuple-$\zeta$ quality of the usual CCSD(T) method\cite{TewKloNeiHat-PCCP-07}, although inherent computational overhead are introduced by the auxiliary basis sets needed to resolve the rather complex three- and four-electron integrals involved in the F12 theory.
An alternative point of view is to leave the short-range correlation effects to DFT and to use WFT to deal only with the long-range and/or strong-correlation effects. A rigorous approach to do so is the range-separated DFT (RSDFT) formalism (see Ref.~\onlinecite{TouColSav-PRA-04} and references therein) which rely on a splitting of the coulomb interaction in terms of the interelectronic distance thanks to a range-separation parameter $\mu$. The advantage of such approach is at least two-folds: i) the DFT part deals only with the short-range part of the coulomb interaction, and therefore the usual semi-local approximations to the unknown exchange-correlation functional are more suited to that correlation regime, ii) as the WFT part deals with a smooth non divergent interaction, the exact wave function has no cusp removed and therefore the basis set convergence is much faster\cite{FraMusLupTou-JCP-15}.
Therefore, a number of approximate RS-DFT schemes have been developed within single-reference \cite{AngGerSavTou-PRA-05, GolWerSto-PCCP-05, TouGerJanSavAng-PRL-09,JanHenScu-JCP-09, TouZhuSavJanAng-JCP-11, MusReiAngTou-JCP-15} or multi-reference \cite{LeiStoWerSav-CPL-97, FroTouJen-JCP-07, FroCimJen-PRA-10, HedKneKieJenRei-JCP-15, HedTouJen-JCP-18, FerGinTou-JCP-18} WFT approaches. Nevertheless, there are still some open issues in RSDFT, such as the dependence of the quality of the results on the value of the range separation $\mu$ which can be seen as an empirical parameter, and the remaining self-interaction errors.
Following this path, a very recent solution to the basis set convergence problem has been proposed by some of the preset authors\cite{GinPraFerAssSavTou-JCP-18} where they proposed to use RSDFT to take into account only the correlation effects outside a given basis set. The key idea in such a work is to realize that as a wave function developed in an incomplete basis set is cusp-less, it could also come from a Hamiltonian with a non divergent electron-electron interaction. Therefore, the authors proposed a mapping with RSDFT through the introduction of an effective non-divergent interaction representing the usual coulomb interaction projected in an incomplete basis set. First applications to weakly correlated molecular systems have been successfully carried recently\cite{LooPraSceTouGin-JCPL-19} together with the first attempt to generalize this approach to excited states\cite{GinSceTouLoo-JCP-19}.
The paper is organized as follows: in section \ref{sec:theory} we recall the mathematical framework of the basis set correction and we propose a practical extension for strongly correlated systems. Within the present development, two important formal properties are imposed: the extensivity of the correlation energies together with the $S_z$ independence of the results.
Then in section \ref{sec:results} we discuss the potential energy surfaces (PES) of N$_2$, F$_2$ and H$_{10}$ up to full dissociation as a prototype of strongly correlated problems. Finally, we conclude in section \ref{sec:conclusion}.
As the theoretical framework of the basis set correction has been exposed in details in Ref. \onlinecite{GinPraFerAssSavTou-JCP-18}, we briefly recall the main equations and concepts needed for this study in sections \ref{sec:basic}, \ref{sec:wee} and \ref{sec:mur}.
More specifically, in section \ref{sec:basic} we recall the basic mathematical framework of the present theory by introducing the density functional complementary to a basis set $\Bas$. Then in section \ref{sec:wee} we introduce an effective non divergent interaction in a basis set $\Bas$, which leads us to the definition of an effective range separation parameter varying in space in section \ref{sec:mur}. Thanks to the range separation parameter, we make a mapping with a specific class of RSDFT functionals and propose practical approximations for the unknown density functional complementary to a basis set $\Bas$, for which new approximations for the strong correlation regime are given in section \ref{sec:functional}.
The exact ground state energy $E_0$ of a $N-$electron system can be obtained by an elegant mathematical framework connecting WFT and DFT, that is the Levy-Lieb constrained search formalism which reads
where $(v_{ne}(\br{})|\denr)$ is the nuclei-electron interaction for a given density $\denr$ and $F[\denr]$ is the so-called Levy-Liev universal density functional
Nevertheless, in practical calculations the minimization is performed over the set $\setdenbasis$ which are the densities representable in a basis set $\Bas$, we assume from thereon that the densities used in the equations belong to $\setdenbasis$.
In the present context it is important to notice that in order to recover the \textit{exact} ground state energy, the wave functions $\Psi$ involved in the definition of eq. \eqref{eq:levy_func} must be developed in a complete basis set.
An important step proposed originally by some of the present authors in Ref. \onlinecite{GinPraFerAssSavTou-JCP-18}
was to propose to split the minimization in the definition of $F[\denr]$ using $\wf{}{\Bas}$ which are wave functions developed in $\basis$
Therefore thanks to eq. \eqref{eq:def_levy_bas} one can properly connect the DFT formalism with the basis set error in WFT calculations. In other terms, the existence of $\efuncden{\denr}$ means that the correlation effects not taken into account in $\basis$ can be formulated as a density functional.
Assuming that the density $\denFCI$ associated to the ground state FCI wave function $\psifci$ is a good approximation of the exact density, one obtains the following approximation for the exact ground state density (see equations 12-15 of Ref. \onlinecite{GinPraFerAssSavTou-JCP-18})
where $\efci$ is the ground state FCI energy within $\Bas$. As it was originally shown in Ref. \onlinecite{GinPraFerAssSavTou-JCP-18} and further emphasized in Ref. \onlinecite{LooPraSceTouGin-JCPL-19,GinSceTouLoo-JCP-19}, the main role of $\efuncbasisFCI$ is to correct for the basis set incompleteness errors, a large part of which originates from the lack of cusp in any wave function developed in an incomplete basis set.
The whole purpose of this paper is to determine approximations for $\efuncbasisFCI$ which are suited for treating strong correlation regimes. The two requirement for such conditions are that i) it must provide size extensive energies, ii) it is invariant of the $S_z$ component of a given spin multiplicity.
As it was originally shown by Kato\cite{kato}, the cusp in the exact wave function originates from the divergence of the coulomb interaction at the coalescence point. Therefore, a cusp less wave function $\wf{}{\Bas}$ could also be obtained from a Hamiltonian with a non divergent electron-electron interaction. In other words, the incompleteness of a finite basis set can be understood as the removal of the divergence of the usual coulomb interaction at the electron coalescence point.
As it was originally derived in Ref. \onlinecite{GinPraFerAssSavTou-JCP-18} (see section D and annexes), one can obtain an effective non divergent interaction, here referred as $\wbasis$, which reproduces the expectation value of the coulomb operator over a given wave function $\wf{}{\Bas}$. As we are interested in the behaviour at the coalescence point, we focus on the opposite spin part of the electron-electron interaction.
$\Gam{pq}{rs}=2\mel*{\wf{}{\Bas}}{\aic{r_\downarrow}\aic{s_\uparrow}\ai{q_\uparrow}\ai{p_\downarrow}}{\wf{}{\Bas}}$ its associated two-body tensor, $\SO{p}{}$ are the spatial orthonormal orbitals,
As it was shown in Ref. \onlinecite{GinPraFerAssSavTou-JCP-18}, the effective interaction $\wbasis$ is necessary finite at coalescence for an incomplete basis set, and tends to the regular coulomb interaction in the limit of a complete basis set for any choice of wave function $\psibasis$, that is
The condition of equation \eqref{eq:cbs_wbasis} is fundamental as it guarantees the good behaviour of all the theory in the limit of a complete basis set.
As the effective interaction within a basis set $\wbasis$ is non divergent, one can fit such a function with a long-range interaction defined in the framework of RSDFT which depends on the range-separation parameter $\mu$
As all WFT calculations for the purpose of that work are performed within the frozen core approximation, we define the valence-only versions of the various quantities needed for the complementary basis set functional.
We split the basis set as $\Bas=\Cor\bigcup\BasFC$ (where $\Cor$ and $\BasFC$ are the sets of core and active MOs, respectively)
and define the valence only range separation parameter
\begin{equation}
\label{eq:def_mur_val}
\murpsival = \frac{\sqrt{\pi}}{2}\wbasiscoalval,
\end{equation}
where $\wbasisval$ is the valence-only effective interaction defined as
As originally proposed and motivated in Ref. \onlinecite{GinPraFerAssSavTou-JCP-18}, we approximate the complementary basis set functional $\efuncden{\denr}$ by using the so-called multi-determinant correlation functional (ECMD) introduced by Toulouse and co-workers\cite{TouGorSav-TCA-05}.
Following the recent work of some of the present authors\cite{LooPraSceTouGin-JCPL-19}, we propose to use a PBE-like functional which uses the total density $\denr$, spin polarisation $\zeta(\br{})$, reduced density gradient $s(\br{})=\nabla\denr/\denr^{4/3}$ and the on-top pair density $\ntwo(\br{})$. In the present work, all the density-related quantities are computed with the same wave function $\psibasis$ used to define $\murpsi$.
and where $\varepsilon_{\text{c,PBE}}(\argepbe)$ is the usual PBE correlation energy density\cite{PerBurErn-PRL-96}. Before introducing the different flavour of approximated functionals that we will use here (see \ref{sec:def_func}), we would like to give some motivations for the such a choice of functional form.
The actual functional form of $\ecmd(\argecmd)$ have been originally proposed by some of the present authors in the context of RSDFT~\cite{FerGinTou-JCP-18} in order to fulfill the two following limits
which, as it was previously shown\cite{TouColSav-PRA-04, GoriSav-PRA-06,PazMorGorBac-PRB-06} by various authors, is the exact expression for the ECMD in the limit of large $\mu$, provided that $\ntwo$ is the \textit{exact} on-top pair density of the system.
In the context of RSDFT, some of the present authors have illustrated in Ref.~\onlinecite{FerGinTou-JCP-18} that the on-top pair density involved in eq. \eqref{eq:def_ecmdpbe} plays a crucial role when reaching the strong correlation regime. The importance of the on-top pair density in the strong correlation regime have been also acknowledged by Pernal and co-workers\cite{GritMeePer-PRA-18} and Gagliardi and co-workers\cite{CarTruGag-JPCA-17}.
Also, $\ecmd(\argecmd)$ vanishes when $\ntwo$ vanishes
which is exact for systems with a vanishing on-top pair density, such as the totally dissociated H$_2$ which is the archetype of strongly correlated systems.
Within the definition of \eqref{eq:def_mur} and \eqref{eq:def_ecmdpbebasis}, any approximated complementary basis set functionals $\efuncdenpbe{\argecmd}$ satisfies two important properties.
Because of the properties \eqref{eq:cbs_mu} and \eqref{eq:lim_muinf}, $\efuncdenpbe{\argecmd}$ vanishes when reaching the complete basis set limit, whatever the wave function $\psibasis$ used to define the range separation parameter $\mu_{\Psi^{\basis}}$:
Also, the $\efuncdenpbe{\argecmd}$ vanishes for systems with vanishing on-top pair density, which guarantees the good limit in the case of stretched H$_2$ and for one-electron system.
Such a property is guaranteed independently by i) the definition of the effective interaction $\wbasis$ (see equation \eqref{eq:wbasis}) together with the condition \eqref{eq:lim_muinf}, ii) the fact that the $\ecmd(\argecmd)$ vanishes when the on-top pair density vanishes (see equation \eqref{eq:lim_n2}).
An important requirement for any electronic structure method is the extensivity of the energy, \textit{i. e.} the additivity of the energies in the case of non interacting fragments, which is mandatory to avoid any ambiguity in computing interaction energies.
When two subsystems $A$ and $B$ dissociate in closed shell systems, as in the case of weak interactions for instance, a simple RHF wave function leads to extensive energies.
When the two subsystems dissociate in open shell systems, such as in covalent bond breaking, it is well known that the RHF approach fail and an alternative is to use a CASSCF wave function which, provided that the active space has been properly chosen, leads to additives energies.
Another important requirement is the independence of the energy with respect to the $S_z$ component of a given spin state, which is also a property of any exact wave function.
Such a property is also important in the context of covalent bond breaking where the ground state of the super system $A+B$ is in general of low spin while the ground states of the fragments $A$ and $B$ are in high spin which can have multiple $S_z$ components.
\subsubsection{Condition for the functional $\efuncdenpbe{\argebasis}$ to obtain $S_z$ invariance}
A sufficient condition to achieve $S_z$ invariance is to eliminate all dependency to $S_z$, which in the case of $\ecmd(\argecmd)$ is the spin polarisation $\zeta(\br{})$ involved in the correlation energy density $\varepsilon_{\text{c,PBE}}(\argepbe)$ (see equation \eqref{eq:def_ecmdpbe}).
As originally shown by Perdew and co-workers\cite{PerSavBur-PRA-95}, the dependence on the spin polarisation in the KS-DFT framework can be removed by the rewriting the spin polarisation of a single Slater determinant with only the on-top pair density and the total density. In other terms, the spin density dependence usually introduced in the correlation functionals of KS-DFT tries to mimic the effect of the on-top pair density.
Based on this reasoning, a similar approach has been used in the context of multi configurational DFT in order to remove the $S_z$ dependency.
In practice, these approaches introduce the effective spin polarisation
which uses the on-top pair density $\ntwo_{\psibasis}$ of a given wave function $\psibasis$.
The advantages of this approach are at least two folds: i) the effective spin polarisation $\tilde{\zeta}$ is $S_z$ invariant, ii) it introduces an indirect dependency on the on-top pair density of the wave function $\psibasis$ which usually improves the treatment of strong correlation.
Nevertheless, the use of $\tilde{\zeta}$ presents several disadvantages as it can become complex when $n^2-4\ntwo_{\psibasis}<0$ and also
the formula of equation \eqref{eq:def_effspin} is exact only when the density $n$ and on-top pair density $\ntwo_{\psibasis}$ are obtained from a single determinant\cite{PerSavBur-PRA-95}, but it is applied to multi configurational wave functions.
An alternative to eliminate the $S_z$ dependency would be to simply set $\zeta(\br{})=0$, but this would lower the accuracy of the usual correlation functional, such as the PBE correlation functional used here $\varepsilon_{\text{c,PBE}}(\argepbe)$. Nevertheless, as the spin polarisation usually tries to mimic the on-top pair density and the function $\ecmd(\argecmd)$ explicitly depends on the on-top pair density (see equations \eqref{eq:def_ecmdpbe} and \eqref{eq:def_beta}), we propose here to use the $\ecmd$ functional with \textit{a zero spin polarisation}. This ensures a $S_z$ invariance and, as will be numerically shown, very weakly affect the accuracy of the functional.
In the case of the present basis set correction, as $\efuncdenpbe{\argebasis}$ is an integral over $\mathbb{R}^3$ of local quantities, in the case of non overlapping fragments $A\ldots B$ it can be written as the sum of two local contributions: one coming from the integration over the region of the sub-system $A$ and the other one from the region of the sub-system $B$.
Therefore, a sufficient condition for the extensivity is that these quantities coincide in the isolated systems and in the subsystem of the super system $A\ldots B$.
As $\efuncdenpbe{\argebasis}$ depends only on quantities which are properties of the wave function $\psibasis$, a sufficient condition for the extensivity of these quantities is that the function factorise in the limit of non-interacting fragments, that is $\Psi_{A\ldots B}^{\basis}=\Psi_A^{\basis}\Psi_B^{\basis}$.
In the case where the two subsystems $A$ and $B$ dissociate in closed shell systems, a simple HF wave function ensures this property, but when one or several covalent bonds are broken, the use of a properly chosen CASSCF wave function is sufficient to recover this property, as will be numerically illustrated in section \ref{sec:separability}.
The condition for the active space involved in the CASSCF wave function is that it has to lead to extensive energies in the limit of dissociated fragments.
As the present work focusses on the strong correlation regime, we propose here to investigate only approximated functionals which are $S_z$ invariant and size extensive in the case of covalent bond breaking. Therefore, the wave function $\psibasis$ used throughout this paper are of CASSCF type in order to ensure extensivity of all density related quantities.
The difference between the different flavours of functionals are only on the i) the type of on-top pair density used, and ii) the type of spin polarisation used.
Regarding the spin polarisation that enters into $\varepsilon_{\text{c,PBE}}(\argepbe)$, two different types of $S_z$ invariant formulations are used: i) the \textit{effective} spin polarization $\tilde{\zeta}$ defined in equation \eqref{eq:def_effspin}, and iii) a \textit{zero} spin polarization.
Regarding the approximation to the \textit{exact} on-top pair density entering in equation \eqref{eq:def_beta}, we use two different approximations. The first one is based on the uniform electron gas (UEG) and reads
where the pair-distribution function $g_0(n)$ is taken from equation (46) of Ref. \onlinecite{GorSav-PRA-06}. As some spin polarization appear in equation \eqref{eq:def_n2ueg}, we use the effective spin density $\tilde{\zeta}$ of equation \eqref{eq:def_effspin} in order to ensure $S_z$ invariance. Notice that, as we use a CASSCF wave function and $\tilde{\zeta}$ as spin polarization, the $\ntwo_{\text{UEG}}$ will depend indirectly on the on-top pair density of the CASSCF wave function as $\tilde{\zeta}$ depends on the on-top pair density.
Another approach to approximate of the exact on top pair density consists in taking advantage of the on-top pair density of the wave function $\psibasis$. Following the work of some of the previous authors\cite{FerGinTou-JCP-18,GinSceTouLoo-JCP-19} we introduce the extrapolated on-top pair density $\ntwoextrap$ as
i) The PBE-UEG-$\tilde{\zeta}$ which uses the UEG-like on-top pair density defined in equation \eqref{eq:def_n2ueg}, the effective spin polarization of equation \eqref{eq:def_effspin} and which reads
iii) and the PBE-ot-$0{\zeta}$ where no spin polarization is used and which therefore uses only the total density and the on-top pair density of equation \eqref{eq:def_n2extrap} and which reads
The purpose of the present paper being the study of the basis set correction in the regime of strong correlation, we propose to study the potential energy surfaces (PES) until dissociation of an equally distant H$_{10}$ chain, together with the C$_2$, N$_2$, O$_2$ and F$_2$ molecules.
In a given basis set, to compute the approximation of the exact ground state energy using equation \eqref{eq:e0approx}, one needs an approximation to both the FCI energy $\efci$ and the complementary basis set energy functional $\efuncbasisFCI$.
In the case of C$_2$, N$_2$, O$_2$ and F$_2$, the approximation to the FCI energies are obtained using converged frozen-core (1s orbitals are kept frozen) CIPSI calculations and the extrapolation scheme for the perturbative correction of Umrigar \textit{et. al.}
(see Refs \onlinecite{HolUmrSha-JCP-17, SceGarCafLoo-JCTC-18, LooSceBloGarCafJac-JCTC-18, SceBenJacCafLoo-JCP-18, LooBogSceCafJac-JCTC-19, QP2} for more details) using the Quantum Package software\cite{QP2}. The estimated exact PES are obtained from Ref. \onlinecite{LieCle-JCP-74a}.
For all geometry and basis sets, the error with respect to actual FCI energies are estimated to be below 0.5 mH.
In the case of H$_{10}$, the approximation to $\efci$ together with the estimated exact curves are obtained from the data from of Ref. \onlinecite{h10_prx} where the authors performed MRCI+Q calculations with a minimal valence active space as reference (see below for the description of the active space).
Regarding the complementary basis set energy functional, we use a full valence CASSCF wave functions computed with the GAMESS-US software\cite{gamess} to obtain the wave functions $\psibasis$. Therefore, all density related quantities (such as the total densities, different flavors of spin polarizations and on-top pair densities) together with the $\murpsi$ of equation \eqref{eq:def_mur} are obtained at full valence CASSCF level.
These CASSCF wave functions correspond to the following active spaces: ten electrons in ten orbitals for H$_{10}$, 8 electrons in 8 electrons for C$_2$, 10 electrons in 8 orbitals for N$_2$, twelve electrons in eight orbitals for O$_2$ and forteen electrons in eight orbitals for F$_2$.
Also, as the frozen core approximation is used in all near FCI calculations, we use the corresponding valence-only complementary functionals. Therefore, all density related quantities exclude any contribution from the core $1s$ orbitals, and the range-separation parameter is taken as the one defined in equation \eqref{eq:def_mur_val}.
The study of equally distant H$_{10}$ chains is a good prototype for the study of strong correlation regime as it consists in the simultaneous breaking of 10 covalent $\sigma$ bonds which all interact with each other. Also, being a relatively small system, benchmark calculations can be performed at near CBS values can be obtained (see Ref. \onlinecite{h10_prx} for detailed study of that problem).
We report in figures \ref{fig:H10_vdz}, \ref{fig:H10_vtz}, \ref{fig:H10_vqz} the PES computed using the cc-pVXZ (X=D,T,Q) basis sets of H$_{10}$, for different levels of approximations.
The computation of the atomization energies $D_0$ at each level of theory used here is reported in table \ref{tab:d0}. A general trend that can be observed from these data is that, in a given basis set, the quality of the potential energy surfaces are globally improved by adding the basis-set correction, whatever the level of approximation used for the functional $\efuncbasisFCI$. Also, no divergence of bizarre behaviour are found when stretching the bonds, which show that the functionals are robust when reaching the strong correlation regime.
More quantitatively, the values of $D_0$ are within the chemical accuracy (\textit{i. e.} an error below 1.4 mH) from the cc-pVTZ basis set when using the PBE-ot-$\tilde{\zeta}$ and PBE-ot-$0{\zeta}$ functionals, whereas such accuracy is not reached at the cc-pVQZ basis set using MRCI+Q.
Regarding in more details the performance of the different types of approximated functionals, the results show that the PBE-ot-$\tilde{\zeta}$ and PBE-ot-$0{\zeta}$ are very similar (the maximal difference being 0.3 mH on $D_0$), and they give slightly more accurate than the PBE-UEG-$\tilde{\zeta}$.
These observations bring two important clues on the role of the different physical ingredients used in the functionals:
i) the explicit use of the on-top pair density coming from the CASSCF wave function (see equation \eqref{eq:def_n2extrap}) is preferable to the use of the on-top pair density based on the UEG (see equation \eqref{eq:def_n2ueg}),
ii) removing the dependence on any kind of spin polarizations does not lead to significant loss of accuracy provided that one uses a qualitatively correct on-top pair density. The point ii) is important as it shows that the use of the spin-polarization in density functional approximations (DFA) essentially plays the role of the effect of the on-top pair density.
The study of C$_2$, N$_2$, O$_2$ and F$_2$ molecules are complementary to the H$_{10}$ system for the present study as the level of strong correlation increases while stretching the bond similarly to the case of H$_{10}$, but also these systems exhibit more important and versatile types of weak correlations due to the larger number of electrons. Indeed, the short-range correlation effects are known to play a strong differential effect on the computation of $D_0$, while the shape of the curve far from the equilibrium geometry is governed by dispersion forces which are medium to long-range weak correlation effects.
Also, O$_2$ exhibit a triplet ground state and therefore is good check for the performance of the dependence on the spin polarization of various types of functionals proposed here.
We report in figures \ref{fig:C2_avdz}, \ref{fig:N2_avdz}, \ref{fig:O2_avdz} and \ref{fig:F2_avdz} (\ref{fig:C2_avtz}, \ref{fig:N2_avtz}, \ref{fig:O2_avtz} and \ref{fig:F2_avtz}) the potential energy curves computed using the aug-cc-pVDZ (aug-cc-pVTZ) basis sets of C$_2$, N$_2$, O$_2$ and N$_2$, respectively, for different levels of computations. The computation of the atomization energies $D_0$ at each level of theory used here is reported in table \ref{tab:d0}.
Just as the case of H$_{10}$, the quality of $D_0$ are globally improved by adding the basis set correction and it is remarkable that the PBE-ot-$\tilde{\zeta}$ and PBE-ot-$0{\zeta}$ functionals give very similar results.
The latter observation confirms that the dependence on the on-top pair density allows to remove the dependence of any kind of spin polarizations for a quite wide spread of electron density and also for purely high spin systems as O$_2$.
More quantitatively, an error below 1.0 mH on the estimated exact valence-only $D_0$ is found for N$_2$, O$_2$ and F$_2$ in aug-cc-pVTZ with the PBE-ot-$0{\zeta}$ functional, whereas such a result is far from reach within the same basis set at near FCI level.
In the case of C$_2$ in the aug-cc-pVTZ basis set, an error of about 5.5 mH is found with respect to the estimated exact $D_0$. Such an error is remarkably large with respect to the other diatomic molecules studied here and might be associated to the level of strong correlation of the C$_2$ molecule.
Regarding now the performance of the basis set correction along the whole PES, it is interesting to notice that it fails to provide a noticeable improvement of the PES far from the equilibrium geometry.
Acknowledging that the weak correlation effects in these regions are dominated by dispersion forces which are long-range effects, the failure of the present approximations for the complementary basis set functionals can be understood easily. Indeed, the whole scheme designed here is based on the physics near the electron-electron cusp: the $\murpsi$ is designed by looking at the electron coalescence point and the ECMD functionals are suited for short-range correlation effects. Therefore, the failure of the present basis set correction to describe dispersion forces can be considered as a good behaviour.
In the present paper we have extended the recently proposed DFT-based basis set correction to strongly correlated systems.
We studied the H$_{10}$, C$_2$, N$_2$, O$_2$ and F$_2$ linear molecules up to full dissociation limits at near FCI level in increasing basis sets, and investigated how the basis set correction affect the convergence toward the CBS limits of the PES of these molecular systems.
The DFT-based basis set correction rely on three aspects: i) the definition of an effective non-divergent electron-electron interaction obtained from the expectation value over a wave function $\psibasis$ of the regular coulomb interaction projected into an incomplete basis set $\basis$, ii) the fitting of such effective interaction with a long-range interaction used in RS-DFT, iii) the use of complementary correlation functional of RS-DFT.
In the present paper, we investigated points i) and iii) in order to to properly investigate atomization energies.
In this context, we propose a new scheme to design functionals fulfilling a) $S_z$ invariance, b) size extensivity. To achieve such requirements we proposed to use CASSCF wave functions leading to extensive energies, and to develop functionals using only $S_z$ invariant density-related quantities.
The development of new $S_z$ invariant and size extensive functionals has lead us to investigate the role of two related quantities: the spin-polarization and the on-top pair density.
To achieve $S_z$ invariant in the context of DFT based on multi-configurational wave functions, an effective spin polarization depending on the total density and on-top pair density is commonly used. Nevertheless, such an effective spin density can be considered as \textit{ad hoc} as its expression is formally valid only for a single-determinant wave function and it can become complex for multi-configurational wave functions. Based on the previous work of some of the present authors, we use functionals depending \textit{explicitly} on the on-top pair density.