diff --git a/Manuscript/rsdft-cipsi-qmc.tex b/Manuscript/rsdft-cipsi-qmc.tex index 78ff3ea..d7c2796 100644 --- a/Manuscript/rsdft-cipsi-qmc.tex +++ b/Manuscript/rsdft-cipsi-qmc.tex @@ -249,7 +249,15 @@ still an active field of research. The present paper falls within this context. The central idea of the present work, and the launch-pad for the remainder of this study, is that one can combine the various strengths of WFT, DFT, and QMC in order to create a new hybrid method with more attractive features and higher accuracy. -In particular, we show here that one can combine CIPSI and KS-DFT via the range separation (RS) of the interelectronic Coulomb operator \cite{Sav-INC-96a,Toulouse_2004} to obtain accurate FN-DMC energies with compact multi-determinant trial wave functions. +In particular, we show here that one can combine CIPSI and KS-DFT via the range separation (RS) of the interelectronic Coulomb operator \cite{Sav-INC-96a,Toulouse_2004} --- a scheme that we label RS-DFT-CIPSI in the following --- to obtain accurate FN-DMC energies with compact multi-determinant trial wave functions. + + +The present manuscript is organized as follows. +In Sec.~\ref{sec:rsdft-cipsi}, we provide theoretical details about the CIPSI algorithm (Sec.~\ref{sec:CIPSI}) and range-separated DFT (Sec.~\ref{sec:rsdft}). +Computational details are reported in Sec.~\ref{sec:comp-details}. +In Sec.~\ref{sec:mu-dmc}, we discuss the influence of the range-separation parameter on the fixed-node error as well as the link between RS-DFT and Jastrow factors. +Section \ref{sec:atomization} examines the performance of the present scheme for the atomization energies of the Gaussian-1 set of molecules. +Finally, we draw our conclusion in Sec.~\ref{sec:conclusion}. Unless otherwise stated, atomic units are used. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -259,6 +267,7 @@ Unless otherwise stated, atomic units are used. %==================== \subsection{The CIPSI algorithm} +\label{sec:CIPSI} %==================== Beyond the single-determinant representation, the best multi-determinant wave function one can wish for --- in a given basis set --- is the FCI wave function. @@ -495,7 +504,7 @@ The first question we would like to address is the quality of the nodes of the wave function $\Psi^{\mu}$ obtained for intermediate values of the range separation parameter (\ie, $0 < \mu < +\infty$). For this purpose, we consider a weakly correlated molecular system, namely the water -molecule \titou{at its experimental geometry. \cite{Caffarel_2016}} +molecule at its experimental geometry. \cite{Caffarel_2016} We then generate trial wave functions $\Psi^\mu$ for multiple values of $\mu$, and compute the associated FN-DMC energy keeping fixed all the parameters impacting the nodal surface, such as the CI coefficients and the molecular orbitals. @@ -538,7 +547,7 @@ The take-home message of this first numerical study is that RS-DFT trial wave fu This is a key result of the present study. %====================================================== -\subsection{Link between RS-DFT and Jastrow factors } +\subsection{Link between RS-DFT and Jastrow factor} \label{sec:rsdft-j} %====================================================== The data presented in Sec.~\ref{sec:fndmc_mu} evidence that, in a finite basis, RS-DFT can provide @@ -583,7 +592,7 @@ To do so, we have made the following numerical experiment. First, we extract the 200 determinants with the largest weights in the FCI wave function out of a large CIPSI calculation obtained with the VDZ-BFD basis. Within this set of determinants, we solve the self-consistent equations of RS-DFT [see Eq.~\eqref{rs-dft-eigen-equation}] -for different values of $\mu$ \titou{using the srPBE functional}. This gives the CI expansions $\Psi^\mu$. +for different values of $\mu$ using the srPBE functional. This gives the CI expansions $\Psi^\mu$. Then, within the same set of determinants we optimize the CI coefficients $c_I$ [see Eq.~\eqref{eq:Slater}] in the presence of a simple one- and two-body Jastrow factor $e^J$ with $J = J_\text{eN} + J_\text{ee}$ and \begin{subequations} @@ -601,7 +610,7 @@ where the sum over $i < j$ loops over all unique electron pairs. In Eqs.~\eqref{eq:jast-eN} and \eqref{eq:jast-ee}, $r_{iA}$ is the distance between the $i$th electron and the $A$th nucleus while $r_{ij}$ is the interlectronic distance between electrons $i$ and $j$. The parameters $a=1/2$ and $b=0.89$ were fixed, and the parameters $\gamma_{\text{O}}=1.15$ and $\gamma_{\text{H}}=0.35$ -were obtained by energy minimization of a single \titou{HF?} determinant. +were obtained by energy minimization of a single determinant. The optimal CI expansion $\Psi^J$ is obtained by sampling the matrix elements of the Hamiltonian ($\mathbf{H}$) and the overlap ($\mathbf{S}$), in the basis of Jastrow-correlated determinants $e^J D_i$: @@ -740,16 +749,16 @@ As a conclusion of the first part of this study, we can highlight the following Atomization energies are challenging for post-HF methods because their calculation requires a perfect balance in the -description of atoms and molecules. Basis sets used in molecular -calculations are atom-centered, so they are always better adapted to +description of atoms and molecules. The mainstream one-electron basis sets employed in molecular +calculations are atom-centered, so they are, by construction, better adapted to atoms than molecules and atomization energies usually tend to be underestimated by variational methods. In the context of FN-DMC calculations, the nodal surface is imposed by -the trial wavefunction which is expanded on an atom-centered basis -set, so we expect the fixed-node error to be also tightly related to +the trial wavefunction which is expanded in an atom-centered basis +set, so we expect the fixed-node error to be also intimately related to the basis set incompleteness error. Increasing the size of the basis set improves the description of -the density and of electron correlation, but also reduces the +the density and of the electron correlation, but also reduces the imbalance in the quality of the description of the atoms and the molecule, leading to more accurate atomization energies. @@ -761,9 +770,9 @@ An extremely important feature required to get accurate atomization energies is size-consistency (or strict separability), since the numbers of correlated electron pairs in the isolated atoms are different from those of the molecules. -The energy computed within density functional theory is size-consistent, and -as it is a mean-field method the convergence to the complete basis set -(CBS) limit is relatively fast. Hence, DFT methods are very well adapted to +The energy computed within DFT is size-consistent, and +as it is a mean-field method the convergence to the CBS limit +is relatively fast. Hence, DFT methods are very well adapted to the calculation of atomization energies, especially with small basis sets. But going to the CBS limit will converge to biased atomization energies because of the use of approximate density functionals. @@ -773,10 +782,10 @@ the FCI energies to the CBS limit is much slower because of the description of short-range electron correlation using atom-centered functions. But ultimately the exact energy will be reached. -In the context of selected CI calculations, when the variational energy is -extrapolated to the FCI energy\cite{Holmes_2017} there is no +In the context of SCI calculations, when the variational energy is +extrapolated to the FCI energy \cite{Holmes_2017} there is no size-consistency error. But when the truncated SCI wave function is used -as a reference for post-Hartree-Fock methods such as SCI+PT2 +as a reference for post-HF methods such as SCI+PT2 or for QMC calculations, there is a residual size-consistency error originating from the truncation of the wave function. @@ -800,12 +809,12 @@ $a$ is determined by cusp conditions, and $b$ is obtained by energy or variance minimization.\cite{Coldwell_1977,Umrigar_2005} One can easily see that this parameterization of the two-body interaction is not size-consistent: the dissociation of a -diatomic molecule $AB$ with a parameter $b_{AB}$ +diatomic molecule \ce{AB} with a parameter $b_{\ce{AB}}$ will lead to two different two-body Jastrow factors, each -with its own optimal value $b_A$ and $b_B$. To remove the -size-consistency error on a PES using this ansätz for $J_\text{ee}$, +with its own optimal value $b_{\ce{A}}$ and $b_{\ce{B}}$. To remove the +size-consistency error on a PES using this ans\"atz for $J_\text{ee}$, one needs to impose that the parameters of $J_\text{ee}$ are fixed: -$b_A = b_B = b_{AB}$. +$b_A = b_B = b_{\ce{AB}}$. When pseudopotentials are used in a QMC calculation, it is common practice to localize the non-local part of the pseudopotential on the @@ -906,7 +915,7 @@ impacted by this spurious effect, as opposed to FCI. In this section, we investigate the impact of the spin contamination due to the short-range density functional on the FN-DMC energy. We have computed the energies of the carbon atom in its triplet state -with BFD pseudopotentials and the corresponding double-zeta basis +with BFD pseudopotentials and the corresponding double-$\zeta$ basis set. The calculation was done with $m_s=1$ (3 spin-up electrons and 1 spin-down electrons) and with $m_s=0$ (2 spin-up and 2 spin-down electrons). @@ -971,18 +980,18 @@ The 55 molecules of the benchmark for the Gaussian-1 theory\cite{Pople_1989,Curtiss_1990} were chosen to test the performance of the RS-DFT-CIPSI trial wave functions in the context of energy differences. Calculations were made in the double-, triple- -and quadruple-zeta basis sets with different values of $\mu$, and using -natural orbitals of a preliminary CIPSI calculation. +and quadruple-$\zeta$ basis sets with different values of $\mu$, and using +NOs from a preliminary CIPSI calculation \titou{as a starting point}. For comparison, we have computed the energies of all the atoms and -molecules at the DFT level with different density functionals, and at +molecules at the KS-DFT level with various semi-local and hybrid density functionals [PBE, BLYP, PBE0, and B3LYP], and at the CCSD(T) level. Table~\ref{tab:mad} gives the corresponding mean absolute errors (MAE), mean signed errors (MSE) and standard deviations (RMSD). For FCI (RS-DFT-CIPSI, $\mu=\infty$) we have -given extrapolated values at $\EPT\rightarrow 0$, and the error bars -correspond to the difference between the energies computed with a -two-point and with a three-point linear extrapolation. +provided the extrapolated values at $\EPT \to 0$, and the error bars +correspond to the difference between the energies \titou{computed with a +two-point and with a three-point linear extrapolation}. \cite{Loos_2018a,Loos_2019,Loos_2020b,Loos_2020c} -In this benchmark, the great majority of the systems are well +In this benchmark, the great majority of the systems are weakly correlated and are then well described by a single determinant. Therefore, the atomization energies calculated at the DFT level are relatively accurate, even when the basis set is small. The introduction of exact exchange (B3LYP and @@ -994,27 +1003,28 @@ and FCI energies. The imbalance of the quality of description of molecules compared to atoms is exhibited by a very negative value of the MSE for CCSD(T) and FCI/VDZ-BFD, which is reduced by a factor of two -when going to the triple-zeta basis, and again by a factor of two when -going to the quadruple-zeta basis. +when going to the triple-$\zeta$ basis, and again by a factor of two when +going to the quadruple-$\zeta$ basis. -This large imbalance at the double-zeta level affects the nodal +This large imbalance at the VDZ-BFD level affects the nodal surfaces, because although the FN-DMC energies obtained with near-FCI trial wave functions are much lower than the single-determinant FN-DMC -energies, the MAE obtained with FCI (7.38~$\pm$ 1.08~kcal/mol) is -larger than the single-determinant MAE (4.61~$\pm$ 0.34 kcal/mol). +energies, the MAE obtained with FCI ($7.38\pm1.08$ kcal/mol) is +larger than the single-determinant MAE ($4.61\pm 0.34$ kcal/mol). Using the FCI trial wave function the MSE is equal to the negative MAE which confirms that all the atomization energies are underestimated. This confirms that some of the basis-set incompleteness error is transferred in the fixed-node error. -Within the double-zeta basis set, the calculations could be done for the +Within the double-$\zeta$ basis set, the calculations could be performed for the whole range of values of $\mu$, and the optimal value of $\mu$ for the trial wave function was estimated for each system by searching for the minimum of the spline interpolation curve of the FN-DMC energy as a function of $\mu$. -This corresponds the the line of the table labelled by the \emph{Opt} -value of $\mu$. Using the optimal value of $\mu$ clearly improves the -MAE, the MSE an the RMSD compared the the FCI wave function. This +This corresponds the line of Table~\ref{tab:mad} labelled as ``Opt.'' +\titou{The optimal $\mu$ value for each system is reported in the \SI.} +Using the optimal value of $\mu$ clearly improves the +MAE, the MSE an the RMSD compared to the FCI wave function. This result is in line with the common knowledge that re-optimizing the determinantal component of the trial wave function in the presence of electron correlation reduces the errors due to the basis set incompleteness. @@ -1022,8 +1032,8 @@ These calculations were done only for the smallest basis set because of the expensive computational cost of the QMC calculations when the trial wave function is expanded on more than a few million determinants. -At the RS-DFT-CIPSI level, we can remark that with the triple-zeta -basis set the MAE are larger for $\mu=1$~bohr$^{-1}$ than for the +At the RS-DFT-CIPSI level, one can see that with the VTZ-BFD +basis the MAEs are larger for $\mu=1$~bohr$^{-1}$ than for the FCI. For the largest systems, as shown in Fig.~\ref{fig:g2-ndet} there are many systems which did not reach the threshold $\EPT<1$~m\hartree{}, and the number of determinants exceeded @@ -1031,7 +1041,7 @@ $\EPT<1$~m\hartree{}, and the number of determinants exceeded small size-consistency error originating from the imbalanced truncation of the wave functions, which is not present in the extrapolated FCI energies. The same comment applies to -$\mu=0.5$~bohr$^{-1}$ with the quadruple-zeta basis set. +$\mu=0.5$~bohr$^{-1}$ with the quadruple-$\zeta$ basis set. %%% FIG 5 %%% @@ -1048,13 +1058,13 @@ $\mu=0.5$~bohr$^{-1}$ with the quadruple-zeta basis set. \end{figure*} %%% %%% %%% %%% -Searching for the optimal value of $\mu$ may be too costly, so we have -computed the MAD, MSE and RMSD for fixed values of $\mu$. The results -are illustrated in Fig.~\ref{fig:g2-dmc}. As seen on the figure and -in Table~\ref{tab:mad}, the best choice for a fixed value of $\mu$ is +Searching for the optimal value of $\mu$ may be too costly and time consuming, so we have +computed the MAD, MSE and RMSD for fixed values of $\mu$. +As illustrated in Fig.~\ref{fig:g2-dmc} and Table \ref{tab:mad}, +the best choice for a fixed value of $\mu$ is 0.5~bohr$^{-1}$ for all three basis sets. It is the value for which -the MAE (3.74(35), 2.46(18) and 2.06(35) kcal/mol) and RMSD (4.03(23), -3.02(06) and 2.74(13)~kcal/mol) are minimal. Note that these values +the MAE [$3.74(35)$, $2.46(18)$, and $2.06(35)$ kcal/mol] and RMSD [$4.03(23)$, +$3.02(06)$, and $2.74(13)$ kcal/mol] are minimal. Note that these values are even lower than those obtained with the optimal value of $\mu$. Although the FN-DMC energies are higher, the numbers show that they are more consistent from one system to another, giving improved @@ -1077,32 +1087,33 @@ The number of determinants in the trial wave functions are shown in Fig.~\ref{fig:g2-ndet}. As expected, the number of determinants is smaller when $\mu$ is small and larger when $\mu$ is large. It is important to remark that the median of the number of -determinants when $\mu=0.5$~bohr$^{-1}$ is below 100~000 determinants -with the quadruple-zeta basis set, making these calculations feasilble -with such a large basis set. At the double-zeta level, compared to the +determinants when $\mu=0.5$~bohr$^{-1}$ is below $100\,000$ determinants +with the VQZ-BFD basis, making these calculations feasible +with such a large basis set. At the double-$\zeta$ level, compared to the FCI trial wave functions the median of the number of determinants is reduced by more than two orders of magnitude. Moreover, going to $\mu=0.25$~bohr$^{-1}$ gives a median close to 100 -determinants at the double-zeta level, and close to 1~000 determinants -at the quadruple-zeta level for only a slight increase of the +determinants at the VDZ-BFD level, and close to $1\,000$ determinants +at the quadruple-$\zeta$ level for only a slight increase of the MAE. Hence, RS-DFT-CIPSI trial wave functions with small values of $\mu$ could be very useful for large systems to go beyond the single-determinant approximation at a very low computational cost -while keeping the size-consistency. +while ensuring size-consistency. Note that when $\mu=0$ the number of determinants is not equal to one because -we have used the natural orbitals of a first CIPSI calculation, and +we have used the natural orbitals of a preliminary CIPSI calculation, and not the srPBE orbitals. So the Kohn-Sham determinant is expressed as a linear combination of -determinants built with natural orbitals. It is possible to add -an extra step to the algorithm to compute the natural orbitals from the -RS-DFT/CIPSI wave function, and re-do the RS-DFT/CIPSI calculation with +determinants built with NOs. It is possible to add +an extra step to the algorithm to compute the NOs from the +RS-DFT-CIPSI wave function, and re-do the RS-DFT-CIPSI calculation with these orbitals to get an even more compact expansion. In that case, we would -have converged to the Kohn-Sham orbitals with $\mu=0$, and the +have converged to the KS orbitals with $\mu=0$, and the solution would have been the PBE single determinant. %%%%%%%%%%%%%%%%%%%% \section{Conclusion} +\label{sec:conclusion} %%%%%%%%%%%%%%%%%%%% In the present work, we have shown that introducing short-range correlation via