Merge branch 'master' of git.irsamc.ups-tlse.fr:scemama/RSDFT-CIPSI-QMC

2020-08-18 12:11:16 +02:00 · 2020-08-18 12:11:16 +02:00 · 9b31fd7e16
commit 9b31fd7e16
parent 1d8e3c382e 6eed47a25f
1 changed files with 68 additions and 57 deletions
--- a/Manuscript/rsdft-cipsi-qmc.tex
+++ b/Manuscript/rsdft-cipsi-qmc.tex
@ -249,7 +249,15 @@ still an active field of research. The present paper falls
 within this context.

 The central idea of the present work, and the launch-pad for the remainder of this study, is that one can combine the various strengths of WFT, DFT, and QMC in order to create a new hybrid method with more attractive features and higher accuracy.
-In particular, we show here that one can combine CIPSI and KS-DFT via the range separation (RS) of the interelectronic Coulomb operator \cite{Sav-INC-96a,Toulouse_2004} to obtain accurate FN-DMC energies with compact multi-determinant trial wave functions.
+In particular, we show here that one can combine CIPSI and KS-DFT via the range separation (RS) of the interelectronic Coulomb operator \cite{Sav-INC-96a,Toulouse_2004} --- a scheme that we label RS-DFT-CIPSI in the following --- to obtain accurate FN-DMC energies with compact multi-determinant trial wave functions.
+
+
+The present manuscript is organized as follows.
+In Sec.~\ref{sec:rsdft-cipsi}, we provide theoretical details about the CIPSI algorithm (Sec.~\ref{sec:CIPSI}) and range-separated DFT (Sec.~\ref{sec:rsdft}).
+Computational details are reported in Sec.~\ref{sec:comp-details}.
+In Sec.~\ref{sec:mu-dmc}, we discuss the influence of the range-separation parameter on the fixed-node error as well as the link between RS-DFT and Jastrow factors.
+Section \ref{sec:atomization} examines the performance of the present scheme for the atomization energies of the Gaussian-1 set of molecules.
+Finally, we draw our conclusion in Sec.~\ref{sec:conclusion}.
 Unless otherwise stated, atomic units are used.

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -259,6 +267,7 @@ Unless otherwise stated, atomic units are used.

 %====================
 \subsection{The CIPSI algorithm}
+\label{sec:CIPSI}
 %====================
 Beyond the single-determinant representation, the best
 multi-determinant wave function one can wish for --- in a given basis set --- is the FCI wave function.
@ -495,7 +504,7 @@ The first question we would like to address is the quality of the
 nodes of the wave function $\Psi^{\mu}$ obtained for intermediate values of the
 range separation parameter (\ie, $0 < \mu  < +\infty$).
 For this purpose, we consider a weakly correlated molecular system, namely the water
-molecule \titou{at its experimental geometry. \cite{Caffarel_2016}}
+molecule at its experimental geometry. \cite{Caffarel_2016}
 We then generate trial wave functions $\Psi^\mu$ for multiple values of
 $\mu$, and compute the associated FN-DMC energy keeping fixed all the
 parameters impacting the nodal surface, such as the CI coefficients and the molecular orbitals.
@ -538,7 +547,7 @@ The take-home message of this first numerical study is that RS-DFT trial wave fu
 This is a key result of the present study.

 %======================================================
-\subsection{Link between RS-DFT and Jastrow factors }
+\subsection{Link between RS-DFT and Jastrow factor}
 \label{sec:rsdft-j}
 %======================================================
 The data presented in Sec.~\ref{sec:fndmc_mu} evidence that, in a finite basis, RS-DFT can provide 
@ -583,7 +592,7 @@ To do so, we have made the following numerical experiment.
 First, we extract the 200 determinants with the largest weights in the FCI wave
 function out of a large CIPSI calculation obtained with the VDZ-BFD basis.  Within this set of determinants,
 we solve the self-consistent equations of RS-DFT [see Eq.~\eqref{rs-dft-eigen-equation}]
-for different values of $\mu$ \titou{using the srPBE functional}. This gives the CI expansions $\Psi^\mu$.
+for different values of $\mu$ using the srPBE functional. This gives the CI expansions $\Psi^\mu$.
 Then, within the same set of determinants we optimize the CI coefficients $c_I$ [see Eq.~\eqref{eq:Slater}] in the presence of
 a simple one- and two-body Jastrow factor $e^J$ with $J = J_\text{eN} + J_\text{ee}$ and
 \begin{subequations}
@ -601,7 +610,7 @@ where the sum over $i < j$ loops over all unique electron pairs.
 In Eqs.~\eqref{eq:jast-eN} and \eqref{eq:jast-ee}, $r_{iA}$ is the distance between the $i$th electron and the $A$th nucleus while $r_{ij}$ is the interlectronic distance between electrons $i$ and $j$.
 The parameters $a=1/2$
 and $b=0.89$ were fixed, and the parameters $\gamma_{\text{O}}=1.15$ and $\gamma_{\text{H}}=0.35$
-were obtained by energy minimization of a single \titou{HF?} determinant.
+were obtained by energy minimization of a single determinant.
 The optimal CI expansion $\Psi^J$ is obtained by sampling the matrix elements
 of the Hamiltonian ($\mathbf{H}$) and the overlap ($\mathbf{S}$), in the
 basis of Jastrow-correlated determinants $e^J D_i$:
@ -740,16 +749,16 @@ As a conclusion of the first part of this study, we can highlight the following

 Atomization energies are challenging for post-HF methods
 because their calculation requires a perfect balance in the
-description of atoms and molecules. Basis sets used in molecular
-calculations are atom-centered, so they are always better adapted to
+description of atoms and molecules. The mainstream one-electron basis sets employed in molecular
+calculations are atom-centered, so they are, by construction, better adapted to
 atoms than molecules and atomization energies usually tend to be
 underestimated by variational methods.
 In the context of FN-DMC calculations, the nodal surface is imposed by
-the trial wavefunction which is expanded on an atom-centered basis
-set, so we expect the fixed-node error to be also tightly related to
+the trial wavefunction which is expanded in an atom-centered basis
+set, so we expect the fixed-node error to be also intimately related to
 the basis set incompleteness error.
 Increasing the size of the basis set improves the description of
-the density and of electron correlation, but also reduces the
+the density and of the electron correlation, but also reduces the
 imbalance in the quality of the description of the atoms and the
 molecule, leading to more accurate atomization energies.

@ -761,9 +770,9 @@ An extremely important feature required to get accurate
 atomization energies is size-consistency (or strict separability),
 since the numbers of correlated electron pairs in the isolated atoms
 are different from those of the molecules.
-The energy computed within density functional theory is size-consistent, and
-as it is a mean-field method the convergence to the complete basis set
-(CBS) limit is relatively fast. Hence, DFT methods are very well adapted to
+The energy computed within DFT is size-consistent, and
+as it is a mean-field method the convergence to the CBS limit 
+is relatively fast. Hence, DFT methods are very well adapted to
 the calculation of atomization energies, especially with small basis
 sets. But going to the CBS limit will converge to biased atomization
 energies because of the use of approximate density functionals.
@ -773,10 +782,10 @@ the FCI energies to the CBS limit is much slower because of the
 description of short-range electron correlation using atom-centered
 functions. But ultimately the exact energy will be reached.

-In the context of selected CI calculations, when the variational energy is
-extrapolated to the FCI energy\cite{Holmes_2017} there is no
+In the context of SCI calculations, when the variational energy is
+extrapolated to the FCI energy \cite{Holmes_2017} there is no
 size-consistency error. But when the truncated SCI wave function is used
-as a reference for post-Hartree-Fock methods such as SCI+PT2
+as a reference for post-HF methods such as SCI+PT2
 or for QMC calculations, there is a residual size-consistency error
 originating from the truncation of the wave function.

@ -800,12 +809,12 @@ $a$ is determined by cusp conditions, and $b$ is obtained by energy
 or variance minimization.\cite{Coldwell_1977,Umrigar_2005}
 One can easily see that this parameterization of the two-body
 interaction is not size-consistent: the dissociation of a
-diatomic molecule $AB$ with a parameter $b_{AB}$
+diatomic molecule \ce{AB} with a parameter $b_{\ce{AB}}$
 will lead to two different two-body Jastrow factors, each
-with its own optimal value $b_A$ and $b_B$. To remove the
-size-consistency error on a PES using this ansätz for $J_\text{ee}$,
+with its own optimal value $b_{\ce{A}}$ and $b_{\ce{B}}$. To remove the
+size-consistency error on a PES using this ans\"atz for $J_\text{ee}$,
 one needs to impose that the parameters of $J_\text{ee}$ are fixed:
-$b_A = b_B = b_{AB}$.
+$b_A = b_B = b_{\ce{AB}}$.

 When pseudopotentials are used in a QMC calculation, it is common
 practice to localize the non-local part of the pseudopotential on the
@ -906,7 +915,7 @@ impacted by this spurious effect, as opposed to FCI.
 In this section, we investigate the impact of the spin contamination
 due to the short-range density functional on the FN-DMC energy. We have
 computed the energies of the carbon atom in its triplet state
-with BFD pseudopotentials and the corresponding double-zeta basis
+with BFD pseudopotentials and the corresponding double-$\zeta$ basis
 set. The calculation was done with $m_s=1$ (3 spin-up electrons
 and 1 spin-down electrons) and with $m_s=0$ (2 spin-up and 2
 spin-down electrons).
@ -971,18 +980,18 @@ The 55 molecules of the benchmark for the Gaussian-1
 theory\cite{Pople_1989,Curtiss_1990} were chosen to test the
 performance of the RS-DFT-CIPSI trial wave functions in the context of
 energy differences.  Calculations were made in the double-, triple-
-and quadruple-zeta basis sets with different values of $\mu$, and using
-natural orbitals of a preliminary CIPSI calculation.
+and quadruple-$\zeta$ basis sets with different values of $\mu$, and using
+NOs from a preliminary CIPSI calculation \titou{as a starting point}.
 For comparison, we have computed the energies of all the atoms and
-molecules at the DFT level with different density functionals, and at
+molecules at the KS-DFT level with various semi-local and hybrid density functionals [PBE, BLYP, PBE0, and B3LYP], and at
 the CCSD(T) level. Table~\ref{tab:mad} gives the corresponding mean
 absolute errors (MAE), mean signed errors (MSE) and standard
 deviations (RMSD). For FCI (RS-DFT-CIPSI, $\mu=\infty$) we have
-given extrapolated values at $\EPT\rightarrow 0$, and the error bars
-correspond to the difference between the energies computed with a
-two-point and with a three-point linear extrapolation.
+provided the extrapolated values at $\EPT \to 0$, and the error bars
+correspond to the difference between the energies \titou{computed with a
+two-point and with a three-point linear extrapolation}. \cite{Loos_2018a,Loos_2019,Loos_2020b,Loos_2020c}

-In this benchmark, the great majority of the systems are well
+In this benchmark, the great majority of the systems are weakly correlated and are then well
 described by a single determinant. Therefore, the atomization energies
 calculated at the DFT level are relatively accurate, even when
 the basis set is small. The introduction of exact exchange (B3LYP and
@ -994,27 +1003,28 @@ and FCI energies.
 The imbalance of the quality of description of molecules compared
 to atoms is exhibited by a very negative value of the MSE for
 CCSD(T) and FCI/VDZ-BFD, which is reduced by a factor of two
-when going to the triple-zeta basis, and again by a factor of two when
-going to the quadruple-zeta basis.
+when going to the triple-$\zeta$ basis, and again by a factor of two when
+going to the quadruple-$\zeta$ basis.

-This large imbalance at the double-zeta level affects the nodal
+This large imbalance at the VDZ-BFD level affects the nodal
 surfaces, because although the FN-DMC energies obtained with near-FCI
 trial wave functions are much lower than the single-determinant FN-DMC
-energies, the MAE obtained with FCI (7.38~$\pm$ 1.08~kcal/mol) is
-larger than the single-determinant MAE (4.61~$\pm$ 0.34 kcal/mol).
+energies, the MAE obtained with FCI ($7.38\pm1.08$ kcal/mol) is
+larger than the single-determinant MAE ($4.61\pm 0.34$ kcal/mol).
 Using the FCI trial wave function the MSE is equal to the
 negative MAE which confirms that all the atomization energies are
 underestimated. This confirms that some of the basis-set
 incompleteness error is transferred in the fixed-node error.

-Within the double-zeta basis set, the calculations could be done for the
+Within the double-$\zeta$ basis set, the calculations could be performed for the
 whole range of values of $\mu$, and the optimal value of $\mu$ for the
 trial wave function was estimated for each system by searching for the
 minimum of the spline interpolation curve of the FN-DMC energy as a
 function of $\mu$.
-This corresponds the the line of the table labelled by the \emph{Opt}
-value of $\mu$. Using the optimal value of $\mu$ clearly improves the
-MAE, the MSE an the RMSD compared the the FCI wave function. This
+This corresponds the line of Table~\ref{tab:mad} labelled as ``Opt.''
+\titou{The optimal $\mu$ value for each system is reported in the \SI.}
+Using the optimal value of $\mu$ clearly improves the
+MAE, the MSE an the RMSD compared to the FCI wave function. This
 result is in line with the common knowledge that re-optimizing
 the determinantal component of the trial wave function in the presence
 of electron correlation reduces the errors due to the basis set incompleteness.
@ -1022,8 +1032,8 @@ These calculations were done only for the smallest basis set
 because of the expensive computational cost of the QMC calculations
 when the trial wave function is expanded on more than a few million
 determinants.
-At the RS-DFT-CIPSI level, we can remark that with the triple-zeta
-basis set the MAE are larger for $\mu=1$~bohr$^{-1}$ than for the
+At the RS-DFT-CIPSI level, one can see that with the VTZ-BFD
+basis the MAEs are larger for $\mu=1$~bohr$^{-1}$ than for the
 FCI. For the largest systems, as shown in Fig.~\ref{fig:g2-ndet}
 there are many systems which did not reach the threshold
 $\EPT<1$~m\hartree{}, and the number of determinants exceeded
@ -1031,7 +1041,7 @@ $\EPT<1$~m\hartree{}, and the number of determinants exceeded
 small size-consistency error originating from the imbalanced
 truncation of the wave functions, which is not present in the
 extrapolated FCI energies. The same comment applies to
-$\mu=0.5$~bohr$^{-1}$ with the quadruple-zeta basis set.
+$\mu=0.5$~bohr$^{-1}$ with the quadruple-$\zeta$ basis set.


 %%% FIG 5 %%%
@ -1048,13 +1058,13 @@ $\mu=0.5$~bohr$^{-1}$ with the quadruple-zeta basis set.
 \end{figure*}
 %%% %%% %%% %%%

-Searching for the optimal value of $\mu$ may be too costly, so we have
-computed the MAD, MSE and RMSD for fixed values of $\mu$. The results
-are illustrated in Fig.~\ref{fig:g2-dmc}. As seen on the figure and
-in Table~\ref{tab:mad}, the best choice for a fixed value of $\mu$ is
+Searching for the optimal value of $\mu$ may be too costly and time consuming, so we have
+computed the MAD, MSE and RMSD for fixed values of $\mu$. 
+As illustrated in Fig.~\ref{fig:g2-dmc} and Table \ref{tab:mad}, 
+the best choice for a fixed value of $\mu$ is
 0.5~bohr$^{-1}$ for all three basis sets. It is the value for which
-the MAE (3.74(35), 2.46(18) and 2.06(35) kcal/mol) and RMSD (4.03(23),
-3.02(06) and 2.74(13)~kcal/mol) are minimal. Note that these values
+the MAE [$3.74(35)$, $2.46(18)$, and $2.06(35)$ kcal/mol] and RMSD [$4.03(23)$,
+$3.02(06)$, and $2.74(13)$ kcal/mol] are minimal. Note that these values
 are even lower than those obtained with the optimal value of
 $\mu$. Although the FN-DMC energies are higher, the numbers show that
 they are more consistent from one system to another, giving improved
@ -1077,32 +1087,33 @@ The number of determinants in the trial wave functions are shown in
 Fig.~\ref{fig:g2-ndet}. As expected, the number of determinants
 is smaller when $\mu$ is small and larger when $\mu$ is large.
 It is important to remark that the median of the number of
-determinants when $\mu=0.5$~bohr$^{-1}$ is below 100~000 determinants
-with the quadruple-zeta basis set, making these calculations feasilble
-with such a large basis set. At the double-zeta level, compared to the
+determinants when $\mu=0.5$~bohr$^{-1}$ is below $100\,000$ determinants
+with the VQZ-BFD basis, making these calculations feasible
+with such a large basis set. At the double-$\zeta$ level, compared to the
 FCI trial wave functions the median of the number of determinants is
 reduced by more than two orders of magnitude.
 Moreover, going to $\mu=0.25$~bohr$^{-1}$ gives a median close to 100
-determinants at the double-zeta level, and close to 1~000 determinants
-at the quadruple-zeta level for only a slight increase of the
+determinants at the VDZ-BFD level, and close to $1\,000$ determinants
+at the quadruple-$\zeta$ level for only a slight increase of the
 MAE. Hence, RS-DFT-CIPSI trial wave functions with small values of
 $\mu$ could be very useful for large systems to go beyond the
 single-determinant approximation at a very low computational cost
-while keeping the size-consistency.
+while ensuring size-consistency.

 Note that when $\mu=0$ the number of determinants is not equal to one because
-we have used the natural orbitals of a first CIPSI calculation, and
+we have used the natural orbitals of a preliminary CIPSI calculation, and
 not the srPBE orbitals.
 So the Kohn-Sham determinant is expressed as a linear combination of
-determinants built with natural orbitals. It is possible to add
-an extra step to the algorithm to compute the natural orbitals from the
-RS-DFT/CIPSI wave function, and re-do the RS-DFT/CIPSI calculation with
+determinants built with NOs. It is possible to add
+an extra step to the algorithm to compute the NOs from the
+RS-DFT-CIPSI wave function, and re-do the RS-DFT-CIPSI calculation with
 these orbitals to get an even more compact expansion. In that case, we would
-have converged to the Kohn-Sham orbitals with $\mu=0$, and the
+have converged to the KS orbitals with $\mu=0$, and the
 solution would have been the PBE single determinant.

 %%%%%%%%%%%%%%%%%%%%
 \section{Conclusion}
+\label{sec:conclusion}
 %%%%%%%%%%%%%%%%%%%%

 In the present work, we have shown that introducing short-range correlation via