Merge branch 'master' of git.irsamc.ups-tlse.fr:scemama/RSDFT-CIPSI-QMC

This commit is contained in:
Anthony Scemama 2020-08-18 12:11:16 +02:00
commit 9b31fd7e16
1 changed files with 68 additions and 57 deletions

View File

@ -249,7 +249,15 @@ still an active field of research. The present paper falls
within this context.
The central idea of the present work, and the launch-pad for the remainder of this study, is that one can combine the various strengths of WFT, DFT, and QMC in order to create a new hybrid method with more attractive features and higher accuracy.
In particular, we show here that one can combine CIPSI and KS-DFT via the range separation (RS) of the interelectronic Coulomb operator \cite{Sav-INC-96a,Toulouse_2004} to obtain accurate FN-DMC energies with compact multi-determinant trial wave functions.
In particular, we show here that one can combine CIPSI and KS-DFT via the range separation (RS) of the interelectronic Coulomb operator \cite{Sav-INC-96a,Toulouse_2004} --- a scheme that we label RS-DFT-CIPSI in the following --- to obtain accurate FN-DMC energies with compact multi-determinant trial wave functions.
The present manuscript is organized as follows.
In Sec.~\ref{sec:rsdft-cipsi}, we provide theoretical details about the CIPSI algorithm (Sec.~\ref{sec:CIPSI}) and range-separated DFT (Sec.~\ref{sec:rsdft}).
Computational details are reported in Sec.~\ref{sec:comp-details}.
In Sec.~\ref{sec:mu-dmc}, we discuss the influence of the range-separation parameter on the fixed-node error as well as the link between RS-DFT and Jastrow factors.
Section \ref{sec:atomization} examines the performance of the present scheme for the atomization energies of the Gaussian-1 set of molecules.
Finally, we draw our conclusion in Sec.~\ref{sec:conclusion}.
Unless otherwise stated, atomic units are used.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -259,6 +267,7 @@ Unless otherwise stated, atomic units are used.
%====================
\subsection{The CIPSI algorithm}
\label{sec:CIPSI}
%====================
Beyond the single-determinant representation, the best
multi-determinant wave function one can wish for --- in a given basis set --- is the FCI wave function.
@ -495,7 +504,7 @@ The first question we would like to address is the quality of the
nodes of the wave function $\Psi^{\mu}$ obtained for intermediate values of the
range separation parameter (\ie, $0 < \mu < +\infty$).
For this purpose, we consider a weakly correlated molecular system, namely the water
molecule \titou{at its experimental geometry. \cite{Caffarel_2016}}
molecule at its experimental geometry. \cite{Caffarel_2016}
We then generate trial wave functions $\Psi^\mu$ for multiple values of
$\mu$, and compute the associated FN-DMC energy keeping fixed all the
parameters impacting the nodal surface, such as the CI coefficients and the molecular orbitals.
@ -538,7 +547,7 @@ The take-home message of this first numerical study is that RS-DFT trial wave fu
This is a key result of the present study.
%======================================================
\subsection{Link between RS-DFT and Jastrow factors }
\subsection{Link between RS-DFT and Jastrow factor}
\label{sec:rsdft-j}
%======================================================
The data presented in Sec.~\ref{sec:fndmc_mu} evidence that, in a finite basis, RS-DFT can provide
@ -583,7 +592,7 @@ To do so, we have made the following numerical experiment.
First, we extract the 200 determinants with the largest weights in the FCI wave
function out of a large CIPSI calculation obtained with the VDZ-BFD basis. Within this set of determinants,
we solve the self-consistent equations of RS-DFT [see Eq.~\eqref{rs-dft-eigen-equation}]
for different values of $\mu$ \titou{using the srPBE functional}. This gives the CI expansions $\Psi^\mu$.
for different values of $\mu$ using the srPBE functional. This gives the CI expansions $\Psi^\mu$.
Then, within the same set of determinants we optimize the CI coefficients $c_I$ [see Eq.~\eqref{eq:Slater}] in the presence of
a simple one- and two-body Jastrow factor $e^J$ with $J = J_\text{eN} + J_\text{ee}$ and
\begin{subequations}
@ -601,7 +610,7 @@ where the sum over $i < j$ loops over all unique electron pairs.
In Eqs.~\eqref{eq:jast-eN} and \eqref{eq:jast-ee}, $r_{iA}$ is the distance between the $i$th electron and the $A$th nucleus while $r_{ij}$ is the interlectronic distance between electrons $i$ and $j$.
The parameters $a=1/2$
and $b=0.89$ were fixed, and the parameters $\gamma_{\text{O}}=1.15$ and $\gamma_{\text{H}}=0.35$
were obtained by energy minimization of a single \titou{HF?} determinant.
were obtained by energy minimization of a single determinant.
The optimal CI expansion $\Psi^J$ is obtained by sampling the matrix elements
of the Hamiltonian ($\mathbf{H}$) and the overlap ($\mathbf{S}$), in the
basis of Jastrow-correlated determinants $e^J D_i$:
@ -740,16 +749,16 @@ As a conclusion of the first part of this study, we can highlight the following
Atomization energies are challenging for post-HF methods
because their calculation requires a perfect balance in the
description of atoms and molecules. Basis sets used in molecular
calculations are atom-centered, so they are always better adapted to
description of atoms and molecules. The mainstream one-electron basis sets employed in molecular
calculations are atom-centered, so they are, by construction, better adapted to
atoms than molecules and atomization energies usually tend to be
underestimated by variational methods.
In the context of FN-DMC calculations, the nodal surface is imposed by
the trial wavefunction which is expanded on an atom-centered basis
set, so we expect the fixed-node error to be also tightly related to
the trial wavefunction which is expanded in an atom-centered basis
set, so we expect the fixed-node error to be also intimately related to
the basis set incompleteness error.
Increasing the size of the basis set improves the description of
the density and of electron correlation, but also reduces the
the density and of the electron correlation, but also reduces the
imbalance in the quality of the description of the atoms and the
molecule, leading to more accurate atomization energies.
@ -761,9 +770,9 @@ An extremely important feature required to get accurate
atomization energies is size-consistency (or strict separability),
since the numbers of correlated electron pairs in the isolated atoms
are different from those of the molecules.
The energy computed within density functional theory is size-consistent, and
as it is a mean-field method the convergence to the complete basis set
(CBS) limit is relatively fast. Hence, DFT methods are very well adapted to
The energy computed within DFT is size-consistent, and
as it is a mean-field method the convergence to the CBS limit
is relatively fast. Hence, DFT methods are very well adapted to
the calculation of atomization energies, especially with small basis
sets. But going to the CBS limit will converge to biased atomization
energies because of the use of approximate density functionals.
@ -773,10 +782,10 @@ the FCI energies to the CBS limit is much slower because of the
description of short-range electron correlation using atom-centered
functions. But ultimately the exact energy will be reached.
In the context of selected CI calculations, when the variational energy is
extrapolated to the FCI energy\cite{Holmes_2017} there is no
In the context of SCI calculations, when the variational energy is
extrapolated to the FCI energy \cite{Holmes_2017} there is no
size-consistency error. But when the truncated SCI wave function is used
as a reference for post-Hartree-Fock methods such as SCI+PT2
as a reference for post-HF methods such as SCI+PT2
or for QMC calculations, there is a residual size-consistency error
originating from the truncation of the wave function.
@ -800,12 +809,12 @@ $a$ is determined by cusp conditions, and $b$ is obtained by energy
or variance minimization.\cite{Coldwell_1977,Umrigar_2005}
One can easily see that this parameterization of the two-body
interaction is not size-consistent: the dissociation of a
diatomic molecule $AB$ with a parameter $b_{AB}$
diatomic molecule \ce{AB} with a parameter $b_{\ce{AB}}$
will lead to two different two-body Jastrow factors, each
with its own optimal value $b_A$ and $b_B$. To remove the
size-consistency error on a PES using this ansätz for $J_\text{ee}$,
with its own optimal value $b_{\ce{A}}$ and $b_{\ce{B}}$. To remove the
size-consistency error on a PES using this ans\"atz for $J_\text{ee}$,
one needs to impose that the parameters of $J_\text{ee}$ are fixed:
$b_A = b_B = b_{AB}$.
$b_A = b_B = b_{\ce{AB}}$.
When pseudopotentials are used in a QMC calculation, it is common
practice to localize the non-local part of the pseudopotential on the
@ -906,7 +915,7 @@ impacted by this spurious effect, as opposed to FCI.
In this section, we investigate the impact of the spin contamination
due to the short-range density functional on the FN-DMC energy. We have
computed the energies of the carbon atom in its triplet state
with BFD pseudopotentials and the corresponding double-zeta basis
with BFD pseudopotentials and the corresponding double-$\zeta$ basis
set. The calculation was done with $m_s=1$ (3 spin-up electrons
and 1 spin-down electrons) and with $m_s=0$ (2 spin-up and 2
spin-down electrons).
@ -971,18 +980,18 @@ The 55 molecules of the benchmark for the Gaussian-1
theory\cite{Pople_1989,Curtiss_1990} were chosen to test the
performance of the RS-DFT-CIPSI trial wave functions in the context of
energy differences. Calculations were made in the double-, triple-
and quadruple-zeta basis sets with different values of $\mu$, and using
natural orbitals of a preliminary CIPSI calculation.
and quadruple-$\zeta$ basis sets with different values of $\mu$, and using
NOs from a preliminary CIPSI calculation \titou{as a starting point}.
For comparison, we have computed the energies of all the atoms and
molecules at the DFT level with different density functionals, and at
molecules at the KS-DFT level with various semi-local and hybrid density functionals [PBE, BLYP, PBE0, and B3LYP], and at
the CCSD(T) level. Table~\ref{tab:mad} gives the corresponding mean
absolute errors (MAE), mean signed errors (MSE) and standard
deviations (RMSD). For FCI (RS-DFT-CIPSI, $\mu=\infty$) we have
given extrapolated values at $\EPT\rightarrow 0$, and the error bars
correspond to the difference between the energies computed with a
two-point and with a three-point linear extrapolation.
provided the extrapolated values at $\EPT \to 0$, and the error bars
correspond to the difference between the energies \titou{computed with a
two-point and with a three-point linear extrapolation}. \cite{Loos_2018a,Loos_2019,Loos_2020b,Loos_2020c}
In this benchmark, the great majority of the systems are well
In this benchmark, the great majority of the systems are weakly correlated and are then well
described by a single determinant. Therefore, the atomization energies
calculated at the DFT level are relatively accurate, even when
the basis set is small. The introduction of exact exchange (B3LYP and
@ -994,27 +1003,28 @@ and FCI energies.
The imbalance of the quality of description of molecules compared
to atoms is exhibited by a very negative value of the MSE for
CCSD(T) and FCI/VDZ-BFD, which is reduced by a factor of two
when going to the triple-zeta basis, and again by a factor of two when
going to the quadruple-zeta basis.
when going to the triple-$\zeta$ basis, and again by a factor of two when
going to the quadruple-$\zeta$ basis.
This large imbalance at the double-zeta level affects the nodal
This large imbalance at the VDZ-BFD level affects the nodal
surfaces, because although the FN-DMC energies obtained with near-FCI
trial wave functions are much lower than the single-determinant FN-DMC
energies, the MAE obtained with FCI (7.38~$\pm$ 1.08~kcal/mol) is
larger than the single-determinant MAE (4.61~$\pm$ 0.34 kcal/mol).
energies, the MAE obtained with FCI ($7.38\pm1.08$ kcal/mol) is
larger than the single-determinant MAE ($4.61\pm 0.34$ kcal/mol).
Using the FCI trial wave function the MSE is equal to the
negative MAE which confirms that all the atomization energies are
underestimated. This confirms that some of the basis-set
incompleteness error is transferred in the fixed-node error.
Within the double-zeta basis set, the calculations could be done for the
Within the double-$\zeta$ basis set, the calculations could be performed for the
whole range of values of $\mu$, and the optimal value of $\mu$ for the
trial wave function was estimated for each system by searching for the
minimum of the spline interpolation curve of the FN-DMC energy as a
function of $\mu$.
This corresponds the the line of the table labelled by the \emph{Opt}
value of $\mu$. Using the optimal value of $\mu$ clearly improves the
MAE, the MSE an the RMSD compared the the FCI wave function. This
This corresponds the line of Table~\ref{tab:mad} labelled as ``Opt.''
\titou{The optimal $\mu$ value for each system is reported in the \SI.}
Using the optimal value of $\mu$ clearly improves the
MAE, the MSE an the RMSD compared to the FCI wave function. This
result is in line with the common knowledge that re-optimizing
the determinantal component of the trial wave function in the presence
of electron correlation reduces the errors due to the basis set incompleteness.
@ -1022,8 +1032,8 @@ These calculations were done only for the smallest basis set
because of the expensive computational cost of the QMC calculations
when the trial wave function is expanded on more than a few million
determinants.
At the RS-DFT-CIPSI level, we can remark that with the triple-zeta
basis set the MAE are larger for $\mu=1$~bohr$^{-1}$ than for the
At the RS-DFT-CIPSI level, one can see that with the VTZ-BFD
basis the MAEs are larger for $\mu=1$~bohr$^{-1}$ than for the
FCI. For the largest systems, as shown in Fig.~\ref{fig:g2-ndet}
there are many systems which did not reach the threshold
$\EPT<1$~m\hartree{}, and the number of determinants exceeded
@ -1031,7 +1041,7 @@ $\EPT<1$~m\hartree{}, and the number of determinants exceeded
small size-consistency error originating from the imbalanced
truncation of the wave functions, which is not present in the
extrapolated FCI energies. The same comment applies to
$\mu=0.5$~bohr$^{-1}$ with the quadruple-zeta basis set.
$\mu=0.5$~bohr$^{-1}$ with the quadruple-$\zeta$ basis set.
%%% FIG 5 %%%
@ -1048,13 +1058,13 @@ $\mu=0.5$~bohr$^{-1}$ with the quadruple-zeta basis set.
\end{figure*}
%%% %%% %%% %%%
Searching for the optimal value of $\mu$ may be too costly, so we have
computed the MAD, MSE and RMSD for fixed values of $\mu$. The results
are illustrated in Fig.~\ref{fig:g2-dmc}. As seen on the figure and
in Table~\ref{tab:mad}, the best choice for a fixed value of $\mu$ is
Searching for the optimal value of $\mu$ may be too costly and time consuming, so we have
computed the MAD, MSE and RMSD for fixed values of $\mu$.
As illustrated in Fig.~\ref{fig:g2-dmc} and Table \ref{tab:mad},
the best choice for a fixed value of $\mu$ is
0.5~bohr$^{-1}$ for all three basis sets. It is the value for which
the MAE (3.74(35), 2.46(18) and 2.06(35) kcal/mol) and RMSD (4.03(23),
3.02(06) and 2.74(13)~kcal/mol) are minimal. Note that these values
the MAE [$3.74(35)$, $2.46(18)$, and $2.06(35)$ kcal/mol] and RMSD [$4.03(23)$,
$3.02(06)$, and $2.74(13)$ kcal/mol] are minimal. Note that these values
are even lower than those obtained with the optimal value of
$\mu$. Although the FN-DMC energies are higher, the numbers show that
they are more consistent from one system to another, giving improved
@ -1077,32 +1087,33 @@ The number of determinants in the trial wave functions are shown in
Fig.~\ref{fig:g2-ndet}. As expected, the number of determinants
is smaller when $\mu$ is small and larger when $\mu$ is large.
It is important to remark that the median of the number of
determinants when $\mu=0.5$~bohr$^{-1}$ is below 100~000 determinants
with the quadruple-zeta basis set, making these calculations feasilble
with such a large basis set. At the double-zeta level, compared to the
determinants when $\mu=0.5$~bohr$^{-1}$ is below $100\,000$ determinants
with the VQZ-BFD basis, making these calculations feasible
with such a large basis set. At the double-$\zeta$ level, compared to the
FCI trial wave functions the median of the number of determinants is
reduced by more than two orders of magnitude.
Moreover, going to $\mu=0.25$~bohr$^{-1}$ gives a median close to 100
determinants at the double-zeta level, and close to 1~000 determinants
at the quadruple-zeta level for only a slight increase of the
determinants at the VDZ-BFD level, and close to $1\,000$ determinants
at the quadruple-$\zeta$ level for only a slight increase of the
MAE. Hence, RS-DFT-CIPSI trial wave functions with small values of
$\mu$ could be very useful for large systems to go beyond the
single-determinant approximation at a very low computational cost
while keeping the size-consistency.
while ensuring size-consistency.
Note that when $\mu=0$ the number of determinants is not equal to one because
we have used the natural orbitals of a first CIPSI calculation, and
we have used the natural orbitals of a preliminary CIPSI calculation, and
not the srPBE orbitals.
So the Kohn-Sham determinant is expressed as a linear combination of
determinants built with natural orbitals. It is possible to add
an extra step to the algorithm to compute the natural orbitals from the
RS-DFT/CIPSI wave function, and re-do the RS-DFT/CIPSI calculation with
determinants built with NOs. It is possible to add
an extra step to the algorithm to compute the NOs from the
RS-DFT-CIPSI wave function, and re-do the RS-DFT-CIPSI calculation with
these orbitals to get an even more compact expansion. In that case, we would
have converged to the Kohn-Sham orbitals with $\mu=0$, and the
have converged to the KS orbitals with $\mu=0$, and the
solution would have been the PBE single determinant.
%%%%%%%%%%%%%%%%%%%%
\section{Conclusion}
\label{sec:conclusion}
%%%%%%%%%%%%%%%%%%%%
In the present work, we have shown that introducing short-range correlation via