up to results

This commit is contained in:
Pierre-Francois Loos 2022-03-06 17:30:42 +01:00
parent 13e0daf6c6
commit 4deafed3e9

View File

@ -1,5 +1,5 @@
\documentclass[aip,jcp,reprint,noshowkeys,superscriptaddress]{revtex4-1}
\usepackage{graphicx,dcolumn,bm,xcolor,microtype,multirow,amscd,amsmath,amssymb,amsfonts,physics,wrapfig,txfonts}
\usepackage{graphicx,dcolumn,bm,xcolor,microtype,multirow,amscd,amsmath,amssymb,amsfonts,physics,wrapfig,txfonts,siunitx}
\usepackage[version=4]{mhchem}
%\usepackage{natbib}
%\bibliographystyle{achemso}
@ -17,7 +17,7 @@
\newcommand{\trashAS}[1]{\textcolor{green}{\sout{#1}}}
\newcommand{\AS}[1]{\toto{(\underline{\bf AS}: #1)}}
\newcommand{\ant}[1]{\textcolor{orange}{#1}}
\newcommand{\SI}{\textcolor{blue}{Supporting Information}}
\newcommand{\SupInf}{\textcolor{blue}{Supporting Information}}
\newcommand{\mc}{\multicolumn}
\newcommand{\fnm}{\footnotemark}
@ -25,23 +25,13 @@
\newcommand{\tabc}[1]{\multicolumn{1}{c}{#1}}
\newcommand{\QP}{\textsc{quantum package}}
\newcommand{\hI}{\Hat{1}}
\newcommand{\hT}{\Hat{T}}
\newcommand{\hH}{\Hat{H}}
\newcommand{\bH}{\Bar{H}}
\newcommand{\ERI}[2]{v_{#1}^{#2}}
\newcommand{\EHF}{E_\text{HF}}
\newcommand{\EDOCI}{E_\text{DOCI}}
\newcommand{\EFCI}{E_\text{FCI}}
\newcommand{\ECC}{E_\text{CC}}
\newcommand{\EVCC}{E_\text{VCC}}
\newcommand{\EpCCD}{E_\text{pCCD}}
\newcommand{\Ndet}{N_\text{det}}
\newcommand{\Nb}{N}
\newcommand{\si}{\sigma}
\newcommand{\Nbas}{N}
\usepackage[
colorlinks=true,
@ -70,7 +60,7 @@
%Aiming at recovering both static and dynamic correlation,
We propose a novel partitioning of the Hilbert space, hierarchy configuration interaction (hCI),
where the excitation degree (with respect to a given reference determinant) and the seniority number (\ie, the number of unpaired electrons) are combined in a single hierarchy parameter \titou{in order to recover both static and dynamic correlations at a similar rate?}
The key appealing feature of hCI is that each hierarchy level accounts for all classes of determinants that share the same scaling with system size.
The key appealing feature of hCI is that each hierarchy level accounts \titou{for all classes of determinants that share the same scaling with system size.}
%number of electrons and basis functions.
%In this way, it accounts for low-seniority high-excitation determinants lacking in excitation-based CI, while keeping the same computational scaling with system size.
By surveying the dissociation of multiple molecular systems, we found that the overall performance of hCI usually exceeds or, at least, parallels that of excitation-based CI.
@ -112,7 +102,7 @@ In this context, one accounts for all determinants generated by exciting up to $
In this way, the excitation degree $e$ defines the following sequence of models:
CI with single excitations (CIS), CI with single and double excitations (CISD), CI with single, double, and triple excitations (CISDT), and so on.
Excitation-based CI manages to quickly recover weak (dynamic) correlation effects, but struggles in strong (static) correlation regimes.
Importantly, the number of determinants $\Ndet$ (which is the key parameter governing the computational cost) scales polynomially with the number of \titou{basis functions} $\Nb$ as $N^{2e}$.
Importantly, the number of determinants $\Ndet$ (which is the key parameter governing the computational cost) scales polynomially with the number of \titou{basis functions} $\Nbas$ as $N^{2e}$.
%This means that the contribution of higher excitations become progressively smaller.
Alternatively, seniority-based CI methods (sCI) have been proposed in both nuclear \cite{Ring_1980} and electronic \cite{Bytautas_2011} structure calculations.
@ -120,7 +110,7 @@ In short, the seniority number $s$ is the number of unpaired electrons in a give
By truncating at the seniority zero ($s = 0$) sector (sCI0), one obtains the well-known doubly-occupied CI (DOCI) method, \cite{Bytautas_2011,Allen_1962,Smith_1965,Veillard_1967}
which has been shown to be particularly effective at catching static correlation,
while higher sectors tend to contribute progressively less. \cite{Bytautas_2011,Bytautas_2015,Alcoba_2014b,Alcoba_2014}
However, already at the sCI0 level, $\Ndet$ scales exponentially with $\Nb$, since excitations of all excitation degrees are included.
However, already at the sCI0 level, $\Ndet$ scales exponentially with $\Nbas$, since excitations of all excitation degrees are included.
Therefore, despite the encouraging successes of seniority-based CI methods, their unfavorable computational scaling restricts applications to very small systems.
Besides CI, other methods that exploit the concept of seniority number have been pursued. \cite{Henderson_2014,Chen_2015,Bytautas_2018}
\titou{T2: I think we need to cite the papers of the Canadians here.}
@ -162,7 +152,7 @@ The color tones represent the determinants that are included at a given level of
We have three key justifications for this new CI hierarchy.
The first one is physical.
We know that the lower degrees of excitations and lower seniority sectors, when looked at individually, often carry the most important contribution to the FCI expansion.
By combining $e$ and $s$ as is Eq.~\eqref{eq:h}, we ensure that both directions in the excitation-seniority map (see Fig.~\ref{fig:allCI}) will be contemplated.
By combining $e$ and $s$ as is Eq.~\eqref{eq:h}, we ensure that both directions in the excitation-seniority map (see Fig.~\ref{fig:allCI}) are contemplated.
Rather than filling the map top-bottom (as in excitation-based CI) or left-right (as in seniority-based CI), the hCI methods fills it diagonally.
In this sense, we hope to recover dynamic correlation by moving right in the map (increasing the excitation degree while keeping a low seniority number),
at the same time as static correlation, by moving down (increasing the seniority number while keeping a low excitation degree).
@ -170,11 +160,11 @@ at the same time as static correlation, by moving down (increasing the seniority
The second justification is computational.
In the hCI class of methods, each level of theory accommodates additional determinants from different excitation-seniority sectors (each block of same color tone in Fig.~\ref{fig:allCI}).
The key insight behind hCI is that the number of additional determinants presents the same scaling with respect to $N$, for all excitation-seniority sectors entering at a given hierarchy $h$.
The key insight behind hCI is that the number of additional determinants presents the same scaling with respect to $\Nbas$, for all excitation-seniority sectors entering at a given hierarchy $h$.
This further justifies the parameter $h$ as being the simple average between $e$ and $s/2$.
Finally, the third justification for our hCI method is empirical and closely related to the computational motivation.
There are many possible ways to populate the Hilbert space starting from the a given reference determinant,
There are many possible ways to populate the Hilbert space starting from a given reference determinant,
and one can in principle formulate any systematic recipe that includes progressively more determinants.
Besides a physical or computational perspective, the question of what makes for a good recipe can be framed empirically.
Does our hCI class of methods perform better than excitation-based or seniority-based CI,
@ -183,47 +173,47 @@ in the sense of recovering most of the correlation energy with the least computa
Hybrid approaches based on both excitation degree and seniority number have been proposed before. \cite{Alcoba_2014,Raemdonck_2015,Alcoba_2018}
In these works, the authors established separate maximum values for the excitation and the seniority,
and either the union or the intersection between the two sets of determinants have been considered.
For the union case, $\Ndet$ grows exponentially with $N$,
For the union case, $\Ndet$ grows exponentially with $\Nbas$,
while in the intersection approach the Hilbert space is filled rectangle-wise in our excitation-seniority map.
In the latter case, the scaling of $\Ndet$ would be dominated by the rightmost bottom block.
Bytautas et al.\cite{Bytautas_2015} explored a different hybrid scheme combining determinants having a maximum seniority number and those from a complete active space.
Bytautas \textit{et al.}\cite{Bytautas_2015} explored a different hybrid scheme combining determinants having a maximum seniority number and those from a complete active space.
In comparison to previous approaches, our hybrid hCI scheme has two key advantages.
First, it is defined by a single parameter that unifies excitation degree and seniority number [see Eq.~\eqref{eq:h}].
Second and most importantly, each next level includes all classes of determinants sharing the same scaling with system size, as discussed before, thus keeping the method at a polynomial scaling.
Second and most importantly, each next level includes all classes of determinants sharing the same scaling with system size, as discussed before, thus preserving the polynomial cost of the method.
Each level of excitation-based CI has a hCI counterpart with the same scaling of $\Ndet$ with respect to $N$.
For example, $\Ndet \sim N^4$ in both hCI2 and CISD, whereas $\Ndet \sim N^6$ in hCI3 and CISDT, and so on.
Each level of excitation-based CI has a hCI counterpart with the same scaling of $\Ndet$ with respect to $\Nbas$.
For example, $\Ndet = \order*{\Nbas^4}$ in both hCI2 and CISD, whereas $\Ndet = \order*{\Nbas^6}$ in hCI3 and CISDT, and so on.
From this computational perspective, hCI can be seen as a more natural choice than the traditional excitation-based CI,
because if one can afford for, say, a CISDT calculation, than one could probably afford a hCI3 calculation, which has the same computational scaling.
Of course, in practice an integer-$h$ hCI method will have more determinants than its excitation-based counterpart (despite the same scaling),
because if one can afford for, say, a CISDT calculation, then one could probably afford a hCI3 calculation, \titou{which has the same computational scaling}.
Of course, in practice an integer-$h$ hCI method has more determinants than its excitation-based counterpart \titou{(despite the same scaling)},
and thus one should first ensure whether including the lower-triangular blocks (going from CISDT to hCI3 in our example)
is a better strategy than adding the next column (going from CISDT to CISDTQ).
Therefore, here we decided to discuss the results in terms of $\Ndet$, rather than the formal scaling of $\Ndet$,
Therefore, here we decided to discuss the results in terms of $\Ndet$, rather than the formal scaling of $\Ndet$ as a function of $\Nbas$,
which could make the comparison somewhat biased toward hCI.
It is interesting to compare the lowest levels of hCI (hCI1) and excitation-based CI (CIS).
Since single excitations do not connect with the reference (at least for HF orbitals), CIS provides the same energy as HF.
In contrast, the paired double excitations of hCI1 do connect with the reference (and the singles contribute indirectly via the doubles).
Therefore, while CIS based on HF orbitals does not improve with respect to the mean-field HF wave function,
the hCI1 counterpart already represents a minimally correlated model, with the same and favourable $\Ndet \sim N^2$ scaling.
hCI also allows for half-integer values of $h$, with no parallel in excitation-based CI.
the hCI1 counterpart already represents a minimally correlated model, with the same and favorable $\Ndet = \order*{\Nbas^2}$ scaling.
hCI also allows for half-integer values of $h$, with no equivalent in excitation-based CI.
This gives extra flexibility in terms of choice of method.
For a particular application with excitation-based CI, CISD might be too inaccurate, for example, while the improved accuracy of CISDT might be too expensive.
hCI2.5 could represent an alternative, being more accurate than hCI2 and less expensive than hCI3.
Our main goal here is to assess the performance of hCI against excitation-based and seniority-based CI.
To do so, we evaluated how fast different observables converge to the FCI limit as a function of $\Ndet$.
To do so, we have evaluated how fast different observables converge to the FCI limit as a function of $\Ndet$.
We have calculated the potential energy curves (PECs) for the dissociation of six systems,
\ce{HF}, \ce{F2}, \ce{N2}, ethylene, \ce{H4}, and \ce{H8},
which display a variable number of bond breaking.
For the latter two, we considered linearly arranged and equally spaced hydrogen atoms, and computed PECs along the symmetric dissociation coordinate.
For the latter two molecules, we considered linearly arranged with equally spaced hydrogen atoms, and computed PECs along the symmetric dissociation coordinate.
For ethylene, we considered the \ce{C=C} double bond breaking, while freezing the remaining internal coordinates.
Due to the (multiple) bond breaking, these are challenging systems for electronic structure methods,
being often considered when assessing novel methodologies.
We evaluated the convergence of four observables: the non-parallelity error (NPE), the distance error, the vibrational frequencies, and the equilibrium geometries.
The NPE is defined as the maximum minus the minimum differences between the PECs obtained at given CI level and the exact FCI result.
We define the distance error as the the maximum plus and the minimum differences between a given PEC and the FCI result.
We define the distance error as the maximum and the minimum differences between a given PEC and the FCI result.
Thus, while the NPE probes the similarity regarding the shape of the PECs, the distance error provides a measure of how their overall magnitudes compare.
From the PECs, we have also extracted the vibrational frequencies and equilibrium geometries (details can be found in the \SI).
From the PECs, we have also extracted the vibrational frequencies and equilibrium geometries (details can be found in the \SupInf).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\section{Computational details}
@ -231,22 +221,23 @@ From the PECs, we have also extracted the vibrational frequencies and equilibriu
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The hCI method was implemented in {\QP} via a straightforward adaptation of the
\textit{configuration interaction using a perturbative selection made iteratively} (CIPSI) algorithm \cite{Huron_1973,Giner_2013,Giner_2015},
\textit{configuration interaction using a perturbative selection made iteratively} (CIPSI) algorithm \cite{Huron_1973,Giner_2013,Giner_2015,Garniron_2018},
by allowing only for determinants having a given maximum hierarchy $h$.
The excitation-based CI, seniority-based CI, and FCI calculations presented here were also performed with the CIPSI algorithm implemented in {\QP}. \cite{Huron_1973,Giner_2013,Giner_2015,Garniron_2019}
In practice, we consider the CI energy to be converged when the second-order perturbation correction lies below $10^{-5}$ Hartree,
The excitation-based CI, seniority-based CI, and FCI calculations presented here were also performed with the CIPSI algorithm implemented in {\QP}. \cite{Garniron_2019}
In practice, we consider the CI energy to be converged when the second-order perturbation correction lies below \SI{0.01}{\milli\hartree}, \cite{Garniron_2018}
which requires considerably fewer determinants than the formal number of determinants (understood as all those that belong to a given CI level, regardless of their weight or symmetry).
Nevertheless, we decided to present the results as functions of the formal number of determinants,
which are not related to the particular algorithmic choices of the CIPSI calculations.
All CI calculations were performed for the cc-pVDZ basis set and with frozen core orbitals.
For \ce{HF} we have also tested basis set effects, by considered the cc-pVTZ and cc-pVQZ basis sets.
\titou{Geometries? SI?}
The CI calculations were performed with both canonical Hartree-Fock (HF) orbitals and optimized orbitals.
The CI calculations were performed with both canonical HF orbitals and optimized orbitals.
In the latter case, the energy is obtained variationally in the CI space and in the orbital parameter space, hence an orbital-optimized CI (oo-CI) method.
We employed the algorithm described elsewhere \cite{Damour_2021} and also implemented in {\QP} for optimizing the orbitals within a CI wave function.
In order to avoid converging to a saddle point solution, we employed a similar strategy as recently described in Ref. \cite{Hollett_2022}.
In order to avoid converging to a saddle point solution, we employed a similar strategy as recently described in Ref.~\cite{Hollett_2022}.
Namely, whenever the eigenvalue of the orbital rotation Hessian is negative and the corresponding gradient component $g_i$ lies below a given threshold $g_0$,
then this gradient component is replaced by $g_0 |g_i|/g_i$.
then this gradient component is replaced by $g_0 \abs{g_i}/g_i$.
While we cannot ensure that the obtained solutions are global minima in the orbital parameter space, we verified that in all stationary solutions surveyed here
correspond to real minima (rather than maxima or saddle points).
@ -254,7 +245,7 @@ It is worth mentioning that obtaining smooth PECs for the orbital optimized calc
First, the orbital optimization started from the HF orbitals of each geometry.
This usually lead to discontinuous PECs, meaning that distinct solutions of the orbital optimization have been found with our algorithm.
Then, at some geometry or geometries that seem to present the lowest lying solution,
the optimized orbitals were employed as the guess orbitals for the neighbouring geometries, and so on, until a new PEC is obtained.
the optimized orbitals were employed as the guess orbitals for the neighboring geometries, and so on, until a new PEC is obtained.
This protocol is repeated until the PEC built from the lowest lying oo-CI solution becomes continuous.
%While we cannot guarantee that the presented solutions represent the global minima, we believe that in most cases the above protocol provides at least close enough solutions.
%Multiple solutions for the orbital optimization are usually found, meaning several local minimal in the orbital parameter landscape.
@ -274,14 +265,14 @@ We recall that saddle point solutions were purposely avoided in our orbital opti
In Fig.~\ref{fig:plot_stat} we present the NPEs for the six systems studied, and for the three classes of CI methods,
as functions of the number of determinants, $\Ndet$.
The corresponding PECs and the energy differences with respect to the FCI results can be found in the \SI.
The corresponding PECs and the energy differences with respect to the FCI results can be found in the \SupInf.
The main result contained in Fig.~\ref{fig:plot_stat} concerns the overall faster convergence of the hCI methods when compared to excitation-based and seniority-based CI methods.
This is observed for single bond breaking (\ce{HF} and \ce{F2}) as well as the more challenging double (ethylene), triple (\ce{N2}), and quadruple (\ce{H4}) bond breaking.
For \ce{H8}, hCI and excitation-based CI perform similarly.
The convergence with respect to $\Ndet$ is slower in the latter, more challenging cases, irrespective of the class of CI methods, as would be expected.
But more importantly, the superiority of the hCI methods appears to be highlighted in the multiple bond break systems (compare ethylene and \ce{N2} with \ce{HF} and \ce{F2} in Fig.~\ref{fig:plot_stat}).
For \ce{HF} we also evaluated the convergence is affected by increasing the basis sets, going from cc-pVDZ to cc-pVTZ and cc-pVQZ basis sets (see Fig.Sx in the \SI).
For \ce{HF} we also evaluated the convergence is affected by increasing the basis sets, going from cc-pVDZ to cc-pVTZ and cc-pVQZ basis sets (see Fig.Sx in the \SupInf).
While a larger $\Ndet$ is required to achieve the same level of convergence, as expected,
the convergence profiles remain very similar for all basis sets.
We thus believe that the main findings discussed here for the other systems would be equally basis set independent.
@ -297,7 +288,7 @@ We thus believe that the main findings discussed here for the other systems woul
For all systems (specially ethylene and \ce{N2}), hCI2 is better than CISD, two methods where $\Ndet$ scales as $N^4$.
hCI2.5 is better than CISDT (except for \ce{H8}), despite its lower computational cost, whereas hCI3 is much better than CISDT, and comparable in accuracy with CISDTQ (again for all systems).
Inspection of the PECs (see \SI) reveals that the lower NPEs observed for hCI stem mostly from the contribution of the dissociation region.
Inspection of the PECs (see \SupInf) reveals that the lower NPEs observed for hCI stem mostly from the contribution of the dissociation region.
This result demonstrates the importance of higher-order excitations with low seniority number in this strong correlation regime,
which are accounted for in hCI but not in excitation-based CI (for a given scaling with $\Ndet$).
@ -305,7 +296,7 @@ Meanwhile, the first level of seniority-based CI (sCI0, which is the same as DOC
tends to offer a rather low NPE when compared to the other CI methods with a similar $\Ndet$ (hCI2.5 and CISDT).
However, convergence is clearly slower for the next levels (sCI2 and sCI4), whereas excitation-based CI and specially hCI methods converge faster.
Furthermore, seniority-based CI becomes less attractive for larger basis set in view of its exponential scaling.
This can be seen in Fig.Sx of the \SI, which shows that augmenting the basis set leads to a much steeper increase of $\Ndet$ for seniority-based CI.
This can be seen in Fig.Sx of the \SupInf, which shows that augmenting the basis set leads to a much steeper increase of $\Ndet$ for seniority-based CI.
It is worth mentioning the surprisingly good performance of the hCI1 and hCI1.5 methods.
For \ce{HF}, \ce{F2}, and ethylene, they presented lower NPEs than the much more expensive CISDT method, being slightly higher in the case of \ce{N2}.
@ -317,7 +308,7 @@ become less apparent as progressively more bonds are being broken (compare for i
This reflects the fact that higher-order excitations are needed to properly describe multiple bond breaking,
and also hints at some cancelation of erros in low order hCI methods for single bond breaking.
In Fig.Sx of the \SI, we present the distance error, which is also found to decrease faster with the hCI methods.
In Fig.Sx of the \SupInf, we present the distance error, which is also found to decrease faster with the hCI methods.
Most of observations discussed for the NPE also hold for the distance error, with two main differences.
The convergence is always monotonic for the latter observable (which is expected from the definition of the observable),
and the performance of seniority-based CI is much poorer (due to the slow recovery of dynamic correlation).
@ -332,7 +323,7 @@ For both observables, hCI and excitation-based CI largely outperform seniority-b
Similarly to what we observed for the NPEs, the convergence of hCI was also found to be non-monotonic in some cases.
This oscillatory behavior is particularly evident for \ce{F2}, also noticeable for \ce{HF}, becoming less apparent for ethylene, virtually absent for \ce{N2},
and showing up again for \ce{H4} and \ce{H8}.
Results for \ce{HF} with larger basis sets (see Fig.Sx in the \SI) show very similar convergence behaviours, though with less oscillations for the hCI methods.
Results for \ce{HF} with larger basis sets (see Fig.Sx in the \SupInf) show very similar convergence behaviours, though with less oscillations for the hCI methods.
Interestingly, equilibrium geometries and vibrational frequencies of \ce{HF} and \ce{F2} (single bond breaking),
are rather accurate when evaluated at the hCI1.5 level, bearing in mind its relatively modest computational cost.
@ -360,7 +351,7 @@ Up to this point, all results and discussions have been based on CI calculations
Now we discuss the role of further optimizing the orbitals at each given CI calculation.
Due to the significantly higher computational cost and numerical difficulties for optimizing the orbitals at higher levels of CI,
such calculations were typically limited up to oo-CISD (for excitation-based), oo-DOCI (for seniority-based), and oo-hCI2 (for hCI).
The PECs and analogous results to those of Figs.~\ref{fig:plot_stat}, \ref{fig:xe}, and \ref{fig:freq} are shown in the \SI.
The PECs and analogous results to those of Figs.~\ref{fig:plot_stat}, \ref{fig:xe}, and \ref{fig:freq} are shown in the \SupInf.
At a given CI level, orbital optimization will lead to lower energies than with HF orbitals.
However, even though the energy is lowered (thus improved) at each geometry, such improvement may vary largely along the PEC, which may or may not decrease the NPE.
@ -383,12 +374,12 @@ due to the larger energy lowering at the Franck-Condon region than at dissociati
These results suggest that, when bond breaking involves one site, orbital optimization at the DOCI level does not have such an important role,
at least in the sense of decreasing the NPE.
Optimizing the orbitals at the CI level also tends to benefit the convergence of vibrational frequencies and equilibrium geometries (shown in Fig.Sx of the \SI).
Optimizing the orbitals at the CI level also tends to benefit the convergence of vibrational frequencies and equilibrium geometries (shown in Fig.Sx of the \SupInf).
The impact is often somewhat larger for hCI than for excitation-based CI, by a small margin.
The large oscillations observed in the hCI convergence with HF orbitals (for \ce{HF} and \ce{F2}) are significantly suppressed upon orbital optimization.
We come back to the surprisingly good performance of oo-CIS, which is interesting due to its low computational cost.
The PECs are compared with those of HF and FCI in Fig.Sx of the \SI.
The PECs are compared with those of HF and FCI in Fig.Sx of the \SupInf.
At this level, the orbital rotations provide an optimized reference (different from the HF solution), from which only single excitations are performed.
Since the reference is not the HF one, Brillouin's theorem no longer holds, and single excitations actually connect with the reference.
Thus, with only single excitations (and a reference that is optimized in the presence of these excitations), one obtains a minimally correlated model.