This commit is contained in:
Pierre-Francois Loos 2020-11-23 21:12:33 +01:00
commit 94653f3c8d

View File

@ -112,10 +112,10 @@ like absorption, fluorescence, phosphorescence or even chemoluminescence \cite{B
For a given level of theory, ground-state methods are usually more accurate than their excited-state analogs.
The reasons behind this are (at least) threefold: i) one might lack a proper variational principle for excited-state energies and one may have to rely on response theory
\cite{Monkhorst_1977,Helgaker_1989,Koch_1990,Koch_1990b,Christiansen_1995b,Christiansen_1998b,Hattig_2003,Kallay_2004,Hattig_2005c} formalisms which inherently introduce a
ground-state ``bias'', iii) accurately modeling the electronic structure of excited states usually requires larger one-electron basis sets (including diffuse functions most of the times) than their
ground-state ``bias'', ii) accurately modeling the electronic structure of excited states usually requires larger one-electron basis sets (including diffuse functions most of the times) than their
ground-state counterpart, and iii) excited states can be governed by different amounts of dynamic/static correlations, present very different physical natures ($\pi \to \pis$, $n \to \pis$, charge
transfer, double excitation, valence, Rydberg, singlet, doublet, triplet, etc), yet be very close in energy from one another. Hence, designing excited-state methods able to tackle simultaneously
and on an equal footing all these types of excited states at an affordable cost remain an open challenge in theoretical computational chemistry as evidenced by the large number of review
and on an equal footing all these types of excited states at an affordable cost remains an open challenge in theoretical computational chemistry as evidenced by the large number of review
articles on this particular subject \cite{Roos_1996,Piecuch_2002,Dreuw_2005,Krylov_2006,Sneskov_2012,Gonzales_2012,Laurent_2013,Adamo_2013,Dreuw_2015,Ghosh_2018,Blase_2020,Loos_2020a}.
@ -146,7 +146,7 @@ CASPT2 \cite{Andersson_1990,Andersson_1992,Roos,Roos_1996} calculations (with th
transitions. These TBEs were quickly refined with the larger aug-cc-pVTZ basis set \cite{Silva-Junior_2010b,Silva-Junior_2010c}. In the same spirit, it is also worth mentioning Gordon's set of vertical transitions
(based on experimental values) \cite{Leang_2012} used to benchmark the performance of time-dependent density-functional theory (TD-DFT) \cite{Runge_1984,Casida_1995,Casida_2012,Ulrich_2012}, as well
as its extended version by Goerigk and coworkers who decided to replace the experimental reference values by CC3 excitation energies \cite{Schwabe_2017,Casanova-Paez_2019,Casanova_Paes_2020}.
For comparisons with experimental values, there also exist various sets of measured 0-0 energies used in various benchmarks, notably by the Furche \cite{Furche_2002,Send_2011a}, H\"attig \cite{Winter_2013}
For comparisons with experimental values, there also exists various sets of measured 0-0 energies used in various benchmarks, notably by the Furche \cite{Furche_2002,Send_2011a}, H\"attig \cite{Winter_2013}
and our \cite{Loos_2018,Loos_2019a,Loos_2019b} groups for gas-phase compounds and by Grimme \cite{Dierksen_2004,Goerigk_2010a} and one of us \cite{Jacquemin_2012,Jacquemin_2015b} for solvated dyes.
Let us also mention the new benchmark set of charge-transfer excited states recently introduced by Szalay and coworkers [based on equation-of-motion coupled cluster (EOM-CC) methods] \cite{Kozma_2020}
as well as the Gagliardi-Truhlar set employed to compare the accuracy of multiconfiguration pair-density functional theory \cite{Ghosh_2018} against the well-established CASPT2 method \cite{Hoyer_2016}.
@ -173,12 +173,12 @@ review the generic benchmark studies devoted to adiabatic and 0-0 energies perfo
The QUEST dataset has the particularity to be based in a large proportion on selected configuration interaction (SCI) reference excitation energies as well as high-order linear-response (LR) CC methods such as LR-CCSDT and
LR-CCSDTQ \cite{Noga_1987,Koch_1990,Kucharski_1991,Christiansen_1998b,Kucharski_2001,Kowalski_2001,Kallay_2003,Kallay_2004,Hirata_2000,Hirata_2004}. Recently, SCI methods have been a force to reckon with for
the computation of highly-accurate energies in small- and medium-sized molecules as they yield near full configuration interaction (FCI) quality energies for only a fraction of the computational cost of a genuine FCI calculation \cite{Booth_2009,Booth_2010,Cleland_2010,Booth_2011,Daday_2012,Blunt_2015,Ghanem_2019,Deustua_2017,Deustua_2018,Holmes_2017,Chien_2018,Li_2018,Yao_2020,Li_2020,Eriksen_2017,Eriksen_2018,Eriksen_2019a,Eriksen_2019b,Xu_2018,Xu_2020,Loos_2018a,Loos_2019,Loos_2020b,Loos_2020c,Loos_2020a,Loos_2020e,Eriksen_2021}.
Due to the fairly natural idea underlying these methods, the SCI family is composed by numerous members \cite{Bender_1969,Whitten_1969,Huron_1973,Abrams_2005,Bunge_2006,Bytautas_2009,Giner_2013,Caffarel_2014,Giner_2015,Garniron_2017b,Caffarel_2016a,Caffarel_2016b,Holmes_2016,Sharma_2017,Holmes_2017,Chien_2018,Scemama_2018,Scemama_2018b,Garniron_2018,Evangelista_2014,Schriber_2016,Schriber_2017,Liu_2016,Per_2017,Ohtsuka_2017,Zimmerman_2017,Li_2018,Ohtsuka_2017,Coe_2018,Loos_2019}.
Their fundamental philosophy consists, roughly speaking, in retaining only the most energetically relevant determinants of the FCI space following a given criterion to slow down the exponential increase of the size of the CI expansion.
Due to the fairly natural idea underlying these methods, the SCI family is composed by numerous members \cite{Bender_1969,Whitten_1969,Huron_1973,Abrams_2005,Bunge_2006,Bytautas_2009,Giner_2013,Caffarel_2014,Giner_2015,Garniron_2017b,Caffarel_2016a,Caffarel_2016b,Holmes_2016,Sharma_2017,Holmes_2017,Chien_2018,Scemama_2018,Scemama_2018b,Garniron_2018,Evangelista_2014,Tubman_2016,Tubman_2020,Schriber_2016,Schriber_2017,Liu_2016,Per_2017,Ohtsuka_2017,Zimmerman_2017,Li_2018,Ohtsuka_2017,Coe_2018,Loos_2019}.
Their fundamental philosophy consists, roughly speaking, in retaining only the most \alert{\textst{energetically}} relevant determinants of the FCI space following a given criterion to slow down the exponential increase of the size of the CI expansion.
Originally developed in the late 1960's by Bender and Davidson \cite{Bender_1969} as well as Whitten and Hackmeyer \cite{Whitten_1969}, new efficient SCI algorithms have resurfaced recently.
Four examples are adaptive sampling CI (ASCI) \cite{Tubman_2016,Tubman_2018,Tubman_2020}, iCI \cite{Liu_2014,Liu_2016,Lei_2017,Zhang_2020}, semistochastic heat-bath CI (SHCI) \cite{Holmes_2016,Holmes_2017,Sharma_2017,Li_2018,Li_2020,Yao_2020}), and \textit{Configuration Interaction using a Perturbative Selection made Iteratively} (CIPSI) \cite{Huron_1973,Giner_2013,Giner_2015,Garniron_2019}.
These four flavors of SCI include a second-order perturbative (PT2) correction which is key to estimate the ``distance'' to the FCI solution (see below).
The SCI calculations performed for the QUEST set of excitation energies relies on the CIPSI algorithm, which is, from a historical point of view, one of the oldest SCI algorithm.
Three examples are \alert{\textst{adaptive sampling CI (ASCI)}, }iCI \cite{Liu_2014,Liu_2016,Lei_2017,Zhang_2020}, semistochastic heat-bath CI (SHCI) \cite{Holmes_2016,Holmes_2017,Sharma_2017,Li_2018,Li_2020,Yao_2020}, and \textit{Configuration Interaction using a Perturbative Selection made Iteratively} (CIPSI) \cite{Huron_1973,Giner_2013,Giner_2015,Garniron_2019}.
These flavors of SCI include a second-order perturbative (PT2) correction which is key to estimate the ``distance'' to the FCI solution (see below).
The SCI calculations performed for the QUEST set of excitation energies relies on the CIPSI algorithm, which is, from a historical point of view, one of the oldest SCI algorithms.
It was developed in 1973 by Huron, Rancurel, and Malrieu \cite{Huron_1973} (see also Refs.~\cite{Evangelisti_1983,Cimiraglia_1985,Cimiraglia_1987,Illas_1988,Povill_1992}).
Recently, the determinant-driven CIPSI algorithm has been efficiently implemented \cite{Garniron_2019} in the open-source programming environment QUANTUM PACKAGE by the Toulouse group enabling to perform massively
parallel computations \cite{Garniron_2017,Garniron_2018,Garniron_2019,Loos_2020e}. CIPSI is also frequently employed to provide accurate trial wave functions for quantum Monte Carlo calculations in molecules \cite{Caffarel_2014,Caffarel_2016a,Caffarel_2016b,Giner_2013,Giner_2015,Scemama_2015,Scemama_2016,Scemama_2018,Scemama_2018b,Scemama_2019,Dash_2018,Dash_2019,Scemama_2020} and more recently
@ -223,7 +223,7 @@ These basis sets are available from the \href{https://www.basissetexchange.org}{
In order to compute reference vertical energies, we have designed different strategies depending on the actual nature of the transition and the size of the system.
For small molecules (typically 1--3 non-hydrogen atoms), we mainly resort to SCI methods which can provide near-FCI excitation energies for compact basis sets.
Obviously, the smaller the molecule, the larger the basis we can afford.
For larger systems (\ie, 4--6 non-hydrogen atom), one cannot afford SCI calculations anymore expect in a few special occasions, and we then rely on LR-CC theory (LR-CCSDT and LR-CCSDTQ typically \cite{Kucharski_1991,Kallay_2003,Kallay_2004,Hirata_2000,Hirata_2004}) to obtain accurate transition energies.
For larger systems (\ie, 4--6 non-hydrogen atom), one cannot afford SCI calculations anymore except in a few special occasions, and we then rely on LR-CC theory (LR-CCSDT and LR-CCSDTQ typically \cite{Kucharski_1991,Kallay_2003,Kallay_2004,Hirata_2000,Hirata_2004}) to obtain accurate transition energies.
In the following, we will omit the prefix LR for the sake of clarity, as equivalent values would be obtained with the equation-of-motion (EOM) formalism \cite{Rowe_1968,Stanton_1993}.
The CC calculations are performed with several codes.
@ -290,7 +290,7 @@ The definition of the active space considered for each system as well as the num
%------------------------------------------------
In this section, we present our scheme to estimate the extrapolation error in SCI calculations.
This new protocol is then applied to five- and six-membered ring molecules for which SCI calculations are particularly challenging even for small basis sets.
Note that the present method does only applied to ``state-averaged'' SCI calculations where ground- and excited-state energies are produced during the same calculation with the same set of molecular orbitals, not to ``state-specific'' calculations where one computes solely the energy of a single state (like conventional ground-state calculations).
Note that the present method does only apply to ``state-averaged'' SCI calculations where ground- and excited-state energies are produced during the same calculation with the same set of molecular orbitals, not to ``state-specific'' calculations where one computes solely the energy of a single state (like conventional ground-state calculations).
For the $m$th excited state (where $m = 0$ corresponds to the ground state), we usually estimate its FCI energy $E_{\text{FCI}}^{(m)}$ by performing a linear extrapolation of its variational energy $E_\text{var}^{(m)}$ as a function of its rPT2 correction $E_{\text{rPT2}}^{(m)}$ as follows
\begin{equation}
@ -330,7 +330,7 @@ This choice ensures that the statistical uncertainty vanishes at the FCI limit.
We then search for a confidence interval $\mathcal{I}$ such that the true value of the excitation energy $\Delta E_{\text{FCI}}^{(m)}$ lies within one standard deviation of $\Delta E_\text{CIPSI}^{(m)}$, i.e., $P( \Delta E_{\text{FCI}}^{(m)} \in [ \Delta E_\text{CIPSI}^{(m)} \pm \sigma ] \; | \; \mathcal{G}) = 0.6827$.
The probability that $\Delta E_{\text{FCI}}^{(m)}$ is in an interval $\mathcal{I}$ is
\begin{equation}
P( \Delta E_{\text{FCI}}^{(m)} \in \mathcal{I} ) = P( \Delta E_{\text{FCI}}^{(m)} \in I | \mathcal{G}) \times P(\mathcal{G})
P\qty( \Delta E_{\text{FCI}}^{(m)} \in \mathcal{I} ) = P\qty( \Delta E_{\text{FCI}}^{(m)} \in I \Big| \mathcal{G}) \times P(\mathcal{G})
\end{equation}
where the probability $P(\mathcal{G})$ that the random variables are normally distributed can be deduced from the Jarque-Bera test $J$ as
\begin{equation}
@ -343,24 +343,24 @@ The inverse of the cumulative distribution function of the $t$-distribution, $t_
\beta = t_{\text{CDF}}^{-1} \qty[
\frac{1}{2} \qty( 1 + \frac{0.6827}{P(\mathcal{G})}), M ]
\end{equation}
such that $P( \Delta E_{\text{FCI}}^{(m)} \in [ \Delta E_{\text{CIPSI}}^{(m)} \pm \beta \sigma ] ) = p = 0.6827$.
such that $P\qty( \Delta E_{\text{FCI}}^{(m)} \in \qty[ \Delta E_{\text{CIPSI}}^{(m)} \pm \beta \sigma ] ) = p = 0.6827$.
Only the last $M>2$ computed energy differences are considered. $M$ is chosen such that $P(\mathcal{G})>0.8$ and such that the error bar is minimal.
If all the values of $P(\mathcal{G})$ are below $0.8$, $M$ is chosen such that $P(\mathcal{G})$ is maximal.
A Python code associated with this procedure is provided in the {\SupInf}.
The singlet and triplet FCI/6-31+G(d) excitation energies and their corresponding error bars estimated with the method presented above based on Gaussian random variables are reported in Table \ref{tab:cycles}.
For the sake of comparison, we also report the CC3 and CCSDT vertical energies from Ref.~\cite{Loos_2020b} computed in the same basis. We note that there is for the vas majority of considered
For the sake of comparison, we also report the CC3 and CCSDT vertical energies from Ref.~\cite{Loos_2020b} computed in the same basis. We note that there is for the vast majority of considered
states a very good agreement between the CC3 and CCSDT values, indicating that the CC values can be trusted.
The estimated values of the excitation energies obtained via a three-point linear extrapolation considering the three largest CIPSI wave functions are also gathered in Table \ref{tab:cycles}.
In this case, the error bar is estimated via the extrapolation distance, \ie, the difference in excitation energies obtained with the three-point linear extrapolation and the largest CIPSI wave function.
This strategy has been considered in some of our previous works \cite{Loos_2020b,Loos_2020c,Loos_2020e}.
The deviation from the CCSDT excitation energies for the same set of excitations are depicted in Fig.~\ref{fig:errors}, where the red dots correspond to the excitation energies and error bars estimated via the present method, and the blue dots correspond to the excitation energies obtained via a three-point linear fit and error bars estimated via the extrapolation distance.
These results contains a good balance between well-behaved and ill-behaved cases.
These results contain a good balance between well-behaved and ill-behaved cases.
For example, cyclopentadiene and furan correspond to well-behaved scenarios where the two flavors of extrapolations yield nearly identical estimates and the error bars associated with these two methods nicely overlap.
In these cases, one can observe that our method based on Gaussian random variables provides almost systematically smaller error bars.
Even in less idealistic situations (like in imidazole, pyrrole, and thiophene), the results are very satisfactory and stable.
The six-membered rings represent much more challenging cases for SCI methods, and even for these systems the newly-developed method provides realistic error bars, and allows to easily detect problematic events (like pyridine for instance).
The present scheme has also been tested on smaller systems when one can tightly converged the CIPSI calculations.
The present scheme has also been tested on smaller systems when one can tightly converge the CIPSI calculations.
In such cases, the agreement is nearly perfect in every scenario that we have encountered.
A selection of these results can be found in the {\SupInf}.