extra work on benchmarks

This commit is contained in:
Pierre-Francois Loos 2019-11-01 22:45:34 +01:00
parent eeb2686f97
commit 4fcadb5c6e
2 changed files with 45 additions and 29 deletions

View File

@ -1,7 +1,7 @@
%% This BibTeX bibliography file was created using BibDesk.
%% http://bibdesk.sourceforge.net/
%% Created for Pierre-Francois Loos at 2019-11-01 17:00:17 +0100
%% Created for Pierre-Francois Loos at 2019-11-01 22:45:06 +0100
%% Saved with string encoding Unicode (UTF-8)
@ -82,6 +82,15 @@
@string{theo = {J. Mol. Struct. (THEOCHEM)}}
@article{Loo20,
Author = {P. F. Loos and F. Lipparini and M. Boggio-Pasqua and A. Scemama and D. Jacquemin},
Date-Added = {2019-11-01 22:39:34 +0100},
Date-Modified = {2019-11-01 22:40:43 +0100},
Journal = {J. Chem. Theory Comput.},
Pages = {submitted},
Title = {Highly-Accurate Reference Excitation Energies and Benchmarks: Medium Size Molecules},
Year = {2020}}
@article{Kan17,
Author = {K{\'a}nn{\'a}r, D{\'a}niel and Tajti, Attila and Szalay, P{\'e}ter G.},
Date-Added = {2019-11-01 17:00:12 +0100},

View File

@ -74,7 +74,7 @@
% \centering
% \includegraphics[width=\linewidth]{TOC}
%\end{wrapfigure}
In this Perspective, we provide a global overview of the successive steps that made possible to obtain increasingly accurate excited-state energies and properties, eventually leading to chemically accurate excitation energies.
We provide a global overview of the successive steps that made possible to obtain increasingly accurate excited-state energies and properties, eventually leading to chemically accurate excitation energies.
We describe
i) the evolution of ab initio reference methods, e.g., originally CASPT2 (Roos, Serrano-Andres in the 1990's), then high-level CCn (as in the
acclaimed Thiel benchmark series in the 2000's), and now selected CI methods thanks to their resurgence in the past five years;
@ -99,7 +99,7 @@ Even more problematic, experimental spectra might not be available in gas phase,
For a more faithful comparison between theory and experiment, although more computationally demanding, the so-called 0-0 energies are definitely a safer playground. \cite{Loo19b}
%However, they require, from a theoretical point of view, access to the optimised excited-state geometry as well as its harmonic vibration frequencies.
Another feature that makes excited states particularly fascinating and challenging is that they can be both extremely close in energy from each other and have very different nature ($\pi \ra \pi^*$, $n \ra \pi^*$, charge transfer, double excitation, Rydberg, etc).
Another feature that makes excited states particularly fascinating and challenging is that they can be both extremely close in energy from each other and have very different nature ($\pi \ra \pi^*$, $n \ra \pi^*$, charge transfer, double excitation, valence, Rydberg, singlet, triplet, etc).
Therefore, it would be highly desirable to possess a computational method or protocol providing a balanced treatment of the entire ``spectrum'' of excited states.
And let's be honest, none of the existing methods does provide this at an affordable cost.
@ -207,12 +207,10 @@ This has drastic consequences such as, for example, the complete absence of doub
%%%%%%%%%%%%%%%%%%%
%%% SCI METHODS %%%
%%%%%%%%%%%%%%%%%%%
Alternatively to CC and multiconfigurational methods, one can also compute transition energies for various types of excited states using selected configuration interaction (SCI) methods \cite{Ben69,Whi69,Hur73} which have recently demonstrated their ability to reach near full CI (FCI) quality energies for small molecules \cite{Gin13,Gin15,Caf16,Gar17b,Hol16,Sha17,Hol17,Chi18,Sce18,Sce18b,Loo18a,Gar18}.
The idea behind such methods is to avoid the exponential increase of the size of the CI expansion by retaining the most energetically relevant determinants only, thanks to the use of a second-order energetic criterion to select perturbatively determinants in the FCI space.
However, although the \textit{``exponential wall''} is pushed back, this type of methods is only applicable to molecules with a small number of heavy atoms with relatively compact basis sets.
In the past five years, we have witnessed a resurgence of selected CI (SCI) methods thanks to the development and implementation of new and fast algorithm to select cleverly determinants in the FCI space.
SCI methods rely on the same principle as the usual CI approach, except that determinants are not chosen a priori based on occupation or excitation criteria but selected among the entire set of determinants based on their estimated contribution to the FCI wave function.
Indeed, it has been noticed long ago that, even inside a predefined subspace of determinants, only a small number of them significantly contributes.
@ -220,6 +218,7 @@ Therefore, an on-the-fly selection of determinants is a rather natural idea that
The main advantage of SCI methods is that no a priori assumption is made on the type of electronic correlation.
Therefore, at the price of a brute force calculation, a SCI calculation is less biased by the user's appreciation of the problem's complexity.
The approach that we have implemented in QUANTUM PACKAGE is based on the CIPSI algorithm developed by Huron, Rancurel, and Malrieu in 1973.
One of the strength of our implementation is its parallel efficiency which makes it possible to run on a ver large number of cores.
%%%%%%%%%%%%%%%
%%% SUMMARY %%%
@ -235,50 +234,58 @@ Although people usually don't really like reading, reviewing or even the idea of
%%%%%%%%%%%%%%%%%%%
%%% THIEL'S SET %%%
%%%%%%%%%%%%%%%%%%%
A major contribution originates from the Thiel's group \cite{Sch08,Sil08,Sau09,Sil10b,Sil10c} with the introduction of the so-called Thiel's set of excitation energies. \cite{Sch08}
For the first time, this benchmark set gathers a large number of excitation energies consisting of 28 medium-size organic molecules with a total of 223 excited states (152 singlet and 71 triplet states).
In their first study they performed CC2, EOM-CCSD, CC3 and MS-CASPT2 calculations (in the TZVP basis) in order to provide (based on additional high-level literature data) best theoretical estimates (TBEs).
A major contribution was provided by the group of Walter Thiel \cite{Sch08,Sil08,Sau09,Sil10b,Sil10c} with the introduction of the so-called Thiel set of excitation energies. \cite{Sch08}
For the first time, this set was large, broad and accurate enough to be used as a proper benchmarking set for excited states.
More specifically, it gathers a large number of excitation energies consisting of 28 medium-size organic molecules with a total of 223 excited states (152 singlet and 71 triplet states).
In their first study they performed CC2, EOM-CCSD, CC3 and MS-CASPT2 calculations (in the TZVP basis) in order to provide (based on additional high-level literature data) a list of best theoretical estimates (TBEs) for all these transitions.
Their main conclusion was that \textit{``CC3 and CASPT2 excitation energies are in excellent agreement for states which are dominated by single excitations''}.
These TBEs were later refined with the larger aug-cc-pVTZ basis set. \cite{Sil10b}
As evidenced of the values of reference data, these TBEs were quickly applied to benchmark various computationally effective methods (see Ref.~\onlinecite{Loo18a} and references therein).
These TBEs were sooner refined with the larger aug-cc-pVTZ basis set, \cite{Sil10b} highlighting the importance of diffuse functions (especially for Rydberg states).
As a direct evidence of the value of reference data, these TBEs were quickly picked up to benchmark various computationally effective methods from semi-empirical to state-of-the-art \textit{ab initio} methods (see Ref.~\onlinecite{Loo18a} and references therein).
Theoretical improvements were slow but steady.
In 2013, Watson et al.\cite{Wat13} proposed for EOM-CCSDT-3/TZVP (which corresponds to an iterative approximation of the triples from EOM-CCSDT) excitation energies for the Thiel's set.
Their quality were very similar to the CC3 values reported in Ref.~\onlinecite{Sau09} and the authors could not appreciate which ones were the most accurate.
Similarly, Dreuw and coworkers performed ADC(3) calculations on Thiel?s set and arrived at the same kind of conclusion: \cite{Har14}
\textit{``based on the quality of the existing benchmark set it is practically not possible to judge whether ADC(3) or CC3 is more accurate''}
Finally, let us mention the work of Kannar and Szalay who reported EOM-CCSDT (with TZVP \cite{Kan14} and aug-cc-pVTZ \cite{Kan17}) for a subset of the original Thiel's set.
Our recent contribution \cite{Loo18a} has been able to bring answers to this question.
Theoretical improvements of Thiel's set were slow but steady, highlighting further their quality.
In 2013, Watson et al.\cite{Wat13} computed EOM-CCSDT-3/TZVP (an iterative approximation of the triples of EOM-CCSDT) excitation energies for the Thiel set.
Their quality were very similar to the CC3 values reported in Ref.~\onlinecite{Sau09} and the authors could not appreciate which one were more accurate.
Similarly, Dreuw and coworkers performed ADC(3) calculations on Thiel's set and arrived at the same kind of conclusion: \cite{Har14}
\textit{``based on the quality of the existing benchmark set it is practically not possible to judge whether ADC(3) or CC3 is more accurate''}.
Finally, let us mention the work of Kannar and Szalay who reported EOM-CCSDT excitation energies \cite{Kan14,Kan17} for a subset of the original Thiel set.
%%%%%%%%%%%%%%%%%%%%%%%
%%% JACQUEMIN'S SET %%%
%%%%%%%%%%%%%%%%%%%%%%%
Indeed, very recently, we also made a contribution to this quest for highly-accurate excitation energies. \cite{Loo18a}
We studied 18 small molecules (water, hydrogen sulfide, ammonia, hydrogen chloride, dinitrogen, carbon monoxide, acetylene, ethylene, formaldehyde, methanimine, thioformaldehyde, acetaldehyde, cyclopropene, diazomethane, formamide, ketene, nitrosomethane, and the smallest streptocyanine) with sizes ranging from one to three nonhydrogen atoms.
For such systems, using SCI expansions of several million determinants, we were able to compute more than 100 highly accurate vertical excitation energies with typically augmented triple-$\zeta$ basis sets.
It allowed us to benchmark a series of 12 state-of-the-art excited-state wave function methods accounting for double and triple excitations.
We use this series theoretical best estimates to benchmark a series of popular methods for excited state calculations [CIS(D), ADC(2), CC2, STEOM-CCSD, CCSD, CCSDR(3), and CCSDT-3].
Recently, we also made, what we think, is a significant contribution to the quest for highly-accurate excitation energies. \cite{Loo18a}
More specifically, we studied 18 small molecules with sizes ranging from one to three non-hydrogen atoms.
For such systems, using a combination of high-order CC, SCI calculations (with expansions of several million determinants) and increasingly large diffuse basis sets, we were able to compute a list of 110 highly accurate vertical excitation energies for states of various characters.
Importantly, it allowed us to benchmark a series of 12 popular excited-state wave function methods accounting for double and triple excitations [CIS(D), ADC(2), CC2, STEOM-CCSD, CCSD, CCSDR(3), and CCSDT-3].
Our main conclusion was that, although less accurate than CC3, EOM-CCSDT-3 can be used as a reliable reference for benchmark studies, and that ADC(3) delivers quite large errors for this set of small compounds, with a clear tendency to overcorrect its second-order version ADC(2).
Even more recently, we provided accurate reference excitation energies for transitions involving a substantial amount of double excitation using a series of increasingly large diffuse-containing atomic basis sets. \cite{Loo19c}
In a second study, \cite{Loo19c} we also provided accurate reference excitation energies for transitions involving a substantial amount of double excitation using a series of increasingly large diffuse-containing atomic basis sets (up to aug-cc-pVTZ when technically feasible).
Our set gathered 20 vertical transitions from 14 small- and medium-sized molecules (acrolein, benzene, beryllium atom, butadiene, carbon dimer and trimer, ethylene, formaldehyde, glyoxal, hexatriene, nitrosomethane, nitroxyl, pyrazine, and tetrazine).
For the smallest molecules, we were able to obtain well converged excitation energies with an augmented quadruple-$\zeta$ basis set, while only augmented double-$\zeta$ bases were manageable for the largest systems (such as acrolein, butadiene, hexatriene, and benzene).
Note that the largest SCI expansion considered in this study had more than 200 million determinants.
An important addition to this second study was the computation of excitation energies with multiconfigurational methods (CASSCF, CASPT2, (X)MS-CASPT2, and NEVPT2) as well as high-order CC methods including perturbative and iterative triple corrections.
Our results clearly evidenced that the error in CC methods is intimately related to the amount of double-excitation character of the transition.
%For ``pure'' double excitations (i.e., for transitions which do not mix with single excitations), the error in CC3 can easily reach 1 eV, while it goes down to a few tenths of an electronvolt for more common transitions (such as in trans-butadiene) involving a significant amount of singles
In order to push further our analysis to larger compounds, we provided highly-accurate vertical transition energies obtained for 27 molecules encompassing 4, 5, and 6 non-hydrogen atoms (acetone, acrolein, benzene, butadiene, cyanoacetylene, cyanoformaldehyde, cyanogen, cyclopentadiene, cyclopropenone, cyclopropenethione, diacetylene, furan, glyoxal, imidazole, isobutene, methylenecyclopropene, propynal, pyrazine, pyridazine, pyridine, pyrimidine, pyrrole, tetrazine, thioacetone, thiophene, thiopropynal, and triazine).
Even more recently, In order to push our analysis and provide more general conclusions, we provided highly-accurate vertical transition energies for larger compounds with 27 molecules encompassing 4, 5, and 6 non-hydrogen atoms. \cite{Loo20}
To obtain these energies, we use CC approaches up to the highest possible order (CC3, CCSDT, and CCSDTQ), SCI approach up to several millions determinants, and NEVPT2.
All approaches being combined with diffuse-containing atomic basis sets.
For all transitions, we report at least CC3/aug-cc-pVQZ transition energies and as well as CC3/aug-cc-pVTZ oscillator strengths for all dipole-allowed transitions.
We show that CC3 almost systematically delivers transition energies in agreement with higher-level of theories ($\pm 0.04$ eV) but for transitions presenting a dominant double excitation character.
This contribution encompasses a set of more than 200 highly-accurate transition energies for states of various nature (valence, Rydberg, singlet, triplet, $n \ra \pi^*$, $\pi \ra \pi^*$, etc).
%%%%%%%%%%%%%%%%%
%%% COMPUTERS %%%
%%%%%%%%%%%%%%%%%
\alert{Here comes Toto's part on the awesomeness of computers.}
%%%%%%%%%%%%%%%%%%
%%% CONCLUSION %%%
%%%%%%%%%%%%%%%%%%
As concluding remarks, we would like to say that, even though Thiel's group contribution is pretty awesome, what we have done is not bad either.
Thanks to new technological advances, we hope to be able to push further our quest to highly accurate excitation energies.
%%%%%%%%%%%%%%%%%%%%%%%%
%%% ACKNOWLEDGEMENTS %%%
%%%%%%%%%%%%%%%%%%%%%%%%
PFL acknowledges funding from the \textit{``Centre National de la Recherche Scientifique''}.
PFL would like to thank Peter Gill for useful discussions.
He also acknowledges funding from the \textit{``Centre National de la Recherche Scientifique''}.
DJ acknowledges the R\'egion des Pays de la Loire for financial support.
%%%%%%%%%%%%%%%%%%%%