Based on 280 reference vertical transition energies of various natures (singlet, triplet, valence, Rydberg, $n\to\pi^*$, $\pi\to\pi^*$, and double excitations) extracted from the QUEST database, we assess the accuracy of third-order multireference perturbation theory, CASPT3, in the context of molecular excited states.
When one applies the disputable ionization-potential-electron-affinity (IPEA) shift, we show that CASPT3 provides a similar accuracy as its second-order counterpart, CASPT2, with the same mean absolute error of $0.11$ eV.
However, as already reported, we also observe that the accuracy of CASPT3 is almost insensitive to the IPEA shift, irrespective of the transition type and system size, with a small reduction of the mean absolute error to $0.09$ eV when the IPEA shift is switched off.
However, it rarely works this way in practice as the perturbative series may exhibit quite a large spectrum of behaviors. \cite{Olsen_1996,Christiansen_1996,Cremer_1996,Olsen_2000,Olsen_2019,Stillinger_2000,Goodson_2000a,Goodson_2000b,Goodson_2004,Sergeev_2005,Sergeev_2006,Goodson_2011}
For example, in single-reference M{\o}ller-Plesset (MP) perturbation theory, \cite{Moller_1934} erratic, slowly convergent, and divergent behaviors have been observed. \cite{Laidig_1985,Knowles_1985,Handy_1985,Gill_1986,Laidig_1987,Nobes_1987,Gill_1988,Gill_1988a,Lepetit_1988,Leininger_2000a,Malrieu_2003,Damour_2021}
Systematic improvement is thus difficult to achieve and it is extremely challenging to predict, \textit{a priori}, the evolution of the series when ramping up the perturbation order. \cite{Marie_2021a}
This has led, in certain specific contexts, to the development of empirical strategy like MP2.5 where one averages the second-order (MP2) and third-order (MP3) total energies, to obtain more accurate values. \cite{Pitonak_2009}
Extension of single-reference perturbation theory to electronic excited states is far from being trivial, and the algebraic diagrammatic
construction (ADC) approximation of the polarization propagator is probably the most natural. \cite{Schirmer_1982,Schirmer_1991,Barth_1995,Schirmer_2004,Schirmer_2018,Trofimov_1997,Trofimov_1997b,Trofimov_2002,Trofimov_2005,Trofimov_2006,Harbach_2014,Dreuw_2015}
However, the ADC series naturally inherits some of the drawbacks of its MP parent and it has been shown to be rather slowly convergent in the context of vertical excitation energies. \cite{Loos_2018a,Loos_2020a,Veril_2021}
This has led some of us to recently propose the ADC(2.5) composite approach, where, in the same spirit as MP2.5, one averages the second-order [ADC(2)] and third-order [ADC(3)] vertical transition energies. \cite{Loos_2020d}
Multi-reference perturbation theory is somewhat easier to generalize to excited states as one has the freedom to select the states of interest to include in the reference (zeroth-order) space via the so-called complete-active-space self-consistent field (CASSCF) formalism, hence effectively catching static correlation in the zeroth-order model space.
The missing dynamical correlation can then be recovered in the (first-order) outer space via low-order perturbation theory, as performed in the complete-active-space second-order perturbation theory (CASPT2) of Roos and coworkers, \cite{Andersson_1990,Andersson_1992,Roos_1995a} the multireference MP2 approach of Hirao, \cite{Hirao_1992} or the $N$-electron valence state second-order perturbation theory (NEVPT2) developed by Angeli, Malrieu, and coworkers. \cite{Angeli_2001a,Angeli_2001b,Angeli_2002,Angeli_2006}
As such, it has been employed in countless computational studies involving electronic excited states. %\cite{Serrano-Andres_1993a,Serrano-Andres_1993b,Serrano-Andres_1993c,Serrano-Andres_1995,Roos_1996,Serrano-Andres_1996a,Serrano-Andres_1996b,Serrano-Andres_1998b,Roos_1999,Merchan_1999,Roos_2002,Serrano-Andres_2002,Serrano-Andres_2005,Tozer_1999,Burcl0_2002,Peach_2008,Faber_2013,Schreiber_2008,Silva-Junior_2008,Sauer_2009,Silva-Junior_2010a,Silva-Junior_2010b,Silva-Junior_2010c}
In the context of excited states, its most severe drawback is certainly the intruder state problem (which is, by construction, absent in NEVPT2) that describes a situation where one or several determinants of the outer (first-order) space, known as perturbers, have an energy close to the zeroth-order CASSCF wave function, hence producing divergences in the denominators of the second-order perturbative energy.
The use of real-valued \cite{Roos_1995b,Roos_1996} or imaginary \cite{Forsberg_1997} level shifts has been successfully tested and is now routine in excited-state calculations. \cite{Schapiro_2013,Zobel_2017,Sarkar_2022}
A second pitfall was brought to light by Andersson \textit{et al.}\cite{Andersson_1993,Andersson_1995} and explained by the unbalanced treatment in the zeroth-order Hamiltonian of the open- and closed-shell electronic configurations.
A cure was quickly proposed via the introduction of an additional parameter in the zeroth-order Hamiltonian, the ionization-potential-electron-affinity (IPEA) shift. \cite{Ghigo_2004}
Although the introduction of an IPEA shift can provide a better agreement between experiment and theory, \cite{Pierloot_2006,Pierloot_2008,Suaud_2009,Kepenekian_2009,Daku_2012,Rudavskyi_2014,Vela_2016,Wen_2018} it has been shown that its application is not systematically justified and that its impact is significantly basis set dependent. \cite{Zobel_2017}
Very recently, based on the highly accurate vertical excitation energies of the QUEST database, \cite{Loos_2018a,Loos_2019,Loos_2020a,Loos_2020b,Loos_2020c,Veril_2021,Loos_2021c,Loos_2021b} we have reported an exhaustive benchmark of CASPT2 and NEVPT2 for 280 excited states of diverse natures (singlet, triplet, valence, Rydberg, $n\to\pis$, $\pi\to\pis$, and double excitations) computed with a large basis set (aug-cc-pVTZ) in 35 small- and medium-sized organic molecules containing from three to six non-hydrogen atoms. \cite{Sarkar_2022}
Our main take-home message was that both CASPT2 with IPEA shift and the partially-contracted version of NEVPT2 provide fairly reliable vertical transition energy estimates, with slight overestimations and mean absolute errors of \SI{0.11}{} and \SI{0.13}{\eV}, respectively.
In the electronic structure community, third-order perturbation theory has a fairly bad reputation especially within MP perturbation theory where it is rarely worth its extra computational cost. \cite{Rettig_2020}
Nonetheless, going against popular beliefs and one step further in the perturbative expansion, we propose here to assess the performance of the complete-active-space third-order perturbation theory (CASPT3) method developed by Werner \cite{Werner_1996} and implemented in MOLPRO \cite{Werner_2020} for the same set of electronic transitions as the one used in Ref.~\onlinecite{Sarkar_2022}
the present study provides, to the best of our knowledge, the first comprehensive benchmark of CASPT3 and allows assessing its accuracy in the framework of electronically excited states.
We underline that, although a third-order version of NEVPT has been developed \cite{Angeli_2006} and has been used in some applications \cite{Pastore_2006a,Pastore_2006b,Pastore_2007,Angeli_2007,Camacho_2010,Angeli_2011,Angeli_2012} by Angeli and coworkers, as far as we are aware of, no NEVPT3 implementation is publicly available.
Geometries and reference theoretical best estimates (TBEs) for the vertical excitation energies have been extracted from the QUEST database \cite{Veril_2021} and can be downloaded at \url{https://lcpq.github.io/QUESTDB_website}.
All the CASPT2 and CASPT3 calculations have been carried out in the frozen-core approximation and within the RS2 and RS3 contraction schemes as implemented in MOLPRO and described in Refs.~\onlinecite{Werner_1996} and \onlinecite{Werner_2020}.
The MOLPRO implementation of CASPT3 is based on a modification of the multi-reference configuration interaction (MRCI) module. \cite{Werner_1988,Knowles_1988}
For the sake of computational efficiency, the doubly-excited external configurations are internally contracted while the singly-excited internal and semi-internal configurations are left uncontracted. \cite{Werner_1996}
These perturbative calculations have been performed by considering a state-averaged (SA) CASSCF wave function where we have included the ground state and (at least) the excited states of interest.
Additionally, for the challenging transitions, we have steadily increased the size of the active space to carefully assess the convergence of the vertical excitation energies of interest.
Note that, compared to our previous CASPT2 benchmark study, \cite{Sarkar_2022} the active spaces of acrolein, pyrimidine, and pyridazine have been slightly reduced in order to make the CASPT3 calculations technically achievable.
Although these active space reductions are overall statistically negligible, this explains the small deviations that one may observe between the data reported here and in Ref.~\onlinecite{Sarkar_2022}.
A detailed discussion of each individual molecule can be found in Ref.~\onlinecite{Sarkar_2022} and in earlier works, \cite{Loos_2018a,Loos_2020b} where theoretical and experimental literature values are discussed.
We therefore decided to focus on global trends here.
The exhaustive list of CASPT2 and CASPT3 transitions can be found in Table \ref{tab:BigTab} and the distribution of the errors are represented in Fig.~\ref{fig:PT2_vs_PT3}.
The usual statistical indicators are used in the following, namely, the mean signed error (MSE), the mean absolute error (MAE), the root-mean-square error (RMSE), the standard deviation of the errors (SDE), as well as the largest positive and negative deviations [Max($+$) and Max($-$), respectively].
These are given in Table \ref{tab:stat} considering the 265 ``safe'' TBEs (out of 280) for which chemical accuracy is assumed (absolute error below \SI{0.043}{\eV}).
The MAEs determined for subsets of transitions (singlet, triplet, valence, Rydberg, $n\to\pis$, $\pi\to\pis$, and double excitations) and system sizes (3 non-H atoms, 4 non-H atoms, and 5-6 non-H atoms) can be found in Table \ref{tab:stat_subset}.
The reference TBEs of the QUEST database, their percentage of single excitations $\%T_1$ involved in the transition (computed at the CC3 level), their nature
(V and R stand for valence and Rydberg, respectively) are also reported.
From the different statistical quantities reported in Table \ref{tab:stat}, one can highlight the following trends.
First, as previously reported, \cite{Werner_1996,Grabarek_2016} CASPT3 vertical excitation energies are much less sensitive to the IPEA shift, which drastically alters the accuracy of CASPT2: the mean absolute deviation between the CASPT2(NOIPEA) and CASPT2(IPEA) data is \SI{0.329}{\eV} while it is only \SI{0.051}{\eV} between CASPT3(NOIPEA) and CASPT3(IPEA).
Consequently, the MAEs of CASPT3(IPEA) and CASPT3(NOIPEA) are amazingly close (\SI{0.11}{} and \SI{0.09}{\eV}), while the MAEs of CASPT2(IPEA) and CASPT2(NOIPEA) are remarkably different (\SI{0.11}{} and \SI{0.27}{\eV}).
Likewise, the MSEs of CASPT2(IPEA) and CASPT2(NOIPEA), \SI{0.06}{} and \SI{-0.26}{\eV}, clearly evidence the well-known global underestimation of the CASPT2(NOIPEA) excitation energies in molecular systems when large basis sets are used.
For CASPT3, the MSE with IPEA shift is only slightly larger without IPEA (\SI{0.10}{} and \SI{0.05}{\eV}, respectively).
Importantly, CASPT3 performs slightly better without IPEA shift, which is a nice outcome that holds for each group of transitions and system size (see the MAEs in Table \ref{tab:stat_subset}).
Because the relative size of the active space naturally decreases as the number of electrons and orbitals get larger, we observe that the MAEs of each subset increase with the size of the molecules.
Note that combining CASPT2 and CASPT3 via an hybrid protocol such as CASPT2.5, as proposed by Zhang and Truhlar in the context of spin splitting energies in transition metals, \cite{Zhang_2020} is not beneficial in the present situation.
It is worth mentioning that CASPT3(NOIPEA) yields MAEs for each subset that is almost systematically below \SI{0.1}{\eV}, except for the singlet subset which contains some states showing large (positive) deviations at both the CASPT2 and CASPT3 levels.
This is most notably the case for the $^1 B_u(\pi,\pis)$ state of butadiene, the $^1B_2(\pi,\pis)$ state of cyclopentadiene, the $^1A_1(\pi,\pis)$ state of cyclopropenone, the second $^1B_{1u}(\pi,\pis)$ state of pyrazine, the $^1B_2(\pi,\pis)$ state of pyridazine, and the $^1E'(\pi,\pis)$ state of triazine, for which both CASPT2(IPEA) and CASPT3(NOIPEA) overestimate the corresponding vertical transition energies by at least \SI{0.4}{\eV} with respect to the TBEs.
This can be tracked down to the relatively small active spaces that we have considered here and, more precisely, to the lack of direct $\sig$-$\pi$ coupling in the active space that is known to be important in ionic states, for example. \cite{Davidson_1996,Borden_1996,Boggio-Pasqua_2004,Angeli_2009,Garniron_2018,Tran_2019,BenAmor_2020}
For this family of states, it is particularly important to describe the dynamic response of the $\sig$-electron framework to the field of the $\pi$-electron system, a phenomenon known as dynamic $\sig$ polarization.
Because the dynamic $\sig$ polarization is generally more important for the ionic excited state than for the ground state, its contribution is expected to lower the vertical transition energy.
Furthermore, this part of the dynamic $\sig$-$\pi$ correlation needs to be included at the orbital optimization stage, otherwise the orbitals become too diffuse, resulting in artificial valence-Rydberg mixing which cannot be disentangled using non-degenerate perturbation theory such as the version of CASPT2 and CASPT3 considered here. \cite{Angeli_2009}
As an illustration of this problematic, we have chosen to address the specific case of the second $^1B_{1u}(\pi,\pis)$ state of pyrazine, which is known to exhibit a strong ionic character. \cite{Fulscher_1994}
As shown in Table \ref{tab:BigTab} (\#171), the TBE for the vertical transition energy to this state is \SI{7.98}{\eV}.
CASPT2(IPEA) and CASPT3(NOIPEA) locate this state at \SI{8.59}{} and \SI{8.57}{\eV}, respectively, providing a large overestimation of \SI{0.6}{\eV}.
This state was computed using a reference CASSCF wave function averaged over four states [the ground state, two valence $B_{1u}(\pi,\pis)$ states and one Rydberg $B_{1u}(\pi,3p_x)$ state] with an active space comprising the $\pi$ valence and three $3p_x$ orbitals.
(The $3p_x$ orbitals were included to recover part of the radial correlation.)
However, this strategy leads to a valence-Rydberg mixing due to the fact that the dynamic correlation is not sufficiently described at the CASSCF level.
The ionic $B_{1u}(\pi,\pis)$ state lies \SI{9.65}{\eV} vertically above the ground state, while the Rydberg $B_{1u}(\pi,3p_x)$ state is \SI{0.2}{\eV} below at the CASSCF level.
The Rydberg character of the ionic $B_{1u}(\pi,\pis)$ state is evident from the inspection of the CASSCF wave function and also from its value of $\expval*{x^2}$, which measures the spatial extent of the wave function out of the molecular plane (hence characteristic of the size of the $\pi$ orbitals in the considered state).
The $\expval*{x^2}$ value is \SI{31.9}{\bohr^2} for the ionic $B_{1u}(\pi,\pis)$ state and \SI{51.1}{\bohr^2} for the $B_{1u}(\pi,3p_x)$ Rydberg state, while it is only \SI{26.6}{\bohr^2} for the ground state.
To remove the artificial valence-Rydberg mixing in the reference CASSCF wave function, we included the dynamic $\sig$ polarization at the orbital optimization stage using a restricted active space self-consistent field (RASSCF) approach. \cite{Olsen_1988}
We selected the bonding $\sigCC$ and $\sigCN$ orbitals in the RAS1 partition and the corresponding anti-bonding $\sigsCC$ and $\sigsCN$ orbitals in RAS3 allowing a single hole in RAS1 and a single electron in RAS3.
The six valence $\pi$ orbitals were kept in RAS2 (full CI space).
In this way, the contraction of the $\pi$ orbitals as a result of the dynamic $\sig$ polarization is ensured and the interference of the Rydberg state is removed allowing to compute the two valence $B_{1u}(\pi,\pis)$ states without including the Rydberg $B_{1u}(\pi,3p_x)$ state in the state-averaging procedure.
The $\expval*{x^2}$ value associated with the ionic $B_{1u}(\pi,\pis)$ state is reduced to \SI{26.9}{\bohr^2}, providing a spatial extent similar to that of the ground state ($\expval*{x^2}=\SI{27.0}{\bohr^2}$ at the RASSCF level).
Using the RASSCF orbitals to perform the CASPT2(IPEA) and CASPT3(NOIPEA) calculations using a CAS-CI(6,6) reference, we obtain vertical transition energies of \SI{7.92}{} and \SI{8.10}{\eV}, respectively.
The agreement with the TBE is now within the expected accuracy of the method with an error of about \SI{0.1}{\eV}.
To be complete the vertical transition energy to the first $B_{1u}(\pi,\pis)$ state, which also possesses a significant ionic character, is improved too with respect to the TBE at \SI{6.88}{\eV} with transition energies of \SI{6.83}{\eV} and \SI{6.87}{\eV} at the CASPT2(IPEA) and CASPT3(NOIPEA) levels, respectively.
This represents a significant improvement compared to the \SI{7.14}{} and \SI{7.12}{\eV} values obtained at the same level of theory but using a reference SA4-CASSCF wave function.
We thus believe that the difficult cases listed above can be handled more rigorously provided that more suitable active spaces are used to describe the reference (zeroth-order) wave function prior to the CASPT2/CASPT3 calculations.
Comparatively, Liang \textit{et al.} have recently shown, for a larger set of transitions, that time-dependent density-functional theory with the best exchange-correlation functionals yield RMSEs of the order of \SI{0.3}{\eV}, \cite{Liang_2022} outperforming (more expensive) wave function methods like CIS(D). \cite{Head-Gordon_1994,Head-Gordon_1995}
Although, these two methods do not beat the approximate third-order coupled-cluster method, CC3, \cite{Christiansen_1995b,Koch_1997} for transitions with a dominant single excitation character (for which CC3 returns a MAEs below the chemical accuracy threshold of \SI{0.043}{\eV}) \cite{Veril_2021} it has the undisputable advantage to describe with the same accuracy both single and double excitations.
\caption{Ratio of the wall times associated with the computation of the third- and second-order energies as a function of the total number of contracted and uncontracted external configurations for benzene (see Table \ref{tab:timings} for raw data).
Calculations have been performed in the frozen-core approximation and with the aug-cc-pVTZ basis set on an Intel Xeon node (see main text).}
Table \ref{tab:timings} reports the evolution of the wall times associated with the computation of the second- and third-order energies in benzene with the aug-cc-pVTZ basis and the frozen-core approximation (42 electrons and 414 basis functions) for increasingly large active spaces.
It is particularly instructive to study the wall time ratio as the number of (contracted and uncontracted) external configurations grows (see Fig.~\ref{fig:timings}).
Overall, the PT3 step takes between 5 and 10 times longer than the PT2 step for the active spaces that we have considered here, and remains thus typically affordable for these kinds of calculations.
In the present study, we have benchmarked, using 280 highly-accurate electronic transitions extracted from the QUEST database, \cite{Veril_2021} the third-order multi-reference perturbation theory method, CASPT3, by computing vertical excitation energies with and without IPEA shift.
Therefore, if one can afford the extra computation cost associated with the third-order energy (which is only several times more than its second-order counterpart), one can eschew the delicate choice of the IPEA value in CASPT2, and rely solely on the CASPT3(NOIPEA) excitation energies.
Included in the {\SupMat} are the error distributions obtained for CASPT2 and CASPT3 with and without IPEA shift for various subsets of transitions, as well as the description and specification of the active space for each molecule.
PFL thanks the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement No.~863481) for funding.