\title{Accurate full configuration interaction correlation energy estimates for five- and six-membered rings}%DJ: il apparait pas mais ring molecules => rings
%DJ: Quand je compile, j'ai un soucis dans les affiliations: vs etes ts <20> Nantes sauf Yann => check ?
%DJ: En terme de calculs/efforts, j'ai je pense m<>rit<69> la *. En terme d'id<69>es, je ne pense pas. Je te laisse juger et je me rangerais derri<72>re ton avis sans le moindre souci.
\DMJ{Following} our recent work on the benzene molecule [\href{https://doi.org/10.1063/5.0027617}{J.~Chem.~Phys.~\textbf{153}, 176101 (2020)}], itself motivated by the blind challenge of Eriksen \textit{et al.} [\href{https://doi.org/10.1021/acs.jpclett.0c02621}{J.~Phys.~Chem.~Lett.~\textbf{11}, 8922 (2020)}] on the same system, we report accurate full configuration interaction (FCI) frozen-core correlation energy estimates for twelve five- and six-membered ring molecules (cyclopentadiene, furan, imidazole, pyrrole, thiophene, benzene, pyrazine, pyridazine, pyridine, pyrimidine, s-tetrazine, and s-triazine) in the standard correlation-consistent double-$\zeta$ Dunning basis set (cc-pVDZ).
Our FCI correlation energy estimates, with estimated error \DMJ{smaller than} 1 millihartree, are based on energetically optimized-orbital selected configuration interaction (SCI) calculations performed with the \textit{Configuration Interaction using a Perturbative Selection made Iteratively} (CIPSI) algorithm.
Having at our disposal these accurate reference energies, the respective performance and convergence properties of several popular and widely-used families of \DMJ{single-reference} quantum chemistry methods are investigated.
In particular, we study the convergence properties of i) the M{\o}ller-Plesset perturbation series up to fifth-order (MP2, MP3, MP4, and MP5), ii) the iterative approximate \trashDMJ{single-reference} coupled-cluster series CC2, CC3, and CC4, and iii) the \trashDMJ{single-reference} coupled-cluster series CCSD, CCSDT, and CCSDTQ.
Non-relativistic electronic structure theory relies heavily on approximations. \cite{Szabo_1996,Helgaker_2013,Jensen_2017}\DMJ{Merci :-) :-) Je dirais que relativistic aussi d'ailleurs. Pq cette distinction ?}
Loosely speaking, to make any method practical, three main approximations \DMJ{are typically} enforced.
The first fundamental approximation, known as the Born-Oppenheimer (or clamped-nuclei) approximation, consists in assuming that the motion of nuclei and electrons are decoupled. \cite{Born_1927}
The second central approximation which makes calculations \DMJ{computationally achievable}\trashDMJ{feasible by a computer} is the basis set approximation where one introduces a set of pre-defined basis functions to represent the many-electron wave function of the system.
In most molecular calculations, a set of one-electron, atom-centered Gaussian basis functions are introduced to expand the so-called one-electron molecular orbitals which are then used to build the many-electron Slater \DMJ{determinant(s)}.
For example, in configuration interaction (CI) methods, the wave function is expanded as a linear combination of Slater determinants, while in (single-reference) coupled-cluster (CC) theory, \cite{Cizek_1966,Paldus_1972,Crawford_2000,Piecuch_2002b,Bartlett_2007,Shavitt_2009} a reference Slater determinant $\PsiO$ [usually taken as the Hartree-Fock (HF) wave function] is multiplied by a wave operator defined as the exponentiated excitation operator $\hT=\sum_{k=1}^\Nel\hT_k$ (where $\Nel$ is the number of electrons).\DMJ{Tk undefined then}
CC with singles and doubles (CCSD), \cite{Cizek_1966,Purvis_1982} CC with singles, doubles, and triples (CCSDT), \cite{Noga_1987a,Scuseria_1988} CC with singles, doubles, triples, and quadruples (CCSDTQ), \cite{Oliphant_1991,Kucharski_1992} with corresponding \DMJ{formal} computational scalings of $\order*{\Norb^{6}}$, $\order*{\Norb^{8}}$, and $\order*{\Norb^{10}}$, respectively (where $\Norb$ denotes the number of orbitals).
Parallel to the ``complete'' CC series presented above, an alternative family of approximate iterative CC models has been developed by the Aarhus group in the context of CC response theory \cite{Christiansen_1998} where one skips the most expensive terms and avoids the storage of the higher-excitation amplitudes: CC2, \cite{Christiansen_1995a} CC3, \cite{Christiansen_1995b,Koch_1997} and CC4. \cite{Kallay_2005,Matthews_2021}
These iterative methods scale as $\order*{\Norb^{5}}$, $\order*{\Norb^{7}}$, and $\order*{\Norb^{9}}$, respectively, and can be seen as cheaper approximations of CCSD, CCSDT, and CCSDTQ.
Coupled-cluster methods have been particularly successful at computing accurately \DMJ{ground and excited state} properties for small- and medium-sized molecules.
A similar systematic truncation strategy can be applied to CI methods leading to the well-established family of methods known as CISD, CISDT, CISDTQ, \ldots~where one systematically increases the maximum excitation degree of the determinants taken into account.
Except for full CI (FCI) where all determinants from the Hilbert space (\ie, with excitation degree up to $\Nel$) are considered, truncated CI methods are variational but lack size-consistency.
The non-variationality of truncated CC methods being, \DMJ{in practice,} less of an issue than the size-inconsistency of the truncated CI methods, the formers have naturally overshadowed the latters in the electronic structure landscape.
However, a different strategy \DMJ{recently came back in the limelight} in the context of CI methods. \cite{Bender_1969,Whitten_1969,Huron_1973}
Indeed, selected CI (SCI) methods, \cite{Booth_2009,Giner_2013,Evangelista_2014,Giner_2015,Caffarel_2016b,Holmes_2016,Tubman_2016,Liu_2016,Ohtsuka_2017,Zimmerman_2017,Coe_2018,Garniron_2018} where one iteratively selects the important determinants from the FCI space (usually) based on a perturbative criterion, has been recently shown to be highly successful in order to produce reference energies for \DMJ{both} ground and excited states in small- and medium-sized molecules \cite{Caffarel_2014,Caffarel_2016a,Scemama_2016,Holmes_2017,Li_2018,Scemama_2018,Scemama_2018b,Li_2020,Loos_2018a,Chien_2018,Loos_2019,Loos_2020b,Loos_2020c,Loos_2020e,Garniron_2019,Eriksen_2020,Yao_2020,Williams_2020,Veril_2021,Loos_2021} thanks to efficient deterministic, stochastic, or hybrid algorithms well suited for massive parallelization.
SCI methods are based on a well-known fact: amongst the very large number of determinants contained in the FCI space, only a tiny fraction of them significantly contributes to the energy.
Accordingly, the SCI+PT2 family of methods performs a sparse exploration of the FCI space by selecting iteratively only the most energetically relevant determinants of the variational space and supplementing it with a second-order perturbative correction (PT2). \cite{Huron_1973,Garniron_2017,Sharma_2017,Garniron_2018,Garniron_2019}
Although the formal scaling of such algorithms remains exponential, the prefactor is greatly reduced which explains their current attractiveness in the electronic structure community \DMJ{thanks to their} much wider applicability than their standard FCI parent.
Note that, very recently, several groups \cite{Aroeira_2021,Lee_2021,Magoulas_2021} have coupled CC and SCI methods via the externally-corrected CC methodology, \cite{Paldus_2017} showing promising performances for weakly and strongly correlated systems.
A rather different strategy in order to reach the holy grail FCI limit is to resort to M{\o}ller-Plesset (MP) perturbation theory, \cite{Moller_1934}
whose popularity originates from its black-box nature, size-extensivity, and relatively low computational scaling, making it easily applied to a broad range of molecular systems. \DMJ{enfin low-scaling, bof, disons plut<75>t qu'il y a des bons codes RI... je dirais plust<73>t low computational requirement que low scaling MP2 scale comme CC2.}
The second-order M{\o}ller-Plesset (MP2) method \cite{Moller_1934} [which scales as $\order*{\Norb^{5}}$] has been broadly adopted in quantum chemistry for several decades, and is now included in the increasingly popular double-hybrid functionals \cite{Grimme_2006} alongside exact \trashDMJ{HF} exchange. \DMJ{ou exact, ou of HF form mais exact HF c'est une pl<70>onasme pour moi.}
MP4, \cite{Krishnan_1980} MP5, \cite{Kucharski_1989} and MP6 \cite{He_1996a,He_1996b} which scale as $\order*{\Norb^{6}}$, $\order*{\Norb^{7}}$, $\order*{\Norb^{8}}$, and $\order*{\Norb^{9}}$ respectively] have been investigated much more scarcely.
However, it is now widely recognized that the series of MP approximations might show erratic, slowly convergent, or divergent behavior that limits its applicability and systematic improvability. \cite{Laidig_1985,Knowles_1985,Handy_1985,Gill_1986,Laidig_1987,Nobes_1987,Gill_1988,Gill_1988a,Lepetit_1988,Malrieu_2003,Marie_2021a}
\DMJ{The most iconic example of such coupling, namely the} CCSD(T) method, \cite{Raghavachari_1989} includes iteratively the single and double excitations and perturbatively (from MP4 and partially MP5) the triple excitations, \DMJ{leading to the so-called} ``gold-standard'' of quantum chemistry for weakly correlated systems thanks to its excellent accuracy/cost ratio\trashDMJ{, is probably the most iconic example of such coupling.}
Motivated by the recent blind test of Eriksen \textit{et al.}\cite{Eriksen_2020}~reporting the performance of a large panel of emerging electronic structure methods [the many-body expansion FCI (MBE-FCI), \cite{Eriksen_2017,Eriksen_2018,Eriksen_2019a,Eriksen_2019b} adaptive sampling CI (ASCI), \cite{Tubman_2016,Tubman_2018,Tubman_2020} iterative CI (iCI), \cite{Liu_2014,Liu_2016,Lei_2017,Zhang_2020} semistochastic heat-bath CI (SHCI), \cite{Holmes_2016,Holmes_2017,Sharma_2017} the full coupled-cluster reduction (FCCR), \cite{Xu_2018,Xu_2020} density-matrix renormalization group (DMRG), \cite{White_1992,White_1993,Chan_2011} adaptive-shift FCI quantum Monte Carlo (AS-FCIQMC), \cite{Booth_2009,Cleland_2010,Ghanem_2019} and cluster-analysis-driven FCIQMC (CAD-FCIQMC) \cite{Deustua_2017,Deustua_2018}] on the non-relativistic frozen-core correlation energy of the benzene molecule in the standard correlation-consistent double-$\zeta$ Dunning basis set (cc-pVDZ), some of us have recently investigated the performance of the SCI method known as \textit{Configuration Interaction using a Perturbative Selection made Iteratively} (CIPSI). \cite{Huron_1973,Giner_2013,Giner_2015,Garniron_2018,Garniron_2019} on the very same system \cite{Loos_2020e} [see also Ref.~\onlinecite{Lee_2020} for a study of the performance of phaseless auxiliary-field quantum Monte Carlo (ph-AFQMC) \cite{Motta_2018}].
In the continuity of this recent work, we report here a \DMJ{very large} extension \DMJ{of this one-molecule test} by \DMJ{accurately} estimating the (frozen-core) FCI/cc-pVDZ correlation energy of twelve cyclic molecules (cyclopentadiene, furan, imidazole, pyrrole, thiophene, benzene, pyrazine, pyridazine, pyridine, pyrimidine, s-tetrazine, and s-triazine) with the help of CIPSI employing energetically-optimized orbitals at the same level of theory. \cite{Yao_2020,Yao_2021}
In particular, we study i) the MP perturbation series up to fifth-order (MP2, MP3, MP4, and MP5), ii) the CC2, CC3, and CC4 approximate series, and ii) the ``complete'' CC series up to quadruples (\ie, CCSD, CCSDT, and CCSDTQ).
The performance of the ground-state gold standard CCSD(T) as well as the completely renormalized (CR) CC model, CR-CC(2,3), \cite{,Kowalski_2000a,Kowalski_2000b,Piecuch_2002a,Piecuch_2002b,Piecuch_2005} are also investigated.
Five-membered rings (top) and six-membered rings (bottom) considered in this study.\DMJ{ C'est con, mais je n'aime pas le placement des double liaisons dans les cycles <20> six. Je n'aurais mis que des l simples partout}
In Sec.~\ref{sec:OO-CIPSI}, we provide theoretical details about the CIPSI algorithm and the orbital optimization procedure \trashDMJ{that we have} employed here.
In Sec.~\ref{sec:res}, we report our reference FCI correlation energies for the five-membered and six-membered cyclic molecules obtained thanks to extrapolated orbital-optimized CIPSI calculations (Sec.~\ref{sec:cipsi_res}).
These reference correlation energies are then used to benchmark and study the convergence properties of various perturbative and CC methods (Sec.~\ref{sec:mpcc_res}).
Finally, we draw our conclusions in Sec.~\ref{sec:ccl}.
Here, we provide key details about the CIPSI method \cite{Huron_1973,Garniron_2019} as well as the orbital optimization procedure which has been shown to be highly effective in the context of SHCI by Umrigar and coworkers. \cite{Eriksen_2020,Yao_2020,Yao_2021}
At the $k^\textsuperscript{th}$ iteration, the total CIPSI energy $\ECIPSI^{(k)}$ is defined as the sum of the variational energy%DJ Change kth en k^th, idem plus bas (cherche textsuperscript si tu n'aimes pas)
is the variational wave function, $\cI_k$ is the set of internal determinants $\ket*{I}$ and $\cA_k$ is the set of external determinants (or perturbers) $\ket*{\alpha}$ which do not belong to the variational space at the $k^\textsuperscript{th}$ iteration but are linked to it via a nonzero matrix element, \ie, $\mel*{\Psivar^{(k)}}{\hH}{\alpha}\neq0$.
The sets $\cI_k$ and $\cA_k$ define, at the $k^\textsuperscript{th}$ iteration, the internal and external spaces, respectively.
In the selection step, the perturbers corresponding to the largest $\abs*{e_{\alpha}^{(k)}}$ values are then added to the variational space at \DMJ{the next iteration.}
In practice, $\Evar^{(k)}$ is the lowest eigenvalue of the $\Ndet^{(k)}\times\Ndet^{(k)}$ CI matrix with elements $\mel{I}{\hH}{J}$ obtained via Davidson's algorithm. \cite{Davidson_1975}
The magnitude of $\EPT^{(k)}$ provides, at iteration $k$, a qualitative idea of the ``distance'' to the FCI limit. \cite{Garniron_2018}%DJ: j'aurais pas distance entre "", on comprend sans cela
Some of the technology presented here has been borrowed from complete-active-space self-consistent-field (CASSCF) methods \cite{Werner_1980,Werner_1985,Sun_2017,Kreplin_2019,Kreplin_2020} but one of the strength of SCI methods is that one does not need to select an active space and to classify orbitals as active, inactive, and virtual orbitals.
Here, we detail our orbital optimization procedure within the CIPSI algorithm and we assume that the variational wave function is normalized, \ie, $\braket*{\Psivar}{\Psivar}=1$.
As stated in Sec.~\ref{sec:intro}, $\Evar$ depends on both the CI coefficients $\{ c_I \}_{1\le I \le\Ndet}$ [see Eq.~\eqref{eq:Psivar}] but also on the orbital rotation parameters $\{\kappa_{pq}\}_{1\le p,q \le\Norb}$.
Motivated by cost saving arguments, we have chosen to optimize separately the CI and orbital coefficients by alternatively diagonalizing the CI matrix after each selection step and then rotating the orbitals until the variational energy, for a given number of determinants, is minimal. \DMJ{We refer the interested reader to the recent work of Yao and Umrigar for a detailed comparison of coupled, uncoupled, and partially-coupled optimizations within SCI methods}. \cite{Yao_2021}
\DMJ{In our approach}, we conveniently rewrite the variational energy as
is a real-valued one-electron antisymmetric operator, which creates an orthogonal transformation of the orbital coefficients when exponentiated, $\ani{p\sigma}$ ($\cre{p\sigma}$) being the second quantization annihilation (creation) operator which annihilates (creates) a spin-$\sigma$ electron in the real-valued spatial orbital $\MO{p}(\br)$. \cite{Helgaker_2013}
Because the size of the CI space is much larger than the orbital space, for each macroiteration, we perform multiple \textit{microiterations} which consist in iteratively minimizing the variational energy \eqref{eq:Evar_c_k} with respect to the $\Norb(\Norb-1)/2$ independent orbital rotation parameters for a fixed set of determinants.
After each microiteration (\ie, orbital rotation), the one- and two-electron integrals [see Eqs.~\eqref{eq:one} and \eqref{eq:two}] have to be updated.
Moreover, the CI matrix must be re-diagonalized and new one- and two-electron density matrices [see Eqs.~\eqref{eq:one_dm} and \eqref{eq:two_dm}] \DMJ{need to be} computed.
Microiterations are stopped when a stationary point is found, \ie, $\norm{\bg}_\infty < \tau$, where $\tau$ is a user-defined threshold which has been set to $10^{-4}$ a.u.~in the present study, and a new CIPSI selection step is performed.
Note that a tight convergence is not critical here as a new set of microiterations is performed at each macroiteration and a new production CIPSI run is performed from scratch using the final set of orbitals (see Sec.~\ref{sec:compdet}).
This procedure might sound computationally expensive but one has to realize that the microiterations are usually performed only for relatively compact variational spaces.
To enhance the convergence of the microiteration process, we employ an adaptation of the Newton-Raphson method known as ``trust region''. \cite{Nocedal_1999}
This popular variant defines a region where the quadratic approximation \eqref{eq:EvarTaylor} is an adequate representation of the objective energy function \eqref{eq:Evar_c_k} and it evolves during the optimization process in order to preserve the adequacy via a constraint on the step size preventing it from overstepping, \ie, $\norm{\bk}\leq\Delta$, where $\Delta$ is the trust radius.
By introducing a Lagrange multiplier $\lambda$ to control the trust-region size, one replaces Eq.~\eqref{eq:kappa_newton} by $\bk=-(\bH+\lambda\bI)^{-1}\cdot\bg$.
The addition of the level shift $\lambda\geq0$ removes the negative eigenvalues and ensures the positive definiteness of the Hessian matrix by reducing the step size.
By choosing the right value of $\lambda$, $\norm{\bk}$ is constrained \trashDMJ{into}\DMJ{within} a hypersphere of radius $\Delta$ and is able to evolve from the Newton direction at $\lambda=0$ to the steepest descent direction as $\lambda$ grows.
The evolution of the trust radius during the optimization and the use of a condition to reject the step when the energy rises ensure the convergence of the algorithm.
\caption{$\Delta\Evar$ (solid) and $\Delta\Evar+\EPT$ (dashed) as functions of the number of determinants $\Ndet$ in the variational space for the twelve cyclic molecules represented in Fig.~\ref{fig:mol}.
Two sets of orbitals are considered: natural orbitals (NOs, in red) and optimized orbitals (OOs, in blue).
\DMJ{Marrant cela fait une inflexion ca. 10 5 pour tous pour Evar.}\DMJ{Pq garder cc-pVDZ dans toutes les captions et ne pas le mettre dans la vraie caption?}
\caption{Total energy $E$ (in \si{\hartree}) and correlation energy $\Delta E$ (in \si{\milli\hartree}) for the frozen-core ground state of five-membered rings in the cc-pVDZ basis set.
For the CIPSI estimates of the FCI correlation energy, the fitting error associated with the weighted five-point linear fit is reported in parenthesis.
\caption{Total energy $E$ (in \si{\hartree}) and correlation energy $\Delta E$ (in \si{\milli\hartree}) for the frozen-core ground state of six-membered rings in the cc-pVDZ basis set.
For the CIPSI estimates of the FCI correlation energy, the fitting error associated with the weighted five-point linear fit is reported in parenthesis.\DMJ{Dire cfr Table I directement plutot que de faire un long titre?}
Extrapolated correlation energies $\Delta\Eextrap$ (in \si{\milli\hartree}) for the twelve cyclic molecules represented in Fig.~\ref{fig:mol} and their associated fitting errors (in \si{\milli\hartree}) obtained via weighted linear fits with a varying number of points.
\caption{$\Delta\Evar$ (solid) and $\Delta\Evar+\EPT$ (dashed) as functions of the number of determinants $\Ndet$ in the variational space for the benzene molecule.
\caption{Convergence of the correlation energy (in \si{\milli\hartree}) as a function of the \DMJ{formal computational scaling} for the twelve cyclic molecules represented in Fig.~\ref{fig:mol}.
The FCI estimate of the correlation energy is represented as a black line.\DMJ{C'est pas le cost mais le formal scaling. Je changerais aussi dans tes petites legendes + enlever cc-pVDZ + ce vert ci est mieux + ton echelle
etant lineaire, on voit pas bien la fin, on a l'impression que CC3 et CCSDT sont on the spot deja tres bien...}
Mean absolute error (MAE), mean signed error (MSE), and minimum (Min) and maximum (Max) absolute errors (in \si{\milli\hartree}) with respect to the FCI correlation energy for various methods.
The geometries of the twelve systems considered in the present study were all obtained at the CC3/aug-cc-pVTZ level of theory and were extracted from a previous study. \cite{Loos_2020b}\DMJ{J'ai change la ref qui etait fausse je pense}
Note that, for the sake of consistency, the geometry of benzene considered here is different from the one of Ref.~\onlinecite{Loos_2020e} which was obtained at a lower level of theory [MP2/6-31G(d)]. \cite{Schreiber_2008}
The MP2, MP3, MP4, CC2, CC3, CC4, CCSD, CCSDT, and CCSDTQ calculations were performed with CFOUR, \cite{Matthews_2020} the CCSD(T) and CR-CC(2,3) calculations were made with GAMESS 2014R1, \cite{gamess} and MP5 calculations were computed with GAUSSIAN 09. \cite{g09}\DMJ{Je pensais que les CCSD(T) etait aussi en Gaussian ? On aurait pu les faire en Cfour aussi note...}
The CIPSI calculations were performed with QUANTUM PACKAGE. \cite{Garniron_2019}
In the current implementation, the selection step and the PT2 correction are computed simultaneously via a hybrid semistochastic algorithm.\cite{Garniron_2017,Garniron_2019}%(which explains the statistical error associated with the PT2 correction in the following).
Here, we employ the renormalized version of the PT2 correction which was recently implemented and tested for a more efficient extrapolation to the FCI limit thanks to a partial resummation of the higher orders of perturbation. \cite{Garniron_2019}
We refer the interested reader to Ref.~\onlinecite{Garniron_2019}\trashDMJ{where one can find all the details regarding the implementation of the PT2 correction and the CIPSI algorithm}\DMJ{for further details.}
\trashDMJ{For all these calculations, we consider} Dunning's correlation-consistent double-$\zeta$ basis (cc-pVDZ) \DMJ{has been applied in all calculations.}
Although the FCI energy has the enjoyable property of being independent of the set of one-electron orbitals used to construct the many-electron Slater determinants, as a truncated CI method, the convergence properties of CIPSI strongly dependent on this orbital choice.
In the present study, we investigate, in particular, the convergence behavior of the CIPSI energy for two sets of orbitals: natural orbitals (NOs) and optimized orbitals (OOs).
Following our usual procedure, \cite{Scemama_2018,Scemama_2018b,Scemama_2019,Loos_2018a,Loos_2019,Loos_2020a,Loos_2020b,Loos_2020c,Loos_2020e} we perform first a preliminary SCI calculation using HF orbitals in order to generate a SCI wave function with at least $10^7$ determinants.
Natural orbitals are computed based on this wave function and they are used to perform a new CIPSI run up to $8\times10^7$ determinants.
Successive orbital optimizations are then performed, which consist in minimizing the variational CIPSI energy at each macroiteration up to approximately $2\times10^5$ determinants.
When convergence is achieved in terms of orbital optimization, as our production run, we perform a new CIPSI calculation from scratch using this set of optimized orbitals to $8\times10^7$ determinants.
Using optimized orbitals has the undeniable advantage to produce, for a given variational energy, more compact CI expansions (see Sec.~\ref{sec:res}).
For the benzene molecule, we also \DMJ{explored} the use of localized orbitals (LOs) which are produced with the Boys-Foster localization procedure \cite{Boys_1960} that we apply to the natural orbitals in several orbital windows in order to preserve a strict $\sigma$-$\pi$ separation in the planar systems considered here. \cite{Loos_2020e}
Because they take advantage of the local character of electron correlation, localized orbitals have been shown to provide faster convergence towards the FCI limit compared to natural orbitals. \cite{Angeli_2003,Angeli_2009,BenAmor_2011,Suaud_2017,Chien_2018,Eriksen_2020,Loos_2020e}
As we shall see below, employing optimized orbitals has the advantage to produce an even smoother and faster convergence of the SCI energy toward the FCI limit.
Note that both localized and optimized orbitals do break the spatial symmetry.
Unlike excited-state calculations where it is important to enforce that the wave functions are eigenfunctions of the $\Hat{S}^2$ spin operator, \cite{Chilkuri_2021} the present wave functions do not fulfill this property as we aim for the lowest possible energy of a closed-shell singlet state.
We have found that $\expval*{\Hat{S}^2}$ is, nonetheless, very close to zero for each system.\DMJ{Tu donnerais une valeur, au moins en SI, pour quand meme avoir une idee ? very close to zero c'est combien de decimale ?}
All the data (geometries, energies, etc) and supplementary material associated with the present manuscript are openly available in Zenodo at \url{http://doi.org/XX.XXXX/zenodo.XXXXXXX}.
Our motivation here is to generate FCI-quality reference correlation energies for the twelve cyclic molecules represented in Fig.~\ref{fig:mol} in order to benchmark \trashDMJ{, in a second time,} the performances \trashDMJ{and convergence properties} of various mainstream MP and CC methods (see Sec.~\ref{sec:mpcc_res}).
For the natural and optimized orbital sets, we report, in Fig.~\ref{fig:vsNdet}, the evolution of the variational correlation energy $\Delta\Evar=\Evar-\EHF$ (where $\EHF$ is the HF energy) and its perturbatively corrected value $\Delta\Evar+\EPT$ with respect to the number of determinants $\Ndet$ for each cyclic molecule.
As compared to natural orbitals (solid red lines), one can see that, for a given number of determinants, the use of optimized orbitals greatly lowers $\Delta\Evar$ (solid blue lines).
This indicates that, for a given number of determinants, $\EPT$ (which, we recall, provides a qualitative idea to the distance to the FCI limit) is much smaller for optimized orbitals than for natural orbitals.
This is further evidenced in Fig.~\ref{fig:vsEPT2} where we show the behavior of $\Delta\Evar$ as a function of $\EPT$ for both sets of orbitals.
From Fig.~\ref{fig:vsEPT2}, it is clear that \trashDMJ{, using optimized orbitals,} the behavior of $\Delta\Evar$ is much more linear and produces smaller $\EPT$ values \DMJ{when optimized orbitals are selected}, hence facilitating the extrapolation procedure to the FCI limit (see below).
The five-point weighted linear fit using the five largest variational wave functions are also represented (dashed black lines), while the FCI estimate of the correlation energy (solid black line) is reported for reference in Figs.~\ref{fig:vsNdet} and \ref{fig:vsEPT2}.
Figure \ref{fig:BenzenevsNdet} compares the convergence of $\Delta\Evar$ for \trashDMJ{the} natural, localized, and optimized \trashDMJ{sets of} orbitals \trashDMJ{in the particular case of}\DMJ{for} benzene.
As mentioned in Sec.~\ref{sec:compdet}, although both the localized and optimized orbitals break the spatial symmetry to take advantage of the local nature of electron correlation, the latter set further \DMJ{improves}\trashDMJ{improve} on the use of former set.
More quantitatively, optimized orbitals produce the same variational energy as localized orbitals with, roughly, a ten-fold reduction in the number of determinants.
A similar improvement is observed going from natural to localized orbitals.
According to these observations, all our FCI correlation energy estimates have been produced \DMJ{from/using}\trashDMJ{with} the set of optimized orbitals.
To \trashDMJ{do so}\DMJ{this end}, we have \trashDMJ{then} extrapolated the orbital-optimized variational CIPSI correlation energies to $\EPT=0$ via a weighted five-point linear fit using the five largest variational wave functions (see Fig.~\ref{fig:vsEPT2}).
The fitting weights have been taken as the inverse square of the perturbative corrections.
Our final FCI correlation energy estimates are reported in Tables \ref{tab:Tab5-VDZ} and \ref{tab:Tab6-VDZ} for the five- and six-membered rings, respectively, alongside their corresponding fitting error.
The stability of these estimates are illustrated by the results gathered in Table \ref{tab:fit}\trashDMJ{where we report}\DMJ{that lists}\trashDMJ{, for each system,} the extrapolated correlation energies $\Delta\Eextrap$ and their associated fitting errors obtained via weighted linear fits varying the number of fitting points from $3$ to $7$.
Although we cannot provide a mathematically rigorous error bar, the data provided by Table \ref{tab:fit} show that the extrapolation procedure is robust and that our FCI estimates are very likely accurate to a few tenths of a millihartree.
Logically, the FCI estimates for the five-membered rings seem slightly more accurate than for the (larger) six-membered rings.
\trashDMJ{Note that it}\DMJ{It is also} pleasing to see that, although different geometries are considered, our present estimate of the frozen-core correlation energy of the benzene molecule in the cc-pVDZ basis is very close to the one reported in Refs.~\onlinecite{Eriksen_2020,Loos_2020e}.\DMJ{pq ne pas donner des chiffres ici ? Une diff de xxx only ???}
Again, the superiority of the latter set is clear as \DMJ{both} the variation in extrapolated values and \DMJ{the} fitting error \trashDMJ{is}\DMJ{are} much larger with the natural set.
Taking cyclopentadiene as an example, the extrapolated values vary by almost \SI{1}{\milli\hartree} with natural orbitals and less than \SI{0.1}{\milli\hartree} with the optimized set.
Using the CIPSI estimates of the FCI correlation energy produced in Sec.~\ref{sec:cipsi_res}, we now study the performance and convergence properties of three series of methods: i) MP2, MP3, MP4, and MP5, ii) CC2, CC3, and CC4, and iii) CCSD, CCSDT, and CCSDTQ.
\trashDMJ{All these}\DMJ{The raw} data are reported in Tables \ref{tab:Tab5-VDZ} and \ref{tab:Tab6-VDZ} for the five- and six-membered rings, respectively.
In Fig.~\ref{fig:MPCC}, we show, for each molecule, the convergence of the correlation energy for each series of methods as a function of the \DMJ{formal} computational \trashDMJ{cost}\DMJ{scaling} of the corresponding method.
\trashDMJ{The FCI correlation energy estimate is represented as a black line for reference.}\DMJ{pas utile si dans la caption quand meme ?}
\trashDMJ{Key}\DMJ{Selected} statistical quantities [mean absolute error (MAE), mean signed error (MSE), \trashDMJ{and} minimum (Min) and maximum (Max) absolute errors with respect to the FCI reference values] are also reported in Table \ref{tab:stats} for each method as well as their formal computational scaling. \DMJ{Tu dis absolute min et max, mais sauf erreur, toutes les valeurs sont tjrs trop petites en valeurs absolues ? C'est d'ailleurs un element que je dirais, car vu les methodes utilises c'etait pas garanti, mais cela l'est quand meme ? Une exception MP4 du s-tetrazine ?}
Unfortunately, CC with singles, doubles, triples, quadruples, and pentuples (CCSDTQP) calculations are out of \DMJ{computational} reach here. \cite{Hirata_2000,Kallay_2001}
As expected for the present set of weakly correlated systems, going from CCSD to CCSDTQ, one \trashDMJ{improves} systematically and quickly \DMJ{improves} the correlation energies with \DMJ{respective} MAEs of $39.4$, $4.5$, \SI{1.8}{\milli\hartree} for CCSD, CCSDT, and CCSDTQ\trashDMJ{, respectively}.
As usually observed, CCSD(T) (MAE of \SI{4.5}{\milli\hartree}) provides similar correlation energies than the more expensive CCSDT method by computing perturbatively (instead of iteratively) the triple excitations, while CCSD(T) and CR-CC(2,3) \trashDMJ{perform}\DMJ{performs} equally well.
\DMJ{Pq passer par MP, et pas d'abord les autres CC ?}
Second, let us look into the \trashDMJ{series of MP approximations }\DMJ{MP series} which is known, as mentioned in Sec.~\ref{sec:intro}, to potentially exhibit ``surprising'' behaviors depending on the type of correlation at play.\cite{Laidig_1985,Knowles_1985,Handy_1985,Gill_1986,Laidig_1987,Nobes_1987,Gill_1988,Gill_1988a,Lepetit_1988,Malrieu_2003}
For each system, the MP series decreases monotonically up to MP4 but raises quite significantly when one takes into account the fifth-order correction.
We note that the MP4 correlation energy is always quite accurate (MAE of \SI{2.1}{\milli\hartree}) and is only a few millihartree higher than the FCI value (except in the case of s-tetrazine where the MP4 number is very slightly below the reference value): MP5 (MAE of \SI{9.4}{\milli\hartree}) is thus systematically worse than MP4 for these weakly-correlated systems.
Importantly here, one notices that MP4 [which scales as $\order*{N^7}$] is systematically on par with the \DMJ{much} more expensive $\order*{N^{10}}$ CCSDTQ method which exhibits a slightly smaller MAE of \SI{1.8}{\milli\hartree}.
As observed in our recent study on excitation energies, \cite{Loos_2021} CC4, which returns a MAE of \SI{1.5}{\milli\hartree}, is an outstanding approximation to its CCSDTQ parent and is, in the present case, even slightly more accurate in terms of mean errors as well as maximum and minimum absolute errors.
Moreover, we observe that CC3 \trashDMJ{(MAE of \SI{2.7}{\milli\hartree})}\trashDMJ{and CC4 provide}\DMJ{provides very accurate} correlation energies \DMJ{with a MAE of \SI{2.7}{\milli\hartree}}\trashDMJ{that only deviate by one or two millihartree,} showing that \trashDMJ{the iterative CC3}\DMJ{this} method is particularly effective for ground-state energetics and outperforms both the perturbative CCSD(T) and iterative CCSDT models.
As a final remark, we would like to mention that even if the two families of CC methods studied here are known to be non-variational (see Sec.~\ref{sec:intro}), for the present set of weakly-correlated molecular systems, they never produce a lower energy than the FCI estimate as illustrated by the systematic equality between MAEs and MSEs.\DMJ{OK, tu le dis ici, je l'aurais dit au-dessus, mais OK}
Using \trashDMJ{the}\DMJ{a} SCI algorithm named \textit{Configuration Interaction using a Perturbative Selection made Iteratively} (CIPSI), we have produced FCI-quality frozen-core correlation energies for twelve cyclic molecules (see Fig.~\ref{fig:mol}) in the correlation-consistent double-$\zeta$ Dunning basis set (cc-pVDZ).
These estimates, which are likely accurate to a few tenths of a millihartree, have been obtained by extrapolating CIPSI energies to the FCI limit based on a set of orbitals obtained by minimizing the CIPSI variational energy.
\trashDMJ{Compared to natural orbitals, we have shown that, by} Using energetically optimized orbitals, one can reduce the size of the variational space by one order of magnitude for the same variational energy \DMJ{as compared
Thanks to these reference FCI energies, we have then benchmarked three families of popular electronic structure methods: i) the MP perturbation series up to fifth-order (MP2, MP3, MP4, and MP5), ii) the approximate CC series CC2, CC3, and CC4, and iii) the ``complete'' CC series CCSD, CCSDT, and CCSDTQ.
\trashDMJ{Our results have shown that,} With a $\order*{N^7}$ scaling, MP4 provides an interesting accuracy/cost ratio for this particular set of weakly correlated systems, while MP5 systematically worsen the perturbative estimates of the correlation energy.
\trashDMJ{We have evidenced that}\DMJ{In addition,} CC3 (where the triples are computed iteratively) \trashDMJ{also} outperforms the perturbative-triples CCSD(T) method with the same $\order*{N^7}$ scaling, its completely renormalized version CR-CC(2,3), \trashDMJ{but also}\DMJ{as well as} its more expensive parent, CCSDT.
A similar trend is observed for the methods including quadruple excitations, where the $\order*{N^9}$ CC4 model has been shown to be \DMJ{slightly} more accurate than CCSDTQ [which scales as $\order*{N^{10}}$], \DMJ{both methods
providing correlation energies within 2 \milli\hartree of the FCI limit.}
Of course, the present trends are only valid for this particular class of (weakly-correlated) systems and it would be desirable to \trashDMJ{provide more}\DMJ{to have a broader} variety \trashDMJ{in terms }of systems in the future by including more challenging systems such as, for example, transition metal compounds.
Some work along this line is currently being \trashDMJ{done}\DMJ{performed}.
As perspectives, we are currently investigating the performance of the present approach for excited states in order to expand the QUEST database of vertical excitation energies. \cite{Veril_2021}
We hope to report on this in the near future.
The compression of the variational space brought by optimized orbitals could be also beneficial in the context of quantum Monte Carlo methods to generate compact, yet accurate multi-determinant trial wave functions. \cite{Dash_2018,Dash_2019,Scemama_2020,Dash_2021}
This work was performed using HPC resources from GENCI-TGCC (2021-gen1738)\DMJ{, from the CCIPL computational center installed in Nantes,} and from CALMIP (Toulouse) under allocation 2021-18005, and was also supported by the European Centre of Excellence in Exascale Computing TREX --- Targeting Real Chemical Accuracy at the Exascale. This project has received funding from the European Union's Horizon 2020 --- Research and Innovation program --- under grant agreement no.~952165.
This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement No.~863481).