%We report (frozen-core) full configuration interaction (FCI) energies in finite Hilbert spaces for various five- and six-membered rings.
In the continuity of our recent work on the benzene molecule [\href{https://doi.org/10.1063/5.0027617}{J. Chem. Phys. \textbf{153}, 176101 (2020)}], itself motivated by the blind challenge of Eriksen \textit{et al.} [\href{https://doi.org/10.1021/acs.jpclett.0c02621}{J. Phys. Chem. Lett. \textbf{11}, 8922 (2020)}] on the same system, we report reference frozen-core correlation energies for twelve five- and six-membered ring molecules (cyclopentadiene, furan, imidazole, pyrrole, thiophene, benzene, pyrazine, pyridazine, pyridine, pyrimidine, tetrazine, and triazine) in the standard correlation-consistent double-$\zeta$ Dunning basis set (cc-pVDZ).
This corresponds to Hilbert spaces with sizes ranging from $10^{28}$ (for thiophene) to $10^{36}$ (for benzene).
Our estimates are based on energetically optimized-orbital selected configuration interaction (SCI) calculations performed with the \textit{Configuration Interaction using a Perturbative Selection made Iteratively} (CIPSI) algorithm.
In particular, we study the convergence properties of i) the M{\o}ller-Plesset perturbation series up to fifth-order (MP2, MP3, MP4, and MP5), ii) the iterative approximate single-reference coupled-cluster series CC2, CC3, and CC4, and iii) the single-reference coupled-cluster series CCSD, CCSDT, and CCSDTQ.
Electronic structure theory relies heavily on approximations.
Loosely speaking, to make any theory useful, three main approximations must be enforced.
The first fundamental approximation, known as the Born-Oppenheimer approximation, usually consists in assuming that the motion of nuclei and electrons are decoupled.
The second central approximation which makes calculations feasible by a computer is the basis set approximation where one introduces a set of pre-defined basis functions to represent the many-electron wave function of the system.
In most molecular calculations, a set of one-electron, atom-centered gaussian basis functions are introduced to expand the so-called one-electron molecular orbitals which are then used to build the many-electron Slater determinants.
The third and most relevant approximation in the present context is the ansatz (or form) of the electronic wave function $\Psi$.
For example, in configuration interaction (CI) methods, the wave function is expanded as a linear combination of Slater determinants, while in (single-reference) coupled-cluster (CC) theory, \cite{Cizek_1966,Paldus_1972,Crawford_2000,Piecuch_2002,Bartlett_2007,Shavitt_2009} a reference Slater determinant $\Psi_0$ [usually taken as the Hartree-Fock (HF) wave function] is multiplied by a wave operator defined as the exponentiated excitation operator $\Hat{T}=\sum_{k=1}^\Nel\Hat{T}_k$ (where $\Nel$ is the number of electrons).
CC with singles and doubles (CCSD), \cite{Cizek_1966,Purvis_1982} CC with singles, doubles, and triples (CCSDT), \cite{Noga_1987a,Scuseria_1988} CC with singles, doubles, triples, and quadruples (CCSDTQ), \cite{Oliphant_1991,Kucharski_1992} with corresponding computational scalings of $\order*{\Norb^{6}}$, $\order*{\Norb^{8}}$, and $\order*{\Norb^{10}}$, respectively (where $\Norb$ denotes the number of orbitals).
Parallel to the ``complete'' CC series presented above, an alternative series of approximate iterative CC models have been developed by the Aarhus group in the context of CC response theory \cite{Christiansen_1998} where one skips the most expensive terms and avoids the storage of the higher-excitation amplitudes: CC2, \cite{Christiansen_1995a} CC3, \cite{Christiansen_1995b,Koch_1997} and CC4 \cite{Kallay_2005,Matthews_2021}
These iterative methods scale as $\order*{\Norb^{5}}$, $\order*{\Norb^{7}}$, and $\order*{\Norb^{9}}$, respectively, and can be seen as cheaper approximations of CCSD, CCSDT, and CCSDTQ.
A similar systematic truncation strategy can be applied to CI methods leading to the well-established family of methods known as CISD, CISDT, CISDTQ, \ldots~where one systematically increases the maximum excitation degree of the determinants taken into account.
Except for full CI (FCI) where all determinants from the Hilbert space (\ie, with excitation degree up to $\Nel$) are considered, truncated CI methods are variational but lack size-consistency.
The non-variationality of truncated CC methods being less of an issue than the size-inconsistency of the truncated CI methods, the formers have naturally overshadowed the latters in the electronic structure landscape.
Indeed, selected CI (SCI) methods, \cite{Booth_2009,Giner_2013,Evangelista_2014,Giner_2015,Holmes_2016,Tubman_2016,Liu_2016,Ohtsuka_2017,Zimmerman_2017,Coe_2018,Garniron_2018} where one iteratively selects the energetically relevant determinants from the FCI space (usually) based on a perturbative criterion, has been recently shown to be highly successful in order to produce reference energies for ground and excited states in small- and medium-size molecules \cite{Holmes_2017,Li_2018,Li_2020,Loos_2018a,Chien_2018,Loos_2019,Loos_2020b,Loos_2020c,Loos_2020e,Garniron_2019,Eriksen_2020,Yao_2020,Veril_2021,Loos_2021} thanks to efficient deterministic, stochastic or hybrid algorithms well suited for massive parallelization.
SCI methods are based on a well-known fact: amongst the very large number of determinants belonging to the FCI space, only a relative small fraction of them significantly contributes to the energy.
Accordingly, the SCI+PT2 family of methods performs a sparse exploration of the FCI space by selecting iteratively only the most energetically relevant determinants of the variational space and supplementing it with a second-order perturbative correction (PT2). \cite{Huron_1973,Garniron_2017,Sharma_2017,Garniron_2018,Garniron_2019}
Although the formal scaling of such algorithms remain exponential, the prefactor is greatly reduced which explains their current attractiveness in the electronic structure community and much wider applicability than their standard FCI parent.
Note that, very recently, several groups \cite{Aroeira_2021,Lee_2021,Magoulas_2021} have coupled CC and SCI methods via the externally-corrected CC methodology, \cite{Paldus_2017} showing promising performances for weakly and strongly correlated systems.
A rather different strategy in order to reach the holy grail FCI limit is to resort to M{\o}ller-Plesset (MP) perturbation theory, \cite{Moller_1934}
which popularity originates from its black-box nature, size-extensivity, and relatively low computational scaling, making it easily applied to a broad range of molecular systems.
The second-order M{\o}ller-Plesset (MP2) method \cite{Moller_1934} [which scales as $\order*{\Norb^{5}}$] has been broadly adopted in quantum chemistry for several decades, and is now included in the increasingly popular double-hybrid functionals \cite{Grimme_2006} alongside exact HF exchange.
MP4, \cite{Krishnan_1980} MP5, \cite{Kucharski_1989} and MP6 \cite{He_1996a,He_1996b} which scales as $\order*{\Norb^{6}}$, $\order*{\Norb^{7}}$, $\order*{\Norb^{8}}$, and $\order*{\Norb^{9}}$ respectively] have been investigated much more scarcely.
However, it is now widely recognised that the series of MP approximations might show erratic, slow, or divergent behavior that limit its applicability and systematic improvability. \cite{Laidig_1985,Knowles_1985,Handy_1985,Gill_1986,Laidig_1987,Nobes_1987,Gill_1988,Gill_1988a,Lepetit_1988,Malrieu_2003}
Again, MP perturbation theory and CC methods can be coupled.
The CCSD(T) method, \cite{Raghavachari_1989} known as the gold-standard of quantum chemistry for weakly correlated systems, is probably the most iconic example of such coupling.
Motivated by the recent blind test of Eriksen \textit{et al.}\cite{Eriksen_2020}~reporting the performance of a large panel of emerging electronic structure methods [the many-body expansion FCI (MBE-FCI), \cite{Eriksen_2017,Eriksen_2018,Eriksen_2019a,Eriksen_2019b} adaptive sampling CI (ASCI), \cite{Tubman_2016,Tubman_2018,Tubman_2020} iterative CI (iCI), \cite{Liu_2014,Liu_2016,Lei_2017,Zhang_2020} semistochastic heat-bath CI (SHCI), \cite{Holmes_2016,Holmes_2017,Sharma_2017} the full coupled-cluster reduction (FCCR), \cite{Xu_2018,Xu_2020} density-matrix renormalization group (DMRG), \cite{White_1992,White_1993,Chan_2011} adaptive-shift FCI quantum Monte Carlo (AS-FCIQMC), \cite{Booth_2009,Cleland_2010,Ghanem_2019} and cluster-analysis-driven FCIQMC (CAD-FCIQMC) \cite{Deustua_2017,Deustua_2018}] on the non-relativistic frozen-core correlation energy of the benzene molecule in the standard correlation-consistent double-$\zeta$ Dunning basis set (cc-pVDZ), some of us have recently investigated the performance of the \textit{Configuration Interaction using a Perturbative Selection made Iteratively} (CIPSI) method \cite{Huron_1973,Giner_2013,Giner_2015,Garniron_2018,Garniron_2019} on the very same system \cite{Loos_2020e} [see also Ref.~\onlinecite{Lee_2020} for a study of the performance of phaseless auxiliary-field quantum Monte Carlo (ph-AFQMC) \cite{Motta_2018}].
In the continuity of this recent work, we report here a significant extension by estimating the (frozen-core) FCI/cc-pVDZ correlation energy of twelve cyclic molecules (cyclopentadiene, furan, imidazole, pyrrole, thiophene, benzene, pyrazine, pyridazine, pyridine, pyrimidine, tetrazine, and triazine) with the help of CIPSI employing energetically-optimized orbitals at the same level of theory. \cite{Yao_2020,Yao_2021}
In particular, we study i) the MP perturbation series up to fifth-order (MP2, MP3, MP4, and MP5), ii) the CC2, CC3, and CC4 approximate series, and ii) the ``complete'' CC series up to quadruples (\ie, CCSD, CCSDT, and CCSDTQ).
The geometries of the twelve systems considered in the present study have been all obtained at the CC3/aug-cc-pVTZ level of theory and have been extracted from a previous study. \cite{Loos_2020a}
Note that, for the sake of consistency, the geometry of benzene considered here is different from one of Ref.~\onlinecite{Loos_2020e} which has been computed at a lower level of theory [MP2/6-31G(d)]. \cite{Schreiber_2008}
The MP2, MP3, MP4, CC2, CC3, CC4, CCSD, CCSDT, and CCSDTQ calculations have been performed with CFOUR, \cite{Matthews_2020} while the CCSD(T) and MP5 calculations have been computed with Gaussian 09. \cite{g09}
The CIPSI calculations have been performed with {\QP}. \cite{Garniron_2019}
In the current implementation, the selection step and the PT2 correction are computed simultaneously via a hybrid semistochastic algorithm. \cite{Garniron_2017,Garniron_2019} (which explains the statistical error associated with the PT2 correction in the following).
Here, we employ the renormalized version of the PT2 correction which has been recently implemented and tested for a more efficient extrapolation to the FCI limit thanks to a partial resummation of the higher-order of perturbation. \cite{Garniron_2019}
We refer the interested reader to Ref.~\onlinecite{Garniron_2019} where one can find all the details regarding the implementation of the PT2 correction and the CIPSI algorithm.
Although the FCI energy has the enjoyable property of being independent of the set of one-electron orbitals used to construct the many-electron Slater determinants, as a truncated CI method, the convergence properties of CIPSI strongly dependent on this orbital choice.
In the present study, we investigate the convergence behavior of the CIPSI energy for two sets of orbitals in particular: natural orbitals (NOs) and optimized orbitals (OOs).
Following our usual procedure, \cite{Scemama_2018,Scemama_2018b,Scemama_2019,Loos_2018a,Loos_2019,Loos_2020a,Loos_2020b,Loos_2020c,Loos_2020e} we perform first a preliminary SCI calculation using HF orbitals in order to generate a SCI wave function with at least $10^7$ determinants.
Natural orbitals are computed based on this wave function and they are used to perform a new CIPSI run.
Successive orbital optimizations are then performed, which consist in minimizing the variational CIPSI energy at each iteration up to approximately $2\times10^5$ determinants.
When convergence is achieved in terms of orbital optimization, as our ``production'' run, we perform a new CIPSI calculation from scratch using this set of optimized orbitals.
In some cases, we also explore the use of localized orbitals (LOs) which are produced with the help of the Boys-Foster localization procedure \cite{Boys_1960} that we apply to the natural orbitals in several orbital windows in order to preserve a strict $\sigma$-$\pi$ separation in the planar systems considered here.
Because they take advantage of the local character of electron correlation, localized orbitals have been shown to provide faster convergence towards the FCI limit compared to natural orbitals. \cite{Angeli_2003,Angeli_2009,BenAmor_2011,Suaud_2017,Chien_2018,Eriksen_2020,Loos_2020e}
As we shall see below, employing optimized orbitals has the advantage to produce an even smoother and faster convergence of the SCI energy toward the FCI limit.
Note that, unlike excited-state calculations where it is important to enforce that the wave functions are eigenfunctions of the $\Hat{S}^2$ spin operator, \cite{Chilkuri_2021} the present wave functions do not fulfil this property as we aim for the lowest possible energy of a singlet state.
We have found that $\expval*{\Hat{S}^2}$ is, nonetheless, very close to zero for each system.
is the variational wave function, $\mathcal{I}_k$ is the set of internal determinants $\ket*{I}$ and $\mathcal{A}_k$ is the set of external determinants $\ket*{\alpha}$ which do not belong to the variational space but are linked to it via a nonzero matrix element, \ie, $\mel*{\Psi_\text{var}^{(k)}}{\Hat{H}}{\alpha}\neq0$.
In practice, $E_\text{var}^{(k)}$ is computed by diagonalizing the $\Ndet^{(k)}\times\Ndet^{(k)}$ CI matrix $\bH$ with elements $H_{IJ}=\mel{I}{\hH}{J}$ via Davidson's algorithm \cite{Davidson_1975} and the magnitude of $E_\text{PT2}$ provides a qualitative idea of the ``distance'' to the FCI limit. \cite{Garniron_2018}
Orbital optimization techniques at the SCI level are theoretically straightforward, but practically challenging. \cite{Yao_2020,Yao_2021}
Here, we detail our orbital optimization procedure within the CIPSI algorithm and we assume that the variational wave function is normalized, \ie, $\braket*{\Psi_\text{var}}{\Psi_\text{var}}=1$.
where $\bc$ gathers the CI coefficients, $\bX$ the orbital rotation parameters and $\hX$ is a one-electron anti-hermitian operator, which creates a rotation matrix when exponentiated, \ie, $\bR= e^{\bX}$.
From a more general point of view, the variational energy $E_\text{var}$ depends on both the coefficient $\{ c_I \}_{1\le I \le\Ndet^{(k)}}$ [see Eq.~\eqref{eq:Psivar}] but also on the orbital rotation parameter $\{X_{pq}\}_{1\le p,q \le\Norb}$.
Although one could use a second order method to minimize the corresponding energy, one has to realize that the size of the CI space is much larger than orbital space.
It is therefore more appropriate to perform a minimization of the variational energy with respect to the orbital rotation parameters and then compute the new CI coefficients by re-diagonalizing the CI matrix.
To do so, we need the first and the second derivatives of the energy with respect to the orbital rotations. Also, we would need the coupling of these last ones with the changes in the CI coefficients, but we will see later that this last point is problematic.
%with $\phi_k$ the $k$-th molecular orbital defined as a linear combination of the atomic orbitals, $\br$ the spatial coordinates of the electron and $\hat{h}$: the one-electron part of the Hamiltonian.
Then, with the differentiation of this variational energy $E$, we obtain the gradient $\bm{g}$ and the Hessian $\bm{H}$. \cite{Henderson2014Dec} The gradient of the energy with respect to the orbital rotation, $g_{pq}$, around $\bm{x}=0$,
\caption{Total energy $E$ (in \Eh) and correlation energy $\Delta E$ (in \mEh) for the frozen-core ground state of five-membered rings in the cc-pVDZ basis set.
\caption{Total energy $E$ (in \Eh) and correlation energy $\Delta E$ (in \mEh) for the frozen-core ground state of six-membered rings in the cc-pVDZ basis set.
\caption{$\Delta E_\text{var}$ (solid) and $\Delta E_\text{var}+ E_\text{PT2}$ (dashed) as functions of the number of determinants $\Ndet$ in the variational space for the twelve cyclic molecules represented in Fig.~\ref{fig:mol}.
Two sets of orbitals are considered: natural orbitals (NOs, in red) and optimized orbitals (OOs, in blue).
The CCSDTQ correlation energy is represented as a thick black line.
This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement No.~863481).
The data that support the findings of this study are openly available in Zenodo at \href{http://doi.org/XX.XXXX/zenodo.XXXXXXX}{http://doi.org/XX.XXXX/zenodo.XXXXXXX}.