In electronic structure theory, configuration interaction (CI) methods allow for a systematic way to obtain approximate or exact solutions of the electronic Hamiltonian,
by expanding the wave function as a linear combination of Slater determinants (or configuration state functions).
At the full CI (FCI) level, the complete Hilbert space is spanned in the wave function expansion, leading to the exact solution for a given one-particle basis set.
Except for very small systems, the FCI limit is unnatainable, and in practice the expansion of the CI wave function must be trunctated.
The question is then how to construct an effective and computationally tractable hierarchy of truncated CI methods
that best recover the correlation energy, understood as the energy difference between the FCI and the mean-field restricted Hartree-Fock (HF) solutions.
%that lead as fast as possible to the FCI limit.
The most well-known and popular class of CI methods is excitation-based,
where one accounts for all determinants generated by exciting up to $d$ electrons from a given close-shell reference, which is usually the restricted HF solution, but does not have to.
In this way, the excitation degree $d$ parameter defines the hierarchy
CI with single excitations (CIS), CI with single and double excitations (CISD), CI with single, double, and triple excitations (CISDT), and so on.
% scaling is based on the excitation degree $d$.
The excitation-based CI hierarchy manages to quickly recover weak (dynamic) correlation effects, but struggles at strong (static) correlation regimes.
Importantly, the number of determinants $N_{det}$ (which control the computational cost) scale polynomially with the number of electrons $N$ as $N^{2d}$.
%This means that the contribution of higher excitations become progressively smaller.
%In turn, seniority-based CI is specially targeted to describe static correlation.
The seniority zero ($\Omega=0$) sector has been shown to be the most important for static correlation, while higher sectors tend to contribute progressively less ~\cite{Bytautas_2011,Bytautas_2015,Alcoba_2014b,Alcoba_2014}.
However, already at the CI$\Omega$0 level the number of determinants scale exponentially with $N$, since excitations of all excitation degrees $d$ are included.
Therefore, despite the encouraging successes of seniority-based CI methods, their unfourable computational scaling restricts applications to very small systems.
When targeting static correlation, seniority-based CI methods tend to have a better performance than excitation-based CI, despite the higher computational cost.
The latter class of methods, in contrast, are well-suited for recovering dynamic correlation, and only at polynomial cost with system size.
% tackling
Ideally, we aim for a method that captures most of both static and dynamic correlation, with as few determinants as possible.
With this goal in mind, we propose a new partitioning of the Hilbert space, named configuration interaction order (CIo).
It combines both the excitation degree $d$ and the seniority number $\Omega$ into one single parameter $o$ (order),
\begin{equation}
\label{eq:o}
o = \frac{d+\Omega/2}{2},
\end{equation}
which assumes half-integer values.
% open-shell
Fig.~\ref{fig:allCI} shows how the Hilbert space is populated in excitation-based CI, seniority-based CI, and our hybrid CIo methods.
\caption{Partionining of the full Hilbert space into blocks of specific excitation degree $d$ (with respect to a closed-shell determinant) and seniority number $\Omega$.
Each of three classes of CI methods truncate this $d$-$\Omega$ map differently, and each color tone represents the added determinants at a given CI level.}
We know that low degree excitations and low seniority sectors, when looked at individually, often have the most important contribution to the FCI expansion.
%carry the most important weights.
By combining $d$ and $\Omega$ as is eq.~\ref{eq:o}, we ensure that both directions in the excitation-seniority map (see Fig.~\ref{fig:allCI}) will be contemplated.
Rather than filling the map top-bottom (as in excitation-based CI) or left-right (as in seniority-based CI), the CIo hierarchy fills it diagonally.
In this sense, we hope to recover dynamic correlation by moving right in the map (increasing the excitation degree while keeping a low seniority number),
at the same time as static correlation, by moving down (increasing the seniority number while keeping a low excitation degree).
%dynamic correlation is recovered with traditional CI.
The second justification is computational.
%Fig.~\ref{fig:scaling} also illustrates how the number of determinants within each block scales with the number of occupied orbitals $O$ and the number of virtual orbitals $V$.
In the CIo class of methods, each next level of theory accomodates additional determinants from different excitation-seniority sectors (each block of Fig.~\ref{fig:allCI}).
The key realization of the CIo hierarchy is that the number of additional determinants presents the same scaling with respect to $N$, for all excitation-seniority sectors entering at a given order $o$.
%to $O$ and $V$, for all excitation-seniority sectors of a given order $o$.
%This computational realization represents the second justification for the introduction of the CIo method.
This further justifies the parameter $o$ as being the simple average between $d$ and $\Omega/2$.
%the number of determinants of CIo2 and CISD scale as $O^2V^2$, those of CIo3 and CISDT scale as $O^3V^3$, and so on.
From this computational perspective, the CIo hierarchy can be seen as a more natural choice than the traditional excitation-based CI,
because if one can afford for, say, the $N_{det}\sim N^6$ cost of a CISDT calculation, than one can probably afford a CIo3 calculation, which has the same computational scaling.
the CIo1 counterpart already represents a minimally correlated model, with the very favourable $N_{det}\sim N^2$ scaling.
%number of determinants scaling only as $OV$.
%
In addition, CIo allows for half-integer orders $o$, with no parallel in excitation-based CI.
This gives extra flexibility in terms of choice of method.
%when evaluating the computational cost and desired accuracy of a calculation.
For a particular application with excitation-based CI, CISD might be too inaccurate, for example, while the price for the improved accuracy of CISDT might be too high.
With the CIo hierarchy, CIo2.5 represents an alternative, being more accurate than CIo2 and less expensive than CIo3.
Finally, the third justification for our CIo method is empirical and closely related to the computational motivation.
There are many possible ways to populate the Hilbert space starting from the a given reference determinant,
and one can in principle formulate any systematic recipe that includes progressively more determinants.
Besides a physical or computational perspective, the question of what makes for a good recipe can be framed empirically.
Does our CIo class of methods perform better than excitation-based or seniority-based CI,
in the sense of recovering most of the correlation energy with the least computational effort?
Bytautas et al.\cite{Bytautas_2015} explored a different hybrid scheme combining determinants from a complete active space and with a maximum seniority number.
First, it is defined by a single parameter that unifies excitation degree and seniority number (eq.\ref{eq:o}).
And second, each next level includes all classes of determinants sharing the same scaling with system size, as discussed before, thus keeping the method at a polynomial scaling.
The CIo method was implemented in {\QP} via a straightforward adaptation of the \textit{configuration interaction using a perturbative selection made iteratively} (CIPSI) algorithm \cite{Huron_1973,Giner_2013,Giner_2015},
by allowing only for determinants at a given order $o$.
In practice, the CI energy is converged (within a chosen threshold of) with considerably fewer determinants than the formal number of determinants at a given $o$.
The traditional excitation-based CI and the FCI calculations presented here were also performed with the CIPSI algorithm implemented in {\QP}. \cite{Huron_1973,Giner_2013,Giner_2015,Garniron_2019}
The CI calculations were performed with both canonical Hartree-Fock (HF) orbitals and optimized orbitals (oo).
In the latter case, the energy is obtained variationally in both the CI space and in the orbital parameter space, given rise to what may be called an orbital-optimized CI (ooCI) method.
%We have also performed orbital optimized CI (ooCI) calculations, where the energy is obtained variationally both in the CI space and in the orbital parameter space.
%
We employed the algorithm described elsewhere \cite{Damour_2021} and also implemented in {\QP} for optimizing the orbitals within a CI wave function.
In order to avoid converging to a saddle point solution, we employed a similar strategy as recently described in Ref. \cite{Hollett_2022}.
Namely, whenever the eigenvalue of the orbital rotation Hessian is negative and the corresponding gradient component $g_i$ lies below a given threshold $g_0$,
then this gradient component is replaced by $g_0 |g_i|/g_i$.
While we can never ensure that the obtained solutions are global minima in the orbital parameter space, we verified that in all cases surveyed here, the stationary solutions are real minima (rather than maxima or stationary points).
Here we assess the performance of the hierarchy of CIo methods against its excitation-based and seniority-based counterparts.
To do so, we calculated the potential energy curves (PECs) for a total of 8 systems:
\ce{HF}, \ce{F2}, \ce{N2},
%\ce{Be2}, \ce{H2O},
ethylene, \ce{H4}, and \ce{H8}.
For the latter two, we considered linearly arranged and equally spaced hydrogen atoms, and computed PECs along the symmetric dissociation coordinates.
%For \ce{H2O}, we considered the symmetric stretching of the O$-$H bonds,
For ethylene, we considered the C$=$C double bond stretching, while freezing the remaining internal coordinates.
%in both cases freezing the remaining internal coordinates.
All CI calculations were performed for the cc-pVDZ basis set and with frozen core orbitals.
From the PECs, we have also extrated the equilibrium geometries and vibrational frequencies (details can be found in the SI).
It is worth mentioning that obtaining smooth PECs for the orbital optimized calculations was far from trivial.
First, the orbital optimization started from the HF orbitals of each geometry.
This usually lead to discontinuous PECs, meaning that distinct solutions of the orbital optimization have been found with our algorithm.
Then, at some geometry or geometries that seem to present the lowest lying solution,
the optimized orbitals were employed as the guess orbitals for the neighbouring geometries, and so on, until a new PEC is attained.
%orthonormalized
This protocol is repeated until the PEC built from the lowest lying orbital optimized solutions becomes continuous.
While we cannot guarantee that the presented solutions represent the global minima, we believe that in most cases the above protocol provides at least close enough solutions.
%Multiple solutions for the orbital optimization are usually found, meaning several local minimal in the orbital parameter landscape.
%meaning that the set of orbitals are stationary with respect to the energy.
We recall that saddle point solutions were purposedly avoided in our orbital optimization algorithm. If that was not the case, then even more stationary solutions would have been found.
%\subsection{Nonparallelity errors and dissociation energies}
\subsection{Nonparallelity errors}
In Fig.~\ref{fig:plot_stat} we present, for the six systems studied, and for the three classes of CI methods,
the nonparalellity error (NPE) as function of the formal number of determinants.
%the potential energy curves (PECs) and the corresponding differences with respect to the FCI result, as well as the nonparalellity error (NPE) and the distance error.
The NPE is defined as the maximum minus the minimum differences between the potential energy curves (PECs) obtained at given CI level and the exact FCI result.
This is an important metric because it captures the resemblance between the shape of the two PECs, which in turn determine the relevant physical observables, as equilibrium geometries, vibrational frequencies, and dissociation energies.
The corresponding PECs and the energy differences with respect to the FCI results can be found in the SI.
%
In the SI we further present the distance error, defined as the sum of the maximum and the minimum differences between a given PEC and the FCI result.
Thus, while the NPE probes the similarity regarding the shape of the PECs, the distance error provides a measure of how their magnitudes compare.
\fk{in progress...}
%%% FIG 2 %%%
\begin{figure}[h!]
\includegraphics[width=\linewidth]{plot_stat}
\caption{Nonparallelity errors as function of the number of determinants, for the three classes of CI methods: seniority-based CI (blue), excitation-based CI (red), and our proposed hybrid CIo (green).
}
\label{fig:plot_stat}
\end{figure}
%%% %%% %%%
%We start by discussing the dissociation of \ce{F2}, which involves a single bond breaking.
%Now moving to a more challenging problem, the dissociation of \ce{N2}, where three bonds are broken.
%For different CI approaches, Fig.~\ref{fig:N2_pes} shows PECs and their differences with respect to FCI, as well as the NPE and distance errors.
%The associated differences with respect to the FCI result can be seen in the Supporting Information.
The main result contained in Fig.~\ref{fig:plot_stat} concerns the overall faster convergence of the CIo methods when compared to excitation-based and seniority-based CI methods.
This is observed both for single bond breaking (\ce{HF} and \ce{F2}) as well as the more challenging double (ethylene) and triple (\ce{N2}) bond breaking.
The convergence with respect to the number of determinants is slower in the latter cases, irrespective of the class of CI methods, as would be expected.
But more importantly, the superiority of the CIo methods appear to be highlighted in the multiple bond break systems.
%Unless stated otherwise, from here on the performance of each method is probed by their NPE. Later we discuss other metrics.
For the four systems (more so for ethylene and \ce{N2}), CIo2 is better than CISD, two methods where the number of determinants scales as $N^4$.
CIo2.5 is better than CISDT, despite its lower computational cost, whereas CIo3 is much better than CISDT, and comparable in accuracy with CISDTQ.
Inspection of the PECs (see SI) reveal that the lower NPE in the CIo results stem mostly from the contribution of the dissociation region.
This result demonstrates the importance of higher-order excitations with low seniority number in this strong correlation regime, which are accounted for in CIo but not in excitation-based CI (for a given scaling with the number of determinants).
%The situation at the Franck-Condon region will be discussed later.
Meanwhile, the first level of seniority-based CI (CI$\Omega$0, which is the same as doubly-occupied CI) tends to offer a rather low NPE when compare to the other CI methods with a similar number of determinants (CIo2.5 and CISDT).
However, convergence is clearly slower for the next levels in this hierarchy (CI$\Omega$2 and CI$\Omega$4), while excitation-based CI and specially CIo methods converge faster.
%
For the symmetric dissociation of linear \ce{H4} and \ce{H8} the performance of CIo and excitation-based CI are similar, both being superior to seniority-based CI.
It is worth mentioning the surprisingly good performance of the CIo1 and CIo1.5 methods.
For \ce{HF}, \ce{F2}, and ethylene, they presented lower NPEs than the much more expensive CISDT method, being slightly higher in the case of \ce{N2}.
For the same systems we also see the NPEs increase from CIo1.5 to CIo2, and decreasing to lower values only at the CIo3 level.
Both findings are not observed for \ce{H4} and \ce{H8}.
It seems that both the relative success of CIo1 and CIo1.5 methods as well as the relative worsening of the CIo2 method decrease as progressively more bonds are being broken (compare for instance \ce{F2}, \ce{N2}, and \ce{H8} in Fig.~\ref{fig:plot_stat}).
This is because
\fk{in progress...}
Even than, it is important to remember that even the CIo2 method remains superior to its excitation-based counterpart.
%Whereas in excitation-based CI, the NPE always decrease as one moves to higher orders,
\subsection{Equilibrium geometries and vibrational frequencies}
In Fig.~\ref{fig:xe} and \ref{fig:freq}, we present the convergence of the equilibrium geometries and vibrational frequencies, respectively,
with respect to the number of determinants, for the three types of CI approaches.
%, vibrational frequencies, and dissociation energies,
%
For \ce{F2}, the CIo method has an overall better convergence than the excitation-based CI counterpart, and much better than seniority-based CI.
The values oscillate around the FCI limit in CIo, whereas the convergence is monotonic in the two CI alternatives.
Interstingly, CIo1 and specially CIo1.5, two methods with a modest computational cost, provide very accurate equilibrium geometries and vibrational frequencies,
Orbital optimization does not change the overall picture.
It does, however, lead to a more monotonic convergence in the case of CIo, which is not necessarily an advantage.
In particular, ooCIo1 and ooCIo1.5 are less accurate than their non-optimized counterparts.
%
For \ce{HF} (results in the Supporting Information),
CIo and excitation-based CI are comparable to each other and superior to seniority-based CI, at least for HF orbitals.
Orbital optimization significantly improves the case for seniority-based CI, and leads to slightly better convergence for CIo with respect to excitation-based CI.
In the case of \ce{N2}, CIo and excitation-based CI present similar convergence behaviours, both being superior to seniority-based CI.
Also, CIo is slightly better than excitation-based CI for HF orbitals, whereas both are equally good with orbital optimization.
%the advantages of CIo are less evident, though stil present.
%
%A somewhat better convergence is also observed in the case of ethylene (see SI).
The same conclusions hold for ethylene, \ce{H4}, and \ce{H8} (see SI).
Most of the times, the convergence of CIo either exceeds or is comparable to that of excitation-based CI.
%%% FIG 3 %%%
\begin{figure}[h!]
\includegraphics[width=\linewidth]{xe}
\caption{Equilibrium geometries as function of the number of determinants, for the three classes of CI methods: seniority-based CI (blue), excitation-based CI (red), and our proposed hybrid CIo (green).
}
\label{fig:xe}
\end{figure}
%%% %%% %%%
%%% FIG 4 %%%
\begin{figure}[h!]
\includegraphics[width=\linewidth]{freq}
\caption{Vibrational frequencies (or force constants) as function of the number of determinants, for the three classes of CI methods: seniority-based CI (blue), excitation-based CI (red), and our proposed hybrid CIo (green).
In CIo, each next step in the hierarchy brings different blocks of determinants which share the same computational scaling with respect to the number of electrons.
One of our key findings is that the NPE decreases faster with our hybrid CIo method than with either excitation-based or seniority-based CI.
One important conclusion is that orbital optimization is not necessarily a recommended strategy, depending on the properties one is interested in.
While orbital optimization will certainly improve the energy at a particular geometry, such improvement may vary largely on the geometry, which may or may not decrease the NPE.
One should also bear in mind that the orbital optimization is always accompanied with well-known challenges (several solutions, convergence issues)
and may imply in a significant computational burden (associated with the calculations of the orbital gradient, Hessian, and the many iterations that are often required).
In this sense, stepping up in the CI hierarchy might be a more straightforward and possibly cheaper alternative than optimizing the orbitals.
One interesting possibility to explore is to first employ a low order CI method to optimize the orbitals, and then to employ this set of orbitals at a higher level of CI.
This work was performed using HPC resources from CALMIP (Toulouse) under allocation 2021-18005.
This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement No.~863481).
%The data that support the findings of this study are openly available in Zenodo at \href{http://doi.org/XX.XXXX/zenodo.XXXXXXX}{http://doi.org/XX.XXXX/zenodo.XXXXXXX}.