woring on Toto part

This commit is contained in:
Pierre-Francois Loos 2020-09-08 11:56:02 +02:00
parent 0f1a2632fd
commit a507931e79

View File

@ -15,8 +15,7 @@
% \documentclass[blind,alpha-refs]{wiley-article} % \documentclass[blind,alpha-refs]{wiley-article}
% Add additional packages here if required % Add additional packages here if required
\usepackage{siunitx} \usepackage{graphicx,dcolumn,bm,xcolor,microtype,multirow,amscd,amsmath,amssymb,amsfonts,physics,longtable,mhchem,siunitx}
\usepackage{mhchem}
% macros % macros
\newcommand{\ra}{\rightarrow} \newcommand{\ra}{\rightarrow}
@ -192,17 +191,18 @@ All excited-state calculations are performed, except when explicitly mentioned,
All the SCI calculations are performed within the FC approximation using QUANTUM PACKAGE \cite{Garniron_2019} where the CIPSI algorithm \cite{Huron_1973} is implemented. Details regarding this specific CIPSI implementation can be found in Refs.~\cite{Garniron_2019} and \cite{Scemama_2019}. All the SCI calculations are performed within the FC approximation using QUANTUM PACKAGE \cite{Garniron_2019} where the CIPSI algorithm \cite{Huron_1973} is implemented. Details regarding this specific CIPSI implementation can be found in Refs.~\cite{Garniron_2019} and \cite{Scemama_2019}.
A state-averaged formalism is employed, i.e., the ground and excited states are described with the same set of determinants, but different CI coefficients. A state-averaged formalism is employed, i.e., the ground and excited states are described with the same set of determinants, but different CI coefficients.
Our usual protocol \cite{Scemama_2018,Scemama_2018b,Scemama_2019,Loos_2018a,Loos_2019,Loos_2020a,Loos_2020b,Loos_2020c} consists of performing a preliminary SCI calculation using Hartree-Fock orbitals in order to generate a SCI wave function with at least $10^7$ determinants. Our usual protocol \cite{Scemama_2018,Scemama_2018b,Scemama_2019,Loos_2018a,Loos_2019,Loos_2020a,Loos_2020b,Loos_2020c} consists of performing a preliminary CIPSI calculation using Hartree-Fock orbitals in order to generate a CIPSI wave function with at least $10^7$ determinants.
Natural orbitals are then computed based on this wave function, and a new, larger SCI calculation is performed with this new set of orbitals. Natural orbitals are then computed based on this wave function, and a new, larger CIPSI calculation is performed with this new set of orbitals.
This has the advantage to produce a smoother and faster convergence of the SCI energy toward the FCI limit. This has the advantage to produce a smoother and faster convergence of the SCI energy toward the FCI limit.
The SCI energy is defined as the sum of the variational energy (computed via diagonalization of the CI matrix in the reference space) and a PT2 correction which estimates the contribution of the determinants not included in the CI space \cite{Garniron_2017b}. The CIPSI energy $E_\text{CIPSI}$ is defined as the sum of the variational energy $E_\text{var}$ (computed via diagonalization of the CI matrix in the reference space) and a PT2 correction $E_\text{PT2}$ which estimates the contribution of the determinants not included in the CI space \cite{Garniron_2017b}.
By linearly extrapolating this second-order correction to zero, one can efficiently estimate the FCI limit for the total energies, and hence, compute the corresponding transition energies. By linearly extrapolating this second-order correction to zero, one can efficiently estimate the FCI limit for the total energies.
These extrapolated total energies (simply labeled as $E_\text{FCI}$ in the remainder of the paper) are then used to compute vertical excitation energies.
Depending on the set, we estimated the extrapolation error via different techniques. Depending on the set, we estimated the extrapolation error via different techniques.
For example, in Ref.~\cite{Loos_2020b}, we estimated the extrapolation error by the difference between the transition energies obtained with the largest SCI wave function and the FCI extrapolated value. For example, in Ref.~\cite{Loos_2020b}, we estimated the extrapolation error by the difference between the transition energies obtained with the largest SCI wave function and the FCI extrapolated value.
This definitely cannot be viewed as a true error bar, but it provides a rough idea of the quality of the FCI extrapolation and estimate. This definitely cannot be viewed as a true error bar, but it provides a rough idea of the quality of the FCI extrapolation and estimate.
Below, we provide a much cleaner way of estimating the extrapolation error in SCI methods, and we adopt this scheme for the five- and six-membered rings. Below, we provide a much cleaner way of estimating the extrapolation error in SCI methods, and we adopt this scheme for the five- and six-membered rings.
The particularity of the current implementation is that the selection step and the PT2 correction are computed \textit{simultaneously} via a hybrid semistochastic algorithm \cite{Garniron_2017,Garniron_2019}. The particularity of the current implementation is that the selection step and the PT2 correction are computed \textit{simultaneously} via a hybrid semistochastic algorithm \cite{Garniron_2017,Garniron_2019}.
Moreover, a renormalized version of the PT2 correction has been recently implemented for a more efficient extrapolation to the FCI limit \cite{Garniron_2019}. Moreover, a renormalized version of the PT2 correction (dubbed rPT2) has been recently implemented for a more efficient extrapolation to the FCI limit \cite{Garniron_2019}.
We refer the interested reader to Ref.~\cite{Garniron_2019} where one can find all the details regarding the implementation of the CIPSI algorithm. We refer the interested reader to Ref.~\cite{Garniron_2019} where one can find all the details regarding the implementation of the CIPSI algorithm.
Note that, all our SCI wave functions are eigenfunctions of the $\Hat{S}^2$ spin operator which is, unlike ground-state calculations, paramount in the case of excited states \cite{Applencourt_2018}. Note that, all our SCI wave functions are eigenfunctions of the $\Hat{S}^2$ spin operator which is, unlike ground-state calculations, paramount in the case of excited states \cite{Applencourt_2018}.
@ -241,8 +241,81 @@ The definition of the active space considered for each system as well as the num
\subsubsection{Estimating the extrapolation error} \subsubsection{Estimating the extrapolation error}
%------------------------------------------------ %------------------------------------------------
\alert{Here comes Anthony's part on error bars in SCI methods.} For the $m$th excited states (where $m = 0$ corresponds to the ground state), we usually estimate its FCI energy by performing a linear extrapolation of its variational energy $E_\text{var}^{(m)}$ as a function of its rPT2 correction $E_{\text{rPT2}}^{(m)}$ as follows
\begin{equation}
E_\text{var}^{(m)} = E_{\text{FCI}}^{(m)} - \alpha^{(m)} E_{\text{rPT2}}^{(m)}
\end{equation}
$E_\text{var}^{(m)}$ varies almost linearly as a function of $E_{\text{rPT2}}^{(m)}$, but with a coefficient $\alpha^{(m)}$ which deviates slightly from unity in well-behaved cases.
This implies that at any iteration of the CIPSI algorithm, the estimated error on the CIPSI energy is
\begin{equation}
E_{\text{CIPSI}}^{(m)} - E_{\text{FCI}}^{(m)}
= \qty(E_\text{var}^{(m)}+E_{\text{rPT2}}^{(m)}) - E_{\text{FCI}}^{(m)}
= \qty(1-\alpha^{(m)}) E_{\text{rPT2}}^{(m)}
\end{equation}
For the large systems considered here, $\abs{E_{\text{rPT2}}} > 2$ eV.
Therefore, the accuracy of the excitation energy estimates will strongly depend on our ability to compensate the errors in the calculations.
Because our selection procedure ensures that the rPT2 values of both states match as well as possible (a trick known as PT2 matching \cite{Dash_2018,Dash_2019}), i.e., $E_{\text{rPT2}} = E_{\text{rPT2}}^{(0)} \approx E_{\text{rPT2}}^{(m)}$, the extrapolated excitation energy associated with the $m$th excited state can be estimated as
\begin{equation}
\begin{split}
\Delta E^{(m)}
& = E^{(m)}_{\text{CIPSI}} - E^{(0)}_{\text{CIPSI}}
\\
& = \qty[ E^{(m)} + E_{\text{rPT2}} + \qty(\alpha^{(n)}-1) E_{\text{rPT2}} ]
- \qty[ E^{(0)} + E_{\text{rPT2}} + \qty(\alpha^{(0)}-1) E_{\text{rPT2}} ]
+ \order{E_{\text{rPT2}}^2 }
\end{split}
\end{equation}
which evidences that the error on $\Delta E^{(m)}$ can be expressed as $\qty(\alpha^{(m)}-\alpha^{(0)}) E_{\text{rPT2}} + \order{E_{\text{rPT2}}^2}$.
Moreover, using a common set of state-averaged natural orbitals for the ground and excited states tends to make the values of $\alpha^{(0)}$ and $\alpha^{(m)}$ very close to each other, such that the error on the energy difference is practically of the order of $E_{\text{rPT2}}^2$.
At the $n$th CIPSI iteration, we have access to the variational energies of both states, $E^{(0)}(n)$ and $E^{(m)}(n)$, as well as their the rPT2 corrections, $E_{\text{rPT2}}^{(0)}(n)$ and $E_{\text{rPT2}}^{(m)}(n)$.
The $m$th excitation energy at iteration $n$ is then modeled as a Gaussian random variable with mean and variance
\begin{gather}
\Delta E^{(m)}(n) = \qty[ E^{(m)}(n) + E_{\text{rPT2}}^{(m)}(n) ] - \qty[ E^{(0)}(n) + E_{\text{rPT2}}^{(0)}(n) ]
\\
\sigma^2(n) \propto \qty[E_{\text{rPT2}}^{(m)}(n)]^2 + \qty[E_{\text{rPT2}}^{(0)}(n)]^2
\end{gather}
and we treat all CIPSI iterations as samples coming from the same Gaussian process with weights $w(n) = 1/\sqrt{\sigma^2(n)}$.
The confidence interval is chosen to be equivalent to what one
would obtain using $\pm 1$ standard deviation with Gaussian-distributed
variables ($\mathcal{G}$). In other words, we will search for an interval $\mathcal{I}$
such that the probability $P( \Delta E_{\text{FCI}} \in \mathcal{I})$
that the true value of the excitation energy lies within the interval is
equal to
$P( \Delta E_{\text{FCI}} \in [ \Delta E \pm \sigma ] \; | \; \mathcal{G}) = 0.6827$.
The probability that the FCI excitation energy is in an interval
$\mathcal{I}$ is
\begin{equation}
P( \Delta E_{\text{FCI}} \in \mathcal{I} ) = P( E_{\text{FCI}} \in I | \mathcal{G}) \times P(\mathcal{G})
\end{equation}
where the probability $P(\mathcal{G})$ that the random variables are
normally distributed can be deduced from the Jarque-Bera test $J$ as
\begin{equation}
P(\mathcal{G}) = 1 - \chi^2_{\text{CDF}}(J,2)
\end{equation}
where $\chi^2_{\text{CDF}}(x,k)$ is the cumulative distribution function of the
$\chi^2$ distribution with $k$ degrees of freedom.
As the number of samples is usually small, we use Student's $t$ distribution to
estimate the statistical error. The inverse of the cumulative
distribution function of the $t$ distribution will allow us to find how
to scale the interval with a parameter $\beta$ such that
$P( \Delta E_{\text{FCI}} \in [ \Delta E \pm \beta \sigma ] ) = p$.
\begin{equation}
%\beta = t_{\text{CDF}}^{-1} \left[
%\frac{1}{2} \left( 1 + \frac{P( \Delta E_{\text{FCI}} \in [ \Delta E \pm \sigma ] \; | \; \mathcal{G}) }{P(\mathcal{G})}\right), n \right]
\beta = t_{\text{CDF}}^{-1} \left[
\frac{1}{2} \left( 1 + \frac{0.6827}{P(\mathcal{G})}\right), n \right]
\end{equation}
Only the last $M>2$ computed energy differences are considered. $M$ is chosen
such that $P(\mathcal{G})>0.8$ and such that the error bar is minimal.
If all the values of $P(\mathcal{G})$ are below $0.8$, $M$ is chosen such that
$P(\mathcal{G})$ is maximal.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{The QUEST database} \section{The QUEST database}
\label{sec:QUEST} \label{sec:QUEST}