forked from PTEROSOR/QUESTDB
toto part
This commit is contained in:
parent
a507931e79
commit
830a893fe2
@ -241,12 +241,12 @@ The definition of the active space considered for each system as well as the num
|
||||
\subsubsection{Estimating the extrapolation error}
|
||||
%------------------------------------------------
|
||||
|
||||
For the $m$th excited states (where $m = 0$ corresponds to the ground state), we usually estimate its FCI energy by performing a linear extrapolation of its variational energy $E_\text{var}^{(m)}$ as a function of its rPT2 correction $E_{\text{rPT2}}^{(m)}$ as follows
|
||||
For the $m$th excited state (where $m = 0$ corresponds to the ground state), we usually estimate its FCI energy $E_{\text{FCI}}^{(m)}$ by performing a linear extrapolation of its variational energy $E_\text{var}^{(m)}$ as a function of its rPT2 correction $E_{\text{rPT2}}^{(m)}$ as follows
|
||||
\begin{equation}
|
||||
E_\text{var}^{(m)} = E_{\text{FCI}}^{(m)} - \alpha^{(m)} E_{\text{rPT2}}^{(m)}
|
||||
\end{equation}
|
||||
$E_\text{var}^{(m)}$ varies almost linearly as a function of $E_{\text{rPT2}}^{(m)}$, but with a coefficient $\alpha^{(m)}$ which deviates slightly from unity in well-behaved cases.
|
||||
This implies that at any iteration of the CIPSI algorithm, the estimated error on the CIPSI energy is
|
||||
This implies that, at any iteration of the CIPSI algorithm, the estimated error on the CIPSI energy is
|
||||
\begin{equation}
|
||||
E_{\text{CIPSI}}^{(m)} - E_{\text{FCI}}^{(m)}
|
||||
= \qty(E_\text{var}^{(m)}+E_{\text{rPT2}}^{(m)}) - E_{\text{FCI}}^{(m)}
|
||||
@ -257,64 +257,43 @@ Therefore, the accuracy of the excitation energy estimates will strongly depend
|
||||
|
||||
Because our selection procedure ensures that the rPT2 values of both states match as well as possible (a trick known as PT2 matching \cite{Dash_2018,Dash_2019}), i.e., $E_{\text{rPT2}} = E_{\text{rPT2}}^{(0)} \approx E_{\text{rPT2}}^{(m)}$, the extrapolated excitation energy associated with the $m$th excited state can be estimated as
|
||||
\begin{equation}
|
||||
\begin{split}
|
||||
\Delta E^{(m)}
|
||||
& = E^{(m)}_{\text{CIPSI}} - E^{(0)}_{\text{CIPSI}}
|
||||
\\
|
||||
& = \qty[ E^{(m)} + E_{\text{rPT2}} + \qty(\alpha^{(n)}-1) E_{\text{rPT2}} ]
|
||||
- \qty[ E^{(0)} + E_{\text{rPT2}} + \qty(\alpha^{(0)}-1) E_{\text{rPT2}} ]
|
||||
\Delta E_{\text{FCI}}^{(m)}
|
||||
= \qty[ E_\text{var}^{(m)} + E_{\text{rPT2}} + \qty(\alpha^{(m)}-1) E_{\text{rPT2}} ]
|
||||
- \qty[ E_\text{var}^{(0)} + E_{\text{rPT2}} + \qty(\alpha^{(0)}-1) E_{\text{rPT2}} ]
|
||||
+ \order{E_{\text{rPT2}}^2 }
|
||||
\end{split}
|
||||
\end{equation}
|
||||
which evidences that the error on $\Delta E^{(m)}$ can be expressed as $\qty(\alpha^{(m)}-\alpha^{(0)}) E_{\text{rPT2}} + \order{E_{\text{rPT2}}^2}$.
|
||||
which evidences that the error in $\Delta E_{\text{FCI}}^{(m)}$ can be expressed as $\qty(\alpha^{(m)}-\alpha^{(0)}) E_{\text{rPT2}} + \order{E_{\text{rPT2}}^2}$.
|
||||
Moreover, using a common set of state-averaged natural orbitals for the ground and excited states tends to make the values of $\alpha^{(0)}$ and $\alpha^{(m)}$ very close to each other, such that the error on the energy difference is practically of the order of $E_{\text{rPT2}}^2$.
|
||||
|
||||
At the $n$th CIPSI iteration, we have access to the variational energies of both states, $E^{(0)}(n)$ and $E^{(m)}(n)$, as well as their the rPT2 corrections, $E_{\text{rPT2}}^{(0)}(n)$ and $E_{\text{rPT2}}^{(m)}(n)$.
|
||||
The $m$th excitation energy at iteration $n$ is then modeled as a Gaussian random variable with mean and variance
|
||||
At the $n$th CIPSI iteration, we have access to the variational energies of both states, $E_\text{var}^{(0)}(n)$ and $E_\text{var}^{(m)}(n)$, as well as their rPT2 corrections, $E_{\text{rPT2}}^{(0)}(n)$ and $E_{\text{rPT2}}^{(m)}(n)$.
|
||||
The $m$th excitation energy at iteration $n$ is then assumed to be a Gaussian random variable with mean and variance
|
||||
\begin{gather}
|
||||
\Delta E^{(m)}(n) = \qty[ E^{(m)}(n) + E_{\text{rPT2}}^{(m)}(n) ] - \qty[ E^{(0)}(n) + E_{\text{rPT2}}^{(0)}(n) ]
|
||||
\Delta E_\text{CIPSI}^{(m)}(n) = \qty[ E_\text{var}^{(m)}(n) + E_{\text{rPT2}}^{(m)}(n) ] - \qty[ E_\text{var}^{(0)}(n) + E_{\text{rPT2}}^{(0)}(n) ]
|
||||
\\
|
||||
\sigma^2(n) \propto \qty[E_{\text{rPT2}}^{(m)}(n)]^2 + \qty[E_{\text{rPT2}}^{(0)}(n)]^2
|
||||
\end{gather}
|
||||
and we treat all CIPSI iterations as samples coming from the same Gaussian process with weights $w(n) = 1/\sqrt{\sigma^2(n)}$.
|
||||
|
||||
The confidence interval is chosen to be equivalent to what one
|
||||
would obtain using $\pm 1$ standard deviation with Gaussian-distributed
|
||||
variables ($\mathcal{G}$). In other words, we will search for an interval $\mathcal{I}$
|
||||
such that the probability $P( \Delta E_{\text{FCI}} \in \mathcal{I})$
|
||||
that the true value of the excitation energy lies within the interval is
|
||||
equal to
|
||||
$P( \Delta E_{\text{FCI}} \in [ \Delta E \pm \sigma ] \; | \; \mathcal{G}) = 0.6827$.
|
||||
The probability that the FCI excitation energy is in an interval
|
||||
$\mathcal{I}$ is
|
||||
|
||||
and we treat all CIPSI iterations as a set of Gaussian-distributed variables ($\mathcal{G}$) with weights $w(n) = 1/\sqrt{\sigma^2(n)}$.
|
||||
We then search for a confidence interval $\mathcal{I}$ such that the true value of the excitation energy $\Delta E_{\text{FCI}}^{(m)}$ lies within one standard deviation of $\Delta E_\text{CIPSI}^{(m)}$, i.e., $P( \Delta E_{\text{FCI}} \in [ \Delta E_\text{CIPSI}^{(m)} \pm \sigma ] \; | \; \mathcal{G}) = 0.6827$.
|
||||
The probability that $\Delta E_{\text{FCI}}^{(m)}$ is in an interval $\mathcal{I}$ is
|
||||
\begin{equation}
|
||||
P( \Delta E_{\text{FCI}} \in \mathcal{I} ) = P( E_{\text{FCI}} \in I | \mathcal{G}) \times P(\mathcal{G})
|
||||
P( \Delta E_{\text{FCI}}^{(m)} \in \mathcal{I} ) = P( \Delta E_{\text{FCI}}^{(m)} \in I | \mathcal{G}) \times P(\mathcal{G})
|
||||
\end{equation}
|
||||
where the probability $P(\mathcal{G})$ that the random variables are
|
||||
normally distributed can be deduced from the Jarque-Bera test $J$ as
|
||||
|
||||
where the probability $P(\mathcal{G})$ that the random variables are normally distributed can be deduced from the Jarque-Bera test $J$ as
|
||||
\begin{equation}
|
||||
P(\mathcal{G}) = 1 - \chi^2_{\text{CDF}}(J,2)
|
||||
\end{equation}
|
||||
where $\chi^2_{\text{CDF}}(x,k)$ is the cumulative distribution function of the
|
||||
$\chi^2$ distribution with $k$ degrees of freedom.
|
||||
As the number of samples is usually small, we use Student's $t$ distribution to
|
||||
estimate the statistical error. The inverse of the cumulative
|
||||
distribution function of the $t$ distribution will allow us to find how
|
||||
to scale the interval with a parameter $\beta$ such that
|
||||
$P( \Delta E_{\text{FCI}} \in [ \Delta E \pm \beta \sigma ] ) = p$.
|
||||
|
||||
where $\chi^2_{\text{CDF}}(x,k)$ is the cumulative distribution function of the $\chi^2$-distribution with $k$ degrees of freedom.
|
||||
As the number of samples is usually small, we use Student's $t$-distribution to estimate the statistical error.
|
||||
The inverse of the cumulative distribution function of the $t$-distribution allows us to find how to scale the interval by a parameter
|
||||
\begin{equation}
|
||||
%\beta = t_{\text{CDF}}^{-1} \left[
|
||||
%\frac{1}{2} \left( 1 + \frac{P( \Delta E_{\text{FCI}} \in [ \Delta E \pm \sigma ] \; | \; \mathcal{G}) }{P(\mathcal{G})}\right), n \right]
|
||||
\beta = t_{\text{CDF}}^{-1} \left[
|
||||
\frac{1}{2} \left( 1 + \frac{0.6827}{P(\mathcal{G})}\right), n \right]
|
||||
\end{equation}
|
||||
Only the last $M>2$ computed energy differences are considered. $M$ is chosen
|
||||
such that $P(\mathcal{G})>0.8$ and such that the error bar is minimal.
|
||||
If all the values of $P(\mathcal{G})$ are below $0.8$, $M$ is chosen such that
|
||||
$P(\mathcal{G})$ is maximal.
|
||||
such that $P( \Delta E_{\text{FCI}}^{(m)} \in [ \Delta E_{\text{CIPSI}}^{(m)} \pm \beta \sigma ] ) = p$.
|
||||
Only the last $M>2$ computed energy differences are considered. $M$ is chosen such that $P(\mathcal{G})>0.8$ and such that the error bar is minimal.
|
||||
If all the values of $P(\mathcal{G})$ are below $0.8$, $M$ is chosen such that $P(\mathcal{G})$ is maximal.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{The QUEST database}
|
||||
@ -324,17 +303,25 @@ and we treat all CIPSI iterations as samples coming from the same Gaussian proce
|
||||
%=======================
|
||||
\subsection{Overview}
|
||||
%=======================
|
||||
The QUEST database gathers more than \alert{470} highly-accurate excitation energies of various natures (valence, Rydberg, $n \ra \pis$, $\pi \ra \pis$, singlet, soublet, triplet, and double excitations) for molecules ranging from diatomics to molecules as large as naphthalene.
|
||||
The QUEST database gathers more than \alert{470} highly-accurate excitation energies of various natures (valence, Rydberg, $n \ra \pis$, $\pi \ra \pis$, singlet, doublet, triplet, and double excitations) for molecules ranging from diatomics to molecules as large as naphthalene.
|
||||
Each of the five subsets making up the QUEST dataset is detailed below.
|
||||
Throughout the present article, we report several statistical indicators: the mean signed error (MSE), mean absolute error (MAE), root-mean square error (RMSE), and standard deviation of the errors (SDE).
|
||||
|
||||
%%% FIGURE 1 %%%
|
||||
\begin{figure}[bt]
|
||||
\begin{figure}[ht]
|
||||
\centering
|
||||
\includegraphics[width=0.5\linewidth]{fig1/fig1}
|
||||
\caption{Composition of each of the five subsets making up the present QUEST dataset of highly-accurate vertical excitation energies.}
|
||||
\end{figure}
|
||||
|
||||
%%% FIGURE 2 %%%
|
||||
\begin{figure}[ht]
|
||||
\centering
|
||||
\includegraphics[width=0.8\linewidth]{fig2}
|
||||
\caption{Molecules each of the five subsets making up the present QUEST dataset of highly-accurate vertical excitation energies:
|
||||
QUEST\#1 (red), QUEST\#2 (magenta and/or underlined), QUEST\#3 (black), QUEST\#4 (green), and QUEST\#5 (blue).}
|
||||
\end{figure}
|
||||
|
||||
%=======================
|
||||
\subsection{QUEST\#1}
|
||||
%=======================
|
||||
|
Binary file not shown.
@ -22,12 +22,12 @@ decoration={snake,
|
||||
\begin{tikzpicture}
|
||||
\begin{scope}[very thick,
|
||||
node distance=2cm,on grid,>=stealth',
|
||||
QUEST0/.style={circle,draw,fill=green!45},
|
||||
QUEST1/.style={rectangle,draw,fill=yellow!45},
|
||||
QUEST2/.style={rectangle,draw,fill=orange!45},
|
||||
QUEST3/.style={rectangle,draw,fill=red!45},
|
||||
QUEST4/.style={rectangle,draw,fill=violet!45},
|
||||
QUEST5/.style={rectangle,draw,fill=black!45}]
|
||||
QUEST0/.style={circle,draw,fill=orange!45},
|
||||
QUEST1/.style={rectangle,draw,fill=red!45},
|
||||
QUEST2/.style={rectangle,draw,fill=magenta!45},
|
||||
QUEST3/.style={rectangle,draw,fill=black!45},
|
||||
QUEST4/.style={rectangle,draw,fill=green!45},
|
||||
QUEST5/.style={rectangle,draw,fill=blue!45}]
|
||||
\node [QUEST0, align=center] (Q) at (4*0, 4*0) {QUEST \\ \tiny 470 highly-accurate \\ \tiny excitations };
|
||||
\node [QUEST1, align=center] (Q1) at (4*0.587785, 4*0.809017) {QUEST\#1 \\ \tiny small-sized molecules \\ \tiny \bf \red{JCTC 14 (2018) 4360}};
|
||||
\node [QUEST2, align=center] (Q2) at (4*0.951057, -4*0.309017) {QUEST\#2 \\ \tiny double excitations \\ \tiny \bf \red{JCTC 15 (2019) 1939}};
|
||||
|
BIN
Manuscript/fig2.pdf
Normal file
BIN
Manuscript/fig2.pdf
Normal file
Binary file not shown.
Loading…
Reference in New Issue
Block a user