4$ - Loop over $\Ndetup$ determinants (rows of the $C$ matrix) \\ Remove all the rows where $d(\mDup, \mDupp)>4$ ($\sim \mathcal{O}(\sqrt{\Ndet})$) - Loop over $\Ndetdn$ determinants (columns of the $C$ matrix) \\ Remove all the columns where $d(\mDdn, \mDdnn)>4)$ - The remaining number of determinants is bounded by the size of the CISDTQ space ** Finer filtering \[ \epsi = \sum_{\ma \in \mA_\mi} \frac{ \langle \mPsi | \mH | \ma_\mi \rangle \langle \ma_\mi | \mH | \mPsi' \rangle}{\Evar - \langle \ma_\mi | \mH | \ma_\mi \rangle} \] - We know that all the $| \ma_\mi \rangle$ are singles and doubles with respect to $| \mi \rangle$ - $|\mPsi'\rangle$ is the projection of $\mPsi$ on the subspace of determinants in $\mD$ which are no more than quadruply excited with respect to $| \mi \rangle$ - For a subset of excitations $\mi \rightarrow \ma$, $|\mPsi'\rangle$ is filtered further with possible hole/particle constraints ** Monte Carlo sampling \[ \epsi = \sum_{\ma_\mi \in \mA_\mi} \frac{ \left( \langle \mPsi | \mH | \ma_\mi \rangle \right)^2}{\Evar - \langle \ma_\mi | \mH | \ma_\mi \rangle} \] 1. $\langle \mPsi | \mH | \ma_\mi \rangle = \sum_{\mj \ge \mi} \mcj\, \langle\, \mj | \mH | \ma_\mi \rangle$ 2. $\langle \ma_\mi | \mH | \ma_\mi \rangle$ is always large (otherwise $|\ma_\mi \rangle$ would be better in the \textcolor{red}{variational space}, and PT is questionable) *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: - $\forall \mi \in \mD : \epsi \le 0$ - $|\epsi|$ is expected to decrease as $\mci^2$ - The computational cost decreases with $\mi$ *** Monte Carlo formulation \[ \Ept = \sum_{\mi \in \mD} \epsi = \sum_{\mi \in \mD} \mpi \frac{\epsi}{\mpi} = \left\langle \frac{\epsi}{\mpi} \right\rangle_{\mpi} \] ** Naive sampling #+LATEX: \begin{columns} #+LATEX: \begin{column}{0.2\textwidth} Uniform sampling: \mpi = \frac{1}{\Ndet}$ #+LATEX: \end{column} #+LATEX: \begin{column}{0.8\textwidth} #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./dist2_noise.png]] #+LATEX: \end{column} #+LATEX: \end{columns} ** Improved sampling #+LATEX: \begin{columns} #+LATEX: \begin{column}{0.2\textwidth} Sampling : $\mpi = \mci^2$ #+LATEX: \end{column} #+LATEX: \begin{column}{0.8\textwidth} #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./eici2.png]] #+LATEX: \end{column} #+LATEX: \end{columns} ** Lazy evaluation Only $\Ndet$ contributions $\epsi \longrightarrow$ all $\epsi$ can be stored in memory. *** Lazy Evaluation (Wikipedia) In programming language theory, /lazy evaluation/, or /call-by-need/ is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations (sharing). *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEGIN_SRC python def lazy_e(i): if not e_is_computed[i]: e[i] = compute_e(i) e_is_computed[i] = true return e[i] #+END_SRC ** Monte Carlo with Lazy Evaluation \[ \Ept = \sum_{\mi \in \mD} \epsi = \sum_{\mi \in \mD} \mpi \frac{\epsi}{\mpi} = \left\langle \frac{\epsi}{\mpi} \right\rangle_{\mpi} \] - Draw a generator determinant $|\mi\,\rangle$ with probability $\mpi$ - Increment $n_\mi$, the number of evaluations of $\epsi$ - If $\epsi$ is not already computed, compute it and store its value - $\Ept \sim \sum_{\mi \in \mD} \frac{n_\mi}{\Nsamples} \frac{\epsi}{\mpi}$ - Statistical error : $\mathcal{O}\left(1/\sqrt{\Nsamples}\right)$ - Lazy evaluation : Exponential acceleration (time to solution) ** Monte Carlo with Lazy Evaluation #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./samples.pdf]] ** Monte Carlo with Lazy Evaluation #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./lazy_e.pdf]] ** Monte Carlo with Lazy Evaluation #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./lazy_err.pdf]] ** Monte Carlo with Variance reduction - Noise can be smoothed out by averaging - Split $\mD$ into $\mM$ \emph{equiprobable} sets : "Comb" \[ \Ept = \sum_{\mi \in \mD} \epsi = \sum_{\mk=1}^{\mM} \sum_{\mi_\mk \in \mD_\mk} \epsik \] *** New Monte Carlo estimator \[ \Ept = \left \langle \frac{1}{\mM} \sum_{\mk=1}^{\mM} \frac{\epsik}{{\textcolor{red}{p_{I_k}}}} \right \rangle_{\textcolor{red}{(p_{I_1}, \dots, p_{I_M})}} \] ** Monte Carlo with Variance reduction #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./Comb1.pdf]] ** Monte Carlo with Variance reduction #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./Comb2.pdf]] ** Monte Carlo with Variance reduction #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./Comb3.pdf]] ** Monte Carlo with Variance reduction #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./dist2.png]] ** Monte Carlo with Variance reduction #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./lazy_e.pdf]] ** Monte Carlo with Variance reduction #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./comb_e.pdf]] ** Monte Carlo with Variance reduction #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./comb_err.pdf]] ** Hybrid deterministic/stochastic scheme - When all the determinants have been drawn, the \emph{exact} $\Ept$ can be computed - $\Longrightarrow$ The result with zero statistical error can be reached in a finite time - In typical wave functions, $90\%$ of the norm is on a few determinants - Compute the few first contributions $\epsi$, and perform the MC in the rest \[ \Ept = \sum_{\mi \in \mD_D} \epsi + \left \langle \frac{1}{\mM} \sum_{\mk=1}^{\mM} \frac{\epsik}{\textcolor{red}{p_{I_k}}} \right \rangle_{(\textcolor{red}{p_{I \in \mD_S})}} \] ** Hybrid deterministic/stochastic scheme Make the deterministic part grow during the calculation. *** At each MC step - Draw a random number - Find the determinants selected by the comb (increment $n_\mi$'s) - Compute the $\epsi$ which have not been yet computed - Compute deterministically the first non-computed determinant - If a tooth of the comb is completely filled $\Longrightarrow$ Deterministic *** At any time \[ \Ept(t) = \sum_{\mi \in \mD_D(t)} \epsi + \sum_{\mi \in \mD_S(t)} \frac{1}{\mM(t)} \frac{n_\mi(t)}{\Nsamples(t)} \frac{\epsi}{\mpi} \] ** Hybrid deterministic/stochastic scheme #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./lazy_e.pdf]] ** Hybrid deterministic/stochastic scheme #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./comb_e.pdf]] ** Hybrid deterministic/stochastic scheme #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./hybrid_e.pdf]] ** Hybrid deterministic/stochastic scheme #+CAPTION: F$_2$, cc-pVDZ, \textcolor{red}{$10^6$} determinants in the variational space #+ATTR_LATEX: :height 0.75\textheight [[./hybrid_err.pdf]] ** Some timings: Cr$_2$, $2\,10^7$ determinants, 800 cores |---------+-------------------------------------+--------------------------| | Basis | $\Ept$ | Wall-clock time | |---------+-------------------------------------+--------------------------| | cc-pVDZ | \textcolor{red}{$-0.068\,3(1)$} | 14 min | | | \textcolor{red}{$-0.068\,36(1)$} | 55 min | | | \textcolor{red}{$-0.068\,361(1)$} | 2.4 hr | | | \textcolor{red}{$-0.068\,360\,604$} | 3 hr | |---------+-------------------------------------+--------------------------| | cc-pVTZ | \textcolor{red}{$-0.124\,4(5)$} | 19 min | | | \textcolor{red}{$-0.124\,7(1)$} | 58 min | | | \textcolor{red}{$-0.124\,63(1)$} | 3.5 hr | | | \textcolor{red}{$-0.124\,642(1)$} | 8.7 hr | | | --- | $\sim$ 15 hr (estimated) | |---------+-------------------------------------+--------------------------| | cc-pVQZ | \textcolor{red}{$-0.155\,8(5)$} | 56 min | | | \textcolor{red}{$-0.155\,9(1)$} | 2.5 hr | | | \textcolor{red}{$-0.155\,95(1)$} | 9.0 hr | | | \textcolor{red}{$-0.155\,952(1)$} | 18.5 hr | | | --- | $\sim$ 29 hr (estimated) | |---------+-------------------------------------+--------------------------| ** Parallel efficiency #+ATTR_LATEX: :height 0.8\textheight [[./speedup_pt2.png]]