En marche

This commit is contained in:
Anthony Scemama 2019-11-15 15:47:04 +01:00
parent 3512ac1a3b
commit 05f202a75e
2 changed files with 19 additions and 2 deletions

View File

@ -1707,3 +1707,13 @@
publisher = {John Wiley {\&} Sons, Ltd}, publisher = {John Wiley {\&} Sons, Ltd},
doi = {10.1002/jcc.24713} doi = {10.1002/jcc.24713}
} }
@article{Sce13b,
author = {Scemama, Anthony and Giner, Emmanuel},
title = {{An efficient implementation of Slater-Condon rules}},
journal = {arXiv},
year = {2013},
month = {Nov},
eprint = {1311.6244},
url = {https://arxiv.org/abs/1311.6244}
}

View File

@ -334,8 +334,15 @@ It would surely stimulate further theoretical developments in excited-state meth
\alert{ \alert{
To keep on with Moore's ``Law'' in the early 2000's, computer manufacturers had no other choice than to propose multi-core chips to avoid an explosion of the energy requirements. To keep on with Moore's ``Law'' in the early 2000's, computer manufacturers had no other choice than to propose multi-core chips to avoid an explosion of the energy requirements.
Doubling the number of floating-point operations per second (flops/s) by doubling the number of CPU cores only requires to double the required energy, while doubling the frequency multiplies the required energy by a factor close to 8. Doubling the number of floating-point operations per second (flops/s) by doubling the number of CPU cores only requires to double the required energy, while doubling the frequency multiplies the required energy by a factor close to 8.
This bifuraction in hardware design implied a \emph{Change of paradigm}\cite{Sut05} in the implementation and design of computational algorithms, which needed to express a large degree of parallelism to benefit from a significant acceleration. This bifuraction in hardware design implied a \emph{change of paradigm}\cite{Sut05} in the implementation and design of computational algorithms, which needed to express a large degree of parallelism to benefit from a significant acceleration.
Fifteen years later, the community has made the effort of redesigning the methods with parallel-friendly algorithms,\cite{Val10,Cle10,Gar17b,Pen16,Kri13,Sce13} but the next generation of supercomputers is going to generalize the presence of accelerators (graphical processing units, GPUs), leading to a new software crisis. Fifteen years later, the community has made a significant effort of redesigning the methods with parallel-friendly algorithms.\cite{Val10,Cle10,Gar17b,Pen16,Kri13,Sce13}
In particular, the change of paradigm to reach FCI accuracy with SCI methods came
from the use of determinant-driven algorithms which were considered for long as inefficient
with respect to integral-driven algorithms.
The first important element making these algorithms efficient is the introduction of new bit manipulation instructions (BMI) in the hardware of modern processors that enable an extremely fast evaluation of Slater-Condon rules\cite{Sce13b} for the direct calculation of
the Hamiltonian matrix elements over arbitrary determinants.
Then massive parallelism can be harnessed to perform the sparse matrix-vector multiplications required in Davidson's algorithm, and to compute the second-order perturbative correction with semi-stochatic algorithms.\cite{Gar17b,Sha17}
The next generation of supercomputers is going to generalize the presence of accelerators (graphical processing units, GPUs), leading to a new software crisis.
Fortunately, some authors have already prepared this transition.\cite{Dep11,Kim18,Sny15,Ufi08,Kal17} Fortunately, some authors have already prepared this transition.\cite{Dep11,Kim18,Sny15,Ufi08,Kal17}
} }