2020-10-14 00:57:25 +02:00
* QMCkl source code
** Introduction
2020-10-16 18:23:20 +02:00
The ultimate goal of QMCkl is to provide a high-performance
implementation of the main kernels of QMC. In this particular
repository, we focus on the definition of the API and the tests,
and on a /pedagogical/ presentation of the algorithms. We expect the
HPC experts to use this repository as a reference for re-writing
optimized libraries.
Literate programming is particularly adapted in this context.
Source files are written in [[ottps://karl-voit.at/2017/09/23/orgmode-as-markup-only/ ][org-mode ]] format, to provide useful
2020-10-14 09:54:12 +02:00
comments and LaTex formulas close to the code. There exists multiple
possibilities to convert org-mode files into different formats such as
HTML or pdf.
2020-10-16 13:58:05 +02:00
For a tutorial on literate programming with org-mode, follow
2020-10-14 09:54:12 +02:00
[[http://www.howardism.org/Technical/Emacs/literate-programming-tutorial.html ][this link ]].
The code is extracted from the org files using Emacs as a command-line
tool in the =Makefile= , and then the produced files are compiled.
2020-10-16 18:23:20 +02:00
*** Language used
Fortran is one of the most common languages used by the community,
and is simple enough to make the algorithms readable. Hence we
propose in this pedagogical implementation of QMCkl to use Fortran
to express the algorithms. For specific internal functions where
the C language is more natural, C is used.
As Fortran modules generate compiler-dependent files, the use of
modules is restricted to the internal use of the library, otherwise
the compliance with C is violated.
The external dependencies should be kept as small as possible, so
external libraries should be used /only/ if their used is strongly
justified.
2020-10-16 13:58:05 +02:00
2020-10-14 09:54:12 +02:00
*** Source code editing
Any text editor can be used to edit org-mode files. For a better
user experience Emacs is recommended.
For users hating Emacs, it is good to know that Emacs can behave
like Vim when switched into ``Evil'' mode. There also exists
2020-10-16 13:58:05 +02:00
[[https://www.spacemacs.org ][Spacemacs ]] which helps the transition for Vim users.
2020-10-14 09:54:12 +02:00
For users with a preference for Jupyter notebooks, the following
script can convert jupyter notebooks to org-mode files:
#+BEGIN_SRC sh tangle: nb_to_org.sh
#!/bin/bash
2020-10-16 13:58:05 +02:00
# $ nb_to_org.sh notebook.ipynb
# produces the org-mode file notebook.org
2020-10-14 09:54:12 +02:00
set -e
nb=$(basename $1 .ipynb)
jupyter nbconvert --to markdown ${nb}.ipynb --output ${nb}.md
pandoc ${nb}.md -o ${nb}.org
rm ${nb}.md
#+END_SRC
And pandoc can convert multiple markdown formats into org-mode.
*** Writing in Fortran
The Fortran source files should provide a C interface using
iso-c-binding. The name of the Fortran source files should end
with =_f.f90= to be properly handled by the Makefile.
2020-10-16 13:58:05 +02:00
2020-10-16 18:23:20 +02:00
*** Coding style
# TODO: decide on a coding style
To improve readability, we maintain a consistent coding style in the library.
- For C source files, we will use __(decide on a coding style)__
- For Fortran source files, we will use __(decide on a coding style)__
Coding style can be automatically checked with [[https://clang.llvm.org/docs/ClangFormat.html ][clang-format ]].
** Design of the library
The proposed API should allow the library to:
- deal with memory transfers between CPU and accelerators
- use different levels of floating-point precision
We chose a multi-layered design with low-level and high-level
functions (see below).
*** Naming conventions
Use =qmckl_= as a prefix for all exported functions and variables.
All exported header files should have a filename with the prefix
=qmckl_= .
If the name of the org-mode file is =xxx.org= , the name of the
produced C files should be =xxx.c= and =xxx.h= and the name of the
produced Fortran files should be =xxx.f90=
*** Application programming interface
The application programming interface (API) is designed to be
compatible with the C programming language (not C++), to ensure
that the library will be easily usable in any language.
This implies that only the following data types are allowed in the API:
- 32-bit and 64-bit floats and arrays
- 32-bit and 64-bit integers and arrays
- Pointers should be represented as 64-bit integers (even on
32-bit architectures)
- ASCII strings are represented as a pointers to a character arrays
and terminated by a zero character (C convention).
To facilitate the use in other languages than C, we provide some
bindings in other languages in other repositories.
# TODO : Link to repositories for bindings
*** Global state
Global variables should be avoided in the library, because it is
possible that one single program needs to use multiple instances of
the library. To solve this problem we propose to use a pointer to a
=context= variable, built by the library with the
=qmckl_context_create= function. The =context= contains the global
state of the library, and is used as the first argument of many
QMCkl functions.
Modifying the state is done by setters and getters, prefixed
by =qmckl_context_set_= an =qmckl_context_get_= .
When a context variable is modified by a setter, a copy of the old
data structure is made and updated, and the pointer to the new data
structure is returned, such that the old contexts can still be
accessed.
It is also possible to modify the state in an impure fashion, using
the =qmckl_context_update_= functions.
The context and its old versions can be destroyed with
=qmckl_context_destroy= .
*** Low-level functions
Low-level functions are very simple functions which are leaves of the
function call tree (they don't call any other QMCkl function).
This functions are /pure/ , and unaware of the QMCkl =context= . They are
not allowed to allocate/deallocate memory, and if they need
temporary memory it should be provided in input.
*** High-level functions
High-level functions are at the top of the function call tree.
They are able to choose which lower-level function to call
depending on the required precision, and do the corresponding type
conversions.
These functions are also responsible for allocating temporary
storage, to simplify the use of accelerators.
The high-level functions should be pure, unless the introduction of
non-purity is justified. All the side effects should be made in the
=context= variable.
# TODO : We need an identifier for impure functions
*** Numerical precision
The number of bits of precision required for a function should be
given as an input of low-level computational functions. This input will
be used to define the values of the different thresholds that might
be used to avoid computing unnecessary noise.
High-level functions will use the precision specified in the
=context= variable.
** Algorithms
Reducing the scaling of an algorithm usually implies also reducing
its arithmetic complexity (number of flops per byte). Therefore,
for small sizes \(\mathcal{O}(N^3)\) and \(\mathcal{O}(N^2)\) algorithms
are better adapted than linear scaling algorithms.
As QMCkl is a general purpose library, multiple algorithms should
be implemented adapted to different problem sizes.
2020-10-14 00:57:25 +02:00
** Documentation
2020-10-14 00:52:50 +02:00
2020-10-16 18:23:20 +02:00
- [[qmckl.org ][Main QMCkl header file ]]
2020-10-14 09:55:08 +02:00
- [[qmckl_context.org ][Context ]]
2020-10-14 00:52:50 +02:00
2020-10-16 18:23:20 +02:00
** Acknowledgments
2020-10-14 00:52:50 +02:00
2020-10-16 18:23:20 +02:00
[[https://trex-coe.eu/sites/default/files/inline-images/euflag.jpg ]]
[[https://trex-coe.eu ][TREX: Targeting Real Chemical Accuracy at the Exascale ]] project has received funding from the European Union’ s Horizon 2020 - Research and Innovation program - under grant agreement no. 952165. The content of this document does not represent the opinion of the European Union, and the European Union is not responsible for any use that might be made of such content.