2020-10-22 01:24:14 +02:00
|
|
|
#+TITLE: QMCkl source code documentation
|
2020-10-26 19:44:21 +01:00
|
|
|
#+EXPORT_FILE_NAME: index.html
|
2020-10-22 01:24:14 +02:00
|
|
|
|
2020-12-03 18:59:25 +01:00
|
|
|
#+SETUPFILE: https://fniessen.github.io/org-html-themes/org/theme-readtheorg.setup
|
2020-10-22 01:24:14 +02:00
|
|
|
|
|
|
|
* Introduction
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
The ultimate goal of QMCkl is to provide a high-performance
|
|
|
|
implementation of the main kernels of QMC. In this particular
|
|
|
|
repository, we focus on the definition of the API and the tests, and
|
|
|
|
on a /pedagogical/ presentation of the algorithms. We expect the
|
|
|
|
HPC experts to use this repository as a reference for re-writing
|
2020-10-22 01:24:14 +02:00
|
|
|
optimized libraries.
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
Literate programming is particularly adapted in this context.
|
|
|
|
Source files are written in [[https://karl-voit.at/2017/09/23/orgmode-as-markup-only/][org-mode]] format, to provide useful
|
2020-10-22 01:24:14 +02:00
|
|
|
comments and LaTex formulas close to the code. There exists multiple
|
2020-11-14 18:27:38 +01:00
|
|
|
possibilities to convert org-mode files into different formats such
|
|
|
|
as HTML or pdf. For a tutorial on literate programming with
|
|
|
|
org-mode, follow [[http://www.howardism.org/Technical/Emacs/literate-programming-tutorial.html][this link]].
|
2020-10-22 01:24:14 +02:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
The code is extracted from the org files using Emacs as a
|
|
|
|
command-line tool in the =Makefile=, and then the produced files are
|
|
|
|
compiled.
|
2020-10-22 01:24:14 +02:00
|
|
|
|
|
|
|
** Language used
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
Fortran is one of the most common languages used by the community,
|
|
|
|
and is simple enough to make the algorithms readable. Hence we
|
|
|
|
propose in this pedagogical implementation of QMCkl to use Fortran
|
|
|
|
to express the algorithms. For specific internal functions where
|
2020-10-22 01:24:14 +02:00
|
|
|
the C language is more natural, C is used.
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
As Fortran modules generate compiler-dependent files, the use of
|
2020-10-22 01:24:14 +02:00
|
|
|
modules is restricted to the internal use of the library, otherwise
|
|
|
|
the compliance with C is violated.
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
The external dependencies should be kept as small as possible, so
|
|
|
|
external libraries should be used /only/ if their used is strongly
|
2020-10-22 01:24:14 +02:00
|
|
|
justified.
|
|
|
|
|
|
|
|
** Source code editing
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
Any text editor can be used to edit org-mode files. For a better
|
|
|
|
user experience Emacs is recommended. For users hating Emacs, it
|
|
|
|
is good to know that Emacs can behave like Vim when switched into
|
|
|
|
``Evil'' mode. There also exists [[https://www.spacemacs.org][Spacemacs]] which helps the
|
|
|
|
transition for Vim users.
|
2020-10-22 01:24:14 +02:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
For users with a preference for Jupyter notebooks, the following
|
2020-10-22 01:24:14 +02:00
|
|
|
script can convert jupyter notebooks to org-mode files:
|
|
|
|
|
|
|
|
#+BEGIN_SRC sh tangle: nb_to_org.sh
|
2020-10-14 09:54:12 +02:00
|
|
|
#!/bin/bash
|
2020-10-16 13:58:05 +02:00
|
|
|
# $ nb_to_org.sh notebook.ipynb
|
|
|
|
# produces the org-mode file notebook.org
|
2020-10-14 09:54:12 +02:00
|
|
|
|
|
|
|
set -e
|
|
|
|
|
|
|
|
nb=$(basename $1 .ipynb)
|
|
|
|
jupyter nbconvert --to markdown ${nb}.ipynb --output ${nb}.md
|
|
|
|
pandoc ${nb}.md -o ${nb}.org
|
|
|
|
rm ${nb}.md
|
2020-10-22 01:24:14 +02:00
|
|
|
#+END_SRC
|
2020-10-14 09:54:12 +02:00
|
|
|
|
2020-10-22 01:24:14 +02:00
|
|
|
And pandoc can convert multiple markdown formats into org-mode.
|
2020-10-14 09:54:12 +02:00
|
|
|
|
2020-10-22 01:24:14 +02:00
|
|
|
** Writing in Fortran
|
2020-10-14 09:54:12 +02:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
The Fortran source files should provide a C interface using
|
|
|
|
=iso_c_binding=. The name of the Fortran source files should end
|
|
|
|
with =_f.f90= to be properly handled by the Makefile. The names of
|
|
|
|
the functions defined in fortran should be the same as those
|
|
|
|
exposed in the API suffixed by =_f=. Fortran interface files
|
|
|
|
should also be written in the =qmckl_f.f90= file.
|
2020-10-26 19:30:50 +01:00
|
|
|
|
2020-11-05 00:46:19 +01:00
|
|
|
For more guidelines on using Fortran to generate a C interface, see
|
2020-11-14 18:27:38 +01:00
|
|
|
[[http://fortranwiki.org/fortran/show/Generating+C+Interfaces][this link]].
|
2020-11-05 00:46:19 +01:00
|
|
|
|
2020-10-22 01:24:14 +02:00
|
|
|
** Coding style
|
|
|
|
# TODO: decide on a coding style
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
To improve readability, we maintain a consistent coding style in
|
|
|
|
the library.
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
- For C source files, we will use __(decide on a coding style)__
|
|
|
|
- For Fortran source files, we will use __(decide on a coding
|
|
|
|
style)__
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-10-22 01:24:14 +02:00
|
|
|
Coding style can be automatically checked with [[https://clang.llvm.org/docs/ClangFormat.html][clang-format]].
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-05 15:34:58 +01:00
|
|
|
** Design of the library
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-05 15:34:58 +01:00
|
|
|
The proposed API should allow the library to:
|
|
|
|
- deal with memory transfers between CPU and accelerators
|
|
|
|
- use different levels of floating-point precision
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
We chose a multi-layered design with low-level and high-level
|
2020-11-05 15:34:58 +01:00
|
|
|
functions (see below).
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-05 15:34:58 +01:00
|
|
|
*** Naming conventions
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-05 15:34:58 +01:00
|
|
|
Use =qmckl_= as a prefix for all exported functions and variables.
|
2020-11-14 18:27:38 +01:00
|
|
|
All exported header files should have a filename with the prefix
|
2020-11-05 15:34:58 +01:00
|
|
|
=qmckl_=.
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
If the name of the org-mode file is =xxx.org=, the name of the
|
2020-11-05 15:34:58 +01:00
|
|
|
produced C files should be =xxx.c= and =xxx.h= and the name of the
|
|
|
|
produced Fortran files should be =xxx.f90=
|
2020-10-25 15:02:37 +01:00
|
|
|
|
2020-11-05 15:34:58 +01:00
|
|
|
Arrays are in uppercase and scalars are in lowercase.
|
2020-11-14 18:27:38 +01:00
|
|
|
|
|
|
|
In the names of the variables and functions, only the singular
|
|
|
|
form is allowed.
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-05 15:34:58 +01:00
|
|
|
*** Application programming interface
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
The application programming interface (API) is designed to be
|
|
|
|
compatible with the C programming language (not C++), to ensure
|
|
|
|
that the library will be easily usable in /any/ language. This
|
|
|
|
implies that only the following data types are allowed in the API:
|
2020-10-16 18:23:20 +02:00
|
|
|
|
2020-11-05 15:34:58 +01:00
|
|
|
- 32-bit and 64-bit floats and arrays (=real= and =double=)
|
|
|
|
- 32-bit and 64-bit integers and arrays (=int32_t= and =int64_t=)
|
2020-11-14 18:27:38 +01:00
|
|
|
- Pointers should be represented as 64-bit integers (even on
|
2020-11-05 15:34:58 +01:00
|
|
|
32-bit architectures)
|
2020-11-14 18:27:38 +01:00
|
|
|
- ASCII strings are represented as a pointers to a character
|
|
|
|
arrays and terminated by a zero character (C convention).
|
2020-11-05 00:46:19 +01:00
|
|
|
|
2020-11-05 15:34:58 +01:00
|
|
|
Complex numbers can be represented by an array of 2 floats.
|
|
|
|
|
|
|
|
# TODO : Link to repositories for bindings
|
2020-11-14 18:27:38 +01:00
|
|
|
To facilitate the use in other languages than C, we provide some
|
2020-11-05 15:34:58 +01:00
|
|
|
bindings in other languages in other repositories.
|
|
|
|
|
|
|
|
*** Global state
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
Global variables should be avoided in the library, because it is
|
|
|
|
possible that one single program needs to use multiple instances
|
|
|
|
of the library. To solve this problem we propose to use a pointer
|
|
|
|
to a =context= variable, built by the library with the
|
2020-11-05 15:34:58 +01:00
|
|
|
=qmckl_context_create= function. The =context= contains the global
|
2020-11-14 18:27:38 +01:00
|
|
|
state of the library, and is used as the first argument of many
|
2020-11-05 15:34:58 +01:00
|
|
|
QMCkl functions.
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
The internal structure of the context is not specified, to give a
|
|
|
|
maximum of freedom to the different implementations. Modifying
|
|
|
|
the state is done by setters and getters, prefixed by
|
|
|
|
=qmckl_context_set_= an =qmckl_context_get_=. When a context
|
|
|
|
variable is modified by a setter, a copy of the old data structure
|
|
|
|
is made and updated, and the pointer to the new data structure is
|
|
|
|
returned, such that the old contexts can still be accessed. It is
|
|
|
|
also possible to modify the state in an impure fashion, using the
|
|
|
|
=qmckl_context_update_= functions. The context and its old
|
|
|
|
versions can be destroyed with =qmckl_context_destroy=.
|
2020-11-05 15:34:58 +01:00
|
|
|
|
|
|
|
*** Low-level functions
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
Low-level functions are very simple functions which are leaves of
|
|
|
|
the function call tree (they don't call any other QMCkl function).
|
2020-11-05 15:34:58 +01:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
These functions are /pure/, and unaware of the QMCkl
|
|
|
|
=context=. They are not allowed to allocate/deallocate memory, and
|
|
|
|
if they need temporary memory it should be provided in input.
|
2020-11-05 15:34:58 +01:00
|
|
|
|
|
|
|
*** High-level functions
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
High-level functions are at the top of the function call tree.
|
|
|
|
They are able to choose which lower-level function to call
|
2020-11-05 15:34:58 +01:00
|
|
|
depending on the required precision, and do the corresponding type
|
2020-11-14 18:27:38 +01:00
|
|
|
conversions. These functions are also responsible for allocating
|
|
|
|
temporary storage, to simplify the use of accelerators.
|
2020-11-05 15:34:58 +01:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
The high-level functions should be pure, unless the introduction
|
|
|
|
of non-purity is justified. All the side effects should be made in
|
|
|
|
the =context= variable.
|
2020-11-05 15:34:58 +01:00
|
|
|
|
|
|
|
# TODO : We need an identifier for impure functions
|
|
|
|
|
|
|
|
*** Numerical precision
|
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
The number of bits of precision required for a function should be
|
|
|
|
given as an input of low-level computational functions. This input
|
|
|
|
will be used to define the values of the different thresholds that
|
|
|
|
might be used to avoid computing unnecessary noise. High-level
|
|
|
|
functions will use the precision specified in the =context=
|
|
|
|
variable.
|
2020-11-05 15:34:58 +01:00
|
|
|
|
|
|
|
** Algorithms
|
2020-10-25 15:02:37 +01:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
Reducing the scaling of an algorithm usually implies also reducing
|
|
|
|
its arithmetic complexity (number of flops per byte). Therefore,
|
|
|
|
for small sizes \(\mathcal{O}(N^3)\) and \(\mathcal{O}(N^2)\)
|
|
|
|
algorithms are better adapted than linear scaling algorithms. As
|
|
|
|
QMCkl is a general purpose library, multiple algorithms should be
|
|
|
|
implemented adapted to different problem sizes.
|
2020-11-05 15:34:58 +01:00
|
|
|
|
|
|
|
** Rules for the API
|
2020-10-22 00:50:07 +02:00
|
|
|
|
2020-11-05 15:34:58 +01:00
|
|
|
- =stdint= should be used for integers (=int32_t=, =int64_t=)
|
|
|
|
- integers used for counting should always be =int64_t=
|
|
|
|
- floats should be by default =double=, unless explicitly mentioned
|
|
|
|
- pointers are converted to =int64_t= to increase portability
|
2020-10-22 00:50:07 +02:00
|
|
|
|
2020-10-22 01:24:14 +02:00
|
|
|
* Documentation
|
2020-10-14 00:52:50 +02:00
|
|
|
|
2020-11-14 18:27:38 +01:00
|
|
|
|