17 KiB
Introduction
Installing QMCkl
The latest version fo QMCkl can be downloaded here, and the source code is accessible on the GitHub repository.
Installing from the released tarball (for end users)
QMCkl is built with GNU Autotools, so the usual
configure ; make ; make check ; make install
scheme will be used.
As usual, the C compiler can be specified with the CC
variable
and the Fortran compiler with the FC
variable. The compiler
options are defined using CFLAGS
and FCFLAGS
.
Installing from the source repository (for developers)
To compile from the source repository, additional dependencies are required to generated the source files:
- Emacs >= 26
- Autotools
- Python3
When the repository is downloaded, the Makefile is not yet
generated, as well as the configure script. ./autogen.sh
has
to be executed first.
Using QMCkl
The qmckl.h
header file installed in the ${prefix}/include
directory
has to be included in C codes when QMCkl functions are used:
#include "qmckl.h"
In Fortran programs, the qmckl_f.f90
installed in
${prefix}/share/qmckl/fortran
interface file should be copied in the source
code using the library, and the Fortran codes should use the qmckl
module as
use qmckl
Both files are located in the include/
directory.
Developing in QMCkl
Literate programming
In a traditional source code, most of the lines of source files of a program are code, scripts, Makefiles, and only a few lines are comments explaining parts of the code that are non-trivial to understand. The documentation of the prorgam is usually written in a separate directory, and is often outdated compared to the code.
Literate programming is a different approach to programming, where the program is considered as a publishable-quality document. Most of the lines of the source files are text, mathematical formulas, tables, figures, etc, and the lines of code are just the translation in a computer language of the ideas and algorithms expressed in the text. More importantly, the "document" is structured like a text document with sections, subsections, a bibliography, a table of contents etc, and the place where pieces of code appear are the places where they should belong for the reader to understand the logic of the program, not the places where the compiler expects to find them. Both the publishable-quality document and the binary executable are produced from the same source files.
Literate programming is particularly well adapted in this context, as the central part of this project is the documentation of an API. The implementation of the algorithms is just an expression of the algorithms in a language that can be compiled, so that the correctness of the algorithms can be tested.
We have chosen to write the source files in org-mode format,
as any text editor can be used to edit org-mode files. To
produce the documentation, there exists multiple possibilities to convert
org-mode files into different formats such as HTML or PDF. The source code is
easily extracted from the org-mode files invoking the Emacs text editor from
the command-line in the Makefile
, and then the produced files are compiled.
Moreover, within the Emacs text editor the source code blocks can be executed
interactively, in the same spirit as Jupyter notebooks.
Note that Emacs is not needed for end users because the distributed tarball contains the generated source code.
Source code editing
For a tutorial on literate programming with org-mode, follow this link.
Any text editor can be used to edit org-mode files. For a better user experience Emacs is recommended. For users hating Emacs, it is good to know that Emacs can behave like Vim when switched into ``Evil'' mode.
In the tools/init.el
file, we provide a minimal Emacs configuration
file for vim users. This file should be copied into .emacs.d/init.el
.
For users with a preference for Jupyter notebooks, we also provide the
tools/nb_to_org.sh
script can convert jupyter notebooks into org-mode
files.
Note that pandoc can be used to convert multiple markdown formats into org-mode.
Choice of the programming language
Most of the codes of the TREX CoE are written in Fortran with some scripts in Bash and Python. Outside of the CoE, Fortran is also important in QMC codes (Casino, Amolqc), and other important languages used by the community are C and C++ (QMCPack, QWalk), Julia and Rust are gaining in popularity. We want QMCkl to be compatible with all of these languages, so the QMCkl API has to be compatible with the C language since libraries with a C-compatible API can be used in every other language.
High-performance versions of QMCkl, with the same API, can be rewritten by HPC experts. These optimized libraries will be tuned for specific architectures, among which we can cite x86 based processors, and GPU accelerators. Nowadays, the most efficient software tools to take advantage of low-level features (intrinsics, prefetching, aligned or pinned memory allocation, …) are for C++ developers. It is highly probable that optimized implementations will be written in C++, but as the API is C-compatible this doesn't pose any problem for linking the library in other languages.
Fortran is one of the most common languages used by the community, and is simple enough to make the algorithms readable both by experts in QMC, and experts in HPC. Hence we propose in this pedagogical implementation of QMCkl to use Fortran to express the QMC algorithms. However, for internal functions related to system programming, the C language is more natural than Fortran.
As QMCkl appears like a C library, for each Fortran function there
is an iso_c_binding
interface to make the Fortran function
callable from C. It is this C interface which is exposed to the
user. As a consequence, the Fortran users of the library never
call directly the Fortran routines, but call instead the C binding
function and an iso_c_binding
is still required:
ISO_C_BINDING ISO_C_BINDING Fortran ---------------> C ---------------> Fortran
The name of the Fortran source files should end with _f.f90
to
be properly handled by the Makefile
and to avoid collision of
object files (*.o
) with the compiled C source files. The names
of the functions defined in Fortran should be the same as those
exposed in the API suffixed by _f
.
For more guidelines on using Fortran to generate a C interface, see this link.
Coding rules
The authors should follow the recommendations of the C99 SEI+CERT C Coding Standard.
Compliance can be checked with cppcheck
as:
cppcheck --addon=cert --enable=all *.c &> cppcheck.out
# or
make cppcheck ; cat cppcheck.out
Design of the library
The proposed API should allow the library to: deal with memory transfers between CPU and accelerators, and to use different levels of floating-point precision. We chose a multi-layered design with low-level and high-level functions (see below).
Naming conventions
To avoid namespace collisions, we use qmckl_
as a prefix for all exported
functions and variables. All exported header files should have a file name
prefixed with qmckl_
.
If the name of the org-mode file is xxx.org
, the name of the
produced C files should be xxx.c
and xxx.h
and the name of the
produced Fortran file should be xxx.f90
.
In the names of the variables and functions, only the singular form is allowed.
Application programming interface
In the C language, the number of bits used by the integer types can change
from one architecture to another one. To circumvent this problem, we choose to
use the integer types defined in <stdint.h>
where the number of bits used for
the integers are fixed.
To ensure that the library will be easily usable in any other language than C, we restrict the data types in the interfaces to the following:
- 32-bit and 64-bit integers, scalars and and arrays (
int32_t
andint64_t
) - 32-bit and 64-bit floats, scalars and and arrays (
float
anddouble
) - Pointers are always casted into 64-bit integers, even on legacy 32-bit architectures
- ASCII strings are represented as a pointers to character arrays
and terminated by a
'\0'
character (C convention). - Complex numbers can be represented by an array of 2 floats.
- Boolean variables are stored as integers,
1
fortrue
and0
forfalse
- Floating point variables should be by default
double
unless explicitly mentioned - integers used for counting should always be
int64_t
To facilitate the use in other languages than C, we will provide some bindings in other languages in other repositories.
Global state
Global variables should be avoided in the library, because it is
possible that one single program needs to use multiple instances
of the library. To solve this problem we propose to use a pointer
to a context
variable, built by the library with the
qmckl_context_create
function. The <<<=context=>>> contains the global
state of the library, and is used as the first argument of many
QMCkl functions.
The internal structure of the context is not specified, to give a
maximum of freedom to the different implementations. Modifying
the state is done by setters and getters, prefixed by
qmckl_set_
an qmckl_get_
.
Headers
A single qmckl.h
header to be distributed by the library
is built by concatenating some of the produced header files.
To facilitate building the qmckl.h
file, we separate types from
function declarations in headers. Types should be defined in header
files suffixed by _type.h
, and functions in files suffixed by
_func.h
.
As these files will be concatenated in a single file, they should
not be guarded by #ifndef *_H
, and they should not include other
produced headers.
Some particular types that are not exported need to be known by the
context, and there are some functions to update instances of these
types contained inside the context. For example, a
qmckl_numprec_struct
is present in the context, and the function
qmckl_set_numprec_range
takes a context as a parameter, and set a
value in the qmckl_numprec_struct
contained in the context.
Because of these intricate dependencies, a private header is
created, containing the qmckl_numprec_struct
. This header is
included in the private header file which defines the type of the
context. Header files for private types are suffixed by _private_type.h
and header files for private functions are suffixed by _private_func.h
.
Fortran interfaces should also be written in the *fh_func.f90
file,
and the types definitions should be written in the *fh_type.f90
file.
File | Scope | Description |
---|---|---|
*_type.h |
Public | Type definitions |
*_func.h |
Public | Function definitions |
*_private_type.h |
Private | Type definitions |
*_private_func.h |
Private | Function definitions |
*fh_type.f90 |
Public | Fortran type definitions |
*fh_func.f90 |
Public | Fortran function definitions |
Low-level functions
Low-level functions are very simple functions which are leaves of the function call tree (they don't call any other QMCkl function).
These functions are pure, and unaware of the QMCkl
context
. They are not allowed to allocate/deallocate memory, and
if they need temporary memory it should be provided in input.
High-level functions
High-level functions are at the top of the function call tree. They are able to choose which lower-level function to call depending on the required precision, and do the corresponding type conversions. These functions are also responsible for allocating temporary storage, to simplify the use of accelerators.
Numerical precision
The minimal number of bits of precision required for a function
should be given as an input of low-level computational
functions. This input will be used to define the values of the
different thresholds that might be used to avoid computing
unnecessary noise. High-level functions will use the precision
specified in the context
variable.
In order to automatize numerical accuracy tests, QMCkl uses Verificarlo and its CI functionality. You can read Verificarlo CI's documentation at the following link. Reading it is advised to understand the remainder of this section.
To enable support for Verificarlo CI tests when building the library, use the following configure command :
./configure CC=verificarlo-f FC=verificarlo-f --host=x86_64 --enable-vfc_ci
Note that this does require an install of Verificarlo with
Fortran support. Enabling the support for CI will define the
VFC_CI
preprocessor variable which use will be explained now.
As explained in the documentation, Verificarlo CI uses a probes
system to export variables from test programs to the tools itself.
To make tests easier to use, QMCkl has its own interface to the
probes system. To make the system very easy to use, these functions
are always defined, but will behave differently depending on the
VFC_CI
preprocessor variable. There are 3 functions at your
disposal. When the CI is enabled, they will place a vfc_ci
probe
as if calling vfc_probes
directly. Otherwise, they will either do
nothing or perform a check on the tested value and return its result
as a boolean that you are free to use or ignore.
Here are these 3 functions :
qmckl_probe
: place a normal probe witout any check. Won't do anything whenvfc_ci
is disabled (false is returned).qmckl_probe_check
: place a probe with an absolute check. Ifvfc_ci
is disabled, this will return the result of an absolute check (|val - ref| < accuracy target ?). If the check fails, true is returned (false otherwise).qmckl_probe_check_relative
: place a probe with a relative check. Ifvfc_ci
is disabled, this will return the result of a relative check (|val - ref| / ref < accuracy target?). If the check fails, true is returned (false otherwise).
If you need more detail on these functions or their Fortran
interfaces, have a look at the tools/qmckl_probes
files.
Finally, if you need to add a QMCkl kernel to the CI tests or modify an existing one, you should pay attention to the following points :
- you should add the new kernel to the
vfc_tests_config.json
file, which controls the backends and repetitions for each executable. More details can be found in thevfc_ci
documentation. - in order to call the
qmckl_probes
functions from Fortran, import theqmckl_probes_f
module. - if your tests include some asserts that rely on accurate FP values, you should probably wrap them inside a
#ifndef VFC_CI
statement, as the asserts would otherwise risk to fail when executed with the different Verificarlo backends.
Algorithms
Reducing the scaling of an algorithm usually implies also reducing its arithmetic complexity (number of flops per byte). Therefore, for small sizes \(\mathcal{O}(N^3)\) and \(\mathcal{O}(N^2)\) algorithms are better adapted than linear scaling algorithms. As QMCkl is a general purpose library, multiple algorithms should be implemented adapted to different problem sizes.