Update README.org

This commit is contained in:
Anthony Scemama 2021-03-06 12:40:41 +01:00
parent 56f5d9d56d
commit fd4a50ddee
3 changed files with 204 additions and 76 deletions

View File

@ -4,110 +4,140 @@
#+SETUPFILE: https://fniessen.github.io/org-html-themes/org/theme-readtheorg.setup
bibliography:../docs/references.bib
* Introduction
The ultimate goal of QMCkl is to provide a high-performance
The ultimate goal of the QMCkl library is to provide a high-performance
implementation of the main kernels of QMC. In this particular
repository, we focus on the definition of the API and the tests, and
implementation of the library, we focus on the definition of the API and the tests, and
on a /pedagogical/ presentation of the algorithms. We expect the
HPC experts to use this repository as a reference for re-writing
optimized libraries.
Literate programming is particularly adapted in this context.
Source files are written in [[https://karl-voit.at/2017/09/23/orgmode-as-markup-only/][org-mode]] format, to provide useful
comments and LaTex formulas close to the code. There exists multiple
possibilities to convert org-mode files into different formats such
as HTML or pdf. For a tutorial on literate programming with
org-mode, follow [[http://www.howardism.org/Technical/Emacs/literate-programming-tutorial.html][this link]].
** Literate programming
In a traditional source code, most of the lines of source files of a program
are code, scripts, Makefiles, and only a few lines are comments explaining
parts of the code that are non-trivial to understand. The documentation of
the prorgam is usually written in a separate directory, and is often outdated
compared to the code.
The code is extracted from the org files using Emacs as a
command-line tool in the =Makefile=, and then the produced files are
compiled.
Literate programming cite:knuth_1992 is a different approach to programming,
where the program is considered as a publishable-quality document. Most of
the lines of the source files are text, mathematical formulas, tables,
figures, /etc/, and the lines of code are just the translation in a computer
language of the ideas and algorithms expressed in the text. More importantly,
the "document" is structured like a text document with sections, subsections,
a bibliography, a table of contents /etc/, and the place where pieces of code
appear are the places where they should belong for the reader to understand
the logic of the program, not the places where the compiler expects to find
them. Both the publishable-quality document and the binary executable are
produced from the same source files.
** Language used
Literate programming is particularly well adapted in this context, as the
central part of this project is the documentation of an API. The
implementation of the algorithms is just an expression of the algorithms in a
language that can be compiled, so that the correctness of the algorithms can
be tested.
Fortran is one of the most common languages used by the community,
and is simple enough to make the algorithms readable. Hence we
propose in this pedagogical implementation of QMCkl to use Fortran
to express the algorithms. For specific internal functions where
the C language is more natural, C is used.
We have chosen to write the source files in [[https://karl-voit.at/2017/09/23/orgmode-as-markup-only/][org-mode]] format,
cite:schulte_2012 as any text editor can be used to edit org-mode files. To
produce the documentation, there exists multiple possibilities to convert
org-mode files into different formats such as HTML or PDF. The source code is
easily extracted from the org-mode files invoking the Emacs text editor from
the command-line in the =Makefile=, and then the produced files are compiled.
Moreover, within the Emacs text editor the source code blocks can be executed
interactively, in the same spirit as Jupyter notebooks. cite:Kluyver_2016
As Fortran modules generate compiler-dependent files, the use of
modules is restricted to the internal use of the library, otherwise
the compliance with C is violated.
The external dependencies should be kept as small as possible, so
external libraries should be used /only/ if their used is strongly
justified.
** Source code editing
For a tutorial on literate programming with org-mode, follow [[http://www.howardism.org/Technical/Emacs/literate-programming-tutorial.html][this link]].
Any text editor can be used to edit org-mode files. For a better
user experience Emacs is recommended. For users hating Emacs, it
is good to know that Emacs can behave like Vim when switched into
``Evil'' mode. There also exists [[https://www.spacemacs.org][Spacemacs]] which helps the
transition for Vim users.
For users with a preference for Jupyter notebooks, the following
script can convert jupyter notebooks to org-mode files:
In the =tools/init.el= file, we provide a minimal Emacs configuration
file for vim users. This file should be copied into =.emacs.d/init.el=.
#+BEGIN_SRC sh tangle: nb_to_org.sh
#!/bin/bash
# $ nb_to_org.sh notebook.ipynb
# produces the org-mode file notebook.org
For users with a preference for Jupyter notebooks, we also provide the
=tools/nb_to_org.sh= script can convert jupyter notebooks into org-mode
files.
set -e
Note that pandoc can be used to convert multiple markdown formats into
org-mode.
nb=$(basename $1 .ipynb)
jupyter nbconvert --to markdown ${nb}.ipynb --output ${nb}.md
pandoc ${nb}.md -o ${nb}.org
rm ${nb}.md
#+END_SRC
And pandoc can convert multiple markdown formats into org-mode.
** Choice of the programming language
** Writing in Fortran
Most of the codes of the TREX CoE are written in Fortran with some scripts in
Bash and Python. Outside of the CoE, Fortran is also important (Casino, Amolqc),
and other important languages used by the community are C and C++ (QMCPack,
QWalk), and Julia is gaining in popularity. cite:poole_2020 The library we
design should be compatible with all of these languages. The QMCkl API has to
be compatible with the C language since libraries with a C-compatible API can be
used in every other language.
High-performance versions of the QMCkl, with the same API, will be rewritten by
the experts in HPC. These optimized libraries will be tuned for specific
architectures, among which we can cite x86 based processors, and GPU
accelerators. Nowadays, the most efficient software tools to take advantage of
low-level features of the processor (intrinsics) and of GPUs are for C++
developers. It is highly probable that the optimized implementations will be
written in C++, and this is agreement with our choice to make the API
C-compatible.
Fortran is one of the most common languages used by the community, and is simple
enough to make the algorithms readable both by experts in QMC, and experts in
HPC. Hence we propose in this pedagogical implementation of QMCkl to use Fortran
to express the QMC algorithms. As the main languages of the library is C, this
implies that the exposed C functions call the Fortran routine. However, for
internal functions related to system programming, the C language is more natural
than Fortran.
The <<<Fortran>>> source files should provide a C interface using the
~iso_c_binding~ module. The name of the Fortran source files should end with
=_f.f90= to be properly handled by the =Makefile=. The names of the functions
defined in Fortran should be the same as those exposed in the API suffixed by
=_f=. Fortran interfaces should also be written in the =qmckl_f.f90= file.
The Fortran source files should provide a C interface using
=iso_c_binding=. The name of the Fortran source files should end
with =_f.f90= to be properly handled by the Makefile. The names of
the functions defined in fortran should be the same as those
exposed in the API suffixed by =_f=. Fortran interface files
should also be written in the =qmckl_f.f90= file.
For more guidelines on using Fortran to generate a C interface, see
[[http://fortranwiki.org/fortran/show/Generating+C+Interfaces][this link]].
** Coding style
# TODO: decide on a coding style
To improve readability, we maintain a consistent coding style in
the library.
# Coding style
# # TODO: decide on a coding style
- For C source files, we will use __(decide on a coding style)__
- For Fortran source files, we will use __(decide on a coding
style)__
# To improve readability, we maintain a consistent coding style in
# the library.
Coding style can be automatically checked with [[https://clang.llvm.org/docs/ClangFormat.html][clang-format]].
# - For C source files, we will use __(decide on a coding style)__
# - For Fortran source files, we will use __(decide on a coding
# style)__
# Coding style can be automatically checked with [[https://clang.llvm.org/docs/ClangFormat.html][clang-format]].
** Design of the library
The proposed API should allow the library to:
- deal with memory transfers between CPU and accelerators
- use different levels of floating-point precision
We chose a multi-layered design with low-level and high-level
The proposed API should allow the library to: deal with memory transfers
between CPU and accelerators, and to use different levels of floating-point
precision. We chose a multi-layered design with low-level and high-level
functions (see below).
*** Naming conventions
Use =qmckl_= as a prefix for all exported functions and variables.
All exported header files should have a filename with the prefix
=qmckl_=.
To avoid namespace collisions, we use =qmckl_= as a prefix for all exported
functions and variables. All exported header files should have a file name
prefixed with =qmckl_=.
If the name of the org-mode file is =xxx.org=, the name of the
produced C files should be =xxx.c= and =xxx.h= and the name of the
produced Fortran files should be =xxx.f90=
produced Fortran file should be =xxx.f90=.
Arrays are in uppercase and scalars are in lowercase.
@ -116,23 +146,30 @@ rm ${nb}.md
*** Application programming interface
The application programming interface (API) is designed to be
compatible with the C programming language (not C++), to ensure
that the library will be easily usable in /any/ language. This
implies that only the following data types are allowed in the API:
In the C language, the number of bits used by the integer types can change
from one architecture to another one. To circumvent this problem, we choose to
use the integer types defined in ~<stdint.h>~ where the number of bits used for
the integers are fixed.
- 32-bit and 64-bit floats and arrays (=real= and =double=)
- 32-bit and 64-bit integers and arrays (=int32_t= and =int64_t=)
- Pointers should be represented as 64-bit integers (even on
32-bit architectures)
- ASCII strings are represented as a pointers to a character
arrays and terminated by a zero character (C convention).
Complex numbers can be represented by an array of 2 floats.
To ensure that the library will be easily usable in /any/ other language
than C, we restrict the data types in the interfaces to the following:
- 32-bit and 64-bit integers, scalars and and arrays (~int32_t~ and ~int64_t~)
- 32-bit and 64-bit floats, scalars and and arrays (~float~ and ~double~)
- Pointers are always casted into 64-bit integers, even on legacy 32-bit architectures
- ASCII strings are represented as a pointers to character arrays
and terminated by a ~'\0'~ character (C convention).
- Complex numbers can be represented by an array of 2 floats.
- Boolean variables are stored as integers, ~1~ for ~true~ and ~0~ for ~false~
- Floating point variables should be by default
- ~double~ unless explicitly mentioned
- integers used for counting should always be ~int64_t~
To facilitate the use in other languages than C, we will provide some
bindings in other languages in other repositories.
# TODO : Link to repositories for bindings
To facilitate the use in other languages than C, we provide some
bindings in other languages in other repositories.
# To facilitate the use in other languages than C, we provide some
# bindings in other languages in other repositories.
*** Global state

80
tools/init.el Normal file
View File

@ -0,0 +1,80 @@
(package-initialize)
(add-to-list 'package-archives
'("gnu" . "https://elpa.gnu.org/packages/"))
(add-to-list 'package-archives
'("melpa-stable" . "https://stable.melpa.org/packages/"))
(add-to-list 'package-archives
'("melpa" . "https://melpa.org/packages/"))
(setq package-archive-priorities '(("melpa-stable" . 100)
("melpa" . 50)
("gnu" . 10)))
(require 'cl)
(let* ((required-packages
'(htmlize
evil
org-evil
org-bullets
))
(missing-packages (remove-if #'package-installed-p required-packages)))
(when missing-packages
(message "Missing packages: %s" missing-packages)
(package-refresh-contents)
(dolist (pkg missing-packages)
(package-install pkg)
(message "Package %s has been installed" pkg))))
(setq backup-directory-alist
`(("." . ,(concat user-emacs-directory "backups"))))
(setq backup-by-copying t)
(require 'org)
(setq org-format-latex-options (plist-put org-format-latex-options :scale 1.6))
(setq org-hide-leading-stars t)
(setq org-alphabetical-lists t)
(setq org-src-fontify-natively t)
(setq org-src-tab-acts-natively t)
(setq org-src-preserve-indentation t)
(setq org-hide-emphasis-markers nil)
(setq org-pretty-entities nil)
(setq org-confirm-babel-evaluate nil) ;; Do not ask for confirmation all the time!!
(org-babel-do-load-languages
'org-babel-load-languages
'(
(emacs-lisp . t)
(shell . t)
(python . t)
(C . t)
(org . t)
(makefile . t)
))
(add-hook 'org-babel-after-execute-hook 'org-display-inline-images)
'(indent-tabs-mode nil)
(require 'evil)
(setq evil-want-C-i-jump nil)
(evil-mode 1)
(global-font-lock-mode t)
(global-superword-mode 1)
(setq line-number-mode 1)
(setq column-number-mode 1)
(evil-select-search-module 'evil-search-module 'evil-search)
(global-set-key (kbd "C-+") 'text-scale-increase)
(global-set-key (kbd "C--") 'text-scale-decrease)
(custom-set-variables
;; custom-set-variables was added by Custom.
;; If you edit it by hand, you could mess it up, so be careful.
;; Your init file should contain only one such instance.
;; If there is more than one, they won't work right.
'(ansi-color-faces-vector
[default default default italic underline success warning error])
'(custom-enabled-themes (quote (leuven)))
)

11
tools/nb_to_org.sh Executable file
View File

@ -0,0 +1,11 @@
#!/bin/bash
# $ nb_to_org.sh notebook.ipynb
# produces the org-mode file notebook.org
set -e
nb=$(basename $1 .ipynb)
jupyter nbconvert --to markdown ${nb}.ipynb --output ${nb}.md
pandoc ${nb}.md -o ${nb}.org
rm ${nb}.md