diff --git a/INSTALL.rst b/INSTALL.rst index 336d350c..e37d31eb 100644 --- a/INSTALL.rst +++ b/INSTALL.rst @@ -2,9 +2,9 @@ Installation ============ -The |qp| can be downloaded on GitHub as an `archive -`_ or as a `git -repository `_. +|qp| can be downloaded on GitHub as an `archive +`_ or as a `git +repository `_. .. code:: bash @@ -19,16 +19,16 @@ Before anything, go into your :file:`quantum_package` directory and run This script will create the :file:`quantum_package.rc` bash script, which -sets all the environment variables required for the normal operation of the -*Quantum Package*. It will also initialize the git submodules that are +sets all the environment variables required for the normal operation of +|qp|. It will also initialize the git submodules that are required, and tell you which external dependencies are missing and need to be installed. The required dependencies are located in the -`external/qp2-dependencies` directory, such that once QP is configured the +`external/qp2-dependencies` directory, such that once |qp| is configured the internet connection is not needed any more. When all dependencies have been installed, (the :command:`configure` will -inform you) source the :file:`quantum_package.rc` in order to load all -environment variables and compile the |QP|. +inform you what is missing) source the :file:`quantum_package.rc` in order to +load all environment variables and compile |QP|. Now all the requirements are met, you can compile the programs using @@ -37,6 +37,15 @@ Now all the requirements are met, you can compile the programs using make +Installation of dependencies via a Conda environment +==================================================== + +.. code:: bash + + conda env create -f qp2.yml + + + Requirements ============ @@ -64,8 +73,8 @@ architecture. Modify it if needed, and run :command:`configure` with .. code:: bash - cp ./config/gfortran.example config/gfortran.cfg - ./configure -c config/gfortran.cfg + cp ./config/gfortran.example config/gfortran_avx.cfg + ./configure -c config/gfortran_avx.cfg .. note:: @@ -86,45 +95,33 @@ The command is to be used as follows: .. code:: bash - ./configure --install= + ./configure -i The following packages are supported by the :command:`configure` installer: * ninja -* irpf90 * zeromq * f77zmq * gmp * ocaml (:math:`\approx` 5 minutes) -* ezfio * docopt * resultsFile * bats +* zlib Example: .. code:: bash - ./configure -i ezfio + ./configure -i ninja -.. note:: - - When installing the ocaml package, you will be asked the location of where - it should be installed. A safe option is to enter the path proposed by the - |QP|: - - QP>> Please install it here: /your_quantum_package_directory/bin - - So just enter the proposition of the |QP| and press enter. If the :command:`configure` executable fails to install a specific dependency ----------------------------------------------------------------------------- -If the :command:`configure` executable does not succeed to install a specific -dependency, there are some proposition of how to download and install the -minimal dependencies to compile and use the |QP|. - +If the :command:`configure` executable does not succeed in installing a specific +dependency, you should try to install the dependency on your system by yourself. Before doing anything below, try to install the packages with your package manager (:command:`apt`, :command:`yum`, etc). @@ -149,11 +146,11 @@ IRPF90 *IRPF90* is a Fortran code generator for programming using the Implicit Reference to Parameters (IRP) method. -If you have *pip* for Python2, you can do +If you have *pip* for Python2, you can do .. code:: bash - python2 -m pip install --user irpf90 + python3 -m pip install --user irpf90 Otherwise, @@ -262,53 +259,6 @@ With Debian or Ubuntu, you can use sudo apt install libgmp-dev -libcap ------- - -Libcap is a library for getting and setting POSIX.1e draft 15 capabilities. - -* Download the latest version of libcap here: - ``_ - and move it in the :file:`${QP_ROOT}/external` directory - -* Extract the archive, go into the :file:`libcap-*/libcap` directory and run - the following command - -.. code:: bash - - prefix=$QP_ROOT make install - -With Debian or Ubuntu, you can use - -.. code:: bash - - sudo apt install libcap-dev - - -Bubblewrap ----------- - -Bubblewrap is an unprivileged sandboxing tool. - -* Download Bubblewrap here: - ``_ - and move it in the :file:`${QP_ROOT}/external` directory - -* Extract the archive, go into the :file:`bubblewrap-*` directory and run - the following commands - -.. code:: bash - - ./configure --prefix=$QP_ROOT && make -j 8 - make install-exec-am - - -With Debian or Ubuntu, you can use - -.. code:: bash - - sudo apt install bubblewrap - OCaml @@ -327,7 +277,7 @@ OCaml ``_ and move it in the :file:`${QP_ROOT}/external` directory -* If you use OCaml only with the |qp|, you can install the OPAM directory +* If you use OCaml only with |qp|, you can install the OPAM directory containing the compiler and all the installed libraries in the :file:`${QP_ROOT}/external` directory as @@ -352,14 +302,14 @@ OCaml .. code:: bash - opam init --comp=4.07.1 + opam init --comp=4.11.1 eval `${QP_ROOT}/bin/opam env` If the installation fails because of bwrap, you can initialize opam using: .. code:: bash - opam init --disable-sandboxing --comp=4.07.1 + opam init --disable-sandboxing --comp=4.11.1 eval `${QP_ROOT}/bin/opam env` * Install the required external OCaml libraries @@ -369,17 +319,6 @@ OCaml opam install ocamlbuild cryptokit zmq sexplib ppx_sexp_conv ppx_deriving getopt -EZFIO ------ - -*EZFIO* is the Easy Fortran Input/Output library generator. - -* Download EZFIO here : ``_ and move - the downloaded archive in the :file:`${QP_ROOT}/external` directory - -* Extract the archive, and rename it as :file:`${QP_ROOT}/external/ezfio` - - Docopt ------ @@ -406,7 +345,7 @@ resultsFile *resultsFile* is a Python package to extract data from output files of quantum chemistry codes. -If you have *pip* for Python3, you can do +If you have *pip* for Python3, you can do .. code:: bash @@ -414,3 +353,4 @@ If you have *pip* for Python3, you can do + diff --git a/RELEASE_NOTES.org b/RELEASE_NOTES.org index 98830f3f..01875f10 100644 --- a/RELEASE_NOTES.org +++ b/RELEASE_NOTES.org @@ -31,6 +31,7 @@ - Fixed bug in molden (Au -> Angs) - Fixed bug with non-contiguous MOs in active space and deleter MOs - Complete network-free installation + - Fixed bug in selection when computing full PT2 *** User interface @@ -58,6 +59,7 @@ symmetry in matrices - qp_export_as_tgz exports also plugin codes - Added a basis module containing basis set information + - Added qp_run truncate_wf *** Code @@ -85,7 +87,7 @@ - Using Intel IPP for sorting when using Intel compiler - Removed parallelism in sorting - Compute banned_excitations from exchange integrals to accelerate with local MOs - + diff --git a/bin/qp_convert_output_to_ezfio b/bin/qp_convert_output_to_ezfio index b6e99176..9412b090 100755 --- a/bin/qp_convert_output_to_ezfio +++ b/bin/qp_convert_output_to_ezfio @@ -195,48 +195,52 @@ def write_ezfio(res, filename): # P a r s i n g # # ~#~#~#~#~#~#~ # + inucl = {} + for i, a in enumerate(res.geometry): + inucl[a.coord] = i + nbasis = 0 - nucl_center = [] + nucl_index = [] curr_center = -1 nucl_shell_num = [] ang_mom = [] nshell = 0 - shell_prim_index = [1] + nshell_tot = 0 + shell_index = [] shell_prim_num = [] for b in res.basis: s = b.sym if str.count(s, "y") + str.count(s, "x") == 0: - c = b.center + c = inucl[b.center] nshell += 1 + nshell_tot += 1 if c != curr_center: curr_center = c - nucl_center.append(nbasis+1) nucl_shell_num.append(nshell) nshell = 0 nbasis += 1 + nucl_index.append(c+1) coefficient += b.coef[:len(b.prim)] exponent += [p.expo for p in b.prim] ang_mom.append(str.count(s, "z")) - shell_prim_index.append(len(exponent)+1) shell_prim_num.append(len(b.prim)) - - nucl_shell_num.append(nshell+1) - nucl_shell_num = nucl_shell_num[1:] + shell_index += [nshell_tot+1] * len(b.prim) # ~#~#~#~#~ # # W r i t e # # ~#~#~#~#~ # ezfio.set_basis_basis("Read from ResultsFile") - ezfio.set_basis_basis_nucleus_index(nucl_center) - ezfio.set_basis_prim_num(len(coefficient)) ezfio.set_basis_shell_num(len(ang_mom)) + ezfio.set_basis_basis_nucleus_index(nucl_index) + ezfio.set_basis_prim_num(len(coefficient)) + ezfio.set_basis_nucleus_shell_num(nucl_shell_num) ezfio.set_basis_prim_coef(coefficient) ezfio.set_basis_prim_expo(exponent) ezfio.set_basis_shell_ang_mom(ang_mom) ezfio.set_basis_shell_prim_num(shell_prim_num) - ezfio.set_basis_shell_prim_index(shell_prim_index) + ezfio.set_basis_shell_index(shell_index) print("OK") @@ -289,12 +293,17 @@ def write_ezfio(res, filename): for i in range(mo_num): energies.append(MOs[i].eigenvalue) + OccNum = [] if res.occ_num is not None: - OccNum = [] for i in MOindices: OccNum.append(res.occ_num[MO_type][i]) + else: + for i in range(res.num_beta): + OccNum.append(2.) + for i in range(res.num_beta,res.num_alpha): + OccNum.append(1.) - while len(OccNum) < mo_num: + while len(OccNum) < mo_num: OccNum.append(0.) MoMatrix = [] @@ -317,8 +326,9 @@ def write_ezfio(res, filename): # ~#~#~#~#~ # ezfio.set_mo_basis_mo_num(mo_num) - ezfio.set_mo_basis_mo_occ(OccNum) ezfio.set_mo_basis_mo_coef(MoMatrix) + ezfio.set_mo_basis_mo_occ(OccNum) + print("OK") diff --git a/bin/qp_gaussian b/bin/qp_gaussian new file mode 100755 index 00000000..a059a023 --- /dev/null +++ b/bin/qp_gaussian @@ -0,0 +1,83 @@ +#!/usr/bin/env python3 +# +""" +Runs a Quantum Package calculation using a Gaussian input file. + +Usage: + qp_gaussian INPUT + +""" + +# Requires pymatgen (https://pymatgen.org/) +# pip install pymatgen + + +import os +import sys +import os.path + +try: + import qp_path +except ImportError: + print("source quantum_package.rc") + +from docopt import docopt +import pymatgen +from pymatgen.io.gaussian import GaussianInput + + +def main(arguments): + + filename = arguments["INPUT"] + + with open(filename,'r') as f: + text = f.read() + + in_file = GaussianInput.from_string(text) + + d = in_file.as_dict() + charge = ("%d"%(d["charge"])).replace('-','m') + basis = d["basis_set"] + mult = d["spin_multiplicity"] + natoms = len(d["molecule"]["sites"]) + with open("g09.xyz","w") as f: + f.write("%d\n"%natoms) + f.write("%s\n"%d["title"]) + f.write("%s\n"%in_file.get_cart_coords()) + + if basis is None: + print("Basis set not found. Use '/' before basis set") + sys.exit(1) + + command = f"rm -rf g09.ezfio" + os.system(command) + + command = f"qp_create_ezfio -c {charge} -m {mult} g09.xyz -b {basis} -o g09.ezfio" + os.system(command) + + command = f"rm -rf g09.xyz" + os.system(command) + + command = f"qp_run scf g09.ezfio" + os.system(command) + + command = f"qp_set_frozen_core g09.ezfio" + os.system(command) + + if d["functional"] == "FCI": + command = f"qp_run fci g09.ezfio" + elif d["functional"] == "CIS": + command = f"qp_run cis g09.ezfio" + elif d["functional"] == "CISD": + command = f"qp_run cisd g09.ezfio" + + os.system(command) + + + + + + +if __name__ == '__main__': + ARGUMENTS = docopt(__doc__) + main(ARGUMENTS) diff --git a/config/gfortran.cfg b/config/gfortran.cfg index 342acae9..56bb6ba4 100644 --- a/config/gfortran.cfg +++ b/config/gfortran.cfg @@ -13,7 +13,7 @@ FC : gfortran -g -ffree-line-length-none -I . -fPIC LAPACK_LIB : -lblas -llapack IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 --assert +IRPF90_FLAGS : --ninja --align=32 --assert -DSET_NESTED # Global options ################ @@ -35,14 +35,14 @@ OPENMP : 1 ; Append OpenMP flags # -ffast-math and the Fortran-specific # -fno-protect-parens and -fstack-arrays. [OPT] -FCFLAGS : -Ofast +FCFLAGS : -Ofast # Profiling flags ################# # [PROFILE] FC : -p -g -FCFLAGS : -Ofast +FCFLAGS : -Ofast # Debugging flags ################# @@ -58,5 +58,5 @@ FCFLAGS : -g -msse4.2 -fcheck=all -Waliasing -Wampersand -Wconversion -Wsurpris # [OPENMP] FC : -fopenmp -IRPF90_FLAGS : --openmp +IRPF90_FLAGS : --openmp diff --git a/config/gfortran_avx.cfg b/config/gfortran_avx.cfg index 4f45e3a1..747dff67 100644 --- a/config/gfortran_avx.cfg +++ b/config/gfortran_avx.cfg @@ -13,7 +13,7 @@ FC : gfortran -ffree-line-length-none -I . -mavx -g -fPIC LAPACK_LIB : -llapack -lblas IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 +IRPF90_FLAGS : --ninja --align=32 -DSET_NESTED # Global options ################ @@ -42,7 +42,7 @@ FCFLAGS : -Ofast -mavx # [PROFILE] FC : -p -g -FCFLAGS : -Ofast +FCFLAGS : -Ofast # Debugging flags ################# @@ -51,12 +51,12 @@ FCFLAGS : -Ofast # -g : Extra debugging information # [DEBUG] -FCFLAGS : -fcheck=all -g +FCFLAGS : -fcheck=all -g # OpenMP flags ################# # [OPENMP] FC : -fopenmp -IRPF90_FLAGS : --openmp +IRPF90_FLAGS : --openmp diff --git a/config/gfortran_debug.cfg b/config/gfortran_debug.cfg index 926255e0..51e5a500 100644 --- a/config/gfortran_debug.cfg +++ b/config/gfortran_debug.cfg @@ -13,7 +13,7 @@ FC : gfortran -g -ffree-line-length-none -I . -fPIC LAPACK_LIB : -lblas -llapack IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 --assert +IRPF90_FLAGS : --ninja --align=32 --assert -DSET_NESTED # Global options ################ @@ -35,14 +35,14 @@ OPENMP : 1 ; Append OpenMP flags # -ffast-math and the Fortran-specific # -fno-protect-parens and -fstack-arrays. [OPT] -FCFLAGS : -Ofast +FCFLAGS : -Ofast # Profiling flags ################# # [PROFILE] FC : -p -g -FCFLAGS : -Ofast +FCFLAGS : -Ofast # Debugging flags ################# @@ -59,5 +59,5 @@ FCFLAGS : -g -msse4.2 -fcheck=all -Waliasing -Wampersand -Wconversion -Wsurpris # [OPENMP] FC : -fopenmp -IRPF90_FLAGS : --openmp +IRPF90_FLAGS : --openmp diff --git a/config/gfortran_mpi.cfg b/config/gfortran_mpi.cfg index d72160c1..1af3ca45 100644 --- a/config/gfortran_mpi.cfg +++ b/config/gfortran_mpi.cfg @@ -13,7 +13,7 @@ FC : mpif90 -ffree-line-length-none -I . -g -fPIC LAPACK_LIB : -lblas -llapack IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 -DMPI +IRPF90_FLAGS : --ninja --align=32 -DMPI -DSET_NESTED # Global options ################ @@ -35,14 +35,14 @@ OPENMP : 1 ; Append OpenMP flags # -ffast-math and the Fortran-specific # -fno-protect-parens and -fstack-arrays. [OPT] -FCFLAGS : -Ofast -msse4.2 +FCFLAGS : -Ofast -msse4.2 # Profiling flags ################# # [PROFILE] FC : -p -g -FCFLAGS : -Ofast -msse4.2 +FCFLAGS : -Ofast -msse4.2 # Debugging flags ################# @@ -51,7 +51,7 @@ FCFLAGS : -Ofast -msse4.2 # -g : Extra debugging information # [DEBUG] -FCFLAGS : -fcheck=all -g +FCFLAGS : -fcheck=all -g # OpenMP flags ################# diff --git a/config/ifort_avx.cfg b/config/ifort_2019_avx.cfg similarity index 96% rename from config/ifort_avx.cfg rename to config/ifort_2019_avx.cfg index a2cb4c8a..661a0e8f 100644 --- a/config/ifort_avx.cfg +++ b/config/ifort_2019_avx.cfg @@ -9,7 +9,7 @@ FC : ifort -fpic LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 -DINTEL +IRPF90_FLAGS : --ninja --align=32 -DINTEL -DSET_NESTED # Global options ################ diff --git a/config/ifort_avx_mpi.cfg b/config/ifort_2019_avx_mpi.cfg similarity index 96% rename from config/ifort_avx_mpi.cfg rename to config/ifort_2019_avx_mpi.cfg index f2bb8889..2d212db5 100644 --- a/config/ifort_avx_mpi.cfg +++ b/config/ifort_2019_avx_mpi.cfg @@ -9,7 +9,7 @@ FC : mpiifort -fpic LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 -DMPI -DINTEL +IRPF90_FLAGS : --ninja --align=32 -DMPI -DINTEL -DSET_NESTED # Global options ################ diff --git a/config/ifort_rome.cfg b/config/ifort_2019_rome.cfg similarity index 96% rename from config/ifort_rome.cfg rename to config/ifort_2019_rome.cfg index 5ed01227..e923a1dd 100644 --- a/config/ifort_rome.cfg +++ b/config/ifort_2019_rome.cfg @@ -9,7 +9,7 @@ FC : ifort -fpic LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 -DINTEL +IRPF90_FLAGS : --ninja --align=32 -DINTEL -DSET_NESTED # Global options ################ diff --git a/config/ifort.cfg b/config/ifort_2019_sse4.cfg similarity index 96% rename from config/ifort.cfg rename to config/ifort_2019_sse4.cfg index 714c4b10..a3aa7cbd 100644 --- a/config/ifort.cfg +++ b/config/ifort_2019_sse4.cfg @@ -9,7 +9,7 @@ FC : ifort -fpic LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 -DINTEL +IRPF90_FLAGS : --ninja --align=32 -DINTEL -DSET_NESTED # Global options ################ diff --git a/config/ifort_mpi.cfg b/config/ifort_2019_sse4_mpi.cfg similarity index 96% rename from config/ifort_mpi.cfg rename to config/ifort_2019_sse4_mpi.cfg index e0d489a0..6959d176 100644 --- a/config/ifort_mpi.cfg +++ b/config/ifort_2019_sse4_mpi.cfg @@ -9,7 +9,7 @@ FC : mpiifort -fpic LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 -DMPI -DINTEL +IRPF90_FLAGS : --ninja --align=32 -DMPI -DINTEL -DSET_NESTED # Global options ################ diff --git a/config/ifort_xHost.cfg b/config/ifort_2019_xHost.cfg similarity index 96% rename from config/ifort_xHost.cfg rename to config/ifort_2019_xHost.cfg index ddb4aa2d..22d28803 100644 --- a/config/ifort_xHost.cfg +++ b/config/ifort_2019_xHost.cfg @@ -9,7 +9,7 @@ FC : ifort -fpic LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=64 -DINTEL +IRPF90_FLAGS : --ninja --align=64 -DINTEL -DSET_NESTED # Global options ################ diff --git a/config/ifort_2021_avx.cfg b/config/ifort_2021_avx.cfg new file mode 100644 index 00000000..8fadda67 --- /dev/null +++ b/config/ifort_2021_avx.cfg @@ -0,0 +1,63 @@ +# Common flags +############## +# +# -mkl=[parallel|sequential] : Use the MKL library +# --ninja : Allow the utilisation of ninja. It is mandatory ! +# --align=32 : Align all provided arrays on a 32-byte boundary +# +[COMMON] +FC : ifort -fpic +LAPACK_LIB : -qmkl=parallel -lirc -lsvml -limf -lipps +IRPF90 : irpf90 +IRPF90_FLAGS : --ninja --align=32 -DINTEL + +# Global options +################ +# +# 1 : Activate +# 0 : Deactivate +# +[OPTION] +MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below +CACHE : 0 ; Enable cache_compile.py +OPENMP : 1 ; Append OpenMP flags + +# Optimization flags +#################### +# +# -xHost : Compile a binary optimized for the current architecture +# -O2 : O3 not better than O2. +# -ip : Inter-procedural optimizations +# -ftz : Flushes denormal results to zero +# +[OPT] +FC : -traceback +FCFLAGS : -xAVX -O2 -ip -ftz -g + +# Profiling flags +################# +# +[PROFILE] +FC : -p -g +FCFLAGS : -xSSE4.2 -O2 -ip -ftz + +# Debugging flags +################# +# +# -traceback : Activate backtrace on runtime +# -fpe0 : All floating point exaceptions +# -C : Checks uninitialized variables, array subscripts, etc... +# -g : Extra debugging information +# -xSSE2 : Valgrind needs a very simple x86 executable +# +[DEBUG] +FC : -g -traceback +FCFLAGS : -xSSE2 -C -fpe0 -implicitnone + +# OpenMP flags +################# +# +[OPENMP] +FC : -qopenmp +IRPF90_FLAGS : --openmp + diff --git a/config/ifort_2021_avx_mpi.cfg b/config/ifort_2021_avx_mpi.cfg new file mode 100644 index 00000000..b6b74b73 --- /dev/null +++ b/config/ifort_2021_avx_mpi.cfg @@ -0,0 +1,64 @@ +# Common flags +############## +# +# -mkl=[parallel|sequential] : Use the MKL library +# --ninja : Allow the utilisation of ninja. It is mandatory ! +# --align=32 : Align all provided arrays on a 32-byte boundary +# +[COMMON] +FC : mpiifort -fpic +LAPACK_LIB : -qmkl=parallel -lirc -lsvml -limf -lipps +IRPF90 : irpf90 +IRPF90_FLAGS : --ninja --align=32 -DMPI -DINTEL + +# Global options +################ +# +# 1 : Activate +# 0 : Deactivate +# +[OPTION] +MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below +CACHE : 0 ; Enable cache_compile.py +OPENMP : 1 ; Append OpenMP flags + +# Optimization flags +#################### +# +# -xHost : Compile a binary optimized for the current architecture +# -O2 : O3 not better than O2. +# -ip : Inter-procedural optimizations +# -ftz : Flushes denormal results to zero +# +[OPT] +FCFLAGS : -mavx -axAVX -O2 -ip -ftz -g -traceback + +# Profiling flags +################# +# +[PROFILE] +FC : -p -g +FCFLAGS : -march=corei7 -O2 -ip -ftz + + +# Debugging flags +################# +# +# -traceback : Activate backtrace on runtime +# -fpe0 : All floating point exaceptions +# -C : Checks uninitialized variables, array subscripts, etc... +# -g : Extra debugging information +# -xSSE2 : Valgrind needs a very simple x86 executable +# +[DEBUG] +FC : -g -traceback +FCFLAGS : -xSSE2 -C -fpe0 -implicitnone + + +# OpenMP flags +################# +# +[OPENMP] +FC : -qopenmp +IRPF90_FLAGS : --openmp + diff --git a/config/ifort_2021_sse4.cfg b/config/ifort_2021_sse4.cfg new file mode 100644 index 00000000..269023da --- /dev/null +++ b/config/ifort_2021_sse4.cfg @@ -0,0 +1,63 @@ +# Common flags +############## +# +# -mkl=[parallel|sequential] : Use the MKL library +# --ninja : Allow the utilisation of ninja. It is mandatory ! +# --align=32 : Align all provided arrays on a 32-byte boundary +# +[COMMON] +FC : ifort -fpic +LAPACK_LIB : -qmkl=parallel -lirc -lsvml -limf -lipps +IRPF90 : irpf90 +IRPF90_FLAGS : --ninja --align=32 -DINTEL + +# Global options +################ +# +# 1 : Activate +# 0 : Deactivate +# +[OPTION] +MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below +CACHE : 0 ; Enable cache_compile.py +OPENMP : 1 ; Append OpenMP flags + +# Optimization flags +#################### +# +# -xHost : Compile a binary optimized for the current architecture +# -O2 : O3 not better than O2. +# -ip : Inter-procedural optimizations +# -ftz : Flushes denormal results to zero +# +[OPT] +FC : -traceback +FCFLAGS : -xSSE4.2 -O2 -ip -ftz -g + +# Profiling flags +################# +# +[PROFILE] +FC : -p -g +FCFLAGS : -xSSE4.2 -O2 -ip -ftz + +# Debugging flags +################# +# +# -traceback : Activate backtrace on runtime +# -fpe0 : All floating point exaceptions +# -C : Checks uninitialized variables, array subscripts, etc... +# -g : Extra debugging information +# -xSSE2 : Valgrind needs a very simple x86 executable +# +[DEBUG] +FC : -g -traceback +FCFLAGS : -xSSE2 -C -fpe0 -implicitnone + +# OpenMP flags +################# +# +[OPENMP] +FC : -qopenmp +IRPF90_FLAGS : --openmp + diff --git a/config/ifort_2021_sse4_mpi.cfg b/config/ifort_2021_sse4_mpi.cfg new file mode 100644 index 00000000..41df87bc --- /dev/null +++ b/config/ifort_2021_sse4_mpi.cfg @@ -0,0 +1,64 @@ +# Common flags +############## +# +# -mkl=[parallel|sequential] : Use the MKL library +# --ninja : Allow the utilisation of ninja. It is mandatory ! +# --align=32 : Align all provided arrays on a 32-byte boundary +# +[COMMON] +FC : mpiifort -fpic +LAPACK_LIB : -qmkl=parallel -lirc -lsvml -limf -lipps +IRPF90 : irpf90 +IRPF90_FLAGS : --ninja --align=32 -DMPI -DINTEL + +# Global options +################ +# +# 1 : Activate +# 0 : Deactivate +# +[OPTION] +MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below +CACHE : 0 ; Enable cache_compile.py +OPENMP : 1 ; Append OpenMP flags + +# Optimization flags +#################### +# +# -xHost : Compile a binary optimized for the current architecture +# -O2 : O3 not better than O2. +# -ip : Inter-procedural optimizations +# -ftz : Flushes denormal results to zero +# +[OPT] +FCFLAGS : -msse4.2 -O2 -ip -ftz -g -traceback + +# Profiling flags +################# +# +[PROFILE] +FC : -p -g +FCFLAGS : -msse4.2 -O2 -ip -ftz + + +# Debugging flags +################# +# +# -traceback : Activate backtrace on runtime +# -fpe0 : All floating point exaceptions +# -C : Checks uninitialized variables, array subscripts, etc... +# -g : Extra debugging information +# -xSSE2 : Valgrind needs a very simple x86 executable +# +[DEBUG] +FC : -g -traceback +FCFLAGS : -xSSE2 -C -fpe0 -implicitnone + + +# OpenMP flags +################# +# +[OPENMP] +FC : -qopenmp +IRPF90_FLAGS : --openmp + diff --git a/config/ifort_2021_xHost.cfg b/config/ifort_2021_xHost.cfg new file mode 100644 index 00000000..05c271f3 --- /dev/null +++ b/config/ifort_2021_xHost.cfg @@ -0,0 +1,63 @@ +# Common flags +############## +# +# -mkl=[parallel|sequential] : Use the MKL library +# --ninja : Allow the utilisation of ninja. It is mandatory ! +# --align=32 : Align all provided arrays on a 32-byte boundary +# +[COMMON] +FC : ifort -fpic +LAPACK_LIB : -qmkl=parallel -lirc -lsvml -limf -lipps +IRPF90 : irpf90 +IRPF90_FLAGS : --ninja --align=64 -DINTEL + +# Global options +################ +# +# 1 : Activate +# 0 : Deactivate +# +[OPTION] +MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below +CACHE : 0 ; Enable cache_compile.py +OPENMP : 1 ; Append OpenMP flags + +# Optimization flags +#################### +# +# -xHost : Compile a binary optimized for the current architecture +# -O2 : O3 not better than O2. +# -ip : Inter-procedural optimizations +# -ftz : Flushes denormal results to zero +# +[OPT] +FC : -traceback +FCFLAGS : -xHost -O2 -ip -ftz -g + +# Profiling flags +################# +# +[PROFILE] +FC : -p -g +FCFLAGS : -xSSE4.2 -O2 -ip -ftz + +# Debugging flags +################# +# +# -traceback : Activate backtrace on runtime +# -fpe0 : All floating point exaceptions +# -C : Checks uninitialized variables, array subscripts, etc... +# -g : Extra debugging information +# -xSSE2 : Valgrind needs a very simple x86 executable +# +[DEBUG] +FC : -g -traceback +FCFLAGS : -xSSE2 -C -fpe0 -implicitnone + +# OpenMP flags +################# +# +[OPENMP] +FC : -qopenmp +IRPF90_FLAGS : --openmp + diff --git a/config/ifort_debug.cfg b/config/ifort_debug.cfg deleted file mode 100644 index 9b718380..00000000 --- a/config/ifort_debug.cfg +++ /dev/null @@ -1,66 +0,0 @@ -# Common flags -############## -# -# -mkl=[parallel|sequential] : Use the MKL library -# --ninja : Allow the utilisation of ninja. It is mandatory ! -# --align=32 : Align all provided arrays on a 32-byte boundary -# -[COMMON] -FC : ifort -fpic -LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps -IRPF90 : irpf90 -IRPF90_FLAGS : --ninja --align=32 --assert -DINTEL - -# Global options -################ -# -# 1 : Activate -# 0 : Deactivate -# -[OPTION] -MODE : DEBUG ; [ OPT | PROFILE | DEBUG ] : Chooses the section below -CACHE : 0 ; Enable cache_compile.py -OPENMP : 1 ; Append OpenMP flags - -# Optimization flags -#################### -# -# -xHost : Compile a binary optimized for the current architecture -# -O2 : O3 not better than O2. -# -ip : Inter-procedural optimizations -# -ftz : Flushes denormal results to zero -# -[OPT] -FC : -traceback -FCFLAGS : -msse4.2 -O2 -ip -ftz -g - - -# Profiling flags -################# -# -[PROFILE] -FC : -p -g -FCFLAGS : -msse4.2 -O2 -ip -ftz - - -# Debugging flags -################# -# -# -traceback : Activate backtrace on runtime -# -fpe0 : All floating point exaceptions -# -C : Checks uninitialized variables, array subscripts, etc... -# -g : Extra debugging information -# -msse4.2 : Valgrind needs a very simple x86 executable -# -[DEBUG] -FC : -g -traceback -FCFLAGS : -msse4.2 -check all -debug all -fpe-all=0 -implicitnone - - -# OpenMP flags -################# -# -[OPENMP] -FC : -qopenmp -IRPF90_FLAGS : --openmp - diff --git a/configure b/configure index 0debde04..65146c22 100755 --- a/configure +++ b/configure @@ -3,8 +3,6 @@ # Quantum Package configuration script # -TEMP=$(getopt -o d:c:i:h -l download:,config:,install:,help -n $0 -- "$@") || exit 1 -eval set -- "$TEMP" export QP_ROOT="$( cd "$(dirname "$0")" ; pwd -P )" echo "QP_ROOT="$QP_ROOT @@ -24,17 +22,17 @@ function help() Quantum Package configuration script. Usage: - $(basename $0) -c | --config= - $(basename $0) -h | --help - $(basename $0) -i | --install= + $(basename $0) -c + $(basename $0) -h + $(basename $0) -i Options: - -c, --config= Define a COMPILATION configuration file, - in "${QP_ROOT}/config/". - -h, --help Print the HELP message - -i, --install= INSTALL . Use at your OWN RISK: - no support will be provided for the installation of - dependencies. + -c Define a COMPILATION configuration file, + in "${QP_ROOT}/config/". + -h Print the HELP message + -i INSTALL . Use at your OWN RISK: + no support will be provided for the installation of + dependencies. Example: ./$(basename $0) -c config/gfortran.cfg @@ -68,32 +66,31 @@ function execute () { } PACKAGES="" +echo $@ -while true ; do - case "$1" in - -c|--config) - case "$2" in + +while getopts "d:c:i:h" c ; do + case "$c" in + c) + case "$OPTARG" in "") help ; break;; - *) if [[ -f $2 ]] ; then - CONFIG="$2" + *) if [[ -f $OPTARG ]] ; then + CONFIG="$OPTARG" else - error "error: configuration file $2 not found." + error "error: configuration file $OPTARG not found." exit 1 fi - esac - shift 2;; - -i|--install) - case "$2" in + esac;; + i) + case "$OPTARG" in "") help ; break;; - *) PACKAGES="${PACKAGE} $2" - esac - shift 2;; - -h|-help|--help) + *) PACKAGES="${PACKAGE} $OPTARG" + esac;; + h) help exit 0;; - --) shift ; break ;; *) - error $(basename $0)": unknown option $1, try --help" + error $(basename $0)": unknown option $c, try --help" exit 2;; esac done @@ -226,13 +223,11 @@ EOF execute << EOF cd "\${QP_ROOT}"/external - tar --gunzip --extract --file qp2-dependencies/f77_zmq-4.2.5.tar.gz - cd f77_zmq-* + tar --gunzip --extract --file qp2-dependencies/f77-zmq-4.3.2.tar.gz + cd f77-zmq-* + ./configure --prefix=\$QP_ROOT export ZMQ_H="\$QP_ROOT"/include/zmq.h - make - cp libf77zmq.a "\${QP_ROOT}"/lib - cp libf77zmq.so "\${QP_ROOT}"/lib - cp f77_zmq_free.h "\${QP_ROOT}"/include + make && make check && make install EOF diff --git a/ocaml/qp_create_ezfio.ml b/ocaml/qp_create_ezfio.ml index a4865e2b..8fcaf5fc 100644 --- a/ocaml/qp_create_ezfio.ml +++ b/ocaml/qp_create_ezfio.ml @@ -585,12 +585,16 @@ let run ?o b au c d m p cart xyz_file = let shell_prim_num = list_map List.length lc in - let shell_prim_idx = + let shell_idx = + let rec make_list n accu = function + | 0 -> accu + | i -> make_list n (n :: accu) (i-1) + in let rec aux count accu = function | [] -> List.rev accu | l::rest -> - let newcount = count+(List.length l) in - aux newcount (count::accu) rest + let new_l = make_list count accu (List.length l) in + aux (count+1) new_l rest in aux 1 [] lc in @@ -602,20 +606,12 @@ let run ?o b au c d m p cart xyz_file = ~rank:1 ~dim:[| shell_num |] ~data:shell_prim_num); Ezfio.set_basis_shell_ang_mom (Ezfio.ezfio_array_of_list ~rank:1 ~dim:[| shell_num |] ~data:ang_mom ) ; - Ezfio.set_basis_shell_prim_index (Ezfio.ezfio_array_of_list - ~rank:1 ~dim:[| shell_num |] ~data:shell_prim_idx) ; + Ezfio.set_basis_shell_index (Ezfio.ezfio_array_of_list + ~rank:1 ~dim:[| prim_num |] ~data:shell_idx) ; Ezfio.set_basis_basis_nucleus_index (Ezfio.ezfio_array_of_list - ~rank:1 ~dim:[| nucl_num |] - ~data:( - list_map (fun (_,n) -> Nucl_number.to_int n) basis - |> List.fold_left (fun accu i -> - match accu with - | [] -> [] - | (h,j) :: rest -> if j == i then ((h+1,j)::rest) else ((h+1,i)::(h+1,j)::rest) - ) [(0,0)] - |> List.rev - |> List.map fst - )) ; + ~rank:1 ~dim:[| shell_num |] + ~data:( list_map (fun (_,n) -> Nucl_number.to_int n) basis) + ) ; Ezfio.set_basis_nucleus_shell_num(Ezfio.ezfio_array_of_list ~rank:1 ~dim:[| nucl_num |] ~data:( diff --git a/scripts/verif_omp/check_omp.f90 b/scripts/verif_omp/check_omp.f90 new file mode 100644 index 00000000..ca6af8bd --- /dev/null +++ b/scripts/verif_omp/check_omp.f90 @@ -0,0 +1,175 @@ +program check_omp_v2 + + use omp_lib + + implicit none + + integer :: accu, accu2 + integer :: s, n_setting + logical :: verbose, test_versions + logical, allocatable :: is_working(:) + + verbose = .False. + test_versions = .True. + n_setting = 4 + + allocate(is_working(n_setting)) + + is_working = .False. + + ! set the number of threads + call omp_set_num_threads(2) + + do s = 1, n_setting + + accu = 0 + accu2 = 0 + + call omp_set_max_active_levels(1) + call omp_set_nested(.False.) + + if (s==1) then + !call set_multiple_levels_omp() + cycle + elseif (s==2) then + call omp_set_max_active_levels(5) + elseif (s==3) then + call omp_set_nested(.True.) + else + call omp_set_nested(.True.) + call omp_set_max_active_levels(5) + endif + + ! Level 1 + !$OMP PARALLEL + if (verbose) then + print*,'Num threads level 1:',omp_get_num_threads() + endif + + ! Level 2 + !$OMP PARALLEL + if (verbose) then + print*,'Num threads level 2:',omp_get_num_threads() + endif + + ! Level 3 + !$OMP PARALLEL + if (verbose) then + print*,'Num threads level 3:',omp_get_num_threads() + endif + + call check_omp_in_subroutine(accu2) + + ! Level 4 + !$OMP PARALLEL + + if (verbose) then + print*,'Num threads level 4:',omp_get_num_threads() + endif + + !$OMP ATOMIC + accu = accu + 1 + !$OMP END ATOMIC + + !$OMP END PARALLEL + + + !$OMP END PARALLEL + + + !$OMP END PARALLEL + + + !$OMP END PARALLEL + + if (verbose) then + print*,'Setting:',s,'accu=',accu + print*,'Setting:',s,'accu2=',accu2 + endif + + if (accu == 16 .and. accu2 == 16) then + is_working(s) = .True. + endif + + enddo + + if (verbose) then + if (is_working(2)) then + print*,'The parallelization works on 4 levels with:' + print*,'call omp_set_max_active_levels(5)' + print*,'' + print*,'Please use the irpf90 flags -DSET_MAX_ACT in qp2/config/${compiler_name}.cfg' + elseif (is_working(3)) then + print*,'The parallelization works on 4 levels with:' + print*,'call omp_set_nested(.True.)' + print*,'' + print*,'Please use the irpf90 flag -DSET_NESTED in qp2/config/${compiler_name}.cfg' + elseif (is_working(4)) then + print*,'The parallelization works on 4 levels with:' + print*,'call omp_set_nested(.True.)' + print*,'+' + print*,'call omp_set_max_active_levels(5)' + print*,'' + print*,'Please use the irpf90 flags -DSET_NESTED -DSET_MAX_ACT in qp2/config/${compiler_name}.cfg' + else + print*,'The parallelization on multiple levels does not work with:' + print*,'call omp_set_max_active_levels(5)' + print*,'or' + print*,'call omp_set_nested(.True.)' + print*,'or' + print*,'call omp_set_nested(.True.)' + print*,'+' + print*,'call omp_set_max_active_levels(5)' + print*,'' + print*,'Try an other compiler and good luck...' + endif + + ! if (is_working(1)) then + ! print*,'' + ! print*,'==========================================================' + ! print*,'Your actual set up works for parallelization with 4 levels' + ! print*,'==========================================================' + ! print*,'' + ! else + ! print*,'' + ! print*,'===================================================================' + ! print*,'Your actual set up does not work for parallelization with 4 levels' + ! print*,'Please look at the previous messages to understand the requirements' + ! print*,'===================================================================' + ! print*,'' + ! endif + endif + + ! List of working flags + if (test_versions) then + print*,'Tests:',is_working(2:4) + endif + + ! IRPF90_FLAGS + if (is_working(2)) then + print*,'-DSET_MAX_ACT' + elseif (is_working(3)) then + print*,'-DSET_NESTED' + elseif (is_working(4)) then + print*,'-DSET_MAX_ACT -DSET_NESTED' + else + print*,'ERROR' + endif + +end + +subroutine check_omp_in_subroutine(accu2) + + implicit none + + integer, intent(inout) :: accu2 + + !$OMP PARALLEL + + !$OMP ATOMIC + accu2 = accu2 + 1 + !$OMP END ATOMIC + + !$OMP END PARALLEL + +end diff --git a/scripts/verif_omp/check_required_setup.sh b/scripts/verif_omp/check_required_setup.sh new file mode 100755 index 00000000..367530b6 --- /dev/null +++ b/scripts/verif_omp/check_required_setup.sh @@ -0,0 +1,19 @@ +#!/bin/sh + +# take one argument which is the compiler used +# return the required IRPF90_FLAGS for the $1 compiler + +if [ -z "$1" ] +then + echo "Give the compiler in argument" +else + +$1 --version > /dev/null \ +&& $1 -O0 -fopenmp check_omp.f90 \ +&& ./a.out | tail -n 1 + + +# if there is an error or if the compiler is not found +$1 --version > /dev/null || echo 'compiler not found' + +fi diff --git a/scripts/verif_omp/study_omp.sh b/scripts/verif_omp/study_omp.sh new file mode 100755 index 00000000..900d04e1 --- /dev/null +++ b/scripts/verif_omp/study_omp.sh @@ -0,0 +1,30 @@ +#!/bin/sh + +# list of compilers +list_comp="ifort gfortran-7 gfortran-8 gfortran-9" + +# file to store the results +FILE=results.dat + +touch $FILE +rm $FILE + +# Comments +echo "1: omp_set_max_active_levels(5)" >> $FILE +echo "2: omp_set_nested(.True.)" >> $FILE +echo "3: 1 + 2" >> $FILE +echo "" >> $FILE +echo "1 2 3" >> $FILE + +# loop on the comp +for comp in $list_comp +do + $comp --version > /dev/null \ + && $comp -O0 -fopenmp check_omp.f90 \ + && echo $(./a.out | grep "Tests:" | cut -d ":" -f2- ) $(echo " : ") $($comp --version | head -n 1) >> $FILE + +done + +# Display +cat $FILE + diff --git a/scripts/verif_omp/update_comp.sh b/scripts/verif_omp/update_comp.sh new file mode 100755 index 00000000..14b644de --- /dev/null +++ b/scripts/verif_omp/update_comp.sh @@ -0,0 +1,49 @@ +#!/bin/bash + +# Compiler +COMP=$1 + +# Path to file.cfg +config_PATH="../../config/" +END="*.cfg" +CONFIG="/config/" + +#LIST=${config_PATH}${COMP}${END} # without ${QP_ROOT} +LIST=${QP_ROOT}${CONFIG}${COMP}${END} + +if [ -z "$1" ] +then + echo "Give the compiler in argument" +else + + # List of the config files for the compiler + #list_files=$(ls ../../config/$comp*.cfg) #does not give the right list + list_files=${LIST} + echo "Files that will be modified:" + echo $list_files + + # Flags that must be added + FLAGS=$(./check_required_setup.sh $COMP) + + # Add the flags + for file in $list_files + do + echo $file + BASE="IRPF90_FLAGS : --ninja" + ACTUAL=$(grep "$BASE" $file) + + # To have only one time each flag + grep " -DSET_MAX_ACT" $file && ${ACTUAL/" -DSET_MAX"/""} + grep " -DSET_NESTED" $file && ${ACTUAL/" -DSET_NESTED"/""} + SPACE=" " + + NEW=${ACTUAL}${SPACE}${FLAGS} + + # Debug + #echo ${NEW} + + sed "s/${ACTUAL}/${NEW}/" $file + # -i # to change the files + done + +fi diff --git a/src/ao_basis/aos.irp.f b/src/ao_basis/aos.irp.f index d2ce1ab2..1cbd3976 100644 --- a/src/ao_basis/aos.irp.f +++ b/src/ao_basis/aos.irp.f @@ -21,6 +21,21 @@ BEGIN_PROVIDER [ integer, ao_shell, (ao_num) ] enddo enddo +END_PROVIDER + +BEGIN_PROVIDER [ integer, ao_first_of_shell, (shell_num) ] + implicit none + BEGIN_DOC + ! Index of the shell to which the AO corresponds + END_DOC + integer :: i, j, k, n + k=1 + do i=1,shell_num + ao_first_of_shell(i) = k + n = shell_ang_mom(i)+1 + k = k+(n*(n+1))/2 + enddo + END_PROVIDER BEGIN_PROVIDER [ double precision, ao_coef_normalized, (ao_num,ao_prim_num_max) ] diff --git a/src/ao_one_e_ints/pot_ao_pseudo_ints.irp.f b/src/ao_one_e_ints/pot_ao_pseudo_ints.irp.f index 24f43311..e75ca056 100644 --- a/src/ao_one_e_ints/pot_ao_pseudo_ints.irp.f +++ b/src/ao_one_e_ints/pot_ao_pseudo_ints.irp.f @@ -28,6 +28,7 @@ BEGIN_PROVIDER [ double precision, ao_pseudo_integrals, (ao_num,ao_num)] END_PROVIDER BEGIN_PROVIDER [ double precision, ao_pseudo_integrals_local, (ao_num,ao_num)] + use omp_lib implicit none BEGIN_DOC ! Local pseudo-potential @@ -42,7 +43,6 @@ BEGIN_PROVIDER [ double precision, ao_pseudo_integrals_local, (ao_num,ao_num)] double precision :: wall_1, wall_2, wall_0 integer :: thread_num - integer :: omp_get_thread_num double precision :: c double precision :: Z @@ -158,6 +158,7 @@ BEGIN_PROVIDER [ double precision, ao_pseudo_integrals_local, (ao_num,ao_num)] BEGIN_PROVIDER [ double precision, ao_pseudo_integrals_non_local, (ao_num,ao_num)] + use omp_lib implicit none BEGIN_DOC ! Non-local pseudo-potential @@ -169,7 +170,6 @@ BEGIN_PROVIDER [ double precision, ao_pseudo_integrals_local, (ao_num,ao_num)] integer :: power_A(3),power_B(3) integer :: i,j,k,l,m double precision :: Vloc, Vpseudo - integer :: omp_get_thread_num double precision :: wall_1, wall_2, wall_0 integer :: thread_num diff --git a/src/basis/EZFIO.cfg b/src/basis/EZFIO.cfg index 7f2ede4c..a6864418 100644 --- a/src/basis/EZFIO.cfg +++ b/src/basis/EZFIO.cfg @@ -37,16 +37,16 @@ doc: Number of primitives in a shell size: (basis.shell_num) interface: ezfio, provider -[shell_prim_index] +[shell_index] type: integer -doc: Max number of primitives in a shell -size: (basis.shell_num) +doc: Index of the shell for each primitive +size: (basis.prim_num) interface: ezfio, provider [basis_nucleus_index] type: integer -doc: Index of the nucleus on which the shell is centered -size: (nuclei.nucl_num) +doc: Nucleus on which the shell is centered +size: (basis.shell_num) interface: ezfio, provider [prim_normalization_factor] diff --git a/src/basis/basis.irp.f b/src/basis/basis.irp.f index 6a406e28..b750d75a 100644 --- a/src/basis/basis.irp.f +++ b/src/basis/basis.irp.f @@ -30,8 +30,10 @@ BEGIN_PROVIDER [ double precision, shell_normalization_factor , (shell_num) ] powA(3) = 0 norm = 0.d0 - do k=shell_prim_index(i),shell_prim_index(i)+shell_prim_num(i)-1 - do j=shell_prim_index(i),shell_prim_index(i)+shell_prim_num(i)-1 + do k=1, prim_num + if (shell_index(k) /= i) cycle + do j=1, prim_num + if (shell_index(j) /= i) cycle call overlap_gaussian_xyz(C_A,C_A,prim_expo(j),prim_expo(k), & powA,powA,overlap_x,overlap_y,overlap_z,c,nz) norm = norm+c*prim_coef(j)*prim_coef(k) * prim_normalization_factor(j) * prim_normalization_factor(k) @@ -91,7 +93,8 @@ BEGIN_PROVIDER [ double precision, prim_normalization_factor , (prim_num) ] powA(2) = 0 powA(3) = 0 - do k=shell_prim_index(i),shell_prim_index(i)+shell_prim_num(i)-1 + do k=1, prim_num + if (shell_index(k) /= i) cycle call overlap_gaussian_xyz(C_A,C_A,prim_expo(k),prim_expo(k), & powA,powA,overlap_x,overlap_y,overlap_z,norm,nz) prim_normalization_factor(k) = 1.d0/dsqrt(norm) diff --git a/src/cipsi/pt2_stoch_routines.irp.f b/src/cipsi/pt2_stoch_routines.irp.f index 3594aaf2..3fa2641a 100644 --- a/src/cipsi/pt2_stoch_routines.irp.f +++ b/src/cipsi/pt2_stoch_routines.irp.f @@ -117,7 +117,6 @@ subroutine ZMQ_pt2(E, pt2_data, pt2_data_err, relative_error, N_in) integer(ZMQ_PTR) :: zmq_to_qp_run_socket, zmq_socket_pull integer, intent(in) :: N_in -! integer, intent(inout) :: N_in double precision, intent(in) :: relative_error, E(N_states) type(pt2_type), intent(inout) :: pt2_data, pt2_data_err ! @@ -288,7 +287,7 @@ subroutine ZMQ_pt2(E, pt2_data, pt2_data_err, relative_error, N_in) call write_int(6,nproc_target,'Number of threads for PT2') call write_double(6,mem,'Memory (Gb)') - call omp_set_max_active_levels(1) + call set_multiple_levels_omp(.False.) print '(A)', '========== ======================= ===================== ===================== ===========' @@ -315,14 +314,14 @@ subroutine ZMQ_pt2(E, pt2_data, pt2_data_err, relative_error, N_in) endif !$OMP END PARALLEL call end_parallel_job(zmq_to_qp_run_socket, zmq_socket_pull, 'pt2') - call omp_set_max_active_levels(8) + call set_multiple_levels_omp(.True.) print '(A)', '========== ======================= ===================== ===================== ===========' - do k=1,N_states - pt2_overlap(pt2_stoch_istate,k) = pt2_data % overlap(k,pt2_stoch_istate) - enddo - SOFT_TOUCH pt2_overlap + do k=1,N_states + pt2_overlap(pt2_stoch_istate,k) = pt2_data % overlap(k,pt2_stoch_istate) + enddo + SOFT_TOUCH pt2_overlap enddo FREE pt2_stoch_istate @@ -576,11 +575,11 @@ subroutine pt2_collector(zmq_socket_pull, E, relative_error, pt2_data, pt2_data_ endif do i=1,n_tasks if(index(i).gt.size(pt2_data_I,1).or.index(i).lt.1)then - print*,'PB !!!' - print*,'If you see this, send a bug report with the following content' - print*,irp_here - print*,'i,index(i),size(pt2_data_I,1) = ',i,index(i),size(pt2_data_I,1) - stop -1 + print*,'PB !!!' + print*,'If you see this, send a bug report with the following content' + print*,irp_here + print*,'i,index(i),size(pt2_data_I,1) = ',i,index(i),size(pt2_data_I,1) + stop -1 endif call pt2_add(pt2_data_I(index(i)),1.d0,pt2_data_task(i)) f(index(i)) -= 1 diff --git a/src/cipsi/run_pt2_slave.irp.f b/src/cipsi/run_pt2_slave.irp.f index a72d3dbb..f1001f89 100644 --- a/src/cipsi/run_pt2_slave.irp.f +++ b/src/cipsi/run_pt2_slave.irp.f @@ -31,12 +31,11 @@ subroutine run_pt2_slave(thread,iproc,energy) double precision, intent(in) :: energy(N_states_diag) integer, intent(in) :: thread, iproc - call run_pt2_slave_large(thread,iproc,energy) -! if (N_det > nproc*(elec_alpha_num * (mo_num-elec_alpha_num))**2) then -! call run_pt2_slave_large(thread,iproc,energy) -! else -! call run_pt2_slave_small(thread,iproc,energy) -! endif + if (N_det > 100000 ) then + call run_pt2_slave_large(thread,iproc,energy) + else + call run_pt2_slave_small(thread,iproc,energy) + endif end subroutine run_pt2_slave_small(thread,iproc,energy) @@ -67,7 +66,6 @@ subroutine run_pt2_slave_small(thread,iproc,energy) double precision, external :: memory_of_double, memory_of_int integer :: bsize ! Size of selection buffers -! logical :: sending allocate(task_id(pt2_n_tasks_max), task(pt2_n_tasks_max)) allocate(pt2_data(pt2_n_tasks_max), i_generator(pt2_n_tasks_max), subset(pt2_n_tasks_max)) @@ -85,7 +83,6 @@ subroutine run_pt2_slave_small(thread,iproc,energy) buffer_ready = .False. n_tasks = 1 -! sending = .False. done = .False. do while (.not.done) @@ -119,14 +116,13 @@ subroutine run_pt2_slave_small(thread,iproc,energy) do k=1,n_tasks call pt2_alloc(pt2_data(k),N_states) b%cur = 0 -!double precision :: time2 -!call wall_time(time2) + double precision :: time2 + call wall_time(time2) call select_connected(i_generator(k),energy,pt2_data(k),b,subset(k),pt2_F(i_generator(k))) -!call wall_time(time1) -!print *, i_generator(1), time1-time2, n_tasks, pt2_F(i_generator(1)) + call wall_time(time1) +! print *, i_generator(1), time1-time2, n_tasks, pt2_F(i_generator(1)) enddo call wall_time(time1) -!print *, '-->', i_generator(1), time1-time0, n_tasks integer, external :: tasks_done_to_taskserver if (tasks_done_to_taskserver(zmq_to_qp_run_socket,worker_id,task_id,n_tasks) == -1) then @@ -164,6 +160,11 @@ end subroutine subroutine run_pt2_slave_large(thread,iproc,energy) use selection_types use f77_zmq + BEGIN_DOC +! This subroutine can miss important determinants when the PT2 is completely +! computed. It should be called only for large workloads where the PT2 is +! interrupted before the end + END_DOC implicit none double precision, intent(in) :: energy(N_states_diag) @@ -234,30 +235,28 @@ subroutine run_pt2_slave_large(thread,iproc,energy) ASSERT (b%N == bsize) endif - double precision :: time0, time1 - call wall_time(time0) call pt2_alloc(pt2_data,N_states) b%cur = 0 call select_connected(i_generator,energy,pt2_data,b,subset,pt2_F(i_generator)) - call wall_time(time1) integer, external :: tasks_done_to_taskserver if (tasks_done_to_taskserver(zmq_to_qp_run_socket,worker_id,task_id,n_tasks) == -1) then done = .true. endif call sort_selection_buffer(b) - call push_pt2_results_async_recv(zmq_socket_push,b%mini,sending) call omp_set_lock(global_selection_buffer_lock) global_selection_buffer%mini = b%mini call merge_selection_buffers(b,global_selection_buffer) b%cur=0 call omp_unset_lock(global_selection_buffer_lock) - if ( iproc == 1 ) then + if ( iproc == 1 .or. i_generator < 100 .or. done) then call omp_set_lock(global_selection_buffer_lock) + call push_pt2_results_async_recv(zmq_socket_push,b%mini,sending) call push_pt2_results_async_send(zmq_socket_push, (/i_generator/), (/pt2_data/), global_selection_buffer, (/task_id/), 1,sending) global_selection_buffer%cur = 0 call omp_unset_lock(global_selection_buffer_lock) else + call push_pt2_results_async_recv(zmq_socket_push,b%mini,sending) call push_pt2_results_async_send(zmq_socket_push, (/i_generator/), (/pt2_data/), b, (/task_id/), 1,sending) endif diff --git a/src/cipsi/selection_buffer.irp.f b/src/cipsi/selection_buffer.irp.f index 10132086..79899139 100644 --- a/src/cipsi/selection_buffer.irp.f +++ b/src/cipsi/selection_buffer.irp.f @@ -60,6 +60,7 @@ subroutine add_to_selection_buffer(b, det, val) b%val(b%cur) = val if(b%cur == size(b%val)) then call sort_selection_buffer(b) + b%cur = b%cur-1 end if end if end subroutine @@ -86,8 +87,8 @@ subroutine merge_selection_buffers(b1, b2) double precision :: rss double precision, external :: memory_of_double sze = max(size(b1%val), size(b2%val)) - rss = memory_of_double(sze) + 2*N_int*memory_of_double(sze) - call check_mem(rss,irp_here) +! rss = memory_of_double(sze) + 2*N_int*memory_of_double(sze) +! call check_mem(rss,irp_here) allocate(val(sze), detmp(N_int, 2, sze)) i1=1 i2=1 @@ -144,8 +145,8 @@ subroutine sort_selection_buffer(b) double precision :: rss double precision, external :: memory_of_double, memory_of_int - rss = memory_of_int(b%cur) + 2*N_int*memory_of_double(size(b%det,3)) - call check_mem(rss,irp_here) +! rss = memory_of_int(b%cur) + 2*N_int*memory_of_double(size(b%det,3)) +! call check_mem(rss,irp_here) allocate(iorder(b%cur), detmp(N_int, 2, size(b%det,3))) do i=1,b%cur iorder(i) = i @@ -225,14 +226,14 @@ subroutine make_selection_buffer_s2(b) endif dup = .True. do k=1,N_int - if ( (tmp_array(k,1,i) /= tmp_array(k,1,j)) & - .or. (tmp_array(k,2,i) /= tmp_array(k,2,j)) ) then + if ( (tmp_array(k,1,i) /= tmp_array(k,1,j)) .or. & + (tmp_array(k,2,i) /= tmp_array(k,2,j)) ) then dup = .False. exit endif enddo if (dup) then - val(i) = max(val(i), val(j)) + val(i) = min(val(i), val(j)) duplicate(j) = .True. endif j+=1 @@ -282,9 +283,6 @@ subroutine make_selection_buffer_s2(b) call configuration_to_dets_size(o(1,1,i),sze,elec_alpha_num,N_int) n_d = n_d + sze if (n_d > b%cur) then -! if (n_d - b%cur > b%cur - n_d + sze) then -! n_d = n_d - sze -! endif exit endif enddo @@ -329,10 +327,11 @@ subroutine remove_duplicates_in_selection_buffer(b) integer(bit_kind), allocatable :: tmp_array(:,:,:) logical, allocatable :: duplicate(:) - n_d = b%cur logical :: found_duplicates double precision :: rss double precision, external :: memory_of_double + + n_d = b%cur rss = (4*N_int+4)*memory_of_double(n_d) call check_mem(rss,irp_here) diff --git a/src/cipsi/selection_weight.irp.f b/src/cipsi/selection_weight.irp.f index 3c09e59a..756c65a1 100644 --- a/src/cipsi/selection_weight.irp.f +++ b/src/cipsi/selection_weight.irp.f @@ -38,11 +38,11 @@ subroutine update_pt2_and_variance_weights(pt2_data, N_st) avg = sum(pt2(1:N_st)) / dble(N_st) + 1.d-32 ! Avoid future division by zero - dt = 8.d0 !* selection_factor + dt = 4.d0 !* selection_factor do k=1,N_st - element = exp(dt*(pt2(k)/avg - 1.d0)) - element = min(2.0d0 , element) - element = max(0.5d0 , element) + element = pt2(k) !exp(dt*(pt2(k)/avg - 1.d0)) +! element = min(2.0d0 , element) +! element = max(0.5d0 , element) pt2_match_weight(k) *= element enddo @@ -50,9 +50,9 @@ subroutine update_pt2_and_variance_weights(pt2_data, N_st) avg = sum(variance(1:N_st)) / dble(N_st) + 1.d-32 ! Avoid future division by zero do k=1,N_st - element = exp(dt*(variance(k)/avg -1.d0)) - element = min(2.0d0 , element) - element = max(0.5d0 , element) + element = variance(k) ! exp(dt*(variance(k)/avg -1.d0)) +! element = min(2.0d0 , element) +! element = max(0.5d0 , element) variance_match_weight(k) *= element enddo @@ -62,6 +62,9 @@ subroutine update_pt2_and_variance_weights(pt2_data, N_st) variance_match_weight(:) = 1.d0 endif + pt2_match_weight(:) = pt2_match_weight(:)/sum(pt2_match_weight(:)) + variance_match_weight(:) = variance_match_weight(:)/sum(variance_match_weight(:)) + threshold_davidson_pt2 = min(1.d-6, & max(threshold_davidson, 1.e-1 * PT2_relative_error * minval(abs(pt2(1:N_states)))) ) @@ -87,7 +90,7 @@ BEGIN_PROVIDER [ double precision, selection_weight, (N_states) ] selection_weight(1:N_states) = c0_weight(1:N_states) case (2) - print *, 'Using pt2-matching weight in selection' + print *, 'Using PT2-matching weight in selection' selection_weight(1:N_states) = c0_weight(1:N_states) * pt2_match_weight(1:N_states) print *, '# PT2 weight ', real(pt2_match_weight(:),4) @@ -97,7 +100,7 @@ BEGIN_PROVIDER [ double precision, selection_weight, (N_states) ] print *, '# var weight ', real(variance_match_weight(:),4) case (4) - print *, 'Using variance- and pt2-matching weights in selection' + print *, 'Using variance- and PT2-matching weights in selection' selection_weight(1:N_states) = c0_weight(1:N_states) * sqrt(variance_match_weight(1:N_states) * pt2_match_weight(1:N_states)) print *, '# PT2 weight ', real(pt2_match_weight(:),4) print *, '# var weight ', real(variance_match_weight(:),4) @@ -112,7 +115,7 @@ BEGIN_PROVIDER [ double precision, selection_weight, (N_states) ] selection_weight(1:N_states) = c0_weight(1:N_states) case (7) - print *, 'Input weights multiplied by variance- and pt2-matching' + print *, 'Input weights multiplied by variance- and PT2-matching' selection_weight(1:N_states) = c0_weight(1:N_states) * sqrt(variance_match_weight(1:N_states) * pt2_match_weight(1:N_states)) * state_average_weight(1:N_states) print *, '# PT2 weight ', real(pt2_match_weight(:),4) print *, '# var weight ', real(variance_match_weight(:),4) @@ -128,6 +131,7 @@ BEGIN_PROVIDER [ double precision, selection_weight, (N_states) ] print *, '# var weight ', real(variance_match_weight(:),4) end select + selection_weight(:) = selection_weight(:)/sum(selection_weight(:)) print *, '# Total weight ', real(selection_weight(:),4) END_PROVIDER diff --git a/src/cipsi/slave_cipsi.irp.f b/src/cipsi/slave_cipsi.irp.f index 510c667b..f96aaa6a 100644 --- a/src/cipsi/slave_cipsi.irp.f +++ b/src/cipsi/slave_cipsi.irp.f @@ -4,7 +4,7 @@ subroutine run_slave_cipsi ! Helper program for distributed parallelism END_DOC - call omp_set_max_active_levels(1) + call set_multiple_levels_omp(.False.) distributed_davidson = .False. read_wf = .False. SOFT_TOUCH read_wf distributed_davidson @@ -171,9 +171,9 @@ subroutine run_slave_main call write_double(6,(t1-t0),'Broadcast time') !--- - call omp_set_max_active_levels(8) + call set_multiple_levels_omp(.True.) call davidson_slave_tcp(0) - call omp_set_max_active_levels(1) + call set_multiple_levels_omp(.False.) print *, mpi_rank, ': Davidson done' !--- @@ -311,7 +311,7 @@ subroutine run_slave_main if (mpi_master) then print *, 'Running PT2' endif - !$OMP PARALLEL PRIVATE(i) NUM_THREADS(nproc_target+1) + !$OMP PARALLEL PRIVATE(i) NUM_THREADS(nproc_target) i = omp_get_thread_num() call run_pt2_slave(0,i,pt2_e0_denominator) !$OMP END PARALLEL diff --git a/src/cis/EZFIO.cfg b/src/cis/EZFIO.cfg index 7e0eeb03..955d1bef 100644 --- a/src/cis/EZFIO.cfg +++ b/src/cis/EZFIO.cfg @@ -5,4 +5,3 @@ interface: ezfio size: (determinants.n_states) - diff --git a/src/cis/cis.irp.f b/src/cis/cis.irp.f index c4100105..ab2294ad 100644 --- a/src/cis/cis.irp.f +++ b/src/cis/cis.irp.f @@ -79,6 +79,6 @@ subroutine run call ezfio_set_cis_energy(CI_energy) psi_coef = ci_eigenvectors SOFT_TOUCH psi_coef - call save_wavefunction_truncated(threshold_save_wf) + call save_wavefunction_truncated(save_threshold) end diff --git a/src/cisd/cisd.irp.f b/src/cisd/cisd.irp.f index 6c55e2ff..fca3b10e 100644 --- a/src/cisd/cisd.irp.f +++ b/src/cisd/cisd.irp.f @@ -63,7 +63,7 @@ subroutine run endif psi_coef = ci_eigenvectors SOFT_TOUCH psi_coef - call save_wavefunction + call save_wavefunction_truncated(save_threshold) call ezfio_set_cisd_energy(CI_energy) do i = 1,N_states diff --git a/src/csf/conversion.irp.f b/src/csf/conversion.irp.f index fecc6123..c8bc9199 100644 --- a/src/csf/conversion.irp.f +++ b/src/csf/conversion.irp.f @@ -1,3 +1,15 @@ +BEGIN_PROVIDER [ double precision, psi_csf_coef, (N_csf, N_states) ] + implicit none + BEGIN_DOC + ! Wafe function in CSF basis + END_DOC + + double precision, allocatable :: buffer(:,:) + allocate ( buffer(N_det, N_states) ) + buffer(1:N_det, 1:N_states) = psi_coef(1:N_det, 1:N_states) + call convertWFfromDETtoCSF(N_states, buffer, psi_csf_coef) +END_PROVIDER + subroutine convertWFfromDETtoCSF(N_st,psi_coef_det_in, psi_coef_cfg_out) use cfunctions use bitmasks diff --git a/src/csf/sigma_vector.irp.f b/src/csf/sigma_vector.irp.f index 85ed5f84..0d24ae57 100644 --- a/src/csf/sigma_vector.irp.f +++ b/src/csf/sigma_vector.irp.f @@ -1,9 +1,12 @@ real*8 function logabsgamma(x) implicit none real*8, intent(in) :: x - logabsgamma = log(abs(gamma(x))) + logabsgamma = 1.d32 ! Avoid floating point exception + if (x>0.d0) then + logabsgamma = log(abs(gamma(x))) + endif end function logabsgamma - + BEGIN_PROVIDER [ integer, NSOMOMax] &BEGIN_PROVIDER [ integer, NCSFMax] &BEGIN_PROVIDER [ integer*8, NMO] @@ -56,24 +59,30 @@ endif endif ncfg = ncfgpersomo - ncfgprev - if(iand(MS,1) .EQ. 0) then - !dimcsfpercfg = max(1,nint((binom(i,i/2)-binom(i,i/2+1)))) - binom1 = dexp(logabsgamma(1.0d0*(i+1)) & - - logabsgamma(1.0d0*((i/2)+1)) & - - logabsgamma(1.0d0*(i-((i/2))+1))); - binom2 = dexp(logabsgamma(1.0d0*(i+1)) & - - logabsgamma(1.0d0*(((i/2)+1)+1)) & - - logabsgamma(1.0d0*(i-((i/2)+1)+1))); - dimcsfpercfg = max(1,nint(binom1 - binom2)) + if(i .EQ. 0 .OR. i .EQ. 1) then + dimcsfpercfg = 1 + elseif( i .EQ. 3) then + dimcsfpercfg = 2 else - !dimcsfpercfg = max(1,nint((binom(i,(i+1)/2)-binom(i,(i+3)/2)))) - binom1 = dexp(logabsgamma(1.0d0*(i+1)) & - - logabsgamma(1.0d0*(((i+1)/2)+1)) & - - logabsgamma(1.0d0*(i-(((i+1)/2))+1))); - binom2 = dexp(logabsgamma(1.0d0*(i+1)) & - - logabsgamma(1.0d0*((((i+3)/2)+1)+1)) & - - logabsgamma(1.0d0*(i-(((i+3)/2)+1)+1))); - dimcsfpercfg = max(1,nint(binom1 - binom2)) + if(iand(MS,1) .EQ. 0) then + !dimcsfpercfg = max(1,nint((binom(i,i/2)-binom(i,i/2+1)))) + binom1 = dexp(logabsgamma(1.0d0*(i+1)) & + - logabsgamma(1.0d0*((i/2)+1)) & + - logabsgamma(1.0d0*(i-((i/2))+1))); + binom2 = dexp(logabsgamma(1.0d0*(i+1)) & + - logabsgamma(1.0d0*(((i/2)+1)+1)) & + - logabsgamma(1.0d0*(i-((i/2)+1)+1))); + dimcsfpercfg = max(1,nint(binom1 - binom2)) + else + !dimcsfpercfg = max(1,nint((binom(i,(i+1)/2)-binom(i,(i+3)/2)))) + binom1 = dexp(logabsgamma(1.0d0*(i+1)) & + - logabsgamma(1.0d0*(((i+1)/2)+1)) & + - logabsgamma(1.0d0*(i-(((i+1)/2))+1))); + binom2 = dexp(logabsgamma(1.0d0*(i+1)) & + - logabsgamma(1.0d0*((((i+3)/2)+1)+1)) & + - logabsgamma(1.0d0*(i-(((i+3)/2)+1)+1))); + dimcsfpercfg = max(1,nint(binom1 - binom2)) + endif endif n_CSF += ncfg * dimcsfpercfg if(cfg_seniority_index(i+2) > ncfgprev) then diff --git a/src/davidson/davidson_parallel.irp.f b/src/davidson/davidson_parallel.irp.f index 8fd023da..e627dfc9 100644 --- a/src/davidson/davidson_parallel.irp.f +++ b/src/davidson/davidson_parallel.irp.f @@ -508,7 +508,7 @@ subroutine H_S2_u_0_nstates_zmq(v_0,s_0,u_0,N_st,sze) endif - call omp_set_max_active_levels(5) + call set_multiple_levels_omp(.True.) !$OMP PARALLEL DEFAULT(shared) NUM_THREADS(2) PRIVATE(ithread) ithread = omp_get_thread_num() diff --git a/src/davidson/davidson_parallel_csf.irp.f b/src/davidson/davidson_parallel_csf.irp.f index fe651b1d..d8e9bffa 100644 --- a/src/davidson/davidson_parallel_csf.irp.f +++ b/src/davidson/davidson_parallel_csf.irp.f @@ -464,7 +464,8 @@ subroutine H_u_0_nstates_zmq(v_0,u_0,N_st,sze) print *, irp_here, ': Failed in zmq_set_running' endif - call omp_set_max_active_levels(4) + call set_multiple_levels_omp(.True.) + !$OMP PARALLEL DEFAULT(shared) NUM_THREADS(2) PRIVATE(ithread) ithread = omp_get_thread_num() if (ithread == 0 ) then diff --git a/src/davidson/davidson_parallel_nos2.irp.f b/src/davidson/davidson_parallel_nos2.irp.f index 84cbe3af..597b001f 100644 --- a/src/davidson/davidson_parallel_nos2.irp.f +++ b/src/davidson/davidson_parallel_nos2.irp.f @@ -464,7 +464,8 @@ subroutine H_u_0_nstates_zmq(v_0,u_0,N_st,sze) print *, irp_here, ': Failed in zmq_set_running' endif - call omp_set_max_active_levels(4) + call set_multiple_levels_omp(.True.) + !$OMP PARALLEL DEFAULT(shared) NUM_THREADS(2) PRIVATE(ithread) ithread = omp_get_thread_num() if (ithread == 0 ) then diff --git a/src/determinants/EZFIO.cfg b/src/determinants/EZFIO.cfg index 1e85693b..9eefa66c 100644 --- a/src/determinants/EZFIO.cfg +++ b/src/determinants/EZFIO.cfg @@ -42,13 +42,13 @@ default: 2 [weight_selection] type: integer -doc: Weight used in the selection. 0: input state-average weight, 1: 1./(c_0^2), 2: rPT2 matching, 3: variance matching, 4: variance and rPT2 matching, 5: variance minimization and matching, 6: CI coefficients 7: input state-average multiplied by variance and rPT2 matching 8: input state-average multiplied by rPT2 matching 9: input state-average multiplied by variance matching +doc: Weight used in the selection. 0: input state-average weight, 1: 1./(c_0^2), 2: PT2 matching, 3: variance matching, 4: variance and PT2 matching, 5: variance minimization and matching, 6: CI coefficients 7: input state-average multiplied by variance and PT2 matching 8: input state-average multiplied by PT2 matching 9: input state-average multiplied by variance matching interface: ezfio,provider,ocaml default: 1 [threshold_generators] type: Threshold -doc: Thresholds on generators (fraction of the square of the norm) +doc: Thresholds on generators (fraction of the square of the norm) interface: ezfio,provider,ocaml default: 0.999 @@ -80,7 +80,7 @@ type: integer [psi_coef] interface: ezfio doc: Coefficients of the wave function -type: double precision +type: double precision size: (determinants.n_det,determinants.n_states) [psi_det] @@ -92,7 +92,7 @@ size: (determinants.n_int*determinants.bit_kind/8,2,determinants.n_det) [psi_coef_qp_edit] interface: ezfio doc: Coefficients of the wave function -type: double precision +type: double precision size: (determinants.n_det_qp_edit,determinants.n_states) [psi_det_qp_edit] @@ -126,19 +126,18 @@ default: 1. [thresh_sym] type: Threshold -doc: Thresholds to check if a determinant is connected with HF +doc: Thresholds to check if a determinant is connected with HF interface: ezfio,provider,ocaml default: 1.e-15 [pseudo_sym] type: logical -doc: If |true|, discard any Slater determinants with an interaction smaller than thresh_sym with HF. +doc: If |true|, discard any Slater determinants with an interaction smaller than thresh_sym with HF. interface: ezfio,provider,ocaml default: False - -[threshold_save_wf] +[save_threshold] type: Threshold -doc: Threshold on the coefficients of the wave function when saving it into the ezfio +doc: Cut-off to apply to the CI coefficients when the wave function is stored interface: ezfio,provider,ocaml default: 1.e-14 diff --git a/src/determinants/density_matrix.irp.f b/src/determinants/density_matrix.irp.f index fa4b3328..1a1d92b5 100644 --- a/src/determinants/density_matrix.irp.f +++ b/src/determinants/density_matrix.irp.f @@ -368,12 +368,12 @@ BEGIN_PROVIDER [ double precision, c0_weight, (N_states) ] c = maxval(psi_coef(:,i) * psi_coef(:,i)) c0_weight(i) = 1.d0/(c+1.d-20) enddo - c = 1.d0/minval(c0_weight(:)) + c = 1.d0/sum(c0_weight(:)) do i=1,N_states c0_weight(i) = c0_weight(i) * c enddo else - c0_weight = 1.d0 + c0_weight(:) = 1.d0 endif END_PROVIDER @@ -390,7 +390,7 @@ BEGIN_PROVIDER [ double precision, state_average_weight, (N_states) ] if (weight_one_e_dm == 0) then state_average_weight(:) = c0_weight(:) else if (weight_one_e_dm == 1) then - state_average_weight(:) = 1./N_states + state_average_weight(:) = 1.d0/N_states else call ezfio_has_determinants_state_average_weight(exists) if (exists) then diff --git a/src/determinants/determinants.irp.f b/src/determinants/determinants.irp.f index 2a6057de..b8c8658f 100644 --- a/src/determinants/determinants.irp.f +++ b/src/determinants/determinants.irp.f @@ -84,7 +84,6 @@ BEGIN_PROVIDER [ integer, psi_det_size ] else psi_det_size = 1 endif - psi_det_size = max(psi_det_size,100000) call write_int(6,psi_det_size,'Dimension of the psi arrays') endif IRP_IF MPI_DEBUG diff --git a/src/determinants/h_apply.irp.f b/src/determinants/h_apply.irp.f index 98fafb4a..d01ad1c7 100644 --- a/src/determinants/h_apply.irp.f +++ b/src/determinants/h_apply.irp.f @@ -322,10 +322,7 @@ subroutine fill_H_apply_buffer_no_selection(n_selected,det_buffer,Nint,iproc) ASSERT (sum(popcnt(H_apply_buffer(iproc)%det(:,2,i))) == elec_beta_num) enddo do i=1,n_selected - do j=1,N_int - H_apply_buffer(iproc)%det(j,1,i+H_apply_buffer(iproc)%N_det) = det_buffer(j,1,i) - H_apply_buffer(iproc)%det(j,2,i+H_apply_buffer(iproc)%N_det) = det_buffer(j,2,i) - enddo + H_apply_buffer(iproc)%det(:,:,i+H_apply_buffer(iproc)%N_det) = det_buffer(:,:,i) ASSERT (sum(popcnt(H_apply_buffer(iproc)%det(:,1,i+H_apply_buffer(iproc)%N_det)) )== elec_alpha_num) ASSERT (sum(popcnt(H_apply_buffer(iproc)%det(:,2,i+H_apply_buffer(iproc)%N_det))) == elec_beta_num) enddo diff --git a/src/dressing/run_dress_slave.irp.f b/src/dressing/run_dress_slave.irp.f index a33fb1dd..08b654c9 100644 --- a/src/dressing/run_dress_slave.irp.f +++ b/src/dressing/run_dress_slave.irp.f @@ -72,7 +72,7 @@ subroutine run_dress_slave(thread,iproce,energy) provide psi_energy ending = dress_N_cp+1 ntask_tbd = 0 - call omp_set_max_active_levels(8) + call set_multiple_levels_omp(.True.) !$OMP PARALLEL DEFAULT(SHARED) & !$OMP PRIVATE(interesting, breve_delta_m, task_id) & @@ -84,7 +84,7 @@ subroutine run_dress_slave(thread,iproce,energy) zmq_socket_push = new_zmq_push_socket(thread) integer, external :: connect_to_taskserver !$OMP CRITICAL - call omp_set_max_active_levels(1) + call set_multiple_levels_omp(.False.) if (connect_to_taskserver(zmq_to_qp_run_socket,worker_id,thread) == -1) then print *, irp_here, ': Unable to connect to task server' stop -1 @@ -296,7 +296,7 @@ subroutine run_dress_slave(thread,iproce,energy) !$OMP END CRITICAL !$OMP END PARALLEL - call omp_set_max_active_levels(1) + call set_multiple_levels_omp(.False.) ! do i=0,dress_N_cp+1 ! call omp_destroy_lock(lck_sto(i)) ! end do diff --git a/src/tools/truncate_wf.irp.f b/src/tools/truncate_wf.irp.f new file mode 100644 index 00000000..6c66c8ec --- /dev/null +++ b/src/tools/truncate_wf.irp.f @@ -0,0 +1,98 @@ +program truncate_wf + implicit none + BEGIN_DOC +! Truncate the wave function + END_DOC + read_wf = .True. + if (s2_eig) then + call routine_s2 + else + call routine + endif +end + +subroutine routine + implicit none + integer :: ndet_max + print*, 'Max number of determinants ?' + read(5,*) ndet_max + integer(bit_kind), allocatable :: psi_det_tmp(:,:,:) + double precision, allocatable :: psi_coef_tmp(:,:) + allocate(psi_det_tmp(N_int,2,ndet_max),psi_coef_tmp(ndet_max, N_states)) + + integer :: i,j + double precision :: accu(N_states) + accu = 0.d0 + do i = 1, ndet_max + do j = 1, N_int + psi_det_tmp(j,1,i) = psi_det_sorted(j,1,i) + psi_det_tmp(j,2,i) = psi_det_sorted(j,2,i) + enddo + do j = 1, N_states + psi_coef_tmp(i,j) = psi_coef_sorted(i,j) + accu(j) += psi_coef_tmp(i,j) **2 + enddo + enddo + do j = 1, N_states + accu(j) = 1.d0/dsqrt(accu(j)) + enddo + do j = 1, N_states + do i = 1, ndet_max + psi_coef_tmp(i,j) = psi_coef_tmp(i,j) * accu(j) + enddo + enddo + + call save_wavefunction_general(ndet_max,N_states,psi_det_tmp,size(psi_coef_tmp,1),psi_coef_tmp) + +end + +subroutine routine_s2 + implicit none + integer :: ndet_max + double precision :: wmin + integer(bit_kind), allocatable :: psi_det_tmp(:,:,:) + double precision, allocatable :: psi_coef_tmp(:,:) + integer :: i,j,k + double precision :: accu(N_states) + + print *, 'Weights of the CFG' + do i=1,N_det + print *, i, real(weight_configuration(det_to_configuration(i),:)), real(sum(weight_configuration(det_to_configuration(i),:))) + enddo + print*, 'Min weight of the configuration?' + read(5,*) wmin + + ndet_max = 0 + do i=1,N_det + if (maxval(weight_configuration( det_to_configuration(i),:)) < wmin) cycle + ndet_max = ndet_max+1 + enddo + + allocate(psi_det_tmp(N_int,2,ndet_max),psi_coef_tmp(ndet_max, N_states)) + + accu = 0.d0 + k=0 + do i = 1, N_det + if (maxval(weight_configuration( det_to_configuration(i),:)) < wmin) cycle + k = k+1 + do j = 1, N_int + psi_det_tmp(j,1,k) = psi_det(j,1,i) + psi_det_tmp(j,2,k) = psi_det(j,2,i) + enddo + do j = 1, N_states + psi_coef_tmp(k,j) = psi_coef(i,j) + accu(j) += psi_coef_tmp(k,j)**2 + enddo + enddo + do j = 1, N_states + accu(j) = 1.d0/dsqrt(accu(j)) + enddo + do j = 1, N_states + do i = 1, ndet_max + psi_coef_tmp(i,j) = psi_coef_tmp(i,j) * accu(j) + enddo + enddo + + call save_wavefunction_general(ndet_max,N_states,psi_det_tmp,size(psi_coef_tmp,1),psi_coef_tmp) + +end diff --git a/src/two_rdm_routines/davidson_like_state_av_2rdm.irp.f b/src/two_rdm_routines/davidson_like_state_av_2rdm.irp.f index eb247dea..26ed5ae6 100644 --- a/src/two_rdm_routines/davidson_like_state_av_2rdm.irp.f +++ b/src/two_rdm_routines/davidson_like_state_av_2rdm.irp.f @@ -529,10 +529,14 @@ subroutine orb_range_2_rdm_state_av_openmp_work_$N_int(big_array,dim1,norb,list_ c_average += c_1(l) * c_1(l) * state_weights(l) enddo - call update_keys_values(keys,values,nkeys,dim1,big_array,lock_2rdm) + if (nkeys > 0) then + call update_keys_values(keys,values,nkeys,dim1,big_array,lock_2rdm) + endif nkeys = 0 call orb_range_diag_to_all_2_rdm_dm_buffer(tmp_det,c_average,orb_bitmask,list_orb_reverse,ispin,sze_buff,nkeys,keys,values) - call update_keys_values(keys,values,nkeys,dim1,big_array,lock_2rdm) + if (nkeys > 0) then + call update_keys_values(keys,values,nkeys,dim1,big_array,lock_2rdm) + endif nkeys = 0 end do diff --git a/src/utils/set_multiple_levels_omp.irp.f b/src/utils/set_multiple_levels_omp.irp.f new file mode 100644 index 00000000..b4764e4a --- /dev/null +++ b/src/utils/set_multiple_levels_omp.irp.f @@ -0,0 +1,26 @@ +subroutine set_multiple_levels_omp(activate) + + BEGIN_DOC +! If true, activate OpenMP nested parallelism. If false, deactivate. + END_DOC + + implicit none + logical, intent(in) :: activate + + if (activate) then + call omp_set_max_active_levels(5) + + IRP_IF SET_NESTED + call omp_set_nested(.True.) + IRP_ENDIF + + else + + call omp_set_max_active_levels(1) + + IRP_IF SET_NESTED + call omp_set_nested(.False.) + IRP_ENDIF + end if + +end diff --git a/src/utils/util.irp.f b/src/utils/util.irp.f index cfb42fd1..ef846bdb 100644 --- a/src/utils/util.irp.f +++ b/src/utils/util.irp.f @@ -300,12 +300,12 @@ subroutine wall_time(t) end BEGIN_PROVIDER [ integer, nproc ] + use omp_lib implicit none BEGIN_DOC ! Number of current OpenMP threads END_DOC - integer :: omp_get_num_threads nproc = 1 !$OMP PARALLEL !$OMP MASTER