10
0
mirror of https://github.com/QuantumPackage/qp2.git synced 2024-12-23 04:43:45 +01:00

Merge pull request #199 from QuantumPackage/dev

Dev
This commit is contained in:
Anthony Scemama 2022-04-13 19:54:30 +02:00 committed by GitHub
commit ac0e6bc3ca
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
110 changed files with 3577 additions and 1142 deletions

View File

@ -2,9 +2,9 @@
Installation Installation
============ ============
The |qp| can be downloaded on GitHub as an `archive |qp| can be downloaded on GitHub as an `archive
<https://github.com/LCPQ/quantum_package/releases/latest>`_ or as a `git <https://github.com/QuantumPackage/qp2/releases>`_ or as a `git
repository <https://github.com/LCPQ/quantum_package>`_. repository <https://github.com/QuantumPackage/qp2>`_.
.. code:: bash .. code:: bash
@ -19,16 +19,16 @@ Before anything, go into your :file:`quantum_package` directory and run
This script will create the :file:`quantum_package.rc` bash script, which This script will create the :file:`quantum_package.rc` bash script, which
sets all the environment variables required for the normal operation of the sets all the environment variables required for the normal operation of
*Quantum Package*. It will also initialize the git submodules that are |qp|. It will also initialize the git submodules that are
required, and tell you which external dependencies are missing and need to be required, and tell you which external dependencies are missing and need to be
installed. The required dependencies are located in the installed. The required dependencies are located in the
`external/qp2-dependencies` directory, such that once QP is configured the `external/qp2-dependencies` directory, such that once |qp| is configured the
internet connection is not needed any more. internet connection is not needed any more.
When all dependencies have been installed, (the :command:`configure` will When all dependencies have been installed, (the :command:`configure` will
inform you) source the :file:`quantum_package.rc` in order to load all inform you what is missing) source the :file:`quantum_package.rc` in order to
environment variables and compile the |QP|. load all environment variables and compile |QP|.
Now all the requirements are met, you can compile the programs using Now all the requirements are met, you can compile the programs using
@ -37,6 +37,15 @@ Now all the requirements are met, you can compile the programs using
make make
Installation of dependencies via a Conda environment
====================================================
.. code:: bash
conda env create -f qp2.yml
Requirements Requirements
============ ============
@ -64,8 +73,8 @@ architecture. Modify it if needed, and run :command:`configure` with
.. code:: bash .. code:: bash
cp ./config/gfortran.example config/gfortran.cfg cp ./config/gfortran.example config/gfortran_avx.cfg
./configure -c config/gfortran.cfg ./configure -c config/gfortran_avx.cfg
.. note:: .. note::
@ -86,45 +95,33 @@ The command is to be used as follows:
.. code:: bash .. code:: bash
./configure --install=<package> ./configure -i <package>
The following packages are supported by the :command:`configure` installer: The following packages are supported by the :command:`configure` installer:
* ninja * ninja
* irpf90
* zeromq * zeromq
* f77zmq * f77zmq
* gmp * gmp
* ocaml (:math:`\approx` 5 minutes) * ocaml (:math:`\approx` 5 minutes)
* ezfio
* docopt * docopt
* resultsFile * resultsFile
* bats * bats
* zlib
Example: Example:
.. code:: bash .. code:: bash
./configure -i ezfio ./configure -i ninja
.. note::
When installing the ocaml package, you will be asked the location of where
it should be installed. A safe option is to enter the path proposed by the
|QP|:
QP>> Please install it here: /your_quantum_package_directory/bin
So just enter the proposition of the |QP| and press enter.
If the :command:`configure` executable fails to install a specific dependency If the :command:`configure` executable fails to install a specific dependency
----------------------------------------------------------------------------- -----------------------------------------------------------------------------
If the :command:`configure` executable does not succeed to install a specific If the :command:`configure` executable does not succeed in installing a specific
dependency, there are some proposition of how to download and install the dependency, you should try to install the dependency on your system by yourself.
minimal dependencies to compile and use the |QP|.
Before doing anything below, try to install the packages with your package manager Before doing anything below, try to install the packages with your package manager
(:command:`apt`, :command:`yum`, etc). (:command:`apt`, :command:`yum`, etc).
@ -153,7 +150,7 @@ If you have *pip* for Python2, you can do
.. code:: bash .. code:: bash
python2 -m pip install --user irpf90 python3 -m pip install --user irpf90
Otherwise, Otherwise,
@ -209,7 +206,7 @@ ZeroMQ and its Fortran binding
.. code:: bash .. code:: bash
cp f77_zmq_free.h ${QP_ROOT}/src/ZMQ/f77_zmq.h cp f77_zmq_free.h ${QP_ROOT}/src/zmq/f77_zmq.h
Zlib Zlib
@ -262,53 +259,6 @@ With Debian or Ubuntu, you can use
sudo apt install libgmp-dev sudo apt install libgmp-dev
libcap
------
Libcap is a library for getting and setting POSIX.1e draft 15 capabilities.
* Download the latest version of libcap here:
`<https://git.kernel.org/pub/scm/linux/kernel/git/morgan/libcap.git/snapshot/libcap-2.25.tar.gz>`_
and move it in the :file:`${QP_ROOT}/external` directory
* Extract the archive, go into the :file:`libcap-*/libcap` directory and run
the following command
.. code:: bash
prefix=$QP_ROOT make install
With Debian or Ubuntu, you can use
.. code:: bash
sudo apt install libcap-dev
Bubblewrap
----------
Bubblewrap is an unprivileged sandboxing tool.
* Download Bubblewrap here:
`<https://github.com/projectatomic/bubblewrap/releases/download/v0.3.3/bubblewrap-0.3.3.tar.xz>`_
and move it in the :file:`${QP_ROOT}/external` directory
* Extract the archive, go into the :file:`bubblewrap-*` directory and run
the following commands
.. code:: bash
./configure --prefix=$QP_ROOT && make -j 8
make install-exec-am
With Debian or Ubuntu, you can use
.. code:: bash
sudo apt install bubblewrap
OCaml OCaml
@ -327,7 +277,7 @@ OCaml
`<https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh>`_ `<https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh>`_
and move it in the :file:`${QP_ROOT}/external` directory and move it in the :file:`${QP_ROOT}/external` directory
* If you use OCaml only with the |qp|, you can install the OPAM directory * If you use OCaml only with |qp|, you can install the OPAM directory
containing the compiler and all the installed libraries in the containing the compiler and all the installed libraries in the
:file:`${QP_ROOT}/external` directory as :file:`${QP_ROOT}/external` directory as
@ -352,14 +302,14 @@ OCaml
.. code:: bash .. code:: bash
opam init --comp=4.07.1 opam init --comp=4.11.1
eval `${QP_ROOT}/bin/opam env` eval `${QP_ROOT}/bin/opam env`
If the installation fails because of bwrap, you can initialize opam using: If the installation fails because of bwrap, you can initialize opam using:
.. code:: bash .. code:: bash
opam init --disable-sandboxing --comp=4.07.1 opam init --disable-sandboxing --comp=4.11.1
eval `${QP_ROOT}/bin/opam env` eval `${QP_ROOT}/bin/opam env`
* Install the required external OCaml libraries * Install the required external OCaml libraries
@ -369,17 +319,6 @@ OCaml
opam install ocamlbuild cryptokit zmq sexplib ppx_sexp_conv ppx_deriving getopt opam install ocamlbuild cryptokit zmq sexplib ppx_sexp_conv ppx_deriving getopt
EZFIO
-----
*EZFIO* is the Easy Fortran Input/Output library generator.
* Download EZFIO here : `<https://gitlab.com/scemama/EZFIO/-/archive/master/EZFIO-master.tar.gz>`_ and move
the downloaded archive in the :file:`${QP_ROOT}/external` directory
* Extract the archive, and rename it as :file:`${QP_ROOT}/external/ezfio`
Docopt Docopt
------ ------
@ -414,3 +353,4 @@ If you have *pip* for Python3, you can do

View File

@ -30,7 +30,8 @@
- Fixed bug in DIIS - Fixed bug in DIIS
- Fixed bug in molden (Au -> Angs) - Fixed bug in molden (Au -> Angs)
- Fixed bug with non-contiguous MOs in active space and deleter MOs - Fixed bug with non-contiguous MOs in active space and deleter MOs
- Network-free installation - Complete network-free installation
- Fixed bug in selection when computing full PT2
- Updated version of f77-zmq - Updated version of f77-zmq
*** User interface *** User interface
@ -59,6 +60,7 @@
symmetry in matrices symmetry in matrices
- qp_export_as_tgz exports also plugin codes - qp_export_as_tgz exports also plugin codes
- Added a basis module containing basis set information - Added a basis module containing basis set information
- Added qp_run truncate_wf
*** Code *** Code

View File

@ -195,48 +195,52 @@ def write_ezfio(res, filename):
# P a r s i n g # # P a r s i n g #
# ~#~#~#~#~#~#~ # # ~#~#~#~#~#~#~ #
inucl = {}
for i, a in enumerate(res.geometry):
inucl[a.coord] = i
nbasis = 0 nbasis = 0
nucl_center = [] nucl_index = []
curr_center = -1 curr_center = -1
nucl_shell_num = [] nucl_shell_num = []
ang_mom = [] ang_mom = []
nshell = 0 nshell = 0
shell_prim_index = [1] nshell_tot = 0
shell_index = []
shell_prim_num = [] shell_prim_num = []
for b in res.basis: for b in res.basis:
s = b.sym s = b.sym
if str.count(s, "y") + str.count(s, "x") == 0: if str.count(s, "y") + str.count(s, "x") == 0:
c = b.center c = inucl[b.center]
nshell += 1 nshell += 1
nshell_tot += 1
if c != curr_center: if c != curr_center:
curr_center = c curr_center = c
nucl_center.append(nbasis+1)
nucl_shell_num.append(nshell) nucl_shell_num.append(nshell)
nshell = 0 nshell = 0
nbasis += 1 nbasis += 1
nucl_index.append(c+1)
coefficient += b.coef[:len(b.prim)] coefficient += b.coef[:len(b.prim)]
exponent += [p.expo for p in b.prim] exponent += [p.expo for p in b.prim]
ang_mom.append(str.count(s, "z")) ang_mom.append(str.count(s, "z"))
shell_prim_index.append(len(exponent)+1)
shell_prim_num.append(len(b.prim)) shell_prim_num.append(len(b.prim))
shell_index += [nshell_tot+1] * len(b.prim)
nucl_shell_num.append(nshell+1)
nucl_shell_num = nucl_shell_num[1:]
# ~#~#~#~#~ # # ~#~#~#~#~ #
# W r i t e # # W r i t e #
# ~#~#~#~#~ # # ~#~#~#~#~ #
ezfio.set_basis_basis("Read from ResultsFile") ezfio.set_basis_basis("Read from ResultsFile")
ezfio.set_basis_basis_nucleus_index(nucl_center)
ezfio.set_basis_prim_num(len(coefficient))
ezfio.set_basis_shell_num(len(ang_mom)) ezfio.set_basis_shell_num(len(ang_mom))
ezfio.set_basis_basis_nucleus_index(nucl_index)
ezfio.set_basis_prim_num(len(coefficient))
ezfio.set_basis_nucleus_shell_num(nucl_shell_num) ezfio.set_basis_nucleus_shell_num(nucl_shell_num)
ezfio.set_basis_prim_coef(coefficient) ezfio.set_basis_prim_coef(coefficient)
ezfio.set_basis_prim_expo(exponent) ezfio.set_basis_prim_expo(exponent)
ezfio.set_basis_shell_ang_mom(ang_mom) ezfio.set_basis_shell_ang_mom(ang_mom)
ezfio.set_basis_shell_prim_num(shell_prim_num) ezfio.set_basis_shell_prim_num(shell_prim_num)
ezfio.set_basis_shell_prim_index(shell_prim_index) ezfio.set_basis_shell_index(shell_index)
print("OK") print("OK")

View File

@ -6,6 +6,7 @@ Usage:
qp_plugins download <url> [-n <name>] qp_plugins download <url> [-n <name>]
qp_plugins install <name>... qp_plugins install <name>...
qp_plugins uninstall <name> qp_plugins uninstall <name>
qp_plugins remove <name>
qp_plugins update [-r <repo>] qp_plugins update [-r <repo>]
qp_plugins create -n <name> [-r <repo>] [<needed_modules>...] qp_plugins create -n <name> [-r <repo>] [<needed_modules>...]
@ -24,6 +25,8 @@ Options:
uninstall Uninstall a plugin uninstall Uninstall a plugin
remove Uninstall a plugin
update Update the repository update Update the repository
create create
@ -274,7 +277,7 @@ def main(arguments):
subprocess.check_call(["qp_create_ninja", "update"]) subprocess.check_call(["qp_create_ninja", "update"])
print("[ OK ]") print("[ OK ]")
elif arguments["uninstall"]: elif arguments["uninstall"] or arguments["remove"]:
m_instance = ModuleHandler([QP_SRC]) m_instance = ModuleHandler([QP_SRC])
d_descendant = m_instance.dict_descendant d_descendant = m_instance.dict_descendant

View File

@ -7,12 +7,13 @@ setting all MOs as Active, except the n/2 first ones which are set as Core.
If pseudo-potentials are used, all the MOs are set as Active. If pseudo-potentials are used, all the MOs are set as Active.
Usage: Usage:
qp_set_frozen_core [-q|--query] [(-l|-s|--large|--small)] EZFIO_DIR qp_set_frozen_core [-q|--query] [(-l|-s|-u|--large|--small|--unset)] EZFIO_DIR
Options: Options:
-q --query Prints in the standard output the number of frozen MOs -q --query Prints in the standard output the number of frozen MOs
-l --large Use a small core -l --large Use a small core
-s --small Use a large core -s --small Use a large core
-u --unset Unset frozen core
Default numbers of frozen electrons: Default numbers of frozen electrons:
@ -88,7 +89,9 @@ def main(arguments):
elif charge <= 54: n_frozen += 9 elif charge <= 54: n_frozen += 9
elif charge <= 86: n_frozen += 18 elif charge <= 86: n_frozen += 18
elif charge <= 118: n_frozen += 27 elif charge <= 118: n_frozen += 27
elif arguments["--unset"]:
n_frozen = 0
else: # default else: # default
for charge in ezfio.nuclei_nucl_charge: for charge in ezfio.nuclei_nucl_charge:
if charge <= 4: pass if charge <= 4: pass

View File

@ -13,7 +13,7 @@
FC : gfortran -g -ffree-line-length-none -I . -fPIC FC : gfortran -g -ffree-line-length-none -I . -fPIC
LAPACK_LIB : -lblas -llapack LAPACK_LIB : -lblas -llapack
IRPF90 : irpf90 IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 --assert IRPF90_FLAGS : --ninja --align=32 --assert -DSET_NESTED
# Global options # Global options
################ ################

View File

@ -13,7 +13,7 @@
FC : gfortran -ffree-line-length-none -I . -mavx -g -fPIC FC : gfortran -ffree-line-length-none -I . -mavx -g -fPIC
LAPACK_LIB : -llapack -lblas LAPACK_LIB : -llapack -lblas
IRPF90 : irpf90 IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 IRPF90_FLAGS : --ninja --align=32 -DSET_NESTED
# Global options # Global options
################ ################

View File

@ -13,7 +13,7 @@
FC : gfortran -g -ffree-line-length-none -I . -fPIC FC : gfortran -g -ffree-line-length-none -I . -fPIC
LAPACK_LIB : -lblas -llapack LAPACK_LIB : -lblas -llapack
IRPF90 : irpf90 IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 --assert IRPF90_FLAGS : --ninja --align=32 --assert -DSET_NESTED
# Global options # Global options
################ ################

View File

@ -13,7 +13,7 @@
FC : mpif90 -ffree-line-length-none -I . -g -fPIC FC : mpif90 -ffree-line-length-none -I . -g -fPIC
LAPACK_LIB : -lblas -llapack LAPACK_LIB : -lblas -llapack
IRPF90 : irpf90 IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DMPI IRPF90_FLAGS : --ninja --align=32 -DMPI -DSET_NESTED
# Global options # Global options
################ ################

63
config/ifort_2019_avx.cfg Normal file
View File

@ -0,0 +1,63 @@
# Common flags
##############
#
# -mkl=[parallel|sequential] : Use the MKL library
# --ninja : Allow the utilisation of ninja. It is mandatory !
# --align=32 : Align all provided arrays on a 32-byte boundary
#
[COMMON]
FC : ifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DINTEL -DSET_NESTED
# Global options
################
#
# 1 : Activate
# 0 : Deactivate
#
[OPTION]
MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below
CACHE : 0 ; Enable cache_compile.py
OPENMP : 1 ; Append OpenMP flags
# Optimization flags
####################
#
# -xHost : Compile a binary optimized for the current architecture
# -O2 : O3 not better than O2.
# -ip : Inter-procedural optimizations
# -ftz : Flushes denormal results to zero
#
[OPT]
FC : -traceback
FCFLAGS : -xAVX -O2 -ip -ftz -g
# Profiling flags
#################
#
[PROFILE]
FC : -p -g
FCFLAGS : -xSSE4.2 -O2 -ip -ftz
# Debugging flags
#################
#
# -traceback : Activate backtrace on runtime
# -fpe0 : All floating point exaceptions
# -C : Checks uninitialized variables, array subscripts, etc...
# -g : Extra debugging information
# -xSSE2 : Valgrind needs a very simple x86 executable
#
[DEBUG]
FC : -g -traceback
FCFLAGS : -xSSE2 -C -fpe0 -implicitnone
# OpenMP flags
#################
#
[OPENMP]
FC : -qopenmp
IRPF90_FLAGS : --openmp

View File

@ -0,0 +1,64 @@
# Common flags
##############
#
# -mkl=[parallel|sequential] : Use the MKL library
# --ninja : Allow the utilisation of ninja. It is mandatory !
# --align=32 : Align all provided arrays on a 32-byte boundary
#
[COMMON]
FC : mpiifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DMPI -DINTEL -DSET_NESTED
# Global options
################
#
# 1 : Activate
# 0 : Deactivate
#
[OPTION]
MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below
CACHE : 0 ; Enable cache_compile.py
OPENMP : 1 ; Append OpenMP flags
# Optimization flags
####################
#
# -xHost : Compile a binary optimized for the current architecture
# -O2 : O3 not better than O2.
# -ip : Inter-procedural optimizations
# -ftz : Flushes denormal results to zero
#
[OPT]
FCFLAGS : -mavx -axAVX -O2 -ip -ftz -g -traceback
# Profiling flags
#################
#
[PROFILE]
FC : -p -g
FCFLAGS : -march=corei7 -O2 -ip -ftz
# Debugging flags
#################
#
# -traceback : Activate backtrace on runtime
# -fpe0 : All floating point exaceptions
# -C : Checks uninitialized variables, array subscripts, etc...
# -g : Extra debugging information
# -xSSE2 : Valgrind needs a very simple x86 executable
#
[DEBUG]
FC : -g -traceback
FCFLAGS : -xSSE2 -C -fpe0 -implicitnone
# OpenMP flags
#################
#
[OPENMP]
FC : -qopenmp
IRPF90_FLAGS : --openmp

View File

@ -0,0 +1,63 @@
# Common flags
##############
#
# -mkl=[parallel|sequential] : Use the MKL library
# --ninja : Allow the utilisation of ninja. It is mandatory !
# --align=32 : Align all provided arrays on a 32-byte boundary
#
[COMMON]
FC : mpiifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DINTEL -DSET_NESTED
# Global options
################
#
# 1 : Activate
# 0 : Deactivate
#
[OPTION]
MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below
CACHE : 0 ; Enable cache_compile.py
OPENMP : 1 ; Append OpenMP flags
# Optimization flags
####################
#
# -xHost : Compile a binary optimized for the current architecture
# -O2 : O3 not better than O2.
# -ip : Inter-procedural optimizations
# -ftz : Flushes denormal results to zero
#
[OPT]
FC : -traceback -shared-intel
FCFLAGS : -O2 -ip -g -march=core-avx2 -align array64byte -fma -ftz -fomit-frame-pointer
# Profiling flags
#################
#
[PROFILE]
FC : -p -g
FCFLAGS : -xSSE4.2 -O2 -ip -ftz
# Debugging flags
#################
#
# -traceback : Activate backtrace on runtime
# -fpe0 : All floating point exaceptions
# -C : Checks uninitialized variables, array subscripts, etc...
# -g : Extra debugging information
# -xSSE2 : Valgrind needs a very simple x86 executable
#
[DEBUG]
FC : -g -traceback
FCFLAGS : -xSSE2 -C -fpe0 -implicitnone
# OpenMP flags
#################
#
[OPENMP]
FC : -qopenmp
IRPF90_FLAGS : --openmp

View File

@ -0,0 +1,63 @@
# Common flags
##############
#
# -mkl=[parallel|sequential] : Use the MKL library
# --ninja : Allow the utilisation of ninja. It is mandatory !
# --align=32 : Align all provided arrays on a 32-byte boundary
#
[COMMON]
FC : ifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DINTEL -DSET_NESTED
# Global options
################
#
# 1 : Activate
# 0 : Deactivate
#
[OPTION]
MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below
CACHE : 0 ; Enable cache_compile.py
OPENMP : 1 ; Append OpenMP flags
# Optimization flags
####################
#
# -xHost : Compile a binary optimized for the current architecture
# -O2 : O3 not better than O2.
# -ip : Inter-procedural optimizations
# -ftz : Flushes denormal results to zero
#
[OPT]
FC : -traceback -shared-intel
FCFLAGS : -O2 -ip -g -march=core-avx2 -align array64byte -fma -ftz -fomit-frame-pointer
# Profiling flags
#################
#
[PROFILE]
FC : -p -g
FCFLAGS : -xSSE4.2 -O2 -ip -ftz
# Debugging flags
#################
#
# -traceback : Activate backtrace on runtime
# -fpe0 : All floating point exaceptions
# -C : Checks uninitialized variables, array subscripts, etc...
# -g : Extra debugging information
# -xSSE2 : Valgrind needs a very simple x86 executable
#
[DEBUG]
FC : -g -traceback
FCFLAGS : -xSSE2 -C -fpe0 -implicitnone
# OpenMP flags
#################
#
[OPENMP]
FC : -qopenmp
IRPF90_FLAGS : --openmp

View File

@ -0,0 +1,63 @@
# Common flags
##############
#
# -mkl=[parallel|sequential] : Use the MKL library
# --ninja : Allow the utilisation of ninja. It is mandatory !
# --align=32 : Align all provided arrays on a 32-byte boundary
#
[COMMON]
FC : ifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DINTEL -DSET_NESTED
# Global options
################
#
# 1 : Activate
# 0 : Deactivate
#
[OPTION]
MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below
CACHE : 0 ; Enable cache_compile.py
OPENMP : 1 ; Append OpenMP flags
# Optimization flags
####################
#
# -xHost : Compile a binary optimized for the current architecture
# -O2 : O3 not better than O2.
# -ip : Inter-procedural optimizations
# -ftz : Flushes denormal results to zero
#
[OPT]
FC : -traceback
FCFLAGS : -xSSE4.2 -O2 -ip -ftz -g
# Profiling flags
#################
#
[PROFILE]
FC : -p -g
FCFLAGS : -xSSE4.2 -O2 -ip -ftz
# Debugging flags
#################
#
# -traceback : Activate backtrace on runtime
# -fpe0 : All floating point exaceptions
# -C : Checks uninitialized variables, array subscripts, etc...
# -g : Extra debugging information
# -xSSE2 : Valgrind needs a very simple x86 executable
#
[DEBUG]
FC : -g -traceback
FCFLAGS : -xSSE2 -C -fpe0 -implicitnone
# OpenMP flags
#################
#
[OPENMP]
FC : -qopenmp
IRPF90_FLAGS : --openmp

View File

@ -0,0 +1,64 @@
# Common flags
##############
#
# -mkl=[parallel|sequential] : Use the MKL library
# --ninja : Allow the utilisation of ninja. It is mandatory !
# --align=32 : Align all provided arrays on a 32-byte boundary
#
[COMMON]
FC : mpiifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DMPI -DINTEL -DSET_NESTED
# Global options
################
#
# 1 : Activate
# 0 : Deactivate
#
[OPTION]
MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below
CACHE : 0 ; Enable cache_compile.py
OPENMP : 1 ; Append OpenMP flags
# Optimization flags
####################
#
# -xHost : Compile a binary optimized for the current architecture
# -O2 : O3 not better than O2.
# -ip : Inter-procedural optimizations
# -ftz : Flushes denormal results to zero
#
[OPT]
FCFLAGS : -msse4.2 -O2 -ip -ftz -g -traceback
# Profiling flags
#################
#
[PROFILE]
FC : -p -g
FCFLAGS : -msse4.2 -O2 -ip -ftz
# Debugging flags
#################
#
# -traceback : Activate backtrace on runtime
# -fpe0 : All floating point exaceptions
# -C : Checks uninitialized variables, array subscripts, etc...
# -g : Extra debugging information
# -xSSE2 : Valgrind needs a very simple x86 executable
#
[DEBUG]
FC : -g -traceback
FCFLAGS : -xSSE2 -C -fpe0 -implicitnone
# OpenMP flags
#################
#
[OPENMP]
FC : -qopenmp
IRPF90_FLAGS : --openmp

View File

@ -0,0 +1,63 @@
# Common flags
##############
#
# -mkl=[parallel|sequential] : Use the MKL library
# --ninja : Allow the utilisation of ninja. It is mandatory !
# --align=32 : Align all provided arrays on a 32-byte boundary
#
[COMMON]
FC : ifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=64 -DINTEL -DSET_NESTED
# Global options
################
#
# 1 : Activate
# 0 : Deactivate
#
[OPTION]
MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below
CACHE : 0 ; Enable cache_compile.py
OPENMP : 1 ; Append OpenMP flags
# Optimization flags
####################
#
# -xHost : Compile a binary optimized for the current architecture
# -O2 : O3 not better than O2.
# -ip : Inter-procedural optimizations
# -ftz : Flushes denormal results to zero
#
[OPT]
FC : -traceback
FCFLAGS : -xHost -O2 -ip -ftz -g
# Profiling flags
#################
#
[PROFILE]
FC : -p -g
FCFLAGS : -xSSE4.2 -O2 -ip -ftz
# Debugging flags
#################
#
# -traceback : Activate backtrace on runtime
# -fpe0 : All floating point exaceptions
# -C : Checks uninitialized variables, array subscripts, etc...
# -g : Extra debugging information
# -xSSE2 : Valgrind needs a very simple x86 executable
#
[DEBUG]
FC : -g -traceback
FCFLAGS : -xSSE2 -C -fpe0 -implicitnone
# OpenMP flags
#################
#
[OPENMP]
FC : -qopenmp
IRPF90_FLAGS : --openmp

View File

@ -0,0 +1,63 @@
# Common flags
##############
#
# -mkl=[parallel|sequential] : Use the MKL library
# --ninja : Allow the utilisation of ninja. It is mandatory !
# --align=32 : Align all provided arrays on a 32-byte boundary
#
[COMMON]
FC : mpiifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DINTEL
# Global options
################
#
# 1 : Activate
# 0 : Deactivate
#
[OPTION]
MODE : OPT ; [ OPT | PROFILE | DEBUG ] : Chooses the section below
CACHE : 0 ; Enable cache_compile.py
OPENMP : 1 ; Append OpenMP flags
# Optimization flags
####################
#
# -xHost : Compile a binary optimized for the current architecture
# -O2 : O3 not better than O2.
# -ip : Inter-procedural optimizations
# -ftz : Flushes denormal results to zero
#
[OPT]
FC : -traceback -shared-intel
FCFLAGS : -O2 -ip -g -march=core-avx2 -align array64byte -fma -ftz -fomit-frame-pointer
# Profiling flags
#################
#
[PROFILE]
FC : -p -g
FCFLAGS : -xSSE4.2 -O2 -ip -ftz
# Debugging flags
#################
#
# -traceback : Activate backtrace on runtime
# -fpe0 : All floating point exaceptions
# -C : Checks uninitialized variables, array subscripts, etc...
# -g : Extra debugging information
# -xSSE2 : Valgrind needs a very simple x86 executable
#
[DEBUG]
FC : -g -traceback
FCFLAGS : -xSSE2 -C -fpe0 -implicitnone
# OpenMP flags
#################
#
[OPENMP]
FC : -qopenmp
IRPF90_FLAGS : --openmp

View File

@ -9,7 +9,7 @@
FC : ifort -fpic FC : ifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90 IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DINTEL IRPF90_FLAGS : --ninja --align=32 -DINTEL
# Global options # Global options
################ ################

View File

@ -9,7 +9,7 @@
FC : ifort -fpic FC : ifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90 IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 -DINTEL IRPF90_FLAGS : --ninja --align=32 -DINTEL
# Global options # Global options
################ ################

View File

@ -1,66 +0,0 @@
# Common flags
##############
#
# -mkl=[parallel|sequential] : Use the MKL library
# --ninja : Allow the utilisation of ninja. It is mandatory !
# --align=32 : Align all provided arrays on a 32-byte boundary
#
[COMMON]
FC : ifort -fpic
LAPACK_LIB : -mkl=parallel -lirc -lsvml -limf -lipps
IRPF90 : irpf90
IRPF90_FLAGS : --ninja --align=32 --assert -DINTEL
# Global options
################
#
# 1 : Activate
# 0 : Deactivate
#
[OPTION]
MODE : DEBUG ; [ OPT | PROFILE | DEBUG ] : Chooses the section below
CACHE : 0 ; Enable cache_compile.py
OPENMP : 1 ; Append OpenMP flags
# Optimization flags
####################
#
# -xHost : Compile a binary optimized for the current architecture
# -O2 : O3 not better than O2.
# -ip : Inter-procedural optimizations
# -ftz : Flushes denormal results to zero
#
[OPT]
FC : -traceback
FCFLAGS : -msse4.2 -O2 -ip -ftz -g
# Profiling flags
#################
#
[PROFILE]
FC : -p -g
FCFLAGS : -msse4.2 -O2 -ip -ftz
# Debugging flags
#################
#
# -traceback : Activate backtrace on runtime
# -fpe0 : All floating point exaceptions
# -C : Checks uninitialized variables, array subscripts, etc...
# -g : Extra debugging information
# -msse4.2 : Valgrind needs a very simple x86 executable
#
[DEBUG]
FC : -g -traceback
FCFLAGS : -msse4.2 -check all -debug all -fpe-all=0 -implicitnone
# OpenMP flags
#################
#
[OPENMP]
FC : -qopenmp
IRPF90_FLAGS : --openmp

View File

@ -204,6 +204,9 @@ _qp_Complete()
uninstall) uninstall)
COMPREPLY=( $(compgen -W "$(qp_plugins list -i)" -- $cur ) ) COMPREPLY=( $(compgen -W "$(qp_plugins list -i)" -- $cur ) )
return 0;; return 0;;
remove)
COMPREPLY=( $(compgen -W "$(qp_plugins list -i)" -- $cur ) )
return 0;;
create) create)
COMPREPLY=( $(compgen -W "-n " -- $cur ) ) COMPREPLY=( $(compgen -W "-n " -- $cur ) )
return 0;; return 0;;

View File

@ -277,6 +277,16 @@ let run ?o b au c d m p cart xyz_file =
) nuclei ) nuclei
in in
let z_core =
List.map (fun x ->
Positive_int.to_int x.Pseudo.n_elec
|> float_of_int
) pseudo
in
let nucl_num = (List.length z_core) in
Ezfio.set_pseudo_nucl_charge_remove (Ezfio.ezfio_array_of_list
~rank:1 ~dim:[| nucl_num |] ~data:z_core);
let molecule = let molecule =
let n_elec_to_remove = let n_elec_to_remove =
List.fold_left (fun accu x -> List.fold_left (fun accu x ->
@ -585,12 +595,16 @@ let run ?o b au c d m p cart xyz_file =
let shell_prim_num = let shell_prim_num =
list_map List.length lc list_map List.length lc
in in
let shell_prim_idx = let shell_idx =
let rec make_list n accu = function
| 0 -> accu
| i -> make_list n (n :: accu) (i-1)
in
let rec aux count accu = function let rec aux count accu = function
| [] -> List.rev accu | [] -> List.rev accu
| l::rest -> | l::rest ->
let newcount = count+(List.length l) in let new_l = make_list count accu (List.length l) in
aux newcount (count::accu) rest aux (count+1) new_l rest
in in
aux 1 [] lc aux 1 [] lc
in in
@ -602,20 +616,12 @@ let run ?o b au c d m p cart xyz_file =
~rank:1 ~dim:[| shell_num |] ~data:shell_prim_num); ~rank:1 ~dim:[| shell_num |] ~data:shell_prim_num);
Ezfio.set_basis_shell_ang_mom (Ezfio.ezfio_array_of_list Ezfio.set_basis_shell_ang_mom (Ezfio.ezfio_array_of_list
~rank:1 ~dim:[| shell_num |] ~data:ang_mom ) ; ~rank:1 ~dim:[| shell_num |] ~data:ang_mom ) ;
Ezfio.set_basis_shell_prim_index (Ezfio.ezfio_array_of_list Ezfio.set_basis_shell_index (Ezfio.ezfio_array_of_list
~rank:1 ~dim:[| shell_num |] ~data:shell_prim_idx) ; ~rank:1 ~dim:[| prim_num |] ~data:shell_idx) ;
Ezfio.set_basis_basis_nucleus_index (Ezfio.ezfio_array_of_list Ezfio.set_basis_basis_nucleus_index (Ezfio.ezfio_array_of_list
~rank:1 ~dim:[| nucl_num |] ~rank:1 ~dim:[| shell_num |]
~data:( ~data:( list_map (fun (_,n) -> Nucl_number.to_int n) basis)
list_map (fun (_,n) -> Nucl_number.to_int n) basis ) ;
|> List.fold_left (fun accu i ->
match accu with
| [] -> []
| (h,j) :: rest -> if j == i then ((h+1,j)::rest) else ((h+1,i)::(h+1,j)::rest)
) [(0,0)]
|> List.rev
|> List.map fst
)) ;
Ezfio.set_basis_nucleus_shell_num(Ezfio.ezfio_array_of_list Ezfio.set_basis_nucleus_shell_num(Ezfio.ezfio_array_of_list
~rank:1 ~dim:[| nucl_num |] ~rank:1 ~dim:[| nucl_num |]
~data:( ~data:(

27
scripts/cipsi_save.sh Normal file
View File

@ -0,0 +1,27 @@
#!/bin/bash
#
# This script runs a CIPSI calculation as a sequence of single CIPSI iterations.
# After each iteration, the EZFIO directory is saved.
#
# Usage: cipsi_save [EZFIO_FILE] [NDET]
#
# Example: cipsi_save file.ezfio 10000
EZ=$1
NDETMAX=$2
qp set_file ${EZ}
qp reset -d
qp set determinants read_wf true
declare -i NDET
NDET=1
while [[ ${NDET} -lt ${NDETMAX} ]]
do
NDET=$(($NDET + $NDET))
qp set determinants n_det_max $NDET
qp run fci > ${EZ}.out
NDET=$(qp get determinants n_det)
mv ${EZ}.out ${EZ}.${NDET}.out
cp -r ${EZ} ${EZ}.${NDET}
done

View File

@ -116,6 +116,7 @@ def get_l_module_descendant(d_child, l_module):
print("Error: ", file=sys.stderr) print("Error: ", file=sys.stderr)
print("`{0}` is not a submodule".format(module), file=sys.stderr) print("`{0}` is not a submodule".format(module), file=sys.stderr)
print("Check the typo (spelling, case, '/', etc.) ", file=sys.stderr) print("Check the typo (spelling, case, '/', etc.) ", file=sys.stderr)
# pass
sys.exit(1) sys.exit(1)
return list(set(l)) return list(set(l))

View File

@ -0,0 +1,175 @@
program check_omp_v2
use omp_lib
implicit none
integer :: accu, accu2
integer :: s, n_setting
logical :: verbose, test_versions
logical, allocatable :: is_working(:)
verbose = .False.
test_versions = .True.
n_setting = 4
allocate(is_working(n_setting))
is_working = .False.
! set the number of threads
call omp_set_num_threads(2)
do s = 1, n_setting
accu = 0
accu2 = 0
call omp_set_max_active_levels(1)
call omp_set_nested(.False.)
if (s==1) then
!call set_multiple_levels_omp()
cycle
elseif (s==2) then
call omp_set_max_active_levels(5)
elseif (s==3) then
call omp_set_nested(.True.)
else
call omp_set_nested(.True.)
call omp_set_max_active_levels(5)
endif
! Level 1
!$OMP PARALLEL
if (verbose) then
print*,'Num threads level 1:',omp_get_num_threads()
endif
! Level 2
!$OMP PARALLEL
if (verbose) then
print*,'Num threads level 2:',omp_get_num_threads()
endif
! Level 3
!$OMP PARALLEL
if (verbose) then
print*,'Num threads level 3:',omp_get_num_threads()
endif
call check_omp_in_subroutine(accu2)
! Level 4
!$OMP PARALLEL
if (verbose) then
print*,'Num threads level 4:',omp_get_num_threads()
endif
!$OMP ATOMIC
accu = accu + 1
!$OMP END ATOMIC
!$OMP END PARALLEL
!$OMP END PARALLEL
!$OMP END PARALLEL
!$OMP END PARALLEL
if (verbose) then
print*,'Setting:',s,'accu=',accu
print*,'Setting:',s,'accu2=',accu2
endif
if (accu == 16 .and. accu2 == 16) then
is_working(s) = .True.
endif
enddo
if (verbose) then
if (is_working(2)) then
print*,'The parallelization works on 4 levels with:'
print*,'call omp_set_max_active_levels(5)'
print*,''
print*,'Please use the irpf90 flags -DSET_MAX_ACT in qp2/config/${compiler_name}.cfg'
elseif (is_working(3)) then
print*,'The parallelization works on 4 levels with:'
print*,'call omp_set_nested(.True.)'
print*,''
print*,'Please use the irpf90 flag -DSET_NESTED in qp2/config/${compiler_name}.cfg'
elseif (is_working(4)) then
print*,'The parallelization works on 4 levels with:'
print*,'call omp_set_nested(.True.)'
print*,'+'
print*,'call omp_set_max_active_levels(5)'
print*,''
print*,'Please use the irpf90 flags -DSET_NESTED -DSET_MAX_ACT in qp2/config/${compiler_name}.cfg'
else
print*,'The parallelization on multiple levels does not work with:'
print*,'call omp_set_max_active_levels(5)'
print*,'or'
print*,'call omp_set_nested(.True.)'
print*,'or'
print*,'call omp_set_nested(.True.)'
print*,'+'
print*,'call omp_set_max_active_levels(5)'
print*,''
print*,'Try an other compiler and good luck...'
endif
! if (is_working(1)) then
! print*,''
! print*,'=========================================================='
! print*,'Your actual set up works for parallelization with 4 levels'
! print*,'=========================================================='
! print*,''
! else
! print*,''
! print*,'==================================================================='
! print*,'Your actual set up does not work for parallelization with 4 levels'
! print*,'Please look at the previous messages to understand the requirements'
! print*,'==================================================================='
! print*,''
! endif
endif
! List of working flags
if (test_versions) then
print*,'Tests:',is_working(2:4)
endif
! IRPF90_FLAGS
if (is_working(2)) then
print*,'-DSET_MAX_ACT'
elseif (is_working(3)) then
print*,'-DSET_NESTED'
elseif (is_working(4)) then
print*,'-DSET_MAX_ACT -DSET_NESTED'
else
print*,'ERROR'
endif
end
subroutine check_omp_in_subroutine(accu2)
implicit none
integer, intent(inout) :: accu2
!$OMP PARALLEL
!$OMP ATOMIC
accu2 = accu2 + 1
!$OMP END ATOMIC
!$OMP END PARALLEL
end

View File

@ -0,0 +1,19 @@
#!/bin/sh
# take one argument which is the compiler used
# return the required IRPF90_FLAGS for the $1 compiler
if [ -z "$1" ]
then
echo "Give the compiler in argument"
else
$1 --version > /dev/null \
&& $1 -O0 -fopenmp check_omp.f90 \
&& ./a.out | tail -n 1
# if there is an error or if the compiler is not found
$1 --version > /dev/null || echo 'compiler not found'
fi

30
scripts/verif_omp/study_omp.sh Executable file
View File

@ -0,0 +1,30 @@
#!/bin/sh
# list of compilers
list_comp="ifort gfortran-7 gfortran-8 gfortran-9"
# file to store the results
FILE=results.dat
touch $FILE
rm $FILE
# Comments
echo "1: omp_set_max_active_levels(5)" >> $FILE
echo "2: omp_set_nested(.True.)" >> $FILE
echo "3: 1 + 2" >> $FILE
echo "" >> $FILE
echo "1 2 3" >> $FILE
# loop on the comp
for comp in $list_comp
do
$comp --version > /dev/null \
&& $comp -O0 -fopenmp check_omp.f90 \
&& echo $(./a.out | grep "Tests:" | cut -d ":" -f2- ) $(echo " : ") $($comp --version | head -n 1) >> $FILE
done
# Display
cat $FILE

View File

@ -0,0 +1,49 @@
#!/bin/bash
# Compiler
COMP=$1
# Path to file.cfg
config_PATH="../../config/"
END="*.cfg"
CONFIG="/config/"
#LIST=${config_PATH}${COMP}${END} # without ${QP_ROOT}
LIST=${QP_ROOT}${CONFIG}${COMP}${END}
if [ -z "$1" ]
then
echo "Give the compiler in argument"
else
# List of the config files for the compiler
#list_files=$(ls ../../config/$comp*.cfg) #does not give the right list
list_files=${LIST}
echo "Files that will be modified:"
echo $list_files
# Flags that must be added
FLAGS=$(./check_required_setup.sh $COMP)
# Add the flags
for file in $list_files
do
echo $file
BASE="IRPF90_FLAGS : --ninja"
ACTUAL=$(grep "$BASE" $file)
# To have only one time each flag
grep " -DSET_MAX_ACT" $file && ${ACTUAL/" -DSET_MAX"/""}
grep " -DSET_NESTED" $file && ${ACTUAL/" -DSET_NESTED"/""}
SPACE=" "
NEW=${ACTUAL}${SPACE}${FLAGS}
# Debug
#echo ${NEW}
sed "s/${ACTUAL}/${NEW}/" $file
# -i # to change the files
done
fi

View File

@ -12,21 +12,21 @@ double precision function ao_value(i,r)
integer :: power_ao(3) integer :: power_ao(3)
double precision :: accu,dx,dy,dz,r2 double precision :: accu,dx,dy,dz,r2
num_ao = ao_nucl(i) num_ao = ao_nucl(i)
power_ao(1:3)= ao_power(i,1:3) ! power_ao(1:3)= ao_power(i,1:3)
center_ao(1:3) = nucl_coord(num_ao,1:3) ! center_ao(1:3) = nucl_coord(num_ao,1:3)
dx = (r(1) - center_ao(1)) ! dx = (r(1) - center_ao(1))
dy = (r(2) - center_ao(2)) ! dy = (r(2) - center_ao(2))
dz = (r(3) - center_ao(3)) ! dz = (r(3) - center_ao(3))
r2 = dx*dx + dy*dy + dz*dz ! r2 = dx*dx + dy*dy + dz*dz
dx = dx**power_ao(1) ! dx = dx**power_ao(1)
dy = dy**power_ao(2) ! dy = dy**power_ao(2)
dz = dz**power_ao(3) ! dz = dz**power_ao(3)
accu = 0.d0 accu = 0.d0
do m=1,ao_prim_num(i) ! do m=1,ao_prim_num(i)
beta = ao_expo_ordered_transp(m,i) ! beta = ao_expo_ordered_transp(m,i)
accu += ao_coef_normalized_ordered_transp(m,i) * dexp(-beta*r2) ! accu += ao_coef_normalized_ordered_transp(m,i) * dexp(-beta*r2)
enddo ! enddo
ao_value = accu * dx * dy * dz ao_value = accu * dx * dy * dz
end end

View File

@ -1,7 +1,7 @@
! Spherical to cartesian transformation matrix obtained with ! Spherical to cartesian transformation matrix obtained with
! Horton (http://theochem.github.com/horton/, 2015) ! Horton (http://theochem.github.com/horton/, 2015)
! First index is the index of the carteisan AO, obtained by ao_power_index ! First index is the index of the cartesian AO, obtained by ao_power_index
! Second index is the index of the spherical AO ! Second index is the index of the spherical AO
BEGIN_PROVIDER [ double precision, cart_to_sphe_0, (1,1) ] BEGIN_PROVIDER [ double precision, cart_to_sphe_0, (1,1) ]

View File

@ -37,16 +37,16 @@ doc: Number of primitives in a shell
size: (basis.shell_num) size: (basis.shell_num)
interface: ezfio, provider interface: ezfio, provider
[shell_prim_index] [shell_index]
type: integer type: integer
doc: Index of the first primitive of the shell doc: Index of the shell for each primitive
size: (basis.shell_num) size: (basis.prim_num)
interface: ezfio, provider interface: ezfio, provider
[basis_nucleus_index] [basis_nucleus_index]
type: integer type: integer
doc: Index of the nucleus on which the shell is centered doc: Nucleus on which the shell is centered
size: (nuclei.nucl_num) size: (basis.shell_num)
interface: ezfio, provider interface: ezfio, provider
[prim_normalization_factor] [prim_normalization_factor]

View File

@ -30,8 +30,10 @@ BEGIN_PROVIDER [ double precision, shell_normalization_factor , (shell_num) ]
powA(3) = 0 powA(3) = 0
norm = 0.d0 norm = 0.d0
do k=shell_prim_index(i),shell_prim_index(i)+shell_prim_num(i)-1 do k=1, prim_num
do j=shell_prim_index(i),shell_prim_index(i)+shell_prim_num(i)-1 if (shell_index(k) /= i) cycle
do j=1, prim_num
if (shell_index(j) /= i) cycle
call overlap_gaussian_xyz(C_A,C_A,prim_expo(j),prim_expo(k), & call overlap_gaussian_xyz(C_A,C_A,prim_expo(j),prim_expo(k), &
powA,powA,overlap_x,overlap_y,overlap_z,c,nz) powA,powA,overlap_x,overlap_y,overlap_z,c,nz)
norm = norm+c*prim_coef(j)*prim_coef(k) * prim_normalization_factor(j) * prim_normalization_factor(k) norm = norm+c*prim_coef(j)*prim_coef(k) * prim_normalization_factor(j) * prim_normalization_factor(k)
@ -91,7 +93,8 @@ BEGIN_PROVIDER [ double precision, prim_normalization_factor , (prim_num) ]
powA(2) = 0 powA(2) = 0
powA(3) = 0 powA(3) = 0
do k=shell_prim_index(i),shell_prim_index(i)+shell_prim_num(i)-1 do k=1, prim_num
if (shell_index(k) /= i) cycle
call overlap_gaussian_xyz(C_A,C_A,prim_expo(k),prim_expo(k), & call overlap_gaussian_xyz(C_A,C_A,prim_expo(k),prim_expo(k), &
powA,powA,overlap_x,overlap_y,overlap_z,norm,nz) powA,powA,overlap_x,overlap_y,overlap_z,norm,nz)
prim_normalization_factor(k) = 1.d0/dsqrt(norm) prim_normalization_factor(k) = 1.d0/dsqrt(norm)

View File

@ -58,3 +58,17 @@ END_PROVIDER
enddo enddo
END_PROVIDER END_PROVIDER
BEGIN_PROVIDER [double precision, final_grid_points_transp, (n_points_final_grid,3)]
implicit none
BEGIN_DOC
! Transposed final_grid_points
END_DOC
integer :: i,j
do j=1,3
do i=1,n_points_final_grid
final_grid_points_transp(i,j) = final_grid_points(j,i)
enddo
enddo
END_PROVIDER

View File

@ -268,6 +268,21 @@ subroutine print_spindet(string,Nint)
end end
subroutine print_det_one_dimension(string,Nint)
use bitmasks
implicit none
BEGIN_DOC
! Subroutine to print the content of a determinant using the '+-' notation
END_DOC
integer, intent(in) :: Nint
integer(bit_kind), intent(in) :: string(Nint)
character*(2048) :: output(1)
call bitstring_to_str( output(1), string, Nint )
print *, trim(output(1))
end
logical function is_integer_in_string(bite,string,Nint) logical function is_integer_in_string(bite,string,Nint)
use bitmasks use bitmasks
implicit none implicit none

View File

@ -1,9 +1,3 @@
[pert_2rdm]
type: logical
doc: If true, computes the one- and two-body rdms with perturbation theory
interface: ezfio,provider,ocaml
default: False
[save_wf_after_selection] [save_wf_after_selection]
type: logical type: logical
doc: If true, saves the wave function after the selection, before the diagonalization doc: If true, saves the wave function after the selection, before the diagonalization
@ -40,3 +34,9 @@ doc: Maximum number of excitation for beta determinants with respect to the Hart
interface: ezfio,ocaml,provider interface: ezfio,ocaml,provider
default: -1 default: -1
[twice_hierarchy_max]
type: integer
doc: Twice the maximum hierarchy parameter (excitation degree plus half the seniority number). Using -1 selects all determinants
interface: ezfio,ocaml,provider
default: -1

View File

@ -2,5 +2,4 @@ perturbation
zmq zmq
mpi mpi
iterations iterations
two_body_rdm
csf csf

View File

@ -70,8 +70,8 @@ subroutine run_cipsi
do while ( & do while ( &
(N_det < N_det_max) .and. & (N_det < N_det_max) .and. &
(maxval(abs(pt2_data % pt2(1:N_states))) > pt2_max) .and. & (sum(abs(pt2_data % pt2(1:N_states)) * state_average_weight(1:N_states)) > pt2_max) .and. &
(maxval(abs(pt2_data % variance(1:N_states))) > variance_max) .and. & (sum(abs(pt2_data % variance(1:N_states)) * state_average_weight(1:N_states)) > variance_max) .and. &
(correlation_energy_ratio <= correlation_energy_ratio_max) & (correlation_energy_ratio <= correlation_energy_ratio_max) &
) )
write(*,'(A)') '--------------------------------------------------------------------------------' write(*,'(A)') '--------------------------------------------------------------------------------'

View File

@ -1,183 +0,0 @@
use bitmasks
use omp_lib
BEGIN_PROVIDER [ integer(omp_lock_kind), pert_2rdm_lock]
use f77_zmq
implicit none
call omp_init_lock(pert_2rdm_lock)
END_PROVIDER
BEGIN_PROVIDER [integer, n_orb_pert_rdm]
implicit none
n_orb_pert_rdm = n_act_orb
END_PROVIDER
BEGIN_PROVIDER [integer, list_orb_reverse_pert_rdm, (mo_num)]
implicit none
list_orb_reverse_pert_rdm = list_act_reverse
END_PROVIDER
BEGIN_PROVIDER [integer, list_orb_pert_rdm, (n_orb_pert_rdm)]
implicit none
list_orb_pert_rdm = list_act
END_PROVIDER
BEGIN_PROVIDER [double precision, pert_2rdm_provider, (n_orb_pert_rdm,n_orb_pert_rdm,n_orb_pert_rdm,n_orb_pert_rdm)]
implicit none
pert_2rdm_provider = 0.d0
END_PROVIDER
subroutine fill_buffer_double_rdm(i_generator, sp, h1, h2, bannedOrb, banned, fock_diag_tmp, E0, pt2_data, mat, buf, psi_det_connection, psi_coef_connection_reverse, n_det_connection)
use bitmasks
use selection_types
implicit none
integer, intent(in) :: n_det_connection
double precision, intent(in) :: psi_coef_connection_reverse(N_states,n_det_connection)
integer(bit_kind), intent(in) :: psi_det_connection(N_int,2,n_det_connection)
integer, intent(in) :: i_generator, sp, h1, h2
double precision, intent(in) :: mat(N_states, mo_num, mo_num)
logical, intent(in) :: bannedOrb(mo_num, 2), banned(mo_num, mo_num)
double precision, intent(in) :: fock_diag_tmp(mo_num)
double precision, intent(in) :: E0(N_states)
type(pt2_type), intent(inout) :: pt2_data
type(selection_buffer), intent(inout) :: buf
logical :: ok
integer :: s1, s2, p1, p2, ib, j, istate, jstate
integer(bit_kind) :: mask(N_int, 2), det(N_int, 2)
double precision :: e_pert, delta_E, val, Hii, sum_e_pert, tmp, alpha_h_psi, coef(N_states)
double precision, external :: diag_H_mat_elem_fock
double precision :: E_shift
logical, external :: detEq
double precision, allocatable :: values(:)
integer, allocatable :: keys(:,:)
integer :: nkeys
integer :: sze_buff
sze_buff = 5 * mo_num ** 2
allocate(keys(4,sze_buff),values(sze_buff))
nkeys = 0
if(sp == 3) then
s1 = 1
s2 = 2
else
s1 = sp
s2 = sp
end if
call apply_holes(psi_det_generators(1,1,i_generator), s1, h1, s2, h2, mask, ok, N_int)
E_shift = 0.d0
if (h0_type == 'CFG') then
j = det_to_configuration(i_generator)
E_shift = psi_det_Hii(i_generator) - psi_configuration_Hii(j)
endif
do p1=1,mo_num
if(bannedOrb(p1, s1)) cycle
ib = 1
if(sp /= 3) ib = p1+1
do p2=ib,mo_num
! -----
! /!\ Generating only single excited determinants doesn't work because a
! determinant generated by a single excitation may be doubly excited wrt
! to a determinant of the future. In that case, the determinant will be
! detected as already generated when generating in the future with a
! double excitation.
!
! if (.not.do_singles) then
! if ((h1 == p1) .or. (h2 == p2)) then
! cycle
! endif
! endif
!
! if (.not.do_doubles) then
! if ((h1 /= p1).and.(h2 /= p2)) then
! cycle
! endif
! endif
! -----
if(bannedOrb(p2, s2)) cycle
if(banned(p1,p2)) cycle
if( sum(abs(mat(1:N_states, p1, p2))) == 0d0) cycle
call apply_particles(mask, s1, p1, s2, p2, det, ok, N_int)
if (do_only_cas) then
integer, external :: number_of_holes, number_of_particles
if (number_of_particles(det)>0) then
cycle
endif
if (number_of_holes(det)>0) then
cycle
endif
endif
if (do_ddci) then
logical, external :: is_a_two_holes_two_particles
if (is_a_two_holes_two_particles(det)) then
cycle
endif
endif
if (do_only_1h1p) then
logical, external :: is_a_1h1p
if (.not.is_a_1h1p(det)) cycle
endif
Hii = diag_H_mat_elem_fock(psi_det_generators(1,1,i_generator),det,fock_diag_tmp,N_int)
sum_e_pert = 0d0
integer :: degree
call get_excitation_degree(det,HF_bitmask,degree,N_int)
if(degree == 2)cycle
do istate=1,N_states
delta_E = E0(istate) - Hii + E_shift
alpha_h_psi = mat(istate, p1, p2)
val = alpha_h_psi + alpha_h_psi
tmp = dsqrt(delta_E * delta_E + val * val)
if (delta_E < 0.d0) then
tmp = -tmp
endif
e_pert = 0.5d0 * (tmp - delta_E)
coef(istate) = e_pert / alpha_h_psi
print*,e_pert,coef,alpha_h_psi
pt2_data % pt2(istate) += e_pert
pt2_data % variance(istate) += alpha_h_psi * alpha_h_psi
enddo
do istate=1,N_states
alpha_h_psi = mat(istate, p1, p2)
e_pert = coef(istate) * alpha_h_psi
do jstate=1,N_states
pt2_data % overlap(jstate,jstate) = coef(istate) * coef(jstate)
enddo
if (weight_selection /= 5) then
! Energy selection
sum_e_pert = sum_e_pert + e_pert * selection_weight(istate)
else
! Variance selection
sum_e_pert = sum_e_pert - alpha_h_psi * alpha_h_psi * selection_weight(istate)
endif
end do
call give_2rdm_pert_contrib(det,coef,psi_det_connection,psi_coef_connection_reverse,n_det_connection,nkeys,keys,values,sze_buff)
if(sum_e_pert <= buf%mini) then
call add_to_selection_buffer(buf, det, sum_e_pert)
end if
end do
end do
call update_keys_values(keys,values,nkeys,n_orb_pert_rdm,pert_2rdm_provider,pert_2rdm_lock)
end

View File

@ -117,7 +117,6 @@ subroutine ZMQ_pt2(E, pt2_data, pt2_data_err, relative_error, N_in)
integer(ZMQ_PTR) :: zmq_to_qp_run_socket, zmq_socket_pull integer(ZMQ_PTR) :: zmq_to_qp_run_socket, zmq_socket_pull
integer, intent(in) :: N_in integer, intent(in) :: N_in
! integer, intent(inout) :: N_in
double precision, intent(in) :: relative_error, E(N_states) double precision, intent(in) :: relative_error, E(N_states)
type(pt2_type), intent(inout) :: pt2_data, pt2_data_err type(pt2_type), intent(inout) :: pt2_data, pt2_data_err
! !
@ -132,8 +131,8 @@ subroutine ZMQ_pt2(E, pt2_data, pt2_data_err, relative_error, N_in)
PROVIDE psi_bilinear_matrix_transp_rows_loc psi_bilinear_matrix_transp_columns PROVIDE psi_bilinear_matrix_transp_rows_loc psi_bilinear_matrix_transp_columns
PROVIDE psi_bilinear_matrix_transp_order psi_selectors_coef_transp psi_det_sorted PROVIDE psi_bilinear_matrix_transp_order psi_selectors_coef_transp psi_det_sorted
PROVIDE psi_det_hii selection_weight pseudo_sym PROVIDE psi_det_hii selection_weight pseudo_sym
PROVIDE n_act_orb n_inact_orb n_core_orb n_virt_orb n_del_orb seniority_max PROVIDE list_act list_inact list_core list_virt list_del seniority_max
PROVIDE pert_2rdm excitation_beta_max excitation_alpha_max excitation_max PROVIDE excitation_beta_max excitation_alpha_max excitation_max
if (h0_type == 'CFG') then if (h0_type == 'CFG') then
PROVIDE psi_configuration_hii det_to_configuration PROVIDE psi_configuration_hii det_to_configuration
@ -288,11 +287,11 @@ subroutine ZMQ_pt2(E, pt2_data, pt2_data_err, relative_error, N_in)
call write_int(6,nproc_target,'Number of threads for PT2') call write_int(6,nproc_target,'Number of threads for PT2')
call write_double(6,mem,'Memory (Gb)') call write_double(6,mem,'Memory (Gb)')
call omp_set_max_active_levels(1) call set_multiple_levels_omp(.False.)
print '(A)', '========== ======================= ===================== ===================== ===========' print '(A)', '========== ======================= ===================== ===================== ==========='
print '(A)', ' Samples Energy Variance Norm^2 Seconds' print '(A)', ' Samples Energy Variance Norm^2 Seconds'
print '(A)', '========== ======================= ===================== ===================== ===========' print '(A)', '========== ======================= ===================== ===================== ==========='
PROVIDE global_selection_buffer PROVIDE global_selection_buffer
@ -315,14 +314,14 @@ subroutine ZMQ_pt2(E, pt2_data, pt2_data_err, relative_error, N_in)
endif endif
!$OMP END PARALLEL !$OMP END PARALLEL
call end_parallel_job(zmq_to_qp_run_socket, zmq_socket_pull, 'pt2') call end_parallel_job(zmq_to_qp_run_socket, zmq_socket_pull, 'pt2')
call omp_set_max_active_levels(8) call set_multiple_levels_omp(.True.)
print '(A)', '========== ======================= ===================== ===================== ===========' print '(A)', '========== ======================= ===================== ===================== ==========='
do k=1,N_states do k=1,N_states
pt2_overlap(pt2_stoch_istate,k) = pt2_data % overlap(k,pt2_stoch_istate) pt2_overlap(pt2_stoch_istate,k) = pt2_data % overlap(k,pt2_stoch_istate)
enddo enddo
SOFT_TOUCH pt2_overlap SOFT_TOUCH pt2_overlap
enddo enddo
FREE pt2_stoch_istate FREE pt2_stoch_istate
@ -524,21 +523,21 @@ subroutine pt2_collector(zmq_socket_pull, E, relative_error, pt2_data, pt2_data_
! 1/(N-1.5) : see Brugger, The American Statistician (23) 4 p. 32 (1969) ! 1/(N-1.5) : see Brugger, The American Statistician (23) 4 p. 32 (1969)
if(c > 2) then if(c > 2) then
eqt = dabs((pt2_data_S2(t) % pt2(pt2_stoch_istate) / c) - (pt2_data_S(t) % pt2(pt2_stoch_istate)/c)**2) ! dabs for numerical stability eqt = dabs((pt2_data_S2(t) % pt2(pt2_stoch_istate) / c) - (pt2_data_S(t) % pt2(pt2_stoch_istate)/c)**2) ! dabs for numerical stability
eqt = sqrt(eqt / (dble(c) - 1.5d0)) eqt = dsqrt(eqt / (dble(c) - 1.5d0))
pt2_data_err % pt2(pt2_stoch_istate) = eqt pt2_data_err % pt2(pt2_stoch_istate) = eqt
eqt = dabs((pt2_data_S2(t) % variance(pt2_stoch_istate) / c) - (pt2_data_S(t) % variance(pt2_stoch_istate)/c)**2) ! dabs for numerical stability eqt = dabs((pt2_data_S2(t) % variance(pt2_stoch_istate) / c) - (pt2_data_S(t) % variance(pt2_stoch_istate)/c)**2) ! dabs for numerical stability
eqt = sqrt(eqt / (dble(c) - 1.5d0)) eqt = dsqrt(eqt / (dble(c) - 1.5d0))
pt2_data_err % variance(pt2_stoch_istate) = eqt pt2_data_err % variance(pt2_stoch_istate) = eqt
eqta(:) = dabs((pt2_data_S2(t) % overlap(:,pt2_stoch_istate) / c) - (pt2_data_S(t) % overlap(:,pt2_stoch_istate)/c)**2) ! dabs for numerical stability eqta(:) = dabs((pt2_data_S2(t) % overlap(:,pt2_stoch_istate) / c) - (pt2_data_S(t) % overlap(:,pt2_stoch_istate)/c)**2) ! dabs for numerical stability
eqta(:) = sqrt(eqta(:) / (dble(c) - 1.5d0)) eqta(:) = dsqrt(eqta(:) / (dble(c) - 1.5d0))
pt2_data_err % overlap(:,pt2_stoch_istate) = eqta(:) pt2_data_err % overlap(:,pt2_stoch_istate) = eqta(:)
if ((time - time1 > 1.d0) .or. (n==N_det_generators)) then if ((time - time1 > 1.d0) .or. (n==N_det_generators)) then
time1 = time time1 = time
print '(I10, X, F12.6, X, G10.3, X, F10.6, X, G10.3, X, F10.6, X, G10.3, X, F10.4)', c, & print '(I10, X, F12.6, X, G10.3, X, F10.6, X, G10.3, X, F10.6, X, G10.3, X, F10.1)', c, &
pt2_data % pt2(pt2_stoch_istate) +E, & pt2_data % pt2(pt2_stoch_istate) +E, &
pt2_data_err % pt2(pt2_stoch_istate), & pt2_data_err % pt2(pt2_stoch_istate), &
pt2_data % variance(pt2_stoch_istate), & pt2_data % variance(pt2_stoch_istate), &
@ -576,11 +575,11 @@ subroutine pt2_collector(zmq_socket_pull, E, relative_error, pt2_data, pt2_data_
endif endif
do i=1,n_tasks do i=1,n_tasks
if(index(i).gt.size(pt2_data_I,1).or.index(i).lt.1)then if(index(i).gt.size(pt2_data_I,1).or.index(i).lt.1)then
print*,'PB !!!' print*,'PB !!!'
print*,'If you see this, send a bug report with the following content' print*,'If you see this, send a bug report with the following content'
print*,irp_here print*,irp_here
print*,'i,index(i),size(pt2_data_I,1) = ',i,index(i),size(pt2_data_I,1) print*,'i,index(i),size(pt2_data_I,1) = ',i,index(i),size(pt2_data_I,1)
stop -1 stop -1
endif endif
call pt2_add(pt2_data_I(index(i)),1.d0,pt2_data_task(i)) call pt2_add(pt2_data_I(index(i)),1.d0,pt2_data_task(i))
f(index(i)) -= 1 f(index(i)) -= 1
@ -843,9 +842,8 @@ END_PROVIDER
do t=1, pt2_N_teeth do t=1, pt2_N_teeth
tooth_width = tilde_cW(pt2_n_0(t+1)) - tilde_cW(pt2_n_0(t)) tooth_width = tilde_cW(pt2_n_0(t+1)) - tilde_cW(pt2_n_0(t))
if (tooth_width == 0.d0) then if (tooth_width == 0.d0) then
tooth_width = sum(tilde_w(pt2_n_0(t):pt2_n_0(t+1))) tooth_width = max(1.d-15,sum(tilde_w(pt2_n_0(t):pt2_n_0(t+1))))
endif endif
ASSERT(tooth_width > 0.d0)
do i=pt2_n_0(t)+1, pt2_n_0(t+1) do i=pt2_n_0(t)+1, pt2_n_0(t+1)
pt2_w(i) = tilde_w(i) * pt2_W_T / tooth_width pt2_w(i) = tilde_w(i) * pt2_W_T / tooth_width
end do end do

View File

@ -31,12 +31,11 @@ subroutine run_pt2_slave(thread,iproc,energy)
double precision, intent(in) :: energy(N_states_diag) double precision, intent(in) :: energy(N_states_diag)
integer, intent(in) :: thread, iproc integer, intent(in) :: thread, iproc
call run_pt2_slave_large(thread,iproc,energy) if (N_det > 100000 ) then
! if (N_det > nproc*(elec_alpha_num * (mo_num-elec_alpha_num))**2) then call run_pt2_slave_large(thread,iproc,energy)
! call run_pt2_slave_large(thread,iproc,energy) else
! else call run_pt2_slave_small(thread,iproc,energy)
! call run_pt2_slave_small(thread,iproc,energy) endif
! endif
end end
subroutine run_pt2_slave_small(thread,iproc,energy) subroutine run_pt2_slave_small(thread,iproc,energy)
@ -67,7 +66,6 @@ subroutine run_pt2_slave_small(thread,iproc,energy)
double precision, external :: memory_of_double, memory_of_int double precision, external :: memory_of_double, memory_of_int
integer :: bsize ! Size of selection buffers integer :: bsize ! Size of selection buffers
! logical :: sending
allocate(task_id(pt2_n_tasks_max), task(pt2_n_tasks_max)) allocate(task_id(pt2_n_tasks_max), task(pt2_n_tasks_max))
allocate(pt2_data(pt2_n_tasks_max), i_generator(pt2_n_tasks_max), subset(pt2_n_tasks_max)) allocate(pt2_data(pt2_n_tasks_max), i_generator(pt2_n_tasks_max), subset(pt2_n_tasks_max))
@ -85,7 +83,6 @@ subroutine run_pt2_slave_small(thread,iproc,energy)
buffer_ready = .False. buffer_ready = .False.
n_tasks = 1 n_tasks = 1
! sending = .False.
done = .False. done = .False.
do while (.not.done) do while (.not.done)
@ -119,14 +116,13 @@ subroutine run_pt2_slave_small(thread,iproc,energy)
do k=1,n_tasks do k=1,n_tasks
call pt2_alloc(pt2_data(k),N_states) call pt2_alloc(pt2_data(k),N_states)
b%cur = 0 b%cur = 0
!double precision :: time2 ! double precision :: time2
!call wall_time(time2) ! call wall_time(time2)
call select_connected(i_generator(k),energy,pt2_data(k),b,subset(k),pt2_F(i_generator(k))) call select_connected(i_generator(k),energy,pt2_data(k),b,subset(k),pt2_F(i_generator(k)))
!call wall_time(time1) ! call wall_time(time1)
!print *, i_generator(1), time1-time2, n_tasks, pt2_F(i_generator(1)) ! print *, i_generator(1), time1-time2, n_tasks, pt2_F(i_generator(1))
enddo enddo
call wall_time(time1) call wall_time(time1)
!print *, '-->', i_generator(1), time1-time0, n_tasks
integer, external :: tasks_done_to_taskserver integer, external :: tasks_done_to_taskserver
if (tasks_done_to_taskserver(zmq_to_qp_run_socket,worker_id,task_id,n_tasks) == -1) then if (tasks_done_to_taskserver(zmq_to_qp_run_socket,worker_id,task_id,n_tasks) == -1) then
@ -164,6 +160,11 @@ end subroutine
subroutine run_pt2_slave_large(thread,iproc,energy) subroutine run_pt2_slave_large(thread,iproc,energy)
use selection_types use selection_types
use f77_zmq use f77_zmq
BEGIN_DOC
! This subroutine can miss important determinants when the PT2 is completely
! computed. It should be called only for large workloads where the PT2 is
! interrupted before the end
END_DOC
implicit none implicit none
double precision, intent(in) :: energy(N_states_diag) double precision, intent(in) :: energy(N_states_diag)
@ -189,8 +190,12 @@ subroutine run_pt2_slave_large(thread,iproc,energy)
integer :: bsize ! Size of selection buffers integer :: bsize ! Size of selection buffers
logical :: sending logical :: sending
double precision :: time_shift
PROVIDE global_selection_buffer global_selection_buffer_lock PROVIDE global_selection_buffer global_selection_buffer_lock
call random_number(time_shift)
time_shift = time_shift*15.d0
zmq_to_qp_run_socket = new_zmq_to_qp_run_socket() zmq_to_qp_run_socket = new_zmq_to_qp_run_socket()
@ -208,6 +213,9 @@ subroutine run_pt2_slave_large(thread,iproc,energy)
sending = .False. sending = .False.
done = .False. done = .False.
double precision :: time0, time1
call wall_time(time0)
time0 = time0+time_shift
do while (.not.done) do while (.not.done)
integer, external :: get_tasks_from_taskserver integer, external :: get_tasks_from_taskserver
@ -234,25 +242,28 @@ subroutine run_pt2_slave_large(thread,iproc,energy)
ASSERT (b%N == bsize) ASSERT (b%N == bsize)
endif endif
double precision :: time0, time1
call wall_time(time0)
call pt2_alloc(pt2_data,N_states) call pt2_alloc(pt2_data,N_states)
b%cur = 0 b%cur = 0
call select_connected(i_generator,energy,pt2_data,b,subset,pt2_F(i_generator)) call select_connected(i_generator,energy,pt2_data,b,subset,pt2_F(i_generator))
call wall_time(time1)
integer, external :: tasks_done_to_taskserver integer, external :: tasks_done_to_taskserver
if (tasks_done_to_taskserver(zmq_to_qp_run_socket,worker_id,task_id,n_tasks) == -1) then if (tasks_done_to_taskserver(zmq_to_qp_run_socket,worker_id,task_id,n_tasks) == -1) then
done = .true. done = .true.
endif endif
call sort_selection_buffer(b) call sort_selection_buffer(b)
call wall_time(time1)
! if (time1-time0 > 15.d0) then
call omp_set_lock(global_selection_buffer_lock)
global_selection_buffer%mini = b%mini
call merge_selection_buffers(b,global_selection_buffer)
b%cur=0
call omp_unset_lock(global_selection_buffer_lock)
call wall_time(time0)
! endif
call push_pt2_results_async_recv(zmq_socket_push,b%mini,sending) call push_pt2_results_async_recv(zmq_socket_push,b%mini,sending)
call omp_set_lock(global_selection_buffer_lock) if ( iproc == 1 .or. i_generator < 100 .or. done) then
global_selection_buffer%mini = b%mini
call merge_selection_buffers(b,global_selection_buffer)
b%cur=0
call omp_unset_lock(global_selection_buffer_lock)
if ( iproc == 1 ) then
call omp_set_lock(global_selection_buffer_lock) call omp_set_lock(global_selection_buffer_lock)
call push_pt2_results_async_send(zmq_socket_push, (/i_generator/), (/pt2_data/), global_selection_buffer, (/task_id/), 1,sending) call push_pt2_results_async_send(zmq_socket_push, (/i_generator/), (/pt2_data/), global_selection_buffer, (/task_id/), 1,sending)
global_selection_buffer%cur = 0 global_selection_buffer%cur = 0

View File

@ -195,7 +195,10 @@ subroutine select_singles_and_doubles(i_generator,hole_mask,particle_mask,fock_d
integer :: l_a, nmax, idx integer :: l_a, nmax, idx
integer, allocatable :: indices(:), exc_degree(:), iorder(:) integer, allocatable :: indices(:), exc_degree(:), iorder(:)
double precision, parameter :: norm_thr = 1.d-16
! Removed to avoid introducing determinants already presents in the wf
!double precision, parameter :: norm_thr = 1.d-16
allocate (indices(N_det), & allocate (indices(N_det), &
exc_degree(max(N_det_alpha_unique,N_det_beta_unique))) exc_degree(max(N_det_alpha_unique,N_det_beta_unique)))
@ -215,10 +218,11 @@ subroutine select_singles_and_doubles(i_generator,hole_mask,particle_mask,fock_d
i = psi_bilinear_matrix_rows(l_a) i = psi_bilinear_matrix_rows(l_a)
if (nt + exc_degree(i) <= 4) then if (nt + exc_degree(i) <= 4) then
idx = psi_det_sorted_order(psi_bilinear_matrix_order(l_a)) idx = psi_det_sorted_order(psi_bilinear_matrix_order(l_a))
if (psi_average_norm_contrib_sorted(idx) > norm_thr) then ! Removed to avoid introducing determinants already presents in the wf
!if (psi_average_norm_contrib_sorted(idx) > norm_thr) then
indices(k) = idx indices(k) = idx
k=k+1 k=k+1
endif !endif
endif endif
enddo enddo
enddo enddo
@ -242,10 +246,11 @@ subroutine select_singles_and_doubles(i_generator,hole_mask,particle_mask,fock_d
idx = psi_det_sorted_order( & idx = psi_det_sorted_order( &
psi_bilinear_matrix_order( & psi_bilinear_matrix_order( &
psi_bilinear_matrix_transp_order(l_a))) psi_bilinear_matrix_transp_order(l_a)))
if (psi_average_norm_contrib_sorted(idx) > norm_thr) then ! Removed to avoid introducing determinants already presents in the wf
!if (psi_average_norm_contrib_sorted(idx) > norm_thr) then
indices(k) = idx indices(k) = idx
k=k+1 k=k+1
endif !endif
endif endif
enddo enddo
enddo enddo
@ -464,27 +469,21 @@ subroutine select_singles_and_doubles(i_generator,hole_mask,particle_mask,fock_d
allocate (fullminilist (N_int, 2, fullinteresting(0)), & allocate (fullminilist (N_int, 2, fullinteresting(0)), &
minilist (N_int, 2, interesting(0)) ) minilist (N_int, 2, interesting(0)) )
if(pert_2rdm)then ! if(pert_2rdm)then
allocate(coef_fullminilist_rev(N_states,fullinteresting(0))) ! allocate(coef_fullminilist_rev(N_states,fullinteresting(0)))
do i=1,fullinteresting(0) ! do i=1,fullinteresting(0)
do j = 1, N_states ! do j = 1, N_states
coef_fullminilist_rev(j,i) = psi_coef_sorted(fullinteresting(i),j) ! coef_fullminilist_rev(j,i) = psi_coef_sorted(fullinteresting(i),j)
enddo ! enddo
enddo ! enddo
endif ! endif
do i=1,fullinteresting(0) do i=1,fullinteresting(0)
do k=1,N_int fullminilist(:,:,i) = psi_det_sorted(:,:,fullinteresting(i))
fullminilist(k,1,i) = psi_det_sorted(k,1,fullinteresting(i))
fullminilist(k,2,i) = psi_det_sorted(k,2,fullinteresting(i))
enddo
enddo enddo
do i=1,interesting(0) do i=1,interesting(0)
do k=1,N_int minilist(:,:,i) = psi_det_sorted(:,:,interesting(i))
minilist(k,1,i) = psi_det_sorted(k,1,interesting(i))
minilist(k,2,i) = psi_det_sorted(k,2,interesting(i))
enddo
enddo enddo
do s2=s1,2 do s2=s1,2
@ -531,19 +530,19 @@ subroutine select_singles_and_doubles(i_generator,hole_mask,particle_mask,fock_d
call splash_pq(mask, sp, minilist, i_generator, interesting(0), bannedOrb, banned, mat, interesting) call splash_pq(mask, sp, minilist, i_generator, interesting(0), bannedOrb, banned, mat, interesting)
if(.not.pert_2rdm)then ! if(.not.pert_2rdm)then
call fill_buffer_double(i_generator, sp, h1, h2, bannedOrb, banned, fock_diag_tmp, E0, pt2_data, mat, buf) call fill_buffer_double(i_generator, sp, h1, h2, bannedOrb, banned, fock_diag_tmp, E0, pt2_data, mat, buf)
else ! else
call fill_buffer_double_rdm(i_generator, sp, h1, h2, bannedOrb, banned, fock_diag_tmp, E0, pt2_data, mat, buf,fullminilist, coef_fullminilist_rev, fullinteresting(0)) ! call fill_buffer_double_rdm(i_generator, sp, h1, h2, bannedOrb, banned, fock_diag_tmp, E0, pt2_data, mat, buf,fullminilist, coef_fullminilist_rev, fullinteresting(0))
endif ! endif
end if end if
enddo enddo
if(s1 /= s2) monoBdo = .false. if(s1 /= s2) monoBdo = .false.
enddo enddo
deallocate(fullminilist,minilist) deallocate(fullminilist,minilist)
if(pert_2rdm)then ! if(pert_2rdm)then
deallocate(coef_fullminilist_rev) ! deallocate(coef_fullminilist_rev)
endif ! endif
enddo enddo
enddo enddo
deallocate(preinteresting, prefullinteresting, interesting, fullinteresting) deallocate(preinteresting, prefullinteresting, interesting, fullinteresting)
@ -572,6 +571,7 @@ subroutine fill_buffer_double(i_generator, sp, h1, h2, bannedOrb, banned, fock_d
double precision, external :: diag_H_mat_elem_fock double precision, external :: diag_H_mat_elem_fock
double precision :: E_shift double precision :: E_shift
double precision :: s_weight(N_states,N_states) double precision :: s_weight(N_states,N_states)
logical, external :: is_in_wavefunction
PROVIDE dominant_dets_of_cfgs N_dominant_dets_of_cfgs PROVIDE dominant_dets_of_cfgs N_dominant_dets_of_cfgs
do jstate=1,N_states do jstate=1,N_states
do istate=1,N_states do istate=1,N_states
@ -713,6 +713,25 @@ subroutine fill_buffer_double(i_generator, sp, h1, h2, bannedOrb, banned, fock_d
if (do_cycle) cycle if (do_cycle) cycle
endif endif
if (twice_hierarchy_max >= 0) then
s = 0
do k=1,N_int
s = s + popcnt(ieor(det(k,1),det(k,2)))
enddo
if ( mod(s,2)>0 ) stop 'For now, hierarchy CI is defined only for an even number of electrons'
if (excitation_ref == 1) then
call get_excitation_degree(HF_bitmask,det(1,1),degree,N_int)
else if (excitation_ref == 2) then
stop 'For now, hierarchy CI is defined only for a single reference determinant'
! do k=1,N_dominant_dets_of_cfgs
! call get_excitation_degree(dominant_dets_of_cfgs(1,1,k),det(1,1),degree,N_int)
! enddo
endif
integer :: twice_hierarchy
twice_hierarchy = degree + s/2
if (twice_hierarchy > twice_hierarchy_max) cycle
endif
Hii = diag_H_mat_elem_fock(psi_det_generators(1,1,i_generator),det,fock_diag_tmp,N_int) Hii = diag_H_mat_elem_fock(psi_det_generators(1,1,i_generator),det,fock_diag_tmp,N_int)
w = 0d0 w = 0d0
@ -834,8 +853,27 @@ subroutine fill_buffer_double(i_generator, sp, h1, h2, bannedOrb, banned, fock_d
endif endif
end select end select
! To force the inclusion of determinants with a positive pt2 contribution
if (e_pert(istate) > 1d-8) then
w = -huge(1.0)
endif
end do end do
!!!BEGIN_DEBUG
! ! To check if the pt2 is taking determinants already in the wf
! if (is_in_wavefunction(det(N_int,1),N_int)) then
! print*, 'A determinant contributing to the pt2 is already in'
! print*, 'the wave function:'
! call print_det(det(N_int,1),N_int)
! print*,'contribution to the pt2 for the states:', e_pert(:)
! print*,'error in the filtering in'
! print*, 'cipsi/selection.irp.f sub: selecte_singles_and_doubles'
! print*, 'abort'
! call abort
! endif
!!!END_DEBUG
integer(bit_kind) :: occ(N_int,2), n integer(bit_kind) :: occ(N_int,2), n
if (h0_type == 'CFG') then if (h0_type == 'CFG') then
@ -1556,7 +1594,7 @@ subroutine bitstring_to_list_in_selection( string, list, n_elements, Nint)
use bitmasks use bitmasks
implicit none implicit none
BEGIN_DOC BEGIN_DOC
! Gives the inidices(+1) of the bits set to 1 in the bit string ! Gives the indices(+1) of the bits set to 1 in the bit string
END_DOC END_DOC
integer, intent(in) :: Nint integer, intent(in) :: Nint
integer(bit_kind), intent(in) :: string(Nint) integer(bit_kind), intent(in) :: string(Nint)

View File

@ -60,6 +60,7 @@ subroutine add_to_selection_buffer(b, det, val)
b%val(b%cur) = val b%val(b%cur) = val
if(b%cur == size(b%val)) then if(b%cur == size(b%val)) then
call sort_selection_buffer(b) call sort_selection_buffer(b)
b%cur = b%cur-1
end if end if
end if end if
end subroutine end subroutine
@ -86,43 +87,56 @@ subroutine merge_selection_buffers(b1, b2)
double precision :: rss double precision :: rss
double precision, external :: memory_of_double double precision, external :: memory_of_double
sze = max(size(b1%val), size(b2%val)) sze = max(size(b1%val), size(b2%val))
rss = memory_of_double(sze) + 2*N_int*memory_of_double(sze) ! rss = memory_of_double(sze) + 2*N_int*memory_of_double(sze)
call check_mem(rss,irp_here) ! call check_mem(rss,irp_here)
allocate(val(sze), detmp(N_int, 2, sze)) allocate(val(sze), detmp(N_int, 2, sze))
i1=1 i1=1
i2=1 i2=1
do i=1,nmwen
if ( (i1 > b1%cur).and.(i2 > b2%cur) ) then select case (N_int)
exit BEGIN_TEMPLATE
else if (i1 > b1%cur) then case $case
val(i) = b2%val(i2) do i=1,nmwen
detmp(1:N_int,1,i) = b2%det(1:N_int,1,i2) if ( (i1 > b1%cur).and.(i2 > b2%cur) ) then
detmp(1:N_int,2,i) = b2%det(1:N_int,2,i2) exit
i2=i2+1 else if (i1 > b1%cur) then
else if (i2 > b2%cur) then val(i) = b2%val(i2)
val(i) = b1%val(i1) detmp(1:$N_int,1,i) = b2%det(1:$N_int,1,i2)
detmp(1:N_int,1,i) = b1%det(1:N_int,1,i1) detmp(1:$N_int,2,i) = b2%det(1:$N_int,2,i2)
detmp(1:N_int,2,i) = b1%det(1:N_int,2,i1) i2=i2+1
i1=i1+1 else if (i2 > b2%cur) then
else val(i) = b1%val(i1)
if (b1%val(i1) <= b2%val(i2)) then detmp(1:$N_int,1,i) = b1%det(1:$N_int,1,i1)
val(i) = b1%val(i1) detmp(1:$N_int,2,i) = b1%det(1:$N_int,2,i1)
detmp(1:N_int,1,i) = b1%det(1:N_int,1,i1) i1=i1+1
detmp(1:N_int,2,i) = b1%det(1:N_int,2,i1)
i1=i1+1
else else
val(i) = b2%val(i2) if (b1%val(i1) <= b2%val(i2)) then
detmp(1:N_int,1,i) = b2%det(1:N_int,1,i2) val(i) = b1%val(i1)
detmp(1:N_int,2,i) = b2%det(1:N_int,2,i2) detmp(1:$N_int,1,i) = b1%det(1:$N_int,1,i1)
i2=i2+1 detmp(1:$N_int,2,i) = b1%det(1:$N_int,2,i1)
i1=i1+1
else
val(i) = b2%val(i2)
detmp(1:$N_int,1,i) = b2%det(1:$N_int,1,i2)
detmp(1:$N_int,2,i) = b2%det(1:$N_int,2,i2)
i2=i2+1
endif
endif endif
endif enddo
enddo do i=nmwen+1,b2%N
val(i) = 0.d0
! detmp(1:$N_int,1,i) = 0_bit_kind
! detmp(1:$N_int,2,i) = 0_bit_kind
enddo
SUBST [ case, N_int ]
(1); 1;;
(2); 2;;
(3); 3;;
(4); 4;;
default; N_int;;
END_TEMPLATE
end select
deallocate(b2%det, b2%val) deallocate(b2%det, b2%val)
do i=nmwen+1,b2%N
val(i) = 0.d0
detmp(1:N_int,1:2,i) = 0_bit_kind
enddo
b2%det => detmp b2%det => detmp
b2%val => val b2%val => val
b2%mini = min(b2%mini,b2%val(b2%N)) b2%mini = min(b2%mini,b2%val(b2%N))
@ -144,8 +158,8 @@ subroutine sort_selection_buffer(b)
double precision :: rss double precision :: rss
double precision, external :: memory_of_double, memory_of_int double precision, external :: memory_of_double, memory_of_int
rss = memory_of_int(b%cur) + 2*N_int*memory_of_double(size(b%det,3)) ! rss = memory_of_int(b%cur) + 2*N_int*memory_of_double(size(b%det,3))
call check_mem(rss,irp_here) ! call check_mem(rss,irp_here)
allocate(iorder(b%cur), detmp(N_int, 2, size(b%det,3))) allocate(iorder(b%cur), detmp(N_int, 2, size(b%det,3)))
do i=1,b%cur do i=1,b%cur
iorder(i) = i iorder(i) = i
@ -225,14 +239,14 @@ subroutine make_selection_buffer_s2(b)
endif endif
dup = .True. dup = .True.
do k=1,N_int do k=1,N_int
if ( (tmp_array(k,1,i) /= tmp_array(k,1,j)) & if ( (tmp_array(k,1,i) /= tmp_array(k,1,j)) .or. &
.or. (tmp_array(k,2,i) /= tmp_array(k,2,j)) ) then (tmp_array(k,2,i) /= tmp_array(k,2,j)) ) then
dup = .False. dup = .False.
exit exit
endif endif
enddo enddo
if (dup) then if (dup) then
val(i) = max(val(i), val(j)) val(i) = min(val(i), val(j))
duplicate(j) = .True. duplicate(j) = .True.
endif endif
j+=1 j+=1
@ -282,9 +296,6 @@ subroutine make_selection_buffer_s2(b)
call configuration_to_dets_size(o(1,1,i),sze,elec_alpha_num,N_int) call configuration_to_dets_size(o(1,1,i),sze,elec_alpha_num,N_int)
n_d = n_d + sze n_d = n_d + sze
if (n_d > b%cur) then if (n_d > b%cur) then
! if (n_d - b%cur > b%cur - n_d + sze) then
! n_d = n_d - sze
! endif
exit exit
endif endif
enddo enddo
@ -329,10 +340,11 @@ subroutine remove_duplicates_in_selection_buffer(b)
integer(bit_kind), allocatable :: tmp_array(:,:,:) integer(bit_kind), allocatable :: tmp_array(:,:,:)
logical, allocatable :: duplicate(:) logical, allocatable :: duplicate(:)
n_d = b%cur
logical :: found_duplicates logical :: found_duplicates
double precision :: rss double precision :: rss
double precision, external :: memory_of_double double precision, external :: memory_of_double
n_d = b%cur
rss = (4*N_int+4)*memory_of_double(n_d) rss = (4*N_int+4)*memory_of_double(n_d)
call check_mem(rss,irp_here) call check_mem(rss,irp_here)

View File

@ -38,11 +38,11 @@ subroutine update_pt2_and_variance_weights(pt2_data, N_st)
avg = sum(pt2(1:N_st)) / dble(N_st) + 1.d-32 ! Avoid future division by zero avg = sum(pt2(1:N_st)) / dble(N_st) + 1.d-32 ! Avoid future division by zero
dt = 8.d0 !* selection_factor dt = 4.d0 !* selection_factor
do k=1,N_st do k=1,N_st
element = exp(dt*(pt2(k)/avg - 1.d0)) element = pt2(k) !exp(dt*(pt2(k)/avg - 1.d0))
element = min(2.0d0 , element) ! element = min(2.0d0 , element)
element = max(0.5d0 , element) ! element = max(0.5d0 , element)
pt2_match_weight(k) *= element pt2_match_weight(k) *= element
enddo enddo
@ -50,9 +50,9 @@ subroutine update_pt2_and_variance_weights(pt2_data, N_st)
avg = sum(variance(1:N_st)) / dble(N_st) + 1.d-32 ! Avoid future division by zero avg = sum(variance(1:N_st)) / dble(N_st) + 1.d-32 ! Avoid future division by zero
do k=1,N_st do k=1,N_st
element = exp(dt*(variance(k)/avg -1.d0)) element = variance(k) ! exp(dt*(variance(k)/avg -1.d0))
element = min(2.0d0 , element) ! element = min(2.0d0 , element)
element = max(0.5d0 , element) ! element = max(0.5d0 , element)
variance_match_weight(k) *= element variance_match_weight(k) *= element
enddo enddo
@ -62,6 +62,9 @@ subroutine update_pt2_and_variance_weights(pt2_data, N_st)
variance_match_weight(:) = 1.d0 variance_match_weight(:) = 1.d0
endif endif
pt2_match_weight(:) = pt2_match_weight(:)/sum(pt2_match_weight(:))
variance_match_weight(:) = variance_match_weight(:)/sum(variance_match_weight(:))
threshold_davidson_pt2 = min(1.d-6, & threshold_davidson_pt2 = min(1.d-6, &
max(threshold_davidson, 1.e-1 * PT2_relative_error * minval(abs(pt2(1:N_states)))) ) max(threshold_davidson, 1.e-1 * PT2_relative_error * minval(abs(pt2(1:N_states)))) )
@ -87,7 +90,7 @@ BEGIN_PROVIDER [ double precision, selection_weight, (N_states) ]
selection_weight(1:N_states) = c0_weight(1:N_states) selection_weight(1:N_states) = c0_weight(1:N_states)
case (2) case (2)
print *, 'Using pt2-matching weight in selection' print *, 'Using PT2-matching weight in selection'
selection_weight(1:N_states) = c0_weight(1:N_states) * pt2_match_weight(1:N_states) selection_weight(1:N_states) = c0_weight(1:N_states) * pt2_match_weight(1:N_states)
print *, '# PT2 weight ', real(pt2_match_weight(:),4) print *, '# PT2 weight ', real(pt2_match_weight(:),4)
@ -97,7 +100,7 @@ BEGIN_PROVIDER [ double precision, selection_weight, (N_states) ]
print *, '# var weight ', real(variance_match_weight(:),4) print *, '# var weight ', real(variance_match_weight(:),4)
case (4) case (4)
print *, 'Using variance- and pt2-matching weights in selection' print *, 'Using variance- and PT2-matching weights in selection'
selection_weight(1:N_states) = c0_weight(1:N_states) * sqrt(variance_match_weight(1:N_states) * pt2_match_weight(1:N_states)) selection_weight(1:N_states) = c0_weight(1:N_states) * sqrt(variance_match_weight(1:N_states) * pt2_match_weight(1:N_states))
print *, '# PT2 weight ', real(pt2_match_weight(:),4) print *, '# PT2 weight ', real(pt2_match_weight(:),4)
print *, '# var weight ', real(variance_match_weight(:),4) print *, '# var weight ', real(variance_match_weight(:),4)
@ -112,7 +115,7 @@ BEGIN_PROVIDER [ double precision, selection_weight, (N_states) ]
selection_weight(1:N_states) = c0_weight(1:N_states) selection_weight(1:N_states) = c0_weight(1:N_states)
case (7) case (7)
print *, 'Input weights multiplied by variance- and pt2-matching' print *, 'Input weights multiplied by variance- and PT2-matching'
selection_weight(1:N_states) = c0_weight(1:N_states) * sqrt(variance_match_weight(1:N_states) * pt2_match_weight(1:N_states)) * state_average_weight(1:N_states) selection_weight(1:N_states) = c0_weight(1:N_states) * sqrt(variance_match_weight(1:N_states) * pt2_match_weight(1:N_states)) * state_average_weight(1:N_states)
print *, '# PT2 weight ', real(pt2_match_weight(:),4) print *, '# PT2 weight ', real(pt2_match_weight(:),4)
print *, '# var weight ', real(variance_match_weight(:),4) print *, '# var weight ', real(variance_match_weight(:),4)
@ -128,6 +131,7 @@ BEGIN_PROVIDER [ double precision, selection_weight, (N_states) ]
print *, '# var weight ', real(variance_match_weight(:),4) print *, '# var weight ', real(variance_match_weight(:),4)
end select end select
selection_weight(:) = selection_weight(:)/sum(selection_weight(:))
print *, '# Total weight ', real(selection_weight(:),4) print *, '# Total weight ', real(selection_weight(:),4)
END_PROVIDER END_PROVIDER

View File

@ -4,7 +4,7 @@ subroutine run_slave_cipsi
! Helper program for distributed parallelism ! Helper program for distributed parallelism
END_DOC END_DOC
call omp_set_max_active_levels(1) call set_multiple_levels_omp(.False.)
distributed_davidson = .False. distributed_davidson = .False.
read_wf = .False. read_wf = .False.
SOFT_TOUCH read_wf distributed_davidson SOFT_TOUCH read_wf distributed_davidson
@ -171,9 +171,9 @@ subroutine run_slave_main
call write_double(6,(t1-t0),'Broadcast time') call write_double(6,(t1-t0),'Broadcast time')
!--- !---
call omp_set_max_active_levels(8) call set_multiple_levels_omp(.True.)
call davidson_slave_tcp(0) call davidson_slave_tcp(0)
call omp_set_max_active_levels(1) call set_multiple_levels_omp(.False.)
print *, mpi_rank, ': Davidson done' print *, mpi_rank, ': Davidson done'
!--- !---
@ -311,7 +311,7 @@ subroutine run_slave_main
if (mpi_master) then if (mpi_master) then
print *, 'Running PT2' print *, 'Running PT2'
endif endif
!$OMP PARALLEL PRIVATE(i) NUM_THREADS(nproc_target+1) !$OMP PARALLEL PRIVATE(i) NUM_THREADS(nproc_target)
i = omp_get_thread_num() i = omp_get_thread_num()
call run_pt2_slave(0,i,pt2_e0_denominator) call run_pt2_slave(0,i,pt2_e0_denominator)
!$OMP END PARALLEL !$OMP END PARALLEL

View File

@ -69,8 +69,8 @@ subroutine run_stochastic_cipsi
do while ( & do while ( &
(N_det < N_det_max) .and. & (N_det < N_det_max) .and. &
(maxval(abs(pt2_data % pt2(1:N_states))) > pt2_max) .and. & (sum(abs(pt2_data % pt2(1:N_states)) * state_average_weight(1:N_states)) > pt2_max) .and. &
(maxval(abs(pt2_data % variance(1:N_states))) > variance_max) .and. & (sum(abs(pt2_data % variance(1:N_states)) * state_average_weight(1:N_states)) > variance_max) .and. &
(correlation_energy_ratio <= correlation_energy_ratio_max) & (correlation_energy_ratio <= correlation_energy_ratio_max) &
) )
write(*,'(A)') '--------------------------------------------------------------------------------' write(*,'(A)') '--------------------------------------------------------------------------------'

View File

@ -1,223 +0,0 @@
use bitmasks
subroutine give_2rdm_pert_contrib(det,coef,psi_det_connection,psi_coef_connection_reverse,n_det_connection,nkeys,keys,values,sze_buff)
implicit none
integer, intent(in) :: n_det_connection,sze_buff
double precision, intent(in) :: coef(N_states)
integer(bit_kind), intent(in) :: det(N_int,2)
integer(bit_kind), intent(in) :: psi_det_connection(N_int,2,n_det_connection)
double precision, intent(in) :: psi_coef_connection_reverse(N_states,n_det_connection)
integer, intent(inout) :: keys(4,sze_buff),nkeys
double precision, intent(inout) :: values(sze_buff)
integer :: i,j
integer :: exc(0:2,2,2)
integer :: degree
double precision :: phase, contrib
do i = 1, n_det_connection
call get_excitation(det,psi_det_connection(1,1,i),exc,degree,phase,N_int)
if(degree.gt.2)cycle
contrib = 0.d0
do j = 1, N_states
contrib += state_average_weight(j) * psi_coef_connection_reverse(j,i) * phase * coef(j)
enddo
! case of single excitations
if(degree == 1)then
if (nkeys + 6 * elec_alpha_num .ge. sze_buff)then
call update_keys_values(keys,values,nkeys,n_orb_pert_rdm,pert_2rdm_provider,pert_2rdm_lock)
nkeys = 0
endif
call update_buffer_single_exc_rdm(det,psi_det_connection(1,1,i),exc,phase,contrib,nkeys,keys,values,sze_buff)
else
!! case of double excitations
! if (nkeys + 4 .ge. sze_buff)then
! call update_keys_values(keys,values,nkeys,n_orb_pert_rdm,pert_2rdm_provider,pert_2rdm_lock)
! nkeys = 0
! endif
! call update_buffer_double_exc_rdm(exc,phase,contrib,nkeys,keys,values,sze_buff)
endif
enddo
!call update_keys_values(keys,values,nkeys,n_orb_pert_rdm,pert_2rdm_provider,pert_2rdm_lock)
!nkeys = 0
end
subroutine update_buffer_single_exc_rdm(det1,det2,exc,phase,contrib,nkeys,keys,values,sze_buff)
implicit none
integer, intent(in) :: sze_buff
integer(bit_kind), intent(in) :: det1(N_int,2)
integer(bit_kind), intent(in) :: det2(N_int,2)
integer,intent(in) :: exc(0:2,2,2)
double precision,intent(in) :: phase, contrib
integer, intent(inout) :: nkeys, keys(4,sze_buff)
double precision, intent(inout):: values(sze_buff)
integer :: occ(N_int*bit_kind_size,2)
integer :: n_occ_ab(2),ispin,other_spin
integer :: h1,h2,p1,p2,i
call bitstring_to_list_ab(det1, occ, n_occ_ab, N_int)
if (exc(0,1,1) == 1) then
! Mono alpha
h1 = exc(1,1,1)
p1 = exc(1,2,1)
ispin = 1
other_spin = 2
else
! Mono beta
h1 = exc(1,1,2)
p1 = exc(1,2,2)
ispin = 2
other_spin = 1
endif
if(list_orb_reverse_pert_rdm(h1).lt.0)return
h1 = list_orb_reverse_pert_rdm(h1)
if(list_orb_reverse_pert_rdm(p1).lt.0)return
p1 = list_orb_reverse_pert_rdm(p1)
!update the alpha/beta part
do i = 1, n_occ_ab(other_spin)
h2 = occ(i,other_spin)
if(list_orb_reverse_pert_rdm(h2).lt.0)return
h2 = list_orb_reverse_pert_rdm(h2)
nkeys += 1
values(nkeys) = 0.5d0 * contrib * phase
keys(1,nkeys) = h1
keys(2,nkeys) = h2
keys(3,nkeys) = p1
keys(4,nkeys) = h2
nkeys += 1
values(nkeys) = 0.5d0 * contrib * phase
keys(1,nkeys) = h2
keys(2,nkeys) = h1
keys(3,nkeys) = h2
keys(4,nkeys) = p1
enddo
!update the same spin part
!do i = 1, n_occ_ab(ispin)
! h2 = occ(i,ispin)
! if(list_orb_reverse_pert_rdm(h2).lt.0)return
! h2 = list_orb_reverse_pert_rdm(h2)
! nkeys += 1
! values(nkeys) = 0.5d0 * contrib * phase
! keys(1,nkeys) = h1
! keys(2,nkeys) = h2
! keys(3,nkeys) = p1
! keys(4,nkeys) = h2
! nkeys += 1
! values(nkeys) = - 0.5d0 * contrib * phase
! keys(1,nkeys) = h1
! keys(2,nkeys) = h2
! keys(3,nkeys) = h2
! keys(4,nkeys) = p1
!
! nkeys += 1
! values(nkeys) = 0.5d0 * contrib * phase
! keys(1,nkeys) = h2
! keys(2,nkeys) = h1
! keys(3,nkeys) = h2
! keys(4,nkeys) = p1
! nkeys += 1
! values(nkeys) = - 0.5d0 * contrib * phase
! keys(1,nkeys) = h2
! keys(2,nkeys) = h1
! keys(3,nkeys) = p1
! keys(4,nkeys) = h2
!enddo
end
subroutine update_buffer_double_exc_rdm(exc,phase,contrib,nkeys,keys,values,sze_buff)
implicit none
integer, intent(in) :: sze_buff
integer,intent(in) :: exc(0:2,2,2)
double precision,intent(in) :: phase, contrib
integer, intent(inout) :: nkeys, keys(4,sze_buff)
double precision, intent(inout):: values(sze_buff)
integer :: h1,h2,p1,p2
if (exc(0,1,1) == 1) then
! Double alpha/beta
h1 = exc(1,1,1)
h2 = exc(1,1,2)
p1 = exc(1,2,1)
p2 = exc(1,2,2)
! check if the orbitals involved are within the orbital range
if(list_orb_reverse_pert_rdm(h1).lt.0)return
h1 = list_orb_reverse_pert_rdm(h1)
if(list_orb_reverse_pert_rdm(h2).lt.0)return
h2 = list_orb_reverse_pert_rdm(h2)
if(list_orb_reverse_pert_rdm(p1).lt.0)return
p1 = list_orb_reverse_pert_rdm(p1)
if(list_orb_reverse_pert_rdm(p2).lt.0)return
p2 = list_orb_reverse_pert_rdm(p2)
nkeys += 1
values(nkeys) = 0.5d0 * contrib * phase
keys(1,nkeys) = h1
keys(2,nkeys) = h2
keys(3,nkeys) = p1
keys(4,nkeys) = p2
nkeys += 1
values(nkeys) = 0.5d0 * contrib * phase
keys(1,nkeys) = p1
keys(2,nkeys) = p2
keys(3,nkeys) = h1
keys(4,nkeys) = h2
else
if (exc(0,1,1) == 2) then
! Double alpha/alpha
h1 = exc(1,1,1)
h2 = exc(2,1,1)
p1 = exc(1,2,1)
p2 = exc(2,2,1)
else if (exc(0,1,2) == 2) then
! Double beta
h1 = exc(1,1,2)
h2 = exc(2,1,2)
p1 = exc(1,2,2)
p2 = exc(2,2,2)
endif
! check if the orbitals involved are within the orbital range
if(list_orb_reverse_pert_rdm(h1).lt.0)return
h1 = list_orb_reverse_pert_rdm(h1)
if(list_orb_reverse_pert_rdm(h2).lt.0)return
h2 = list_orb_reverse_pert_rdm(h2)
if(list_orb_reverse_pert_rdm(p1).lt.0)return
p1 = list_orb_reverse_pert_rdm(p1)
if(list_orb_reverse_pert_rdm(p2).lt.0)return
p2 = list_orb_reverse_pert_rdm(p2)
nkeys += 1
values(nkeys) = 0.5d0 * contrib * phase
keys(1,nkeys) = h1
keys(2,nkeys) = h2
keys(3,nkeys) = p1
keys(4,nkeys) = p2
nkeys += 1
values(nkeys) = - 0.5d0 * contrib * phase
keys(1,nkeys) = h1
keys(2,nkeys) = h2
keys(3,nkeys) = p2
keys(4,nkeys) = p1
nkeys += 1
values(nkeys) = 0.5d0 * contrib * phase
keys(1,nkeys) = h2
keys(2,nkeys) = h1
keys(3,nkeys) = p2
keys(4,nkeys) = p1
nkeys += 1
values(nkeys) = - 0.5d0 * contrib * phase
keys(1,nkeys) = h2
keys(2,nkeys) = h1
keys(3,nkeys) = p1
keys(4,nkeys) = p2
endif
end

View File

@ -22,7 +22,7 @@ subroutine ZMQ_selection(N_in, pt2_data)
PROVIDE psi_bilinear_matrix_transp_rows_loc psi_bilinear_matrix_transp_columns PROVIDE psi_bilinear_matrix_transp_rows_loc psi_bilinear_matrix_transp_columns
PROVIDE psi_bilinear_matrix_transp_order selection_weight pseudo_sym PROVIDE psi_bilinear_matrix_transp_order selection_weight pseudo_sym
PROVIDE n_act_orb n_inact_orb n_core_orb n_virt_orb n_del_orb seniority_max PROVIDE n_act_orb n_inact_orb n_core_orb n_virt_orb n_del_orb seniority_max
PROVIDE pert_2rdm excitation_beta_max excitation_alpha_max excitation_max PROVIDE excitation_beta_max excitation_alpha_max excitation_max
call new_parallel_job(zmq_to_qp_run_socket,zmq_socket_pull,'selection') call new_parallel_job(zmq_to_qp_run_socket,zmq_socket_pull,'selection')

View File

@ -5,4 +5,3 @@ interface: ezfio
size: (determinants.n_states) size: (determinants.n_states)

View File

@ -62,6 +62,7 @@ subroutine run
else else
call H_apply_cis call H_apply_cis
endif endif
print*,''
print *, 'N_det = ', N_det print *, 'N_det = ', N_det
print*,'******************************' print*,'******************************'
print *, 'Energies of the states:' print *, 'Energies of the states:'
@ -69,16 +70,18 @@ subroutine run
print *, i, CI_energy(i) print *, i, CI_energy(i)
enddo enddo
if (N_states > 1) then if (N_states > 1) then
print*,'******************************' print*,''
print*,'Excitation energies ' print*,'******************************************************'
print*,'Excitation energies (au) (eV)'
do i = 2, N_states do i = 2, N_states
print*, i ,CI_energy(i) - CI_energy(1) print*, i ,CI_energy(i) - CI_energy(1), (CI_energy(i) - CI_energy(1))/0.0367502d0
enddo enddo
print*,''
endif endif
call ezfio_set_cis_energy(CI_energy) call ezfio_set_cis_energy(CI_energy)
psi_coef = ci_eigenvectors psi_coef = ci_eigenvectors
SOFT_TOUCH psi_coef SOFT_TOUCH psi_coef
call save_wavefunction_truncated(1.d-12) call save_wavefunction_truncated(save_threshold)
end end

View File

@ -63,7 +63,7 @@ subroutine run
endif endif
psi_coef = ci_eigenvectors psi_coef = ci_eigenvectors
SOFT_TOUCH psi_coef SOFT_TOUCH psi_coef
call save_wavefunction call save_wavefunction_truncated(save_threshold)
call ezfio_set_cisd_energy(CI_energy) call ezfio_set_cisd_energy(CI_energy)
do i = 1,N_states do i = 1,N_states

View File

@ -779,6 +779,7 @@ subroutine binary_search_cfg(cfgInp,addcfg)
end subroutine end subroutine
BEGIN_PROVIDER [ integer, psi_configuration_to_psi_det, (2,N_configuration) ] BEGIN_PROVIDER [ integer, psi_configuration_to_psi_det, (2,N_configuration) ]
&BEGIN_PROVIDER [ integer, psi_configuration_n_det, (N_configuration) ]
&BEGIN_PROVIDER [ integer, psi_configuration_to_psi_det_data, (N_det) ] &BEGIN_PROVIDER [ integer, psi_configuration_to_psi_det_data, (N_det) ]
implicit none implicit none
@ -867,6 +868,29 @@ end subroutine
enddo enddo
deallocate(dets, old_order) deallocate(dets, old_order)
integer :: ndet_conf
do i = 1, N_configuration
ndet_conf = psi_configuration_to_psi_det(2,i) - psi_configuration_to_psi_det(1,i) + 1
psi_configuration_n_det(i) = ndet_conf
enddo
END_PROVIDER END_PROVIDER
BEGIN_PROVIDER [ integer, n_elec_alpha_for_psi_configuration, (N_configuration)]
implicit none
integer :: i,j,k,l
integer(bit_kind) :: det_tmp(N_int,2),det_alpha(N_int)
n_elec_alpha_for_psi_configuration = 0
do i = 1, N_configuration
j = psi_configuration_to_psi_det(2,i)
det_tmp(:,:) = psi_det(:,:,j)
k = 0
do l = 1, N_int
det_alpha(N_int) = iand(det_tmp(l,1),psi_configuration(l,1,i))
k += popcnt(det_alpha(l))
enddo
n_elec_alpha_for_psi_configuration(i) = k
enddo
END_PROVIDER

View File

@ -1,3 +1,15 @@
BEGIN_PROVIDER [ double precision, psi_csf_coef, (N_csf, N_states) ]
implicit none
BEGIN_DOC
! Wafe function in CSF basis
END_DOC
double precision, allocatable :: buffer(:,:)
allocate ( buffer(N_det, N_states) )
buffer(1:N_det, 1:N_states) = psi_coef(1:N_det, 1:N_states)
call convertWFfromDETtoCSF(N_states, buffer, psi_csf_coef)
END_PROVIDER
subroutine convertWFfromDETtoCSF(N_st,psi_coef_det_in, psi_coef_cfg_out) subroutine convertWFfromDETtoCSF(N_st,psi_coef_det_in, psi_coef_cfg_out)
use cfunctions use cfunctions
use bitmasks use bitmasks
@ -26,6 +38,8 @@ subroutine convertWFfromDETtoCSF(N_st,psi_coef_det_in, psi_coef_cfg_out)
integer s, bfIcfg integer s, bfIcfg
integer countcsf integer countcsf
integer MS
MS = elec_alpha_num-elec_beta_num
countcsf = 0 countcsf = 0
phasedet = 1.0d0 phasedet = 1.0d0
do i = 1,N_configuration do i = 1,N_configuration
@ -44,12 +58,17 @@ subroutine convertWFfromDETtoCSF(N_st,psi_coef_det_in, psi_coef_cfg_out)
enddo enddo
enddo enddo
s = 0 s = 0 ! s == total number of SOMOs
do k=1,N_int do k=1,N_int
if (psi_configuration(k,1,i) == 0_bit_kind) cycle if (psi_configuration(k,1,i) == 0_bit_kind) cycle
s = s + popcnt(psi_configuration(k,1,i)) s = s + popcnt(psi_configuration(k,1,i))
enddo enddo
bfIcfg = max(1,nint((binom(s,(s+1)/2)-binom(s,((s+1)/2)+1))))
if(iand(s,1) .EQ. 0) then
bfIcfg = max(1,nint((binom(s,s/2)-binom(s,(s/2)+1))))
else
bfIcfg = max(1,nint((binom(s,(s+1)/2)-binom(s,((s+1)/2)+1))))
endif
! perhaps blocking with CFGs of same seniority ! perhaps blocking with CFGs of same seniority
! can be more efficient ! can be more efficient

View File

@ -1,7 +1,10 @@
real*8 function logabsgamma(x) real*8 function logabsgamma(x)
implicit none implicit none
real*8, intent(in) :: x real*8, intent(in) :: x
logabsgamma = log(abs(gamma(x))) logabsgamma = 1.d32 ! Avoid floating point exception
if (x>0.d0) then
logabsgamma = log(abs(gamma(x)))
endif
end function logabsgamma end function logabsgamma
BEGIN_PROVIDER [ integer, NSOMOMax] BEGIN_PROVIDER [ integer, NSOMOMax]
@ -48,11 +51,24 @@
if(cfg_seniority_index(i+2) > ncfgpersomo) then if(cfg_seniority_index(i+2) > ncfgpersomo) then
ncfgpersomo = cfg_seniority_index(i+2) ncfgpersomo = cfg_seniority_index(i+2)
else else
k = 0 ! l = i+k+2
do while(cfg_seniority_index(i+2+k) < ncfgpersomo) ! Loop over l with a constraint to ensure that l <= size(cfg_seniority_index,1)-1
k = k + 2 ! Old version commented just below
ncfgpersomo = cfg_seniority_index(i+2+k) do l = min(size(cfg_seniority_index,1)-1, i+2), size(cfg_seniority_index,1)-1, 2
if (cfg_seniority_index(l) >= ncfgpersomo) then
ncfgpersomo = cfg_seniority_index(l)
endif
enddo enddo
!k = 0
!if ((i+2+k) < size(cfg_seniority_index,1)) then
! do while(cfg_seniority_index(i+2+k) < ncfgpersomo)
! k = k + 2
! if ((i+2+k) >= size(cfg_seniority_index,1)) then
! exit
! endif
! ncfgpersomo = cfg_seniority_index(i+2+k)
! enddo
!endif
endif endif
endif endif
ncfg = ncfgpersomo - ncfgprev ncfg = ncfgpersomo - ncfgprev
@ -62,34 +78,33 @@
dimcsfpercfg = 2 dimcsfpercfg = 2
else else
if(iand(MS,1) .EQ. 0) then if(iand(MS,1) .EQ. 0) then
!dimcsfpercfg = max(1,nint((binom(i,i/2)-binom(i,i/2+1)))) dimcsfpercfg = max(1,nint((binom(i,i/2)-binom(i,i/2+1))))
binom1 = dexp(logabsgamma(1.0d0*(i+1)) &
- logabsgamma(1.0d0*((i/2)+1)) &
- logabsgamma(1.0d0*(i-((i/2))+1)));
binom2 = dexp(logabsgamma(1.0d0*(i+1)) &
- logabsgamma(1.0d0*(((i/2)+1)+1)) &
- logabsgamma(1.0d0*(i-((i/2)+1)+1)));
dimcsfpercfg = max(1,nint(binom1 - binom2))
else else
!dimcsfpercfg = max(1,nint((binom(i,(i+1)/2)-binom(i,(i+3)/2)))) dimcsfpercfg = max(1,nint((binom(i,(i+1)/2)-binom(i,(i+3)/2))))
binom1 = dexp(logabsgamma(1.0d0*(i+1)) &
- logabsgamma(1.0d0*(((i+1)/2)+1)) &
- logabsgamma(1.0d0*(i-(((i+1)/2))+1)));
binom2 = dexp(logabsgamma(1.0d0*(i+1)) &
- logabsgamma(1.0d0*((((i+3)/2)+1)+1)) &
- logabsgamma(1.0d0*(i-(((i+3)/2)+1)+1)));
dimcsfpercfg = max(1,nint(binom1 - binom2))
endif endif
endif endif
n_CSF += ncfg * dimcsfpercfg n_CSF += ncfg * dimcsfpercfg
if(cfg_seniority_index(i+2) > ncfgprev) then if(cfg_seniority_index(i+2) > ncfgprev) then
ncfgprev = cfg_seniority_index(i+2) ncfgprev = cfg_seniority_index(i+2)
else else
k = 0 ! l = i+k+2
do while(cfg_seniority_index(i+2+k) < ncfgprev) ! Loop over l with a constraint to ensure that l <= size(cfg_seniority_index,1)-1
k = k + 2 ! Old version commented just below
ncfgprev = cfg_seniority_index(i+2+k) do l = min(size(cfg_seniority_index,1)-1, i+2), size(cfg_seniority_index,1)-1, 2
if (cfg_seniority_index(l) >= ncfgprev) then
ncfgprev = cfg_seniority_index(l)
endif
enddo enddo
!k = 0
!if ((i+2+k) < size(cfg_seniority_index,1)) then
! do while(cfg_seniority_index(i+2+k) < ncfgprev)
! k = k + 2
! if ((i+2+k) >= size(cfg_seniority_index,1)) then
! exit
! endif
! ncfgprev = cfg_seniority_index(i+2+k)
! enddo
!endif
endif endif
enddo enddo
END_PROVIDER END_PROVIDER

View File

@ -0,0 +1,451 @@
subroutine davidson_general_ext_rout_diag_dressed(u_in,H_jj,Dress_jj,energies,sze,N_st,N_st_diag_in,converged,hcalc)
use mmap_module
implicit none
BEGIN_DOC
! Generic Davidson diagonalization with ONE DIAGONAL DRESSING OPERATOR
!
! Dress_jj : DIAGONAL DRESSING of the Hamiltonian
!
! H_jj : specific diagonal H matrix elements to diagonalize de Davidson
!
! u_in : guess coefficients on the various states. Overwritten on exit
!
! sze : leftmost dimension of u_in
!
! sze : Number of determinants
!
! N_st : Number of eigenstates
!
! N_st_diag_in : Number of states in which H is diagonalized. Assumed > sze
!
! Initial guess vectors are not necessarily orthonormal
!
! hcalc subroutine to compute W = H U (see routine hcalc_template for template of input/output)
END_DOC
integer, intent(in) :: sze, N_st, N_st_diag_in
double precision, intent(in) :: H_jj(sze),Dress_jj(sze)
double precision, intent(inout) :: u_in(sze,N_st_diag_in)
double precision, intent(out) :: energies(N_st)
external hcalc
integer :: iter, N_st_diag
integer :: i,j,k,l,m
logical, intent(inout) :: converged
double precision, external :: u_dot_v, u_dot_u
integer :: k_pairs, kl
integer :: iter2, itertot
double precision, allocatable :: y(:,:), h(:,:), lambda(:)
double precision, allocatable :: residual_norm(:)
character*(16384) :: write_buffer
double precision :: to_print(2,N_st)
double precision :: cpu, wall
integer :: shift, shift2, itermax, istate
double precision :: r1, r2, alpha
integer :: nproc_target
integer :: order(N_st_diag_in)
double precision :: cmax
double precision, allocatable :: U(:,:), overlap(:,:)!, S_d(:,:)
double precision, pointer :: W(:,:)
logical :: disk_based
double precision :: energy_shift(N_st_diag_in*davidson_sze_max)
include 'constants.include.F'
N_st_diag = N_st_diag_in
!DIR$ ATTRIBUTES ALIGN : $IRP_ALIGN :: U, W, y, h, lambda
if (N_st_diag*3 > sze) then
print *, 'error in Davidson :'
print *, 'Increase n_det_max_full to ', N_st_diag*3
stop -1
endif
itermax = max(2,min(davidson_sze_max, sze/N_st_diag))+1
itertot = 0
if (state_following) then
allocate(overlap(N_st_diag*itermax, N_st_diag*itermax))
else
allocate(overlap(1,1)) ! avoid 'if' for deallocate
endif
overlap = 0.d0
provide threshold_davidson !nthreads_davidson
call write_time(6)
write(6,'(A)') ''
write(6,'(A)') 'Davidson Diagonalization'
write(6,'(A)') '------------------------'
write(6,'(A)') ''
! Find max number of cores to fit in memory
! -----------------------------------------
nproc_target = nproc
double precision :: rss
integer :: maxab
maxab = sze
m=1
disk_based = .False.
call resident_memory(rss)
do
r1 = 8.d0 * &! bytes
( dble(sze)*(N_st_diag*itermax) &! U
+ 1.d0*dble(sze*m)*(N_st_diag*itermax) &! W
+ 2.0d0*(N_st_diag*itermax)**2 &! h,y
+ 2.d0*(N_st_diag*itermax) &! s2,lambda
+ 1.d0*(N_st_diag) &! residual_norm
! In H_S2_u_0_nstates_zmq
+ 3.d0*(N_st_diag*N_det) &! u_t, v_t, s_t on collector
+ 3.d0*(N_st_diag*N_det) &! u_t, v_t, s_t on slave
+ 0.5d0*maxab &! idx0 in H_S2_u_0_nstates_openmp_work_*
+ nproc_target * &! In OMP section
( 1.d0*(N_int*maxab) &! buffer
+ 3.5d0*(maxab) ) &! singles_a, singles_b, doubles, idx
) / 1024.d0**3
if (nproc_target == 0) then
call check_mem(r1,irp_here)
nproc_target = 1
exit
endif
if (r1+rss < qp_max_mem) then
exit
endif
if (itermax > 4) then
itermax = itermax - 1
else if (m==1.and.disk_based_davidson) then
m=0
disk_based = .True.
itermax = 6
else
nproc_target = nproc_target - 1
endif
enddo
nthreads_davidson = nproc_target
TOUCH nthreads_davidson
call write_int(6,N_st,'Number of states')
call write_int(6,N_st_diag,'Number of states in diagonalization')
call write_int(6,sze,'Number of basis functions')
call write_int(6,nproc_target,'Number of threads for diagonalization')
call write_double(6, r1, 'Memory(Gb)')
if (disk_based) then
print *, 'Using swap space to reduce RAM'
endif
double precision, allocatable :: H_jj_tmp(:)
ASSERT (N_st > 0)
ASSERT (sze > 0)
allocate(H_jj_tmp(sze))
do i=1,sze
H_jj_tmp(i) = H_jj(i) + Dress_jj(i)
enddo
!---------------
write(6,'(A)') ''
write_buffer = '====='
do i=1,N_st
write_buffer = trim(write_buffer)//' ================ ==========='
enddo
write(6,'(A)') write_buffer(1:6+41*N_st)
write_buffer = 'Iter'
do i=1,N_st
write_buffer = trim(write_buffer)//' Energy Residual '
enddo
write(6,'(A)') write_buffer(1:6+41*N_st)
write_buffer = '====='
do i=1,N_st
write_buffer = trim(write_buffer)//' ================ ==========='
enddo
write(6,'(A)') write_buffer(1:6+41*N_st)
allocate(W(sze,N_st_diag*itermax))
allocate( &
! Large
U(sze,N_st_diag*itermax), &
! Small
h(N_st_diag*itermax,N_st_diag*itermax), &
y(N_st_diag*itermax,N_st_diag*itermax), &
residual_norm(N_st_diag), &
lambda(N_st_diag*itermax))
h = 0.d0
U = 0.d0
y = 0.d0
ASSERT (N_st > 0)
ASSERT (N_st_diag >= N_st)
ASSERT (sze > 0)
! Davidson iterations
! ===================
converged = .False.
! Initialize from N_st to N_st_diat with gaussian random numbers
! to be sure to have overlap with any eigenvectors
do k=N_st+1,N_st_diag
u_in(k,k) = 10.d0
do i=1,sze
call random_number(r1)
call random_number(r2)
r1 = dsqrt(-2.d0*dlog(r1))
r2 = dtwo_pi*r2
u_in(i,k) = r1*dcos(r2)
enddo
enddo
! Normalize all states
do k=1,N_st_diag
call normalize(u_in(1,k),sze)
enddo
! Copy from the guess input "u_in" to the working vectors "U"
do k=1,N_st_diag
do i=1,sze
U(i,k) = u_in(i,k)
enddo
enddo
do while (.not.converged)
itertot = itertot+1
if (itertot == 8) then
exit
endif
do iter=1,itermax-1
shift = N_st_diag*(iter-1)
shift2 = N_st_diag*iter
if ((iter > 1).or.(itertot == 1)) then
! Compute |W_k> = \sum_i |i><i|H|u_k>
! -----------------------------------
! Gram-Schmidt to orthogonalize all new guess with the previous vectors
call ortho_qr(U,size(U,1),sze,shift2)
call ortho_qr(U,size(U,1),sze,shift2)
! it does W = H U with W(sze,N_st_diag),U(sze,N_st_diag)
! where sze is the size of the vector, N_st_diag is the number of states
call hcalc(W(1,shift+1),U(1,shift+1),N_st_diag,sze)
! Compute then the DIAGONAL PART OF THE DRESSING
! <i|W_k> += Dress_jj(i) * <i|U>
call dressing_diag_uv(W(1,shift+1),U(1,shift+1),Dress_jj,N_st_diag_in,sze)
else
! Already computed in update below
continue
endif
! Compute h_kl = <u_k | W_l> = <u_k| H |u_l>
! -------------------------------------------
call dgemm('T','N', shift2, shift2, sze, &
1.d0, U, size(U,1), W, size(W,1), &
0.d0, h, size(h,1))
! Diagonalize h y = lambda y
! ---------------
call lapack_diag(lambda,y,h,size(h,1),shift2)
if (state_following) then
overlap = -1.d0
do k=1,shift2
do i=1,shift2
overlap(k,i) = dabs(y(k,i))
enddo
enddo
do k=1,N_st
cmax = -1.d0
do i=1,N_st
if (overlap(i,k) > cmax) then
cmax = overlap(i,k)
order(k) = i
endif
enddo
do i=1,N_st_diag
overlap(order(k),i) = -1.d0
enddo
enddo
overlap = y
do k=1,N_st
l = order(k)
if (k /= l) then
y(1:shift2,k) = overlap(1:shift2,l)
endif
enddo
do k=1,N_st
overlap(k,1) = lambda(k)
enddo
do k=1,N_st
l = order(k)
if (k /= l) then
lambda(k) = overlap(l,1)
endif
enddo
endif
! Express eigenvectors of h in the determinant basis
! --------------------------------------------------
call dgemm('N','N', sze, N_st_diag, shift2, &
1.d0, U, size(U,1), y, size(y,1), 0.d0, U(1,shift2+1), size(U,1))
call dgemm('N','N', sze, N_st_diag, shift2, &
1.d0, W, size(W,1), y, size(y,1), 0.d0, W(1,shift2+1), size(W,1))
! Compute residual vector and davidson step
! -----------------------------------------
!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(i,k)
do k=1,N_st_diag
do i=1,sze
U(i,shift2+k) = &
(lambda(k) * U(i,shift2+k) - W(i,shift2+k) ) &
/max(H_jj_tmp(i) - lambda (k),1.d-2)
enddo
if (k <= N_st) then
residual_norm(k) = u_dot_u(U(1,shift2+k),sze)
to_print(1,k) = lambda(k)
to_print(2,k) = residual_norm(k)
endif
enddo
!$OMP END PARALLEL DO
if ((itertot>1).and.(iter == 1)) then
!don't print
continue
else
write(*,'(1X,I3,1X,100(1X,F16.10,1X,F11.6,1X,E11.3))') iter-1, to_print(1:2,1:N_st)
endif
! Check convergence
if (iter > 1) then
converged = dabs(maxval(residual_norm(1:N_st))) < threshold_davidson
endif
do k=1,N_st
if (residual_norm(k) > 1.e8) then
print *, 'Davidson failed'
stop -1
endif
enddo
if (converged) then
exit
endif
logical, external :: qp_stop
if (qp_stop()) then
converged = .True.
exit
endif
enddo
call dgemm('N','N', sze, N_st_diag, shift2, 1.d0, &
W, size(W,1), y, size(y,1), 0.d0, u_in, size(u_in,1))
do k=1,N_st_diag
do i=1,sze
W(i,k) = u_in(i,k)
enddo
enddo
call dgemm('N','N', sze, N_st_diag, shift2, 1.d0, &
U, size(U,1), y, size(y,1), 0.d0, u_in, size(u_in,1))
do k=1,N_st_diag
do i=1,sze
U(i,k) = u_in(i,k)
enddo
enddo
call ortho_qr(U,size(U,1),sze,N_st_diag)
call ortho_qr(U,size(U,1),sze,N_st_diag)
do j=1,N_st_diag
k=1
do while ((k<sze).and.(U(k,j) == 0.d0))
k = k+1
enddo
if (U(k,j) * u_in(k,j) < 0.d0) then
do i=1,sze
W(i,j) = -W(i,j)
enddo
endif
enddo
enddo
do k=1,N_st
energies(k) = lambda(k)
enddo
write_buffer = '====='
do i=1,N_st
write_buffer = trim(write_buffer)//' ================ ==========='
enddo
write(6,'(A)') trim(write_buffer)
write(6,'(A)') ''
call write_time(6)
deallocate(W)
deallocate ( &
residual_norm, &
U, h, &
y, &
lambda &
)
deallocate(overlap)
FREE nthreads_davidson
end
subroutine dressing_diag_uv(v,u,dress_diag,N_st,sze)
implicit none
BEGIN_DOC
! Routine that computes the diagonal part of the dressing
!
! v(i) += u(i) * dress_diag(i)
!
! !!!!!!!! WARNING !!!!!!!! the vector v is not initialized
!
! !!!!!!!! SO MAKE SURE THERE ARE SOME MEANINGFUL VALUES IN THERE
END_DOC
integer, intent(in) :: N_st,sze
double precision, intent(in) :: u(sze,N_st),dress_diag(sze)
double precision, intent(inout) :: v(sze,N_st)
integer :: i,istate
do istate = 1, N_st
do i = 1, sze
v(i,istate) += dress_diag(i) * u(i,istate)
enddo
enddo
end

View File

@ -0,0 +1,518 @@
subroutine dav_double_dressed(u_in,H_jj,Dress_jj,Dressing_vec,idx_dress,energies,sze,N_st,N_st_diag,converged,hcalc)
use mmap_module
BEGIN_DOC
! Generic Davidson diagonalization with TWO DRESSING VECTORS
!
! Dress_jj : DIAGONAL DRESSING of the Hamiltonian
!
! Dressing_vec : COLUMN / LINE DRESSING VECTOR
!
! idx_dress : position of the basis function used to use the Dressing_vec (usually the largest coeff)
!
! H_jj : specific diagonal H matrix elements to diagonalize de Davidson
!
! u_in : guess coefficients on the various states. Overwritten on exit
!
! sze : leftmost dimension of u_in
!
! sze : Number of determinants
!
! N_st : Number of eigenstates
!
! N_st_diag : Number of states in which H is diagonalized. Assumed > sze
!
! Initial guess vectors are not necessarily orthonormal
!
! hcalc subroutine to compute W = H U (see routine hcalc_template for template of input/output)
END_DOC
implicit none
integer, intent(in) :: sze, N_st, N_st_diag, idx_dress
double precision, intent(in) :: H_jj(sze),Dress_jj(sze),Dressing_vec(sze,N_st)
double precision, intent(inout) :: u_in(sze,N_st_diag)
double precision, intent(out) :: energies(N_st_diag)
logical, intent(out) :: converged
external hcalc
double precision, allocatable :: H_jj_tmp(:)
ASSERT (N_st > 0)
ASSERT (sze > 0)
allocate(H_jj_tmp(sze))
do i=1,sze
H_jj_tmp(i) = H_jj(i) + Dress_jj(i)
enddo
do k=1,N_st
do i=1,sze
H_jj_tmp(i) += u_in(i,k) * Dressing_vec(i,k)
enddo
enddo
integer :: iter
integer :: i,j,k,l,m
double precision, external :: u_dot_v, u_dot_u
integer :: k_pairs, kl
integer :: iter2, itertot
double precision, allocatable :: y(:,:), h(:,:), lambda(:)
double precision, allocatable :: s_tmp(:,:)
double precision, allocatable :: residual_norm(:),inv_c_idx_dress_vec(:)
character*(16384) :: write_buffer
double precision :: to_print(2,N_st),inv_c_idx_dress
double precision :: cpu, wall
integer :: shift, shift2, itermax, istate
double precision :: r1, r2, alpha
logical :: state_ok(N_st_diag*davidson_sze_max)
integer :: nproc_target
integer :: order(N_st_diag)
double precision :: cmax
double precision, allocatable :: U(:,:), overlap(:,:)
double precision, pointer :: W(:,:)
logical :: disk_based
double precision :: energy_shift(N_st_diag*davidson_sze_max)
allocate(inv_c_idx_dress_vec(N_st))
inv_c_idx_dress = 1.d0/u_in(idx_dress,1)
do i = 1, N_st
inv_c_idx_dress_vec(i) = 1.d0/u_in(idx_dress,i)
enddo
include 'constants.include.F'
integer :: N_st_diag_in
N_st_diag_in = N_st_diag
!DIR$ ATTRIBUTES ALIGN : $IRP_ALIGN :: U, W, y, h, lambda
if (N_st_diag_in*3 > sze) then
print *, 'error in Davidson :'
print *, 'Increase n_det_max_full to ', N_st_diag_in*3
stop -1
endif
itermax = max(2,min(davidson_sze_max, sze/N_st_diag_in))+1
itertot = 0
if (state_following) then
allocate(overlap(N_st_diag_in*itermax, N_st_diag_in*itermax))
else
allocate(overlap(1,1)) ! avoid 'if' for deallocate
endif
overlap = 0.d0
call write_time(6)
write(6,'(A)') ''
write(6,'(A)') 'Davidson Diagonalization'
write(6,'(A)') '------------------------'
write(6,'(A)') ''
! Find max number of cores to fit in memory
! -----------------------------------------
nproc_target = nproc
double precision :: rss
integer :: maxab
maxab = max(N_det_alpha_unique, N_det_beta_unique)+1
m=1
disk_based = .False.
call resident_memory(rss)
do
r1 = 8.d0 * &! bytes
( dble(sze)*(N_st_diag_in*itermax) &! U
+ 1.0d0*dble(sze*m)*(N_st_diag_in*itermax) &! W
+ 3.0d0*(N_st_diag_in*itermax)**2 &! h,y,s_tmp
+ 1.d0*(N_st_diag_in*itermax) &! lambda
+ 1.d0*(N_st_diag_in) &! residual_norm
! In H_u_0_nstates_zmq
+ 2.d0*(N_st_diag_in*N_det) &! u_t, v_t, on collector
+ 2.d0*(N_st_diag_in*N_det) &! u_t, v_t, on slave
+ 0.5d0*maxab &! idx0 in H_u_0_nstates_openmp_work_*
+ nproc_target * &! In OMP section
( 1.d0*(N_int*maxab) &! buffer
+ 3.5d0*(maxab) ) &! singles_a, singles_b, doubles, idx
) / 1024.d0**3
if (nproc_target == 0) then
call check_mem(r1,irp_here)
nproc_target = 1
exit
endif
if (r1+rss < qp_max_mem) then
exit
endif
if (itermax > 4) then
itermax = itermax - 1
else if (m==1.and.disk_based_davidson) then
m=0
disk_based = .True.
itermax = 6
else
nproc_target = nproc_target - 1
endif
enddo
nthreads_davidson = nproc_target
TOUCH nthreads_davidson
call write_int(6,N_st,'Number of states')
call write_int(6,N_st_diag_in,'Number of states in diagonalization')
call write_int(6,sze,'Number of basis functions ')
call write_int(6,nproc_target,'Number of threads for diagonalization')
call write_double(6, r1, 'Memory(Gb)')
if (disk_based) then
print *, 'Using swap space to reduce RAM'
endif
!---------------
write(6,'(A)') ''
write_buffer = '====='
do i=1,N_st
write_buffer = trim(write_buffer)//' ================ ==========='
enddo
write(6,'(A)') write_buffer(1:6+41*N_st)
write_buffer = 'Iter'
do i=1,N_st
write_buffer = trim(write_buffer)//' Energy Residual '
enddo
write(6,'(A)') write_buffer(1:6+41*N_st)
write_buffer = '====='
do i=1,N_st
write_buffer = trim(write_buffer)//' ================ ==========='
enddo
write(6,'(A)') write_buffer(1:6+41*N_st)
allocate(W(sze,N_st_diag_in*itermax))
allocate( &
! Large
U(sze,N_st_diag_in*itermax), &
! Small
h(N_st_diag_in*itermax,N_st_diag_in*itermax), &
y(N_st_diag_in*itermax,N_st_diag_in*itermax), &
s_tmp(N_st_diag_in*itermax,N_st_diag_in*itermax), &
residual_norm(N_st_diag_in), &
lambda(N_st_diag_in*itermax))
h = 0.d0
U = 0.d0
y = 0.d0
s_tmp = 0.d0
ASSERT (N_st > 0)
ASSERT (N_st_diag_in >= N_st)
ASSERT (sze > 0)
! Davidson iterations
! ===================
converged = .False.
do k=N_st+1,N_st_diag_in
do i=1,sze
call random_number(r1)
call random_number(r2)
r1 = dsqrt(-2.d0*dlog(r1))
r2 = dtwo_pi*r2
u_in(i,k) = r1*dcos(r2) * u_in(i,k-N_st)
enddo
u_in(k,k) = u_in(k,k) + 10.d0
enddo
do k=1,N_st_diag_in
call normalize(u_in(1,k),sze)
enddo
do k=1,N_st_diag_in
do i=1,sze
U(i,k) = u_in(i,k)
enddo
enddo
do while (.not.converged)
itertot = itertot+1
if (itertot == 2) then
exit
endif
do iter=1,itermax-1
shift = N_st_diag_in*(iter-1)
shift2 = N_st_diag_in*iter
if ((iter > 1).or.(itertot == 1)) then
! Compute |W_k> = \sum_i |i><i|H|u_k>
! -----------------------------------
call hcalc(W(1,shift+1),U(1,shift+1),N_st_diag_in,sze)
! Compute then the DIAGONAL PART OF THE DRESSING
! <i|W_k> += Dress_jj(i) * <i|U>
call dressing_diag_uv(W(1,shift+1),U(1,shift+1),Dress_jj,N_st_diag_in,sze)
else
! Already computed in update below
continue
endif
if (N_st == 1) then
l = idx_dress
double precision :: f
f = inv_c_idx_dress
do istate=1,N_st_diag_in
do i=1,sze
W(i,shift+istate) += Dressing_vec(i,1) *f * U(l,shift+istate)
W(l,shift+istate) += Dressing_vec(i,1) *f * U(i,shift+istate)
enddo
enddo
else
print*,'dav_double_dressed routine not yet implemented for N_st > 1'
!
! call dgemm('T','N', N_st, N_st_diag_in, sze, 1.d0, &
! psi_coef, size(psi_coef,1), &
! U(1,shift+1), size(U,1), 0.d0, s_tmp, size(s_tmp,1))
!
! call dgemm('N','N', sze, N_st_diag_in, N_st, 1.0d0, &
! Dressing_vec, size(Dressing_vec,1), s_tmp, size(s_tmp,1), &
! 1.d0, W(1,shift+1), size(W,1))
!
!
! call dgemm('T','N', N_st, N_st_diag_in, sze, 1.d0, &
! Dressing_vec, size(Dressing_vec,1), &
! U(1,shift+1), size(U,1), 0.d0, s_tmp, size(s_tmp,1))
!
! call dgemm('N','N', sze, N_st_diag_in, N_st, 1.0d0, &
! psi_coef, size(psi_coef,1), s_tmp, size(s_tmp,1), &
! 1.d0, W(1,shift+1), size(W,1))
!
endif
! Compute h_kl = <u_k | W_l> = <u_k| H |u_l>
! -------------------------------------------
call dgemm('T','N', shift2, shift2, sze, &
1.d0, U, size(U,1), W, size(W,1), &
0.d0, h, size(h,1))
call dgemm('T','N', shift2, shift2, sze, &
1.d0, U, size(U,1), U, size(U,1), &
0.d0, s_tmp, size(s_tmp,1))
! Diagonalize h
! ---------------
integer :: lwork, info
double precision, allocatable :: work(:)
y = h
lwork = -1
allocate(work(1))
call dsygv(1,'V','U',shift2,y,size(y,1), &
s_tmp,size(s_tmp,1), lambda, work,lwork,info)
lwork = int(work(1))
deallocate(work)
allocate(work(lwork))
call dsygv(1,'V','U',shift2,y,size(y,1), &
s_tmp,size(s_tmp,1), lambda, work,lwork,info)
deallocate(work)
if (info /= 0) then
stop 'DSYGV Diagonalization failed'
endif
! Compute Energy for each eigenvector
! -----------------------------------
call dgemm('N','N',shift2,shift2,shift2, &
1.d0, h, size(h,1), y, size(y,1), &
0.d0, s_tmp, size(s_tmp,1))
call dgemm('T','N',shift2,shift2,shift2, &
1.d0, y, size(y,1), s_tmp, size(s_tmp,1), &
0.d0, h, size(h,1))
do k=1,shift2
lambda(k) = h(k,k)
enddo
if (state_following) then
overlap = -1.d0
do k=1,shift2
do i=1,shift2
overlap(k,i) = dabs(y(k,i))
enddo
enddo
do k=1,N_st
cmax = -1.d0
do i=1,N_st
if (overlap(i,k) > cmax) then
cmax = overlap(i,k)
order(k) = i
endif
enddo
do i=1,N_st_diag_in
overlap(order(k),i) = -1.d0
enddo
enddo
overlap = y
do k=1,N_st
l = order(k)
if (k /= l) then
y(1:shift2,k) = overlap(1:shift2,l)
endif
enddo
do k=1,N_st
overlap(k,1) = lambda(k)
enddo
endif
! Express eigenvectors of h in the determinant basis
! --------------------------------------------------
call dgemm('N','N', sze, N_st_diag_in, shift2, &
1.d0, U, size(U,1), y, size(y,1), 0.d0, U(1,shift2+1), size(U,1))
call dgemm('N','N', sze, N_st_diag_in, shift2, &
1.d0, W, size(W,1), y, size(y,1), 0.d0, W(1,shift2+1), size(W,1))
! Compute residual vector and davidson step
! -----------------------------------------
!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(i,k)
do k=1,N_st_diag_in
do i=1,sze
U(i,shift2+k) = &
(lambda(k) * U(i,shift2+k) - W(i,shift2+k) ) &
/max(H_jj_tmp(i) - lambda (k),1.d-2)
enddo
if (k <= N_st) then
residual_norm(k) = u_dot_u(U(1,shift2+k),sze)
to_print(1,k) = lambda(k)
to_print(2,k) = residual_norm(k)
endif
enddo
!$OMP END PARALLEL DO
if ((itertot>1).and.(iter == 1)) then
!don't print
continue
else
write(*,'(1X,I3,1X,100(1X,F16.10,1X,E11.3))') iter-1, to_print(1:2,1:N_st)
endif
! Check convergence
if (iter > 1) then
converged = dabs(maxval(residual_norm(1:N_st))) < threshold_davidson
endif
do k=1,N_st
if (residual_norm(k) > 1.d8) then
print *, 'Davidson failed'
stop -1
endif
enddo
if (converged) then
exit
endif
logical, external :: qp_stop
if (qp_stop()) then
converged = .True.
exit
endif
enddo
! Re-contract U and update W
! --------------------------------
call dgemm('N','N', sze, N_st_diag_in, shift2, 1.d0, &
W, size(W,1), y, size(y,1), 0.d0, u_in, size(u_in,1))
do k=1,N_st_diag_in
do i=1,sze
W(i,k) = u_in(i,k)
enddo
enddo
call dgemm('N','N', sze, N_st_diag_in, shift2, 1.d0, &
U, size(U,1), y, size(y,1), 0.d0, u_in, size(u_in,1))
do k=1,N_st_diag_in
do i=1,sze
U(i,k) = u_in(i,k)
enddo
enddo
enddo
call nullify_small_elements(sze,N_st_diag_in,U,size(U,1),threshold_davidson_pt2)
do k=1,N_st_diag_in
do i=1,sze
u_in(i,k) = U(i,k)
enddo
enddo
do k=1,N_st_diag_in
energies(k) = lambda(k)
enddo
write_buffer = '======'
do i=1,N_st
write_buffer = trim(write_buffer)//' ================ ==========='
enddo
write(6,'(A)') trim(write_buffer)
write(6,'(A)') ''
call write_time(6)
deallocate(W)
deallocate ( &
residual_norm, &
U, overlap, &
h, y, s_tmp, &
lambda &
)
FREE nthreads_davidson
end
subroutine dressing_diag_uv(v,u,dress_diag,N_st,sze)
implicit none
BEGIN_DOC
! Routine that computes the diagonal part of the dressing
!
! v(i) += u(i) * dress_diag(i)
!
! !!!!!!!! WARNING !!!!!!!! the vector v is not initialized
!
! !!!!!!!! SO MAKE SURE THERE ARE SOME MEANINGFUL VALUES IN THERE
END_DOC
integer, intent(in) :: N_st,sze
double precision, intent(in) :: u(sze,N_st),dress_diag(sze)
double precision, intent(inout) :: v(sze,N_st)
integer :: i,istate
do istate = 1, N_st
do i = 1, sze
v(i,istate) += dress_diag(i) * u(i,istate)
enddo
enddo
end

View File

@ -0,0 +1,485 @@
subroutine davidson_general_ext_rout_dressed(u_in,H_jj,energies,sze,N_st,N_st_diag,dressing_state,dressing_vec,idress,converged,hcalc)
use mmap_module
implicit none
BEGIN_DOC
! Davidson diagonalization.
!
! u_in : guess coefficients on the various states. Overwritten
! on exit
!
! sze : leftmost dimension of u_in
!
! sze : Number of determinants
!
! N_st : Number of eigenstates
!
! Initial guess vectors are not necessarily orthonormal
END_DOC
integer, intent(in) :: sze, N_st, N_st_diag,idress
double precision, intent(inout) :: u_in(sze,N_st_diag)
double precision, intent(inout) :: H_jj(sze)
double precision, intent(out) :: energies(N_st_diag)
double precision, intent(in) :: dressing_vec(sze,N_st)
integer, intent(in) :: dressing_state
logical, intent(out) :: converged
external hcalc
double precision :: f
integer :: iter
integer :: i,j,k,l,m
double precision, external :: u_dot_v, u_dot_u
integer :: k_pairs, kl
integer :: iter2, itertot
double precision, allocatable :: y(:,:), h(:,:), lambda(:)
double precision, allocatable :: s_tmp(:,:)
double precision :: diag_h_mat_elem
double precision, allocatable :: residual_norm(:)
character*(16384) :: write_buffer
double precision :: to_print(2,N_st)
double precision :: cpu, wall
integer :: shift, shift2, itermax, istate
double precision :: r1, r2, alpha
logical :: state_ok(N_st_diag*davidson_sze_max)
integer :: nproc_target
integer :: order(N_st_diag)
double precision :: cmax
double precision, allocatable :: U(:,:), overlap(:,:)
double precision, pointer :: W(:,:)
logical :: disk_based
double precision :: energy_shift(N_st_diag*davidson_sze_max)
!!!! TO CHANGE !!!!
integer :: idx_dress(1)
idx_dress = idress
if (dressing_state > 0) then
do k=1,N_st
do i=1,sze
H_jj(i) += u_in(i,k) * dressing_vec(i,k)
enddo
enddo
endif
l = idx_dress(1)
f = 1.0d0/u_in(l,1)
include 'constants.include.F'
!DIR$ ATTRIBUTES ALIGN : $IRP_ALIGN :: U, W, y, h, lambda
if (N_st_diag*3 > sze) then
print *, 'error in Davidson :'
print *, 'Increase n_det_max_full to ', N_st_diag*3
stop -1
endif
itermax = max(2,min(davidson_sze_max, sze/N_st_diag))+1
itertot = 0
if (state_following) then
allocate(overlap(N_st_diag*itermax, N_st_diag*itermax))
else
allocate(overlap(1,1)) ! avoid 'if' for deallocate
endif
overlap = 0.d0
provide threshold_davidson !nthreads_davidson
call write_time(6)
write(6,'(A)') ''
write(6,'(A)') 'Davidson Diagonalization'
write(6,'(A)') '------------------------'
write(6,'(A)') ''
! Find max number of cores to fit in memory
! -----------------------------------------
nproc_target = nproc
double precision :: rss
integer :: maxab
! maxab = max(N_det_alpha_unique, N_det_beta_unique)+1
maxab = sze
m=1
disk_based = .False.
call resident_memory(rss)
do
r1 = 8.d0 * &! bytes
( dble(sze)*(N_st_diag*itermax) &! U
+ 1.0d0*dble(sze*m)*(N_st_diag*itermax) &! W
+ 3.0d0*(N_st_diag*itermax)**2 &! h,y,s_tmp
+ 1.d0*(N_st_diag*itermax) &! lambda
+ 1.d0*(N_st_diag) &! residual_norm
! In H_u_0_nstates_zmq
+ 2.d0*(N_st_diag*N_det) &! u_t, v_t, on collector
+ 2.d0*(N_st_diag*N_det) &! u_t, v_t, on slave
+ 0.5d0*maxab &! idx0 in H_u_0_nstates_openmp_work_*
+ nproc_target * &! In OMP section
( 1.d0*(N_int*maxab) &! buffer
+ 3.5d0*(maxab) ) &! singles_a, singles_b, doubles, idx
) / 1024.d0**3
if (nproc_target == 0) then
call check_mem(r1,irp_here)
nproc_target = 1
exit
endif
if (r1+rss < qp_max_mem) then
exit
endif
if (itermax > 4) then
itermax = itermax - 1
else if (m==1.and.disk_based_davidson) then
m=0
disk_based = .True.
itermax = 6
else
nproc_target = nproc_target - 1
endif
enddo
nthreads_davidson = nproc_target
TOUCH nthreads_davidson
call write_int(6,N_st,'Number of states')
call write_int(6,N_st_diag,'Number of states in diagonalization')
call write_int(6,sze,'Number of basis function')
call write_int(6,nproc_target,'Number of threads for diagonalization')
call write_double(6, r1, 'Memory(Gb)')
if (disk_based) then
print *, 'Using swap space to reduce RAM'
endif
!---------------
write(6,'(A)') ''
write_buffer = '====='
do i=1,N_st
write_buffer = trim(write_buffer)//' ================ ==========='
enddo
write(6,'(A)') write_buffer(1:6+41*N_st)
write_buffer = 'Iter'
do i=1,N_st
write_buffer = trim(write_buffer)//' Energy Residual '
enddo
write(6,'(A)') write_buffer(1:6+41*N_st)
write_buffer = '====='
do i=1,N_st
write_buffer = trim(write_buffer)//' ================ ==========='
enddo
write(6,'(A)') write_buffer(1:6+41*N_st)
allocate(W(sze,N_st_diag*itermax))
allocate( &
! Large
U(sze,N_st_diag*itermax), &
! Small
h(N_st_diag*itermax,N_st_diag*itermax), &
y(N_st_diag*itermax,N_st_diag*itermax), &
s_tmp(N_st_diag*itermax,N_st_diag*itermax), &
residual_norm(N_st_diag), &
lambda(N_st_diag*itermax))
h = 0.d0
U = 0.d0
y = 0.d0
s_tmp = 0.d0
ASSERT (N_st > 0)
ASSERT (N_st_diag >= N_st)
ASSERT (sze > 0)
! Davidson iterations
! ===================
converged = .False.
do k=N_st+1,N_st_diag
do i=1,sze
call random_number(r1)
call random_number(r2)
r1 = dsqrt(-2.d0*dlog(r1))
r2 = dtwo_pi*r2
u_in(i,k) = r1*dcos(r2) * u_in(i,k-N_st)
enddo
u_in(k,k) = u_in(k,k) + 10.d0
enddo
! Normalize all states
do k=1,N_st_diag
call normalize(u_in(1,k),sze)
enddo
! Copy from the guess input "u_in" to the working vectors "U"
do k=1,N_st_diag
do i=1,sze
U(i,k) = u_in(i,k)
enddo
enddo
do while (.not.converged)
itertot = itertot+1
if (itertot == 8) then
exit
endif
do iter=1,itermax-1
shift = N_st_diag*(iter-1)
shift2 = N_st_diag*iter
if ((iter > 1).or.(itertot == 1)) then
! Compute |W_k> = \sum_i |i><i|H|u_k>
! -----------------------------------
! Gram-Schmidt to orthogonalize all new guess with the previous vectors
call ortho_qr(U,size(U,1),sze,shift2)
call ortho_qr(U,size(U,1),sze,shift2)
! it does W = H U with W(sze,N_st_diag),U(sze,N_st_diag)
! where sze is the size of the vector, N_st_diag is the number of states
call hcalc(W(1,shift+1),U(1,shift+1),N_st_diag,sze)
else
! Already computed in update below
continue
endif
if (dressing_state > 0) then
if (N_st == 1) then
do istate=1,N_st_diag
do i=1,sze
W(i,shift+istate) += dressing_vec(i,1) *f * U(l,shift+istate)
W(l,shift+istate) += dressing_vec(i,1) *f * U(i,shift+istate)
enddo
enddo
else
print*,'Not implemented yet for multi state ...'
stop
! call dgemm('T','N', N_st, N_st_diag, sze, 1.d0, &
! psi_coef, size(psi_coef,1), &
! U(1,shift+1), size(U,1), 0.d0, s_tmp, size(s_tmp,1))
!
! call dgemm('N','N', sze, N_st_diag, N_st, 1.0d0, &
! dressing_vec, size(dressing_vec,1), s_tmp, size(s_tmp,1), &
! 1.d0, W(1,shift+1), size(W,1))
!
!
! call dgemm('T','N', N_st, N_st_diag, sze, 1.d0, &
! dressing_vec, size(dressing_vec,1), &
! U(1,shift+1), size(U,1), 0.d0, s_tmp, size(s_tmp,1))
!
! call dgemm('N','N', sze, N_st_diag, N_st, 1.0d0, &
! psi_coef, size(psi_coef,1), s_tmp, size(s_tmp,1), &
! 1.d0, W(1,shift+1), size(W,1))
endif
endif
! Compute h_kl = <u_k | W_l> = <u_k| H |u_l>
! -------------------------------------------
call dgemm('T','N', shift2, shift2, sze, &
1.d0, U, size(U,1), W, size(W,1), &
0.d0, h, size(h,1))
call dgemm('T','N', shift2, shift2, sze, &
1.d0, U, size(U,1), U, size(U,1), &
0.d0, s_tmp, size(s_tmp,1))
! Diagonalize h
! ---------------
integer :: lwork, info
double precision, allocatable :: work(:)
y = h
lwork = -1
allocate(work(1))
call dsygv(1,'V','U',shift2,y,size(y,1), &
s_tmp,size(s_tmp,1), lambda, work,lwork,info)
lwork = int(work(1))
deallocate(work)
allocate(work(lwork))
call dsygv(1,'V','U',shift2,y,size(y,1), &
s_tmp,size(s_tmp,1), lambda, work,lwork,info)
deallocate(work)
if (info /= 0) then
stop 'DSYGV Diagonalization failed'
endif
! Compute Energy for each eigenvector
! -----------------------------------
call dgemm('N','N',shift2,shift2,shift2, &
1.d0, h, size(h,1), y, size(y,1), &
0.d0, s_tmp, size(s_tmp,1))
call dgemm('T','N',shift2,shift2,shift2, &
1.d0, y, size(y,1), s_tmp, size(s_tmp,1), &
0.d0, h, size(h,1))
do k=1,shift2
lambda(k) = h(k,k)
enddo
if (state_following) then
overlap = -1.d0
do k=1,shift2
do i=1,shift2
overlap(k,i) = dabs(y(k,i))
enddo
enddo
do k=1,N_st
cmax = -1.d0
do i=1,N_st
if (overlap(i,k) > cmax) then
cmax = overlap(i,k)
order(k) = i
endif
enddo
do i=1,N_st_diag
overlap(order(k),i) = -1.d0
enddo
enddo
overlap = y
do k=1,N_st
l = order(k)
if (k /= l) then
y(1:shift2,k) = overlap(1:shift2,l)
endif
enddo
do k=1,N_st
overlap(k,1) = lambda(k)
enddo
endif
! Express eigenvectors of h in the determinant basis
! --------------------------------------------------
call dgemm('N','N', sze, N_st_diag, shift2, &
1.d0, U, size(U,1), y, size(y,1), 0.d0, U(1,shift2+1), size(U,1))
call dgemm('N','N', sze, N_st_diag, shift2, &
1.d0, W, size(W,1), y, size(y,1), 0.d0, W(1,shift2+1), size(W,1))
! Compute residual vector and davidson step
! -----------------------------------------
!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(i,k)
do k=1,N_st_diag
do i=1,sze
U(i,shift2+k) = &
(lambda(k) * U(i,shift2+k) - W(i,shift2+k) ) &
/max(H_jj(i) - lambda (k),1.d-2)
enddo
if (k <= N_st) then
residual_norm(k) = u_dot_u(U(1,shift2+k),sze)
to_print(1,k) = lambda(k)
to_print(2,k) = residual_norm(k)
endif
enddo
!$OMP END PARALLEL DO
if ((itertot>1).and.(iter == 1)) then
!don't print
continue
else
write(*,'(1X,I3,1X,100(1X,F16.10,1X,E11.3))') iter-1, to_print(1:2,1:N_st)
endif
! Check convergence
if (iter > 1) then
converged = dabs(maxval(residual_norm(1:N_st))) < threshold_davidson
endif
do k=1,N_st
if (residual_norm(k) > 1.d8) then
print *, 'Davidson failed'
stop -1
endif
enddo
if (converged) then
exit
endif
logical, external :: qp_stop
if (qp_stop()) then
converged = .True.
exit
endif
enddo
! Re-contract U and update W
! --------------------------------
call dgemm('N','N', sze, N_st_diag, shift2, 1.d0, &
W, size(W,1), y, size(y,1), 0.d0, u_in, size(u_in,1))
do k=1,N_st_diag
do i=1,sze
W(i,k) = u_in(i,k)
enddo
enddo
call dgemm('N','N', sze, N_st_diag, shift2, 1.d0, &
U, size(U,1), y, size(y,1), 0.d0, u_in, size(u_in,1))
do k=1,N_st_diag
do i=1,sze
U(i,k) = u_in(i,k)
enddo
enddo
enddo
call nullify_small_elements(sze,N_st_diag,U,size(U,1),threshold_davidson_pt2)
do k=1,N_st_diag
do i=1,sze
u_in(i,k) = U(i,k)
enddo
enddo
do k=1,N_st_diag
energies(k) = lambda(k)
enddo
write_buffer = '======'
do i=1,N_st
write_buffer = trim(write_buffer)//' ================ ==========='
enddo
write(6,'(A)') trim(write_buffer)
write(6,'(A)') ''
call write_time(6)
deallocate(W)
deallocate ( &
residual_norm, &
U, overlap, &
h, y, s_tmp, &
lambda &
)
FREE nthreads_davidson
end

View File

@ -1,15 +1,15 @@
subroutine davidson_general_ext_rout(u_in,H_jj,energies,dim_in,sze,N_st,N_st_diag_in,converged,hcalc) subroutine davidson_general_ext_rout(u_in,H_jj,energies,sze,N_st,N_st_diag_in,converged,hcalc)
use mmap_module use mmap_module
implicit none implicit none
BEGIN_DOC BEGIN_DOC
! Davidson diagonalization with specific diagonal elements of the H matrix ! Generic Davidson diagonalization
! !
! H_jj : specific diagonal H matrix elements to diagonalize de Davidson ! H_jj : specific diagonal H matrix elements to diagonalize de Davidson
! !
! u_in : guess coefficients on the various states. Overwritten on exit ! u_in : guess coefficients on the various states. Overwritten on exit
! !
! dim_in : leftmost dimension of u_in ! sze : leftmost dimension of u_in
! !
! sze : Number of determinants ! sze : Number of determinants
! !
@ -21,9 +21,9 @@ subroutine davidson_general_ext_rout(u_in,H_jj,energies,dim_in,sze,N_st,N_st_dia
! !
! hcalc subroutine to compute W = H U (see routine hcalc_template for template of input/output) ! hcalc subroutine to compute W = H U (see routine hcalc_template for template of input/output)
END_DOC END_DOC
integer, intent(in) :: dim_in, sze, N_st, N_st_diag_in integer, intent(in) :: sze, N_st, N_st_diag_in
double precision, intent(in) :: H_jj(sze) double precision, intent(in) :: H_jj(sze)
double precision, intent(inout) :: u_in(dim_in,N_st_diag_in) double precision, intent(inout) :: u_in(sze,N_st_diag_in)
double precision, intent(out) :: energies(N_st) double precision, intent(out) :: energies(N_st)
external hcalc external hcalc
@ -157,19 +157,7 @@ subroutine davidson_general_ext_rout(u_in,H_jj,energies,dim_in,sze,N_st,N_st_dia
write(6,'(A)') write_buffer(1:6+41*N_st) write(6,'(A)') write_buffer(1:6+41*N_st)
! if (disk_based) then allocate(W(sze,N_st_diag*itermax))
! ! Create memory-mapped files for W and S
! type(c_ptr) :: ptr_w, ptr_s
! integer :: fd_s, fd_w
! call mmap(trim(ezfio_work_dir)//'davidson_w', (/int(sze,8),int(N_st_diag*itermax,8)/),&
! 8, fd_w, .False., ptr_w)
! call mmap(trim(ezfio_work_dir)//'davidson_s', (/int(sze,8),int(N_st_diag*itermax,8)/),&
! 4, fd_s, .False., ptr_s)
! call c_f_pointer(ptr_w, w, (/sze,N_st_diag*itermax/))
! call c_f_pointer(ptr_s, s, (/sze,N_st_diag*itermax/))
! else
allocate(W(sze,N_st_diag*itermax))
! endif
allocate( & allocate( &
! Large ! Large
@ -233,7 +221,6 @@ subroutine davidson_general_ext_rout(u_in,H_jj,energies,dim_in,sze,N_st,N_st_dia
if ((iter > 1).or.(itertot == 1)) then if ((iter > 1).or.(itertot == 1)) then
! Compute |W_k> = \sum_i |i><i|H|u_k> ! Compute |W_k> = \sum_i |i><i|H|u_k>
! ----------------------------------- ! -----------------------------------
! Gram-Schmidt to orthogonalize all new guess with the previous vectors ! Gram-Schmidt to orthogonalize all new guess with the previous vectors
call ortho_qr(U,size(U,1),sze,shift2) call ortho_qr(U,size(U,1),sze,shift2)
call ortho_qr(U,size(U,1),sze,shift2) call ortho_qr(U,size(U,1),sze,shift2)
@ -357,6 +344,9 @@ subroutine davidson_general_ext_rout(u_in,H_jj,energies,dim_in,sze,N_st,N_st_dia
enddo enddo
! Re-contract U and update W
! --------------------------------
call dgemm('N','N', sze, N_st_diag, shift2, 1.d0, & call dgemm('N','N', sze, N_st_diag, shift2, 1.d0, &
W, size(W,1), y, size(y,1), 0.d0, u_in, size(u_in,1)) W, size(W,1), y, size(y,1), 0.d0, u_in, size(u_in,1))
do k=1,N_st_diag do k=1,N_st_diag
@ -372,8 +362,8 @@ subroutine davidson_general_ext_rout(u_in,H_jj,energies,dim_in,sze,N_st,N_st_dia
U(i,k) = u_in(i,k) U(i,k) = u_in(i,k)
enddo enddo
enddo enddo
call ortho_qr(U,size(U,1),sze,N_st_diag) call ortho_qr(U,size(U,1),sze,N_st_diag)
call ortho_qr(U,size(U,1),sze,N_st_diag) call ortho_qr(U,size(U,1),sze,N_st_diag)
do j=1,N_st_diag do j=1,N_st_diag
k=1 k=1
do while ((k<sze).and.(U(k,j) == 0.d0)) do while ((k<sze).and.(U(k,j) == 0.d0))
@ -398,7 +388,7 @@ subroutine davidson_general_ext_rout(u_in,H_jj,energies,dim_in,sze,N_st,N_st_dia
write(6,'(A)') '' write(6,'(A)') ''
call write_time(6) call write_time(6)
deallocate(W) deallocate(W)
deallocate ( & deallocate ( &
residual_norm, & residual_norm, &

View File

@ -8,12 +8,13 @@ program test_dav
touch read_wf touch read_wf
PROVIDE threshold_davidson nthreads_davidson PROVIDE threshold_davidson nthreads_davidson
call routine call routine
call test_dav_dress
end end
subroutine routine subroutine routine
implicit none implicit none
double precision, allocatable :: u_in(:,:), H_jj(:), energies(:),h_mat(:,:) double precision, allocatable :: u_in(:,:), H_jj(:), energies(:),h_mat(:,:)
integer :: dim_in,sze,N_st,N_st_diag_in,dressing_state integer :: dim_in,sze,N_st,N_st_diag_in
logical :: converged logical :: converged
integer :: i,j integer :: i,j
external hcalc_template external hcalc_template
@ -21,9 +22,8 @@ subroutine routine
N_st_diag_in = N_states_diag N_st_diag_in = N_states_diag
sze = N_det sze = N_det
dim_in = sze dim_in = sze
dressing_state = 0
!!!! MARK THAT u_in mut dimensioned with "N_st_diag_in" as a second dimension !!!! MARK THAT u_in mut dimensioned with "N_st_diag_in" as a second dimension
allocate(u_in(dim_in,N_st_diag_in),H_jj(sze),h_mat(sze,sze),energies(N_st)) allocate(u_in(dim_in,N_st_diag_in),H_jj(sze),h_mat(sze,sze),energies(N_st_diag_in))
u_in = 0.d0 u_in = 0.d0
do i = 1, N_st do i = 1, N_st
u_in(1,i) = 1.d0 u_in(1,i) = 1.d0
@ -42,7 +42,38 @@ subroutine routine
print*,'energies = ',energies print*,'energies = ',energies
!!! hcalc_template is the routine that computes v = H u !!! hcalc_template is the routine that computes v = H u
!!! and you can use the routine "davidson_general_ext_rout" !!! and you can use the routine "davidson_general_ext_rout"
call davidson_general_ext_rout(u_in,H_jj,energies,dim_in,sze,N_st,N_st_diag_in,converged,hcalc_template) call davidson_general_ext_rout(u_in,H_jj,energies,sze,N_st,N_st_diag_in,converged,hcalc_template)
print*,'energies = ',energies print*,'energies = ',energies
end end
subroutine test_dav_dress
implicit none
double precision, allocatable :: u_in(:,:), H_jj(:), energies(:)
integer :: sze,N_st,N_st_diag_in,dressing_state
logical :: converged
integer :: i,j
external hcalc_template
double precision, allocatable :: dressing_vec(:)
integer :: idress
N_st = N_states
N_st_diag_in = N_states_diag
sze = N_det
dressing_state = 0
idress = 1
!!!! MARK THAT u_in mut dimensioned with "N_st_diag_in" as a second dimension
allocate(u_in(sze,N_st_diag_in),H_jj(sze),energies(N_st_diag_in))
allocate(dressing_vec(sze))
dressing_vec = 0.d0
u_in = 0.d0
do i = 1, N_st
u_in(1,i) = 1.d0
enddo
do i = 1, sze
H_jj(i) = H_matrix_all_dets(i,i) + nuclear_repulsion
enddo
print*,'dressing davidson '
call davidson_general_ext_rout_dressed(u_in,H_jj,energies,sze,N_st,N_st_diag_in,dressing_state,dressing_vec,idress,converged,hcalc_template)
print*,'energies(1) = ',energies(1)
end

View File

@ -508,7 +508,7 @@ subroutine H_S2_u_0_nstates_zmq(v_0,s_0,u_0,N_st,sze)
endif endif
call omp_set_max_active_levels(5) call set_multiple_levels_omp(.True.)
!$OMP PARALLEL DEFAULT(shared) NUM_THREADS(2) PRIVATE(ithread) !$OMP PARALLEL DEFAULT(shared) NUM_THREADS(2) PRIVATE(ithread)
ithread = omp_get_thread_num() ithread = omp_get_thread_num()

View File

@ -464,7 +464,8 @@ subroutine H_u_0_nstates_zmq(v_0,u_0,N_st,sze)
print *, irp_here, ': Failed in zmq_set_running' print *, irp_here, ': Failed in zmq_set_running'
endif endif
call omp_set_max_active_levels(4) call set_multiple_levels_omp(.True.)
!$OMP PARALLEL DEFAULT(shared) NUM_THREADS(2) PRIVATE(ithread) !$OMP PARALLEL DEFAULT(shared) NUM_THREADS(2) PRIVATE(ithread)
ithread = omp_get_thread_num() ithread = omp_get_thread_num()
if (ithread == 0 ) then if (ithread == 0 ) then

View File

@ -464,7 +464,8 @@ subroutine H_u_0_nstates_zmq(v_0,u_0,N_st,sze)
print *, irp_here, ': Failed in zmq_set_running' print *, irp_here, ': Failed in zmq_set_running'
endif endif
call omp_set_max_active_levels(4) call set_multiple_levels_omp(.True.)
!$OMP PARALLEL DEFAULT(shared) NUM_THREADS(2) PRIVATE(ithread) !$OMP PARALLEL DEFAULT(shared) NUM_THREADS(2) PRIVATE(ithread)
ithread = omp_get_thread_num() ithread = omp_get_thread_num()
if (ithread == 0 ) then if (ithread == 0 ) then

View File

@ -299,7 +299,7 @@ subroutine davidson_diag_csf_hjj(dets_in,u_in,H_jj,energies,dim_in,sze,sze_csf,N
shift = N_st_diag*(iter-1) shift = N_st_diag*(iter-1)
shift2 = N_st_diag*iter shift2 = N_st_diag*iter
if ((iter > 1).or.(itertot == 1)) then ! if ((iter > 1).or.(itertot == 1)) then
! Compute |W_k> = \sum_i |i><i|H|u_k> ! Compute |W_k> = \sum_i |i><i|H|u_k>
! ----------------------------------- ! -----------------------------------
@ -309,10 +309,10 @@ subroutine davidson_diag_csf_hjj(dets_in,u_in,H_jj,energies,dim_in,sze,sze_csf,N
else else
call H_u_0_nstates_openmp(W,U,N_st_diag,sze) call H_u_0_nstates_openmp(W,U,N_st_diag,sze)
endif endif
else ! else
! Already computed in update below ! ! Already computed in update below
continue ! continue
endif ! endif
if (dressing_state > 0) then if (dressing_state > 0) then
@ -508,17 +508,8 @@ subroutine davidson_diag_csf_hjj(dets_in,u_in,H_jj,energies,dim_in,sze,sze_csf,N
enddo enddo
! Re-contract U and update W ! Re-contract U
! -------------------------------- ! -------------
call dgemm('N','N', sze_csf, N_st_diag, shift2, 1.d0, &
W_csf, size(W_csf,1), y, size(y,1), 0.d0, u_in, size(u_in,1))
do k=1,N_st_diag
do i=1,sze_csf
W_csf(i,k) = u_in(i,k)
enddo
enddo
call convertWFfromCSFtoDET(N_st_diag,W_csf,W)
call dgemm('N','N', sze_csf, N_st_diag, shift2, 1.d0, & call dgemm('N','N', sze_csf, N_st_diag, shift2, 1.d0, &
U_csf, size(U_csf,1), y, size(y,1), 0.d0, u_in, size(u_in,1)) U_csf, size(U_csf,1), y, size(y,1), 0.d0, u_in, size(u_in,1))

View File

@ -349,7 +349,7 @@ subroutine davidson_diag_hjj_sjj(dets_in,u_in,H_jj,s2_out,energies,dim_in,sze,N_
shift = N_st_diag*(iter-1) shift = N_st_diag*(iter-1)
shift2 = N_st_diag*iter shift2 = N_st_diag*iter
if ((iter > 1).or.(itertot == 1)) then ! if ((iter > 1).or.(itertot == 1)) then
! Compute |W_k> = \sum_i |i><i|H|u_k> ! Compute |W_k> = \sum_i |i><i|H|u_k>
! ----------------------------------- ! -----------------------------------
@ -359,10 +359,10 @@ subroutine davidson_diag_hjj_sjj(dets_in,u_in,H_jj,s2_out,energies,dim_in,sze,N_
call H_S2_u_0_nstates_openmp(W(1,shift+1),S_d,U(1,shift+1),N_st_diag,sze) call H_S2_u_0_nstates_openmp(W(1,shift+1),S_d,U(1,shift+1),N_st_diag,sze)
endif endif
S(1:sze,shift+1:shift+N_st_diag) = real(S_d(1:sze,1:N_st_diag)) S(1:sze,shift+1:shift+N_st_diag) = real(S_d(1:sze,1:N_st_diag))
else ! else
! Already computed in update below ! ! Already computed in update below
continue ! continue
endif ! endif
if (dressing_state > 0) then if (dressing_state > 0) then

View File

@ -1,9 +1,9 @@
BEGIN_PROVIDER [ double precision, CI_energy, (N_states_diag) ] BEGIN_PROVIDER [ double precision, CI_energy, (N_states_diag) ]
implicit none implicit none
BEGIN_DOC BEGIN_DOC
! :c:data:`n_states` lowest eigenvalues of the |CI| matrix ! :c:data:`n_states` lowest eigenvalues of the |CI| matrix
END_DOC END_DOC
PROVIDE distributed_davidson
integer :: j integer :: j
character*(8) :: st character*(8) :: st
@ -247,6 +247,7 @@ subroutine diagonalize_CI
! eigenstates of the |CI| matrix. ! eigenstates of the |CI| matrix.
END_DOC END_DOC
integer :: i,j integer :: i,j
PROVIDE distributed_davidson
do j=1,N_states do j=1,N_states
do i=1,N_det do i=1,N_det
psi_coef(i,j) = CI_eigenvectors(i,j) psi_coef(i,j) = CI_eigenvectors(i,j)

View File

@ -203,7 +203,7 @@ subroutine H_S2_u_0_nstates_openmp_work_$N_int(v_t,s_t,u_t,N_st,sze,istart,iend,
integer, allocatable :: doubles(:) integer, allocatable :: doubles(:)
integer, allocatable :: singles_a(:) integer, allocatable :: singles_a(:)
integer, allocatable :: singles_b(:) integer, allocatable :: singles_b(:)
integer, allocatable :: idx(:), idx0(:) integer, allocatable :: idx(:), buffer_lrow(:), idx0(:)
integer :: maxab, n_singles_a, n_singles_b, kcol_prev integer :: maxab, n_singles_a, n_singles_b, kcol_prev
integer*8 :: k8 integer*8 :: k8
logical :: compute_singles logical :: compute_singles
@ -253,7 +253,7 @@ compute_singles=.True.
!$OMP PRIVATE(krow, kcol, tmp_det, spindet, k_a, k_b, i, & !$OMP PRIVATE(krow, kcol, tmp_det, spindet, k_a, k_b, i, &
!$OMP lcol, lrow, l_a, l_b, utl, kk, u_is_sparse, & !$OMP lcol, lrow, l_a, l_b, utl, kk, u_is_sparse, &
!$OMP buffer, doubles, n_doubles, umax, & !$OMP buffer, doubles, n_doubles, umax, &
!$OMP tmp_det2, hij, sij, idx, l, kcol_prev, & !$OMP tmp_det2, hij, sij, idx, buffer_lrow, l, kcol_prev, &
!$OMP singles_a, n_singles_a, singles_b, ratio, & !$OMP singles_a, n_singles_a, singles_b, ratio, &
!$OMP n_singles_b, k8, last_found,left,right,right_max) !$OMP n_singles_b, k8, last_found,left,right,right_max)
@ -264,7 +264,7 @@ compute_singles=.True.
singles_a(maxab), & singles_a(maxab), &
singles_b(maxab), & singles_b(maxab), &
doubles(maxab), & doubles(maxab), &
idx(maxab), utl(N_st,block_size)) idx(maxab), buffer_lrow(maxab), utl(N_st,block_size))
kcol_prev=-1 kcol_prev=-1
@ -332,18 +332,20 @@ compute_singles=.True.
l_a = psi_bilinear_matrix_columns_loc(lcol) l_a = psi_bilinear_matrix_columns_loc(lcol)
ASSERT (l_a <= N_det) ASSERT (l_a <= N_det)
!DIR$ UNROLL(8)
!DIR$ LOOP COUNT avg(50000)
do j=1,psi_bilinear_matrix_columns_loc(lcol+1) - psi_bilinear_matrix_columns_loc(lcol) do j=1,psi_bilinear_matrix_columns_loc(lcol+1) - psi_bilinear_matrix_columns_loc(lcol)
lrow = psi_bilinear_matrix_rows(l_a) lrow = psi_bilinear_matrix_rows(l_a)
ASSERT (lrow <= N_det_alpha_unique) ASSERT (lrow <= N_det_alpha_unique)
buffer(1:$N_int,j) = psi_det_alpha_unique(1:$N_int, lrow) ! hot spot buffer_lrow(j) = lrow
ASSERT (l_a <= N_det) ASSERT (l_a <= N_det)
idx(j) = l_a idx(j) = l_a
l_a = l_a+1 l_a = l_a+1
enddo enddo
do j=1,psi_bilinear_matrix_columns_loc(lcol+1) - psi_bilinear_matrix_columns_loc(lcol)
buffer(1:$N_int,j) = psi_det_alpha_unique(1:$N_int, buffer_lrow(j)) ! hot spot
enddo
j = j-1 j = j-1
call get_all_spin_singles_$N_int( & call get_all_spin_singles_$N_int( &
@ -789,7 +791,7 @@ compute_singles=.True.
end do end do
!$OMP END DO !$OMP END DO
deallocate(buffer, singles_a, singles_b, doubles, idx, utl) deallocate(buffer, singles_a, singles_b, doubles, idx, buffer_lrow, utl)
!$OMP END PARALLEL !$OMP END PARALLEL
end end

View File

@ -12,7 +12,7 @@ BEGIN_PROVIDER [ double precision, CI_energy_dressed, (N_states_diag) ]
enddo enddo
do j=1,min(N_det,N_states) do j=1,min(N_det,N_states)
write(st,'(I4)') j write(st,'(I4)') j
call write_double(6,CI_energy_dressed(j),'Energy of state '//trim(st)) call write_double(6,CI_energy_dressed(j),'Energy dressed of state '//trim(st))
call write_double(6,CI_eigenvectors_s2_dressed(j),'S^2 of state '//trim(st)) call write_double(6,CI_eigenvectors_s2_dressed(j),'S^2 of state '//trim(st))
enddo enddo
@ -21,133 +21,201 @@ END_PROVIDER
BEGIN_PROVIDER [ double precision, CI_electronic_energy_dressed, (N_states_diag) ] BEGIN_PROVIDER [ double precision, CI_electronic_energy_dressed, (N_states_diag) ]
&BEGIN_PROVIDER [ double precision, CI_eigenvectors_dressed, (N_det,N_states_diag) ] &BEGIN_PROVIDER [ double precision, CI_eigenvectors_dressed, (N_det,N_states_diag) ]
&BEGIN_PROVIDER [ double precision, CI_eigenvectors_s2_dressed, (N_states_diag) ] &BEGIN_PROVIDER [ double precision, CI_eigenvectors_s2_dressed, (N_states_diag) ]
BEGIN_DOC BEGIN_DOC
! Eigenvectors/values of the CI matrix ! Eigenvectors/values of the CI matrix
END_DOC END_DOC
implicit none implicit none
double precision :: ovrlp,u_dot_v double precision :: ovrlp,u_dot_v
integer :: i_good_state integer :: i_good_state
integer, allocatable :: index_good_state_array(:) integer, allocatable :: index_good_state_array(:)
logical, allocatable :: good_state_array(:) logical, allocatable :: good_state_array(:)
double precision, allocatable :: s2_values_tmp(:) double precision, allocatable :: s2_values_tmp(:)
integer :: i_other_state integer :: i_other_state
double precision, allocatable :: eigenvectors(:,:), eigenvectors_s2(:,:), eigenvalues(:) double precision, allocatable :: eigenvectors(:,:), eigenvectors_s2(:,:), eigenvalues(:)
integer :: i_state integer :: i_state
double precision :: e_0 double precision :: e_0
integer :: i,j,k,mrcc_state integer :: i,j,k,mrcc_state
double precision, allocatable :: s2_eigvalues(:) double precision, allocatable :: s2_eigvalues(:)
double precision, allocatable :: e_array(:) double precision, allocatable :: e_array(:)
integer, allocatable :: iorder(:) integer, allocatable :: iorder(:)
logical :: converged
logical :: do_csf
PROVIDE threshold_davidson nthreads_davidson PROVIDE threshold_davidson nthreads_davidson
! Guess values for the "N_states" states of the CI_eigenvectors_dressed ! Guess values for the "N_states" states of the CI_eigenvectors_dressed
do j=1,min(N_states,N_det) do j=1,min(N_states,N_det)
do i=1,N_det do i=1,N_det
CI_eigenvectors_dressed(i,j) = psi_coef(i,j) CI_eigenvectors_dressed(i,j) = psi_coef(i,j)
enddo enddo
enddo enddo
do j=min(N_states,N_det)+1,N_states_diag do j=min(N_states,N_det)+1,N_states_diag
do i=1,N_det do i=1,N_det
CI_eigenvectors_dressed(i,j) = 0.d0 CI_eigenvectors_dressed(i,j) = 0.d0
enddo enddo
enddo enddo
if (diag_algorithm == "Davidson") then do_csf = s2_eig .and. only_expected_s2 .and. csf_based
do j=1,min(N_states,N_det) if (diag_algorithm == "Davidson") then
do i=1,N_det
CI_eigenvectors_dressed(i,j) = psi_coef(i,j)
enddo
enddo
logical :: converged
converged = .False.
call davidson_diag_HS2(psi_det,CI_eigenvectors_dressed, CI_eigenvectors_s2_dressed,&
size(CI_eigenvectors_dressed,1), CI_electronic_energy_dressed,&
N_det,min(N_det,N_states),min(N_det,N_states_diag),N_int,1,converged)
else if (diag_algorithm == "Lapack") then do j=1,min(N_states,N_det)
do i=1,N_det
allocate (eigenvectors(size(H_matrix_dressed,1),N_det)) CI_eigenvectors_dressed(i,j) = psi_coef(i,j)
allocate (eigenvalues(N_det))
call lapack_diag(eigenvalues,eigenvectors, &
H_matrix_dressed,size(H_matrix_dressed,1),N_det)
CI_electronic_energy_dressed(:) = 0.d0
if (s2_eig) then
i_state = 0
allocate (s2_eigvalues(N_det))
allocate(index_good_state_array(N_det),good_state_array(N_det))
good_state_array = .False.
call u_0_S2_u_0(s2_eigvalues,eigenvectors,N_det,psi_det,N_int,&
N_det,size(eigenvectors,1))
do j=1,N_det
! Select at least n_states states with S^2 values closed to "expected_s2"
if(dabs(s2_eigvalues(j)-expected_s2).le.0.5d0)then
i_state +=1
index_good_state_array(i_state) = j
good_state_array(j) = .True.
endif
if(i_state.eq.N_states) then
exit
endif
enddo enddo
if(i_state .ne.0)then enddo
! Fill the first "i_state" states that have a correct S^2 value converged = .False.
do j = 1, i_state if (do_csf) then
do i=1,N_det call davidson_diag_H_csf(psi_det,CI_eigenvectors_dressed, &
CI_eigenvectors_dressed(i,j) = eigenvectors(i,index_good_state_array(j)) size(CI_eigenvectors_dressed,1),CI_electronic_energy_dressed, &
enddo N_det,N_csf,min(N_det,N_states),min(N_det,N_states_diag),N_int,1,converged)
CI_electronic_energy_dressed(j) = eigenvalues(index_good_state_array(j))
CI_eigenvectors_s2_dressed(j) = s2_eigvalues(index_good_state_array(j))
enddo
i_other_state = 0
do j = 1, N_det
if(good_state_array(j))cycle
i_other_state +=1
if(i_state+i_other_state.gt.n_states_diag)then
exit
endif
do i=1,N_det
CI_eigenvectors_dressed(i,i_state+i_other_state) = eigenvectors(i,j)
enddo
CI_electronic_energy_dressed(i_state+i_other_state) = eigenvalues(j)
CI_eigenvectors_s2_dressed(i_state+i_other_state) = s2_eigvalues(i_state+i_other_state)
enddo
else
print*,''
print*,'!!!!!!!! WARNING !!!!!!!!!'
print*,' Within the ',N_det,'determinants selected'
print*,' and the ',N_states_diag,'states requested'
print*,' We did not find any state with S^2 values close to ',expected_s2
print*,' We will then set the first N_states eigenvectors of the H matrix'
print*,' as the CI_eigenvectors_dressed'
print*,' You should consider more states and maybe ask for s2_eig to be .True. or just enlarge the CI space'
print*,''
do j=1,min(N_states_diag,N_det)
do i=1,N_det
CI_eigenvectors_dressed(i,j) = eigenvectors(i,j)
enddo
CI_electronic_energy_dressed(j) = eigenvalues(j)
CI_eigenvectors_s2_dressed(j) = s2_eigvalues(j)
enddo
endif
deallocate(index_good_state_array,good_state_array)
deallocate(s2_eigvalues)
else else
call u_0_S2_u_0(CI_eigenvectors_s2_dressed,eigenvectors,N_det,psi_det,N_int,& call davidson_diag_HS2(psi_det,CI_eigenvectors_dressed, CI_eigenvectors_s2_dressed,&
min(N_det,N_states_diag),size(eigenvectors,1)) size(CI_eigenvectors_dressed,1), CI_electronic_energy_dressed,&
! Select the "N_states_diag" states of lowest energy N_det,min(N_det,N_states),min(N_det,N_states_diag),N_int,1,converged)
do j=1,min(N_det,N_states_diag)
do i=1,N_det
CI_eigenvectors_dressed(i,j) = eigenvectors(i,j)
enddo
CI_electronic_energy_dressed(j) = eigenvalues(j)
enddo
endif endif
deallocate(eigenvectors,eigenvalues)
endif integer :: N_states_diag_save
N_states_diag_save = N_states_diag
do while (.not.converged)
double precision, allocatable :: CI_electronic_energy_tmp (:)
double precision, allocatable :: CI_eigenvectors_tmp (:,:)
double precision, allocatable :: CI_s2_tmp (:)
N_states_diag *= 2
TOUCH N_states_diag
if (do_csf) then
allocate (CI_electronic_energy_tmp (N_states_diag) )
allocate (CI_eigenvectors_tmp (N_det,N_states_diag) )
CI_electronic_energy_tmp(1:N_states_diag_save) = CI_electronic_energy_dressed(1:N_states_diag_save)
CI_eigenvectors_tmp(1:N_det,1:N_states_diag_save) = CI_eigenvectors_dressed(1:N_det,1:N_states_diag_save)
call davidson_diag_H_csf(psi_det,CI_eigenvectors_tmp, &
size(CI_eigenvectors_tmp,1),CI_electronic_energy_tmp, &
N_det,N_csf,min(N_det,N_states),min(N_det,N_states_diag),N_int,1,converged)
CI_electronic_energy_dressed(1:N_states_diag_save) = CI_electronic_energy_tmp(1:N_states_diag_save)
CI_eigenvectors_dressed(1:N_det,1:N_states_diag_save) = CI_eigenvectors_tmp(1:N_det,1:N_states_diag_save)
deallocate (CI_electronic_energy_tmp)
deallocate (CI_eigenvectors_tmp)
else
allocate (CI_electronic_energy_tmp (N_states_diag) )
allocate (CI_eigenvectors_tmp (N_det,N_states_diag) )
allocate (CI_s2_tmp (N_states_diag) )
CI_electronic_energy_tmp(1:N_states_diag_save) = CI_electronic_energy_dressed(1:N_states_diag_save)
CI_eigenvectors_tmp(1:N_det,1:N_states_diag_save) = CI_eigenvectors_dressed(1:N_det,1:N_states_diag_save)
CI_s2_tmp(1:N_states_diag_save) = CI_eigenvectors_s2_dressed(1:N_states_diag_save)
call davidson_diag_HS2(psi_det,CI_eigenvectors_tmp, CI_s2_tmp, &
size(CI_eigenvectors_tmp,1),CI_electronic_energy_tmp, &
N_det,min(N_det,N_states),min(N_det,N_states_diag),N_int,1,converged)
CI_electronic_energy_dressed(1:N_states_diag_save) = CI_electronic_energy_tmp(1:N_states_diag_save)
CI_eigenvectors_dressed(1:N_det,1:N_states_diag_save) = CI_eigenvectors_tmp(1:N_det,1:N_states_diag_save)
CI_eigenvectors_s2_dressed(1:N_states_diag_save) = CI_s2_tmp(1:N_states_diag_save)
deallocate (CI_electronic_energy_tmp)
deallocate (CI_eigenvectors_tmp)
deallocate (CI_s2_tmp)
endif
enddo
if (N_states_diag > N_states_diag_save) then
N_states_diag = N_states_diag_save
TOUCH N_states_diag
endif
else if (diag_algorithm == "Lapack") then
print *, 'Diagonalization of H using Lapack'
allocate (eigenvectors(size(H_matrix_dressed,1),N_det))
allocate (eigenvalues(N_det))
call lapack_diag(eigenvalues,eigenvectors, &
H_matrix_dressed,size(H_matrix_dressed,1),N_det)
CI_electronic_energy_dressed(:) = 0.d0
if (s2_eig) then
i_state = 0
allocate (s2_eigvalues(N_det))
allocate(index_good_state_array(N_det),good_state_array(N_det))
good_state_array = .False.
call u_0_S2_u_0(s2_eigvalues,eigenvectors,N_det,psi_det,N_int,&
N_det,size(eigenvectors,1))
do j=1,N_det
! Select at least n_states states with S^2 values closed to "expected_s2"
if(dabs(s2_eigvalues(j)-expected_s2).le.0.5d0)then
i_state +=1
index_good_state_array(i_state) = j
good_state_array(j) = .True.
endif
if(i_state.eq.N_states) then
exit
endif
enddo
if(i_state .ne.0)then
! Fill the first "i_state" states that have a correct S^2 value
do j = 1, i_state
do i=1,N_det
CI_eigenvectors_dressed(i,j) = eigenvectors(i,index_good_state_array(j))
enddo
CI_electronic_energy_dressed(j) = eigenvalues(index_good_state_array(j))
CI_eigenvectors_s2_dressed(j) = s2_eigvalues(index_good_state_array(j))
enddo
i_other_state = 0
do j = 1, N_det
if(good_state_array(j))cycle
i_other_state +=1
if(i_state+i_other_state.gt.n_states_diag)then
exit
endif
do i=1,N_det
CI_eigenvectors_dressed(i,i_state+i_other_state) = eigenvectors(i,j)
enddo
CI_electronic_energy_dressed(i_state+i_other_state) = eigenvalues(j)
CI_eigenvectors_s2_dressed(i_state+i_other_state) = s2_eigvalues(i_state+i_other_state)
enddo
else
print*,''
print*,'!!!!!!!! WARNING !!!!!!!!!'
print*,' Within the ',N_det,'determinants selected'
print*,' and the ',N_states_diag,'states requested'
print*,' We did not find any state with S^2 values close to ',expected_s2
print*,' We will then set the first N_states eigenvectors of the H matrix'
print*,' as the CI_eigenvectors_dressed'
print*,' You should consider more states and maybe ask for s2_eig to be .True. or just enlarge the CI space'
print*,''
do j=1,min(N_states_diag,N_det)
do i=1,N_det
CI_eigenvectors_dressed(i,j) = eigenvectors(i,j)
enddo
CI_electronic_energy_dressed(j) = eigenvalues(j)
CI_eigenvectors_s2_dressed(j) = s2_eigvalues(j)
enddo
endif
deallocate(index_good_state_array,good_state_array)
deallocate(s2_eigvalues)
else
call u_0_S2_u_0(CI_eigenvectors_s2_dressed,eigenvectors,N_det,psi_det,N_int,&
min(N_det,N_states_diag),size(eigenvectors,1))
! Select the "N_states_diag" states of lowest energy
do j=1,min(N_det,N_states_diag)
do i=1,N_det
CI_eigenvectors_dressed(i,j) = eigenvectors(i,j)
enddo
CI_electronic_energy_dressed(j) = eigenvalues(j)
enddo
endif
deallocate(eigenvectors,eigenvalues)
endif
END_PROVIDER END_PROVIDER

View File

@ -42,7 +42,7 @@ default: 2
[weight_selection] [weight_selection]
type: integer type: integer
doc: Weight used in the selection. 0: input state-average weight, 1: 1./(c_0^2), 2: rPT2 matching, 3: variance matching, 4: variance and rPT2 matching, 5: variance minimization and matching, 6: CI coefficients 7: input state-average multiplied by variance and rPT2 matching 8: input state-average multiplied by rPT2 matching 9: input state-average multiplied by variance matching doc: Weight used in the selection. 0: input state-average weight, 1: 1./(c_0^2), 2: PT2 matching, 3: variance matching, 4: variance and PT2 matching, 5: variance minimization and matching, 6: CI coefficients 7: input state-average multiplied by variance and PT2 matching 8: input state-average multiplied by PT2 matching 9: input state-average multiplied by variance matching
interface: ezfio,provider,ocaml interface: ezfio,provider,ocaml
default: 1 default: 1
@ -136,3 +136,8 @@ doc: If |true|, discard any Slater determinants with an interaction smaller than
interface: ezfio,provider,ocaml interface: ezfio,provider,ocaml
default: False default: False
[save_threshold]
type: Threshold
doc: Cut-off to apply to the CI coefficients when the wave function is stored
interface: ezfio,provider,ocaml
default: 1.e-14

View File

@ -268,6 +268,44 @@ subroutine set_natural_mos
soft_touch mo_occ soft_touch mo_occ
end end
subroutine save_natural_mos_canon_label
implicit none
BEGIN_DOC
! Save natural orbitals, obtained by diagonalization of the one-body density matrix in
! the |MO| basis
END_DOC
call set_natural_mos_canon_label
call nullify_small_elements(ao_num,mo_num,mo_coef,size(mo_coef,1),1.d-10)
call orthonormalize_mos
call save_mos
end
subroutine set_natural_mos_canon_label
implicit none
BEGIN_DOC
! Set natural orbitals, obtained by diagonalization of the one-body density matrix
! in the |MO| basis
END_DOC
character*(64) :: label
double precision, allocatable :: tmp(:,:)
label = "Canonical"
integer :: i,j,iorb,jorb
do i = 1, n_virt_orb
iorb = list_virt(i)
do j = 1, n_core_inact_act_orb
jorb = list_core_inact_act(j)
enddo
enddo
call mo_as_svd_vectors_of_mo_matrix_eig(one_e_dm_mo,size(one_e_dm_mo,1),mo_num,mo_num,mo_occ,label)
soft_touch mo_occ
end
subroutine set_natorb_no_ov_rot subroutine set_natorb_no_ov_rot
implicit none implicit none
BEGIN_DOC BEGIN_DOC
@ -330,12 +368,12 @@ BEGIN_PROVIDER [ double precision, c0_weight, (N_states) ]
c = maxval(psi_coef(:,i) * psi_coef(:,i)) c = maxval(psi_coef(:,i) * psi_coef(:,i))
c0_weight(i) = 1.d0/(c+1.d-20) c0_weight(i) = 1.d0/(c+1.d-20)
enddo enddo
c = 1.d0/minval(c0_weight(:)) c = 1.d0/sum(c0_weight(:))
do i=1,N_states do i=1,N_states
c0_weight(i) = c0_weight(i) * c c0_weight(i) = c0_weight(i) * c
enddo enddo
else else
c0_weight = 1.d0 c0_weight(:) = 1.d0
endif endif
END_PROVIDER END_PROVIDER
@ -352,7 +390,7 @@ BEGIN_PROVIDER [ double precision, state_average_weight, (N_states) ]
if (weight_one_e_dm == 0) then if (weight_one_e_dm == 0) then
state_average_weight(:) = c0_weight(:) state_average_weight(:) = c0_weight(:)
else if (weight_one_e_dm == 1) then else if (weight_one_e_dm == 1) then
state_average_weight(:) = 1./N_states state_average_weight(:) = 1.d0/N_states
else else
call ezfio_has_determinants_state_average_weight(exists) call ezfio_has_determinants_state_average_weight(exists)
if (exists) then if (exists) then

View File

@ -77,29 +77,31 @@ BEGIN_PROVIDER [ integer, psi_det_size ]
END_DOC END_DOC
PROVIDE ezfio_filename PROVIDE ezfio_filename
logical :: exists logical :: exists
if (mpi_master) then psi_det_size = 1
call ezfio_has_determinants_n_det(exists) PROVIDE mpi_master
if (exists) then if (read_wf) then
call ezfio_get_determinants_n_det(psi_det_size) if (mpi_master) then
else call ezfio_has_determinants_n_det(exists)
psi_det_size = 1 if (exists) then
call ezfio_get_determinants_n_det(psi_det_size)
else
psi_det_size = 1
endif
call write_int(6,psi_det_size,'Dimension of the psi arrays')
endif endif
psi_det_size = max(psi_det_size,100000) IRP_IF MPI_DEBUG
call write_int(6,psi_det_size,'Dimension of the psi arrays') print *, irp_here, mpi_rank
call MPI_BARRIER(MPI_COMM_WORLD, ierr)
IRP_ENDIF
IRP_IF MPI
include 'mpif.h'
integer :: ierr
call MPI_BCAST( psi_det_size, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr)
if (ierr /= MPI_SUCCESS) then
stop 'Unable to read psi_det_size with MPI'
endif
IRP_ENDIF
endif endif
IRP_IF MPI_DEBUG
print *, irp_here, mpi_rank
call MPI_BARRIER(MPI_COMM_WORLD, ierr)
IRP_ENDIF
IRP_IF MPI
include 'mpif.h'
integer :: ierr
call MPI_BCAST( psi_det_size, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr)
if (ierr /= MPI_SUCCESS) then
stop 'Unable to read psi_det_size with MPI'
endif
IRP_ENDIF
END_PROVIDER END_PROVIDER

View File

@ -9,7 +9,7 @@
double precision :: weight, r(3) double precision :: weight, r(3)
double precision :: cpu0,cpu1,nuclei_part_z,nuclei_part_y,nuclei_part_x double precision :: cpu0,cpu1,nuclei_part_z,nuclei_part_y,nuclei_part_x
call cpu_time(cpu0) ! call cpu_time(cpu0)
z_dipole_moment = 0.d0 z_dipole_moment = 0.d0
y_dipole_moment = 0.d0 y_dipole_moment = 0.d0
x_dipole_moment = 0.d0 x_dipole_moment = 0.d0
@ -26,10 +26,10 @@
enddo enddo
enddo enddo
print*,'electron part for z_dipole = ',z_dipole_moment ! print*,'electron part for z_dipole = ',z_dipole_moment
print*,'electron part for y_dipole = ',y_dipole_moment ! print*,'electron part for y_dipole = ',y_dipole_moment
print*,'electron part for x_dipole = ',x_dipole_moment ! print*,'electron part for x_dipole = ',x_dipole_moment
!
nuclei_part_z = 0.d0 nuclei_part_z = 0.d0
nuclei_part_y = 0.d0 nuclei_part_y = 0.d0
nuclei_part_x = 0.d0 nuclei_part_x = 0.d0
@ -38,28 +38,39 @@
nuclei_part_y += nucl_charge(i) * nucl_coord(i,2) nuclei_part_y += nucl_charge(i) * nucl_coord(i,2)
nuclei_part_x += nucl_charge(i) * nucl_coord(i,1) nuclei_part_x += nucl_charge(i) * nucl_coord(i,1)
enddo enddo
print*,'nuclei part for z_dipole = ',nuclei_part_z ! print*,'nuclei part for z_dipole = ',nuclei_part_z
print*,'nuclei part for y_dipole = ',nuclei_part_y ! print*,'nuclei part for y_dipole = ',nuclei_part_y
print*,'nuclei part for x_dipole = ',nuclei_part_x ! print*,'nuclei part for x_dipole = ',nuclei_part_x
!
do istate = 1, N_states do istate = 1, N_states
z_dipole_moment(istate) += nuclei_part_z z_dipole_moment(istate) += nuclei_part_z
y_dipole_moment(istate) += nuclei_part_y y_dipole_moment(istate) += nuclei_part_y
x_dipole_moment(istate) += nuclei_part_x x_dipole_moment(istate) += nuclei_part_x
enddo enddo
call cpu_time(cpu1) ! call cpu_time(cpu1)
print*,'Time to provide the dipole moment :',cpu1-cpu0 ! print*,'Time to provide the dipole moment :',cpu1-cpu0
END_PROVIDER END_PROVIDER
subroutine print_z_dipole_moment_only subroutine print_dipole_moments
implicit none implicit none
integer :: i
print*, '' print*, ''
print*, '' print*, ''
print*, '****************************************' print*, '****************************************'
print*, 'z_dipole_moment = ',z_dipole_moment write(*,'(A10)',advance='no') ' State : '
do i = 1,N_states
write(*,'(i16)',advance='no') i
end do
write(*,*) ''
write(*,'(A17,100(1pE16.8))') 'x_dipole_moment = ',x_dipole_moment
write(*,'(A17,100(1pE16.8))') 'y_dipole_moment = ',y_dipole_moment
write(*,'(A17,100(1pE16.8))') 'z_dipole_moment = ',z_dipole_moment
!print*, 'x_dipole_moment = ',x_dipole_moment
!print*, 'y_dipole_moment = ',y_dipole_moment
!print*, 'z_dipole_moment = ',z_dipole_moment
print*, '****************************************' print*, '****************************************'
end end

View File

@ -322,10 +322,7 @@ subroutine fill_H_apply_buffer_no_selection(n_selected,det_buffer,Nint,iproc)
ASSERT (sum(popcnt(H_apply_buffer(iproc)%det(:,2,i))) == elec_beta_num) ASSERT (sum(popcnt(H_apply_buffer(iproc)%det(:,2,i))) == elec_beta_num)
enddo enddo
do i=1,n_selected do i=1,n_selected
do j=1,N_int H_apply_buffer(iproc)%det(:,:,i+H_apply_buffer(iproc)%N_det) = det_buffer(:,:,i)
H_apply_buffer(iproc)%det(j,1,i+H_apply_buffer(iproc)%N_det) = det_buffer(j,1,i)
H_apply_buffer(iproc)%det(j,2,i+H_apply_buffer(iproc)%N_det) = det_buffer(j,2,i)
enddo
ASSERT (sum(popcnt(H_apply_buffer(iproc)%det(:,1,i+H_apply_buffer(iproc)%N_det)) )== elec_alpha_num) ASSERT (sum(popcnt(H_apply_buffer(iproc)%det(:,1,i+H_apply_buffer(iproc)%N_det)) )== elec_alpha_num)
ASSERT (sum(popcnt(H_apply_buffer(iproc)%det(:,2,i+H_apply_buffer(iproc)%N_det))) == elec_beta_num) ASSERT (sum(popcnt(H_apply_buffer(iproc)%det(:,2,i+H_apply_buffer(iproc)%N_det))) == elec_beta_num)
enddo enddo

View File

@ -103,13 +103,17 @@ BEGIN_PROVIDER [ double precision, expected_s2]
END_PROVIDER END_PROVIDER
BEGIN_PROVIDER [ double precision, s2_values, (N_states) ] BEGIN_PROVIDER [ double precision, s2_values, (N_states) ]
&BEGIN_PROVIDER [ double precision, s_values, (N_states) ]
implicit none implicit none
BEGIN_DOC BEGIN_DOC
! array of the averaged values of the S^2 operator on the various states ! array of the averaged values of the S^2 operator on the various states
END_DOC END_DOC
integer :: i integer :: i
call u_0_S2_u_0(s2_values,psi_coef,n_det,psi_det,N_int,N_states,psi_det_size) call u_0_S2_u_0(s2_values,psi_coef,n_det,psi_det,N_int,N_states,psi_det_size)
do i = 1, N_states
s_values(i) = 0.5d0 *(-1.d0 + dsqrt(1.d0 + 4 * s2_values(i)))
enddo
END_PROVIDER END_PROVIDER

View File

@ -438,7 +438,7 @@ subroutine bitstring_to_list_ab( string, list, n_elements, Nint)
use bitmasks use bitmasks
implicit none implicit none
BEGIN_DOC BEGIN_DOC
! Gives the inidices(+1) of the bits set to 1 in the bit string ! Gives the indices(+1) of the bits set to 1 in the bit string
! For alpha/beta determinants. ! For alpha/beta determinants.
END_DOC END_DOC
integer, intent(in) :: Nint integer, intent(in) :: Nint
@ -472,6 +472,35 @@ subroutine bitstring_to_list_ab( string, list, n_elements, Nint)
end end
!subroutine bitstring_to_list( string, list, n_elements, Nint)
! use bitmasks
! implicit none
! BEGIN_DOC
! ! Gives the indices(+1) of the bits set to 1 in the bit string
! END_DOC
! integer, intent(in) :: Nint
! integer(bit_kind), intent(in) :: string(Nint)
! integer, intent(out) :: list(Nint*bit_kind_size)
! integer, intent(out) :: n_elements
!
! integer :: i, j, ishift
! integer(bit_kind) :: l
!
! n_elements = 0
! ishift = 1
! do i=1,Nint
! l = string(i)
! do while (l /= 0_bit_kind)
! j = trailz(l)
! n_elements = n_elements + 1
! l = ibclr(l,j)
! list(n_elements) = ishift+j
! enddo
! ishift = ishift + bit_kind_size
! enddo
!
!end
subroutine i_H_j_s2(key_i,key_j,Nint,hij,s2) subroutine i_H_j_s2(key_i,key_j,Nint,hij,s2)
use bitmasks use bitmasks

View File

@ -585,7 +585,7 @@ END_PROVIDER
enddo enddo
!$OMP ENDDO !$OMP ENDDO
!$OMP END PARALLEL !$OMP END PARALLEL
call i8radix_sort(to_sort, psi_bilinear_matrix_transp_order, N_det,-1) call i8sort(to_sort, psi_bilinear_matrix_transp_order, N_det)
call iset_order(psi_bilinear_matrix_transp_rows,psi_bilinear_matrix_transp_order,N_det) call iset_order(psi_bilinear_matrix_transp_rows,psi_bilinear_matrix_transp_order,N_det)
call iset_order(psi_bilinear_matrix_transp_columns,psi_bilinear_matrix_transp_order,N_det) call iset_order(psi_bilinear_matrix_transp_columns,psi_bilinear_matrix_transp_order,N_det)
!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(l) !$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(l)

View File

@ -8,6 +8,7 @@ BEGIN_PROVIDER [ double precision, H_matrix_all_dets,(N_det,N_det) ]
double precision :: hij double precision :: hij
integer :: degree(N_det),idx(0:N_det) integer :: degree(N_det),idx(0:N_det)
call i_H_j(psi_det(1,1,1),psi_det(1,1,1),N_int,hij) call i_H_j(psi_det(1,1,1),psi_det(1,1,1),N_int,hij)
print*,'Providing the H_matrix_all_dets ...'
!$OMP PARALLEL DO SCHEDULE(GUIDED) DEFAULT(NONE) PRIVATE(i,j,hij,degree,idx,k) & !$OMP PARALLEL DO SCHEDULE(GUIDED) DEFAULT(NONE) PRIVATE(i,j,hij,degree,idx,k) &
!$OMP SHARED (N_det, psi_det, N_int,H_matrix_all_dets) !$OMP SHARED (N_det, psi_det, N_int,H_matrix_all_dets)
do i =1,N_det do i =1,N_det
@ -18,6 +19,26 @@ BEGIN_PROVIDER [ double precision, H_matrix_all_dets,(N_det,N_det) ]
enddo enddo
enddo enddo
!$OMP END PARALLEL DO !$OMP END PARALLEL DO
print*,'H_matrix_all_dets done '
END_PROVIDER
BEGIN_PROVIDER [ double precision, H_matrix_diag_all_dets,(N_det) ]
use bitmasks
implicit none
BEGIN_DOC
! |H| matrix on the basis of the Slater determinants defined by psi_det
END_DOC
integer :: i
double precision :: hij
integer :: degree(N_det)
call i_H_j(psi_det(1,1,1),psi_det(1,1,1),N_int,hij)
!$OMP PARALLEL DO SCHEDULE(GUIDED) DEFAULT(NONE) PRIVATE(i,hij,degree) &
!$OMP SHARED (N_det, psi_det, N_int,H_matrix_diag_all_dets)
do i =1,N_det
call i_H_j(psi_det(1,1,i),psi_det(1,1,i),N_int,hij)
H_matrix_diag_all_dets(i) = hij
enddo
!$OMP END PARALLEL DO
END_PROVIDER END_PROVIDER

View File

@ -16,3 +16,8 @@ doc: Percentage of HF exchange in the DFT model
interface: ezfio,provider,ocaml interface: ezfio,provider,ocaml
default: 0. default: 0.
[mu_dft_type]
type: character*(32)
doc: type of mu(r) for rsdft [ cst ]
interface: ezfio, provider, ocaml
default: cst

View File

@ -6,3 +6,4 @@ ao_one_e_ints
ao_two_e_ints ao_two_e_ints
mo_two_e_erf_ints mo_two_e_erf_ints
ao_two_e_erf_ints ao_two_e_erf_ints
mu_of_r

View File

@ -8,3 +8,73 @@ BEGIN_PROVIDER [double precision, mu_erf_dft]
mu_erf_dft = mu_erf mu_erf_dft = mu_erf
END_PROVIDER END_PROVIDER
BEGIN_PROVIDER [double precision, mu_of_r_dft, (n_points_final_grid)]
implicit none
integer :: i
if(mu_dft_type == "Read")then
call ezfio_get_mu_of_r_mu_of_r_disk(mu_of_r_dft)
else
do i = 1, n_points_final_grid
if(mu_dft_type == "cst")then
mu_of_r_dft(i) = mu_erf_dft
else if(mu_dft_type == "hf")then
mu_of_r_dft(i) = mu_of_r_hf(i)
else if(mu_dft_type == "rsc")then
mu_of_r_dft(i) = mu_rsc_of_r(i)
else if(mu_dft_type == "grad_rho")then
mu_of_r_dft(i) = mu_grad_rho(i)
else
print*,'mu_dft_type is not of good type = ',mu_dft_type
print*,'it must be of type Read, cst, hf, rsc'
print*,'Stopping ...'
stop
endif
enddo
endif
END_PROVIDER
BEGIN_PROVIDER [double precision, mu_rsc_of_r, (n_points_final_grid)]
implicit none
integer :: i
double precision :: mu_rs_c,rho,r(3), dm_a, dm_b
do i = 1, n_points_final_grid
r(1) = final_grid_points(1,i)
r(2) = final_grid_points(2,i)
r(3) = final_grid_points(3,i)
call dm_dft_alpha_beta_at_r(r,dm_a,dm_b)
rho = dm_a + dm_b
mu_rsc_of_r(i) = mu_rs_c(rho)
enddo
END_PROVIDER
BEGIN_PROVIDER [double precision, mu_grad_rho, (n_points_final_grid)]
implicit none
integer :: i
double precision :: mu_grad_rho_func, r(3)
do i = 1, n_points_final_grid
r(1) = final_grid_points(1,i)
r(2) = final_grid_points(2,i)
r(3) = final_grid_points(3,i)
mu_grad_rho(i) = mu_grad_rho_func(r)
enddo
END_PROVIDER
BEGIN_PROVIDER [double precision, mu_of_r_dft_average]
implicit none
integer :: i
double precision :: mu_rs_c,rho,r(3), dm_a, dm_b
mu_of_r_dft_average = 0.d0
do i = 1, n_points_final_grid
r(1) = final_grid_points(1,i)
r(2) = final_grid_points(2,i)
r(3) = final_grid_points(3,i)
call dm_dft_alpha_beta_at_r(r,dm_a,dm_b)
rho = dm_a + dm_b
if(mu_of_r_dft(i).gt.1.d+3)cycle
mu_of_r_dft_average += rho * mu_of_r_dft(i) * final_weight_at_r_vector(i)
enddo
mu_of_r_dft_average = mu_of_r_dft_average / dble(elec_alpha_num + elec_beta_num)
print*,'mu_of_r_dft_average = ',mu_of_r_dft_average
END_PROVIDER

View File

@ -0,0 +1,37 @@
double precision function mu_rs_c(rho)
implicit none
double precision, intent(in) :: rho
include 'constants.include.F'
double precision :: cst_rs,alpha_rs,rs
cst_rs = (4.d0 * dacos(-1.d0)/3.d0)**(-1.d0/3.d0)
alpha_rs = 2.d0 * dsqrt((9.d0 * dacos(-1.d0)/4.d0)**(-1.d0/3.d0)) / sqpi
rs = cst_rs * rho**(-1.d0/3.d0)
mu_rs_c = alpha_rs/dsqrt(rs)
end
double precision function mu_grad_rho_func(r)
implicit none
double precision , intent(in) :: r(3)
integer :: m
double precision :: rho, dm_a, dm_b, grad_dm_a(3), grad_dm_b(3)
double precision :: eta, grad_rho(3), grad_sqr
eta = mu_erf
call density_and_grad_alpha_beta(r,dm_a,dm_b, grad_dm_a, grad_dm_b)
rho = dm_a + dm_b
do m = 1,3
grad_rho(m) = grad_dm_a(m) + grad_dm_b(m)
enddo
grad_sqr=0.d0
do m = 1,3
grad_sqr=grad_sqr+grad_rho(m)*grad_rho(m)
enddo
grad_sqr = dsqrt(grad_sqr)
if (rho<1.d-12) then
mu_grad_rho_func = 1.d-10
else
mu_grad_rho_func = eta * grad_sqr / rho
endif
end

View File

@ -91,7 +91,19 @@
enddo enddo
END_PROVIDER END_PROVIDER
BEGIN_PROVIDER[double precision, aos_lapl_in_r_array, (ao_num,n_points_final_grid,3)] BEGIN_PROVIDER [double precision, aos_lapl_in_r_array_transp, (ao_num, n_points_final_grid,3)]
implicit none
integer :: i,j,m
do i = 1, n_points_final_grid
do j = 1, ao_num
do m = 1, 3
aos_lapl_in_r_array_transp(j,i,m) = aos_lapl_in_r_array(m,j,i)
enddo
enddo
enddo
END_PROVIDER
BEGIN_PROVIDER [double precision, aos_lapl_in_r_array, (3,ao_num,n_points_final_grid)]
implicit none implicit none
BEGIN_DOC BEGIN_DOC
! aos_lapl_in_r_array(i,j,k) = value of the kth component of the laplacian of jth ao on the ith grid point ! aos_lapl_in_r_array(i,j,k) = value of the kth component of the laplacian of jth ao on the ith grid point
@ -100,20 +112,20 @@
END_DOC END_DOC
integer :: i,j,m integer :: i,j,m
double precision :: aos_array(ao_num), r(3) double precision :: aos_array(ao_num), r(3)
double precision :: aos_grad_array(ao_num,3) double precision :: aos_grad_array(3,ao_num)
double precision :: aos_lapl_array(ao_num,3) double precision :: aos_lapl_array(3,ao_num)
!$OMP PARALLEL DO & !$OMP PARALLEL DO &
!$OMP DEFAULT (NONE) & !$OMP DEFAULT (NONE) &
!$OMP PRIVATE (i,r,aos_array,aos_grad_array,aos_lapl_array,j,m) & !$OMP PRIVATE (i,r,aos_array,aos_grad_array,aos_lapl_array,j,m) &
!$OMP SHARED(aos_lapl_in_r_array,n_points_final_grid,ao_num,final_grid_points) !$OMP SHARED(aos_lapl_in_r_array,n_points_final_grid,ao_num,final_grid_points)
do m = 1, 3 do i = 1, n_points_final_grid
do i = 1, n_points_final_grid r(1) = final_grid_points(1,i)
r(1) = final_grid_points(1,i) r(2) = final_grid_points(2,i)
r(2) = final_grid_points(2,i) r(3) = final_grid_points(3,i)
r(3) = final_grid_points(3,i) call give_all_aos_and_grad_and_lapl_at_r(r,aos_array,aos_grad_array,aos_lapl_array)
call give_all_aos_and_grad_and_lapl_at_r(r,aos_array,aos_grad_array,aos_lapl_array) do j = 1, ao_num
do j = 1, ao_num do m = 1, 3
aos_lapl_in_r_array(j,i,m) = aos_lapl_array(j,m) aos_lapl_in_r_array(m,j,i) = aos_lapl_array(m,j)
enddo enddo
enddo enddo
enddo enddo

View File

@ -0,0 +1,39 @@
BEGIN_PROVIDER [ double precision, mo_grad_ints, (mo_num, mo_num,3)]
implicit none
BEGIN_DOC
! mo_grad_ints(i,j,m) = <phi_i^MO | d/dx | phi_j^MO>
END_DOC
integer :: i,j,ipoint,m
double precision :: weight
mo_grad_ints = 0.d0
do m = 1, 3
do ipoint = 1, n_points_final_grid
weight = final_weight_at_r_vector(ipoint)
do j = 1, mo_num
do i = 1, mo_num
mo_grad_ints(i,j,m) += mos_grad_in_r_array(j,ipoint,m) * mos_in_r_array(i,ipoint) * weight
enddo
enddo
enddo
enddo
END_PROVIDER
BEGIN_PROVIDER [ double precision, mo_grad_ints_transp, (3,mo_num, mo_num)]
implicit none
BEGIN_DOC
! mo_grad_ints(i,j,m) = <phi_i^MO | d/dx | phi_j^MO>
END_DOC
integer :: i,j,ipoint,m
double precision :: weight
do m = 1, 3
do j = 1, mo_num
do i = 1, mo_num
mo_grad_ints_transp(m,i,j) = mo_grad_ints(i,j,m)
enddo
enddo
enddo
END_PROVIDER

View File

@ -138,7 +138,7 @@
integer :: m integer :: m
mos_lapl_in_r_array = 0.d0 mos_lapl_in_r_array = 0.d0
do m=1,3 do m=1,3
call dgemm('N','N',mo_num,n_points_final_grid,ao_num,1.d0,mo_coef_transp,mo_num,aos_lapl_in_r_array(1,1,m),ao_num,0.d0,mos_lapl_in_r_array(1,1,m),mo_num) call dgemm('N','N',mo_num,n_points_final_grid,ao_num,1.d0,mo_coef_transp,mo_num,aos_lapl_in_r_array_transp(1,1,m),ao_num,0.d0,mos_lapl_in_r_array(1,1,m),mo_num)
enddo enddo
END_PROVIDER END_PROVIDER

View File

@ -1179,7 +1179,7 @@ subroutine bitstring_to_list_in_selection( string, list, n_elements, Nint)
use bitmasks use bitmasks
implicit none implicit none
BEGIN_DOC BEGIN_DOC
! Gives the inidices(+1) of the bits set to 1 in the bit string ! Gives the indices(+1) of the bits set to 1 in the bit string
END_DOC END_DOC
integer, intent(in) :: Nint integer, intent(in) :: Nint
integer(bit_kind), intent(in) :: string(Nint) integer(bit_kind), intent(in) :: string(Nint)

View File

@ -72,7 +72,7 @@ subroutine run_dress_slave(thread,iproce,energy)
provide psi_energy provide psi_energy
ending = dress_N_cp+1 ending = dress_N_cp+1
ntask_tbd = 0 ntask_tbd = 0
call omp_set_max_active_levels(8) call set_multiple_levels_omp(.True.)
!$OMP PARALLEL DEFAULT(SHARED) & !$OMP PARALLEL DEFAULT(SHARED) &
!$OMP PRIVATE(interesting, breve_delta_m, task_id) & !$OMP PRIVATE(interesting, breve_delta_m, task_id) &
@ -84,7 +84,7 @@ subroutine run_dress_slave(thread,iproce,energy)
zmq_socket_push = new_zmq_push_socket(thread) zmq_socket_push = new_zmq_push_socket(thread)
integer, external :: connect_to_taskserver integer, external :: connect_to_taskserver
!$OMP CRITICAL !$OMP CRITICAL
call omp_set_max_active_levels(1) call set_multiple_levels_omp(.False.)
if (connect_to_taskserver(zmq_to_qp_run_socket,worker_id,thread) == -1) then if (connect_to_taskserver(zmq_to_qp_run_socket,worker_id,thread) == -1) then
print *, irp_here, ': Unable to connect to task server' print *, irp_here, ': Unable to connect to task server'
stop -1 stop -1
@ -296,7 +296,7 @@ subroutine run_dress_slave(thread,iproce,energy)
!$OMP END CRITICAL !$OMP END CRITICAL
!$OMP END PARALLEL !$OMP END PARALLEL
call omp_set_max_active_levels(1) call set_multiple_levels_omp(.False.)
! do i=0,dress_N_cp+1 ! do i=0,dress_N_cp+1
! call omp_destroy_lock(lck_sto(i)) ! call omp_destroy_lock(lck_sto(i))
! end do ! end do

View File

@ -25,7 +25,7 @@ subroutine write_time(iunit)
ct = ct - output_cpu_time_0 ct = ct - output_cpu_time_0
call wall_time(wt) call wall_time(wt)
wt = wt - output_wall_time_0 wt = wt - output_wall_time_0
write(6,'(A,F14.6,A,F14.6,A)') & write(6,'(A,F14.2,A,F14.2,A)') &
'.. >>>>> [ WALL TIME: ', wt, ' s ] [ CPU TIME: ', ct, ' s ] <<<<< ..' '.. >>>>> [ WALL TIME: ', wt, ' s ] [ CPU TIME: ', ct, ' s ] <<<<< ..'
write(6,*) write(6,*)
end end

View File

@ -59,43 +59,43 @@ function run_stoch() {
@test "HCO" { # 12.2868s @test "HCO" { # 12.2868s
qp set_file hco.ezfio qp set_file hco.ezfio
run -113.389297812482 6.e-4 100000 run -113.393356604085 1.e-3 100000
} }
@test "H2O2" { # 12.9214s @test "H2O2" { # 12.9214s
qp set_file h2o2.ezfio qp set_file h2o2.ezfio
qp set_mo_class --core="[1-2]" --act="[3-24]" --del="[25-38]" qp set_mo_class --core="[1-2]" --act="[3-24]" --del="[25-38]"
run -151.00467 1.e-4 100000 run -151.005848404095 1.e-3 100000
} }
@test "HBO" { # 13.3144s @test "HBO" { # 13.3144s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file hbo.ezfio qp set_file hbo.ezfio
run -100.212560384678 1.e-3 100000 run -100.214099486337 1.e-3 100000
} }
@test "H2O" { # 11.3727s @test "H2O" { # 11.3727s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file h2o.ezfio qp set_file h2o.ezfio
run -76.2361605151999 3.e-4 100000 run -76.2361605151999 5.e-4 100000
} }
@test "ClO" { # 13.3755s @test "ClO" { # 13.3755s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file clo.ezfio qp set_file clo.ezfio
run -534.545616787223 3.e-4 100000 run -534.546453546852 1.e-3 100000
} }
@test "SO" { # 13.4952s @test "SO" { # 13.4952s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file so.ezfio qp set_file so.ezfio
run -26.0096209515081 1.e-3 100000 run -26.0176563764039 1.e-3 100000
} }
@test "H2S" { # 13.6745s @test "H2S" { # 13.6745s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file h2s.ezfio qp set_file h2s.ezfio
run -398.859168655255 3.e-4 100000 run -398.859577605891 5.e-4 100000
} }
@test "OH" { # 13.865s @test "OH" { # 13.865s
@ -113,13 +113,13 @@ function run_stoch() {
@test "H3COH" { # 14.7299s @test "H3COH" { # 14.7299s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file h3coh.ezfio qp set_file h3coh.ezfio
run -115.205191406072 3.e-4 100000 run -115.205632960026 1.e-3 100000
} }
@test "SiH3" { # 15.99s @test "SiH3" { # 15.99s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file sih3.ezfio qp set_file sih3.ezfio
run -5.57241217753818 3.e-4 100000 run -5.57241217753818 5.e-4 100000
} }
@test "CH4" { # 16.1612s @test "CH4" { # 16.1612s
@ -132,28 +132,28 @@ function run_stoch() {
@test "ClF" { # 16.8864s @test "ClF" { # 16.8864s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file clf.ezfio qp set_file clf.ezfio
run -559.169313755572 3.e-4 100000 run -559.169748890031 1.5e-3 100000
} }
@test "SO2" { # 17.5645s @test "SO2" { # 17.5645s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file so2.ezfio qp set_file so2.ezfio
qp set_mo_class --core="[1-8]" --act="[9-87]" qp set_mo_class --core="[1-8]" --act="[9-87]"
run -41.5746738713298 3.e-4 100000 run -41.5746738713298 1.5e-3 100000
} }
@test "C2H2" { # 17.6827s @test "C2H2" { # 17.6827s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file c2h2.ezfio qp set_file c2h2.ezfio
qp set_mo_class --act="[1-30]" --del="[31-36]" qp set_mo_class --act="[1-30]" --del="[31-36]"
run -12.3685464085969 3.e-4 100000 run -12.3685464085969 2.e-3 100000
} }
@test "N2" { # 18.0198s @test "N2" { # 18.0198s
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file n2.ezfio qp set_file n2.ezfio
qp set_mo_class --core="[1,2]" --act="[3-40]" --del="[41-60]" qp set_mo_class --core="[1,2]" --act="[3-40]" --del="[41-60]"
run -109.28681540699360 3.e-4 100000 run -109.287917088107 1.5e-3 100000
} }
@test "N2H4" { # 18.5006s @test "N2H4" { # 18.5006s
@ -167,7 +167,7 @@ function run_stoch() {
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file co2.ezfio qp set_file co2.ezfio
qp set_mo_class --core="[1,2]" --act="[3-30]" --del="[31-42]" qp set_mo_class --core="[1,2]" --act="[3-30]" --del="[31-42]"
run -187.968547952413 3.e-4 100000 run -187.970184372047 1.5e-3 100000
} }
@ -182,6 +182,6 @@ function run_stoch() {
[[ -n $TRAVIS ]] && skip [[ -n $TRAVIS ]] && skip
qp set_file hcn.ezfio qp set_file hcn.ezfio
qp set_mo_class --core="[1,2]" --act="[3-40]" --del="[41-55]" qp set_mo_class --core="[1,2]" --act="[3-40]" --del="[41-55]"
run -93.0771143355433 3.e-4 100000 run -93.0777619629755 1.e-3 100000
} }

View File

@ -21,7 +21,9 @@
weight = final_weight_at_r_vector(i) weight = final_weight_at_r_vector(i)
rhoa(istate) = one_e_dm_and_grad_alpha_in_r(4,i,istate) rhoa(istate) = one_e_dm_and_grad_alpha_in_r(4,i,istate)
rhob(istate) = one_e_dm_and_grad_beta_in_r(4,i,istate) rhob(istate) = one_e_dm_and_grad_beta_in_r(4,i,istate)
call ex_lda_sr(mu_erf_dft,rhoa(istate),rhob(istate),e_x,vx_a,vx_b) double precision :: mu_local
mu_local = mu_of_r_dft(i)
call ex_lda_sr(mu_local,rhoa(istate),rhob(istate),e_x,vx_a,vx_b)
energy_x_sr_lda(istate) += weight * e_x energy_x_sr_lda(istate) += weight * e_x
enddo enddo
enddo enddo
@ -48,7 +50,9 @@
weight = final_weight_at_r_vector(i) weight = final_weight_at_r_vector(i)
rhoa(istate) = one_e_dm_and_grad_alpha_in_r(4,i,istate) rhoa(istate) = one_e_dm_and_grad_alpha_in_r(4,i,istate)
rhob(istate) = one_e_dm_and_grad_beta_in_r(4,i,istate) rhob(istate) = one_e_dm_and_grad_beta_in_r(4,i,istate)
call ec_lda_sr(mu_erf_dft,rhoa(istate),rhob(istate),e_c,vc_a,vc_b) double precision :: mu_local
mu_local = mu_of_r_dft(i)
call ec_lda_sr(mu_local,rhoa(istate),rhob(istate),e_c,vc_a,vc_b)
energy_c_sr_lda(istate) += weight * e_c energy_c_sr_lda(istate) += weight * e_c
enddo enddo
enddo enddo
@ -122,8 +126,10 @@ END_PROVIDER
weight = final_weight_at_r_vector(i) weight = final_weight_at_r_vector(i)
rhoa(istate) = one_e_dm_and_grad_alpha_in_r(4,i,istate) rhoa(istate) = one_e_dm_and_grad_alpha_in_r(4,i,istate)
rhob(istate) = one_e_dm_and_grad_beta_in_r(4,i,istate) rhob(istate) = one_e_dm_and_grad_beta_in_r(4,i,istate)
call ec_lda_sr(mu_erf_dft,rhoa(istate),rhob(istate),e_c,sr_vc_a,sr_vc_b) double precision :: mu_local
call ex_lda_sr(mu_erf_dft,rhoa(istate),rhob(istate),e_x,sr_vx_a,sr_vx_b) mu_local = mu_of_r_dft(i)
call ec_lda_sr(mu_local,rhoa(istate),rhob(istate),e_c,sr_vc_a,sr_vc_b)
call ex_lda_sr(mu_local,rhoa(istate),rhob(istate),e_x,sr_vx_a,sr_vx_b)
do j =1, ao_num do j =1, ao_num
aos_sr_vc_alpha_lda_w(j,i,istate) = sr_vc_a * aos_in_r_array(j,i)*weight aos_sr_vc_alpha_lda_w(j,i,istate) = sr_vc_a * aos_in_r_array(j,i)*weight
aos_sr_vc_beta_lda_w(j,i,istate) = sr_vc_b * aos_in_r_array(j,i)*weight aos_sr_vc_beta_lda_w(j,i,istate) = sr_vc_b * aos_in_r_array(j,i)*weight
@ -147,8 +153,6 @@ END_PROVIDER
double precision :: mu,weight double precision :: mu,weight
double precision :: e_c,sr_vc_a,sr_vc_b,e_x,sr_vx_a,sr_vx_b double precision :: e_c,sr_vc_a,sr_vc_b,e_x,sr_vx_a,sr_vx_b
double precision, allocatable :: rhoa(:),rhob(:) double precision, allocatable :: rhoa(:),rhob(:)
double precision :: mu_local
mu_local = mu_erf_dft
allocate(rhoa(N_states), rhob(N_states)) allocate(rhoa(N_states), rhob(N_states))
do istate = 1, N_states do istate = 1, N_states
do i = 1, n_points_final_grid do i = 1, n_points_final_grid
@ -158,6 +162,8 @@ END_PROVIDER
weight = final_weight_at_r_vector(i) weight = final_weight_at_r_vector(i)
rhoa(istate) = one_e_dm_and_grad_alpha_in_r(4,i,istate) rhoa(istate) = one_e_dm_and_grad_alpha_in_r(4,i,istate)
rhob(istate) = one_e_dm_and_grad_beta_in_r(4,i,istate) rhob(istate) = one_e_dm_and_grad_beta_in_r(4,i,istate)
double precision :: mu_local
mu_local = mu_of_r_dft(i)
call ec_lda_sr(mu_local,rhoa(istate),rhob(istate),e_c,sr_vc_a,sr_vc_b) call ec_lda_sr(mu_local,rhoa(istate),rhob(istate),e_c,sr_vc_a,sr_vc_b)
call ex_lda_sr(mu_local,rhoa(istate),rhob(istate),e_x,sr_vx_a,sr_vx_b) call ex_lda_sr(mu_local,rhoa(istate),rhob(istate),e_x,sr_vx_a,sr_vx_b)
do j =1, ao_num do j =1, ao_num

View File

@ -36,8 +36,10 @@
grad_rho_a_b += grad_rho_a(m) * grad_rho_b(m) grad_rho_a_b += grad_rho_a(m) * grad_rho_b(m)
enddo enddo
double precision :: mu_local
mu_local = mu_of_r_dft(i)
! inputs ! inputs
call GGA_sr_type_functionals(mu_erf_dft,rho_a,rho_b,grad_rho_a_2,grad_rho_b_2,grad_rho_a_b, & ! outputs exchange call GGA_sr_type_functionals(mu_local,rho_a,rho_b,grad_rho_a_2,grad_rho_b_2,grad_rho_a_b, & ! outputs exchange
ex,vx_rho_a,vx_rho_b,vx_grad_rho_a_2,vx_grad_rho_b_2,vx_grad_rho_a_b, & ! outputs correlation ex,vx_rho_a,vx_rho_b,vx_grad_rho_a_2,vx_grad_rho_b_2,vx_grad_rho_a_b, & ! outputs correlation
ec,vc_rho_a,vc_rho_b,vc_grad_rho_a_2,vc_grad_rho_b_2,vc_grad_rho_a_b ) ec,vc_rho_a,vc_rho_b,vc_grad_rho_a_2,vc_grad_rho_b_2,vc_grad_rho_a_b )
energy_x_sr_pbe(istate) += ex * weight energy_x_sr_pbe(istate) += ex * weight
@ -135,8 +137,10 @@ END_PROVIDER
grad_rho_a_b += grad_rho_a(m) * grad_rho_b(m) grad_rho_a_b += grad_rho_a(m) * grad_rho_b(m)
enddo enddo
double precision :: mu_local
mu_local = mu_of_r_dft(i)
! inputs ! inputs
call GGA_sr_type_functionals(mu_erf_dft,rho_a,rho_b,grad_rho_a_2,grad_rho_b_2,grad_rho_a_b, & ! outputs exchange call GGA_sr_type_functionals(mu_local,rho_a,rho_b,grad_rho_a_2,grad_rho_b_2,grad_rho_a_b, & ! outputs exchange
ex,vx_rho_a,vx_rho_b,vx_grad_rho_a_2,vx_grad_rho_b_2,vx_grad_rho_a_b, & ! outputs correlation ex,vx_rho_a,vx_rho_b,vx_grad_rho_a_2,vx_grad_rho_b_2,vx_grad_rho_a_b, & ! outputs correlation
ec,vc_rho_a,vc_rho_b,vc_grad_rho_a_2,vc_grad_rho_b_2,vc_grad_rho_a_b ) ec,vc_rho_a,vc_rho_b,vc_grad_rho_a_2,vc_grad_rho_b_2,vc_grad_rho_a_b )
vx_rho_a *= weight vx_rho_a *= weight
@ -292,8 +296,10 @@ END_PROVIDER
grad_rho_a_b += grad_rho_a(m) * grad_rho_b(m) grad_rho_a_b += grad_rho_a(m) * grad_rho_b(m)
enddo enddo
double precision :: mu_local
mu_local = mu_of_r_dft(i)
! inputs ! inputs
call GGA_sr_type_functionals(mu_erf_dft,rho_a,rho_b,grad_rho_a_2,grad_rho_b_2,grad_rho_a_b, & ! outputs exchange call GGA_sr_type_functionals(mu_local,rho_a,rho_b,grad_rho_a_2,grad_rho_b_2,grad_rho_a_b, & ! outputs exchange
ex,vx_rho_a,vx_rho_b,vx_grad_rho_a_2,vx_grad_rho_b_2,vx_grad_rho_a_b, & ! outputs correlation ex,vx_rho_a,vx_rho_b,vx_grad_rho_a_2,vx_grad_rho_b_2,vx_grad_rho_a_b, & ! outputs correlation
ec,vc_rho_a,vc_rho_b,vc_grad_rho_a_2,vc_grad_rho_b_2,vc_grad_rho_a_b ) ec,vc_rho_a,vc_rho_b,vc_grad_rho_a_2,vc_grad_rho_b_2,vc_grad_rho_a_b )
vx_rho_a *= weight vx_rho_a *= weight

View File

@ -98,7 +98,7 @@ subroutine print_summary(e_,pt2_data,pt2_data_err,n_det_,n_configuration_,n_st,s
enddo enddo
endif endif
call print_energy_components() ! call print_energy_components()
end subroutine end subroutine

View File

@ -17,7 +17,7 @@ program rs_ks_scf
print*, '**************************' print*, '**************************'
print*, 'mu_erf_dft = ',mu_erf_dft print*, 'mu_erf_dft = ',mu_erf_dft
print*, '**************************' print*, '**************************'
call check_coherence_functional ! call check_coherence_functional
call create_guess call create_guess
call orthonormalize_mos call orthonormalize_mos
call run call run

View File

@ -1,6 +1,9 @@
subroutine give_all_mos_at_r(r,mos_array) subroutine give_all_mos_at_r(r,mos_array)
implicit none implicit none
BEGIN_DOC
! mos_array(i) = ith MO function evaluated at "r"
END_DOC
double precision, intent(in) :: r(3) double precision, intent(in) :: r(3)
double precision, intent(out) :: mos_array(mo_num) double precision, intent(out) :: mos_array(mo_num)
double precision :: aos_array(ao_num) double precision :: aos_array(ao_num)

View File

@ -7,7 +7,7 @@ subroutine hcore_guess
label = 'Guess' label = 'Guess'
call mo_as_eigvectors_of_mo_matrix(mo_one_e_integrals, & call mo_as_eigvectors_of_mo_matrix(mo_one_e_integrals, &
size(mo_one_e_integrals,1), & size(mo_one_e_integrals,1), &
size(mo_one_e_integrals,2),label,1,.false.) size(mo_one_e_integrals,2),label,1,.true.)
call nullify_small_elements(ao_num, mo_num, mo_coef, size(mo_coef,1), 1.d-12 ) call nullify_small_elements(ao_num, mo_num, mo_coef, size(mo_coef,1), 1.d-12 )
call save_mos call save_mos
TOUCH mo_coef mo_label TOUCH mo_coef mo_label

View File

@ -302,21 +302,21 @@ end
integer(key_kind) :: idx integer(key_kind) :: idx
double precision :: tmp double precision :: tmp
icount = 1 ! Avoid division by zero !icount = 1 ! Avoid division by zero
do j=1,mo_num !do j=1,mo_num
do i=1,j-1 ! do i=1,j-1
call two_e_integrals_index(i,j,j,i,idx) ! call two_e_integrals_index(i,j,j,i,idx)
!DIR$ FORCEINLINE ! !DIR$ FORCEINLINE
call map_get(mo_integrals_map,idx,tmp) ! call map_get(mo_integrals_map,idx,tmp)
banned_excitation(i,j) = dabs(tmp) < 1.d-14 ! banned_excitation(i,j) = dabs(tmp) < 1.d-14
banned_excitation(j,i) = banned_excitation(i,j) ! banned_excitation(j,i) = banned_excitation(i,j)
if (banned_excitation(i,j)) icount = icount+2 ! if (banned_excitation(i,j)) icount = icount+2
enddo ! enddo
enddo !enddo
use_banned_excitation = (mo_num*mo_num) / icount <= 100 !1% !use_banned_excitation = (mo_num*mo_num) / icount <= 100 !1%
if (use_banned_excitation) then !if (use_banned_excitation) then
print *, 'Using sparsity of exchange integrals' ! print *, 'Using sparsity of exchange integrals'
endif !endif
END_PROVIDER END_PROVIDER

View File

@ -10,7 +10,7 @@ type:integer
interface: ezfio,provider interface: ezfio,provider
[pseudo_n_k] [pseudo_n_k]
doc: Number of gaussians in the local component doc: Powers of r - 2 in the local component
type: integer type: integer
interface: ezfio,provider interface: ezfio,provider
size: (nuclei.nucl_num,pseudo.pseudo_klocmax) size: (nuclei.nucl_num,pseudo.pseudo_klocmax)
@ -38,7 +38,7 @@ type:integer
interface: ezfio,provider interface: ezfio,provider
[pseudo_n_kl] [pseudo_n_kl]
doc: Number of functions in the non-local component doc: Powers of r - 2 in the non-local component
type: integer type: integer
interface: ezfio,provider interface: ezfio,provider
size: (nuclei.nucl_num,pseudo.pseudo_kmax,0:pseudo.pseudo_lmax) size: (nuclei.nucl_num,pseudo.pseudo_kmax,0:pseudo.pseudo_lmax)

View File

@ -2,3 +2,4 @@ fci
mo_two_e_erf_ints mo_two_e_erf_ints
aux_quantities aux_quantities
hartree_fock hartree_fock
two_body_rdm

View File

@ -52,8 +52,8 @@ program molden
l += 1 l += 1
if (l > ao_num) exit if (l > ao_num) exit
enddo enddo
write(i_unit_output,*)''
enddo enddo
write(i_unit_output,*)''
enddo enddo

Some files were not shown because too many files have changed in this diff Show More