Aurélien Delval
26bbd6f341
Start work on cuBLAS implementation
...
TODO Replace CPU BLAS calls by cuBLAS calls (will probably require to write a Fortran to the functions we're interested in, at least DGEMMs)
2022-04-01 09:19:56 +02:00
Aurélien Delval
9428eaa19e
Implement computation of tmp_c and dtmp_c in OpenACC
...
These 2 kernels seem to give good speedup compared to the CPU BLAS
versions. However, the current GPU implementation of factor_een_deriv seems to
be slightly slower (on the tested machine).
TODO:
- Try to improve factor_een_deriv GPU implem
- Try out a cuBLAS implementation of tmp_c and dtmp_c
2022-03-30 16:16:06 +02:00
Aurelien Delval
99306473a4
Start OpenACC implementation in Jastro, including compute_dtmp_c
2022-03-30 09:01:32 +02:00
Aurelien Delval
383c6ac78a
Add OFFLOAD_FLAGS, OFFLOAD_CFLAGS and OFFLOAD_FCFLAGS vars to configure
2022-03-28 07:58:01 +02:00
Aurelien Delval
bcc49ca312
Minor fixes to previous commit
...
TODO Start modifying dedicated function to implement offloading
Also, as of now, Fortran preprocessor flags should be passed manually,
we need to manage this in the configure.ac in the future. For now, when
using gfortran, you should pass FCFLAGS="-cpp -DWITH_OPENMP_OFFLOAD" to
enable offloading.
2022-03-25 13:03:35 +01:00
Aurelien Delval
5e3231e7e3
Add selection mechanism for offload mode in Jastrow
...
This system adds an additional field to the QMCkl context to store the
offload mode currently in use for each kernel (in this commit, this has
been implemented for Jastrow as an example). This will be useful to test
different offloading versions that can be easily toggled on/off at
compilation and at runtime.
2022-03-24 16:35:29 +01:00
5ecb1d6326
Faster AOs
2022-03-21 18:32:39 +01:00
9124c9209a
Merge branch 'master' of github.com:TREX-CoE/qmckl
2022-03-11 13:16:48 +01:00
5b6f530255
Fix debug build
...
Added missing preprocessor wrapper
2022-03-01 14:17:38 +01:00
8b7b56b57b
Fix broken build
...
Recent HPC-related additions break the current build (make) process. This is because the HPC-related functions are not wrapped in the preprocessor ifdef statement.
2022-02-28 22:28:11 +01:00
22cd823edf
Working on generalized contractions
2022-02-27 23:31:52 +01:00
26fe759209
Added examples.org
2022-02-27 12:35:58 +01:00
5e35df226a
Fortran interface
2022-02-27 11:18:26 +01:00
ad86cb7d67
Working on HPC version of AOs
2022-02-25 20:39:20 +01:00
b6a31b8c58
Optimize AOs
2022-02-25 16:30:16 +01:00
ff526a18cb
Fix Clang build
2022-02-25 13:57:13 +01:00
1a5b76157b
Updated documentation
2022-02-24 19:06:19 +01:00
d919c53c42
Fix bug in HPC AOs
2022-02-19 19:24:18 +01:00
73399e24ec
Fix fortran strings in trexio interface
2022-02-18 01:24:37 +01:00
c93e7828c5
Added qmckl_context_touch for benchmarking
2022-02-17 22:29:53 +01:00
22e281560e
Accelerate AOs in HPC
2022-02-17 15:37:57 +01:00
41c0effa10
Accelerated AOs in HPC
2022-02-17 12:36:16 +01:00
7fe73e0104
Fix bug in fast AOs
2022-02-17 01:36:45 +01:00
cc4d0f62f8
Fix CI build
2022-02-16 19:49:05 +01:00
733d941c30
Optimized polynomials
2022-02-16 19:40:14 +01:00
e90e9a531c
Added HPC version of polynomials
2022-02-16 15:14:41 +01:00
7ab099f4f5
Prepare polynomials for HPC
2022-02-16 01:12:42 +01:00
1c681d4d7e
Rewrote AOs HPC in C
2022-02-16 00:21:37 +01:00
d83dad53cf
OpenMP in HPC version
2022-02-15 16:42:47 +01:00
685b7201fc
Accelerated AOs
2022-02-15 00:44:47 +01:00
v1j4y
8ed7a8b672
Added dim to factor_een and factor_een_deriv_e.
2022-02-11 17:35:07 +01:00
v1j4y
e2a678cc5c
Cleaned tmp_c and dtmp_c.
2022-02-11 17:31:17 +01:00
v1j4y
2f05df5109
Fixed een_rescaled_n_deriv_e.
2022-02-11 17:30:15 +01:00
v1j4y
367d0ff108
Fixed een_rescaled_n.
2022-02-11 17:19:36 +01:00
v1j4y
cf005084f1
Fixed een_rescaled_e_deriv_d.
2022-02-11 17:06:17 +01:00
v1j4y
a7ec3585a7
reorder indices for een_rescaled_e.
2022-02-11 16:19:31 +01:00
vijay
6e4b7f6722
Merge branch 'master' into reorder_indices_jastrow
2022-02-11 16:13:44 +01:00
bac6bf9cb8
Merge branch 'master' of github.com:TREX-CoE/qmckl
2022-02-11 16:07:37 +01:00
dcb392c0af
Swap indices 1..5 with points in AOs/MOs
2022-02-11 16:07:25 +01:00
v1j4y
2c7a1eb2c6
Fix factor_en_deriv_e.
2022-02-11 16:06:19 +01:00
vijay
82c5f54573
Merge branch 'master' into reorder_indices_jastrow
2022-02-11 15:58:56 +01:00
v1j4y
1bb1e1f7d3
Fix bug in calculation of en_distance_rescaled_deriv_e.
2022-02-11 15:50:58 +01:00
v1j4y
f22e2b1d72
Working on factor_en.
2022-02-11 15:45:16 +01:00
v1j4y
19ad64a80b
Merge branch 'reorder_indices_jastrow' of https://github.com/v1j4y/qmckl into reorder_indices_jastrow
2022-02-11 15:38:15 +01:00
v1j4y
3348781cc2
Fixed ee_distances_rescaled_deriv_e.
2022-02-11 15:37:55 +01:00
vijay
fdf25fd6fa
Merge branch 'master' into reorder_indices_jastrow
2022-02-11 15:37:27 +01:00
v1j4y
88e2f62d7f
Fixed ee_distance_rescaled index order.
2022-02-11 15:36:08 +01:00
v1j4y
71b0bbfaff
Fix test for drift.
2022-02-11 15:28:11 +01:00
v1j4y
da3c8c7cf9
Working on ee_distance_deriv_e.
2022-02-11 15:27:18 +01:00
v1j4y
fa535bdcd1
Added size to factor_ee.
2022-02-11 15:17:57 +01:00