1
0
mirror of https://github.com/TREX-CoE/qmckl.git synced 2025-01-10 13:08:29 +01:00
Commit Graph

511 Commits

Author SHA1 Message Date
Gianfranco Abrusci
0a3f427ace removed unused variable in doc and hpc of compute_factor_ee_deriv_e 2022-04-07 16:21:29 +02:00
Gianfranco Abrusci
61495786db merged gpu with compute_factor_ee_deriv_e 2022-04-07 15:51:50 +02:00
Gianfranco Abrusci
12ccb09b86 test passed 2022-04-07 15:41:22 +02:00
Aurelien Delval
3cd30bc8f3 Fix OpenACC and OpenMP implementations 2022-04-07 13:57:20 +02:00
7dc02571e9 Fix build 2022-04-07 13:33:50 +02:00
Max Hoffer
7aad2a79a2
Merge branch 'gpu' into gpu 2022-04-06 17:17:16 +02:00
9cef7048d3 Fix CI 2022-04-06 17:10:23 +02:00
hoffer
fe277b7a6e Ok for openmp and Cublas 2022-04-06 17:04:00 +02:00
88e8404b2a Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu 2022-04-06 16:38:19 +02:00
cc5f6914f6 Cleaning 2022-04-06 16:26:35 +02:00
hoffer
3b5221531c Add openmp and cublas 2022-04-06 16:20:29 +02:00
Gianfranco Abrusci
e496667189 debugging factor_ee_deriv_e 2022-04-06 15:59:12 +02:00
Gianfranco Abrusci
ff6d2e17f2 Merge branch 'gpu' into jastrow_hpc 2022-04-06 14:13:24 +02:00
Gianfranco Abrusci
b79a23897d qmckl_compute_een_rescaled_e_hpc (c version) working 2022-04-06 14:01:13 +02:00
0d5d14b8e4 Fix openacc 2022-04-06 11:51:36 +02:00
hoffer
39bcc569e0 Start implementing cublas 2022-04-06 11:16:17 +02:00
0966e1e2b1 Fix OpenACC 2022-04-06 10:42:00 +02:00
2323
72fad819bf Fix flags 2022-04-06 10:03:56 +02:00
a3a1cc6428 Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu 2022-04-05 16:52:43 +02:00
c3424216de Fix info 2022-04-05 16:52:35 +02:00
Aurélien Delval
63c7f8ea72 Replace placeholder cuBLAS kernels with new C HPC implementation 2022-04-05 16:29:52 +02:00
Aurélien Delval
0ce0a93522 Fix preprocessor else and remove old cuBLAS interface 2022-04-05 14:37:57 +02:00
Aurélien Delval
eb71a752f5 Fixed naive GPU kernels and ignored variable issue 2022-04-05 14:28:35 +02:00
Gianfranco Abrusci
586eb92801 compute_cord_vect_full done 2022-04-05 14:23:20 +02:00
Aurélien Delval
bc43113b6f
Merge branch 'gpu' into master 2022-04-05 11:46:12 +02:00
94035929e4 Fixed cppcheck 2022-04-05 11:45:02 +02:00
Aurélien Delval
0e43d33a1d
Merge branch 'gpu' into master 2022-04-05 11:39:16 +02:00
6fb261d635 warnings 2022-04-05 11:15:42 +02:00
731fded4a8 warnings 2022-04-05 11:03:30 +02:00
Aurélien Delval
98097e8fa7 Convert GPU implementations to C
TODO : Fix naive implementation which seems to be incorrect (probably an
issue with indexing)
2022-04-05 11:02:08 +02:00
511eba5843 Fixed dgemm bug 2022-04-05 09:56:13 +02:00
bcdbc49d5f Cleaning 2022-04-04 23:53:58 +02:00
dd045452f6 Fixed documentation 2022-04-04 17:30:38 +02:00
1f9ea610d4 Moved C version of Jastrow into HPC 2022-04-04 16:56:33 +02:00
Aurélien Delval
84013a5f76 Cleanup before merging into QMCkl's GPU branch 2022-04-04 12:12:11 +02:00
7e56b3e2ed Merge branch 'master' into gpu 2022-04-04 12:11:57 +02:00
bac1eb33f0 Fixed configure for Nvidian compilers 2022-04-04 12:11:26 +02:00
Gianfranco Abrusci
35e15205df Merge branch 'master' into jastrow_c 2022-04-04 11:22:17 +02:00
Aurélien Delval
26bbd6f341 Start work on cuBLAS implementation
TODO Replace CPU BLAS calls by cuBLAS calls (will probably require to write a Fortran to the functions we're interested in, at least DGEMMs)
2022-04-01 09:19:56 +02:00
Aurélien Delval
9428eaa19e Implement computation of tmp_c and dtmp_c in OpenACC
These 2 kernels seem to give good speedup compared to the CPU BLAS
versions. However, the current GPU implementation of factor_een_deriv seems to
be slightly slower (on the tested machine).

TODO:
- Try to improve factor_een_deriv GPU implem
- Try out a cuBLAS implementation of tmp_c and dtmp_c
2022-03-30 16:16:06 +02:00
Aurelien Delval
99306473a4 Start OpenACC implementation in Jastro, including compute_dtmp_c 2022-03-30 09:01:32 +02:00
91811079d3 Fixed bugs. Travis OK. 2022-03-28 18:29:29 +02:00
b9cd2ed1ab Fix type error 2022-03-28 18:26:20 +02:00
bab87884cd Accelerated HPC AO->MO transformation 2022-03-28 17:58:03 +02:00
1b0bfd40be HPC version of AO->MO transformation 2022-03-28 17:37:50 +02:00
9b1f648437 Accelerated AO->MO transformation 2022-03-28 16:53:36 +02:00
Aurelien Delval
383c6ac78a Add OFFLOAD_FLAGS, OFFLOAD_CFLAGS and OFFLOAD_FCFLAGS vars to configure 2022-03-28 07:58:01 +02:00
Aurelien Delval
bcc49ca312 Minor fixes to previous commit
TODO Start modifying dedicated function to implement offloading

Also, as of now, Fortran preprocessor flags should be passed manually,
we need to manage this in the configure.ac in the future. For now, when
using gfortran, you should pass FCFLAGS="-cpp -DWITH_OPENMP_OFFLOAD" to
enable offloading.
2022-03-25 13:03:35 +01:00
Aurelien Delval
5e3231e7e3 Add selection mechanism for offload mode in Jastrow
This system adds an additional field to the QMCkl context to store the
offload mode currently in use for each kernel (in this commit, this has
been implemented for Jastrow as an example). This will be useful to test
different offloading versions that can be easily toggled on/off at
compilation and at runtime.
2022-03-24 16:35:29 +01:00
5ecb1d6326 Faster AOs 2022-03-21 18:32:39 +01:00