1
0
mirror of https://github.com/TREX-CoE/qmckl.git synced 2024-11-05 13:44:07 +01:00
Commit Graph

1124 Commits

Author SHA1 Message Date
hoffer
fe277b7a6e Ok for openmp and Cublas 2022-04-06 17:04:00 +02:00
88e8404b2a Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu 2022-04-06 16:38:19 +02:00
cc5f6914f6 Cleaning 2022-04-06 16:26:35 +02:00
hoffer
3b5221531c Add openmp and cublas 2022-04-06 16:20:29 +02:00
4224991b12
Merge pull request #74 from GianFree/jastrow_hpc
Jastrow hpc
2022-04-06 16:06:19 +02:00
Gianfranco Abrusci
e496667189 debugging factor_ee_deriv_e 2022-04-06 15:59:12 +02:00
Gianfranco Abrusci
ff6d2e17f2 Merge branch 'gpu' into jastrow_hpc 2022-04-06 14:13:24 +02:00
Gianfranco Abrusci
b79a23897d qmckl_compute_een_rescaled_e_hpc (c version) working 2022-04-06 14:01:13 +02:00
6c7634038f Improve configure 2022-04-06 13:48:37 +02:00
0d5d14b8e4 Fix openacc 2022-04-06 11:51:36 +02:00
hoffer
39bcc569e0 Start implementing cublas 2022-04-06 11:16:17 +02:00
0966e1e2b1 Fix OpenACC 2022-04-06 10:42:00 +02:00
2323
72fad819bf Fix flags 2022-04-06 10:03:56 +02:00
2323
f02e761b79 Fixed configure.ac for GPUs 2022-04-05 19:31:11 +02:00
2323
08f01ece89 Fix configure 2022-04-05 17:57:56 +02:00
0489831e18 Simplified configure 2022-04-05 17:06:29 +02:00
a3a1cc6428 Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu 2022-04-05 16:52:43 +02:00
c3424216de Fix info 2022-04-05 16:52:35 +02:00
Aurélien Delval
63c7f8ea72 Replace placeholder cuBLAS kernels with new C HPC implementation 2022-04-05 16:29:52 +02:00
f8e6d5f06b
Merge pull request #72 from PurplePachyderm/master
Merge in-progress work of GPU ports
2022-04-05 14:44:30 +02:00
Aurélien Delval
0ce0a93522 Fix preprocessor else and remove old cuBLAS interface 2022-04-05 14:37:57 +02:00
Aurélien Delval
eb71a752f5 Fixed naive GPU kernels and ignored variable issue 2022-04-05 14:28:35 +02:00
Gianfranco Abrusci
586eb92801 compute_cord_vect_full done 2022-04-05 14:23:20 +02:00
Aurélien Delval
bc43113b6f
Merge branch 'gpu' into master 2022-04-05 11:46:12 +02:00
2f26ccd4f0 Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu 2022-04-05 11:45:11 +02:00
94035929e4 Fixed cppcheck 2022-04-05 11:45:02 +02:00
c7dd46da05 Fixed cppcheck 2022-04-05 11:44:17 +02:00
Aurélien Delval
0e43d33a1d
Merge branch 'gpu' into master 2022-04-05 11:39:16 +02:00
6fb261d635 warnings 2022-04-05 11:15:42 +02:00
731fded4a8 warnings 2022-04-05 11:03:30 +02:00
Aurélien Delval
98097e8fa7 Convert GPU implementations to C
TODO : Fix naive implementation which seems to be incorrect (probably an
issue with indexing)
2022-04-05 11:02:08 +02:00
hoffer
508b294190 Fix flag for nvc and nvfortran 2022-04-05 10:07:25 +02:00
511eba5843 Fixed dgemm bug 2022-04-05 09:56:13 +02:00
bcdbc49d5f Cleaning 2022-04-04 23:53:58 +02:00
dd045452f6 Fixed documentation 2022-04-04 17:30:38 +02:00
2a13d8e18d Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu 2022-04-04 16:56:39 +02:00
1f9ea610d4 Moved C version of Jastrow into HPC 2022-04-04 16:56:33 +02:00
hoffer
31a05c47e2 Add flags for nvc and nvfortran to support offload 2022-04-04 12:41:00 +02:00
Aurélien Delval
84013a5f76 Cleanup before merging into QMCkl's GPU branch 2022-04-04 12:12:11 +02:00
7e56b3e2ed Merge branch 'master' into gpu 2022-04-04 12:11:57 +02:00
bac1eb33f0 Fixed configure for Nvidian compilers 2022-04-04 12:11:26 +02:00
9f03c32e20
Merge pull request #70 from GianFree/jastrow_c
Jastrow c
2022-04-04 11:55:33 +02:00
Gianfranco Abrusci
35e15205df Merge branch 'master' into jastrow_c 2022-04-04 11:22:17 +02:00
Aurélien Delval
1173bb2586 Update configure.ac with cuBLAS support
(forgotten in last commit)
2022-04-01 17:56:27 +02:00
Aurélien Delval
26bbd6f341 Start work on cuBLAS implementation
TODO Replace CPU BLAS calls by cuBLAS calls (will probably require to write a Fortran to the functions we're interested in, at least DGEMMs)
2022-04-01 09:19:56 +02:00
Aurélien Delval
9428eaa19e Implement computation of tmp_c and dtmp_c in OpenACC
These 2 kernels seem to give good speedup compared to the CPU BLAS
versions. However, the current GPU implementation of factor_een_deriv seems to
be slightly slower (on the tested machine).

TODO:
- Try to improve factor_een_deriv GPU implem
- Try out a cuBLAS implementation of tmp_c and dtmp_c
2022-03-30 16:16:06 +02:00
Aurelien Delval
99306473a4 Start OpenACC implementation in Jastro, including compute_dtmp_c 2022-03-30 09:01:32 +02:00
91811079d3 Fixed bugs. Travis OK. 2022-03-28 18:29:29 +02:00
b9cd2ed1ab Fix type error 2022-03-28 18:26:20 +02:00
bab87884cd Accelerated HPC AO->MO transformation 2022-03-28 17:58:03 +02:00