d1e88ad475
Fixed efence compilation
2022-05-06 11:29:46 +02:00
2ea9e50421
Fixed cppcheck
2022-05-06 00:15:40 +02:00
Aurelien Delval
ad531dddf9
Configure cuBLAS with --enable-gpu and clean code
2022-04-08 11:11:15 +02:00
Max Hoffer
9b806aa071
Merge branch 'gpu' into gpu
2022-04-08 10:43:42 +02:00
hoffer
d4f0ccee3b
Add cublas batch Dgemm
2022-04-08 10:44:48 +02:00
07cc64bb31
Changed enable-cublas into with-cublas
2022-04-08 10:32:38 +02:00
hoffer
69b9e0fb89
Add cublas batched
2022-04-07 18:44:59 +02:00
Gianfranco Abrusci
4ee83a48d0
Merge branch 'gpu' into jastrow_hpc
2022-04-07 18:43:11 +02:00
185c1c3cb7
Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu
...
Conflicts:
org/qmckl_jastrow.org
2022-04-07 17:07:41 +02:00
47d63aa9d3
Fix cublas
2022-04-07 17:02:36 +02:00
Gianfranco Abrusci
0a3f427ace
removed unused variable in doc and hpc of compute_factor_ee_deriv_e
2022-04-07 16:21:29 +02:00
Gianfranco Abrusci
61495786db
merged gpu with compute_factor_ee_deriv_e
2022-04-07 15:51:50 +02:00
Gianfranco Abrusci
12ccb09b86
test passed
2022-04-07 15:41:22 +02:00
Aurelien Delval
3cd30bc8f3
Fix OpenACC and OpenMP implementations
2022-04-07 13:57:20 +02:00
a7fac59f04
Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu
2022-04-07 13:35:08 +02:00
7dc02571e9
Fix build
2022-04-07 13:33:50 +02:00
d1dc35eaa4
First working OpenMP version
2022-04-06 17:58:05 +02:00
Max Hoffer
7aad2a79a2
Merge branch 'gpu' into gpu
2022-04-06 17:17:16 +02:00
aeec721774
Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu
2022-04-06 17:11:26 +02:00
3ea90bc4a5
OpenMP
2022-04-06 17:11:21 +02:00
9cef7048d3
Fix CI
2022-04-06 17:10:23 +02:00
hoffer
fe277b7a6e
Ok for openmp and Cublas
2022-04-06 17:04:00 +02:00
88e8404b2a
Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu
2022-04-06 16:38:19 +02:00
cc5f6914f6
Cleaning
2022-04-06 16:26:35 +02:00
hoffer
3b5221531c
Add openmp and cublas
2022-04-06 16:20:29 +02:00
Gianfranco Abrusci
e496667189
debugging factor_ee_deriv_e
2022-04-06 15:59:12 +02:00
Gianfranco Abrusci
ff6d2e17f2
Merge branch 'gpu' into jastrow_hpc
2022-04-06 14:13:24 +02:00
Gianfranco Abrusci
b79a23897d
qmckl_compute_een_rescaled_e_hpc (c version) working
2022-04-06 14:01:13 +02:00
0d5d14b8e4
Fix openacc
2022-04-06 11:51:36 +02:00
hoffer
39bcc569e0
Start implementing cublas
2022-04-06 11:16:17 +02:00
0966e1e2b1
Fix OpenACC
2022-04-06 10:42:00 +02:00
2323
72fad819bf
Fix flags
2022-04-06 10:03:56 +02:00
Aurélien Delval
63c7f8ea72
Replace placeholder cuBLAS kernels with new C HPC implementation
2022-04-05 16:29:52 +02:00
Aurélien Delval
0ce0a93522
Fix preprocessor else and remove old cuBLAS interface
2022-04-05 14:37:57 +02:00
Aurélien Delval
eb71a752f5
Fixed naive GPU kernels and ignored variable issue
2022-04-05 14:28:35 +02:00
Gianfranco Abrusci
586eb92801
compute_cord_vect_full done
2022-04-05 14:23:20 +02:00
Aurélien Delval
bc43113b6f
Merge branch 'gpu' into master
2022-04-05 11:46:12 +02:00
94035929e4
Fixed cppcheck
2022-04-05 11:45:02 +02:00
Aurélien Delval
0e43d33a1d
Merge branch 'gpu' into master
2022-04-05 11:39:16 +02:00
6fb261d635
warnings
2022-04-05 11:15:42 +02:00
Aurélien Delval
98097e8fa7
Convert GPU implementations to C
...
TODO : Fix naive implementation which seems to be incorrect (probably an
issue with indexing)
2022-04-05 11:02:08 +02:00
511eba5843
Fixed dgemm bug
2022-04-05 09:56:13 +02:00
bcdbc49d5f
Cleaning
2022-04-04 23:53:58 +02:00
dd045452f6
Fixed documentation
2022-04-04 17:30:38 +02:00
1f9ea610d4
Moved C version of Jastrow into HPC
2022-04-04 16:56:33 +02:00
Aurélien Delval
84013a5f76
Cleanup before merging into QMCkl's GPU branch
2022-04-04 12:12:11 +02:00
Aurélien Delval
26bbd6f341
Start work on cuBLAS implementation
...
TODO Replace CPU BLAS calls by cuBLAS calls (will probably require to write a Fortran to the functions we're interested in, at least DGEMMs)
2022-04-01 09:19:56 +02:00
Aurélien Delval
9428eaa19e
Implement computation of tmp_c and dtmp_c in OpenACC
...
These 2 kernels seem to give good speedup compared to the CPU BLAS
versions. However, the current GPU implementation of factor_een_deriv seems to
be slightly slower (on the tested machine).
TODO:
- Try to improve factor_een_deriv GPU implem
- Try out a cuBLAS implementation of tmp_c and dtmp_c
2022-03-30 16:16:06 +02:00
Aurelien Delval
99306473a4
Start OpenACC implementation in Jastro, including compute_dtmp_c
2022-03-30 09:01:32 +02:00
Aurelien Delval
383c6ac78a
Add OFFLOAD_FLAGS, OFFLOAD_CFLAGS and OFFLOAD_FCFLAGS vars to configure
2022-03-28 07:58:01 +02:00