6c7634038f
Improve configure
2022-04-06 13:48:37 +02:00
0d5d14b8e4
Fix openacc
2022-04-06 11:51:36 +02:00
hoffer
39bcc569e0
Start implementing cublas
2022-04-06 11:16:17 +02:00
0966e1e2b1
Fix OpenACC
2022-04-06 10:42:00 +02:00
2323
72fad819bf
Fix flags
2022-04-06 10:03:56 +02:00
2323
f02e761b79
Fixed configure.ac for GPUs
2022-04-05 19:31:11 +02:00
2323
08f01ece89
Fix configure
2022-04-05 17:57:56 +02:00
0489831e18
Simplified configure
2022-04-05 17:06:29 +02:00
a3a1cc6428
Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu
2022-04-05 16:52:43 +02:00
c3424216de
Fix info
2022-04-05 16:52:35 +02:00
Aurélien Delval
63c7f8ea72
Replace placeholder cuBLAS kernels with new C HPC implementation
2022-04-05 16:29:52 +02:00
f8e6d5f06b
Merge pull request #72 from PurplePachyderm/master
...
Merge in-progress work of GPU ports
2022-04-05 14:44:30 +02:00
Aurélien Delval
0ce0a93522
Fix preprocessor else and remove old cuBLAS interface
2022-04-05 14:37:57 +02:00
Aurélien Delval
eb71a752f5
Fixed naive GPU kernels and ignored variable issue
2022-04-05 14:28:35 +02:00
Gianfranco Abrusci
586eb92801
compute_cord_vect_full done
2022-04-05 14:23:20 +02:00
Aurélien Delval
bc43113b6f
Merge branch 'gpu' into master
2022-04-05 11:46:12 +02:00
2f26ccd4f0
Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu
2022-04-05 11:45:11 +02:00
94035929e4
Fixed cppcheck
2022-04-05 11:45:02 +02:00
c7dd46da05
Fixed cppcheck
2022-04-05 11:44:17 +02:00
Aurélien Delval
0e43d33a1d
Merge branch 'gpu' into master
2022-04-05 11:39:16 +02:00
6fb261d635
warnings
2022-04-05 11:15:42 +02:00
731fded4a8
warnings
2022-04-05 11:03:30 +02:00
Aurélien Delval
98097e8fa7
Convert GPU implementations to C
...
TODO : Fix naive implementation which seems to be incorrect (probably an
issue with indexing)
2022-04-05 11:02:08 +02:00
hoffer
508b294190
Fix flag for nvc and nvfortran
2022-04-05 10:07:25 +02:00
511eba5843
Fixed dgemm bug
2022-04-05 09:56:13 +02:00
bcdbc49d5f
Cleaning
2022-04-04 23:53:58 +02:00
dd045452f6
Fixed documentation
2022-04-04 17:30:38 +02:00
2a13d8e18d
Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu
2022-04-04 16:56:39 +02:00
1f9ea610d4
Moved C version of Jastrow into HPC
2022-04-04 16:56:33 +02:00
hoffer
31a05c47e2
Add flags for nvc and nvfortran to support offload
2022-04-04 12:41:00 +02:00
Aurélien Delval
84013a5f76
Cleanup before merging into QMCkl's GPU branch
2022-04-04 12:12:11 +02:00
7e56b3e2ed
Merge branch 'master' into gpu
2022-04-04 12:11:57 +02:00
bac1eb33f0
Fixed configure for Nvidian compilers
2022-04-04 12:11:26 +02:00
9f03c32e20
Merge pull request #70 from GianFree/jastrow_c
...
Jastrow c
2022-04-04 11:55:33 +02:00
Gianfranco Abrusci
35e15205df
Merge branch 'master' into jastrow_c
2022-04-04 11:22:17 +02:00
Aurélien Delval
1173bb2586
Update configure.ac with cuBLAS support
...
(forgotten in last commit)
2022-04-01 17:56:27 +02:00
Aurélien Delval
26bbd6f341
Start work on cuBLAS implementation
...
TODO Replace CPU BLAS calls by cuBLAS calls (will probably require to write a Fortran to the functions we're interested in, at least DGEMMs)
2022-04-01 09:19:56 +02:00
Aurélien Delval
9428eaa19e
Implement computation of tmp_c and dtmp_c in OpenACC
...
These 2 kernels seem to give good speedup compared to the CPU BLAS
versions. However, the current GPU implementation of factor_een_deriv seems to
be slightly slower (on the tested machine).
TODO:
- Try to improve factor_een_deriv GPU implem
- Try out a cuBLAS implementation of tmp_c and dtmp_c
2022-03-30 16:16:06 +02:00
Aurelien Delval
99306473a4
Start OpenACC implementation in Jastro, including compute_dtmp_c
2022-03-30 09:01:32 +02:00
91811079d3
Fixed bugs. Travis OK.
2022-03-28 18:29:29 +02:00
b9cd2ed1ab
Fix type error
2022-03-28 18:26:20 +02:00
bab87884cd
Accelerated HPC AO->MO transformation
2022-03-28 17:58:03 +02:00
1b0bfd40be
HPC version of AO->MO transformation
2022-03-28 17:37:50 +02:00
9b1f648437
Accelerated AO->MO transformation
2022-03-28 16:53:36 +02:00
Aurelien Delval
383c6ac78a
Add OFFLOAD_FLAGS, OFFLOAD_CFLAGS and OFFLOAD_FCFLAGS vars to configure
2022-03-28 07:58:01 +02:00
Aurelien Delval
bcc49ca312
Minor fixes to previous commit
...
TODO Start modifying dedicated function to implement offloading
Also, as of now, Fortran preprocessor flags should be passed manually,
we need to manage this in the configure.ac in the future. For now, when
using gfortran, you should pass FCFLAGS="-cpp -DWITH_OPENMP_OFFLOAD" to
enable offloading.
2022-03-25 13:03:35 +01:00
Aurelien Delval
5e3231e7e3
Add selection mechanism for offload mode in Jastrow
...
This system adds an additional field to the QMCkl context to store the
offload mode currently in use for each kernel (in this commit, this has
been implemented for Jastrow as an example). This will be useful to test
different offloading versions that can be easily toggled on/off at
compilation and at runtime.
2022-03-24 16:35:29 +01:00
Aurélien Delval
79d4cf130b
Add detection of configure arguments to enable GPU offloading
...
As of now, only OpenMP offload will be implemented as a test.
2022-03-24 10:06:25 +01:00
5ecb1d6326
Faster AOs
2022-03-21 18:32:39 +01:00
Gianfranco Abrusci
3ce162a384
dtmp_c done
2022-03-17 22:27:10 +01:00