qmckl

TREX/qmckl

mirror of https://github.com/TREX-CoE/qmckl.git synced 2024-11-05 13:44:07 +01:00

Author	SHA1	Message	Date
Anthony Scemama	6c7634038f	Improve configure	2022-04-06 13:48:37 +02:00
Anthony Scemama	0d5d14b8e4	Fix openacc	2022-04-06 11:51:36 +02:00
hoffer	39bcc569e0	Start implementing cublas	2022-04-06 11:16:17 +02:00
Anthony Scemama	0966e1e2b1	Fix OpenACC	2022-04-06 10:42:00 +02:00
2323	72fad819bf	Fix flags	2022-04-06 10:03:56 +02:00
2323	f02e761b79	Fixed configure.ac for GPUs	2022-04-05 19:31:11 +02:00
2323	08f01ece89	Fix configure	2022-04-05 17:57:56 +02:00
Anthony Scemama	0489831e18	Simplified configure	2022-04-05 17:06:29 +02:00
Anthony Scemama	a3a1cc6428	Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu	2022-04-05 16:52:43 +02:00
Anthony Scemama	c3424216de	Fix info	2022-04-05 16:52:35 +02:00
Aurélien Delval	63c7f8ea72	Replace placeholder cuBLAS kernels with new C HPC implementation	2022-04-05 16:29:52 +02:00
Anthony Scemama	f8e6d5f06b	Merge pull request #72 from PurplePachyderm/master Merge in-progress work of GPU ports	2022-04-05 14:44:30 +02:00
Aurélien Delval	0ce0a93522	Fix preprocessor else and remove old cuBLAS interface	2022-04-05 14:37:57 +02:00
Aurélien Delval	eb71a752f5	Fixed naive GPU kernels and ignored variable issue	2022-04-05 14:28:35 +02:00
Gianfranco Abrusci	586eb92801	compute_cord_vect_full done	2022-04-05 14:23:20 +02:00
Aurélien Delval	bc43113b6f	Merge branch 'gpu' into master	2022-04-05 11:46:12 +02:00
Anthony Scemama	2f26ccd4f0	Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu	2022-04-05 11:45:11 +02:00
Anthony Scemama	94035929e4	Fixed cppcheck	2022-04-05 11:45:02 +02:00
Anthony Scemama	c7dd46da05	Fixed cppcheck	2022-04-05 11:44:17 +02:00
Aurélien Delval	0e43d33a1d	Merge branch 'gpu' into master	2022-04-05 11:39:16 +02:00
Anthony Scemama	6fb261d635	warnings	2022-04-05 11:15:42 +02:00
Anthony Scemama	731fded4a8	warnings	2022-04-05 11:03:30 +02:00
Aurélien Delval	98097e8fa7	Convert GPU implementations to C TODO : Fix naive implementation which seems to be incorrect (probably an issue with indexing)	2022-04-05 11:02:08 +02:00
hoffer	508b294190	Fix flag for nvc and nvfortran	2022-04-05 10:07:25 +02:00
Anthony Scemama	511eba5843	Fixed dgemm bug	2022-04-05 09:56:13 +02:00
Anthony Scemama	bcdbc49d5f	Cleaning	2022-04-04 23:53:58 +02:00
Anthony Scemama	dd045452f6	Fixed documentation	2022-04-04 17:30:38 +02:00
Anthony Scemama	2a13d8e18d	Merge branch 'gpu' of github.com:TREX-CoE/qmckl into gpu	2022-04-04 16:56:39 +02:00
Anthony Scemama	1f9ea610d4	Moved C version of Jastrow into HPC	2022-04-04 16:56:33 +02:00
hoffer	31a05c47e2	Add flags for nvc and nvfortran to support offload	2022-04-04 12:41:00 +02:00
Aurélien Delval	84013a5f76	Cleanup before merging into QMCkl's GPU branch	2022-04-04 12:12:11 +02:00
Anthony Scemama	7e56b3e2ed	Merge branch 'master' into gpu	2022-04-04 12:11:57 +02:00
Anthony Scemama	bac1eb33f0	Fixed configure for Nvidian compilers	2022-04-04 12:11:26 +02:00
Anthony Scemama	9f03c32e20	Merge pull request #70 from GianFree/jastrow_c Jastrow c	2022-04-04 11:55:33 +02:00
Gianfranco Abrusci	35e15205df	Merge branch 'master' into jastrow_c	2022-04-04 11:22:17 +02:00
Aurélien Delval	1173bb2586	Update configure.ac with cuBLAS support (forgotten in last commit)	2022-04-01 17:56:27 +02:00
Aurélien Delval	26bbd6f341	Start work on cuBLAS implementation TODO Replace CPU BLAS calls by cuBLAS calls (will probably require to write a Fortran to the functions we're interested in, at least DGEMMs)	2022-04-01 09:19:56 +02:00
Aurélien Delval	9428eaa19e	Implement computation of tmp_c and dtmp_c in OpenACC These 2 kernels seem to give good speedup compared to the CPU BLAS versions. However, the current GPU implementation of factor_een_deriv seems to be slightly slower (on the tested machine). TODO: - Try to improve factor_een_deriv GPU implem - Try out a cuBLAS implementation of tmp_c and dtmp_c	2022-03-30 16:16:06 +02:00
Aurelien Delval	99306473a4	Start OpenACC implementation in Jastro, including compute_dtmp_c	2022-03-30 09:01:32 +02:00
Anthony Scemama	91811079d3	Fixed bugs. Travis OK.	2022-03-28 18:29:29 +02:00
Anthony Scemama	b9cd2ed1ab	Fix type error	2022-03-28 18:26:20 +02:00
Anthony Scemama	bab87884cd	Accelerated HPC AO->MO transformation	2022-03-28 17:58:03 +02:00
Anthony Scemama	1b0bfd40be	HPC version of AO->MO transformation	2022-03-28 17:37:50 +02:00
Anthony Scemama	9b1f648437	Accelerated AO->MO transformation	2022-03-28 16:53:36 +02:00
Aurelien Delval	383c6ac78a	Add OFFLOAD_FLAGS, OFFLOAD_CFLAGS and OFFLOAD_FCFLAGS vars to configure	2022-03-28 07:58:01 +02:00
Aurelien Delval	bcc49ca312	Minor fixes to previous commit TODO Start modifying dedicated function to implement offloading Also, as of now, Fortran preprocessor flags should be passed manually, we need to manage this in the configure.ac in the future. For now, when using gfortran, you should pass FCFLAGS="-cpp -DWITH_OPENMP_OFFLOAD" to enable offloading.	2022-03-25 13:03:35 +01:00
Aurelien Delval	5e3231e7e3	Add selection mechanism for offload mode in Jastrow This system adds an additional field to the QMCkl context to store the offload mode currently in use for each kernel (in this commit, this has been implemented for Jastrow as an example). This will be useful to test different offloading versions that can be easily toggled on/off at compilation and at runtime.	2022-03-24 16:35:29 +01:00
Aurélien Delval	79d4cf130b	Add detection of configure arguments to enable GPU offloading As of now, only OpenMP offload will be implemented as a test.	2022-03-24 10:06:25 +01:00
Anthony Scemama	5ecb1d6326	Faster AOs	2022-03-21 18:32:39 +01:00
Gianfranco Abrusci	3ce162a384	dtmp_c done	2022-03-17 22:27:10 +01:00

... 4 5 6 7 8 ...

1116 Commits