229 Commits

Author SHA1 Message Date
Francois Coppens
f7dbe3ddd8 - Sync and Async version
- OpenMP version
- PP defines cleanup
2022-11-08 15:35:25 +01:00
Francois Coppens
2d5a34faed Trivial change 2022-10-17 15:26:30 +02:00
Francois Coppens
90bc5090c2 Trivial change 2022-10-17 15:15:16 +02:00
Francois Coppens
6bb95f068d - Resrtructured tree
- Added NVTX annotations to GPU kernel.
2022-10-17 14:56:32 +02:00
Francois Coppens
7594e15576 Improved memory allocation on the GPU. 2022-10-10 11:01:53 +02:00
Francois Coppens
bba5cf5f2c Improved version.
- All static arrays replaced by dynamic ones
- All overhead induced by checking before and after running of the kernels replaced as much as possible with calls to MKL/DGEMMs.
- Solved bugs due to dimension mismatches.

Overhead time is dramatically reduced because no more calls to naive 'matmul'.
2022-10-02 10:20:11 +02:00
Francois Coppens
15f959099d Cleanup 2022-10-02 10:20:11 +02:00
Francois Coppens
c0d21dd9af Various 2022-10-02 10:20:11 +02:00
François Coppens
5fabf9b37a Changed other version from 2 to 3 as well 2022-10-02 10:20:11 +02:00
François Coppens
26d2e37c32 Changed names of GH action scripts
Changed VFC checkout action version from 2 to 3
2022-10-02 10:20:11 +02:00
Thukisdo
f1f6f20f5c Added back compilation GitHub action 2022-10-02 10:20:11 +02:00
Thukisdo
8293bd090c Merged random cycle generator into the main scherman-morrison repository 2022-10-02 10:20:11 +02:00
Francois Coppens
d76632e792 Removed QMCkl dependency. 2022-10-02 10:20:11 +02:00
Francois Coppens
cc97230b69 Disabled qmckl_test_c make target because of incompatible call signatures. Will look at that later. 2022-10-02 10:20:11 +02:00
Francois Coppens
9df1c039af Readded qmckl submodule 2022-10-02 10:20:11 +02:00
François Coppens
7ae82a1d3d Update compile.yml
Disabled TREXIO
2022-10-02 10:20:11 +02:00
Francois Coppens
a63b1289d4 Cleanup: consolidated some pragmas. 2022-09-27 11:11:54 +02:00
Francois Coppens
4e7a334b78 - LAPACKE_dgetrf/ri replaced with cusolverDnDgetrf/rs.
- Solved sign bug in computation of determinant.

Most code is now executed on the device. Some openMP pragmas can be consolidated.
2022-09-26 17:06:50 +02:00
Francois Coppens
5a61ccc6b1 Added cuSOLVER and replaced LAPACKE_dgetrf with cusolverDnDgetrf. 2022-09-23 18:57:54 +02:00
Francois Coppens
00bdcba230 cuBLAS version of Woodbury KxK is working, but called to lapacke dgetrf/ri need to be replaced with cuSOLVER calls to eliminate intermediate results to be transfered to/from device. 2022-09-22 14:37:00 +02:00
François Coppens
892358d0d1 Replaced all CBLAS dgemms with cuBLAS dgemms and dgeams. Works but not ideal. 2022-09-09 17:15:12 +02:00
François Coppens
87e319189e - Got rid of NVC compiler warnings
- Included lib paths for MKL/HDF5 and cuBLAS
- Cleaned Makefile
- Added GPU node session request script
2022-07-22 11:34:29 +02:00
Francois Coppens
fa03590f6f Resolved some warnings of icx 2022-07-21 13:57:28 +02:00
Francois Coppens
ebe38e79e3 Added cuBLAS offloaded kernel for Woodbury KxK 2022-07-21 12:21:51 +02:00
Francois Coppens
f35ad6a777 Small bugfix in qmckl_slagel_splitting() 2022-07-21 08:16:25 +02:00
Francois Coppens
0a083e2875 Added first version of K x K Woodbury kernel using only CBLAS and LAPACK calls 2022-07-20 19:09:55 +02:00
Francois Coppens
732045284a Added independent test harness, written in C. It has it's own Makefile and datasets. It is completely independent of the main tree. 2022-07-11 14:48:59 +02:00
François Coppens
8bab304cb5
Create LICENSE 2021-10-28 14:40:30 +02:00
François Coppens
cb09cd0614
Merge pull request #54 from fmgjcoppens/performance-tuning
Performance tuning
2021-10-04 10:48:21 +02:00
Francois Coppens
c255a9e035 Updated qmckl submodule status 2021-10-04 10:44:56 +02:00
Francois Coppens
9b13f818f0 Small changes in tests. 2021-10-04 09:06:05 +02:00
Francois Coppens
b094b74e48 Small changes to help with performance measurements. 2021-09-30 16:36:18 +02:00
François Coppens
0bd71c1968
Merge pull request #53 from fmgjcoppens/qmckl_integration
Qmckl integration
2021-09-21 14:48:08 +02:00
Francois Coppens
5e9da43c93 Added submodule support in workflow. 2021-09-21 14:42:57 +02:00
Francois Coppens
846d236b5f Added QMCkl build to build-check Github workflow 2021-09-21 14:32:12 +02:00
Francois Coppens
0614971437 More minor bug fixes 2021-09-21 14:20:41 +02:00
Francois Coppens
c9d1abd29d Fixed minor bug 2021-09-21 13:42:44 +02:00
Francois Coppens
177411f472 Removed binary file. 2021-09-21 12:30:59 +02:00
Francois Coppens
71e7fcc1b3 Added test that uses SMW kernes in QMCkl from Fortran. 2021-09-21 12:30:43 +02:00
Aurélien Delval
8b39bc44c2
Merge pull request #52 from PurplePachyderm/dev
Update integration of vfc_probes
2021-09-02 12:44:05 +02:00
Aurélien Delval
7e42a000c4 Update integration of vfc_probes
vfc_probes used to be built along the code in the previous versions.
This has been removed so that the verison used is the one provided
system wide by Verificarlo. Moreover, vfc_test_h5.cpp has been update to
reflect the name changes of vfc_probes functions.
2021-09-02 12:37:22 +02:00
Pablo Oliveira
ecb6018cc0 Install HDF5 dependencies 2021-09-02 11:06:22 +02:00
vfcci
5796e8e970 [auto] Set up Verificarlo CI on this branch 2021-09-02 10:43:13 +02:00
Francois Coppens
3a90248cc1 Cleanup and compiler flags. 2021-07-30 11:51:04 +02:00
Francois Coppens
74bb333de1 - Passing break-down threshold as a function argument
- Renaming kernels to correspond with the ones in QMCkl
- In the qmckl-version of the test program, chaning the way integer data is read from the HDF5 file.
2021-07-29 12:01:26 +02:00
Francois Coppens
6ce2055e59 * Removed dependency on qmckl_threshhold() and the accompanying preprocessor definition.
The break-down threshold now has to be passed explicitly as a function argument.
* Break-down threshold must now be passed on the command line together with the residual threshold.
2021-07-26 17:48:52 +02:00
Francois Coppens
7fb5ead349 Added and tested Woodbury 3x3 kernel to QMCkl.
Residual = wb3 14 9.92936e-07 1.90518e-11
    ok -- cycle 14

    Residual = qmckl_wb3 14 9.92936e-07 1.90518e-11
    ok -- cycle 14.
2021-07-22 11:44:37 +02:00
Francois Coppens
e188871df4 Fixed unsigned int/uint64_t/H5::PredType::STD_U32LE problem in qmckl_test_h5.cpp that caused the segmentation faults due to array indices running out of bounds. Naive Sherman-Morrison and Woodbury 2x2 kernels are working correctly from QMCkl with good accuracy.
Residual = sm1 23 2.665e-07 5.85161e-13
ok -- cycle 23
Residual = qmckl_sm1 23 2.665e-07 5.85161e-13
ok -- cycle 23
Residual = wb2 23 2.665e-07 5.85161e-13
ok -- cycle 23
Residual = qmckl_wb2 23 2.665e-07 5.85161e-13
ok -- cycle 23
2021-07-22 10:45:21 +02:00
Francois Coppens
675f5bef41 Changes in qmckl 2021-07-21 17:40:07 +02:00
Francois Coppens
e314987bb7 Added Woodbury 2x2 to QMCkl test program tests/qmckl_test_h5.cpp. For now it crashes with a segmentation fault when run on a cycle with 2 updates (qmckl_test_h5 wb2 3 3 1e-3 1) . 2021-07-21 17:37:21 +02:00