Francois Coppens
f7dbe3ddd8
- Sync and Async version
...
- OpenMP version
- PP defines cleanup
2022-11-08 15:35:25 +01:00
Francois Coppens
2d5a34faed
Trivial change
2022-10-17 15:26:30 +02:00
Francois Coppens
90bc5090c2
Trivial change
2022-10-17 15:15:16 +02:00
Francois Coppens
6bb95f068d
- Resrtructured tree
...
- Added NVTX annotations to GPU kernel.
2022-10-17 14:56:32 +02:00
Francois Coppens
7594e15576
Improved memory allocation on the GPU.
2022-10-10 11:01:53 +02:00
Francois Coppens
bba5cf5f2c
Improved version.
...
- All static arrays replaced by dynamic ones
- All overhead induced by checking before and after running of the kernels replaced as much as possible with calls to MKL/DGEMMs.
- Solved bugs due to dimension mismatches.
Overhead time is dramatically reduced because no more calls to naive 'matmul'.
2022-10-02 10:20:11 +02:00
Francois Coppens
15f959099d
Cleanup
2022-10-02 10:20:11 +02:00
Francois Coppens
c0d21dd9af
Various
2022-10-02 10:20:11 +02:00
François Coppens
5fabf9b37a
Changed other version from 2 to 3 as well
2022-10-02 10:20:11 +02:00
François Coppens
26d2e37c32
Changed names of GH action scripts
...
Changed VFC checkout action version from 2 to 3
2022-10-02 10:20:11 +02:00
Thukisdo
f1f6f20f5c
Added back compilation GitHub action
2022-10-02 10:20:11 +02:00
Thukisdo
8293bd090c
Merged random cycle generator into the main scherman-morrison repository
2022-10-02 10:20:11 +02:00
Francois Coppens
d76632e792
Removed QMCkl dependency.
2022-10-02 10:20:11 +02:00
Francois Coppens
cc97230b69
Disabled qmckl_test_c make target because of incompatible call signatures. Will look at that later.
2022-10-02 10:20:11 +02:00
Francois Coppens
9df1c039af
Readded qmckl submodule
2022-10-02 10:20:11 +02:00
François Coppens
7ae82a1d3d
Update compile.yml
...
Disabled TREXIO
2022-10-02 10:20:11 +02:00
Francois Coppens
a63b1289d4
Cleanup: consolidated some pragmas.
2022-09-27 11:11:54 +02:00
Francois Coppens
4e7a334b78
- LAPACKE_dgetrf/ri replaced with cusolverDnDgetrf/rs.
...
- Solved sign bug in computation of determinant.
Most code is now executed on the device. Some openMP pragmas can be consolidated.
2022-09-26 17:06:50 +02:00
Francois Coppens
5a61ccc6b1
Added cuSOLVER and replaced LAPACKE_dgetrf with cusolverDnDgetrf.
2022-09-23 18:57:54 +02:00
Francois Coppens
00bdcba230
cuBLAS version of Woodbury KxK is working, but called to lapacke dgetrf/ri need to be replaced with cuSOLVER calls to eliminate intermediate results to be transfered to/from device.
2022-09-22 14:37:00 +02:00
François Coppens
892358d0d1
Replaced all CBLAS dgemms with cuBLAS dgemms and dgeams. Works but not ideal.
2022-09-09 17:15:12 +02:00
François Coppens
87e319189e
- Got rid of NVC compiler warnings
...
- Included lib paths for MKL/HDF5 and cuBLAS
- Cleaned Makefile
- Added GPU node session request script
2022-07-22 11:34:29 +02:00
Francois Coppens
fa03590f6f
Resolved some warnings of icx
2022-07-21 13:57:28 +02:00
Francois Coppens
ebe38e79e3
Added cuBLAS offloaded kernel for Woodbury KxK
2022-07-21 12:21:51 +02:00
Francois Coppens
f35ad6a777
Small bugfix in qmckl_slagel_splitting()
2022-07-21 08:16:25 +02:00
Francois Coppens
0a083e2875
Added first version of K x K Woodbury kernel using only CBLAS and LAPACK calls
2022-07-20 19:09:55 +02:00
Francois Coppens
732045284a
Added independent test harness, written in C. It has it's own Makefile and datasets. It is completely independent of the main tree.
2022-07-11 14:48:59 +02:00
François Coppens
8bab304cb5
Create LICENSE
2021-10-28 14:40:30 +02:00
François Coppens
cb09cd0614
Merge pull request #54 from fmgjcoppens/performance-tuning
...
Performance tuning
2021-10-04 10:48:21 +02:00
Francois Coppens
c255a9e035
Updated qmckl submodule status
2021-10-04 10:44:56 +02:00
Francois Coppens
9b13f818f0
Small changes in tests.
2021-10-04 09:06:05 +02:00
Francois Coppens
b094b74e48
Small changes to help with performance measurements.
2021-09-30 16:36:18 +02:00
François Coppens
0bd71c1968
Merge pull request #53 from fmgjcoppens/qmckl_integration
...
Qmckl integration
2021-09-21 14:48:08 +02:00
Francois Coppens
5e9da43c93
Added submodule support in workflow.
2021-09-21 14:42:57 +02:00
Francois Coppens
846d236b5f
Added QMCkl build to build-check Github workflow
2021-09-21 14:32:12 +02:00
Francois Coppens
0614971437
More minor bug fixes
2021-09-21 14:20:41 +02:00
Francois Coppens
c9d1abd29d
Fixed minor bug
2021-09-21 13:42:44 +02:00
Francois Coppens
177411f472
Removed binary file.
2021-09-21 12:30:59 +02:00
Francois Coppens
71e7fcc1b3
Added test that uses SMW kernes in QMCkl from Fortran.
2021-09-21 12:30:43 +02:00
Aurélien Delval
8b39bc44c2
Merge pull request #52 from PurplePachyderm/dev
...
Update integration of vfc_probes
2021-09-02 12:44:05 +02:00
Aurélien Delval
7e42a000c4
Update integration of vfc_probes
...
vfc_probes used to be built along the code in the previous versions.
This has been removed so that the verison used is the one provided
system wide by Verificarlo. Moreover, vfc_test_h5.cpp has been update to
reflect the name changes of vfc_probes functions.
2021-09-02 12:37:22 +02:00
Pablo Oliveira
ecb6018cc0
Install HDF5 dependencies
2021-09-02 11:06:22 +02:00
vfcci
5796e8e970
[auto] Set up Verificarlo CI on this branch
2021-09-02 10:43:13 +02:00
Francois Coppens
3a90248cc1
Cleanup and compiler flags.
2021-07-30 11:51:04 +02:00
Francois Coppens
74bb333de1
- Passing break-down threshold as a function argument
...
- Renaming kernels to correspond with the ones in QMCkl
- In the qmckl-version of the test program, chaning the way integer data is read from the HDF5 file.
2021-07-29 12:01:26 +02:00
Francois Coppens
6ce2055e59
* Removed dependency on qmckl_threshhold() and the accompanying preprocessor definition.
...
The break-down threshold now has to be passed explicitly as a function argument.
* Break-down threshold must now be passed on the command line together with the residual threshold.
2021-07-26 17:48:52 +02:00
Francois Coppens
7fb5ead349
Added and tested Woodbury 3x3 kernel to QMCkl.
...
Residual = wb3 14 9.92936e-07 1.90518e-11
ok -- cycle 14
Residual = qmckl_wb3 14 9.92936e-07 1.90518e-11
ok -- cycle 14.
2021-07-22 11:44:37 +02:00
Francois Coppens
e188871df4
Fixed unsigned int/uint64_t/H5::PredType::STD_U32LE problem in qmckl_test_h5.cpp that caused the segmentation faults due to array indices running out of bounds. Naive Sherman-Morrison and Woodbury 2x2 kernels are working correctly from QMCkl with good accuracy.
...
Residual = sm1 23 2.665e-07 5.85161e-13
ok -- cycle 23
Residual = qmckl_sm1 23 2.665e-07 5.85161e-13
ok -- cycle 23
Residual = wb2 23 2.665e-07 5.85161e-13
ok -- cycle 23
Residual = qmckl_wb2 23 2.665e-07 5.85161e-13
ok -- cycle 23
2021-07-22 10:45:21 +02:00
Francois Coppens
675f5bef41
Changes in qmckl
2021-07-21 17:40:07 +02:00
Francois Coppens
e314987bb7
Added Woodbury 2x2 to QMCkl test program tests/qmckl_test_h5.cpp. For now it crashes with a segmentation fault when run on a cycle with 2 updates (qmckl_test_h5 wb2 3 3 1e-3 1) .
2021-07-21 17:37:21 +02:00