BLAS functions
Table of Contents
1 Matrix operations
1.1 qmckl_dgemm
Matrix multiplication:
\[ C_{ij} = \beta C_{ij} + \alpha \sum_{k} A_{ik} \cdot B_{kj} \]
Variable | Type | In/Out | Description |
---|---|---|---|
context |
qmckl_context |
in | Global state |
TransA |
char |
in | 'T' is transposed |
TransB |
char |
in | 'T' is transposed |
m |
int64_t |
in | Number of rows of the input matrix |
n |
int64_t |
in | Number of columns of the input matrix |
k |
int64_t |
in | Number of columns of the input matrix |
alpha |
double |
in | Number of columns of the input matrix |
A |
double[][lda] |
in | Array containing the matrix \(A\) |
lda |
int64_t |
in | Leading dimension of array A |
B |
double[][ldb] |
in | Array containing the matrix \(B\) |
ldb |
int64_t |
in | Leading dimension of array B |
beta |
double |
in | Array containing the matrix \(B\) |
C |
double[][ldc] |
out | Array containing the matrix \(B\) |
ldc |
int64_t |
in | Leading dimension of array B |
Requirements:
context
is notQMCKL_NULL_CONTEXT
m > 0
n > 0
k > 0
lda >= m
ldb >= n
ldc >= n
A
is allocated with at least \(m \times k \times 8\) bytesB
is allocated with at least \(k \times n \times 8\) bytesC
is allocated with at least \(m \times n \times 8\) bytes
qmckl_exit_code qmckl_dgemm ( const qmckl_context context, const char TransA, const char TransB, const int64_t m, const int64_t n, const int64_t k, const double alpha, const double* A, const int64_t lda, const double* B, const int64_t ldb, const double beta, double* const C, const int64_t ldc );
integer function qmckl_dgemm_f(context, TransA, TransB, & m, n, k, alpha, A, LDA, B, LDB, beta, C, LDC) & result(info) use qmckl implicit none integer(qmckl_context), intent(in) :: context character , intent(in) :: TransA, TransB integer*8 , intent(in) :: m, n, k double precision , intent(in) :: alpha, beta integer*8 , intent(in) :: lda double precision , intent(in) :: A(lda,*) integer*8 , intent(in) :: ldb double precision , intent(in) :: B(ldb,*) integer*8 , intent(in) :: ldc double precision , intent(out) :: C(ldc,*) info = QMCKL_SUCCESS if (context == QMCKL_NULL_CONTEXT) then info = QMCKL_INVALID_CONTEXT return endif if (m <= 0_8) then info = QMCKL_INVALID_ARG_4 return endif if (n <= 0_8) then info = QMCKL_INVALID_ARG_5 return endif if (k <= 0_8) then info = QMCKL_INVALID_ARG_6 return endif if (LDA <= 0) then info = QMCKL_INVALID_ARG_9 return endif if (LDB <= 0) then info = QMCKL_INVALID_ARG_11 return endif if (LDC <= 0) then info = QMCKL_INVALID_ARG_14 return endif call dgemm(transA, transB, int(m,4), int(n,4), int(k,4), & alpha, A, int(LDA,4), B, int(LDB,4), beta, C, int(LDC,4)) end function qmckl_dgemm_f
1.2 qmckl_adjugate
Given a matrix \(\mathbf{A}\), the adjugate matrix \(\text{adj}(\mathbf{A})\) is the transpose of the cofactors matrix of \(\mathbf{A}\).
\[ \mathbf{B} = \text{adj}(\mathbf{A}) = \text{det}(\mathbf{A}) \, \mathbf{A}^{-1} \]
See also: https://en.wikipedia.org/wiki/Adjugate_matrix
Variable | Type | In/Out | Description |
---|---|---|---|
context |
qmckl_context |
in | Global state |
n |
int64_t |
in | Number of rows and columns of the input matrix |
A |
double[][lda] |
in | Array containing the \(n \times n\) matrix \(A\) |
lda |
int64_t |
in | Leading dimension of array A |
B |
double[][ldb] |
out | Adjugate of \(A\) |
ldb |
int64_t |
in | Leading dimension of array B |
det_l |
double |
inout | determinant of \(A\) |
Requirements:
context
is notQMCKL_NULL_CONTEXT
n > 0
lda >= m
A
is allocated with at least \(m \times m \times 8\) bytesldb >= m
B
is allocated with at least \(m \times m \times 8\) bytes
qmckl_exit_code qmckl_adjugate ( const qmckl_context context, const int64_t n, const double* A, const int64_t lda, double* const B, const int64_t ldb, double* det_l );
For small matrices (≤ 5× 5), we use specialized routines for performance motivations. For larger sizes, we rely on the LAPACK library.
integer function qmckl_adjugate_f(context, na, A, LDA, B, ldb, det_l) & result(info) use qmckl implicit none integer(qmckl_context) , intent(in) :: context double precision, intent(in) :: A (LDA,*) integer*8, intent(in) :: LDA double precision, intent(out) :: B (LDB,*) integer*8, intent(in) :: LDB integer*8, intent(in) :: na double precision, intent(inout) :: det_l integer :: i,j info = QMCKL_SUCCESS if (context == QMCKL_NULL_CONTEXT) then info = QMCKL_INVALID_CONTEXT return endif if (na <= 0_8) then info = QMCKL_INVALID_ARG_2 return endif if (LDA <= 0_8) then info = QMCKL_INVALID_ARG_4 return endif if (LDA < na) then info = QMCKL_INVALID_ARG_4 return endif select case (na) case (5) call adjugate5(A,LDA,B,LDB,na,det_l) case (4) call adjugate4(A,LDA,B,LDB,na,det_l) case (3) call adjugate3(A,LDA,B,LDB,na,det_l) case (2) call adjugate2(A,LDA,B,LDB,na,det_l) case (1) det_l = a(1,1) b(1,1) = 1.d0 case default call adjugate_general(context, na, A, LDA, B, LDB, det_l) end select end function qmckl_adjugate_f
subroutine adjugate_general(context, na, A, LDA, B, LDB, det_l) use qmckl implicit none integer(qmckl_context), intent(in) :: context double precision, intent(in) :: A (LDA,na) integer*8, intent(in) :: LDA double precision, intent(out) :: B (LDB,na) integer*8, intent(in) :: LDB integer*8, intent(in) :: na double precision, intent(inout) :: det_l double precision :: work(LDA*max(na,64)) integer :: inf integer :: ipiv(LDA) integer :: lwork integer :: i, j
We first copy the array A
into array B
.
B(1:na,1:na) = A(1:na,1:na)
Then, we compute the LU factorization \(LU=A\)
using the dgetrf
routine.
call dgetrf(na, na, B, LDB, ipiv, inf )
By convention, the determinant of \(L\) is equal to one, so the determinant of \(A\) is equal to the determinant of \(U\), which is simply computed as the product of its diagonal elements.
det_l = 1.d0 j=0 do i=1,na j = j+min(abs(ipiv(i)-i),1) det_l = det_l*B(i,i) enddo
As dgetrf
returns \(PLU=A\) where \(P\) is a permutation matrix, the
sign of the determinant is computed as \(-1^m\) where \(m\) is the
number of permutations.
if (iand(j,1) /= 0) then det_l = -det_l endif
Then, the inverse of \(A\) is computed using dgetri
:
lwork = SIZE(work) call dgetri(na, B, LDB, ipiv, work, lwork, inf )
and the adjugate matrix is computed as the product of the determinant with the inverse:
B(:,:) = B(:,:)*det_l end subroutine adjugate_general