BLAS functions

1. Matrix operations
- 1.1. qmckl_dgemm
- 1.2. qmckl_adjugate

1 Matrix operations

1.1 `qmckl_dgemm`

Matrix multiplication:

\[ C_{ij} = \beta C_{ij} + \alpha \sum_{k} A_{ik} \cdot B_{kj} \]

Variable	Type	In/Out	Description
`context`	`qmckl_context`	in	Global state
`TransA`	`char`	in	'T' is transposed
`TransB`	`char`	in	'T' is transposed
`m`	`int64_t`	in	Number of rows of the input matrix
`n`	`int64_t`	in	Number of columns of the input matrix
`k`	`int64_t`	in	Number of columns of the input matrix
`alpha`	`double`	in	Number of columns of the input matrix
`A`	`double[][lda]`	in	Array containing the matrix \(A\)
`lda`	`int64_t`	in	Leading dimension of array `A`
`B`	`double[][ldb]`	in	Array containing the matrix \(B\)
`ldb`	`int64_t`	in	Leading dimension of array `B`
`beta`	`double`	in	Array containing the matrix \(B\)
`C`	`double[][ldc]`	out	Array containing the matrix \(B\)
`ldc`	`int64_t`	in	Leading dimension of array `B`

Requirements:

context is not QMCKL_NULL_CONTEXT
m > 0
n > 0
k > 0
lda >= m
ldb >= n
ldc >= n
A is allocated with at least \(m \times k \times 8\) bytes
B is allocated with at least \(k \times n \times 8\) bytes
C is allocated with at least \(m \times n \times 8\) bytes

qmckl_exit_code qmckl_dgemm (
      const qmckl_context context,
      const char TransA,
      const char TransB,
      const int64_t m,
      const int64_t n,
      const int64_t k,
      const double alpha,
      const double* A,
      const int64_t lda,
      const double* B,
      const int64_t ldb,
      const double beta,
      double* const C,
      const int64_t ldc );

integer function qmckl_dgemm_f(context, TransA, TransB, &
     m, n, k, alpha, A, LDA, B, LDB, beta, C, LDC) &
     result(info)
  use qmckl
  implicit none
  integer(qmckl_context), intent(in)  :: context
  character             , intent(in)  :: TransA, TransB
  integer*8             , intent(in)  :: m, n, k
  double precision      , intent(in)  :: alpha, beta
  integer*8             , intent(in)  :: lda
  double precision      , intent(in)  :: A(lda,*)
  integer*8             , intent(in)  :: ldb
  double precision      , intent(in)  :: B(ldb,*)
  integer*8             , intent(in)  :: ldc
  double precision      , intent(out) :: C(ldc,*)

  info = QMCKL_SUCCESS

  if (context == QMCKL_NULL_CONTEXT) then
     info = QMCKL_INVALID_CONTEXT
     return
  endif

  if (m <= 0_8) then
     info = QMCKL_INVALID_ARG_4
     return
  endif

  if (n <= 0_8) then
     info = QMCKL_INVALID_ARG_5
     return
  endif

  if (k <= 0_8) then
     info = QMCKL_INVALID_ARG_6
     return
  endif

  if (LDA <= 0) then
     info = QMCKL_INVALID_ARG_9
     return
  endif

  if (LDB <= 0) then
     info = QMCKL_INVALID_ARG_11
     return
  endif

  if (LDC <= 0) then
     info = QMCKL_INVALID_ARG_14
     return
  endif

  call dgemm(transA, transB, int(m,4), int(n,4), int(k,4), &
       alpha, A, int(LDA,4), B, int(LDB,4), beta, C, int(LDC,4))

end function qmckl_dgemm_f

1.2 `qmckl_adjugate`

Given a matrix \(\mathbf{A}\), the adjugate matrix \(\text{adj}(\mathbf{A})\) is the transpose of the cofactors matrix of \(\mathbf{A}\).

\[ \mathbf{B} = \text{adj}(\mathbf{A}) = \text{det}(\mathbf{A}) \, \mathbf{A}^{-1} \]

Variable	Type	In/Out	Description
`context`	`qmckl_context`	in	Global state
`n`	`int64_t`	in	Number of rows and columns of the input matrix
`A`	`double[][lda]`	in	Array containing the \(n \times n\) matrix \(A\)
`lda`	`int64_t`	in	Leading dimension of array `A`
`B`	`double[][ldb]`	out	Adjugate of \(A\)
`ldb`	`int64_t`	in	Leading dimension of array `B`
`det_l`	`double`	inout	determinant of \(A\)

Requirements:

context is not QMCKL_NULL_CONTEXT
n > 0
lda >= m
A is allocated with at least \(m \times m \times 8\) bytes
ldb >= m
B is allocated with at least \(m \times m \times 8\) bytes

qmckl_exit_code qmckl_adjugate (
      const qmckl_context context,
      const int64_t n,
      const double* A,
      const int64_t lda,
      double* const B,
      const int64_t ldb,
      double* det_l );

For small matrices (≤ 5× 5), we use specialized routines for performance motivations. For larger sizes, we rely on the LAPACK library.

integer function qmckl_adjugate_f(context, na, A, LDA, B, ldb, det_l) &
     result(info)
  use qmckl
  implicit none
  integer(qmckl_context)  , intent(in)  :: context
  double precision, intent(in)          :: A (LDA,*)
  integer*8, intent(in)                 :: LDA
  double precision, intent(out)         :: B (LDB,*)
  integer*8, intent(in)                 :: LDB
  integer*8, intent(in)                 :: na
  double precision, intent(inout)       :: det_l

  integer    :: i,j

  info = QMCKL_SUCCESS

  if (context == QMCKL_NULL_CONTEXT) then
     info = QMCKL_INVALID_CONTEXT
     return
  endif

  if (na <= 0_8) then
     info = QMCKL_INVALID_ARG_2
     return
  endif

  if (LDA <= 0_8) then
     info = QMCKL_INVALID_ARG_4
     return
  endif

  if (LDA < na) then
     info = QMCKL_INVALID_ARG_4
     return
  endif

  select case (na)
  case (5)
     call adjugate5(A,LDA,B,LDB,na,det_l)
  case (4)
     call adjugate4(A,LDA,B,LDB,na,det_l)
  case (3)
     call adjugate3(A,LDA,B,LDB,na,det_l)
  case (2)
     call adjugate2(A,LDA,B,LDB,na,det_l)
  case (1)
    det_l = a(1,1)
    b(1,1) = 1.d0
  case default
     call adjugate_general(context, na, A, LDA, B, LDB, det_l)
  end select

end function qmckl_adjugate_f

subroutine adjugate_general(context, na, A, LDA, B, LDB, det_l)
  use qmckl
  implicit none
  integer(qmckl_context), intent(in) :: context
  double precision, intent(in)       :: A (LDA,na)
  integer*8, intent(in)              :: LDA
  double precision, intent(out)      :: B (LDB,na)
  integer*8, intent(in)              :: LDB
  integer*8, intent(in)              :: na
  double precision, intent(inout)    :: det_l

  double precision :: work(LDA*max(na,64))
  integer          :: inf
  integer          :: ipiv(LDA)
  integer          :: lwork
  integer(8)       :: i, j

We first copy the array A into array B.

B(1:na,1:na) = A(1:na,1:na)

Then, we compute the LU factorization \(LU=A\) using the dgetrf routine.

call dgetrf(na, na, B, LDB, ipiv, inf )

By convention, the determinant of \(L\) is equal to one, so the determinant of \(A\) is equal to the determinant of \(U\), which is simply computed as the product of its diagonal elements.

det_l = 1.d0
j=0_8
do i=1,na
 j = j+min(abs(ipiv(i)-i),1)
 det_l = det_l*B(i,i)
enddo

As dgetrf returns \(PLU=A\) where \(P\) is a permutation matrix, the sign of the determinant is computed as \(-1^m\) where \(m\) is the number of permutations.

if (iand(j,1_8) /= 0_8)  then
  det_l = -det_l
endif

Then, the inverse of \(A\) is computed using dgetri:

lwork = SIZE(work)
call dgetri(na, B, LDB, ipiv, work, lwork, inf )

and the adjugate matrix is computed as the product of the determinant with the inverse:

  B(:,:) = B(:,:)*det_l

end subroutine adjugate_general

BLAS functions

Table of Contents

1 Matrix operations

1.1 qmckl_dgemm

1.2 qmckl_adjugate

1.1 `qmckl_dgemm`

1.2 `qmckl_adjugate`