mirror of
https://github.com/TREX-CoE/qmckl.git
synced 2025-01-08 20:33:40 +01:00
- Updated Perfomance recommendations, did some rewriting of parts of the text and removed more typos.
This commit is contained in:
parent
78c574af49
commit
37d5ff61ff
@ -2,7 +2,7 @@
|
||||
#+SETUPFILE: ../tools/theme.setup
|
||||
#+INCLUDE: ../tools/lib.org
|
||||
|
||||
Low- and high-level functions that use the Sherman-Morrison and
|
||||
Low- and high-level functions that use the Sherman-Morrison and
|
||||
Woodbury matrix inversion formulas to update the inverse of a
|
||||
non-singular matrix
|
||||
|
||||
@ -59,8 +59,8 @@ int main() {
|
||||
This value sets the lower bound for which the
|
||||
denominator $1+v_j^TS^{-1}u_j$ is considered to be too small and will most probably result in a singular matrix
|
||||
$S$, or at least in an inverse of $S$ of very poor numerical quality. Therefore, when $1+v_j^TS^{-1}u_j \geq \epsilon$,
|
||||
the update is applied as usual. If $1+v_j^TS^{-1}u_j \leq \epsilon$ the update is rejected and the kernel exits with
|
||||
return code \texttt{QMCKL_FAILURE}.
|
||||
the update is applied as usual and the kernel exits with return code \texttt{QMCKL_SUCCESS}.
|
||||
If $1+v_j^TS^{-1}u_j \leq \epsilon$ the update is rejected and the kernel exits with return code \texttt{QMCKL_FAILURE}.
|
||||
|
||||
#+NAME: qmckl_sherman_morrison_args
|
||||
| qmckl_context | context | in | Global state |
|
||||
@ -176,8 +176,9 @@ qmckl_exit_code qmckl_sherman_morrison_c(const qmckl_context context,
|
||||
|
||||
*** Performance
|
||||
|
||||
This function performs better when there is only 1 rank-1 update in the update cycle and the fail-rate of rank-1 updates is high.
|
||||
|
||||
This function performs best when there is only 1 rank-1 update in the update cycle. It is not useful to
|
||||
use Sherman-Morrison with update splitting for these cycles since splitting can never resolve a situation
|
||||
where applying the update causes singular behaviour.
|
||||
|
||||
** C interface :noexport:
|
||||
|
||||
@ -449,7 +450,8 @@ qmckl_exit_code qmckl_woodbury_2_c(const qmckl_context context,
|
||||
|
||||
*** Performance
|
||||
|
||||
This function is most efficient when used in cases where there are only 2 rank-1 updates.
|
||||
This function is most efficient when used in cases where there are only 2 rank-1 updates and
|
||||
it is sure they will not result in a singular matrix.
|
||||
|
||||
** C interface :noexport:
|
||||
|
||||
@ -689,8 +691,8 @@ qmckl_exit_code qmckl_woodbury_3_c(const qmckl_context context,
|
||||
|
||||
*** Performance...
|
||||
|
||||
This function is most efficient when used in cases where there are only 3 rank-1 updates.
|
||||
|
||||
This function is most efficient when used in cases where there are only 3 rank-1 updates and
|
||||
it is sure they will not result in a singular matrix.
|
||||
|
||||
** C interface :noexport:
|
||||
|
||||
@ -780,11 +782,13 @@ assert(rc == QMCKL_SUCCESS);
|
||||
|
||||
This is a variation on the 'Naive' Sherman-Morrison kernel. Whenever the denominator $1+v_j^T S^{-1} u_j$ in
|
||||
the Sherman-Morrison formula is deemed to be too close to zero, the update $u_j$ is split in half:
|
||||
$u_j \rightarrow \frac{1}{1} u_j$. One half is applied immediately --necessarily increasing the value of the
|
||||
$u_j \rightarrow \frac{1}{2} u_j$. One half is applied immediately --necessarily increasing the value of the
|
||||
denominator because of the split-- while the other halve is put in a queue that will be applied when all the
|
||||
remaining updates have been treated. The kernel is executed recursively until the queue is eiter empty and all
|
||||
remaining updates have been treated.
|
||||
|
||||
The kernel is executed recursively until the queue is eiter empty and all
|
||||
updates are applied successfully, or the size of the queue equals the number of initial updates. In the last
|
||||
case the Slater-matrix that would have resulted from applying the updates is un-invertable and therefore the
|
||||
case the Slater-matrix that would have resulted from applying the updates is singular and therefore the
|
||||
kernel exits with an exit code.
|
||||
|
||||
#+NAME: qmckl_sherman_morrison_splitting_args
|
||||
@ -877,7 +881,7 @@ qmckl_exit_code qmckl_sherman_morrison_splitting_c(const qmckl_context context,
|
||||
|
||||
*** Performance...
|
||||
|
||||
This kernel performs best when there are only 1 rank-1 update cycles and/or when the fail-rate is high.
|
||||
This kernel performs best when there are 2 or more rank-1 update cycles and fail-rate is high.
|
||||
|
||||
** C interface :noexport:
|
||||
|
||||
@ -1099,7 +1103,7 @@ qmckl_exit_code qmckl_sherman_morrison_smw32s_c(const qmckl_context context,
|
||||
|
||||
*** Performance...
|
||||
|
||||
This kernel performs best when the number of rank-1 updates is larger than 3 and fail-rates are low.
|
||||
This kernel performs best for update cycles with 2 or more rank-1 updates and the fail-rate is low.
|
||||
|
||||
** C interface :noexport:
|
||||
|
||||
@ -1176,7 +1180,7 @@ for (unsigned int i = 0; i < Dim; i++) {
|
||||
}
|
||||
assert(rc == QMCKL_SUCCESS);
|
||||
#+end_src
|
||||
|
||||
|
||||
|
||||
* Helper Functions
|
||||
|
||||
@ -1191,7 +1195,7 @@ These functions can only be used internally by the kernels in this module.
|
||||
:END:
|
||||
|
||||
~qmckl_slagel_splitting~ is the non-recursive, inner part of the 'Sherman-Morrison with update splitting'-kernel.
|
||||
It is used internally to apply a collection of $N$ of rank-1 updates to the inverse Slater-matrix $S^{-1}$ and
|
||||
It is used internally to apply a collection of $N$ rank-1 updates to the inverse Slater-matrix $S^{-1}$ and
|
||||
splitting an update in two equal pieces if necessary. In case of a split, it applies the first half of the update,
|
||||
while putting the second half in a waiting queue to be applied at the end.
|
||||
|
||||
@ -1279,9 +1283,9 @@ qmckl_exit_code qmckl_slagel_splitting_c(uint64_t Dim,
|
||||
|
||||
// Denominator
|
||||
double den = 1 + C[Updates_index[l] - 1];
|
||||
if (fabs(den) < breakdown) {
|
||||
if (fabs(den) < breakdown) { // Here is decided to split the update, or not.
|
||||
|
||||
// U_l = U_l / 2 (do the split)
|
||||
// U_l = U_l / 2: split the update in 2 equal halves and save the second halve in later_updates
|
||||
for (uint64_t i = 0; i < Dim; i++) {
|
||||
later_updates[*later * Dim + i] = Updates[l * Dim + i] / 2.0;
|
||||
C[i] /= 2.0;
|
||||
@ -1290,7 +1294,7 @@ qmckl_exit_code qmckl_slagel_splitting_c(uint64_t Dim,
|
||||
(*later)++;
|
||||
|
||||
den = 1 + C[Updates_index[l] - 1];
|
||||
}
|
||||
} // From here onwards we continue with applying the first havel of the update to Slater_inv
|
||||
double iden = 1 / den;
|
||||
|
||||
// D = v^T x S^{-1}
|
||||
@ -1315,7 +1319,8 @@ qmckl_exit_code qmckl_slagel_splitting_c(uint64_t Dim,
|
||||
|
||||
*** Performance
|
||||
|
||||
This function performce better for cycles with 1 rank-1 update and with a high fail-rate.
|
||||
This function cannot be used by itself and is used in Sherman-Morrison with update splitting and Woodbury 3x3 and 2x2
|
||||
with Sherman-Morrison and update splitting. Please look at the performance reccomendations for those two kernels.
|
||||
|
||||
|
||||
** C interface :noexport:
|
||||
|
Loading…
Reference in New Issue
Block a user