51 KiB
TREX Configuration file
- Metadata (metadata group)
- Electron (electron group)
- Nucleus (nucleus group)
- Effective core potentials (ecp group)
- Basis set (basis group)
- Atomic orbitals (ao group)
- Molecular orbitals (mo group)
- Slater determinants (determinant group)
- Configuration state functions (csf group)
- Excited states (state group)
- Reduced density matrices (rdm group)
- Cell (cell group)
- Periodic boundary calculations (pbc group)
- Quantum Monte Carlo data (qmc group)
This page contains information about the general structure of the
TREXIO library. The source code of the library can be automatically
generated based on the contents of the trex.json
configuration file,
which itself is generated from different sections (groups) presented
below.
All quantities are saved in TREXIO files in atomic units. The
dimensions of the arrays in the tables below are given in column-major
order (as in Fortran), and the ordering of the dimensions is reversed
in the produced trex.json
configuration file as the library is
written in C.
TREXIO currently supports int
, float
and str
types for both
single attributes and arrays. Note, that some attributes might have
dim
type (e.g. num
of the nucleus
group). This type is treated
exactly in the same way as int
with the only difference that dim
variables cannot be negative. This additional constraint is required
because dim
attributes are used internally to allocate memory and to
check array boundaries in the memory-safe API. Most of the times, the
dim
variables contain the num
suffix.
You may also encounter some dim readonly
variables.
It means that the value is automatically computed and written by the
TREXIO library, thus it is read-only and cannot be (over)written by the
user.
In Fortran, arrays are 1-based and in most other languages the
arrays are 0-based. Hence, we introduce the index
type which is a
1-based int
in the Fortran interface and 0-based otherwise.
For sparse data structures such as electron replusion integrals,
the data can be too large to fit in memory and the data needs to be
fetched using multiple function calls to perform I/O on buffers.
For more information on how to read/write sparse data structures, see
the examples. The sparse
data representation implies the
coordinate list representation, namely the user has to write a list
of indices and values.
For the Configuration Interfaction (CI) and Configuration State Function (CSF)
groups, the buffered
data type is introduced, which allows similar incremental
I/O as for sparse
data but without the need to write indices of the sparse values.
For determinant lists (integer bit fields), the special
attribute is present in the type.
This means that the source code is not produced by the generator, but hand-written.
Metadata (metadata group)
As we expect our files to be archived in open-data repositories, we need to give the possibility to the users to store some metadata inside the files. We propose to store the list of names of the codes which have participated to the creation of the file, a list of authors of the file, and a textual description.
Variable | Type | Dimensions (for arrays) | Description |
---|---|---|---|
code_num |
dim |
Number of codes used to produce the file | |
code |
str |
(metadata.code_num) |
Names of the codes used |
author_num |
dim |
Number of authors of the file | |
author |
str |
(metadata.author_num) |
Names of the authors of the file |
package_version |
str |
TREXIO version used to produce the file | |
description |
str |
Text describing the content of file | |
unsafe |
int |
1 : true, 0 : false |
Note: unsafe
attribute of the metadata
group indicates whether the file has been previously opened with 'u'
mode.
It is automatically written in the file upon the first unsafe opening.
If the user has checked that the TREXIO file is valid (e.g. using trexio-tools
) after unsafe operations,
then the unsafe
attribute value can be manually overwritten (in unsafe mode) from 1
to 0
.
"metadata": {
"code_num" : [ "dim", [] ]
, "code" : [ "str", [ "metadata.code_num" ] ]
, "author_num" : [ "dim", [] ]
, "author" : [ "str", [ "metadata.author_num" ] ]
, "package_version" : [ "str", [] ]
, "description" : [ "str", [] ]
, "unsafe" : [ "int", [] ]
} ,
Electron (electron group)
We consider wave functions expressed in the spin-free formalism, where the number of ↑ and ↓ electrons is fixed.
#+NAME:electron
Variable | Type | Dimensions | Description |
---|---|---|---|
num |
dim |
Number of electrons | |
up_num |
int |
Number of ↑-spin electrons | |
dn_num |
int |
Number of ↓-spin electrons |
"electron": {
"num" : [ "dim", [] ]
, "up_num" : [ "int", [] ]
, "dn_num" : [ "int", [] ]
} ,
Nucleus (nucleus group)
The nuclei are considered as fixed point charges. Coordinates are given in Cartesian $(x,y,z)$ format.
Variable | Type | Dimensions | Description |
---|---|---|---|
num |
dim |
Number of nuclei | |
charge |
float |
(nucleus.num) |
Charges of the nuclei |
coord |
float |
(3,nucleus.num) |
Coordinates of the atoms |
label |
str |
(nucleus.num) |
Atom labels |
point_group |
str |
Symmetry point group | |
repulsion |
float |
Nuclear repulsion energy |
"nucleus": {
"num" : [ "dim" , [] ]
, "charge" : [ "float", [ "nucleus.num" ] ]
, "coord" : [ "float", [ "nucleus.num", "3" ] ]
, "label" : [ "str" , [ "nucleus.num" ] ]
, "point_group" : [ "str" , [] ]
, "repulsion" : [ "float", [] ]
} ,
Effective core potentials (ecp group)
An effective core potential (ECP) $V_A^{\text{ECP}}$ replacing the core electrons of atom $A$ can be expressed as \[ V_A^{\text{ECP}} = V_{A \ell_{\max}+1} + \sum_{\ell=0}^{\ell_{\max}} \sum_{m=-\ell}^{\ell} | Y_{\ell m} \rangle \left[ V_{A \ell} - V_{A \ell_{\max}+1} \right] \langle Y_{\ell m} | \]
The first term in the equation above is sometimes attributed to the local channel, while the remaining terms correspond to the non-local channel projections.
The functions $V_{A\ell}$ are parameterized as: \[ V_{A \ell}(\mathbf{r}) = \sum_{q=1}^{N_{q \ell}} \beta_{A q \ell}\, |\mathbf{r}-\mathbf{R}_{A}|^{n_{A q \ell}}\, e^{-\alpha_{A q \ell} |\mathbf{r}-\mathbf{R}_{A}|^2 } \]
See http://dx.doi.org/10.1063/1.4984046 or https://doi.org/10.1063/1.5121006 for more info.
Variable | Type | Dimensions | Description |
---|---|---|---|
max_ang_mom_plus_1 |
int |
(nucleus.num) |
$\ell_{\max}+1$, one higher than the max angular momentum in the removed core orbitals |
z_core |
int |
(nucleus.num) |
Number of core electrons to remove per atom |
num |
dim |
Total number of ECP functions for all atoms and all values of $\ell$ | |
ang_mom |
int |
(ecp.num) |
One-to-one correspondence between ECP items and the angular momentum $\ell$ |
nucleus_index |
index |
(ecp.num) |
One-to-one correspondence between ECP items and the atom index |
exponent |
float |
(ecp.num) |
$\alpha_{A q \ell}$ all ECP exponents |
coefficient |
float |
(ecp.num) |
$\beta_{A q \ell}$ all ECP coefficients |
power |
int |
(ecp.num) |
$n_{A q \ell}$ all ECP powers |
There might be some confusion in the meaning of the $\ell_{\max}$. It can be attributed to the maximum angular momentum occupied in the core orbitals, which are removed by the ECP. On the other hand, it can be attributed to the maximum angular momentum of the ECP that replaces the core electrons. Note, that the latter $\ell_{\max}$ is always higher by 1 than the former.
Note for developers: avoid having variables with similar prefix in their name.
HDF5 back end might cause issues due to the way find_dataset
function works.
For example, in the ECP group we use max_ang_mom
and not ang_mom_max
.
The latter causes issues when written before the ang_mom
array in the TREXIO file.
Update: in fact, the aforementioned issue has only been observed when using HDF5 version 1.10.4
installed via apt-get
. Installing the same version from the conda-forge
channel and running it in
an isolated conda
environment works just fine. Thus, it seems to be a bug in the apt
-provided package.
If you encounter the aforementioned issue, please report it to our issue tracker on GitHub.
"ecp": {
"max_ang_mom_plus_1" : [ "int" , [ "nucleus.num" ] ]
, "z_core" : [ "int" , [ "nucleus.num" ] ]
, "num" : [ "dim" , [] ]
, "ang_mom" : [ "int" , [ "ecp.num" ] ]
, "nucleus_index" : [ "index", [ "ecp.num" ] ]
, "exponent" : [ "float", [ "ecp.num" ] ]
, "coefficient" : [ "float", [ "ecp.num" ] ]
, "power" : [ "int" , [ "ecp.num" ] ]
} ,
Example
For example, consider H_2 molecule with the following effective core potential (in GAMESS input format for the H atom):
H-ccECP GEN 0 1 3 1.00000000000000 1 21.24359508259891 21.24359508259891 3 21.24359508259891 -10.85192405303825 2 21.77696655044365 1 0.00000000000000 2 1.000000000000000
In TREXIO representation this would be:
num = 8 # lmax+1 per atom max_ang_mom_plus_1 = [ 1, 1 ] # number of core electrons to remove per atom zcore = [ 0, 0 ] # first 4 ECP elements correspond to the first H atom ; the remaining 4 elements are for the second H atom nucleus_index = [ 0, 0, 0, 0, 1, 1, 1, 1 ] # 3 first ECP elements correspond to potential of the P orbital (l=1), then 1 element for the S orbital (l=0) ; similar for the second H atom ang_mom = [ 1, 1, 1, 0, 1, 1, 1, 0 ] # ECP quantities that can be attributed to atoms and/or angular momenta based on the aforementioned ecp_nucleus and ecp_ang_mom arrays coefficient = [ 1.00000000000000, 21.24359508259891, -10.85192405303825, 0.00000000000000, 1.00000000000000, 21.24359508259891, -10.85192405303825, 0.00000000000000 ] exponent = [ 21.24359508259891, 21.24359508259891, 21.77696655044365, 1.000000000000000, 21.24359508259891, 21.24359508259891, 21.77696655044365, 1.000000000000000 ] power = [ -1, 1, 0, 0, -1, 1, 0, 0 ]
Basis set (basis group)
We consider here basis functions centered on nuclei. Hence, we enable the possibility to define dummy atoms to place basis functions in random positions.
The atomic basis set is defined as a list of shells. Each shell $s$ is centered on a center $A$, possesses a given angular momentum $l$ and a radial function $R_s$. The radial function is a linear combination of $N_{\text{prim}}$ primitive functions that can be of type Slater ($p=1$) or Gaussian ($p=2$), parameterized by exponents $\gamma_{ks}$ and coefficients $a_{ks}$: \[ R_s(\mathbf{r}) = \mathcal{N}_s \vert\mathbf{r}-\mathbf{R}_A\vert^{n_s} \sum_{k=1}^{N_{\text{prim}}} a_{ks}\, f_{ks}(\gamma_{ks},p)\, \exp \left( - \gamma_{ks} \vert \mathbf{r}-\mathbf{R}_A \vert ^p \right). \]
In the case of Gaussian functions, $n_s$ is always zero.
Different codes normalize functions at different levels. Computing normalization factors requires the ability to compute overlap integrals, so the normalization factors should be written in the file to ensure that the file is self-contained and does not need the client program to have the ability to compute such integrals.
Some codes assume that the contraction coefficients are for a linear combination of normalized primitives. This implies that a normalization constant for the primitive $ks$ needs to be computed and stored. If this normalization factor is not required, $f_{ks}=1$.
Some codes assume that the basis function are normalized. This implies the computation of an extra normalization factor, $\mathcal{N}_s$. If the the basis function is not considered normalized, $\mathcal{N}_s=1$.
All the basis set parameters are stored in one-dimensional arrays:
Variable | Type | Dimensions | Description |
---|---|---|---|
type |
str |
Type of basis set: "Gaussian" or "Slater" | |
prim_num |
dim |
Total number of primitives | |
shell_num |
dim |
Total number of shells | |
nucleus_index |
index |
(basis.shell_num) |
One-to-one correspondence between shells and atomic indices |
shell_ang_mom |
int |
(basis.shell_num) |
One-to-one correspondence between shells and angular momenta |
shell_factor |
float |
(basis.shell_num) |
Normalization factor of each shell ($\mathcal{N}_s$) |
r_power |
int |
(basis.shell_num) |
Power to which $r$ is raised ($n_s$) |
shell_index |
index |
(basis.prim_num) |
One-to-one correspondence between primitives and shell index |
exponent |
float |
(basis.prim_num) |
Exponents of the primitives ($\gamma_{ks}$) |
coefficient |
float |
(basis.prim_num) |
Coefficients of the primitives ($a_{ks}$) |
prim_factor |
float |
(basis.prim_num) |
Normalization coefficients for the primitives ($f_{ks}$) |
"basis": {
"type" : [ "str" , [] ]
, "prim_num" : [ "dim" , [] ]
, "shell_num" : [ "dim" , [] ]
, "nucleus_index" : [ "index", [ "basis.shell_num" ] ]
, "shell_ang_mom" : [ "int" , [ "basis.shell_num" ] ]
, "shell_factor" : [ "float", [ "basis.shell_num" ] ]
, "r_power" : [ "int" , [ "basis.shell_num" ] ]
, "shell_index" : [ "index", [ "basis.prim_num" ] ]
, "exponent" : [ "float", [ "basis.prim_num" ] ]
, "coefficient" : [ "float", [ "basis.prim_num" ] ]
, "prim_factor" : [ "float", [ "basis.prim_num" ] ]
} ,
Example
For example, consider H_2 with the following basis set (in GAMESS format), where both the AOs and primitives are considered normalized:
HYDROGEN S 5 1 3.387000E+01 6.068000E-03 2 5.095000E+00 4.530800E-02 3 1.159000E+00 2.028220E-01 4 3.258000E-01 5.039030E-01 5 1.027000E-01 3.834210E-01 S 1 1 3.258000E-01 1.000000E+00 S 1 1 1.027000E-01 1.000000E+00 P 1 1 1.407000E+00 1.000000E+00 P 1 1 3.880000E-01 1.000000E+00 D 1 1 1.057000E+00 1.000000E+00
In TREXIO representaion we have:
type = "Gaussian" prim_num = 20 shell_num = 12 # 6 shells per H atom nucleus_index = [ 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1 ] # 3 shells in S (l=0), 2 in P (l=1), 1 in D (l=2) shell_ang_mom = [ 0, 0, 0, 1, 1, 2, 0, 0, 0, 1, 1, 2 ] # no need to renormalize shells shell_factor = [ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1. ] # 5 primitives for the first S shell and then 1 primitive per remaining shells in each H atom shell_index = [ 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 6, 6, 6, 6, 7, 8, 9, 10, 11 ] # parameters of the primitives (10 per H atom) exponent = [ 33.87, 5.095, 1.159, 0.3258, 0.1027, 0.3258, 0.1027, 1.407, 0.388, 1.057, 33.87, 5.095, 1.159, 0.3258, 0.1027, 0.3258, 0.1027, 1.407, 0.388, 1.057 ] coefficient = [ 0.006068, 0.045308, 0.202822, 0.503903, 0.383421, 1.0, 1.0, 1.0, 1.0, 1.0, 0.006068, 0.045308, 0.202822, 0.503903, 0.383421, 1.0, 1.0, 1.0, 1.0, 1.0 ] ` prim_factor = [ 1.0006253235944540e+01, 2.4169531573445120e+00, 7.9610924849766440e-01 3.0734305383061117e-01, 1.2929684417481876e-01, 3.0734305383061117e-01, 1.2929684417481876e-01, 2.1842769845268308e+00, 4.3649547399719840e-01, 1.8135965626177861e+00, 1.0006253235944540e+01, 2.4169531573445120e+00, 7.9610924849766440e-01, 3.0734305383061117e-01, 1.2929684417481876e-01, 3.0734305383061117e-01, 1.2929684417481876e-01, 2.1842769845268308e+00, 4.3649547399719840e-01, 1.8135965626177861e+00 ]
Atomic orbitals (ao group)
Going from the atomic basis set to AOs implies a systematic construction of all the angular functions of each shell. We consider two cases for the angular functions: the real-valued spherical harmonics, and the polynomials in Cartesian coordinates. In the case of real spherical harmonics, the AOs are ordered as $0, +1, -1, +2, -2, \dots, +m, -m$ (see Wikipedia). In the case of polynomials we impose the canonical (or alphabetical) ordering), i.e
\begin{eqnarray} p & : & p_x, p_y, p_z \nonumber \\ d & : & d_{xx}, d_{xy}, d_{xz}, d_{yy}, d_{yz}, d_{zz} \nonumber \\ f & : & f_{xxx}, f_{xxy}, f_{xxz}, f_{xyy}, f_{xyz}, f_{xzz}, f_{yyy}, f_{yyz}, f_{yzz}, …f_{zzz} \nonumber \\ {\rm etc.} \nonumber \end{eqnarray}Note that there is no exception for $p$ orbitals in spherical coordinates: the ordering is $0,+1,-1$ which corresponds $p_z, p_x, p_y$.
AOs are defined as
\[ \chi_i (\mathbf{r}) = \mathcal{N}_i\, P_{\eta(i)}(\mathbf{r})\, R_{\theta(i)} (\mathbf{r}) \]
where $i$ is the atomic orbital index, $P$ encodes for either the polynomials or the spherical harmonics, $\theta(i)$ returns the shell on which the AO is expanded, and $\eta(i)$ denotes which angular function is chosen. $\mathcal{N}_i$ is a normalization factor that enables the possibility to have different normalization coefficients within a shell, as in the GAMESS convention where $\mathcal{N}_{x^2} \ne \mathcal{N}_{xy}$ because \[ \left[ \iiint \left(x-X_A \right)^2 R_{\theta(i)} (\mathbf{r}) dx\, dy\, dz \right]^{-1/2} \ne \left[ \iiint \left( x-X_A \right) \left( y-Y_A \right) R_{\theta(i)} (\mathbf{r}) dx\, dy\, dz \right]^{-1/2}. \]
In such a case, one should set the normalization of the shell (in the Basis set section) to $\mathcal{N}_{z^2}$, which is the normalization factor of the atomic orbitals in spherical coordinates. The normalization factor of the $xy$ function which should be introduced here should be $\frac{\mathcal{N}_{xy}}{\mathcal{N}_{z^2}}$.
Variable | Type | Dimensions | Description |
---|---|---|---|
cartesian |
int |
1 : true, 0 : false |
|
num |
dim |
Total number of atomic orbitals | |
shell |
index |
(ao.num) |
basis set shell for each AO |
normalization |
float |
(ao.num) |
Normalization factors |
"ao": {
"cartesian" : [ "int" , [] ]
, "num" : [ "dim" , [] ]
, "shell" : [ "index", [ "ao.num" ] ]
, "normalization" : [ "float", [ "ao.num" ] ]
} ,
One-electron integrals (ao_1e_int
group)
- \[ \hat{V}_{\text{ne}} = \sum_{A=1}^{N_\text{nucl}} \sum_{i=1}^{N_\text{elec}} \frac{-Z_A }{\vert \mathbf{R}_A - \mathbf{r}_i \vert} \] : electron-nucleus attractive potential,
- \[ \hat{T}_{\text{e}} = \sum_{i=1}^{N_\text{elec}} -\frac{1}{2}\hat{\Delta}_i \] : electronic kinetic energy
- $\hat{h} = \hat{T}_\text{e} + \hat{V}_\text{ne} + \hat{V}_\text{ECP}$ : core electronic Hamiltonian
The one-electron integrals for a one-electron operator $\hat{O}$ are \[ \langle p \vert \hat{O} \vert q \rangle \], returned as a matrix over atomic orbitals.
Variable | Type | Dimensions | Description |
---|---|---|---|
overlap |
float |
(ao.num, ao.num) |
$\langle p \vert q \rangle$ (real part, general case) |
kinetic |
float |
(ao.num, ao.num) |
$\langle p \vert \hat{T}_e \vert q \rangle$ (real part, general case) |
potential_n_e |
float |
(ao.num, ao.num) |
$\langle p \vert \hat{V}_{\text{ne}} \vert q \rangle$ (real part, general case) |
ecp |
float |
(ao.num, ao.num) |
$\langle p \vert \hat{V}_{\text{ecp}} \vert q \rangle$ (real part, general case) |
core_hamiltonian |
float |
(ao.num, ao.num) |
$\langle p \vert \hat{h} \vert q \rangle$ (real part, general case) |
overlap_im |
float |
(ao.num, ao.num) |
$\langle p \vert q \rangle$ (imaginary part) (imaginary part) |
kinetic_im |
float |
(ao.num, ao.num) |
$\langle p \vert \hat{T}_e \vert q \rangle$ (imaginary part) |
potential_n_e_im |
float |
(ao.num, ao.num) |
$\langle p \vert \hat{V}_{\text{ne}} \vert q \rangle$ (imaginary part) |
ecp_im |
float |
(ao.num, ao.num) |
$\langle p \vert \hat{V}_{\text{ECP}} \vert q \rangle$ (imaginary part) |
core_hamiltonian_im |
float |
(ao.num, ao.num) |
$\langle p \vert \hat{h} \vert q \rangle$ (imaginary part) |
"ao_1e_int": {
"overlap" : [ "float", [ "ao.num", "ao.num" ] ]
, "kinetic" : [ "float", [ "ao.num", "ao.num" ] ]
, "potential_n_e" : [ "float", [ "ao.num", "ao.num" ] ]
, "ecp" : [ "float", [ "ao.num", "ao.num" ] ]
, "core_hamiltonian" : [ "float", [ "ao.num", "ao.num" ] ]
, "overlap_im" : [ "float", [ "ao.num", "ao.num" ] ]
, "kinetic_im" : [ "float", [ "ao.num", "ao.num" ] ]
, "potential_n_e_im" : [ "float", [ "ao.num", "ao.num" ] ]
, "ecp_im" : [ "float", [ "ao.num", "ao.num" ] ]
, "core_hamiltonian_im" : [ "float", [ "ao.num", "ao.num" ] ]
} ,
Two-electron integrals (ao_2e_int
group)
The two-electron integrals for a two-electron operator $\hat{O}$ are \[ \langle p q \vert \hat{O} \vert r s \rangle \] in physicists notation or \[ ( pr \vert \hat{O} \vert qs ) \] in chemists notation, where $p,q,r,s$ are indices over atomic orbitals.
Functions are provided to get the indices in physicists or chemists notation.
- \[ \hat{W}_{\text{ee}} = \sum_{i=2}^{N_\text{elec}} \sum_{j=1}^{i-1} \frac{1}{\vert \mathbf{r}_i - \mathbf{r}_j \vert} \] : electron-electron repulsive potential operator.
- \[ \hat{W}^{lr}_{\text{ee}} = \sum_{i=2}^{N_\text{elec}} \sum_{j=1}^{i-1} \frac{\text{erf}(\vert \mathbf{r}_i - \mathbf{r}_j \vert)}{\vert \mathbf{r}_i - \mathbf{r}_j \vert} \] : electron-electron long range potential
Variable | Type | Dimensions | Description |
---|---|---|---|
eri |
float sparse |
(ao.num, ao.num, ao.num, ao.num) |
Electron repulsion integrals |
eri_lr |
float sparse |
(ao.num, ao.num, ao.num, ao.num) |
Long-range Electron repulsion integrals |
"ao_2e_int": {
"eri" : [ "float sparse", [ "ao.num", "ao.num", "ao.num", "ao.num" ] ]
, "eri_lr" : [ "float sparse", [ "ao.num", "ao.num", "ao.num", "ao.num" ] ]
} ,
Molecular orbitals (mo group)
Variable | Type | Dimensions | Description |
---|---|---|---|
type |
str |
Free text to identify the set of MOs (HF, Natural, Local, CASSCF, etc) | |
num |
dim |
Number of MOs | |
coefficient |
float |
(ao.num, mo.num) |
MO coefficients (real part, general case) |
coefficient_im |
float |
(ao.num, mo.num) |
MO coefficients (imaginary part, for periodic calculations) |
class |
str |
(mo.num) |
Choose among: Core, Inactive, Active, Virtual, Deleted |
symmetry |
str |
(mo.num) |
Symmetry in the point group |
occupation |
float |
(mo.num) |
Occupation number |
energy |
float |
(mo.num) |
For canonical MOs, corresponding eigenvalue |
spin |
int |
(mo.num) |
For UHF wave functions, 0 is $\alpha$ and 1 is $\beta$ |
"mo": {
"type" : [ "str" , [] ]
, "num" : [ "dim" , [] ]
, "coefficient" : [ "float", [ "mo.num", "ao.num" ] ]
, "coefficient_im" : [ "float", [ "mo.num", "ao.num" ] ]
, "class" : [ "str" , [ "mo.num" ] ]
, "symmetry" : [ "str" , [ "mo.num" ] ]
, "occupation" : [ "float", [ "mo.num" ] ]
, "energy" : [ "float", [ "mo.num" ] ]
, "spin" : [ "int" , [ "mo.num" ] ]
} ,
One-electron integrals (mo_1e_int
group)
The operators as the same as those defined in the AO one-electron integrals section. Here, the integrals are given in the basis of molecular orbitals.
Variable | Type | Dimensions | Description |
---|---|---|---|
overlap |
float |
(mo.num, mo.num) |
$\langle i \vert j \rangle$ (real part, general case) |
kinetic |
float |
(mo.num, mo.num) |
$\langle i \vert \hat{T}_e \vert j \rangle$ (real part, general case) |
potential_n_e |
float |
(mo.num, mo.num) |
$\langle i \vert \hat{V}_{\text{ne}} \vert j \rangle$ (real part, general case) |
ecp |
float |
(mo.num, mo.num) |
$\langle i \vert \hat{V}_{\text{ECP}} \vert j \rangle$ (real part, general case) |
core_hamiltonian |
float |
(mo.num, mo.num) |
$\langle i \vert \hat{h} \vert j \rangle$ (real part, general case) |
overlap_im |
float |
(mo.num, mo.num) |
$\langle i \vert j \rangle$ (imaginary part) (imaginary part) |
kinetic_im |
float |
(mo.num, mo.num) |
$\langle i \vert \hat{T}_e \vert j \rangle$ (imaginary part) |
potential_n_e_im |
float |
(mo.num, mo.num) |
$\langle i \vert \hat{V}_{\text{ne}} \vert j \rangle$ (imaginary part) |
ecp_im |
float |
(mo.num, mo.num) |
$\langle i \vert \hat{V}_{\text{ECP}} \vert j \rangle$ (imaginary part) |
core_hamiltonian_im |
float |
(mo.num, mo.num) |
$\langle i \vert \hat{h} \vert j \rangle$ (imaginary part) |
"mo_1e_int": {
"overlap" : [ "float", [ "mo.num", "mo.num" ] ]
, "kinetic" : [ "float", [ "mo.num", "mo.num" ] ]
, "potential_n_e" : [ "float", [ "mo.num", "mo.num" ] ]
, "ecp" : [ "float", [ "mo.num", "mo.num" ] ]
, "core_hamiltonian" : [ "float", [ "mo.num", "mo.num" ] ]
, "overlap_im" : [ "float", [ "mo.num", "mo.num" ] ]
, "kinetic_im" : [ "float", [ "mo.num", "mo.num" ] ]
, "potential_n_e_im" : [ "float", [ "mo.num", "mo.num" ] ]
, "ecp_im" : [ "float", [ "mo.num", "mo.num" ] ]
, "core_hamiltonian_im" : [ "float", [ "mo.num", "mo.num" ] ]
} ,
Two-electron integrals (mo_2e_int
group)
The operators are the same as those defined in the AO two-electron integrals section. Here, the integrals are given in the basis of molecular orbitals.
The Cholesky decomposition of the integrals can also be stored:
\[ \A_{ijkl} = \sum_{\alpha} G_{il\alpha} G_{jl\alpha} \]
Variable | Type | Dimensions | Description |
---|---|---|---|
eri |
float sparse |
(mo.num, mo.num, mo.num, mo.num) |
Electron repulsion integrals |
eri_lr |
float sparse |
(mo.num, mo.num, mo.num, mo.num) |
Long-range Electron repulsion integrals |
eri_cholesky_num |
dim |
Number of Cholesky vectors for ERI | |
eri_cholesky |
float sparse |
(mo.num, mo.num, mo_2e_int.eri_cholesky_num) |
Cholesky decomposition of the ERI |
eri_lr_cholesky_num |
dim |
Number of Cholesky vectors for long range ERI | |
eri_lr_cholesky |
float sparse |
(mo.num, mo.num, mo_2e_int.eri_lr_cholesky_num) |
Cholesky decomposition of the long range ERI |
"mo_2e_int": {
"eri" : [ "float sparse", [ "mo.num", "mo.num", "mo.num", "mo.num" ] ]
, "eri_lr" : [ "float sparse", [ "mo.num", "mo.num", "mo.num", "mo.num" ] ]
, "eri_cholesky_num" : [ "dim" , [] ]
, "eri_cholesky" : [ "float sparse", [ "mo_2e_int.eri_cholesky_num", "mo.num", "mo.num" ] ]
, "eri_lr_cholesky_num" : [ "dim" , [] ]
, "eri_lr_cholesky" : [ "float sparse", [ "mo_2e_int.eri_lr_cholesky_num", "mo.num", "mo.num" ] ]
} ,
Slater determinants (determinant group)
The configuration interaction (CI) wave function $\Psi$ can be expanded in the basis of Slater determinants $D_I$ as follows
\[ \Psi = \sum_I C_I D_I \]
For relatively small expansions, a given determinant can be represented as a list of occupied orbitals. However, this becomes unfeasible for larger expansions and requires more advanced data structures. The bit field representation is used here, namely a given determinant is represented as $N_{\text{int}}$ 64-bit integers where j-th bit is set to 1 if there is an electron in the j-th orbital and 0 otherwise. This gives access to larger determinant expansions by optimising the storage of the determinant lists in the memory.
\[ D_I = \alpha_1 \alpha_2 \ldots \alpha_{n_\uparrow} \beta_1 \beta_2 \ldots \beta_{n_\downarrow} \]
where $\alpha$ and $\beta$ denote ↑-spin and ↓-spin electrons, respectively,
$n_\uparrow$ and $n_\downarrow$ correspond to electron.up_num
and electron.dn_num
, respectively.
Note: the special
attribute is present in the types, meaning that the source node is not
produced by the code generator.
An illustration on how to read determinants is presented in the examples.
Variable | Type | Dimensions | Description |
---|---|---|---|
num |
dim readonly |
Number of determinants | |
list |
int special |
(determinant.num) |
List of determinants as integer bit fields |
coefficient |
float buffered |
(determinant.num) |
Coefficients of the determinants from the CI expansion |
"determinant": {
"num" : [ "dim readonly" , [] ]
, "list" : [ "int special" , [ "determinant.num" ] ]
, "coefficient" : [ "float buffered", [ "determinant.num" ] ]
} ,
Configuration state functions (csf group)
The configuration interaction (CI) wave function $\Psi$ can be expanded in the basis of configuration state functions (CSFs) $\Psi_I$ as follows
\[ \Psi = \sum_I C_I \psi_I. \]
Each CSF is a linear combination of Slater determinants. Slater
determinants are stored in the determinant
section. In this group
we store the CI coefficients in the basis of CSFs, and the
matrix $\langle D_I | \psi_J \rangle$ needed to project the CSFs in
the basis of Slater determinants.
Variable | Type | Dimensions | Description |
---|---|---|---|
num |
dim readonly |
Number of CSFs | |
coefficient |
float buffered |
(csf.num) |
Coefficients of the CSFs from the CI expansion |
det_coefficient |
float sparse |
(determinant.num,csf.num) |
Projection on the determinant basis |
"csf": {
"num" : [ "dim readonly" , [] ]
, "coefficient" : [ "float buffered", [ "csf.num" ] ]
, "det_coefficient" : [ "float sparse" , [ "csf.num", "determinant.num" ] ]
} ,
Excited states (state group)
This group contains information about excited state. Since TREXIO version 2.3.0 the state-specific data (e.g. CI/CSF coeffcients, RDMs) is written in a separate file in order to avoid over-complicated internal logics and global state switches.
The file_name
and label
arrays have to be written only for the master file,
e.g. the one containing the ground state wave function.
The id
and current_label
attributes have to be specified for each file
(containing both ground and excited state data).
Variable | Type | Dimensions | Description |
---|---|---|---|
num |
dim |
Number of states (including the ground state) | |
id |
int |
Index of a current state (0 is ground state) | |
label |
str |
(state.num) |
Labels of all states related to this file (e.g. 'S' for singlets) |
current_label |
str |
Labels of the current state that is in a file | |
file_name |
str |
(state.num) |
Names of the TREXIO files linked to the current one (i.e. containing data for excited states) |
"state": {
"num" : [ "dim", [] ]
, "id" : [ "int", [] ]
, "label" : [ "str", [ "state.num" ] ]
, "current_label" : [ "str", [] ]
, "file_name" : [ "str", [ "state.num" ] ]
} ,
Reduced density matrices (rdm group)
The reduced density matrices are defined in the basis of molecular orbitals.
The ↑-spin and ↓-spin components of the one-body density matrix are given by
\begin{eqnarray*} \gamma_{ij}^{\uparrow} &=& \langle \Psi | \hat{a}^{\dagger}_{j\alpha}\, \hat{a}_{i\alpha} | \Psi \rangle \\ \gamma_{ij}^{\downarrow} &=& \langle \Psi | \hat{a}^{\dagger}_{j\beta} \, \hat{a}_{i\beta} | \Psi \rangle \end{eqnarray*}and the spin-summed one-body density matrix is \[ \gamma_{ij} = \gamma^{\uparrow}_{ij} + \gamma^{\downarrow}_{ij} \]
The $\uparrow \uparrow$, $\downarrow \downarrow$, $\uparrow \downarrow$, $\downarrow \uparrow$ components of the two-body density matrix are given by
\begin{eqnarray*} \Gamma_{ijkl}^{\uparrow \uparrow} &=& \langle \Psi | \hat{a}^{\dagger}_{k\alpha}\, \hat{a}^{\dagger}_{l\alpha} \hat{a}_{j\alpha}\, \hat{a}_{i\alpha} | \Psi \rangle \\ \Gamma_{ijkl}^{\downarrow \downarrow} &=& \langle \Psi | \hat{a}^{\dagger}_{k\beta}\, \hat{a}^{\dagger}_{l\beta} \hat{a}_{j\beta}\, \hat{a}_{i\beta} | \Psi \rangle \\ \Gamma_{ijkl}^{\uparrow \downarrow} &=& \langle \Psi | \hat{a}^{\dagger}_{k\alpha}\, \hat{a}^{\dagger}_{l\beta} \hat{a}_{j\beta}\, \hat{a}_{i\alpha} | \Psi \rangle \\ \Gamma_{ijkl}^{\downarrow \uparrow} &=& \langle \Psi | \hat{a}^{\dagger}_{k\beta}\, \hat{a}^{\dagger}_{l\alpha} \hat{a}_{j\alpha}\, \hat{a}_{i\beta} | \Psi \rangle \\ \end{eqnarray*}and the spin-summed one-body density matrix is \[ \Gamma_{ijkl} = \Gamma_{ijkl}^{\uparrow \uparrow} + \Gamma_{ijkl}^{\downarrow \downarrow} + \Gamma_{ijkl}^{\uparrow \downarrow} + \Gamma_{ijkl}^{\downarrow \uparrow} \]
The total energy can be computed as: \[ E = E_{\text{NN}} + \sum_{ij} \gamma_{ij} \langle j|h|i \rangle + \frac{1}{2} \sum_{ijlk} \Gamma_{ijkl} \langle k l | i j \rangle \]
To compress the storage, the Cholesky decomposition of the RDMs can be stored:
\[ \Gamma_{ijkl} = \sum_{\alpha} G_{ij\alpha} G_{kl\alpha} \]
Warning: as opposed to electron repulsion integrals, the decomposition is made such that the Cholesky vectors are expanded in a two-electron basis $f_{ij}(\mathbf{r}_1,\mathbf{r}_2) = \phi_i(\mathbf{r}_1) \phi_j(\mathbf{r}_2)$, whereas in electron repulsion integrals each Cholesky vector is expressed in a basis of a one-electron function $g_{ik}(\mathbf{r}_1) = \phi_i(\mathbf{r}_1) \phi_k(\mathbf{r}_1)$.
Variable | Type | Dimensions | Description |
---|---|---|---|
1e |
float |
(mo.num, mo.num) |
One body density matrix |
1e_up |
float |
(mo.num, mo.num) |
↑-spin component of the one body density matrix |
1e_dn |
float |
(mo.num, mo.num) |
↓-spin component of the one body density matrix |
2e |
float sparse |
(mo.num, mo.num, mo.num, mo.num) |
Two-body reduced density matrix (spin trace) |
2e_upup |
float sparse |
(mo.num, mo.num, mo.num, mo.num) |
↑↑ component of the two-body reduced density matrix |
2e_dndn |
float sparse |
(mo.num, mo.num, mo.num, mo.num) |
↓↓ component of the two-body reduced density matrix |
2e_updn |
float sparse |
(mo.num, mo.num, mo.num, mo.num) |
↑↓ component of the two-body reduced density matrix |
2e_dnup |
float sparse |
(mo.num, mo.num, mo.num, mo.num) |
↓↑ component of the two-body reduced density matrix |
2e_cholesky_num |
dim |
Number of Cholesky vectors | |
2e_cholesky |
float sparse |
(mo.num, mo.num, rdm.2e_cholesky_num) |
Cholesky decomposition of the Two-body RDM (spin trace) |
2e_upup_cholesky_num |
dim |
Number of Cholesky vectors | |
2e_upup_cholesky |
float sparse |
(mo.num, mo.num, rdm.2e_upup_cholesky_num) |
Cholesky decomposition of the Two-body RDM (↑↑) |
2e_dndn_cholesky_num |
dim |
Number of Cholesky vectors | |
2e_dndn_cholesky |
float sparse |
(mo.num, mo.num, rdm.2e_dndn_cholesky_num) |
Cholesky decomposition of the Two-body RDM (↓↓) |
2e_updn_cholesky_num |
dim |
Number of Cholesky vectors | |
2e_updn_cholesky |
float sparse |
(mo.num, mo.num, rdm.2e_updn_cholesky_num) |
Cholesky decomposition of the Two-body RDM (↑↓) |
2e_dnup_cholesky_num |
dim |
Number of Cholesky vectors | |
2e_dnup_cholesky |
float sparse |
(mo.num, mo.num, rdm.2e_dnup_cholesky_num) |
Cholesky decomposition of the Two-body RDM (↓↑) |
"rdm": {
"1e" : [ "float" , [ "mo.num", "mo.num" ] ]
, "1e_up" : [ "float" , [ "mo.num", "mo.num" ] ]
, "1e_dn" : [ "float" , [ "mo.num", "mo.num" ] ]
, "2e" : [ "float sparse", [ "mo.num", "mo.num", "mo.num", "mo.num" ] ]
, "2e_upup" : [ "float sparse", [ "mo.num", "mo.num", "mo.num", "mo.num" ] ]
, "2e_dndn" : [ "float sparse", [ "mo.num", "mo.num", "mo.num", "mo.num" ] ]
, "2e_updn" : [ "float sparse", [ "mo.num", "mo.num", "mo.num", "mo.num" ] ]
, "2e_dnup" : [ "float sparse", [ "mo.num", "mo.num", "mo.num", "mo.num" ] ]
, "2e_cholesky_num" : [ "dim" , [] ]
, "2e_cholesky" : [ "float sparse", [ "rdm.2e_cholesky_num", "mo.num", "mo.num" ] ]
, "2e_upup_cholesky_num" : [ "dim" , [] ]
, "2e_upup_cholesky" : [ "float sparse", [ "rdm.2e_upup_cholesky_num", "mo.num", "mo.num" ] ]
, "2e_dndn_cholesky_num" : [ "dim" , [] ]
, "2e_dndn_cholesky" : [ "float sparse", [ "rdm.2e_dndn_cholesky_num", "mo.num", "mo.num" ] ]
, "2e_updn_cholesky_num" : [ "dim" , [] ]
, "2e_updn_cholesky" : [ "float sparse", [ "rdm.2e_updn_cholesky_num", "mo.num", "mo.num" ] ]
, "2e_dnup_cholesky_num" : [ "dim" , [] ]
, "2e_dnup_cholesky" : [ "float sparse", [ "rdm.2e_dnup_cholesky_num", "mo.num", "mo.num" ] ]
} ,
Cell (cell group)
Variable | Type | Dimensions | Description |
---|---|---|---|
a |
float |
(3) |
First unit cell vector |
b |
float |
(3) |
Second unit cell vector |
c |
float |
(3) |
Third unit cell vector |
"cell": {
"a" : [ "float", [ "3" ] ]
, "b" : [ "float", [ "3" ] ]
, "c" : [ "float", [ "3" ] ]
} ,
Periodic boundary calculations (pbc group)
Variable | Type | Dimensions | Description |
---|---|---|---|
periodic |
int |
1 : true or 0 : false |
|
k_point |
float |
(3) |
k-point sampling |
"pbc": {
"periodic" : [ "int" , [] ]
, "k_point" : [ "float", [ "3" ] ]
} ,
Quantum Monte Carlo data (qmc group)
In quantum Monte Carlo calculations, the wave function is evaluated at points of the 3N-dimensional space. Some algorithms require multiple independent walkers, so it is possible to store multiple coordinates, as well as some quantities evaluated at those points.
By convention, the electron coordinates contain first all the electrons of $\uparrow$-spin and then all the $\downarrow$-spin.
Variable | Type | Dimensions | Description |
---|---|---|---|
num |
dim |
Number of 3N-dimensional points | |
point |
float |
(3, electron.num, qmc.num) |
3N-dimensional points |
psi |
float |
(qmc.num) |
Wave function evaluated at the points |
e_loc |
float |
(qmc.num) |
Local energy evaluated at the points |
"qmc": {
"num" : [ "dim" , [] ]
, "point" : [ "float", [ "qmc.num", "electron.num", "3" ] ]
, "psi" : [ "float", [ "qmc.num" ] ]
, "e_loc" : [ "float", [ "qmc.num" ] ]
}