diff --git a/docs/tutorial_benzene.html b/docs/tutorial_benzene.html index 7a31a89..1914081 100644 --- a/docs/tutorial_benzene.html +++ b/docs/tutorial_benzene.html @@ -1,159 +1,15101 @@ -

TREXIO Tutorial

-

This interactive Tutorial covers some basic use cases of the TREXIO library based on the Python API. At this point, it is assumed that the TREXIO Python package has been sucessfully installed on the user machine or in the virtual environment. If this is not the case, feel free to follow the installation guide.

-

Importing TREXIO

-

First of all, let’s import the TREXIO package.

-
try:
-    import trexio
-except ImportError:
-    raise Exception("Unable to import trexio. Please check that trexio is properly instaled.")
-

If no error occurs, then it means that the TREXIO package has been sucessfully imported. Within the current import, TREXIO attributes can be accessed using the corresponding trexio.attribute notation. If you prefer to bound a shorter name to the imported module (as commonly done by the NumPy users with import numpy as np), this is also possible. To do so, replace import trexio with import trexio as tr for example. To learn more about importing modules, see the corresponding page of the Python documentation.

-

Creating a new TREXIO file

-

TREXIO currently supports two back ends for file I/O:

-
    -
  1. TREXIO_HDF5, which relies on extensive use of the HDF5 library and the associated binary file format. This back end is optimized for high performance but it requires HDF5 to be installed on the user machine.

  2. -
  3. TREXIO_TEXT, which relies on basic I/O operations that are available in the standard C library. This back end is not optimized for performance but it is supposed to work “out-of-the-box” since there are no external dependencies.

  4. -
-

Armed with these new definitions, let’s proceed with the tutorial. The first task is to create a TREXIO file called benzene_demo.h5. But first we have to remove the file if it exists in the current directory

-
filename = 'benzene_demo.h5'
-
-import os
-try:
-    os.remove(filename)
-except:
-    print(f"File {filename} does not exist.")
-
File benzene_demo.h5 does not exist.
-

We are now ready to create a new TREXIO file:

-
demo_file = trexio.File(filename, mode='w', back_end=trexio.TREXIO_HDF5)
-

This creates an instance of the trexio.File class, which we refer to as demo_file in this tutorial. You can check that the corresponding file called benzene_demo.h5 exists in the root directory. It is now open for writing as indicated by the user-supplied argument mode='w'. The file has been initiated using TREXIO_HDF5 back end and will be accessed accordingly from now on. The information about back end is stored internally by TREXIO, which means that there is no need to specify it every time the I/O operation is performed. If the file named benzene_demo.h5 already exists, then it is re-opened for writing (and not truncated to prevent data loss).

-

Writing data in the TREXIO file

-

Prior to any work with TREXIO library, we highly recommend users to read about TREXIO internal configuration, which explains the structure of the wavefunction file. The reason is that TREXIO API has a naming convention, which is based on the groups and variables names that are pre-defined by the developers. In this Tutorial, we will only cover contents of the nucleus group. Note that custom groups and variables can be added to the TREXIO API.

-

In this Tutorial, we consider benzene molecule (C6H6) as an example. Since benzene has 12 atoms, let’s specify it in the previously created demo_file. In order to do so, one has to call trexio.write_nucleus_num function, which accepts an instance of the trexio.File class as a first argument and an int value corresponding to the number of nuclei as a second argument.

-
nucleus_num = 12
-
trexio.write_nucleus_num(demo_file, nucleus_num)
-

In fact, all API functions that contain write_ prefix can be used in a similar way. Variables that contain _num suffix are important part of the TREXIO file because some of them define dimensions of arrays. For example, nucleus_num variable corresponds to the number of atoms, which will be internally used to write/read the nucleus_coord array of nuclear coordinates. In order for TREXIO files to be self-consistent, overwriting num-suffixed variables is currently disabled.

-

The number of atoms is not sufficient to define a molecule. Let’s first create a list of nuclear charges, which correspond to benzene.

-
charges = [6., 6., 6., 6., 6., 6., 1., 1., 1., 1., 1., 1.]
-

According to the TREX configuration file, there is a charge attribute of the nucleus group, which has float type and [nucleus_num] dimension. The charges list defined above fits nicely in the description and can be written as follows

-
trexio.write_nucleus_charge(demo_file, charges)
-

Note: TREXIO function names only contain parts in singular form. This means that, both write_nucleus_charges and write_nuclear_charges are invalid API calls. These functions simply do not exist in the trexio Python package and the corresponding error message should appear.

-

Alternatively, one can provide a list of nuclear labels (chemical elements from the periodic table) that correspond to the aforementioned charges. There is a label attribute of the nucleus group, which has str type and [nucleus_num] dimension. Let’s create a list of 12 strings, which correspond to 6 carbon and 6 hydrogen atoms:

-
labels = [
-    'C',
-    'C',
-    'C',
-    'C',
-    'C',
-    'C',
-    'H',
-    'H',
-    'H',
-    'H',
-    'H',
-    'H']
-

This can now be written using the corresponding trexio.write_nucleus_label function:

-
trexio.write_nucleus_label(demo_file, labels)
-

Two examples above demonstrate how to write arrays of numbers or strings in the file. TREXIO also supports I/O operations on single numerical or string attributes. In fact, in this Tutorial you have already written one numerical attribute: nucleus_num. Let’s now write a string 'D6h', which indicates a point group of benzene molecule. According to the TREX configuration file, point_group is a str attribute of the nucleus group, thus it can be written in the demo_file as follows

-
point_group = 'D6h'
-
trexio.write_nucleus_point_group(demo_file, point_group)
-

Writing NumPy arrays (float or int types)

-

The aforementioned examples cover the majority of the currently implemented functionality related to writing data in the file. It is worth mentioning that I/O of numerical arrays in TREXIO Python API relies on extensive use of the NumPy package. This will be discussed in more details in the section about reading data. However, TREXIO write_ functions that work with numerical arrays also accept numpy.ndarray objects. For example, consider a coords list of nuclear coordinates that correspond to benzene molecule

-
coords = [
-    [0.00000000  ,  1.39250319 ,  0.00000000 ],
-    [-1.20594314 ,  0.69625160 ,  0.00000000 ],
-    [-1.20594314 , -0.69625160 ,  0.00000000 ],
-    [0.00000000  , -1.39250319 ,  0.00000000 ],
-    [1.20594314  , -0.69625160 ,  0.00000000 ],
-    [1.20594314  ,  0.69625160 ,  0.00000000 ],
-    [-2.14171677 ,  1.23652075 ,  0.00000000 ],
-    [-2.14171677 , -1.23652075 ,  0.00000000 ],
-    [0.00000000  , -2.47304151 ,  0.00000000 ],
-    [2.14171677  , -1.23652075 ,  0.00000000 ],
-    [2.14171677  ,  1.23652075 ,  0.00000000 ],
-    [0.00000000  ,  2.47304151 ,  0.00000000 ],
-    ]
-

Let’s take advantage of using NumPy arrays with fixed precision for floating point numbers. But first, try to import the numpy package

-
try:
-    import numpy as np
-except ImportError:
-    raise Exception("Unable to import numpy. Please check that numpy is properly instaled.")
-

You can now convert the previously defined coords list into a numpy array with fixed float64 type as follows

-
coords_np = np.array(coords, dtype=np.float64)
-

TREXIO functions that write numerical arrays accept both lists and numpy arrays as a second argument. That is, both trexio.write_nucleus_coord(demo_file, coords) and trexio.write_nucleus_coord(demo_file, coords_np) are valid API calls. Let’s use the latter and see if it works

-
trexio.write_nucleus_coord(demo_file, coords_np)
-

Congratulations, you have just completed the nucleus section of the TREXIO file for benzene molecule! Note that TREXIO API is rather permissive and do not impose any strict ordering on the I/O operations. The only requirement is that dimensioning (_num suffixed) variables have to be written in the file before writing arrays that depend on these variables. For example, attempting to write nucleus_charge or nucleus_coord fails if nucleus_num has not been written.

-

TREXIO error handling

-

TREXIO Python API provides the trexio.Error class which simplifies exception handling in the Python scripts. This class wraps up TREXIO return codes and propagates them all the way from the C back end to the Python front end. Let’s try to write a negative number of basis set shells basis_num in the TREXIO file.

-
try:
-    trexio.write_basis_num(demo_file, -256)
-except trexio.Error as e:
-    print(f"TREXIO error message: {e.message}")
-
TREXIO error message: Invalid argument 2
-

The error message says Invalid argument 2, which indicates that the user-provided value -256 is not valid.

-

As mentioned before, _num-suffixed variables cannot be overwritten in the file. But what happens if you accidentally attempt to do so? Let’s have a look at the write_nucleus_num function as an example:

-
try:
-    trexio.write_nucleus_num(demo_file, 24)
-except trexio.Error as e:
-    print(f"TREXIO error message: {e.message}")
-
TREXIO error message: Attribute already exists
-

The API rightfully complains that the target attribute already exists and cannot be overwritten.

-

Alternatively, the aforementioned case can be handled using trexio.has_nucleus_num function as follows

-
if not trexio.has_nucleus_num:
-    trexio.write_nucleus_num(demo_file, 24)
-

TREXIO functions with has_ prefix return True if the corresponding variable exists and False otherwise.

-

What about writing arrays? Let’s try to write an list of 48 nuclear indices instead of 12

-
indices = [i for i in range(nucleus_num*4)]
-
try:
-    trexio.write_basis_nucleus_index(demo_file, indices)
-except trexio.Error as e:
-    print(f"TREXIO error message: {e.message}")
-
TREXIO error message: Access to memory beyond allocated
-

According to the TREX configuration file, nucleus_index attribute of a basis group is supposed to have [nucleus_num] elements. In the example above, we have tried to write 4 times more elements, which might lead to memory and/or file corruption. Luckily, TREXIO internally checks the array dimensions and returns an error in case of inconsistency.

-

Closing the TREXIO file

-

It is good practice to close the TREXIO file at the end of the session. In fact, trexio.File class has a destructor, which normally takes care of that. However, if you intend to re-open the TREXIO file, it has to be closed explicitly before. This can be done using the close method, i.e.

-
demo_file.close()
-

Good! You are now ready to inspect the contents of the benzene_demo.h5 file using the reading functionality of TREXIO.

-

Reading data from the TREXIO file

-

First, let’s try to open an existing TREXIO file in read-only mode. This can be done by creating a new instance of the trexio.File class but this time with mode='r' argument. Back end has to be specified as well.

-
demo_file_r = trexio.File(filename, mode='r', back_end=trexio.TREXIO_HDF5)
-

When reading data from the TREXIO file, the only required argument is a previously created instance of trexio.File class. In our case, it is demo_file_r. TREXIO functions with read_ prefix return the desired variable as an output. For example, nucleus_num value can be read from the file as follows

-
nucleus_num_r = trexio.read_nucleus_num(demo_file_r)
-
print(f"nucleus_num from {filename} file ---> {nucleus_num_r}")
-
nucleus_num from benzene_demo.h5 file ---> 12
-

The function call assigns nucleus_num_r to 12, which is consistent with the number of atoms in benzene that we wrote in the previous section.

-

All calls to functions that read data can be done in a very similar way. The key point here is a function name, which in turn defines the output format. Hopefully by now you got used to the TREXIO naming convention and the contents of the nucleus group. Which function would you call to read a point_group attribute of the nucleus group? What type does it return? See the answer below:

-
point_group_r = trexio.read_nucleus_point_group(demo_file_r)
-
print(f"nucleus_point_group from {filename} TREXIO file ---> {point_group_r}\n")
-print(f"Is return type of read_nucleus_point_group a string? ---> {isinstance(point_group_r, str)}")
-
nucleus_point_group from benzene_demo.h5 TREXIO file ---> D6h
+
+
+
+
 
-Is return type of read_nucleus_point_group a string? ---> True
-

The trexio.read_nucleus_point_group function call returns a string D6h, which is exactly what we provided in the previous section. Now, let’s read nuclear charges and labels.

-
labels_r = trexio.read_nucleus_label(demo_file_r)
-
print(f"nucleus_label from {filename} file \n---> {labels_r}")
-
nucleus_label from benzene_demo.h5 file 
----> ['C', 'C', 'C', 'C', 'C', 'C', 'H', 'H', 'H', 'H', 'H', 'H']
-
charges_r = trexio.read_nucleus_charge(demo_file_r)
-
print(f"nucleus_charge from {filename} file \n---> {charges_r}")
-
nucleus_charge from benzene_demo.h5 file 
----> [6. 6. 6. 6. 6. 6. 1. 1. 1. 1. 1. 1.]
+tutorial_benzene + + + + + + + + + + + + + + + + + + + + + +
+
+

TREXIO Tutorial

+
+
+
+
+

This tutorial covers some basic use cases of the TREXIO library based on the Python API. At this point, it is assumed that the TREXIO Python package has been successfully installed on the user machine or in the virtual environment. If this is not the case, feel free to follow the installation guide.

+ +
+
+
+
+

Importing TREXIO

+
+
+
+
+

First of all, let's import the TREXIO package.

+ +
+
+
+ +
+ +
+
+
+

If no error occurs, then it means that the TREXIO package has been successfully imported. Within the current import, TREXIO attributes can be accessed using the corresponding trexio.attribute notation. If you prefer to bound a shorter name to the imported module (as commonly done by the NumPy users with import numpy as np), this is also possible. To do so, replace import trexio with import trexio as tr for example. To learn more about importing modules, see the corresponding page of the Python documentation.

+ +
+
+
+
+

Creating a new TREXIO file

+
+
+
+
+

TREXIO currently supports two back ends for file I/O:

+
    +
  1. TREXIO_HDF5, which relies on extensive use of the HDF5 library and the associated binary file format. This back end is optimized for high performance but it requires HDF5 to be installed on the user machine.

    +
  2. +
  3. TREXIO_TEXT, which relies on basic I/O operations that are available in the standard C library. This back end is not optimized for performance but it is supposed to work "out-of-the-box" since there are no external dependencies.

    +
  4. +
+

Armed with these new definitions, let's proceed with the tutorial. The first task is to create a TREXIO file called benzene_demo.h5. But first we have to remove the file if it exists in the current directory

+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+
+

We are now ready to create a new TREXIO file:

+ +
+
+
+ +
+ +
+
+
+

This creates an instance of the trexio.File class, which we refer to as demo_file in this tutorial. You can check that the corresponding file called benzene_demo.h5 exists in the root directory. It is now open for writing as indicated by the user-supplied argument mode='w'. The file has been initiated using TREXIO_HDF5 back end and will be accessed accordingly from now on. The information about back end is stored internally by TREXIO, which means that there is no need to specify it every time the I/O operation is performed. If the file named benzene_demo.h5 already exists, then it is re-opened for writing (and not truncated to prevent data loss).

+ +
+
+
+
+

Writing data in the TREXIO file

+
+
+
+
+

Prior to any work with TREXIO library, we highly recommend users to read about TREXIO internal configuration, which explains the structure of the wavefunction file. The reason is that TREXIO API has a naming convention, which is based on the groups and variables names that are pre-defined by the developers. In this Tutorial, we will only cover contents of the nucleus group. Note that custom groups and variables can be added to the TREXIO API.

+ +
+
+
+
+

In this tutorial, we consider benzene molecule (C6H6) as an example. Since benzene has 12 atoms, let's specify it in the previously created demo_file. In order to do so, one has to call trexio.write_nucleus_num function, which accepts an instance of the trexio.File class as a first argument and an int value corresponding to the number of nuclei as a second argument.

+ +
+
+
+ +
+ +
+
+ +
+ +
+
+
+

In fact, all API functions that contain write_ prefix can be used in a similar way. +Variables that contain _num suffix (or dim type) are important part of the TREXIO file because some of them define dimensions of arrays. For example, nucleus_num variable corresponds to the number of atoms, which will be internally used to write/read the nucleus_coord array of nuclear coordinates. In order for TREXIO files to be self-consistent, the data in the file cannot be overwritten.

+ +
+
+
+
+

The number of atoms is not sufficient to define a molecule. Let's first create a list of nuclear charges, which correspond to benzene.

+ +
+
+
+ +
+ +
+
+
+

According to the TREX configuration file, there is a charge attribute of the nucleus group, which has float type and [nucleus_num] dimension. The charges list defined above fits nicely in the description and can be written as follows

+ +
+
+
+ +
+ +
+
+
+

Note: TREXIO function names only contain parts in singular form. This means that, both write_nucleus_charges and write_nuclear_charges are invalid API calls. These functions simply do not exist in the trexio Python package and the corresponding error message should appear.

+ +
+
+
+
+

Alternatively, one can provide a list of nuclear labels (chemical elements from the periodic table) that correspond to the aforementioned charges. There is a label attribute of the nucleus group, which has str type and [nucleus_num] dimension. Let's create a list of 12 strings, which correspond to 6 carbon and 6 hydrogen atoms:

+ +
+
+
+ +
+ +
+
+
+

This can now be written using the corresponding trexio.write_nucleus_label function:

+ +
+
+
+ +
+ +
+
+
+

Two examples above demonstrate how to write arrays of numbers or strings in the file. TREXIO also supports I/O operations on single numerical or string attributes. In fact, in this Tutorial you have already written one numerical attribute: nucleus_num. Let's now write a string 'D6h', which indicates a point group of benzene molecule. According to the TREX configuration file, point_group is a str attribute of the nucleus group, thus it can be written in the demo_file as follows

+ +
+
+
+ +
+ +
+
+ +
+ +
+
+
+

Writing NumPy arrays (float or int types)

+
+
+
+
+

The aforementioned examples cover the majority of the currently implemented functionality related to writing data in the file. It is worth mentioning that I/O of numerical arrays in TREXIO Python API relies on extensive use of the NumPy package. This will be discussed in more details in the section about reading data. However, TREXIO write_ functions that work with numerical arrays also accept numpy.ndarray objects. For example, consider a coords list of nuclear coordinates that correspond to benzene molecule

+ +
+
+
+ +
+ +
+
+
+

Let's take advantage of using NumPy arrays with fixed precision for floating point numbers. But first, try to import the numpy package

+ +
+
+
+ +
+ +
+
+
+

You can now convert the previously defined coords list into a numpy array with fixed float64 type as follows

+ +
+
+
+ +
+ +
+
+
+

TREXIO functions that write numerical arrays accept both lists and numpy arrays as a second argument. That is, both trexio.write_nucleus_coord(demo_file, coords) and trexio.write_nucleus_coord(demo_file, coords_np) are valid API calls. Let's use the latter and see if it works

+ +
+
+
+ +
+ +
+
+
+

Congratulations, you have just completed the nucleus section of the TREXIO file for benzene molecule! Note that TREXIO API is rather permissive and do not impose any strict ordering on the I/O operations. The only requirement is that dimensioning (see dim type) attributes have to be written in the file before writing arrays that depend on these variables. For example, attempting to write nucleus_charge or nucleus_coord fails if nucleus_num has not been written.

+ +
+
+
+
+

TREXIO error handling

+
+
+
+
+

TREXIO Python API provides the trexio.Error class which simplifies exception handling in the Python scripts. This class wraps up TREXIO return codes and propagates them all the way from the C back end to the Python front end. Let's try to write a negative number of basis set shells basis_num in the TREXIO file.

+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+
+

The error message says Invalid (negative or 0) dimension, which indicates that the user-provided value -256 is not valid.

+ +
+
+
+
+

As mentioned before, the data in the TREXIO file cannot be overwritten. But what happens if you accidentally attempt to do so? Let's have a look at the write_nucleus_num function as an example:

+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+
+

The API rightfully complains that the target attribute already exists and cannot be overwritten.

+ +
+
+
+
+

Alternatively, the aforementioned case can be handled using trexio.has_nucleus_num function as follows

+ +
+
+
+ +
+ +
+
+
+

TREXIO functions with has_ prefix return True if the corresponding variable exists and False otherwise.

+ +
+
+
+
+

What about writing arrays? Let's try to write an list of 48 nuclear indices instead of 12

+ +
+
+
+ +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+
+

According to the TREX configuration file, nucleus_index attribute of a basis group is supposed to have [nucleus_num] elements. In the example above, we have tried to write 4 times more elements, which might lead to memory and/or file corruption. Luckily, TREXIO internally checks the array dimensions and returns an error in case of inconsistency.

+ +
+
+
+
+

Closing the TREXIO file

+
+
+
+
+

It is good practice to close the TREXIO file at the end of the session. In fact, trexio.File class has a destructor, which normally takes care of that. However, if you intend to re-open the TREXIO file, it has to be closed explicitly before. This can be done using the close method, i.e.

+ +
+
+
+ +
+ +
+
+
+

Good! You are now ready to inspect the contents of the benzene_demo.h5 file using the reading functionality of TREXIO.

+ +
+
+
+
+

Reading data from the TREXIO file

+
+
+
+
+

First, let's try to open an existing TREXIO file in read-only mode. This can be done by creating a new instance of the trexio.File class but this time with mode='r' argument. Back end has to be specified as well.

+ +
+
+
+ +
+ +
+
+
+

When reading data from the TREXIO file, the only required argument is a previously created instance of trexio.File class. In our case, it is demo_file_r. TREXIO functions with read_ prefix return the desired variable as an output. For example, nucleus_num value can be read from the file as follows

+ +
+
+
+ +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+
+

The function call assigns nucleus_num_r to 12, which is consistent with the number of atoms in benzene that we wrote in the previous section.

+ +
+
+
+
+

All calls to functions that read data can be done in a very similar way. The key point here is a function name, which in turn defines the output format. Hopefully by now you got used to the TREXIO naming convention and the contents of the nucleus group. Which function would you call to read a point_group attribute of the nucleus group? What type does it return? See the answer below:

+ +
+
+
+ +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+
+

The trexio.read_nucleus_point_group function call returns a string D6h, which is exactly what we provided in the previous section. Now, let's read nuclear charges and labels.

+ +
+
+
+ +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+ +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+

The values are consistent with each other and with the previously written data. Not bad. What about the format of the output?

- -
nucleus_label return type: <class 'list'>
-

This makes sense, isn’t it? We have written a list of nuclear labels and have received back a list of values from the file. What about nuclear charges?

- -
nucleus_charge return type: <class 'numpy.ndarray'>
+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+
+

This makes sense, isn't it? We have written a list of nuclear labels and have received back a list of values from the file. What about nuclear charges?

+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+

Looks like trexio.read_nucleus_charge function returns a numpy.ndarray even though we have provided a python-ic list to trexio.write_nucleus_charge in the previous section. Why is it so? As has been mentioned before, TREXIO Python API internally relies on the use of the NumPy package to communicate arrays of float-like or int-like values. This prevents some memory leaks and grants additional flexibility to the API. What kind of flexibility? Check this out:

- -
return dtype in NumPy notation: ---> float64
-

It means that the default precision of the TREXIO output is double (np.float64) for arrays of floating point numbers like nucleus_charge. But what if you do not need this extra precision and would like to read nuclear charges in single (np.float32) or even reduced (e.g. np.float16) precision? TREXIO Python API provides an additional (optional) argument for this. This argument is called dtype and accepts one of the NumPy data types. For example,

- - -
return dtype in NumPy notation: ---> float32
-

Reading multidimensional arrays

-

So far, we have only read flat 1D arrays. However, we have also written a 2D array of nuclear coordinates. Let’s now read it back from the file:

- - -
nucleus_coord from benzene_demo.h5 TREXIO file: 
+
+
+
+ + + + +
+
+
+ + +
+ + + + +
+
+
+
+
+ + +
+ + +
+ + + + +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+

We can see that TREXIO returns a 2D array with 12 rows and 3 columns, which is consistent with the nucleus_coord dimensions [nucleus_num, 3]. What this means is that by default TREXIO reshapes the output flat array into a multidimensional one whenever applicable. This is done based on the shape specified in the TREX configuration file.

+ +
+
+
+

In some cases, it might be a good idea to explicitly check that the data exists in the file before reading it. This can be achieved using has_-suffixed functions of the API. For example,

- -

Conclusion

+ +
+
+
+ +
+ +
+
+
+

Conclusion

+
+
+
+

In this Tutorial, you have created a TREXIO file using HDF5 back end and have written the number of atoms, point group, nuclear charges, labels and coordinates, which correspond to benzene molecule. You have also learned how to read this data back from the TREXIO file and how to handle some TREXIO errors.

+ +
+
+ + + + + + + + + diff --git a/src/pytrexio.i b/src/pytrexio.i index 04c0837..1665eb9 100644 --- a/src/pytrexio.i +++ b/src/pytrexio.i @@ -28,12 +28,15 @@ Useful when working with C pointers */ %include typemaps.i -/* Redefine the int32_t* and int64_t* num to be output +/* Redefine the [int32_t*, int64_t*, float*, double*] num + pattern to be appended to the output tuple. Useful for TREXIO read_num functions where the num variable is modified by address */ %apply int *OUTPUT { int32_t* const num}; %apply int *OUTPUT { int64_t* const num}; +%apply float *OUTPUT { float* const num}; +%apply float *OUTPUT { double* const num}; /* Does not work for arrays (SIGSEGV) */ @@ -66,23 +69,23 @@ import_array(); %numpy_typemaps(int32_t, NPY_INT32, int64_t) %numpy_typemaps(int64_t, NPY_INT64, int64_t) /* Enable write|read_safe functions to convert numpy arrays from/to double arrays */ -%apply (double* ARGOUT_ARRAY1, int64_t DIM1) {(double * const dset_out, const int64_t dim_out)}; -%apply (double* IN_ARRAY1, int64_t DIM1) {(const double * dset_in, const int64_t dim_in)}; +%apply (double* ARGOUT_ARRAY1, int64_t DIM1) {(double* const dset_out, const int64_t dim_out)}; +%apply (double* IN_ARRAY1, int64_t DIM1) {(const double* dset_in, const int64_t dim_in)}; /* Enable write|read_safe functions to convert numpy arrays from/to float arrays */ -%apply (float* ARGOUT_ARRAY1, int64_t DIM1) {(float * const dset_out, const int64_t dim_out)}; -%apply (float* IN_ARRAY1, int64_t DIM1) {(const float * dset_in, const int64_t dim_in)}; +%apply (float* ARGOUT_ARRAY1, int64_t DIM1) {(float* const dset_out, const int64_t dim_out)}; +%apply (float* IN_ARRAY1, int64_t DIM1) {(const float* dset_in, const int64_t dim_in)}; /* Enable write|read_safe functions to convert numpy arrays from/to int32 arrays */ -%apply (int32_t* ARGOUT_ARRAY1, int64_t DIM1) {(int32_t * const dset_out, const int64_t dim_out)}; -%apply (int32_t* IN_ARRAY1, int64_t DIM1) {(const int32_t * dset_in, const int64_t dim_in)}; +%apply (int32_t* ARGOUT_ARRAY1, int64_t DIM1) {(int32_t* const dset_out, const int64_t dim_out)}; +%apply (int32_t* IN_ARRAY1, int64_t DIM1) {(const int32_t* dset_in, const int64_t dim_in)}; /* Enable write|read_safe functions to convert numpy arrays from/to int64 arrays */ -%apply (int64_t* ARGOUT_ARRAY1, int64_t DIM1) {(int64_t * const dset_out, const int64_t dim_out)}; -%apply (int64_t* IN_ARRAY1, int64_t DIM1) {(const int64_t * dset_in, const int64_t dim_in)}; +%apply (int64_t* ARGOUT_ARRAY1, int64_t DIM1) {(int64_t* const dset_out, const int64_t dim_out)}; +%apply (int64_t* IN_ARRAY1, int64_t DIM1) {(const int64_t* dset_in, const int64_t dim_in)}; /* This tells SWIG to treat char ** dset_in pattern as a special case Enables access to trexio_[...]_write_dset_str set of functions directly, i.e. by converting input list of strings from Python into char ** of C */ -%typemap(in) char ** dset_in { +%typemap(in) char** dset_in { /* Check if is a list */ if (PyList_Check($input)) { int size = PyList_Size($input); @@ -105,7 +108,7 @@ import_array(); } } /* This cleans up the char ** array we malloc-ed before */ -%typemap(freearg) char ** dset_in { +%typemap(freearg) char** dset_in { free((char *) $1); } diff --git a/src/templates_front/templator_front.org b/src/templates_front/templator_front.org index 312e494..cc0d2c6 100644 --- a/src/templates_front/templator_front.org +++ b/src/templates_front/templator_front.org @@ -164,7 +164,7 @@ __trexio_path__ = None | ~TREXIO_INVALID_ID~ | 9 | 'Invalid ID' | | ~TREXIO_ALLOCATION_FAILED~ | 10 | 'Allocation failed' | | ~TREXIO_HAS_NOT~ | 11 | 'Element absent' | - | ~TREXIO_INVALID_NUM~ | 12 | 'Invalid dimensions' | + | ~TREXIO_INVALID_NUM~ | 12 | 'Invalid (negative or 0) dimension' | | ~TREXIO_ATTR_ALREADY_EXISTS~ | 13 | 'Attribute already exists' | | ~TREXIO_DSET_ALREADY_EXISTS~ | 14 | 'Dataset already exists' | | ~TREXIO_OPEN_ERROR~ | 15 | 'Error opening file' | @@ -389,7 +389,7 @@ return '\n'.join(result) return "Element absent"; break; case TREXIO_INVALID_NUM: - return "Invalid dimensions"; + return "Invalid (negative or 0) dimension"; break; case TREXIO_ATTR_ALREADY_EXISTS: return "Attribute already exists"; @@ -1054,6 +1054,7 @@ def close(trexio_file): , "charge" : [ "float", [ "nucleus.num" ] ] , "coord" : [ "float", [ "nucleus.num", "3" ] ] , "label" : [ "str" , [ "nucleus.num" ] ] + , "point_group" : [ "str" , [ ] ] } } #+end_src @@ -1066,32 +1067,38 @@ def close(trexio_file): All templates presented below use the ~$var$~ notation to indicate the variable, which will be replaced by the - ~generator.py~. Sometimes the upper case is used, i.e. ~$VAR$~ (for + ~generator.py~. Sometimes the upper case is used, i.e. ~$VAR$~ (for example, in ~#define~ statements). More detailed description of each variable can be found below: - | Template variable | Description | Example | - |--------------------------------+-----------------------------------------------------+----------------------| - | ~$group$~ | Name of the group | ~nucleus~ | - | ~$group_num$~ | Name of the dimensioning variable (scalar) | ~nucleus_num~ | - | ~$group_dset$~ | Name of the dataset (vector/matrix/tensor) | ~nucleus_coord~ | - | ~$group_dset_rank$~ | Rank of the dataset | ~2~ | - | ~$group_dset_dim$~ | Selected dimension of the dataset | ~nucleus_num~ | - | ~$group_dset_dim_list$~ | All dimensions of the dataset | ~{nucleus_num, 3}~ | - | ~$group_dset_dtype$~ | Basic type of the dataset (int/float/char) | ~float~ | - | ~$group_dset_h5_dtype$~ | Type of the dataset in HDF5 | ~double~ | - | ~$group_dset_std_dtype_in$~ | Input type of the dataset in TEXT [fscanf] | ~%lf~ | - | ~$group_dset_std_dtype_out$~ | Output type of the dataset in TEXT [fprintf] | ~%24.16e~ | - | ~$group_dset_dtype_default$~ | Default datatype of the dataset [C] | ~double/int32_t~ | - | ~$group_dset_dtype_single$~ | Single precision datatype of the dataset [C] | ~float/int32_t~ | - | ~$group_dset_dtype_double$~ | Double precision datatype of the dataset [C] | ~double/int64_t~ | - | ~$default_prec$~ | Default precision for read/write without suffix [C] | ~64/32~ | - | ~$group_dset_f_dtype_default$~ | Default datatype of the dataset [Fortran] | ~real(8)/integer(4)~ | - | ~$group_dset_f_dtype_single$~ | Single precision datatype of the dataset [Fortran] | ~real(4)/integer(4)~ | - | ~$group_dset_f_dtype_double$~ | Double precision datatype of the dataset [Fortran] | ~real(8)/integer(8)~ | - | ~$group_dset_f_dims$~ | Dimensions in Fortran | ~(:,:)~ | - | ~$group_dset_py_dtype$~ | Standard datatype of the dataset [Python] | ~float/int~ | + | Template variable | Description | Example | + |--------------------------------+-----------------------------------------------------+-----------------------| + | ~$group$~ | Name of the group | ~nucleus~ | + | ~$group_num$~ | Name of the numerical attribute (scalar) | ~nucleus_num~ | + | ~$group_str$~ | Name of the string attribute (scalar) | ~nucleus_point_group~ | + | ~$group_dset$~ | Name of the dataset (vector/matrix/tensor) | ~nucleus_coord~ | + | ~$group_dset_rank$~ | Rank of the dataset | ~2~ | + | ~$group_dset_dim$~ | Selected dimension of the dataset | ~nucleus_num~ | + | ~$group_dset_dim_list$~ | All dimensions of the dataset | ~{nucleus_num, 3}~ | + | ~$group_dset_dtype$~ | Basic type of the dataset (int/float/char) | ~float~ | + | ~$group_dset_h5_dtype$~ | Type of the dataset in HDF5 | ~double~ | + | ~$group_dset_std_dtype_in$~ | Input type of the dataset in TEXT [fscanf] | ~%lf~ | + | ~$group_dset_std_dtype_out$~ | Output type of the dataset in TEXT [fprintf] | ~%24.16e~ | + | ~$group_dset_dtype_default$~ | Default datatype of the dataset [C] | ~double/int32_t~ | + | ~$group_dset_dtype_single$~ | Single precision datatype of the dataset [C] | ~float/int32_t~ | + | ~$group_dset_dtype_double$~ | Double precision datatype of the dataset [C] | ~double/int64_t~ | + | ~$group_dset_f_dtype_default$~ | Default datatype of the dataset [Fortran] | ~real(8)/integer(4)~ | + | ~$group_dset_f_dtype_single$~ | Single precision datatype of the dataset [Fortran] | ~real(4)/integer(4)~ | + | ~$group_dset_f_dtype_double$~ | Double precision datatype of the dataset [Fortran] | ~real(8)/integer(8)~ | + | ~$group_dset_f_dims$~ | Dimensions in Fortran | ~(:,:)~ | + | ~$group_dset_py_dtype$~ | Standard datatype of the dataset [Python] | ~float/int~ | + | ~$default_prec$~ | Default precision for read/write without suffix [C] | ~64/32~ | + | ~$is_index$~ | Expands to ~true~ if dataset has a type ~index~ [C] | ~true/false~ | + + Some of the aforementioned template variables with ~group_dset~ prefix are duplicated with ~group_num~ prefix, + e.g. you might find $group_num_dtype_double$ in the templates corresponding to numerical attributes. + The expanding values are the same as for ~group_dset~ and thus are not listed in the table above. Note: parent group name is always added to the child objects upon construction of TREXIO (e.g. ~num~ of ~nucleus~ group becomes @@ -1102,27 +1109,33 @@ def close(trexio_file): object) levels of =trex.json= . The parsed data is divided in 2 parts: - 1) Dimensioning variables (contain ~num~ in their names). These are always scalar integers. + 1) Single attributes. These can be numerical values or strings. 2) Datasets. These can be vectors, matrices or tensors. The types are indicated in =trex.json=. - Currently supported types: int, float and strings. + Currently supported data types: int, float and strings. For each of the aforementioned objects, TREXIO provides *has*, *read* and *write* functionality. TREXIO supports I/O with single or double precision for integer and floating point numbers. -** Templates for front end has/read/write a single dimensioning variable + *Note:* single integer attributes that contain ~num~ in their name (e.g. ~nucleus_num~) are + considered dimensioning variables and cannot be negative or 0. An attempt to write negative or 0 + value will result in ~TREXIO_INVALID_ARG_2~ exit code. - This section concerns API calls related to dimensioning variables. +** Templates for front end has/read/write a single numerical attribute +*** Introduction - | Function name | Description | Precision | - |-------------------------------+---------------------------------------------------+-----------| - | ~trexio_has_$group_num$~ | Check if a dimensioning variable exists in a file | --- | - | ~trexio_read_$group_num$~ | Read a dimensioning variable | Single | - | ~trexio_write_$group_num$~ | Write a dimensioning variable | Single | - | ~trexio_read_$group_num$_32~ | Read a dimensioning variable | Single | - | ~trexio_write_$group_num$_32~ | Write a dimensioning variable | Single | - | ~trexio_read_$group_num$_64~ | Read a dimensioning variable | Double | - | ~trexio_write_$group_num$_64~ | Write a dimensioning variable | Double | + This section concerns API calls related to numerical attributes, + namely single value of int/float types. + + | Function name | Description | Precision | + |-------------------------------+----------------------------------------+-----------| + | ~trexio_has_$group_num$~ | Check if an attribute exists in a file | --- | + | ~trexio_read_$group_num$~ | Read a attribute | Single | + | ~trexio_write_$group_num$~ | Write a attribute | Single | + | ~trexio_read_$group_num$_32~ | Read a attribute | Single | + | ~trexio_write_$group_num$_32~ | Write a attribute | Single | + | ~trexio_read_$group_num$_64~ | Read a attribute | Double | + | ~trexio_write_$group_num$_64~ | Write a attribute | Double | *** C templates for front end @@ -1135,70 +1148,39 @@ def close(trexio_file): (non-suffixed) API call on dimensioning variables deals with single precision (see Table above). +**** Function declarations - #+begin_src c :tangle hrw_num_front.h :exports none + #+begin_src c :tangle hrw_attr_num_front.h :exports none trexio_exit_code trexio_has_$group_num$(trexio_t* const file); -trexio_exit_code trexio_read_$group_num$(trexio_t* const file, int32_t* const num); -trexio_exit_code trexio_write_$group_num$(trexio_t* const file, const int32_t num); -trexio_exit_code trexio_read_$group_num$_32(trexio_t* const file, int32_t* const num); -trexio_exit_code trexio_write_$group_num$_32(trexio_t* const file, const int32_t num); -trexio_exit_code trexio_read_$group_num$_64(trexio_t* const file, int64_t* const num); -trexio_exit_code trexio_write_$group_num$_64(trexio_t* const file, const int64_t num); +trexio_exit_code trexio_read_$group_num$(trexio_t* const file, $group_num_dtype_default$* const num); +trexio_exit_code trexio_write_$group_num$(trexio_t* const file, const $group_num_dtype_default$ num); +trexio_exit_code trexio_read_$group_num$_32(trexio_t* const file, $group_num_dtype_single$* const num); +trexio_exit_code trexio_write_$group_num$_32(trexio_t* const file, const $group_num_dtype_single$ num); +trexio_exit_code trexio_read_$group_num$_64(trexio_t* const file, $group_num_dtype_double$* const num); +trexio_exit_code trexio_write_$group_num$_64(trexio_t* const file, const $group_num_dtype_double$ num); #+end_src - #+begin_src c :tangle read_num_64_front.c +**** Source code for double precision functions + + #+begin_src c :tangle read_attr_num_64_front.c trexio_exit_code -trexio_read_$group_num$_64 (trexio_t* const file, int64_t* const num) +trexio_read_$group_num$_64 (trexio_t* const file, $group_num_dtype_double$* const num) { if (file == NULL) return TREXIO_INVALID_ARG_1; if (trexio_has_$group_num$(file) != TREXIO_SUCCESS) return TREXIO_ATTR_MISSING; - uint64_t u_num = 0; - trexio_exit_code rc = TREXIO_GROUP_READ_ERROR; - switch (file->back_end) { case TREXIO_TEXT: - rc = trexio_text_read_$group_num$(file, &u_num); + return trexio_text_read_$group_num$(file, num); break; case TREXIO_HDF5: - rc = trexio_hdf5_read_$group_num$(file, &u_num); + return trexio_hdf5_read_$group_num$(file, num); break; /* case TREXIO_JSON: - rc =trexio_json_read_$group_num$(file, &u_num); - break; -,*/ - } - - if (rc != TREXIO_SUCCESS) return rc; - - *num = (int64_t) u_num; - return TREXIO_SUCCESS; -} - #+end_src - - #+begin_src c :tangle write_num_64_front.c -trexio_exit_code -trexio_write_$group_num$_64 (trexio_t* const file, const int64_t num) -{ - if (file == NULL) return TREXIO_INVALID_ARG_1; - if (num < 0 ) return TREXIO_INVALID_ARG_2; - if (trexio_has_$group_num$(file) == TREXIO_SUCCESS) return TREXIO_ATTR_ALREADY_EXISTS; - - switch (file->back_end) { - - case TREXIO_TEXT: - return trexio_text_write_$group_num$(file, (int64_t) num); - break; - - case TREXIO_HDF5: - return trexio_hdf5_write_$group_num$(file, (int64_t) num); - break; -/* - case TREXIO_JSON: - return trexio_json_write_$group_num$(file, (int64_t) num); + return trexio_json_read_$group_num$(file, num); break; ,*/ } @@ -1207,60 +1189,26 @@ trexio_write_$group_num$_64 (trexio_t* const file, const int64_t num) } #+end_src - #+begin_src c :tangle read_num_32_front.c + #+begin_src c :tangle write_attr_num_64_front.c trexio_exit_code -trexio_read_$group_num$_32 (trexio_t* const file, int32_t* const num) +trexio_write_$group_num$_64 (trexio_t* const file, const $group_num_dtype_double$ num) { if (file == NULL) return TREXIO_INVALID_ARG_1; - if (trexio_has_$group_num$(file) != TREXIO_SUCCESS) return TREXIO_ATTR_MISSING; - - uint64_t u_num = 0; - trexio_exit_code rc = TREXIO_GROUP_READ_ERROR; - - switch (file->back_end) { - - case TREXIO_TEXT: - rc = trexio_text_read_$group_num$(file, &u_num); - break; - - case TREXIO_HDF5: - rc = trexio_hdf5_read_$group_num$(file, &u_num); - break; -/* - case TREXIO_JSON: - rc =trexio_json_read_$group_num$(file, &u_num); - break; -,*/ - } - - if (rc != TREXIO_SUCCESS) return rc; - - *num = (int32_t) u_num; - return TREXIO_SUCCESS; -} - #+end_src - - #+begin_src c :tangle write_num_32_front.c -trexio_exit_code -trexio_write_$group_num$_32 (trexio_t* const file, const int32_t num) -{ - - if (file == NULL) return TREXIO_INVALID_ARG_1; - if (num < 0 ) return TREXIO_INVALID_ARG_2; + //if (num <= 0L) return TREXIO_INVALID_NUM; /* this line is uncommented by the generator for dimensioning variables; do NOT remove! */ if (trexio_has_$group_num$(file) == TREXIO_SUCCESS) return TREXIO_ATTR_ALREADY_EXISTS; switch (file->back_end) { case TREXIO_TEXT: - return trexio_text_write_$group_num$(file, (int64_t) num); - break; - - case TREXIO_HDF5: - return trexio_hdf5_write_$group_num$(file, (int64_t) num); - break; -/* - case TREXIO_JSON: - return trexio_json_write_$group_num$(file, (int64_t) num); + return trexio_text_write_$group_num$(file, num); + break; + + case TREXIO_HDF5: + return trexio_hdf5_write_$group_num$(file, num); + break; +/* + case TREXIO_JSON: + return trexio_json_write_$group_num$(file, num); break; ,*/ } @@ -1269,23 +1217,89 @@ trexio_write_$group_num$_32 (trexio_t* const file, const int32_t num) } #+end_src - #+begin_src c :tangle read_num_def_front.c +**** Source code for single precision functions + + #+begin_src c :tangle read_attr_num_32_front.c trexio_exit_code -trexio_read_$group_num$ (trexio_t* const file, int32_t* const num) +trexio_read_$group_num$_32 (trexio_t* const file, $group_num_dtype_single$* const num) { - return trexio_read_$group_num$_32(file, num); + if (file == NULL) return TREXIO_INVALID_ARG_1; + if (trexio_has_$group_num$(file) != TREXIO_SUCCESS) return TREXIO_ATTR_MISSING; + + $group_num_dtype_double$ num_64 = 0; + trexio_exit_code rc = TREXIO_GROUP_READ_ERROR; + + switch (file->back_end) { + + case TREXIO_TEXT: + rc = trexio_text_read_$group_num$(file, &num_64); + break; + + case TREXIO_HDF5: + rc = trexio_hdf5_read_$group_num$(file, &num_64); + break; +/* + case TREXIO_JSON: + rc =trexio_json_read_$group_num$(file, &num_64); + break; +,*/ + } + + if (rc != TREXIO_SUCCESS) return rc; + + *num = ($group_num_dtype_single$) num_64; + return TREXIO_SUCCESS; } #+end_src - #+begin_src c :tangle write_num_def_front.c + #+begin_src c :tangle write_attr_num_32_front.c trexio_exit_code -trexio_write_$group_num$ (trexio_t* const file, const int32_t num) +trexio_write_$group_num$_32 (trexio_t* const file, const $group_num_dtype_single$ num) { - return trexio_write_$group_num$_32(file, num); + + if (file == NULL) return TREXIO_INVALID_ARG_1; + //if (num <= 0) return TREXIO_INVALID_NUM; /* this line is uncommented by the generator for dimensioning variables; do NOT remove! */ + if (trexio_has_$group_num$(file) == TREXIO_SUCCESS) return TREXIO_ATTR_ALREADY_EXISTS; + + switch (file->back_end) { + + case TREXIO_TEXT: + return trexio_text_write_$group_num$(file, ($group_num_dtype_double$) num); + break; + + case TREXIO_HDF5: + return trexio_hdf5_write_$group_num$(file, ($group_num_dtype_double$) num); + break; +/* + case TREXIO_JSON: + return trexio_json_write_$group_num$(file, ($group_num_dtype_double$) num); + break; +,*/ + } + + return TREXIO_FAILURE; } #+end_src - #+begin_src c :tangle has_num_front.c +**** Source code for default functions + + #+begin_src c :tangle read_attr_num_def_front.c +trexio_exit_code +trexio_read_$group_num$ (trexio_t* const file, $group_num_dtype_default$* const num) +{ + return trexio_read_$group_num$_$default_prec$(file, num); +} + #+end_src + + #+begin_src c :tangle write_attr_num_def_front.c +trexio_exit_code +trexio_write_$group_num$ (trexio_t* const file, const $group_num_dtype_default$ num) +{ + return trexio_write_$group_num$_$default_prec$(file, num); +} + #+end_src + + #+begin_src c :tangle has_attr_num_front.c trexio_exit_code trexio_has_$group_num$ (trexio_t* const file) { @@ -1319,67 +1333,67 @@ trexio_has_$group_num$ (trexio_t* const file) The ~Fortran~ templates that provide an access to the ~C~ API calls from Fortran. These templates are based on the use of ~iso_c_binding~. Pointers have to be passed by value. - #+begin_src f90 :tangle write_num_64_front_fortran.f90 + #+begin_src f90 :tangle write_attr_num_64_front_fortran.f90 interface integer function trexio_write_$group_num$_64 (trex_file, num) bind(C) use, intrinsic :: iso_c_binding integer(8), intent(in), value :: trex_file - integer(8), intent(in), value :: num + $group_num_f_dtype_double$, intent(in), value :: num end function trexio_write_$group_num$_64 end interface #+end_src - #+begin_src f90 :tangle read_num_64_front_fortran.f90 + #+begin_src f90 :tangle read_attr_num_64_front_fortran.f90 interface integer function trexio_read_$group_num$_64 (trex_file, num) bind(C) use, intrinsic :: iso_c_binding integer(8), intent(in), value :: trex_file - integer(8), intent(out) :: num + $group_num_f_dtype_double$, intent(out) :: num end function trexio_read_$group_num$_64 end interface #+end_src - #+begin_src f90 :tangle write_num_32_front_fortran.f90 + #+begin_src f90 :tangle write_attr_num_32_front_fortran.f90 interface integer function trexio_write_$group_num$_32 (trex_file, num) bind(C) use, intrinsic :: iso_c_binding integer(8), intent(in), value :: trex_file - integer(4), intent(in), value :: num + $group_num_f_dtype_single$, intent(in), value :: num end function trexio_write_$group_num$_32 end interface #+end_src - #+begin_src f90 :tangle read_num_32_front_fortran.f90 + #+begin_src f90 :tangle read_attr_num_32_front_fortran.f90 interface integer function trexio_read_$group_num$_32 (trex_file, num) bind(C) use, intrinsic :: iso_c_binding integer(8), intent(in), value :: trex_file - integer(4), intent(out) :: num + $group_num_f_dtype_single$, intent(out) :: num end function trexio_read_$group_num$_32 end interface #+end_src - #+begin_src f90 :tangle write_num_def_front_fortran.f90 + #+begin_src f90 :tangle write_attr_num_def_front_fortran.f90 interface integer function trexio_write_$group_num$ (trex_file, num) bind(C) use, intrinsic :: iso_c_binding integer(8), intent(in), value :: trex_file - integer(4), intent(in), value :: num + $group_num_f_dtype_default$, intent(in), value :: num end function trexio_write_$group_num$ end interface #+end_src - #+begin_src f90 :tangle read_num_def_front_fortran.f90 + #+begin_src f90 :tangle read_attr_num_def_front_fortran.f90 interface integer function trexio_read_$group_num$ (trex_file, num) bind(C) use, intrinsic :: iso_c_binding integer(8), intent(in), value :: trex_file - integer(4), intent(out) :: num + $group_num_f_dtype_default$, intent(out) :: num end function trexio_read_$group_num$ end interface #+end_src - #+begin_src f90 :tangle has_num_front_fortran.f90 + #+begin_src f90 :tangle has_attr_num_front_fortran.f90 interface integer function trexio_has_$group_num$ (trex_file) bind(C) use, intrinsic :: iso_c_binding @@ -1390,8 +1404,8 @@ end interface *** Python templates for front end - #+begin_src python :tangle write_num_front.py -def write_$group_num$(trexio_file, num_w: int) -> None: + #+begin_src python :tangle write_attr_num_front.py +def write_$group_num$(trexio_file, num_w: $group_num_py_dtype$) -> None: """Write the $group_num$ variable in the TREXIO file. Parameters: @@ -1415,8 +1429,8 @@ def write_$group_num$(trexio_file, num_w: int) -> None: raise #+end_src - #+begin_src python :tangle read_num_front.py -def read_$group_num$(trexio_file) -> int: + #+begin_src python :tangle read_attr_num_front.py +def read_$group_num$(trexio_file) -> $group_num_py_dtype$: """Read the $group_num$ variable from the TREXIO file. Parameter is a ~TREXIO File~ object that has been created by a call to ~open~ function. @@ -1440,7 +1454,7 @@ def read_$group_num$(trexio_file) -> int: return num_r #+end_src - #+begin_src python :tangle has_num_front.py + #+begin_src python :tangle has_attr_num_front.py def has_$group_num$(trexio_file) -> bool: """Check that $group_num$ variable exists in the TREXIO file. @@ -1468,6 +1482,7 @@ def has_$group_num$(trexio_file) -> bool: #+end_src ** Templates for front end has/read/write a dataset of numerical data +*** Introduction This section concerns API calls related to datasets. @@ -2324,6 +2339,7 @@ trexio_read_chunk_ao_2e_int_eri_value_64(trexio_t* const file, First parameter is the ~TREXIO~ file handle. Second parameter is the variable to be written/read to/from the ~TREXIO~ file (except for ~trexio_has_~ functions). +**** Function declarations #+begin_src c :tangle hrw_dset_str_front.h :exports none trexio_exit_code trexio_has_$group_dset$(trexio_t* const file); @@ -2333,6 +2349,8 @@ trexio_exit_code trexio_read_$group_dset$(trexio_t* const file, char** dset_out, trexio_exit_code trexio_write_$group_dset$(trexio_t* const file, const char** dset_in, const int32_t max_str_len); #+end_src +**** Source code for default functions + #+begin_src c :tangle read_dset_str_front.c trexio_exit_code trexio_read_$group_dset$_low (trexio_t* const file, char* dset_out, const int32_t max_str_len) @@ -2776,6 +2794,7 @@ def has_$group_dset$(trexio_file) -> bool: | ~trexio_write_$group_str$~ | Write a string attribute | *** C templates for front end +**** Function declarations #+begin_src c :tangle hrw_attr_str_front.h :exports none trexio_exit_code trexio_has_$group_str$(trexio_t* const file); @@ -2783,6 +2802,8 @@ trexio_exit_code trexio_read_$group_str$(trexio_t* const file, char* const str_o trexio_exit_code trexio_write_$group_str$(trexio_t* const file, const char* str, const int32_t max_str_len); #+end_src +**** Source code for default functions + #+begin_src c :tangle read_attr_str_front.c trexio_exit_code trexio_read_$group_str$ (trexio_t* const file, char* const str_out, const int32_t max_str_len) diff --git a/src/templates_hdf5/templator_hdf5.org b/src/templates_hdf5/templator_hdf5.org index 90a9881..8ff47d1 100644 --- a/src/templates_hdf5/templator_hdf5.org +++ b/src/templates_hdf5/templator_hdf5.org @@ -152,18 +152,18 @@ trexio_hdf5_deinit (trexio_t* const file) } #+end_src -** Template for HDF5 has/read/write a single dimensioning variable +** Template for HDF5 has/read/write the numerical attribute - #+begin_src c :tangle hrw_num_hdf5.h :exports none + #+begin_src c :tangle hrw_attr_num_hdf5.h :exports none trexio_exit_code trexio_hdf5_has_$group_num$ (trexio_t* const file); -trexio_exit_code trexio_hdf5_read_$group_num$ (trexio_t* const file, uint64_t* const num); -trexio_exit_code trexio_hdf5_write_$group_num$(trexio_t* const file, const uint64_t num); +trexio_exit_code trexio_hdf5_read_$group_num$ (trexio_t* const file, $group_num_dtype_double$* const num); +trexio_exit_code trexio_hdf5_write_$group_num$(trexio_t* const file, const $group_num_dtype_double$ num); #+end_src - #+begin_src c :tangle read_num_hdf5.c + #+begin_src c :tangle read_attr_num_hdf5.c trexio_exit_code -trexio_hdf5_read_$group_num$ (trexio_t* const file, uint64_t* const num) +trexio_hdf5_read_$group_num$ (trexio_t* const file, $group_num_dtype_double$* const num) { if (file == NULL) return TREXIO_INVALID_ARG_1; @@ -177,7 +177,7 @@ trexio_hdf5_read_$group_num$ (trexio_t* const file, uint64_t* const num) const hid_t num_id = H5Aopen(f->$group$_group, $GROUP_NUM$_NAME, H5P_DEFAULT); if (num_id <= 0) return TREXIO_INVALID_ID; - const herr_t status = H5Aread(num_id, H5T_NATIVE_UINT64, num); + const herr_t status = H5Aread(num_id, H5T_$GROUP_NUM_H5_DTYPE$, num); H5Aclose(num_id); @@ -189,66 +189,44 @@ trexio_hdf5_read_$group_num$ (trexio_t* const file, uint64_t* const num) #+end_src - #+begin_src c :tangle write_num_hdf5.c + #+begin_src c :tangle write_attr_num_hdf5.c trexio_exit_code -trexio_hdf5_write_$group_num$ (trexio_t* const file, const uint64_t num) +trexio_hdf5_write_$group_num$ (trexio_t* const file, const $group_num_dtype_double$ num) { if (file == NULL) return TREXIO_INVALID_ARG_1; - if (num == 0L ) return TREXIO_INVALID_ARG_2; trexio_hdf5_t* const f = (trexio_hdf5_t*) file; - if (H5Aexists(f->$group$_group, $GROUP_NUM$_NAME) == 0) { - - /* Write the dimensioning variables */ - const hid_t dtype = H5Tcopy(H5T_NATIVE_UINT64); - const hid_t dspace = H5Screate(H5S_SCALAR); - - const hid_t num_id = H5Acreate(f->$group$_group, $GROUP_NUM$_NAME, dtype, dspace, - H5P_DEFAULT, H5P_DEFAULT); - if (num_id <= 0) { - H5Sclose(dspace); - H5Tclose(dtype); - return TREXIO_INVALID_ID; - } - - const herr_t status = H5Awrite(num_id, dtype, &(num)); - if (status < 0) { - H5Aclose(num_id); - H5Sclose(dspace); - H5Tclose(dtype); - return TREXIO_FAILURE; - } - + /* Write the dimensioning variables */ + const hid_t dtype = H5Tcopy(H5T_$GROUP_NUM_H5_DTYPE$); + const hid_t dspace = H5Screate(H5S_SCALAR); + + const hid_t num_id = H5Acreate(f->$group$_group, $GROUP_NUM$_NAME, + dtype, dspace, H5P_DEFAULT, H5P_DEFAULT); + if (num_id <= 0) { H5Sclose(dspace); - H5Aclose(num_id); H5Tclose(dtype); - return TREXIO_SUCCESS; - - } else { - - uint64_t infile_num; - trexio_exit_code rc = trexio_hdf5_read_$group_num$(file, &(infile_num)); - if (rc != TREXIO_SUCCESS) return rc; - - const hid_t dtype = H5Tcopy(H5T_NATIVE_UINT64); - const hid_t num_id = H5Aopen(f->$group$_group, $GROUP_NUM$_NAME, H5P_DEFAULT); - if (num_id <= 0) return TREXIO_INVALID_ID; - - const herr_t status = H5Awrite(num_id, dtype, &(num)); - if (status < 0) return TREXIO_FAILURE; - - H5Aclose(num_id); - H5Tclose(dtype); - - return TREXIO_SUCCESS; + return TREXIO_INVALID_ID; } + + const herr_t status = H5Awrite(num_id, dtype, &(num)); + if (status < 0) { + H5Aclose(num_id); + H5Sclose(dspace); + H5Tclose(dtype); + return TREXIO_FAILURE; + } + + H5Sclose(dspace); + H5Aclose(num_id); + H5Tclose(dtype); + return TREXIO_SUCCESS; } #+end_src - #+begin_src c :tangle has_num_hdf5.c + #+begin_src c :tangle has_attr_num_hdf5.c trexio_exit_code trexio_hdf5_has_$group_num$ (trexio_t* const file) { @@ -270,7 +248,7 @@ trexio_hdf5_has_$group_num$ (trexio_t* const file) } #+end_src -** Template for HDF5 has/read/write a dataset of numerical data +** Template for HDF5 has/read/write the dataset of numerical data #+begin_src c :tangle hrw_dset_data_hdf5.h :exports none trexio_exit_code trexio_hdf5_has_$group_dset$(trexio_t* const file); @@ -341,12 +319,6 @@ trexio_hdf5_write_$group_dset$ (trexio_t* const file, const $group_dset_dtype$* if (file == NULL) return TREXIO_INVALID_ARG_1; if ($group_dset$ == NULL) return TREXIO_INVALID_ARG_2; - trexio_exit_code rc; - uint64_t $group_dset_dim$; - // error handling for rc is added by the generator - rc = trexio_hdf5_read_$group_dset_dim$(file, &($group_dset_dim$)); - if ($group_dset_dim$ == 0L) return TREXIO_INVALID_NUM; - trexio_hdf5_t* f = (trexio_hdf5_t*) file; if ( H5LTfind_dataset(f->$group$_group, $GROUP_DSET$_NAME) != 1 ) { @@ -400,7 +372,7 @@ trexio_hdf5_has_$group_dset$ (trexio_t* const file) } #+end_src -** Template for HDF5 has/read/write a dataset of strings +** Template for HDF5 has/read/write the dataset of strings #+begin_src c :tangle hrw_dset_str_hdf5.h :exports none trexio_exit_code trexio_hdf5_has_$group_dset$(trexio_t* const file); @@ -523,12 +495,6 @@ trexio_hdf5_write_$group_dset$ (trexio_t* const file, const char** $group_dset$, if (file == NULL) return TREXIO_INVALID_ARG_1; if ($group_dset$ == NULL) return TREXIO_INVALID_ARG_2; - trexio_exit_code rc; - uint64_t $group_dset_dim$; - // error handling for rc is added by the generator - rc = trexio_hdf5_read_$group_dset_dim$(file, &($group_dset_dim$)); - if ($group_dset_dim$ == 0L) return TREXIO_INVALID_NUM; - trexio_hdf5_t* f = (trexio_hdf5_t*) file; herr_t status; @@ -612,7 +578,7 @@ trexio_hdf5_has_$group_dset$ (trexio_t* const file) } #+end_src -** Template for HDF5 has/read/write a single string attribute +** Template for HDF5 has/read/write the string attribute #+begin_src c :tangle hrw_attr_str_hdf5.h :exports none trexio_exit_code trexio_hdf5_has_$group_str$ (trexio_t* const file); diff --git a/src/templates_text/build.sh b/src/templates_text/build.sh index dba0bfb..4cf6c64 100644 --- a/src/templates_text/build.sh +++ b/src/templates_text/build.sh @@ -19,19 +19,19 @@ cat populated/pop_flush_group_text.h >> trexio_text.h cat populated/pop_has_dset_data_text.c >> trexio_text.c cat populated/pop_has_dset_str_text.c >> trexio_text.c -cat populated/pop_has_num_text.c >> trexio_text.c +cat populated/pop_has_attr_num_text.c >> trexio_text.c cat populated/pop_has_attr_str_text.c >> trexio_text.c cat populated/pop_read_dset_data_text.c >> trexio_text.c cat populated/pop_read_dset_str_text.c >> trexio_text.c cat populated/pop_read_attr_str_text.c >> trexio_text.c -cat populated/pop_read_num_text.c >> trexio_text.c +cat populated/pop_read_attr_num_text.c >> trexio_text.c cat populated/pop_write_dset_data_text.c >> trexio_text.c cat populated/pop_write_dset_str_text.c >> trexio_text.c cat populated/pop_write_attr_str_text.c >> trexio_text.c -cat populated/pop_write_num_text.c >> trexio_text.c -cat populated/pop_hrw_num_text.h >> trexio_text.h +cat populated/pop_write_attr_num_text.c >> trexio_text.c cat populated/pop_hrw_dset_data_text.h >> trexio_text.h cat populated/pop_hrw_dset_str_text.h >> trexio_text.h +cat populated/pop_hrw_attr_num_text.h >> trexio_text.h cat populated/pop_hrw_attr_str_text.h >> trexio_text.h cat rdm_text.c >> trexio_text.c diff --git a/src/templates_text/templator_text.org b/src/templates_text/templator_text.org index 4fcc892..4d6cb71 100644 --- a/src/templates_text/templator_text.org +++ b/src/templates_text/templator_text.org @@ -46,6 +46,7 @@ #include #include #include +#include #+end_src @@ -78,7 +79,8 @@ #+begin_src c :tangle struct_text_group_dset.h typedef struct $group$_s { - uint64_t $group_num$; + $group_num_dtype_double$ $group_num$; + bool $group_num$_isSet; $group_dset_dtype$* $group_dset$; uint32_t rank_$group_dset$; uint32_t to_flush; @@ -346,18 +348,21 @@ trexio_text_read_$group$ (trexio_text_t* const file) } // END REPEAT GROUP_DSET_ALL + unsigned int local_isSet; // START REPEAT GROUP_NUM /* Read data */ rc = fscanf(f, "%1023s", buffer); - assert(!((rc != 1) || (strcmp(buffer, "$group_num$") != 0))); - if ((rc != 1) || (strcmp(buffer, "$group_num$") != 0)) { + assert(!((rc != 1) || (strcmp(buffer, "$group_num$_isSet") != 0))); + if ((rc != 1) || (strcmp(buffer, "$group_num$_isSet") != 0)) { FREE(buffer); fclose(f); FREE($group$); return NULL; } - - rc = fscanf(f, "%" SCNu64 "", &($group$->$group_num$)); + + /* additional parameter local_isSet is needed to suppress warning when fscanf into bool variable using %u or %d */ + rc = fscanf(f, "%u", &(local_isSet)); + $group$->$group_num$_isSet = (bool) local_isSet; assert(!(rc != 1)); if (rc != 1) { FREE(buffer); @@ -365,6 +370,26 @@ trexio_text_read_$group$ (trexio_text_t* const file) FREE($group$); return NULL; } + + if ($group$->$group_num$_isSet == true) { + rc = fscanf(f, "%1023s", buffer); + assert(!((rc != 1) || (strcmp(buffer, "$group_num$") != 0))); + if ((rc != 1) || (strcmp(buffer, "$group_num$") != 0)) { + FREE(buffer); + fclose(f); + FREE($group$); + return NULL; + } + + rc = fscanf(f, "%$group_num_std_dtype_in$", &($group$->$group_num$)); + assert(!(rc != 1)); + if (rc != 1) { + FREE(buffer); + fclose(f); + FREE($group$); + return NULL; + } + } // END REPEAT GROUP_NUM // START REPEAT GROUP_ATTR_STR @@ -458,49 +483,51 @@ trexio_text_read_$group$ (trexio_text_t* const file) // START REPEAT GROUP_DSET_STR /* Allocate arrays */ - $group$->$group_dset$ = CALLOC(size_$group_dset$, $group_dset_dtype$); - assert (!($group$->$group_dset$ == NULL)); - if ($group$->$group_dset$ == NULL) { - FREE(buffer); - fclose(f); - FREE($group$->$group_dset$); - FREE($group$); - return NULL; - } - - rc = fscanf(f, "%1023s", buffer); - assert(!((rc != 1) || (strcmp(buffer, "$group_dset$") != 0))); - if ((rc != 1) || (strcmp(buffer, "$group_dset$") != 0)) { - FREE(buffer); - fclose(f); - FREE($group$->$group_dset$); - FREE($group$); - return NULL; - } - - /* WARNING: this tmp array allows to avoid allocation of space for each element of array of string - , BUT it's size has to be number_of_str*max_len_str where max_len_str is somewhat arbitrary, e.g. 32. - ,*/ - char* tmp_$group_dset$; - if(size_$group_dset$ != 0) tmp_$group_dset$ = CALLOC(size_$group_dset$*32, char); - - for (uint64_t i=0 ; i$group_dset$[i] = tmp_$group_dset$; - /* conventional fcanf with "%s" only return the string before the first space character - ,* to read string with spaces use "%[^\n]" possible with space before or after, i.e. " %[^\n]" - ,* Q: depending on what ? */ - rc = fscanf(f, " %1023[^\n]", tmp_$group_dset$); - assert(!(rc != 1)); - if (rc != 1) { - FREE(buffer); - fclose(f); - FREE($group$->$group_dset$); - FREE($group$); - return NULL; + if(size_$group_dset$ != 0) { + $group$->$group_dset$ = CALLOC(size_$group_dset$, $group_dset_dtype$); + assert (!($group$->$group_dset$ == NULL)); + if ($group$->$group_dset$ == NULL) { + FREE(buffer); + fclose(f); + FREE($group$->$group_dset$); + FREE($group$); + return NULL; + } + + rc = fscanf(f, "%1023s", buffer); + assert(!((rc != 1) || (strcmp(buffer, "$group_dset$") != 0))); + if ((rc != 1) || (strcmp(buffer, "$group_dset$") != 0)) { + FREE(buffer); + fclose(f); + FREE($group$->$group_dset$); + FREE($group$); + return NULL; + } + + /* WARNING: this tmp array allows to avoid allocation of space for each element of array of string + , BUT it's size has to be number_of_str*max_len_str where max_len_str is somewhat arbitrary, e.g. 32. + ,*/ + char* tmp_$group_dset$; + tmp_$group_dset$ = CALLOC(size_$group_dset$*32, char); + + for (uint64_t i=0 ; i$group_dset$[i] = tmp_$group_dset$; + /* conventional fcanf with "%s" only return the string before the first space character + ,* to read string with spaces use "%[^\n]" possible with space before or after, i.e. " %[^\n]" + ,* Q: depending on what ? */ + rc = fscanf(f, " %1023[^\n]", tmp_$group_dset$); + assert(!(rc != 1)); + if (rc != 1) { + FREE(buffer); + fclose(f); + FREE($group$->$group_dset$); + FREE($group$); + return NULL; + } + + size_t tmp_$group_dset$_len = strlen($group$->$group_dset$[i]); + tmp_$group_dset$ += tmp_$group_dset$_len + 1; } - - size_t tmp_$group_dset$_len = strlen($group$->$group_dset$[i]); - tmp_$group_dset$ += tmp_$group_dset$_len + 1; } // END REPEAT GROUP_DSET_STR @@ -555,7 +582,8 @@ trexio_text_flush_$group$ (trexio_text_t* const file) // END REPEAT GROUP_DSET_ALL // START REPEAT GROUP_NUM - fprintf(f, "$group_num$ %" PRIu64 "\n", $group$->$group_num$); + fprintf(f, "$group_num$_isSet %u \n", $group$->$group_num$_isSet); + if ($group$->$group_num$_isSet == true) fprintf(f, "$group_num$ %$group_num_std_dtype_out$ \n", $group$->$group_num$); // END REPEAT GROUP_NUM // START REPEAT GROUP_ATTR_STR @@ -624,17 +652,17 @@ trexio_text_free_$group$ (trexio_text_t* const file) } #+end_src -** Template for has/read/write the num attribute +** Template for has/read/write the numerical attribute - #+begin_src c :tangle hrw_num_text.h :exports none + #+begin_src c :tangle hrw_attr_num_text.h :exports none trexio_exit_code trexio_text_has_$group_num$ (trexio_t* const file); -trexio_exit_code trexio_text_read_$group_num$ (trexio_t* const file, uint64_t* const num); -trexio_exit_code trexio_text_write_$group_num$(trexio_t* const file, const uint64_t num); +trexio_exit_code trexio_text_read_$group_num$ (trexio_t* const file, $group_num_dtype_double$* const num); +trexio_exit_code trexio_text_write_$group_num$(trexio_t* const file, const $group_num_dtype_double$ num); #+end_src - #+begin_src c :tangle read_num_text.c + #+begin_src c :tangle read_attr_num_text.c trexio_exit_code -trexio_text_read_$group_num$ (trexio_t* const file, uint64_t* const num) +trexio_text_read_$group_num$ (trexio_t* const file, $group_num_dtype_double$* const num) { if (file == NULL) return TREXIO_INVALID_ARG_1; @@ -650,9 +678,9 @@ trexio_text_read_$group_num$ (trexio_t* const file, uint64_t* const num) } #+end_src - #+begin_src c :tangle write_num_text.c + #+begin_src c :tangle write_attr_num_text.c trexio_exit_code -trexio_text_write_$group_num$ (trexio_t* const file, const uint64_t num) +trexio_text_write_$group_num$ (trexio_t* const file, const $group_num_dtype_double$ num) { if (file == NULL) return TREXIO_INVALID_ARG_1; @@ -662,6 +690,7 @@ trexio_text_write_$group_num$ (trexio_t* const file, const uint64_t num) if ($group$ == NULL) return TREXIO_FAILURE; $group$->$group_num$ = num; + $group$->$group_num$_isSet = true; $group$->to_flush = 1; return TREXIO_SUCCESS; @@ -669,7 +698,7 @@ trexio_text_write_$group_num$ (trexio_t* const file, const uint64_t num) } #+end_src - #+begin_src c :tangle has_num_text.c + #+begin_src c :tangle has_attr_num_text.c trexio_exit_code trexio_text_has_$group_num$ (trexio_t* const file) { @@ -678,7 +707,7 @@ trexio_text_has_$group_num$ (trexio_t* const file) $group$_t* $group$ = trexio_text_read_$group$((trexio_text_t*) file); if ($group$ == NULL) return TREXIO_FAILURE; - if ($group$->$group_num$ > 0L){ + if ($group$->$group_num$_isSet == true){ return TREXIO_SUCCESS; } else { return TREXIO_HAS_NOT; diff --git a/tests/io_num_hdf5.c b/tests/io_num_hdf5.c index 2983a45..8cac31c 100644 --- a/tests/io_num_hdf5.c +++ b/tests/io_num_hdf5.c @@ -27,6 +27,14 @@ static int test_write_num (const char* file_name, const back_end_t backend) { rc = trexio_write_nucleus_num(file, num); assert (rc == TREXIO_SUCCESS); + // attempt to write 0 as dimensioning variable in an empty file; should FAIL and return TREXIO_INVALID_ARG_2 + rc = trexio_write_mo_num(file, 0); + assert (rc == TREXIO_INVALID_NUM); + + // write numerical attribute ao_cartesian as 0 + rc = trexio_write_ao_cartesian(file, 0); + assert (rc == TREXIO_SUCCESS); + // close current session rc = trexio_close(file); assert (rc == TREXIO_SUCCESS); @@ -77,6 +85,7 @@ static int test_read_num (const char* file_name, const back_end_t backend) { // parameters to be read int num; + int cartesian; /*================= START OF TEST ==================*/ @@ -89,6 +98,15 @@ static int test_read_num (const char* file_name, const back_end_t backend) { assert (rc == TREXIO_SUCCESS); assert (num == 12); + // read non-existing numerical attribute from the file + rc = trexio_read_mo_num(file, &num); + assert (rc == TREXIO_ATTR_MISSING); + + // read ao_cartesian (zero) value from the file + rc = trexio_read_ao_cartesian(file, &cartesian); + assert (rc == TREXIO_SUCCESS); + assert (cartesian == 0); + // close current session rc = trexio_close(file); assert (rc == TREXIO_SUCCESS); diff --git a/tests/io_num_text.c b/tests/io_num_text.c index f8907fb..9f779cd 100644 --- a/tests/io_num_text.c +++ b/tests/io_num_text.c @@ -27,6 +27,14 @@ static int test_write_num (const char* file_name, const back_end_t backend) { rc = trexio_write_nucleus_num(file, num); assert (rc == TREXIO_SUCCESS); + // attempt to write 0 as dimensioning variable in an empty file; should FAIL and return TREXIO_INVALID_ARG_2 + rc = trexio_write_mo_num(file, 0); + assert (rc == TREXIO_INVALID_NUM); + + // write numerical attribute ao_cartesian as 0 + rc = trexio_write_ao_cartesian(file, 0); + assert (rc == TREXIO_SUCCESS); + // close current session rc = trexio_close(file); assert (rc == TREXIO_SUCCESS); @@ -77,6 +85,7 @@ static int test_read_num (const char* file_name, const back_end_t backend) { // parameters to be read int num; + int cartesian; /*================= START OF TEST ==================*/ @@ -89,6 +98,15 @@ static int test_read_num (const char* file_name, const back_end_t backend) { assert (rc == TREXIO_SUCCESS); assert (num == 12); + // read non-existing numerical attribute from the file + rc = trexio_read_mo_num(file, &num); + assert (rc == TREXIO_ATTR_MISSING); + + // read ao_cartesian (zero) value from the file + rc = trexio_read_ao_cartesian(file, &cartesian); + assert (rc == TREXIO_SUCCESS); + assert (cartesian == 0); + // close current session rc = trexio_close(file); assert (rc == TREXIO_SUCCESS); diff --git a/tools/generator.py b/tools/generator.py index 03b3c96..30e06ad 100644 --- a/tools/generator.py +++ b/tools/generator.py @@ -41,7 +41,7 @@ for fname in files_todo['auxiliary']: iterative_populate_file(fname, template_paths, group_dict, detailed_dsets, detailed_nums, detailed_strs) # populate has/read/write_num functions with recursive scheme -for fname in files_todo['num']: +for fname in files_todo['attr_num']: recursive_populate_file(fname, template_paths, detailed_nums) # populate has/read/write_str functions with recursive scheme diff --git a/tools/generator_tools.py b/tools/generator_tools.py index 18c5395..d7cb9a0 100644 --- a/tools/generator_tools.py +++ b/tools/generator_tools.py @@ -39,7 +39,7 @@ def get_files_todo(source_files: dict) -> dict: files_todo = {} #files_todo['all'] = list(filter(lambda x: 'read' in x or 'write' in x or 'has' in x or 'hrw' in x or 'flush' in x or 'free' in x, all_files)) files_todo['all'] = [f for f in all_files if 'read' in f or 'write' in f or 'has' in f or 'flush' in f or 'free' in f or 'hrw' in f] - for key in ['dset_data', 'dset_str', 'num', 'attr_str', 'group']: + for key in ['dset_data', 'dset_str', 'attr_num', 'attr_str', 'group']: files_todo[key] = list(filter(lambda x: key in x, files_todo['all'])) files_todo['group'].append('struct_text_group_dset.h') @@ -100,10 +100,13 @@ def recursive_populate_file(fname: str, paths: dict, detailed_source: dict) -> N fname_new = join('populated',f'pop_{fname}') templ_path = get_template_path(fname, paths) - triggers = ['group_dset_dtype', 'group_dset_py_dtype', 'group_dset_h5_dtype', 'default_prec', 'is_index', - 'group_dset_f_dtype_default', 'group_dset_f_dtype_double', 'group_dset_f_dtype_single', - 'group_dset_dtype_default', 'group_dset_dtype_double', 'group_dset_dtype_single', + triggers = ['group_dset_dtype', 'group_dset_py_dtype', 'group_dset_h5_dtype', 'default_prec', 'is_index', + 'group_dset_f_dtype_default', 'group_dset_f_dtype_double', 'group_dset_f_dtype_single', + 'group_dset_dtype_default', 'group_dset_dtype_double', 'group_dset_dtype_single', 'group_dset_rank', 'group_dset_dim_list', 'group_dset_f_dims', + 'group_num_f_dtype_default', 'group_num_f_dtype_double', 'group_num_f_dtype_single', + 'group_num_dtype_default', 'group_num_dtype_double', 'group_num_dtype_single', + 'group_num_h5_dtype', 'group_num_py_dtype', 'group_dset', 'group_num', 'group_str', 'group'] for item in detailed_source.keys(): @@ -126,6 +129,12 @@ def recursive_populate_file(fname: str, paths: dict, detailed_source: dict) -> N f_out.write(templine) num_written = [] continue + # special case to uncomment check for positive dimensioning variables in templates + elif 'uncommented by the generator for dimensioning' in line: + # only uncomment and write the line if `num` is in the name + if 'dim' in detailed_source[item]['trex_json_int_type']: + templine = line.replace('//', '') + f_out.write(templine) # general case of recursive replacement of inline triggers else: populated_line = recursive_replace_line(line, triggers, detailed_source[item]) @@ -284,6 +293,7 @@ def special_populate_text_group(fname: str, paths: dict, group_dict: dict, detai templ_path = get_template_path(fname, paths) triggers = ['group_dset_dtype', 'group_dset_std_dtype_out', 'group_dset_std_dtype_in', + 'group_num_dtype_double', 'group_num_std_dtype_out', 'group_num_std_dtype_in', 'group_dset', 'group_num', 'group_str', 'group'] for group in group_dict.keys(): @@ -455,7 +465,7 @@ def get_detailed_num_dict (configuration: dict) -> dict: configuration (dict) : configuration from `trex.json` Returns: - num_dict (dict) : dictionary of num-suffixed variables + num_dict (dict) : dictionary of all numerical attributes (of types int, float, dim) """ num_dict = {} for k1,v1 in configuration.items(): @@ -468,6 +478,35 @@ def get_detailed_num_dict (configuration: dict) -> dict: tmp_dict['group_num'] = tmp_num num_dict[tmp_num] = tmp_dict + # TODO the arguments below are almost the same as for group_dset (except for trex_json_int_type) and can be exported from somewhere + if v2[0] == 'float': + tmp_dict['datatype'] = 'double' + tmp_dict['group_num_h5_dtype'] = 'native_double' + tmp_dict['group_num_f_dtype_default']= 'real(8)' + tmp_dict['group_num_f_dtype_double'] = 'real(8)' + tmp_dict['group_num_f_dtype_single'] = 'real(4)' + tmp_dict['group_num_dtype_default']= 'double' + tmp_dict['group_num_dtype_double'] = 'double' + tmp_dict['group_num_dtype_single'] = 'float' + tmp_dict['default_prec'] = '64' + tmp_dict['group_num_std_dtype_out'] = '24.16e' + tmp_dict['group_num_std_dtype_in'] = 'lf' + tmp_dict['group_num_py_dtype'] = 'float' + elif v2[0] in ['int', 'dim']: + tmp_dict['datatype'] = 'int64_t' + tmp_dict['group_num_h5_dtype'] = 'native_int64' + tmp_dict['group_num_f_dtype_default']= 'integer(4)' + tmp_dict['group_num_f_dtype_double'] = 'integer(8)' + tmp_dict['group_num_f_dtype_single'] = 'integer(4)' + tmp_dict['group_num_dtype_default']= 'int32_t' + tmp_dict['group_num_dtype_double'] = 'int64_t' + tmp_dict['group_num_dtype_single'] = 'int32_t' + tmp_dict['default_prec'] = '32' + tmp_dict['group_num_std_dtype_out'] = '" PRId64 "' + tmp_dict['group_num_std_dtype_in'] = '" SCNd64 "' + tmp_dict['group_num_py_dtype'] = 'int' + tmp_dict['trex_json_int_type'] = v2[0] + return num_dict @@ -634,7 +673,7 @@ def check_dim_consistency(num: dict, dset: dict) -> None: Consistency check to make sure that each dimensioning variable exists as a num attribute of some group. Parameters: - num (dict) : dictionary of num-suffixed variables + num (dict) : dictionary of numerical attributes dset (dict) : dictionary of datasets Returns: @@ -647,6 +686,8 @@ def check_dim_consistency(num: dict, dset: dict) -> None: if dim not in dim_tocheck: dim_tocheck.append(dim) + num_onlyDim = [attr_name for attr_name, specs in num.items() if specs['trex_json_int_type']=='dim'] + for dim in dim_tocheck: - if not dim in num.keys(): + if not dim in num_onlyDim: raise ValueError(f"Dimensioning variable {dim} is not a num attribute of any group.\n") diff --git a/trex.org b/trex.org index b7c5723..908a738 100644 --- a/trex.org +++ b/trex.org @@ -16,6 +16,14 @@ column-major order (as in Fortran), and the ordering of the dimensions is reversed in the produced ~trex.json~ configuration file as the library is written in C. +TREXIO currently supports ~int~, ~float~ and ~str~ types for both single attributes and arrays. +Note, that some attributes might have ~dim~ type (e.g. ~num~ of the ~nucleus~ group). +This type is treated exactly the same as ~int~ with the only difference that ~dim~ variables +cannot be negative or zero. This additional constraint is required because ~dim~ attributes +are used internally to allocate memory and to check array boundaries in the memory-safe API. +Most of the times, the ~dim~ variables contain ~num~ suffix. + + In Fortran, the arrays are 1-based and in most other languages the arrays are 0-based. Hence, we introduce the ~index~ type which is an 1-based ~int~ in the Fortran interface and 0-based otherwise. @@ -35,9 +43,9 @@ arrays are 0-based. Hence, we introduce the ~index~ type which is an #+NAME: metadata | Variable | Type | Dimensions (for arrays) | Description | |-------------------+-------+-------------------------+------------------------------------------| - | ~code_num~ | ~int~ | | Number of codes used to produce the file | + | ~code_num~ | ~dim~ | | Number of codes used to produce the file | | ~code~ | ~str~ | ~(metadata.code_num)~ | Names of the codes used | - | ~author_num~ | ~int~ | | Number of authors of the file | + | ~author_num~ | ~dim~ | | Number of authors of the file | | ~author~ | ~str~ | ~(metadata.author_num)~ | Names of the authors of the file | | ~package_version~ | ~str~ | | TREXIO version used to produce the file | | ~description~ | ~str~ | | Text describing the content of file | @@ -47,9 +55,9 @@ arrays are 0-based. Hence, we introduce the ~index~ type which is an :RESULTS: #+begin_src python :tangle trex.json "metadata": { - "code_num" : [ "int", [] ] + "code_num" : [ "dim", [] ] , "code" : [ "str", [ "metadata.code_num" ] ] - , "author_num" : [ "int", [] ] + , "author_num" : [ "dim", [] ] , "author" : [ "str", [ "metadata.author_num" ] ] , "package_version" : [ "str", [] ] , "description" : [ "str", [] ] @@ -65,19 +73,19 @@ arrays are 0-based. Hence, we introduce the ~index~ type which is an #+NAME:electron | Variable | Type | Dimensions | Description | |----------+-------+------------+-------------------------------------| - | ~up_num~ | ~int~ | | Number of \uparrow-spin electrons | - | ~dn_num~ | ~int~ | | Number of \downarrow-spin electrons | + | ~up_num~ | ~dim~ | | Number of \uparrow-spin electrons | + | ~dn_num~ | ~dim~ | | Number of \downarrow-spin electrons | #+CALL: json(data=electron, title="electron") #+RESULTS: - :results: + :RESULTS: #+begin_src python :tangle trex.json "electron": { - "up_num" : [ "int", [] ] - , "dn_num" : [ "int", [] ] + "up_num" : [ "dim", [] ] + , "dn_num" : [ "dim", [] ] } , #+end_src - :end: + :END: * Nucleus (nucleus group) @@ -87,7 +95,7 @@ arrays are 0-based. Hence, we introduce the ~index~ type which is an #+NAME: nucleus | Variable | Type | Dimensions | Description | |---------------+---------+-------------------+--------------------------| - | ~num~ | ~int~ | | Number of nuclei | + | ~num~ | ~dim~ | | Number of nuclei | | ~charge~ | ~float~ | ~(nucleus.num)~ | Charges of the nuclei | | ~coord~ | ~float~ | ~(3,nucleus.num)~ | Coordinates of the atoms | | ~label~ | ~str~ | ~(nucleus.num)~ | Atom labels | @@ -95,17 +103,17 @@ arrays are 0-based. Hence, we introduce the ~index~ type which is an #+CALL: json(data=nucleus, title="nucleus") #+RESULTS: - :results: + :RESULTS: #+begin_src python :tangle trex.json "nucleus": { - "num" : [ "int" , [] ] - , "charge" : [ "float", [ "nucleus.num" ] ] - , "coord" : [ "float", [ "nucleus.num", "3" ] ] - , "label" : [ "str" , [ "nucleus.num" ] ] - , "point_group" : [ "str" , [] ] + "num" : [ "dim" , [] ] + , "charge" : [ "float", [ "nucleus.num" ] ] + , "coord" : [ "float", [ "nucleus.num", "3" ] ] + , "label" : [ "str" , [ "nucleus.num" ] ] + , "point_group" : [ "str" , [] ] } , #+end_src - :end: + :END: * Effective core potentials (ecp group) @@ -135,12 +143,12 @@ arrays are 0-based. Hence, we introduce the ~index~ type which is an | ~lmax_plus_1~ | ~int~ | ~(nucleus.num)~ | $\ell_{\max} + 1$, one higher than the maximum angular momentum in the removed core orbitals | | ~z_core~ | ~float~ | ~(nucleus.num)~ | Charges to remove | | ~local_n~ | ~int~ | ~(nucleus.num)~ | Number of local functions $N_{q \ell}$ | - | ~local_num_n_max~ | ~int~ | | Maximum value of ~local_n~, used for dimensioning arrays | - | ~local_exponent~ | ~float~ | ~(ecp.local_num_n_max, nucleus.num)~ | $\alpha_{A q \ell_{\max}}$ | - | ~local_coef~ | ~float~ | ~(ecp.local_num_n_max, nucleus.num)~ | $\beta_{A q \ell_{\max}}$ | + | ~local_num_n_max~ | ~dim~ | | Maximum value of ~local_n~, used for dimensioning arrays | + | ~local_exponent~ | ~float~ | ~(ecp.local_num_n_max, nucleus.num)~ | $\alpha_{A q \ell_{\max}}$ | + | ~local_coef~ | ~float~ | ~(ecp.local_num_n_max, nucleus.num)~ | $\beta_{A q \ell_{\max}}$ | | ~local_power~ | ~int~ | ~(ecp.local_num_n_max, nucleus.num)~ | $n_{A q \ell_{\max}}$ | | ~non_local_n~ | ~int~ | ~(nucleus.num)~ | $N_{q \ell_{\max}}$ | - | ~non_local_num_n_max~ | ~int~ | | Maximum value of ~non_local_n~, used for dimensioning arrays | + | ~non_local_num_n_max~ | ~dim~ | | Maximum value of ~non_local_n~, used for dimensioning arrays | | ~non_local_exponent~ | ~float~ | ~(ecp.non_local_num_n_max, nucleus.num)~ | $\alpha_{A q \ell}$ | | ~non_local_coef~ | ~float~ | ~(ecp.non_local_num_n_max, nucleus.num)~ | $\beta_{A q \ell}$ | | ~non_local_power~ | ~int~ | ~(ecp.non_local_num_n_max, nucleus.num)~ | $n_{A q \ell}$ | @@ -148,24 +156,24 @@ arrays are 0-based. Hence, we introduce the ~index~ type which is an #+CALL: json(data=ecp, title="ecp") #+RESULTS: - :results: + :RESULTS: #+begin_src python :tangle trex.json "ecp": { - "lmax_plus_1" : [ "int" , [ "nucleus.num" ] ] - , "z_core" : [ "float", [ "nucleus.num" ] ] - , "local_n" : [ "int" , [ "nucleus.num" ] ] - , "local_num_n_max" : [ "int" , [] ] - , "local_exponent" : [ "float", [ "nucleus.num", "ecp.local_num_n_max" ] ] - , "local_coef" : [ "float", [ "nucleus.num", "ecp.local_num_n_max" ] ] - , "local_power" : [ "int" , [ "nucleus.num", "ecp.local_num_n_max" ] ] - , "non_local_n" : [ "int" , [ "nucleus.num" ] ] - , "non_local_num_n_max" : [ "int" , [] ] - , "non_local_exponent" : [ "float", [ "nucleus.num", "ecp.non_local_num_n_max" ] ] - , "non_local_coef" : [ "float", [ "nucleus.num", "ecp.non_local_num_n_max" ] ] - , "non_local_power" : [ "int" , [ "nucleus.num", "ecp.non_local_num_n_max" ] ] + "lmax_plus_1" : [ "int" , [ "nucleus.num" ] ] + , "z_core" : [ "float", [ "nucleus.num" ] ] + , "local_n" : [ "int" , [ "nucleus.num" ] ] + , "local_num_n_max" : [ "dim" , [] ] + , "local_exponent" : [ "float", [ "nucleus.num", "ecp.local_num_n_max" ] ] + , "local_coef" : [ "float", [ "nucleus.num", "ecp.local_num_n_max" ] ] + , "local_power" : [ "int" , [ "nucleus.num", "ecp.local_num_n_max" ] ] + , "non_local_n" : [ "int" , [ "nucleus.num" ] ] + , "non_local_num_n_max" : [ "dim" , [] ] + , "non_local_exponent" : [ "float", [ "nucleus.num", "ecp.non_local_num_n_max" ] ] + , "non_local_coef" : [ "float", [ "nucleus.num", "ecp.non_local_num_n_max" ] ] + , "non_local_power" : [ "int" , [ "nucleus.num", "ecp.non_local_num_n_max" ] ] } , #+end_src - :end: + :END: * Basis set (basis group) @@ -210,8 +218,8 @@ arrays are 0-based. Hence, we introduce the ~index~ type which is an | Variable | Type | Dimensions | Description | |---------------------+---------+--------------------+----------------------------------------------------------| | ~type~ | ~str~ | | Type of basis set: "Gaussian" or "Slater" | - | ~num~ | ~int~ | | Total Number of shells | - | ~prim_num~ | ~int~ | | Total number of primitives | + | ~num~ | ~dim~ | | Total Number of shells | + | ~prim_num~ | ~dim~ | | Total number of primitives | | ~nucleus_index~ | ~index~ | ~(nucleus.num)~ | Index of the first shell of each nucleus ($A$) | | ~nucleus_shell_num~ | ~int~ | ~(nucleus.num)~ | Number of shells for each nucleus | | ~shell_ang_mom~ | ~int~ | ~(basis.num)~ | Angular momentum ~0:S, 1:P, 2:D, ...~ | @@ -225,24 +233,24 @@ arrays are 0-based. Hence, we introduce the ~index~ type which is an #+CALL: json(data=basis, title="basis") #+RESULTS: - :results: + :RESULTS: #+begin_src python :tangle trex.json "basis": { - "type" : [ "str" , [] ] - , "num" : [ "int" , [] ] - , "prim_num" : [ "int" , [] ] - , "nucleus_index" : [ "index", [ "nucleus.num" ] ] - , "nucleus_shell_num" : [ "int" , [ "nucleus.num" ] ] - , "shell_ang_mom" : [ "int" , [ "basis.num" ] ] - , "shell_prim_num" : [ "int" , [ "basis.num" ] ] - , "shell_factor" : [ "float", [ "basis.num" ] ] - , "shell_prim_index" : [ "index", [ "basis.num" ] ] - , "exponent" : [ "float", [ "basis.prim_num" ] ] - , "coefficient" : [ "float", [ "basis.prim_num" ] ] - , "prim_factor" : [ "float", [ "basis.prim_num" ] ] + "type" : [ "str" , [] ] + , "num" : [ "dim" , [] ] + , "prim_num" : [ "dim" , [] ] + , "nucleus_index" : [ "index", [ "nucleus.num" ] ] + , "nucleus_shell_num" : [ "int" , [ "nucleus.num" ] ] + , "shell_ang_mom" : [ "int" , [ "basis.num" ] ] + , "shell_prim_num" : [ "int" , [ "basis.num" ] ] + , "shell_factor" : [ "float", [ "basis.num" ] ] + , "shell_prim_index" : [ "index", [ "basis.num" ] ] + , "exponent" : [ "float", [ "basis.prim_num" ] ] + , "coefficient" : [ "float", [ "basis.prim_num" ] ] + , "prim_factor" : [ "float", [ "basis.prim_num" ] ] } , #+end_src - :end: + :END: For example, consider H_2 with the following basis set (in GAMESS format), where both the AOs and primitives are considered normalized: @@ -348,23 +356,23 @@ prim_factor = | Variable | Type | Dimensions | Description | |-----------------+---------+------------+---------------------------------| | ~cartesian~ | ~int~ | | ~1~: true, ~0~: false | - | ~num~ | ~int~ | | Total number of atomic orbitals | + | ~num~ | ~dim~ | | Total number of atomic orbitals | | ~shell~ | ~index~ | ~(ao.num)~ | basis set shell for each AO | | ~normalization~ | ~float~ | ~(ao.num)~ | Normalization factors | #+CALL: json(data=ao, title="ao") #+RESULTS: - :results: + :RESULTS: #+begin_src python :tangle trex.json "ao": { - "cartesian" : [ "int" , [] ] - , "num" : [ "int" , [] ] - , "shell" : [ "index", [ "ao.num" ] ] - , "normalization" : [ "float", [ "ao.num" ] ] + "cartesian" : [ "int" , [] ] + , "num" : [ "dim" , [] ] + , "shell" : [ "index", [ "ao.num" ] ] + , "normalization" : [ "float", [ "ao.num" ] ] } , #+end_src - :end: + :END: ** One-electron integrals (~ao_1e_int~ group) :PROPERTIES: @@ -453,7 +461,7 @@ prim_factor = | Variable | Type | Dimensions | Description | |---------------+---------+--------------------+--------------------------------------------------------------------------| | ~type~ | ~str~ | | Free text to identify the set of MOs (HF, Natural, Local, CASSCF, /etc/) | - | ~num~ | ~int~ | | Number of MOs | + | ~num~ | ~dim~ | | Number of MOs | | ~coefficient~ | ~float~ | ~(ao.num, mo.num)~ | MO coefficients | | ~class~ | ~str~ | ~(mo.num)~ | Choose among: Core, Inactive, Active, Virtual, Deleted | | ~symmetry~ | ~str~ | ~(mo.num)~ | Symmetry in the point group | @@ -462,18 +470,18 @@ prim_factor = #+CALL: json(data=mo, title="mo") #+RESULTS: - :results: + :RESULTS: #+begin_src python :tangle trex.json "mo": { - "type" : [ "str" , [] ] - , "num" : [ "int" , [] ] - , "coefficient" : [ "float", [ "mo.num", "ao.num" ] ] - , "class" : [ "str" , [ "mo.num" ] ] - , "symmetry" : [ "str" , [ "mo.num" ] ] - , "occupation" : [ "float", [ "mo.num" ] ] + "type" : [ "str" , [] ] + , "num" : [ "dim" , [] ] + , "coefficient" : [ "float", [ "mo.num", "ao.num" ] ] + , "class" : [ "str" , [ "mo.num" ] ] + , "symmetry" : [ "str" , [ "mo.num" ] ] + , "occupation" : [ "float", [ "mo.num" ] ] } , #+end_src - :end: + :END: ** One-electron integrals (~mo_1e_int~ group)