While the H5MD specification is agnostic towards the software used to write and read H5MD files, this document provides implementation hints for common HDF5 libraries.
The step
and time
datasets are in monotonically increasing
order, which permits the use of a binary
search to
efficiently determine the index of an element of the value
dataset
corresponding to a given step or time.
For storage efficiency, small-sized time-independent data should be stored using the compact dataset layout (see §5.4.5). The raw data of a compact dataset, which may not exceed 64 kb, is stored in the header block of the dataset object.
species
data¶The species
data group in the particles
group of a H5MD file may
use the enum datatype (see
§6.5.3)
that is compatible with the integer datatype. This allows to map
numerical species identifiers to a string while keeping the performance
of integer data.
An enumerated-type dataset can be read using, e.g., H5T_NATIVE_INT
as the memory datatype. For writing, however, the memory datatype needs
to be an enumerated type. A program extending an existing
enumerated-type dataset needs to be aware of this.
This section describes the usage of the HDF5 C API.
#include <hdf5.h>
The H5MD specification recommends HDF5 file format version 2. This format is supported by HDF5 library version 1.8 or later, but needs to be enabled explicitly when creating or opening an HDF5 file for writing.
This is achieved using the function H5Pset_libver_bounds.
/* Create HDF5 file with HDF5 file format version 2. */
hid_t fapl_id = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_libver_bounds(fapl_id, H5F_LIBVER_18, H5F_LIBVER_18);
hid_t file_id = H5Fcreate("name.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id);
Compact datasets are created by setting the layout property of the dataset creation property list.
hid_t dcpl_id = H5Pcreate(H5P_DATASET_CREATE);
H5Pset_layout(dcpl_id, H5D_COMPACT);
hid_t dset_id = H5Dcreate(loc_id, "name", type_id, space_id, H5P_DEFAULT, dcpl_id, H5P_DEFAULT);
For HDF5 file format version 2 or later, the HDF5 library automatically tracks access, modification, change, and birth time of an object, which applies to the file, and its groups and datasets.
The function H5Oget_info may be used to query object times in seconds since the epoch.
H5O_info_t info;
H5Oget_info(file_id, &info);
printf("atime %ld\n", info.atime);
printf("mtime %ld\n", info.mtime);
printf("ctime %ld\n", info.ctime);
printf("btime %ld\n", info.btime);
This section describes the usage of HDF5 for Python.
import h5py
The class h5py.File
takes a “libver”
argument
to set the file format.
f = h5py.File("name.h5", libver="18")
Note that h5py up to version 2.1.3 lacks a binding for the
H5F_LIBVER_18
constant.
The function numpy.searchsorted may be used to perform a binary search:
""" Returns value of H5MD element at given step. """
def read_step(group, step):
index = np.searchsorted(group["step"], step)
assert(group["step"][index] == step)
return group["value"][index]
The low-level function h5py.h5o.get_info retrieves object times.
info = h5py.h5o.get_info(f.id)
print(info.atime)
print(info.mtime)
print(info.ctime)
print(info.btime)
Note that h5py up to version 2.1.3 lacks bindings for the above
mentioned H5O_info_t
members.