-
Notifications
You must be signed in to change notification settings - Fork 3
Home
HDF5 is a low level IO library.
Other libraries which are built on top of hdf5 or for-fill the same job:
Silo:
- built on top of hdf5.
Exodus:
- Un-structured grids.
- Built on top of NetCDF.
eXtensible Data Model and FOrmat (XDMF).
XIOS (XML IO Server).
- Hearily used in the climate community.
Two main libraries:
- hdf5
- NetCDF
A hdf5 file is built on groups which are a structure containing instances of zero or more groups or datasets.
A hdf5 dataset is a multidimensional array of data elements and is defined by: name, datatype (atomic, composite), dataspaces (rank, sizes, max size etc..), storage layout (contiguous, compact, chunked).
High level and low level. High level are short cuts: Dimension scale (H5DS) used for units of the data. Lite (H5LT) simple write of dataset in one line. Image (H5IM) write images in one cell. Table (H5TB) allows for represent data as database (not well implemented in hdf5 by default).
Finally Packet which will be covered now.
- H5F file
- H5G Group
- H5S dataspaces
- H5D dataset
From the point of view of hdf5, language syntax is the only difference.
- hid_t: handler for any hdf5 objects
- hsize_t: type used for data
- herr_t: error handler
file = H5Fcreate("example.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);"example.h5": file name H5F_ACC_TRUNC: File creation H5P_DEFAULT: file creation property list.
dimsf[0] = NY;
dimsf[1] = NX;
dataspace = H5Screate(RANK, dimsf, NULL);RANK: dataset dimensional. dimsf: dimension size. NULL: specify max size.
dataset = H5Dcreate(file, "IntArray", H5T_NATIVE_INT, dataspace, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);file: HDF5 object. IntARRY: dataset name. H5T_NATIVE_INT: type of the data the dataset will contain. H5P_DEFAULT: default options.
Predefined datatypes, tw of them: STANDARD: e.g. H5T_IEEE_F32BE means IEEE 32 bit float with Big representation.
NATIVE: Alias to the STANDARD with depend of the platform of compilation, means you don't need to know which are the correct STANDARD types to uses and also allows for portability.
ATOMIC: finest grain of manipulation possible with HDF5 COMPOSITE: an aggregation of one of more data types.
status = H5Dwrite(dataset, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, data);dataset: object representing the dataset to write. H5S_ALL: I want to write everything. data: buffer containing data to write.
HDF5 will do the translation if you use a different type from that of the data set.
H5Sclose(dataspace);
H5Dclose(dataset);
H5Fclose(file); // This should flush the data. Can also use H5Flush().This is good practice to avoid memory leaks. If files are opened at the start of simulations and then not closed, if the simulation crashes, the file maybe corrupted.
- start: start location of the hyperslab
- stride: the integer separation of elements to select.
- count: how many rows/columns to select.
- block: size of data.
Each of these are a vector of length equal to the dimension of the dataset.
These concepts are also used for the parallel IO.
different dataspaces:
- Null: contain nothing
- Scalar
-
Simple:
- rank
- current size
- maximum size (can be unlimited).
- Selections: subset of dataset.
For example, dataspaces can be used to alter the dimensionality of the data: 2D -> 1D by selection.
hid_t space_id;
hsize_t dim[2], start[2], cont[2];
hsize_t *stride=NULL, *block=NULL;
dims[0] = ny; dims[1] = nx;
start[0] = 2; start[1] = 1;
count[0] = 6; count[1] = 4;
space_id = H5Screate(2, dims, NULL);
status = H5Sselect_hyperslab(space_id, h5S_SELECT_SET, start, stride, cont, block);stride, block arrays are considered as 1 if NULL is passed.
status = H5Sselect_hyperslab(space_id_mem, h5S_SELECT_SET, start_mem, stride_mem, count_mem, block_mem);
status = H5Sselect_hyperslab(space_id_disk, h5S_SELECT_SET, start_disk, stride_disk, count_disk, block_disk);
status = H5*write(dataset, H5T_NATIVE_INT, space_id_mem, space_id_disk , H5P_DEFAULT, data);hdf5 files are non-human readable so there are some tools to help, the three main tools are: h5ls, h5dump, h5diff.