The HDF5 Format
Hierarchical Data Format files are organized in a hierarchical structure. The two primary structures are:
HDF attributes are small named datasets that are attached to primary datasets, groups, or named datatypes.
HDF4 versus HDF5
HDF5 was designed to address some of the limitations of the HDF4 format, in addition to providing new functionality.
The limitations of the HDF4 format included:
- A file cannot store more than 20,000 complex objects and cannot be larger than 2 gigabytes;
- The data models are inconsistent, there are too many object types, and datatypes are too restrictive;
- The C library source was old and complex, did not support parallel I/O effectively, and was not threadsafe.
The new HDF5 includes the following improvements:
- Larger files may be stored and more objects per file may be included.
- A more comprehensive data model with two basic structures: multidimensional datasets and groups.
- Simpler, better-engineered library and API, with support for parallel I/O and threads.
Note
The HDF5 format is not compatible with HDF4, although a conversion routine (h4toh5) is available from NCSA (http://hdf.ncsa.uiuc.edu/h4toh5/).