Datatype, Dataspace and Dataset¶
The following classes are all defined within namespace HIPP::IO.
Class H5Datatype¶
-
class
H5Datatype¶ In HDF5 format, every dataset has a datatype which describes the type (i.e., interpretation of each byte) of the element in the dataset.
HDF5 pre-defines several sets of datatypes (standard or platform-dependent). The predefined datatypes are listed below in Predefined Datatypes.
User may define new datatypes, derived from predefined datatypes or other user-defined datatypes.
H5Datatypeinstance can be copy-constructed, copy-assigned, move-constructed and move-assigned. The copy, move operations and destructor are all noexcept. The copy operations are shallow-copy, i.e., instances of source and target refer to the same underlying datatype object (like ashared_ptr).-
enum
class_t¶
-
enumerator
class_t::COMPOUND_C¶ -
enumerator
class_t::OPAQUE_C¶ -
enumerator
class_t::ENUM_C¶ -
enumerator
class_t::STRING_C¶ Enumerators of datatype classes. They can be passed into method
create()(to create new user-defined datatypes) or returned bymember_class()(to detect classes of existing datatypes).
-
bool
equal(const H5Datatype &dtype) const¶ -
size_t
size() const¶ Datatype information query functions.
equal(dtype)tests whether the current instance is the same datatype as dtype.size()returns the size in bytes of the current instance.
-
static H5Datatype
create(class_t c, size_t size)¶ -
template<typename
record_t, typenamefield_t, typename ...Args>
static H5Datatypecreate_compound(const string &field_name, field_t record_t::* field_ptr, Args&&... args)¶ -
H5Datatype &
insert(const string &name, size_t offset, const H5Datatype &dtype)¶ -
template<typename
record_t, typenamefield_t>
H5Datatype &insert(const string &name, field_t record_t::* field_ptr)¶ -
H5Datatype &
pack()¶ -
unsigned
nmembers() const¶ -
unsigned
member_index(const string &name) const¶ -
class_t
member_class(unsigned idx) const¶ -
size_t
member_offset(unsigned idx) const¶ -
H5Datatype
member_type(unsigned idx) const¶ -
string
member_name(unsigned idx) const¶ -
template<typename
record_t, typenamefield_t>
static constexpr size_toffset(field_t record_t::* field_ptr) noexcept¶ Compound datatype creation and information-query methods.
create()creates a new datatype with datatype-classcand size in bytessize. Valid datatype-classes are defined in Dtatype Classes.create_compound(name1, ptr1, ...)directly creates a new compound datatype (usually corresponding to a C++ struct/class). For each member of the datatype, you pass two arguments to describe it: the name and a member-pointer to it. The member can be one of the numeric Predefined Datatypes or its raw array (e.g.,int,float[3][4][5]). The library will infer its (native) datatype, offset and size from the member-pointer.insert()inserts a member into the compound datatype. There are two overloads:insert(name, offset, dtype)inserts a member whose name isname, offset isoffsetand datatype isdtype(which may be a predefined or user-defined datatype).insert(name, field_ptr)inserts a member whose name isnameand whose datatype, offset and size (if it is a raw array) are infered from member-pointerfield_ptr. This is valid for any of the numeric Predefined Datatypes or its raw array.pack()recursively removes the paddings in members to make the datatype more memory efficient.
nmembers()returns the number of members in the current compound datatype instance.member_index(name)returns the index to the member namednamefor a compound/enum datatype.member_name(idx)converts the index back to the name. Index can be any number in the range[0, N-1]whereNis returned bynmembers().member_class(idx),member_offset(idx),member_type(idx).To create a compound datatype of, e.g. a structured C++ type
T, callH5Datatype::create(H5Datatype::COMPOUND_C, sizeof(T))to get a new datatype instance, and callinsert()to add information of each field ofT.For example, a dark matter halo in cosmological simulation can be described by the following C++ type:
// for storing the properties of a dark matter halo class DarkMatterHalo { public: long long id; double position[3]; float tidal_tensor[3][3]; double radius; };
To create a corresponding HDF5 datatype for I/O, you write:
/* Create compound datatype for DarkMatterHalo. */ auto dtype = H5Datatype::create( H5Datatype::COMPOUND_C, sizeof(DarkMatterHalo)); dtype.insert("ID", H5Datatype::offset(&DarkMatterHalo::id), NATIVE_LLONG_T) .insert("Position", &DarkMatterHalo::position) .insert("Tidal Tensor", &DarkMatterHalo::tidal_tensor) .insert("Radius", &DarkMatterHalo::radius);
Note that you can insert each field by
(name, offset, datatype)(like “ID” above) or simply by(name, field_ptr)wherefield_ptris the pointer to that member (like “Position”, “Tidal Tensor”, “Radius” above).If your C++ structure contains only numeric types (such as
DarkMatterHalohere), it is easier to create the compound datatype directly using a single function call:/* Another way to create a compound datatype. */ auto dtype = H5Datatype::create_compound( "ID", &DarkMatterHalo::id, "Position", &DarkMatterHalo::position, "Tidal Tensor", &DarkMatterHalo::tidal_tensor, "Radius", &DarkMatterHalo::radius);
Now you perform I/O using the new datatype:
/* Write halo instances into a new file */ vector<DarkMatterHalo> halos(10), halos_in(10); H5File file("halos.h5", "w"); file.create_dataset("Halos", dtype, {10}).write(halos.data(), dtype); /* Load it back */ file.open_dataset("Halos").read(halos_in.data(), dtype);
Using
h5dump halos.h5you see the outputHDF5 "halos.h5" { GROUP "/" { DATASET "Halos" { DATATYPE H5T_COMPOUND { H5T_STD_I64LE "ID"; H5T_ARRAY { [3] H5T_IEEE_F64LE } "Position"; H5T_ARRAY { [3][3] H5T_IEEE_F32LE } "Tidal Tensor"; H5T_IEEE_F64LE "Radius"; } DATASPACE SIMPLE { ( 10 ) / ( 10 ) } DATA { (0): { 0,[ 0, 0, 0 ], [ 0, 0, 0, 0, 0, 0, 0, 0, 0 ], 0}, .... } }}}
-
H5Datatype
create_array(const vector<hsize_t> &dims) const¶ -
template<typename
raw_array_t>
static H5Datatypecreate_array()¶ -
unsigned
array_ndims() const¶ -
vector<hsize_t>
array_dims() const¶ Array datatype creation and information-query functions.
-
enum
Predefined datatypes¶
-
extern const H5Datatype
NATIVE_CHAR_T¶ -
extern const H5Datatype
NATIVE_SCHAR_T¶ -
extern const H5Datatype
NATIVE_SHORT_T¶ -
extern const H5Datatype
NATIVE_INT_T¶ -
extern const H5Datatype
NATIVE_LONG_T¶ -
extern const H5Datatype
NATIVE_LLONG_T¶ -
extern const H5Datatype
NATIVE_UCHAR_T¶ -
extern const H5Datatype
NATIVE_USHORT_T¶ -
extern const H5Datatype
NATIVE_UINT_T¶ -
extern const H5Datatype
NATIVE_ULONG_T¶ -
extern const H5Datatype
NATIVE_ULLONG_T¶ -
extern const H5Datatype
NATIVE_FLOAT_T¶ -
extern const H5Datatype
NATIVE_DOUBLE_T¶ The predefined datatypes that correspond to the native numeric types in this platform:
char,signed char,short,int,long,long long,unsigned char,unsigned short,unsigned int,unsigned long,unsigned long longfloatanddouble.
-
extern const H5Datatype
STD_I8LE_T¶ -
extern const H5Datatype
STD_I16LE_T¶ -
extern const H5Datatype
STD_I32LE_T¶ -
extern const H5Datatype
STD_I64LE_T¶ -
extern const H5Datatype
STD_U8LE_T¶ -
extern const H5Datatype
STD_U16LE_T¶ -
extern const H5Datatype
STD_U32LE_T¶ -
extern const H5Datatype
STD_U64LE_T¶ -
extern const H5Datatype
IEEE_F32LE_T¶ -
extern const H5Datatype
IEEE_F64LE_T¶ -
extern const H5Datatype
STD_I8BE_T¶ -
extern const H5Datatype
STD_I16BE_T¶ -
extern const H5Datatype
STD_I32BE_T¶ -
extern const H5Datatype
STD_I64BE_T¶ -
extern const H5Datatype
STD_U8BE_T¶ -
extern const H5Datatype
STD_U16BE_T¶ -
extern const H5Datatype
STD_U32BE_T¶ -
extern const H5Datatype
STD_U64BE_T¶ -
extern const H5Datatype
IEEE_F32BE_T¶ -
extern const H5Datatype
IEEE_F64BE_T¶ The predefined datatypes that correspond to the standard numeric types (i.e., machine-independent). They are usually used as the “file-type” in the construction of a dataset only when you want to get certain storage type in the target file.
The “LE” version is the little-endian type, and the “BE” version is the big-endian type. “8”, “16”, “32”, “64” are sizes of the datatypes in bytes.
-
extern const H5Datatype
C_S1_T¶ The elementary type of a C string (i.e.,
const char *). A fixed-length C string can be obtained by copying this instance and callingresizeto change it to desired length.
Class H5Dataspace¶
-
class
H5Dataspace¶ In HDF5 format, every dataset has a dataspace which describe its shape. In the I/O process, a dataspace instance also defines which part of the data in the memory or in the file is involved.
H5Dataspaceinstance can be copy-constructed, copy-assigned, move-constructed and move-assigned. The copy, move operations and destructor are all noexcept. The copy operations are shallow-copy, i.e., instances of source and target refer to the same underlying dataspace object (like ashared_ptr).-
enum
class_t¶
-
enumerator
class_t::NULL_C¶ -
enumerator
class_t::SIMPLE_C¶ -
enumerator
class_t::SCALAR_C¶ Enumerators of dataspace classes. They can be passed into
create()method to create new dataspaces.
-
static const H5Dataspace
allval¶ -
static const H5Dataspace
nullval¶ -
static const H5Dataspace
scalarval¶ Predefined dataspaces.
allvalrepresents all data in a dataset or a memory buffer, whose exact meaning depends on the context.nullvalrepresents an empty dataspace.scalarvalrepresents the dataspace for a single element, although the its datatype may be complex.
-
H5Dataspace(const vector<hsize_t> &dims)¶ -
H5Dataspace(const vector<hsize_t> &dims, const vector<hsize_t> &maxdims)¶ Constructors - create simple dataspaces (i.e., regular array, dataspace class =
SIMPLE_C).dimsspecifies its shape (i.e., number of element at each dimension). If amaxdimsis also provided, the maximal number of element at each dimension may be larger than used, which means you may extend its shape later.
-
static H5Dataspace
create(class_t type)¶ -
static H5Dataspace
create_null()¶ -
static H5Dataspace
create_scalar()¶ -
static H5Dataspace
create_simple()¶ Create a new dataspace instance.
typemay be one of the Dataspace Classes. For convenience, we also provide three functions to create null, scalar and simple dataspace, respectively.
-
int
ndims() const¶ -
vector<hsize_t>
dims() const¶ -
vector<hsize_t>
maxdims() const¶ -
hsize_t
size() const¶ Query the information of current dataspace instance.
ndims()returns the number of dimensions (i.e., rank).dims()returns the number of elements in each dimension.maxdims()returns the maximal number of elements in each dimension.size()returns the total number of elements (i.e., the product of all values returned bydims()).
-
void
select_hyperslab(const vector<hsize_t> &start, const vector<hsize_t> &count)¶ -
void
select_hyperslab(const string &op, const hsize_t *start, const hsize_t *count, const hsize_t *stride = NULL, const hsize_t *block = NULL);¶ -
hssize_t
get_select_npoint() const¶ Select a hyperslab in the current dataspace.
opcan be either “set”, “or” (“|”), “and” (“&”), “xor” (“^”), “notb”, “nota”.In the first overload, op = “set”,
stride = 1in all dimension,block=1in all dimension.In the second overload, setting
strideorblocktoNULLmeans “1” in all dimensions.
-
enum
class H5Dataset¶
-
class
H5Dataset¶ The API for HDF5 dataset.
H5Datasetinstance can be copy-constructed, copy-assigned, move-constructed and move-assigned. The copy, move operations and destructor are all noexcept. The copy operations are shallow-copy, i.e., instances of source and target refer to the same underlying dataset object (like ashared_ptr).-
H5Dataspace
dataspace()¶ -
const H5Dataspace
dataspace() const¶ -
H5Datatype
datatype()¶ -
const H5Datatype
datatype() const¶ Retrive the information (i.e., dataspace and datatype) of the dataset instance.
-
template<typename
T>
H5Attrcreate_attr(const string &name, const vector<hsize_t> &dims, const string &flag = "trunc")¶ -
H5Attr
create_attr(const string &name, const H5Datatype &dtype, const vector<hsize_t> &dims, const string &flag = "trunc")¶ -
template<typename
T>
H5Attrcreate_attr_scalar(const string &name, const string &flag = "trunc")¶ -
H5Attr
create_attr_str(const string &name, size_t len, const string &flag = "trunc")¶ Create a new attribute (or open an existing attribute) under the current dataset.
The template parameter and argument list are the same with
H5File::create_dataset()and its variants. The difference is that you cannot specify any property list.
-
H5Attr
open_attr(const string &name)¶ -
bool
attr_exists(const string &name) const¶ Opens an existing attribute of name
name. If not existing, throw an errorErrH5.attr_exists()checks whether an attribute has been existed.
-
template<typename
T, typenameA>
voidwrite(const vector<T, A> &buff, const H5Dataspace &memspace = H5Dataspace::allval, const H5Dataspace &filespace = H5Dataspace::allval, const H5Proplist &xprop = H5Proplist::defaultval)¶ -
template<typename
T>
voidwrite(const T *buff, const H5Dataspace &memspace = H5Dataspace::allval, const H5Dataspace &filespace = H5Dataspace::allval, const H5Proplist &xprop = H5Proplist::defaultval)¶ -
void
write(const string &buff, const H5Proplist &xprop = H5Proplist::defaultval)¶ -
template<typename
T, typenameA>
voidwrite(const vector<T, A> &buff, const H5Datatype &memtype, const H5Dataspace &memspace = H5Dataspace::allval, const H5Dataspace &filespace = H5Dataspace::allval, const H5Proplist &xprop = H5Proplist::defaultval)¶ -
template<typename
T>
voidwrite(const T *buff, const H5Datatype &memtype, const H5Dataspace &memspace = H5Dataspace::allval, const H5Dataspace &filespace = H5Dataspace::allval, const H5Proplist &xprop = H5Proplist::defaultval)¶ Write data in
buffinto the dataset. Type and number of elements in the buff must be compatible with the dataset. Five overloads are provided for:const vector<T, A> & buff: write a vector of elements of type T. T must be a numeric Predefined Datatypes (i.e., int, float) orstd::string. For the numeric types, the total number of elements in the vector must be compatible with the dataset’ size (i.e., the product of actual dims). For the string, the total number of strings and the maximal length of these strings must br compatible with the dataset’s dims.const T *buff: same as the vector version (1), but use data in the raw buffer.const string &: write a single string.const vector<T, A> &buff, const H5Datatype &memtype: same as the vector version (1), but now T could be any type whose datatype is described bymemtype`.const T *buff, const H5Datatype &memtype: same as (4), but using a raw buffer.
-
template<typename
T, typenameA>
voidread(vector<T, A> &buff, const H5Dataspace &memspace = H5Dataspace::allval, const H5Dataspace &filespace = H5Dataspace::allval, const H5Proplist &xprop = H5Proplist::defaultval) const¶ -
template<typename
T>
voidread(T *buff, const H5Dataspace &memspace = H5Dataspace::allval, const H5Dataspace &filespace = H5Dataspace::allval, const H5Proplist &xprop = H5Proplist::defaultval) const¶ -
void
read(string &buff, const H5Proplist &xprop = H5Proplist::defaultval) const¶ -
template<typename
T, typenameA>
voidread(vector<T, A> &buff, const H5Datatype &memtype, const H5Dataspace &memspace = H5Dataspace::allval, const H5Dataspace &filespace = H5Dataspace::allval, const H5Proplist &xprop = H5Proplist::defaultval) const¶ -
template<typename
T>
voidread(T *buff, const H5Datatype &memtype, const H5Dataspace &memspace = H5Dataspace::allval, const H5Dataspace &filespace = H5Dataspace::allval, const H5Proplist &xprop = H5Proplist::defaultval) const¶ Read data in the dataset instance into
buff. This the inverse of thewrite()method, so we still provide five overloads. The detailed requirement for each overload is the same aswrite().The first and the third overloads automatically resize the buffer. In all other cases the buffers must have correct shapes.
-
static H5Proplist
create_proplist(const string &cls)¶ -
H5Proplist
proplist(const string &cls) const¶ Dataset property list manipulation methods.
create_proplist(cls)creates a property list with given classcls. Possible values forclsinclude“c” or “create”: properties for dataset creation.
“a” or “access”: properties for dataset access.
“x” or “xfer” or “transfer”: properties for dataset transfer.
proplist()retrives the property list of current dataset instance.clscan be either “c” (or “create”) or “a” (or “access”).
-
H5Dataspace