Variable Length Array Datatypes

HDF5 provides support for variable length arrays, but IDL itself does not. As a result, in order to store data in an HDF5 variable length array you must:

  1. Create a series of vectors of data in IDL, each with a potentially different length. All vectors must be of the same data type.
  2. Store a pointer to each data vector in the PDATA field of an IDL_H5_VLEN structure. The IDL_H5_VLEN structure is defined as follows:
  3. { IDL_H5_VLEN, pdata:PTR_NEW() } 
    
  4. Create an array of IDL_H5_VLEN structures that will be stored as an HDF5 variable length array.
  5. The IDL_H5_VLEN structure is defined as follows:
  6. { IDL_H5_VLEN, pdata:PTR_NEW() } 
    
  7. Create a base HDF5 datatype from one of the data vectors.
  8. Create an HDF5 variable length datatype from the base datatype.
  9. Create an HDF5 dataspace of the appropriate size.
  10. Create an HDF5 dataset.
  11. Write the array of IDL_H5_VLEN structures to the HDF5 dataset.
  12. Note
    IDL string arrays are a special case: see Variable Length String Arrays for details.

Creating a variable length array datatype is a two-step process. First, you must create a base datatype using the H5T_IDL_CREATE function; all data in the variable length array must be of this datatype. Second, you create a variable length array datatype using the base datatype as an input to the H5T_VLEN_CREATE function.

Note
No explicit size is provided to the H5T_VLEN_CREATE function; sizes are determined as needed by the data being written.

Example: Writing a Variable Length Array

; Create a file to hold the data 
file = 'h5_test.h5' 
fid = H5F_CREATE(file) 
 
; Create three vectors containing integer data 
a = INDGEN(2) 
b = INDGEN(3) 
c = 3 
 
; Create an array of three IDL_H5_VLEN structures 
sArray = REPLICATE({IDL_H5_VLEN},3) 
 
; Populate the IDL_H5_VLEN structures with pointers to 
; the three data vectors 
sArray[0].pdata = PTR_NEW(a) 
sArray[1].pdata = PTR_NEW(b) 
sArray[2].pdata = PTR_NEW(c) 
 
; Create a dataype based on one of the data vectors 
dt1 = H5T_IDL_CREATE(a) 
 
; Create a variable length datatype based on the previously- 
; created datatype 
dt = H5T_VLEN_CREATE(dt1) 
 
; Create a dataspace 
ds = H5S_CREATE_SIMPLE(N_ELEMENTS(sArray)) 
 
; Create the dataset 
d = H5D_CREATE(fid,'dataset', dt, ds) 
 
; Write the array of structures to the dataset 
H5D_WRITE, d, sArray 

Examples: Reading a Variable Length Array

Using the H5D_READ function to read data written as a variable length array creates an array of IDL_H5_VLEN structures. The following examples show how to refer to individual data elements of various HDF5 datatypes

Atomic HDF5 Datatypes

To read and access data stored in variable length arrays of atomic HDF5 datatypes, simply dereference the pointer stored in the PDATA field of the appropriate IDL_H5_VLEN structure. For example, to retrieve the variable b from the data written in the above example:

data = H5D_READ(d) 
b = *data[1].pdata 

Compound HDF5 Datatypes

If you have a variable length array of compound datatypes, the tag tag of the jth structure of the ith element of the variable length array would be accessed as follows:

data = H5D_READ(d) 
a = (*data[i].pdata)[j].tag 

Variable Length Arrays of Variable Length Arrays

If you have a variable length array of variable length arrays of integers, the kth integer of the jth element of a variable length array stored in the ith element of a variable length array would be accessed as follows:

data = H5D_READ(d) 
a = (*(*data[i].pdata)[j].pdata)[k] 

Compound Datatypes Containing Variable Length Arrays

If you have a compound datatype containing a variable length array, the kth data element of the jth variable length array in the ith compound datatype would be accessed as follows:

data = H5D_READ(d) 
a = (*data[i].vl_array[j].pdata)[k] 

Variable Length String Arrays

Because the data vectors referenced by the pointers stored in the PDATA field of the IDL_H5_VLEN structure must all have the same type and dimension, strings are handled as vectors of individual characters rather than as atomic units. This means that each element in a string array must be assigned to an individual IDL_H5_VLEN structure:

str = ['dog', 'dragon', 'duck'] 
sArray = REPLICATE({IDL_H5_VLEN},3) 
sArray[0].pdata = ptr_new(str[0]) 
sArray[1].pdata = ptr_new(str[1]) 
sArray[2].pdata = ptr_new(str[2]) 

Use the H5T_STR_TO_VLEN function to assist in converting between an IDL string array and an HDF5 variable length string array. The following achieves the same result as the above five lines:

str = ['dog', 'dragon', 'duck'] 
sArray = H5T_STR_TO_VLEN(str) 

Similarly, if you have an HDF5 variable length array containing string data, use the H5T_VLEN_TO_STR function to access the string data:

data = H5D_READ(d) 
str = H5T_VLEN_TO_STR(data)