sidpy.hdf.dtype_utils.check_dtype

sidpy.hdf.dtype_utils.check_dtype(h5_dset)[source]

Checks the datatype of the input HDF5 dataset and provides the appropriate function calls to convert it to a float

Parameters:

h5_dset (h5py.Dataset) – Dataset of interest

Returns:

  • func (callable) – function that will convert the dataset to a float

  • is_complex (bool) – is the input dataset complex?

  • is_compound (bool) – is the input dataset compound?

  • n_features (Unsigned int) – Unsigned integer - the length of the 2nd dimension of the data after func is called on it

  • type_mult (Unsigned int) – multiplier that converts from the typesize of the input dtype to the typesize of the data after func is run on it

Examples

>>> import numpy as np
>>> import h5py
>>> import sidpy
>>> struct_dtype = np.dtype({'names': ['r', 'g', 'b'],
>>>                      'formats': [np.float32, np.uint16, np.float64]})
>>> file_path = 'dtype_utils_example.h5'
>>> if os.path.exists(file_path):
>>>     os.remove(file_path)
>>> with h5py.File(file_path, mode='w') as h5_f:
>>>     num_elems = (5, 7)
>>>     structured_array = np.zeros(shape=num_elems, dtype=struct_dtype)
>>>     structured_array['r'] = 450 * np.random.random(size=num_elems)
>>>     structured_array['g'] = np.random.randint(0, high=1024, size=num_elems)
>>>     structured_array['b'] = 3178 * np.random.random(size=num_elems)
>>>     _ = h5_f.create_dataset('compound', data=structured_array)
>>>     _ = h5_f.create_dataset('real', data=450 * np.random.random(size=num_elems), dtype=np.float16)
>>>     _ = h5_f.create_dataset('complex', data=np.random.random(size=num_elems) + 1j * np.random.random(size=num_elems),
>>>                             dtype=np.complex64)
>>> h5_f.flush()
>>> # Now, lets test the the function on compound-, complex-, and real-valued HDF5 datasets:
>>> def check_dataset(h5_dset):
>>>     print('     Dataset being tested: {}'.format(h5_dset))
>>>     func, is_complex, is_compound, n_features, type_mult = sidpy.dtype_utils.check_dtype(h5_dset)
>>>     print('     Function to transform to real: %s' % func)
>>>     print('     is_complex? %s' % is_complex)
>>>     print('     is_compound? %s' % is_compound)
>>>     print('     Shape of dataset in its current form: {}'.format(h5_dset.shape))
>>>     print('     After flattening to real, shape is expected to be: ({}, {})'.format(h5_dset.shape[0], n_features))
>>>     print('     Byte-size of a single element in its current form: {}'.format(type_mult))
>>> with h5py.File(file_path, mode='r') as h5_f:
>>>     print('Checking a compound-valued dataset:')
>>>     check_dataset(h5_f['compound'])
>>>     print('')
>>>     print('Checking a complex-valued dataset:')
>>>     check_dataset(h5_f['complex'])
>>>    print('')
>>>     print('Checking a real-valued dataset:')
>>>     check_dataset(h5_f['real'])
>>> os.remove(file_path)
Checking a compound-valued dataset:
Dataset being tested: <HDF5 dataset "compound": shape (5, 7), type "|V14">
Function to transform to real: <function flatten_compound_to_real at 0x112c130d0>
is_complex? False
is_compound? True
Shape of dataset in its current form: (5, 7)
After flattening to real, shape is expected to be: (5, 21)
Byte-size of a single element in its current form: 12
- - - - - - - - - - - - - - - - - -
Checking a complex-valued dataset:
Dataset being tested: <HDF5 dataset "complex": shape (5, 7), type "<c8">
Function to transform to real: <function flatten_complex_to_real at 0x112c13048>
is_complex? True
is_compound? False
Shape of dataset in its current form: (5, 7)
After flattening to real, shape is expected to be: (5, 14)
Byte-size of a single element in its current form: 8
- - - - - - - - - - - - - - - - - -
Checking a real-valued dataset:
Dataset being tested: <HDF5 dataset "real": shape (5, 7), type "<f2">
Function to transform to real: <class 'numpy.float32'>
is_complex? False
is_compound? False
Shape of dataset in its current form: (5, 7)
After flattening to real, shape is expected to be: (5, 7)
Byte-size of a single element in its current form: 4