nanshe.io.hdf5.serializers module

The module serializers performs IO of NumPy object to an from HDF5 files.

Overview

The module serializers provides an easy way to serialize unusual NumPy object to an from files. In particular, it provides support for structured arrays and masked arrays.

API

class nanshe.io.hdf5.serializers.HDF5MaskedDataset(group, shape=None, dtype=None, data=None, chunks=True, **kwargs)[source]

Bases: object

Provides an abstraction of the masked array the HDF5 Group where the contents of a masked array are serialized.

Note

This behaves roughly like an h5py.Dataset and roughly like a numpy.ma.masked_array. Internally, it uses an h5py.Group to contain the components of the masked array and allow interaction with them.

data
dims
dtype
fill_value
group
mask
name
ndim
resize(size, axis=None)[source]
shape
size
nanshe.io.hdf5.serializers.create_numpy_structured_array_in_HDF5(*args, **kwargs)[source]

Serializes a NumPy structure array to an HDF5 file by using the HDF5 compound data type. Also, will handle normal NumPy arrays and scalars, as well.

Note

HDF5 does not support generic Python objects. So, serialization of objects to something else (perhaps strs of fixed size) must be performed first.

Parameters:
  • file_handle (HDF5 file) – either an HDF5 filename or Group.
  • internalPath (str) – an internal path for the HDF5 file.
  • data (numpy.ndarray) – the NumPy structure array to save (or normal NumPy array).
  • overwrite (bool) – whether to overwrite what is already there (defaults to False).
nanshe.io.hdf5.serializers.hdf5_wrapper(hdf5_args=[], hdf5_kwargs=[], hdf5_result='')[source]

Drop array results into HDF5 files specified.

Useful wrapper, which take a callable and handle its input arguments that are HDF5 Datasets and reads them in as NumPy arrays. These NumPy arrays are then provided to the decorated callable as normal arguments. The result is then stored as an HDF5 Dataset.

Parameters:
  • hdf5_args (Sequence) – A sequence of indices that represent arguments passed in that are expected to be HDF5 Datasets that will be read in and provided as NumPy arrays.
  • hdf5_kwargs (Sequence) – A sequence of keyword arguments that are expected to be HDF5 Datasets that will be read in and provided as NumPy arrays.
  • hdf5_result (bytes) – Which HDF5 Dataset to use for storing the result.
Returns:

Does the actual decoration.

Return type:

callable

nanshe.io.hdf5.serializers.read_numpy_structured_array_from_HDF5(*args, **kwargs)[source]

Serializes a NumPy structure array from an HDF5 file by using the HDF5 compound data type. Also, it will handle normal NumPy arrays and scalars, as well.

Note

HDF5 does not support generic Python objects. So, serialization of objects to something else (perhaps strs of fixed size) must be performed first.

Parameters:
  • file_handle (HDF5 file) – either an HDF5 filename or Group.
  • internalPath (str) – an internal path for the HDF5 file.

Note

TODO: Write doctests.

Returns:the NumPy structure array.
Return type:data(numpy.ndarray)
nanshe.io.hdf5.serializers.split_hdf5_path(*args, **kwargs)[source]

Splits an HDF5 path (e.g. a.h5/b) into its internal and external components.

Parameters:path (HDF5 path) – a path to the HDF5.

Note

TODO: Write doctests.

Returns:the external and internal paths for the HDF5 file.
Return type:tuple