Write/read SIDpy Dataset via pyNSID¶
Author: Maxim Ziatdinov
Date: September 2020
update: - Gerd Duscher 01/2021 (compatibility to pyNSID version 0.0.2)
A fast introduction into how to write SIDpy datasets to NSID formatted HDF5 files
Start with standard imports:
[1]:
# Ensure python 3 compatibility:
from __future__ import (absolute_import, division, print_function,
unicode_literals)
import warnings
import h5py
import matplotlib.pylab as plt
import numpy as np
# we will also need a sidpy package
try:
import sidpy
except ModuleNotFoundError:
!pip3 install sidpy
import sidpy
import pyNSID
warnings.filterwarnings("ignore", module="numpy.core.fromnumeric")
warnings.filterwarnings("ignore", module="pyNSID.io.nsi_reader")
Creating sidpy.Dataset object(s)¶
Let’s create a simple sidpy Dataset from a numpy array:
[2]:
dataset = sidpy.Dataset.from_array(np.random.random([4, 5, 10]), name='new')
dataset
[2]:
|
Let’s also define the dataset attributes…
[3]:
dataset.data_type = 'SPECTRAL_IMAGE'
dataset.units = 'nA'
dataset.quantity = 'Current'
… and set individual dimensions. In the case of spectroscopic datasets, the first two dimensions are typically spatial units (e.g. nm) and the third one can be energy (e.g. \(meV\) or \(nm^{-1}\)).
[4]:
dataset.set_dimension(0, sidpy.Dimension(np.arange(dataset.shape[0]), 'x',
units='nm', quantity='Length',
dimension_type='spatial'))
dataset.set_dimension(1, sidpy.Dimension(np.linspace(-2, 2, num=dataset.shape[1], endpoint=True), 'y',
units='nm', quantity='Length',
dimension_type='spatial'))
dataset.set_dimension(2, sidpy.Dimension(np.sin(np.linspace(0, 2 * np.pi, num=dataset.shape[2])), 'bias',
units='mV', quantity='Voltage',
dimension_type='spectral'))
[5]:
print(dataset.dim_0)
print(dataset.dim_1)
print(dataset.dim_2)
x: Length (nm) of size (4,)
y: Length (nm) of size (5,)
bias: Voltage (mV) of size (10,)
Writing sidpy.Dataset object(s) to HDF5 files¶
Load NSID-formatted h5 file:
[6]:
hf = h5py.File("test.hf5", 'a')
Let’s create a new channel where we are going to save our sidpy dataset:
[7]:
hf.create_group('Measurement_000/Channel_000')
print(*hf["Measurement_000"].keys())
Channel_000
Now let’s write our sidpy dataset into the newly created channel:
[8]:
pyNSID.hdf_io.write_nsid_dataset(dataset, hf['Measurement_000/Channel_000'], main_data_name="new_spectrum");
C:\Users\gduscher\Anaconda3\lib\site-packages\pyNSID\io\hdf_utils.py:351: FutureWarning: validate_h5_dimension may be removed in a future version
warn('validate_h5_dimension may be removed in a future version',
Close h5 file:
[9]:
hf.close()
Reading sidpy.Dataset object(s)¶
Load back the file:
[10]:
hf = h5py.File("test.hf5", 'r+')
print(*hf["Measurement_000"].keys())
Channel_000
Find our dataset:
[11]:
dataset_hdf5 = pyNSID.io.hdf_utils.find_dataset(hf,'new_spectrum')[0]
dataset_hdf5
[11]:
<HDF5 dataset "new_spectrum": shape (4, 5, 10), type "<f8">
[12]:
hf.close()
Read the dataset stored in HDF5 format as a sidpy object (Dataset) using NSIDReader:
[17]:
dr = pyNSID.NSIDReader('test.hf5')
dataset_sid = dr.read()[0]
assert isinstance(dataset_sid, sidpy.Dataset)
dataset_sid.plot()
[18]:
print(dataset_sid)
dataset_sid
sidpy.Dataset of type SPECTRAL_IMAGE with:
dask.array<generic, shape=(4, 5, 10), dtype=float64, chunksize=(4, 5, 10), chunktype=numpy.ndarray>
data contains: Current (nA)
and Dimensions:
x: Length (nm) of size (4,)
y: Length (nm) of size (5,)
bias: Voltage (mV) of size (10,)
[18]:
|
Close File¶
[19]:
dataset_sid.h5_dataset.file.close()
[ ]: