Band Excitation Relaxation Spectroscopy Data Processing

Jessica Kong, Suhas Somnath, Chris R. Smith, Stephen Jesse

University of Washington 7/15/2019

This Jupyter Notebook demonstrates how to use Pycroscopy and PyUSID to process Band Excitation relaxation spectra. The code used to configure the notebook, select a file, and preliminary exploration of file parameters (i.e, code before the SHO fits section) was taken from other BE notebooks which were written by Suhas Somnath, Chris R. Smith, and Stephen Jesse.

notebook_rules.png

Image courtesy of Jean Bilheux from the neutron imaging GitHub repository.

Configure the notebook

[1]:
# Make sure needed packages are installed and up-to-date
import sys
!conda install --yes --prefix {sys.prefix} numpy scipy matplotlib scikit-learn Ipython ipywidgets h5py
!{sys.executable} -m pip install -U --no-deps pycroscopy  # this will automatically install pyUSID as well
^C
Invalid requirement: '#'

[2]:
# Ensure python 3 compatibility
from __future__ import division, print_function, absolute_import

# Import necessary libraries:
# General utilities:
import os
import shutil

# Computation:
import numpy as np
import h5py

# Visualization:
# import ipympl
import matplotlib.pyplot as plt
import matplotlib.widgets as mpw
from IPython.display import display, clear_output, HTML
import ipywidgets as widgets

import pyUSID as usid
# Finally, pycroscopy itself
sys.path.append('..')
import pycroscopy as px

# Make Notebook take up most of page width
display(HTML(data="""
<style>
    div#notebook-container    { width: 90%; }
    div#menubar-container     { width: 65%; }
    div#maintoolbar-container { width: 99%; }
</style>
"""))
[3]:
# set up notebook to show plots within the notebook
%matplotlib notebook

Set some basic parameters for computation

This notebook performs some functional fitting whose duration can be substantially decreased by using more memory and CPU cores. We have provided default values below but you may choose to change them if necessary.

[4]:
max_mem         = 1024*8  # Maximum memory to use, in Mbs. Default = 1024
max_cores       = None    # Number of logical cores to use in fitting.  None uses all but 2 available cores.

Make the data pycroscopy compatible

Converting the raw data into a pycroscopy compatible hierarchical data format (HDF or .h5) file gives you access to the fast fitting algorithms and powerful analysis functions within pycroscopy

  • are like smart containers that can store matrices with data, folders to organize these datasets, images, metadata like experimental parameters, links or shortcuts to datasets, etc.

  • are readily compatible with high-performance computing facilities

  • scale very efficiently from few kilobytes to several terabytes

  • can be read and modified using any language including Python, Matlab, C/C++, Java, Fortran, Igor Pro, etc.

  • Any .mat or .txt parameter file from the original experiment

  • A .h5 file generated from the raw data using pycroscopy - skips translation

You can select desired file type by choosing the second option in the pull down menu on the bottom right of the file window

[5]:
input_file_path = usid.io_utils.file_dialog(caption='Select translated .h5 file or raw experiment data',
                                          file_filter='Parameters for raw BE data (*.txt *.mat *xls *.xlsx);; \
                                          Translated file (*.h5)')

(data_dir, filename) = os.path.split(input_file_path)

if input_file_path.endswith('.h5'):
    # No translation here
    h5_path = input_file_path
    force = True # Set this to true to force patching of the datafile.
    tl = px.io.translators.LabViewH5Patcher()
    tl.translate(h5_path, force_patch=force)
else:
    # Set the data to be translated
    data_path = input_file_path

    (junk, base_name) = os.path.split(data_dir)

#     # Check if the data is in the new or old format.  Initialize the correct translator for the format.
#     if base_name == 'newdataformat':
#         (junk, base_name) = os.path.split(junk)
#         translator = px.io.translators.BEPSndfTranslator(max_mem_mb=max_mem)
#     else:
#         translator = px.io.translators.BEodfTranslator(max_mem_mb=max_mem)
#     if base_name.endswith('_d'):
#         base_name = base_name[:-2]
    # Translate the data
    h5_path = translator.translate(data_path, show_plots=True, save_plots=False)

h5_file = h5py.File(h5_path, 'r+')
print('Working on:\n' + h5_path)

h5_main = usid.hdf_utils.find_dataset(h5_file, 'Raw_Data')[0]
Working on:
Z:/Jessica Kong/Data/ORNL Summer-Fall 2018/MeasurementsByKyle/Jess_copy/+6 V Relaxation/BTO_031919_Post+6VPole_RS_+2Vbias_0017.h5

The file contents are stored in a tree structure, just like files on a conventional computer. The data is stored as a 2D matrix (position, spectroscopic value) regardless of the dimensionality of the data. Thus, the positions will be arranged as row0-col0, row0-col1…. row0-colN, row1-col0…. and the data for each position is stored as it was chronologically collected

The main dataset is always accompanied by four ancillary datasets that explain the position and spectroscopic value of any given element in the dataset.

[6]:
print('Datasets and datagroups within the file:\n------------------------------------')
usid.hdf_utils.print_tree(h5_file)

print('\nThe main dataset:\n------------------------------------')
print(h5_main)
print('\nThe ancillary datasets:\n------------------------------------')
print(h5_main.h5_pos_inds)
print(h5_main.h5_pos_vals)
print(h5_main.h5_spec_inds)
print(h5_main.h5_spec_vals)

print('\nMetadata or attributes in a datagroup\n------------------------------------')
for key, val in usid.hdf_utils.get_attributes(h5_main.parent.parent).items():
    print('{} : {}'.format(key, val))
Datasets and datagroups within the file:
------------------------------------
/
├ Measurement_000
  ---------------
  ├ Channel_000
    -----------
    ├ Bin_FFT
    ├ Bin_Frequencies
    ├ Bin_Indices
    ├ Bin_Step
    ├ Excitation_Waveform
    ├ Noise_Floor
    ├ Position_Indices
    ├ Position_Values
    ├ Raw_Data
    ├ Raw_Data-SHO_Fit_000
      --------------------
      ├ Guess
      ├ Spectroscopic_Indices
      ├ Spectroscopic_Values
    ├ Raw_Data-SHO_Fit_001
      --------------------
      ├ Fit
      ├ Fit-Double_Exp_000
        ------------------
        ├ Double_Exp_Fit
        ├ Spectroscopic_Indices
        ├ Spectroscopic_Values
        ├ completed_positions
      ├ Guess
      ├ Spectroscopic_Indices
      ├ Spectroscopic_Values
    ├ Spatially_Averaged_Plot_Group_000
      ---------------------------------
      ├ Bin_Frequencies
      ├ Mean_Spectrogram
      ├ Spectroscopic_Parameter
      ├ Step_Averaged_Response
    ├ Spectroscopic_Indices
    ├ Spectroscopic_Values

The main dataset:
------------------------------------
<HDF5 dataset "Raw_Data": shape (100, 7905), type "<c8">
located at:
        /Measurement_000/Channel_000/Raw_Data
Data contains:
        quantity (a.u.)
Data dimensions and original shape:
Position Dimensions:
        X - size: 10
        Y - size: 10
Spectroscopic Dimensions:
        Frequency - size: 31
        Pulse_Repeat - size: 250
        DC_Offset - size: 2
        Field - size: 1
Data Type:
        complex64

The ancillary datasets:
------------------------------------
<HDF5 dataset "Position_Indices": shape (100, 2), type "<u4">
<HDF5 dataset "Position_Values": shape (100, 2), type "<f4">
<HDF5 dataset "Spectroscopic_Indices": shape (4, 7905), type "<u4">
<HDF5 dataset "Spectroscopic_Values": shape (4, 7905), type "<f4">

Metadata or attributes in a datagroup
------------------------------------
AFM_InvOLS : 1.1009831191079487e-09
AFM_XLVDT_sensor : 3.8194705242817696e-06
AFM_XPiezo_sensitivity : -1.7495e-07
AFM_YLVDT_sensor : 3.9885994791750315e-06
AFM_YPiezo_sensitivity : 1.8273e-07
AFM_ZLVDT_sensor : 2.43377e-06
AFM_ZPiezo_sensitivity : 2.68351e-08
BE_amplitude_[V] : 1.0
BE_auto_smooth_cond : 1.0
BE_band_width_[Hz] : 100000.0
BE_center_frequency_[Hz] : 384000.0
BE_phase_content : chirp-sinc hybrid
BE_phase_variation : 1.0
BE_pulse_duration_[s] : 0.002
BE_repeats : 4.0
BE_signal_type_ring : 1.0
BE_smoothing : 3208.717049919559
BE_window_adjustment : 0.215625
IO_AFM_platform : Cypher AR14
IO_AI_range : 10.0
IO_AO_range : 10.0
IO_Channel_001_type : none
IO_Channel_002_type : none
IO_Channel_003_type : none
IO_analog_output_amplifier_ring : 1.0
IO_card : 6124
IO_deflection_detector_ring : 0
IO_rate : 4000000.0
IO_sensitivity : 1.0
VS_envelope_max_amp_[V] : 4.0
VS_envelope_min_amp_[V] : 1.0
VS_envelope_shape : 0
VS_mode : VS_Relaxation
VS_num_kernel_repeats : 1
VS_num_meas_per_read_step : 250
VS_num_meas_per_write_step : 5
VS_num_steps_per_envelope : 4
VS_pulse_kernel_polarity : 1
VS_start_kernel_with : 0
VS_step_transition_duration_[s]_ : 2.0
data_type : BERelaxData
grid_current_col : 9
grid_current_row : 9
grid_num_cols : 10
grid_num_rows : 10
grid_set_point_[V] : 1.0
grid_settle_time_[s] : 0.1
grid_transit_time_[s] : 0.1
num_UDVS_bins : 7905
num_UDVS_steps : 255
num_bins : 7905
num_pix : 100
num_steps : 255

Get some basic parameters from the H5 file

This information will be vital for futher analysis and visualization of the data

[7]:
h5_pos_inds = h5_main.h5_pos_inds
pos_dims = h5_main.pos_dim_sizes
pos_labels = h5_main.pos_dim_labels
print(pos_labels, pos_dims)

parm_dict = h5_main.parent.parent.attrs
is_ckpfm = h5_file.attrs['data_type'] == 'cKPFMData'
if is_ckpfm:
    num_write_steps = parm_dict['VS_num_DC_write_steps']
    num_read_steps = parm_dict['VS_num_read_steps']
    num_fields = 2
['X', 'Y'] [10, 10]

Perform SHO fits to data to extract PFM response

[8]:
sho_fit_points = 5  # The number of data points at each step to use when fitting
sho_override = False  # Force recompute if True

h5_main_SHO_fitter = px.analysis.BESHOfitter(h5_main, parallel=True, verbose=True)
h5_main_SHO_guess = h5_main_SHO_fitter.do_guess(strategy='complex_gaussian', options={'num_points':sho_fit_points},
                                   processors=max_cores, max_mem=max_mem, override=sho_override)
h5_main_SHO_fit = h5_main_SHO_fitter.do_fit(processors=max_cores, max_mem=max_mem, override=sho_override)
Allowed to read 17078 pixels per chunk
Looking at group - Raw_Data-SHO_Fit_000
Looking for new attribute named: num_points
New parm: num_points    - new parm not in group *****

Looking at group - Raw_Data-SHO_Fit_001
Looking for new attribute named: num_points
New parm: num_points    - match: True
Looking for new attribute named: frequencies
New parm: frequencies   - match: True
Looking for new attribute named: strategy
New parm: strategy      - match: True

Returned previously computed results at /Measurement_000/Channel_000/Raw_Data-SHO_Fit_001/Guess
Groups that matched the nomenclature: [<HDF5 group "/Measurement_000/Channel_000/Raw_Data-SHO_Fit_000" (3 members)>, <HDF5 group "/Measurement_000/Channel_000/Raw_Data-SHO_Fit_001" (5 members)>]
Looking for new attribute named: jac
New parm: jac   - match: True
Looking for new attribute named: solver_type
New parm: solver_type   - match: True
Looking for new attribute named: class
New parm: class         - match: True
Looking for new attribute named: obj_func
New parm: obj_func      - match: True
Looking for new attribute named: xvals
New parm: xvals         - match: True

Returned previously computed results at /Measurement_000/Channel_000/Raw_Data-SHO_Fit_001/Fit

Perform functional fits to PFM relaxation data

[9]:
starts_with = 'write'
#experimentally obtained tip sensitivity
tip_sens = 83.53 * 1e3 #pm/V
#phase offset to get raw phase centered at 0. can obtain by plotting:
# plt.hist(np.concatenate(h5_main_SHO_fit['Phase [rad]'].reshape(-1,1)), bins=1000);
phase = 3.0
#fit method, can be 'Exponential' 'Double_Exp' or 'Logistic'
fit_method = 'Double_Exp'
[10]:
decayfit = px.analysis.BERelaxFit(h5_main_SHO_fit, sens=tip_sens, phase_off=phase, starts_with=starts_with, fit_method=fit_method)
#start and end positions should be 0, and spatial dimension of h5_main; currently is user specified because it has yet to be integrated to the BERelaxFit process class.
decayfit._start_pos = 0
decayfit._end_pos = h5_main.shape[0]
decayfit._cores = 1
#decayfit.compute()
Consider calling test() to check results before calling compute() which computes on the entire dataset and writes back to the HDF5 file

Visualize the data

[11]:

h5_main_fit = h5_file['Measurement_000/Channel_000/Raw_Data-SHO_Fit_001/Fit-Double_Exp_000/Double_Exp_Fit'] #make widget sliders for x-, and y-axes, voltage, and time slide_x = widgets.IntSlider(min=0, max=h5_main_SHO_fit.pos_dim_sizes[0] - 1, description='Col') slide_y = widgets.IntSlider(min=0, max=h5_main_SHO_fit.pos_dim_sizes[1] - 1, description='Row') slide_v = widgets.IntSlider(min=0, max=decayfit.no_rs_spectra-1, step=1, description='App. Voltage', continuous_update=False, value = 0) slide_t = widgets.IntSlider(min=0, max=decayfit.no_time_steps - 1, description='Time Slice', continuous_update=False) #make plotting interactive with widgets widgets.interact(px.viz.be_viz_utils.viz_berelaxfit, berelaxfit = widgets.fixed(h5_main_fit), t_time=slide_t, x_col=slide_x, h_row=slide_y, bias_ind=slide_v, sensitivity=widgets.fixed(tip_sens), phase_offset=widgets.fixed(phase), fit_method=widgets.fixed(fit_method), starts_with = widgets.fixed(starts_with));
[ ]: