sidpy.proc.fitter_refactor.SidpyFitterRefactor¶
- class sidpy.proc.fitter_refactor.SidpyFitterRefactor(dataset, model_function, guess_function, ind_dims=None, num_params=None)[source]¶
Bases:
objectA parallelized fitter for sidpy.Datasets that supports K-Means-based initial guesses for improved convergence on large datasets.
- dataset¶
The original sidpy dataset containing data and metadata.
- Type:
sidpy.Dataset
- dask_data¶
The underlying dask array used for parallel computation.
- Type:
- guess_func¶
The function to generate initial guesses. Expected signature: f(x_axis, y_data).
- Type:
callable
Initializes the SidpyFitterKMeans.
Inputs¶
- datasetsidpy.Dataset
Dataset to be fitted.
- model_functioncallable
The model function to use for fitting.
- guess_functioncallable
The function to generate initial parameters for the model.
- ind_dimsint or tuple of int, optional
The indices of the dimensions to fit over. Default is whatever are the spectral dimensions
- num_params: int, optional but required in case of 2D or higher fitting
The number of parameters the fitting function expects.
Methods
Executes the parallel fit.
Parallelized guess logic across all pixels.
Performs K-Means clustering to find representative spectra for prior fitting.
Reconstructs a python function from source code stored in metadata.
Prepares the calculation by rechunking and determining the parameter count.
Convert the fit results into sidpy.Dataset(s).
- do_fit(guesses=None, use_kmeans=False, n_clusters=10, fit_parameter_labels=None, loss='linear', f_scale=1.0, return_cov=False)[source]¶
Executes the parallel fit.
- Parameters:
guesses (dask.array.Array, optional) – Initial guesses. If None, generated automatically.
use_kmeans (bool, optional) – Whether to use K-means priors. Default is False.
n_clusters (int, optional) – Number of clusters if use_kmeans is True. Default is 10.
fit_parameter_labels (list of str, optional) – List of string labels for the fit parameters (e.g. [‘Amp’, ‘Phase’]). These are simply saved in metadata.
loss (str, optional) – Loss function for least_squares (e.g., ‘linear’, ‘soft_l1’, ‘huber’, ‘cauchy’, ‘arctan’).
f_scale (float, optional) – Value of soft margin between inlier and outlier residuals. Default is 1.0.
return_cov (bool, optional) – If True, returns a tuple (fit_dataset, cov_dataset). The cov_dataset contains the covariance matrix for the fit parameters. CAUTION: This significantly increases memory usage.
- Returns:
If return_cov is False: returns the Fit Parameter dataset. If return_cov is True: returns (Fit Parameter dataset, Covariance Matrix dataset).
- Return type:
sidpy.Dataset or tuple(sidpy.Dataset, sidpy.Dataset)
- do_kmeans_guess(n_clusters=10)[source]¶
Performs K-Means clustering to find representative spectra for prior fitting. We use Dask-ML Kmeans to do this in a scalable fashion.
- Parameters:
n_clusters (int, optional) – Number of clusters to use for K-Means. Default is 10.
- Returns:
A dask array containing the initial guesses for every pixel.
- Return type:
- static reconstruct_function(source_code_input, context=None)[source]¶
Reconstructs a python function from source code stored in metadata. Robustly handles lists, strings, and indentation issues.