giotto.meta_transformers
.LandscapeGenerator¶
-
class
giotto.meta_transformers.
LandscapeGenerator
(metric='euclidean', max_edge_length=inf, homology_dimensions=(0, 1), scaler_metric='bottleneck', scaler_metric_params=None, scaler_function=<function amax>, filter_epsilon=0.0, n_layers=1, n_values=100, n_jobs=None)¶ Meta transformer returning persistence landscapes directly from point clouds.
Implements a feature generation pipeline which computes persistence diagrams, scales and filters them, and then computes their persistence landscapes.
- Parameters
- metricstring or callable, optional, default:
'euclidean'
If set to
'precomputed'
, each entry in X along axis 0 is interpreted to be a distance matrix. Otherwise, entries are interpreted as feature arrays, and metric determines a rule with which to calculate distances between pairs of instances (i.e. rows) in these arrays. If metric is a string, it must be one of the options allowed byscipy.spatial.distance.pdist
for its metric parameter, or a metric listed insklearn.pairwise.PAIRWISE_DISTANCE_FUNCTIONS
, including “euclidean”, “manhattan” or “cosine”. If metric is a callable function, it is called on each pair of instances and the resulting value recorded. The callable should take two arrays from the entry in X as input, and return a value indicating the distance between them.- max_edge_lengthfloat, optional, default:
numpy.inf
Upper bound on the maximum value of the Vietoris-Rips filtration parameter. Points whose distance is greater than this value will never be connected by an edge, and topological features at scales larger than this value will not be detected.
- homology_dimensionsiterable, optional, default:
(0, 1)
Dimensions (non-negative integers) of the topological features to be detected.
- scaler_metric
'bottleneck'
|'wasserstein'
|'landscape'
|'betti'
|'heat'
, optional, default:'bottleneck'
Distance or dissimilarity function used to define the amplitude of a subdiagram as its distance from the diagonal diagram:
'bottleneck'
and'wasserstein'
refer to the identically named perfect-matching–based notions of distance.'landscape'
refers to the \(L^p\) distance between persistence landscapes.'betti'
refers to the \(L^p\) distance between Betti curves.'heat'
refers to the \(L^p\) distance between Gaussian-smoothed diagrams.
- scaler_metric_paramsdict or None, optional, default:
None
Additional keyword arguments for scaler_metric:
If
metric == 'bottleneck'
there are no available arguments.If
metric == 'wasserstein'
the only argument is p (int, default:2
).If
metric == 'betti'
the available arguments are p (float, default:2.
) and n_values (int, default:100
).If
metric == 'landscape'
the available arguments are p (float, default:2.
), n_values (int, default:100
) and n_layers (int, default:1
).If
metric == 'heat'
the available arguments are p (float, default:2.
), sigma (float, default:1.
) and n_values (int, default:100
).
- scaler_functioncallable, optional, default:
numpy.max
Function used to extract a single positive scalar from the collection of norms of diagrams.
- filter_epsilonfloat, optional, default:
0.
The cutoff value controlling the amount of filtering.
- n_layersint, optional, default:
1
How many layers to consider in the persistence landscape.
- n_valuesint, optional, default:
100
Length of array used to sample the continuous persistence landscapes.
- n_jobsint or None, optional, default:
None
The number of jobs to use for the computation.
None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processors.
- metricstring or callable, optional, default:
Methods
fit
(self, X[, y])Create a giotto
Pipeline
object and fit it.fit_transform
(self, X[, y])Fit to data, then transform it.
get_params
(self[, deep])Get parameters for this estimator.
set_params
(self, \*\*params)Set the parameters of this estimator.
transform
(self, X[, y])Extract persistence landscapes from the sample point clouds in X.
-
__init__
(self, metric='euclidean', max_edge_length=inf, homology_dimensions=(0, 1), scaler_metric='bottleneck', scaler_metric_params=None, scaler_function=<function amax at 0x10f7bef28>, filter_epsilon=0.0, n_layers=1, n_values=100, n_jobs=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
fit
(self, X, y=None)¶ Create a giotto
Pipeline
object and fit it. Then, return the estimator.This method is there to implement the usual scikit-learn API and hence work in pipelines.
- Parameters
- Xndarray, shape (n_samples, n_points, n_dimensions)
Input data.
n_samples
is the number of point clouds,n_points
is the number of points per point cloud andn_dimensions
is the number of features for each point of the point cloud (i.e. the dimension of the point cloud space).- yNone
Ignored.
- Returns
- selfobject
-
fit_transform
(self, X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
- Xnumpy array of shape [n_samples, n_features]
Training set.
- ynumpy array of shape [n_samples]
Target values.
- Returns
- X_newnumpy array of shape [n_samples, n_features_new]
Transformed array.
-
get_params
(self, deep=True)¶ Get parameters for this estimator.
- Parameters
- deepboolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
- paramsmapping of string to any
Parameter names mapped to their values.
-
set_params
(self, **params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Returns
- self
-
transform
(self, X, y=None)¶ Extract persistence landscapes from the sample point clouds in X.
- Parameters
- Xndarray, shape (n_samples, n_points, n_dimensions)
Input data.
n_samples
is the number of point clouds,n_points
is the number of points per point cloud andn_dimensions
is the number of features for each point of the point cloud (i.e. the dimension of the point cloud space).- yNone
There is no need for a target in a transformer, yet the pipeline API requires this parameter.
- Returns
- Xtndarray, shape (n_samples, n_homology_dimensions, n_layers, n_values)
For each point cloud in X, one discretised persistence landscape per homology dimension in homology_dimensions, consisting of n_layers layers.