giotto.diagrams
.Scaler¶

class
giotto.diagrams.
Scaler
(metric='bottleneck', metric_params=None, function=<function amax>, n_jobs=None)¶ Linear scaling of persistence diagrams.
A positive scale factor is calculated during
fit
by considering all available persistence diagrams and homology dimensions. Duringtransform
, all birthdeath pairs are divided by this factor.The value of the scale factor depends on two things:
A way of computing, for each homology dimension, the amplitude in that dimension of a persistence diagram consisting of birthdeathdimension triples [b, d, q]. Together, metric and metric_params define this in the same way as in
Amplitude
.A scalarvalued function which is applied to the resulting twodimensional array of amplitudes.
 Parameters
 metric
'bottleneck'
'wasserstein'
'landscape'
'betti'
'heat'
, optional, default:'bottleneck'
Distance or dissimilarity function used to define the amplitude of a subdiagram as its distance from the diagonal diagram:
'bottleneck'
and'wasserstein'
refer to the identically named perfectmatching–based notions of distance.'landscape'
refers to the \(L^p\) distance between persistence landscapes.'betti'
refers to the \(L^p\) distance between Betti curves.'heat'
refers to the \(L^p\) distance between Gaussiansmoothed diagrams.
 metric_paramsdict or None, optional, default:
None
Additional keyword arguments for the metric function:
If
metric == 'bottleneck'
there are no available arguments.If
metric == 'wasserstein'
the only argument is p (int, default:2
).If
metric == 'betti'
the available arguments are p (float, default:2.
) and n_values (int, default:100
).If
metric == 'landscape'
the available arguments are p (float, default:2.
), n_values (int, default:100
) and n_layers (int, default:1
).If
metric == 'heat'
the available arguments are p (float, default:2.
), sigma (float, default:1.
) and n_values (int, default:100
).
 functioncallable, optional, default:
numpy.max
Function used to extract a positive scalar from the collection of amplitude vectors in
fit
. n_jobsint or None, optional, default:
None
The number of jobs to use for the computation.
None
means 1 unless in ajoblib.parallel_backend
context.1
means using all processors.
 metric
 Attributes
Notes
To compute scaling factors without first splitting the computation between different homology dimensions, data should be first transformed by an instance of
ForgetDimension
.Methods
fit
(self, X[, y])Store all observed homology dimensions in
homology_dimensions_
and computescale_
.fit_transform
(self, X[, y])Fit to data, then transform it.
get_params
(self[, deep])Get parameters for this estimator.
inverse_transform
(self, X[, copy])Scale back the data to the original representation.
set_params
(self, \*\*params)Set the parameters of this estimator.
transform
(self, X[, y])Divide all birth and death values in X by
scale_
.
__init__
(self, metric='bottleneck', metric_params=None, function=<function amax at 0x10f7bef28>, n_jobs=None)¶ Initialize self. See help(type(self)) for accurate signature.

fit
(self, X, y=None)¶ Store all observed homology dimensions in
homology_dimensions_
and computescale_
. Then, return the estimator. Parameters
 Xndarray, shape (n_samples, n_features, 3)
Input data. Array of persistence diagrams, each a collection of triples [b, d, q] representing persistent topological features through their birth (b), death (d) and homology dimension (q).
 yNone
There is no need for a target in a transformer, yet the pipeline API requires this parameter.
 Returns
 selfobject

fit_transform
(self, X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
 Parameters
 Xnumpy array of shape [n_samples, n_features]
Training set.
 ynumpy array of shape [n_samples]
Target values.
 Returns
 X_newnumpy array of shape [n_samples, n_features_new]
Transformed array.

get_params
(self, deep=True)¶ Get parameters for this estimator.
 Parameters
 deepboolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
 Returns
 paramsmapping of string to any
Parameter names mapped to their values.

inverse_transform
(self, X, copy=None)¶ Scale back the data to the original representation. Multiplies by the scale found in
fit
. Parameters
 Xndarray, shape (n_samples, n_features, 3)
Data to apply the inverse transform to.
 Returns
 Xsndarray, shape (n_samples, n_features, 3)
Rescaled diagrams.

set_params
(self, **params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object. Returns
 self

transform
(self, X, y=None)¶ Divide all birth and death values in X by
scale_
. Parameters
 Xndarray, shape (n_samples, n_features, 3)
Input data. Array of persistence diagrams, each a collection of triples [b, d, q] representing persistent topological features through their birth (b), death (d) and homology dimension (q).
 yNone
There is no need for a target in a transformer, yet the pipeline API requires this parameter.
 Returns
 Xsndarray, shape (n_samples, n_features, 3)
Rescaled diagrams.