giotto.homology
.ConsistentRescaling¶

class
giotto.homology.
ConsistentRescaling
(metric='euclidean', metric_params={}, neighbor_rank=1, n_jobs=None)¶ Rescaling of distances between pairs of points by the geometric mean of the distances to the respective \(k\)th nearest neighbours.
Based on ideas in [1]. The computation during
transform
depends on the nature of the array X. If each entry in X along axis 0 represents a distance matrix \(D\), then the corresponding entry in the transformed array is the distance matrix \(D'_{ij} = D_{ij}/\sqrt{D_{ik_i}D_{jk_j}}\), where \(k_i\) is the index of the \(k\)th largest value in row \(i\) (and similarly for \(j\)). If the entries in X represent point clouds, their distance matrices are first computed, and then rescaled according to the same formula. Parameters
 metricstring or callable, optional, default:
'euclidean'
If set to
'precomputed'
, each entry in X along axis 0 is interpreted to be a distance matrix. Otherwise, entries are interpreted as feature arrays, and metric determines a rule with which to calculate distances between pairs of instances (i.e. rows) in these arrays. If metric is a string, it must be one of the options allowed byscipy.spatial.distance.pdist
for its metric parameter, or a metric listed insklearn.pairwise.PAIRWISE_DISTANCE_FUNCTIONS
, including “euclidean”, “manhattan” or “cosine”. If metric is a callable function, it is called on each pair of instances and the resulting value recorded. The callable should take two arrays from the entry in X as input, and return a value indicating the distance between them. metric_paramsdict, optional, default:
{}
Additional keyword arguments for the metric function.
 neighbor_rankint, optional, default:
1
Rank of the neighbors used to modify the metric structure according to the “consistent rescaling” procedure.
 n_jobsint or None, optional, default:
None
The number of jobs to use for the computation.
None
means 1 unless in ajoblib.parallel_backend
context.1
means using all processors.
 metricstring or callable, optional, default:
See also
References
 1
T. Berry and T. Sauer, “Consistent manifold representation for topological data analysis”; Foundations of data analysis 1, pp. 1–38, 2019; doi: 10.3934/fods.2019001.
Examples
>>> import numpy as np >>> from giotto.homology import ConsistentRescaling >>> X = np.array([[[0, 0], [1, 2], [5, 6]]]) >>> cr = ConsistentRescaling() >>> X_rescaled = cr.fit_transform(X) >>> print(X_rescaled.shape) (1, 3, 3)
Methods
fit
(self, X[, y])Do nothing and return the estimator unchanged.
fit_transform
(self, X[, y])Fit to data, then transform it.
get_params
(self[, deep])Get parameters for this estimator.
set_params
(self, \*\*params)Set the parameters of this estimator.
transform
(self, X[, y])For each entry in the input data array X, find the metric structure after consistent rescaling and encodes it as a distance matrix.

__init__
(self, metric='euclidean', metric_params={}, neighbor_rank=1, n_jobs=None)¶ Initialize self. See help(type(self)) for accurate signature.

fit
(self, X, y=None)¶ Do nothing and return the estimator unchanged.
This method is there to implement the usual scikitlearn API and hence work in pipelines.
 Parameters
 Xndarray, shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)
Input data. If
metric == 'precomputed'
, the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape(n_points, n_points)
. Otherwise, each such entry will be interpreted as an array ofn_points
row vectors inn_dimensions
dimensional space. yNone
There is no need for a target in a transformer, yet the pipeline API requires this parameter.
 Returns
 selfobject

fit_transform
(self, X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
 Parameters
 Xnumpy array of shape [n_samples, n_features]
Training set.
 ynumpy array of shape [n_samples]
Target values.
 Returns
 X_newnumpy array of shape [n_samples, n_features_new]
Transformed array.

get_params
(self, deep=True)¶ Get parameters for this estimator.
 Parameters
 deepboolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
 Returns
 paramsmapping of string to any
Parameter names mapped to their values.

set_params
(self, **params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object. Returns
 self

transform
(self, X, y=None)¶ For each entry in the input data array X, find the metric structure after consistent rescaling and encodes it as a distance matrix. Then, arrange all results in a single ndarray of appropriate shape.
 Parameters
 Xndarray, shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)
Input data. If
metric == 'precomputed'
, the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape(n_points, n_points)
. Otherwise, each such entry will be interpreted as an array ofn_points
row vectors inn_dimensions
dimensional space. yNone
There is no need for a target in a transformer, yet the pipeline API requires this parameter.
 Returns
 Xtndarray, shape (n_samples, n_points, n_points)
Array containing (as entries along axis 0) the distance matrices after consistent rescaling.