giotto.homology
.ConsistentRescaling¶
-
class
giotto.homology.
ConsistentRescaling
(metric='euclidean', metric_params={}, neighbor_rank=1, n_jobs=None)¶ Rescaling of distances between pairs of points by the geometric mean of the distances to the respective \(k\)-th nearest neighbours.
Based on ideas in [1]. The computation during
transform
depends on the nature of the array X. If each entry in X along axis 0 represents a distance matrix \(D\), then the corresponding entry in the transformed array is the distance matrix \(D'_{ij} = D_{ij}/\sqrt{D_{ik_i}D_{jk_j}}\), where \(k_i\) is the index of the \(k\)-th largest value in row \(i\) (and similarly for \(j\)). If the entries in X represent point clouds, their distance matrices are first computed, and then rescaled according to the same formula.- Parameters
- metricstring or callable, optional, default:
'euclidean'
If set to
'precomputed'
, each entry in X along axis 0 is interpreted to be a distance matrix. Otherwise, entries are interpreted as feature arrays, and metric determines a rule with which to calculate distances between pairs of instances (i.e. rows) in these arrays. If metric is a string, it must be one of the options allowed byscipy.spatial.distance.pdist
for its metric parameter, or a metric listed insklearn.pairwise.PAIRWISE_DISTANCE_FUNCTIONS
, including “euclidean”, “manhattan” or “cosine”. If metric is a callable function, it is called on each pair of instances and the resulting value recorded. The callable should take two arrays from the entry in X as input, and return a value indicating the distance between them.- metric_paramsdict, optional, default:
{}
Additional keyword arguments for the metric function.
- neighbor_rankint, optional, default:
1
Rank of the neighbors used to modify the metric structure according to the “consistent rescaling” procedure.
- n_jobsint or None, optional, default:
None
The number of jobs to use for the computation.
None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processors.
- metricstring or callable, optional, default:
See also
References
- 1
T. Berry and T. Sauer, “Consistent manifold representation for topological data analysis”; Foundations of data analysis 1, pp. 1–38, 2019; doi: 10.3934/fods.2019001.
Examples
>>> import numpy as np >>> from giotto.homology import ConsistentRescaling >>> X = np.array([[[0, 0], [1, 2], [5, 6]]]) >>> cr = ConsistentRescaling() >>> X_rescaled = cr.fit_transform(X) >>> print(X_rescaled.shape) (1, 3, 3)
Methods
fit
(self, X[, y])Do nothing and return the estimator unchanged.
fit_transform
(self, X[, y])Fit to data, then transform it.
get_params
(self[, deep])Get parameters for this estimator.
set_params
(self, \*\*params)Set the parameters of this estimator.
transform
(self, X[, y])For each entry in the input data array X, find the metric structure after consistent rescaling and encodes it as a distance matrix.
-
__init__
(self, metric='euclidean', metric_params={}, neighbor_rank=1, n_jobs=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
fit
(self, X, y=None)¶ Do nothing and return the estimator unchanged.
This method is there to implement the usual scikit-learn API and hence work in pipelines.
- Parameters
- Xndarray, shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)
Input data. If
metric == 'precomputed'
, the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape(n_points, n_points)
. Otherwise, each such entry will be interpreted as an array ofn_points
row vectors inn_dimensions
-dimensional space.- yNone
There is no need for a target in a transformer, yet the pipeline API requires this parameter.
- Returns
- selfobject
-
fit_transform
(self, X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
- Xnumpy array of shape [n_samples, n_features]
Training set.
- ynumpy array of shape [n_samples]
Target values.
- Returns
- X_newnumpy array of shape [n_samples, n_features_new]
Transformed array.
-
get_params
(self, deep=True)¶ Get parameters for this estimator.
- Parameters
- deepboolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
- paramsmapping of string to any
Parameter names mapped to their values.
-
set_params
(self, **params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Returns
- self
-
transform
(self, X, y=None)¶ For each entry in the input data array X, find the metric structure after consistent rescaling and encodes it as a distance matrix. Then, arrange all results in a single ndarray of appropriate shape.
- Parameters
- Xndarray, shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)
Input data. If
metric == 'precomputed'
, the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape(n_points, n_points)
. Otherwise, each such entry will be interpreted as an array ofn_points
row vectors inn_dimensions
-dimensional space.- yNone
There is no need for a target in a transformer, yet the pipeline API requires this parameter.
- Returns
- Xtndarray, shape (n_samples, n_points, n_points)
Array containing (as entries along axis 0) the distance matrices after consistent rescaling.