_types¶
This module provides a set of types that can be used as building blocks
in the aggregation of a Clustering
object.
Go to:
Cluster parameters¶
- class commonnn._types.ClusterParameters(*args, **kwargs)¶
Input parameters for clustering procedure
- classmethod from_mapping(type cls, parameters: dict, **kwargs)¶
- get_fparam(self, AINDEX i)¶
- get_iparam(self, AINDEX i)¶
- to_dict(self)¶
Return a Python dictionary of cluster parameter key-value pairs
- class commonnn._types.CommonNNParameters¶
- class commonnn._types.RadiusParameters¶
Cluster labels¶
- class commonnn._types.Labels(labels, consider=None, *, meta=None)¶
Represents cluster label assignments
- classmethod from_length(type cls, n: int, meta=None) Type[Labels] ¶
Construct all zero labels with length
- classmethod from_sequence(type cls, labels, *, consider=None, meta=None) Type[Labels] ¶
Construct from any sequence (not supporting the buffer protocol)
- sort_by_size(self, member_cutoff: Optional[int] = None, max_clusters: Optional[int] = None, bundle=None)¶
Sort labels by clustersize in-place
Re-assigns cluster numbers so that the biggest cluster (that is not noise) is cluster 1. Also filters out clusters, that have not at least
member_cutoff
members (default 2). Optionally, does only keep themax_clusters
largest clusters.- Parameters:
member_cutoff – Valid clusters need to have at least this many members.
max_clusters – Only keep this many clusters.
- to_mapping(self)¶
Convert labels container to
mapping
of labels to lists of point indices
- to_set(self)¶
Convert labels container to
set
of unique labels
- class commonnn._types.ReferenceIndices¶
Root and parent indices relating child with parent clusterings
Input data¶
Types used as input data to a clustering have to adhere to the input
data interface which is defined through
InputDataExtInterface
for Cython extension
types. For pure Python types the input data interface is defined through
the abstract base class InputData
and the specialised abstract classes
- class commonnn._types.InputDataExtInterface¶
Defines the input data interface for Cython extension types
- compute_distances(self, InputDataExtInterface input_data)¶
- compute_neighbourhoods(self, InputDataExtInterface input_data, AVALUE r, ABOOL is_sorted, ABOOL is_selfcounting)¶
- classmethod get_builder_kwargs(type cls)¶
- get_component(self, point: int, dimension: int) int ¶
- get_distance(self, point_a: int, point_b: int) int ¶
- get_n_neighbours(self, point: int) int ¶
- get_neighbour(self, point: int, member: int) int ¶
- class commonnn._types.InputData¶
Defines the input data interface
- abstract property data¶
Return underlying data (only for user convenience, not to be relied on)
- classmethod get_builder_kwargs(cls)¶
- abstract get_subset(self, indices: Container) Type['InputData'] ¶
Return input data subset
- property meta¶
- abstract property n_points: int¶
Return total number of points
- class commonnn._types.InputDataComponents¶
Extends the input data interface for point coordinates
- abstract get_component(self, point: int, dimension: int) float ¶
Return one component of point coordinates
- abstract property n_dim: int¶
Return total number of dimensions
- abstract to_components_array(self) Type[np.ndarray] ¶
Return input data as NumPy array of shape (#points, #components)
- class commonnn._types.InputDataPairwiseDistances¶
Extends the input data interface for inter-point distances
- abstract get_distance(self, point_a: int, point_b: int) float ¶
Return the pairwise distance between two points
- class commonnn._types.InputDataPairwiseDistancesComputer¶
Extends the distance input data interface for computable distances
- abstract compute_distances(self, input_data: Type['InputData']) None ¶
Pre-compute pairwise distances
- class commonnn._types.InputDataNeighbourhoods¶
Extends the input data interface for point neighbourhoods
- abstract get_n_neighbours(self, point: int) int ¶
Return number of neighbours for point
- abstract get_neighbour(self, point: int, member: int) int ¶
Return a member for point
- class commonnn._types.InputDataNeighbourhoodsComputer¶
Extends the neighbourhood input data interface for computable neighbourhoods
- abstract compute_neighbourhoods(self, input_data: Type['InputData'], r: float, is_sorted: bool = False, is_selfcounting: bool = True) None ¶
Pre-compute neighbourhoods at radius
- class commonnn._types.InputDataExtComponentsMemoryview(data, meta=None, *)¶
Implements the input data interface
Stores point compenents as a 2D Cython typed memoryview.
- by_parts(self) Iterator ¶
Yield data by parts
- Returns:
Generator of 2D
numpy.ndarray
(parts)
- get_subset(self, indices: Sequence) Type['InputDataExtComponentsMemoryview'] ¶
- to_components_array(self)¶
- class commonnn._types.InputDataExtDistancesMemoryview(data, meta=None, *)¶
Implements the input data interface
Stores inter-point distances as a 2D Cython typed memoryview.
- get_subset(self, indices: Sequence) Type['InputDataExtComponentsMemoryview'] ¶
- to_distance_array(self)¶
- class commonnn._types.InputDataExtDistancesLinearMemoryview¶
Implements the input data interface
Stores inter-point distances as 1D Cython typed memoryview
- get_subset(self, indices: Sequence) Type['InputDataExtComponentsMemoryview'] ¶
- to_distance_array(self)¶
- class commonnn._types.InputDataExtNeighbourhoodsMemoryview¶
Implements the input data interface
Neighbours of points stored using a Cython memoryview.
- get_subset(self, indices: Sequence) Type['InputDataExtNeighbourhoodsMemoryview'] ¶
- to_n_neighbours_array(self)¶
- to_neighbourhoods_array(self)¶
- class commonnn._types.InputDataExtNeighbourhoodsVector¶
Implements the input data interface
Neighbours of points are stored using a C++ std::vector of vectors.
- get_n_neighbours(self, point: int) int ¶
- get_neighbour(self, point: int, member: int) int ¶
- get_subset(self, indices: Sequence) Type['InputDataExtNeighbourhoodsVector'] ¶
Return input data subset
- to_n_neighbours_array(self)¶
- to_neighbourhoods_array(self)¶
- class commonnn._types.InputDataNeighbourhoodsSequence(data: Sequence, *, meta=None)¶
Implements the input data interface
Neighbours of points stored as a sequence.
- property data¶
- get_n_neighbours(self, point: int) int ¶
- get_neighbour(self, point: int, member: int) int ¶
- get_subset(self, indices: Container) Type['InputDataNeighbourhoodsSequence'] ¶
- property n_neighbours¶
- property n_points¶
- to_n_neighbours_array(self)¶
- to_neighbourhoods_array(self)¶
- class commonnn._types.InputDataSklearnKDTree(data: Type[np.ndarray], *, meta=None, **kwargs)¶
Implements the input data interface
Components stored as a NumPy array. Neighbour queries delegated to pre-build KDTree.
- build_tree(self, **kwargs)¶
- clear_cached(self)¶
- compute_neighbourhoods(self, input_data: Type['InputData'], radius: float, is_sorted: bool = False, is_selfcounting: bool = True)¶
- property data¶
- get_component(self, point: int, dimension: int) float ¶
- get_n_neighbours(self, point: int) int ¶
- get_neighbour(self, point: int, member: int) int ¶
Return a member for point
- get_subset(self, indices: Container) Type['InputDataSklearnKDTree'] ¶
Return input data subset
- property n_dim¶
- property n_neighbours¶
- property n_points¶
- to_components_array(self)¶
- to_n_neighbours_array(self)¶
Neighbour containers¶
- class commonnn._types.NeighboursExtInterface¶
- assign(self, member: int)¶
- contains(self, member: int)¶
- enough(self, member_cutoff: int)¶
- classmethod get_builder_kwargs(type cls)¶
- get_member(self, index: int)¶
- reset(self)¶
- class commonnn._types.Neighbours¶
Defines the neighbours interface
- abstract assign(self, member: int) None ¶
Add a member to this container
- abstract contains(self, member: int) bool ¶
Return True if member is in neighbours container
- abstract enough(self, member_cutoff: int) bool ¶
Return True if there are enough points
- classmethod get_builder_kwargs(cls)¶
- abstract get_member(self, index: int) int ¶
Return indexable neighbours container
- abstract property n_points: int¶
Return total number of points
- abstract reset(self) None ¶
Reset/empty this container
- abstract to_neighbours_array(self)¶
Return point indices as NumPy array
- class commonnn._types.NeighboursExtVector¶
Implements the neighbours interface
Uses an underlying C++ std:vector.
- Keyword Arguments:
neighbours – A sequence of labels suitable to be cast to a vector.
initial_size – Number of elements reserved for the size of vector.
- to_neighbours_array(self)¶
- class commonnn._types.NeighboursExtSet¶
Implements the neighbours interface
Uses an underlying C++ std:set.
- Keyword Arguments:
neighbours – A sequence of labels suitable to be cast to a C++ set.
- to_neighbours_array(self)¶
- class commonnn._types.NeighboursExtUnorderedSet¶
Implements the neighbours interface
Uses an underlying C++ std:unordered_set.
- Keyword Arguments:
neighbours – A sequence of labels suitable to be cast to a C++ set.
- to_neighbours_array(self)¶
- class commonnn._types.NeighboursExtVectorUnorderedSet¶
Implements the neighbours interface
Uses a compination of an underlying C++ std:vector and a std:unordered_set.
- Keyword Arguments:
neighbours – A sequence of labels suitable to be cast to a C++ vector.
- to_neighbours_array(self)¶
- class commonnn._types.NeighboursList(neighbours=None)¶
Implements the neighbours interface
- assign(self, member: int)¶
- contains(self, member: int) bool ¶
- enough(self, member_cutoff: int) bool ¶
- get_member(self, index: int) int ¶
- property n_points¶
- property neighbours¶
- reset(self)¶
- to_neighbours_array(self)¶
- class commonnn._types.NeighboursSet(neighbours=None)¶
Implements the neighbours interface
- assign(self, member: int)¶
- contains(self, member: int) bool ¶
- enough(self, member_cutoff: int)¶
- get_member(self, index: int) int ¶
- property n_points¶
- property neighbours¶
- reset(self)¶
- to_neighbours_array(self)¶
Neighbours getter¶
- class commonnn._types.NeighboursGetterExtInterface¶
- get(self, AINDEX index, InputDataExtInterface input_data, NeighboursExtInterface neighbours, ClusterParameters cluster_params)¶
- classmethod get_builder_kwargs(type cls)¶
- get_other(self, AINDEX index, InputDataExtInterface input_data, InputDataExtInterface other_input_data, NeighboursExtInterface neighbours, ClusterParameters cluster_params)¶
- is_selfcounting¶
‘bool’
- Type:
is_selfcounting
- is_sorted¶
‘bool’
- Type:
is_sorted
- class commonnn._types.NeighboursGetter¶
Defines the neighbours-getter interface
- abstract get(self, index: int, input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters']) None ¶
Collect neighbours for point in input data
- classmethod get_builder_kwargs(cls)¶
- get_other(self, index: int, input_data: Type['InputData'], other_input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters']) None ¶
Collect neighbours in input data for point in other input data
- abstract property is_selfcounting: bool¶
Return True if points count as their own neighbour
- abstract property is_sorted: bool¶
Return True if neighbour indices are sorted
- class commonnn._types.NeighboursGetterExtBruteForce(distance_getter: Type['DistanceGetterExtInterface'])¶
Implements the neighbours getter interface
This getter retrieves the neighbours of a point by comparing the distances (from a distance getter) between the point and all other points to the radius cutoff (\(r_{ij} \leq r\)). The resulting neighbour containers are in general not sorted and include points as their own neighbour (self counting).
- Parameters:
distance_getter – An object implementing the distance getter interface. Has to be a Cython extension type.
- classmethod get_builder_kwargs(type cls)¶
- class commonnn._types.NeighboursGetterExtLookup¶
Implements the neighbours getter interface
- class commonnn._types.NeighboursGetterBruteForce(distance_getter: Type['DistanceGetter'])¶
Implements the neighbours getter interface
- get(self, index: int, input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters'])¶
- classmethod get_builder_kwargs(cls)¶
- get_other(self, index: int, input_data: Type['InputData'], other_input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters'])¶
- property is_selfcounting: bool¶
- property is_sorted: bool¶
- class commonnn._types.NeighboursGetterLookup(is_sorted=False, is_selfcounting=False)¶
Implements the neighbours getter interface
- get(self, index: int, input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters']) None ¶
- get_other(self, index: int, input_data: Type['InputData'], other_input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters'])¶
- property is_selfcounting: bool¶
- property is_sorted: bool¶
- class commonnn._types.NeighboursGetterRecomputeLookup(is_sorted=False, is_selfcounting=True)¶
Implements the neighbours getter interface
- get(self, index: int, input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters']) None ¶
- get_other(self, index: int, input_data: Type['InputData'], other_input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters'])¶
- property is_selfcounting: bool¶
- property is_sorted: bool¶
Distance getter¶
- class commonnn._types.DistanceGetterExtInterface¶
- classmethod get_builder_kwargs(type cls)¶
- get_single(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data)¶
- get_single_other(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data)¶
- class commonnn._types.DistanceGetter¶
Defines the distance getter interface
- classmethod get_builder_kwargs(cls)¶
- abstract get_single(self, point_a: int, point_b: int, input_data: Type['InputData']) float ¶
Get distance between two points in input data
- abstract get_single_other(self, point_a: int, point_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float ¶
Get distance between two points in input data and other input data
- class commonnn._types.DistanceGetterExtMetric¶
Implements the distance getter interface
- classmethod get_builder_kwargs(type cls)¶
- get_single(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data)¶
- get_single_other(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data)¶
- class commonnn._types.DistanceGetterExtLookup¶
Implements the distance getter interface
- class commonnn._types.DistanceGetterMetric(metric: Type['Metric'])¶
Implements the distance getter interface
- classmethod get_builder_kwargs(cls)¶
- get_single(self, point_a: int, point_b: int, input_data: Type['InputData'])¶
- get_single_other(self, point_a: int, point_b: int, input_data: Type['InputData'], other_input_data: Type['InputData'])¶
Metrics¶
- class commonnn._types.MetricExtInterface¶
Defines the metric interface for extension types
- adjust_radius(self, AVALUE radius_cutoff) float ¶
- calc_distance(self, AINDEX index_a, AINDEX index_b, InputDataExtInterface input_data) float ¶
- calc_distance_other(self, AINDEX index_a, AINDEX index_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data) float ¶
- classmethod get_builder_kwargs(type cls)¶
- class commonnn._types.Metric¶
Defines the metric-interface
- abstract calc_distance(self, index_a: int, index_b: int, input_data: Type['InputData']) float ¶
Return distance between two points in input data
- abstract calc_distance_other(self, index_a: int, index_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float ¶
Return distance between two points in input data and other input data
- class commonnn._types.MetricExtDummy¶
Implements the metric interface
- class commonnn._types.MetricExtPrecomputed¶
Implements the metric interface
- class commonnn._types.MetricExtEuclidean¶
Implements the metric interface
- class commonnn._types.MetricExtEuclideanReduced¶
Implements the metric interface
- class commonnn._types.MetricExtEuclideanPeriodicReduced¶
Implements the metric interface
- class commonnn._types.MetricDummy¶
Implements the metric interface
- adjust_radius(self, radius_cutoff: float) float ¶
- calc_distance(self, index_a: int, index_b: int, input_data: Type['InputData']) float ¶
- calc_distance_other(self, index_a: int, index_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float ¶
- class commonnn._types.MetricEuclidean¶
Implements the metric interface
- adjust_radius(self, radius_cutoff: float) float ¶
- calc_distance(self, index_a: int, index_b: int, input_data: Type['InputData']) float ¶
- calc_distance_other(self, index_a: int, index_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float ¶
- class commonnn._types.MetricEuclideanReduced¶
Implements the metric interface
- adjust_radius(self, radius_cutoff: float) float ¶
- calc_distance(self, index_a: int, index_b: int, input_data: Type['InputData']) float ¶
- calc_distance_other(self, index_a: int, index_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float ¶
Similarity checker¶
- class commonnn._types.SimilarityCheckerExtInterface¶
Defines the similarity checker interface for extension types
- check(self, NeighboursExtInterface neighbours_a, NeighboursExtInterface neighbours_b, ClusterParameters cluster_params)¶
- get(self, NeighboursExtInterface neighbours_a, NeighboursExtInterface neighbours_b, ClusterParameters cluster_params)¶
- classmethod get_builder_kwargs(type cls)¶
- class commonnn._types.SimilarityChecker¶
Defines the similarity checker interface
- abstract check(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) bool ¶
Return True if a and b have sufficiently many common neighbours
- abstract get(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) int ¶
Return number of common neighbours
- classmethod get_builder_kwargs(cls)¶
- class commonnn._types.SimilarityCheckerExtContains¶
Implements the similarity checker interface
- Strategy:
Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that no switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare
commonnn._types.SimilarityCheckerExtSwitchContains
).
- class commonnn._types.SimilarityCheckerExtSwitchContains¶
Implements the similarity checker interface
- Strategy:
Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare
SimilarityCheckerExtContains
).
- class commonnn._types.SimilarityCheckerExtScreensorted¶
Implements the similarity checker interface
- Strategy:
Loops over members of two neighbour containers alternatingly and checks if neighbours are contained in both containers. Requires that the containers are sorted ascendingly to return the correct result. Sorting will neither be checked nor enforced. Breaks early when similarity criterion is reached. The performance of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n + m)\) with \(n\) and \(m\) being the lengths of the neighbours containers.
- class commonnn._types.SimilarityCheckerContains¶
Implements the similarity checker interface
- Strategy:
Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that no switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare
commonnn._types.SimilarityCheckerSwitchContains
).
- check(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) bool ¶
- get(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) int ¶
Return number of common neighbours
- class commonnn._types.SimilarityCheckerSwitchContains¶
Implements the similarity checker interface
- Strategy:
Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that a switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare
commonnn._types.SimilarityCheckerContains
).
- check(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) bool ¶
- get(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) int ¶
Return number of common neighbours
Queues¶
Queues can be optionally used by a fitter, e.g.
- class commonnn._types.QueueExtInterface¶
- classmethod get_builder_kwargs(type cls)¶
- is_empty(self) bool ¶
- pop(self) int ¶
- push(self, value: int)¶
- size(self) int ¶
- class commonnn._types.PriorityQueueExtInterface¶
- classmethod get_builder_kwargs(type cls)¶
- is_empty(self) bool ¶
- pop(self)¶
- push(self, a: int, b: int, weight: float)¶
- reset(self) None ¶
- size(self) int ¶
- class commonnn._types.Queue¶
Defines the queue interface
- classmethod get_builder_kwargs(cls)¶
- abstract is_empty(self) bool ¶
Return True if there are no values in the queue
- abstract pop(self)¶
Retrieve value from the queue
- abstract push(self, value)¶
Put value into the queue
- abstract size(self) int ¶
Get number of items in the queue
- class commonnn._types.PriorityQueue¶
Defines the prioqueue interface
- classmethod get_builder_kwargs(cls)¶
- abstract is_empty(self) bool ¶
Return True if there are no values in the queue
- abstract pop(self)¶
Retrieve values from the queue
- abstract push(self, a, b, weight) None ¶
Put values into the queue
- abstract reset(self) None ¶
Reset the queue
- abstract size(self) int ¶
Get number of items in the queue
- class commonnn._types.QueueExtLIFOVector¶
Implements the queue interface
- class commonnn._types.QueueExtFIFOQueue¶
Implements the queue interface
- class commonnn._types.QueueFIFODeque¶
Implements the queue interface
- is_empty(self) bool ¶
- pop(self)¶
- push(self, value)¶
Append value to back/right end
- size(self) int ¶