_types

This module provides a set of types that can be used as building blocks in the aggregation of a Clustering object.

Go to:

Cluster parameters

class commonnn._types.ClusterParameters(*args, **kwargs)

Input parameters for clustering procedure

classmethod from_mapping(type cls, parameters: dict, **kwargs)
get_fparam(self, AINDEX i)
get_iparam(self, AINDEX i)
to_dict(self)

Return a Python dictionary of cluster parameter key-value pairs

class commonnn._types.CommonNNParameters
class commonnn._types.RadiusParameters

Cluster labels

class commonnn._types.Labels(labels, consider=None, *, meta=None)

Represents cluster label assignments

classmethod from_length(type cls, n: int, meta=None) Type[Labels]

Construct all zero labels with length

classmethod from_sequence(type cls, labels, *, consider=None, meta=None) Type[Labels]

Construct from any sequence (not supporting the buffer protocol)

sort_by_size(self, member_cutoff: Optional[int] = None, max_clusters: Optional[int] = None, bundle=None)

Sort labels by clustersize in-place

Re-assigns cluster numbers so that the biggest cluster (that is not noise) is cluster 1. Also filters out clusters, that have not at least member_cutoff members (default 2). Optionally, does only keep the max_clusters largest clusters.

Parameters:
  • member_cutoff – Valid clusters need to have at least this many members.

  • max_clusters – Only keep this many clusters.

to_mapping(self)

Convert labels container to mapping of labels to lists of point indices

to_set(self)

Convert labels container to set of unique labels

class commonnn._types.ReferenceIndices

Root and parent indices relating child with parent clusterings

Input data

Types used as input data to a clustering have to adhere to the input data interface which is defined through InputDataExtInterface for Cython extension types. For pure Python types the input data interface is defined through the abstract base class InputData and the specialised abstract classes


class commonnn._types.InputDataExtInterface

Defines the input data interface for Cython extension types

compute_distances(self, InputDataExtInterface input_data)
compute_neighbourhoods(self, InputDataExtInterface input_data, AVALUE r, ABOOL is_sorted, ABOOL is_selfcounting)
classmethod get_builder_kwargs(type cls)
get_component(self, point: int, dimension: int) int
get_distance(self, point_a: int, point_b: int) int
get_n_neighbours(self, point: int) int
get_neighbour(self, point: int, member: int) int
class commonnn._types.InputData

Defines the input data interface

abstract property data

Return underlying data (only for user convenience, not to be relied on)

classmethod get_builder_kwargs(cls)
abstract get_subset(self, indices: Container) Type['InputData']

Return input data subset

property meta
abstract property n_points: int

Return total number of points

class commonnn._types.InputDataComponents

Extends the input data interface for point coordinates

abstract get_component(self, point: int, dimension: int) float

Return one component of point coordinates

abstract property n_dim: int

Return total number of dimensions

abstract to_components_array(self) Type[np.ndarray]

Return input data as NumPy array of shape (#points, #components)

class commonnn._types.InputDataPairwiseDistances

Extends the input data interface for inter-point distances

abstract get_distance(self, point_a: int, point_b: int) float

Return the pairwise distance between two points

class commonnn._types.InputDataPairwiseDistancesComputer

Extends the distance input data interface for computable distances

abstract compute_distances(self, input_data: Type['InputData']) None

Pre-compute pairwise distances

class commonnn._types.InputDataNeighbourhoods

Extends the input data interface for point neighbourhoods

abstract get_n_neighbours(self, point: int) int

Return number of neighbours for point

abstract get_neighbour(self, point: int, member: int) int

Return a member for point

class commonnn._types.InputDataNeighbourhoodsComputer

Extends the neighbourhood input data interface for computable neighbourhoods

abstract compute_neighbourhoods(self, input_data: Type['InputData'], r: float, is_sorted: bool = False, is_selfcounting: bool = True) None

Pre-compute neighbourhoods at radius

class commonnn._types.InputDataExtComponentsMemoryview(data, meta=None, *)

Implements the input data interface

Stores point compenents as a 2D Cython typed memoryview.

by_parts(self) Iterator

Yield data by parts

Returns:

Generator of 2D numpy.ndarray (parts)

get_subset(self, indices: Sequence) Type['InputDataExtComponentsMemoryview']
to_components_array(self)
class commonnn._types.InputDataExtDistancesMemoryview(data, meta=None, *)

Implements the input data interface

Stores inter-point distances as a 2D Cython typed memoryview.

get_subset(self, indices: Sequence) Type['InputDataExtComponentsMemoryview']
to_distance_array(self)
class commonnn._types.InputDataExtDistancesLinearMemoryview

Implements the input data interface

Stores inter-point distances as 1D Cython typed memoryview

get_subset(self, indices: Sequence) Type['InputDataExtComponentsMemoryview']
to_distance_array(self)
class commonnn._types.InputDataExtNeighbourhoodsMemoryview

Implements the input data interface

Neighbours of points stored using a Cython memoryview.

get_subset(self, indices: Sequence) Type['InputDataExtNeighbourhoodsMemoryview']
to_n_neighbours_array(self)
to_neighbourhoods_array(self)
class commonnn._types.InputDataExtNeighbourhoodsVector

Implements the input data interface

Neighbours of points are stored using a C++ std::vector of vectors.

get_n_neighbours(self, point: int) int
get_neighbour(self, point: int, member: int) int
get_subset(self, indices: Sequence) Type['InputDataExtNeighbourhoodsVector']

Return input data subset

to_n_neighbours_array(self)
to_neighbourhoods_array(self)
class commonnn._types.InputDataNeighbourhoodsSequence(data: Sequence, *, meta=None)

Implements the input data interface

Neighbours of points stored as a sequence.

property data
get_n_neighbours(self, point: int) int
get_neighbour(self, point: int, member: int) int
get_subset(self, indices: Container) Type['InputDataNeighbourhoodsSequence']
property n_neighbours
property n_points
to_n_neighbours_array(self)
to_neighbourhoods_array(self)
class commonnn._types.InputDataSklearnKDTree(data: Type[np.ndarray], *, meta=None, **kwargs)

Implements the input data interface

Components stored as a NumPy array. Neighbour queries delegated to pre-build KDTree.

build_tree(self, **kwargs)
clear_cached(self)
compute_neighbourhoods(self, input_data: Type['InputData'], radius: float, is_sorted: bool = False, is_selfcounting: bool = True)
property data
get_component(self, point: int, dimension: int) float
get_n_neighbours(self, point: int) int
get_neighbour(self, point: int, member: int) int

Return a member for point

get_subset(self, indices: Container) Type['InputDataSklearnKDTree']

Return input data subset

property n_dim
property n_neighbours
property n_points
to_components_array(self)
to_n_neighbours_array(self)

Neighbour containers

class commonnn._types.NeighboursExtInterface
assign(self, member: int)
contains(self, member: int)
enough(self, member_cutoff: int)
classmethod get_builder_kwargs(type cls)
get_member(self, index: int)
reset(self)
class commonnn._types.Neighbours

Defines the neighbours interface

abstract assign(self, member: int) None

Add a member to this container

abstract contains(self, member: int) bool

Return True if member is in neighbours container

abstract enough(self, member_cutoff: int) bool

Return True if there are enough points

classmethod get_builder_kwargs(cls)
abstract get_member(self, index: int) int

Return indexable neighbours container

abstract property n_points: int

Return total number of points

abstract reset(self) None

Reset/empty this container

abstract to_neighbours_array(self)

Return point indices as NumPy array

class commonnn._types.NeighboursExtVector

Implements the neighbours interface

Uses an underlying C++ std:vector.

Keyword Arguments:
  • neighbours – A sequence of labels suitable to be cast to a vector.

  • initial_size – Number of elements reserved for the size of vector.

to_neighbours_array(self)
class commonnn._types.NeighboursExtSet

Implements the neighbours interface

Uses an underlying C++ std:set.

Keyword Arguments:

neighbours – A sequence of labels suitable to be cast to a C++ set.

to_neighbours_array(self)
class commonnn._types.NeighboursExtUnorderedSet

Implements the neighbours interface

Uses an underlying C++ std:unordered_set.

Keyword Arguments:

neighbours – A sequence of labels suitable to be cast to a C++ set.

to_neighbours_array(self)
class commonnn._types.NeighboursExtVectorUnorderedSet

Implements the neighbours interface

Uses a compination of an underlying C++ std:vector and a std:unordered_set.

Keyword Arguments:

neighbours – A sequence of labels suitable to be cast to a C++ vector.

to_neighbours_array(self)
class commonnn._types.NeighboursList(neighbours=None)

Implements the neighbours interface

assign(self, member: int)
contains(self, member: int) bool
enough(self, member_cutoff: int) bool
get_member(self, index: int) int
property n_points
property neighbours
reset(self)
to_neighbours_array(self)
class commonnn._types.NeighboursSet(neighbours=None)

Implements the neighbours interface

assign(self, member: int)
contains(self, member: int) bool
enough(self, member_cutoff: int)
get_member(self, index: int) int
property n_points
property neighbours
reset(self)
to_neighbours_array(self)

Neighbours getter

class commonnn._types.NeighboursGetterExtInterface
get(self, AINDEX index, InputDataExtInterface input_data, NeighboursExtInterface neighbours, ClusterParameters cluster_params)
classmethod get_builder_kwargs(type cls)
get_other(self, AINDEX index, InputDataExtInterface input_data, InputDataExtInterface other_input_data, NeighboursExtInterface neighbours, ClusterParameters cluster_params)
is_selfcounting

‘bool’

Type:

is_selfcounting

is_sorted

‘bool’

Type:

is_sorted

class commonnn._types.NeighboursGetter

Defines the neighbours-getter interface

abstract get(self, index: int, input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters']) None

Collect neighbours for point in input data

classmethod get_builder_kwargs(cls)
get_other(self, index: int, input_data: Type['InputData'], other_input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters']) None

Collect neighbours in input data for point in other input data

abstract property is_selfcounting: bool

Return True if points count as their own neighbour

abstract property is_sorted: bool

Return True if neighbour indices are sorted

class commonnn._types.NeighboursGetterExtBruteForce(distance_getter: Type['DistanceGetterExtInterface'])

Implements the neighbours getter interface

This getter retrieves the neighbours of a point by comparing the distances (from a distance getter) between the point and all other points to the radius cutoff (\(r_{ij} \leq r\)). The resulting neighbour containers are in general not sorted and include points as their own neighbour (self counting).

Parameters:

distance_getter – An object implementing the distance getter interface. Has to be a Cython extension type.

classmethod get_builder_kwargs(type cls)
class commonnn._types.NeighboursGetterExtLookup

Implements the neighbours getter interface

class commonnn._types.NeighboursGetterBruteForce(distance_getter: Type['DistanceGetter'])

Implements the neighbours getter interface

get(self, index: int, input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters'])
classmethod get_builder_kwargs(cls)
get_other(self, index: int, input_data: Type['InputData'], other_input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters'])
property is_selfcounting: bool
property is_sorted: bool
class commonnn._types.NeighboursGetterLookup(is_sorted=False, is_selfcounting=False)

Implements the neighbours getter interface

get(self, index: int, input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters']) None
get_other(self, index: int, input_data: Type['InputData'], other_input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters'])
property is_selfcounting: bool
property is_sorted: bool
class commonnn._types.NeighboursGetterRecomputeLookup(is_sorted=False, is_selfcounting=True)

Implements the neighbours getter interface

get(self, index: int, input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters']) None
get_other(self, index: int, input_data: Type['InputData'], other_input_data: Type['InputData'], neighbours: Type['Neighbours'], cluster_params: Type['ClusterParameters'])
property is_selfcounting: bool
property is_sorted: bool

Distance getter

class commonnn._types.DistanceGetterExtInterface
classmethod get_builder_kwargs(type cls)
get_single(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data)
get_single_other(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data)
class commonnn._types.DistanceGetter

Defines the distance getter interface

classmethod get_builder_kwargs(cls)
abstract get_single(self, point_a: int, point_b: int, input_data: Type['InputData']) float

Get distance between two points in input data

abstract get_single_other(self, point_a: int, point_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float

Get distance between two points in input data and other input data

class commonnn._types.DistanceGetterExtMetric

Implements the distance getter interface

classmethod get_builder_kwargs(type cls)
get_single(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data)
get_single_other(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data)
class commonnn._types.DistanceGetterExtLookup

Implements the distance getter interface

class commonnn._types.DistanceGetterMetric(metric: Type['Metric'])

Implements the distance getter interface

classmethod get_builder_kwargs(cls)
get_single(self, point_a: int, point_b: int, input_data: Type['InputData'])
get_single_other(self, point_a: int, point_b: int, input_data: Type['InputData'], other_input_data: Type['InputData'])
class commonnn._types.DistanceGetterLookup

Implements the distance getter interface

get_single(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data)
get_single_other(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data)

Metrics

class commonnn._types.MetricExtInterface

Defines the metric interface for extension types

adjust_radius(self, AVALUE radius_cutoff) float
calc_distance(self, AINDEX index_a, AINDEX index_b, InputDataExtInterface input_data) float
calc_distance_other(self, AINDEX index_a, AINDEX index_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data) float
classmethod get_builder_kwargs(type cls)
class commonnn._types.Metric

Defines the metric-interface

abstract calc_distance(self, index_a: int, index_b: int, input_data: Type['InputData']) float

Return distance between two points in input data

abstract calc_distance_other(self, index_a: int, index_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float

Return distance between two points in input data and other input data

class commonnn._types.MetricExtDummy

Implements the metric interface

class commonnn._types.MetricExtPrecomputed

Implements the metric interface

class commonnn._types.MetricExtEuclidean

Implements the metric interface

class commonnn._types.MetricExtEuclideanReduced

Implements the metric interface

class commonnn._types.MetricExtEuclideanPeriodicReduced

Implements the metric interface

class commonnn._types.MetricDummy

Implements the metric interface

adjust_radius(self, radius_cutoff: float) float
calc_distance(self, index_a: int, index_b: int, input_data: Type['InputData']) float
calc_distance_other(self, index_a: int, index_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float
class commonnn._types.MetricEuclidean

Implements the metric interface

adjust_radius(self, radius_cutoff: float) float
calc_distance(self, index_a: int, index_b: int, input_data: Type['InputData']) float
calc_distance_other(self, index_a: int, index_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float
class commonnn._types.MetricEuclideanReduced

Implements the metric interface

adjust_radius(self, radius_cutoff: float) float
calc_distance(self, index_a: int, index_b: int, input_data: Type['InputData']) float
calc_distance_other(self, index_a: int, index_b: int, input_data: Type['InputData'], other_input_data: Type['InputData']) float

Similarity checker

class commonnn._types.SimilarityCheckerExtInterface

Defines the similarity checker interface for extension types

check(self, NeighboursExtInterface neighbours_a, NeighboursExtInterface neighbours_b, ClusterParameters cluster_params)
get(self, NeighboursExtInterface neighbours_a, NeighboursExtInterface neighbours_b, ClusterParameters cluster_params)
classmethod get_builder_kwargs(type cls)
class commonnn._types.SimilarityChecker

Defines the similarity checker interface

abstract check(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) bool

Return True if a and b have sufficiently many common neighbours

abstract get(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) int

Return number of common neighbours

classmethod get_builder_kwargs(cls)
class commonnn._types.SimilarityCheckerExtContains

Implements the similarity checker interface

Strategy:

Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that no switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare commonnn._types.SimilarityCheckerExtSwitchContains).

class commonnn._types.SimilarityCheckerExtSwitchContains

Implements the similarity checker interface

Strategy:

Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare SimilarityCheckerExtContains).

class commonnn._types.SimilarityCheckerExtScreensorted

Implements the similarity checker interface

Strategy:

Loops over members of two neighbour containers alternatingly and checks if neighbours are contained in both containers. Requires that the containers are sorted ascendingly to return the correct result. Sorting will neither be checked nor enforced. Breaks early when similarity criterion is reached. The performance of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n + m)\) with \(n\) and \(m\) being the lengths of the neighbours containers.

class commonnn._types.SimilarityCheckerContains

Implements the similarity checker interface

Strategy:

Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that no switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare commonnn._types.SimilarityCheckerSwitchContains).

check(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) bool
get(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) int

Return number of common neighbours

class commonnn._types.SimilarityCheckerSwitchContains

Implements the similarity checker interface

Strategy:

Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that a switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare commonnn._types.SimilarityCheckerContains).

check(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) bool
get(self, neighbours_a: Type['Neighbours'], neighbours_b: Type['Neighbours'], cluster_params: Type['ClusterParameters']) int

Return number of common neighbours


Queues

Queues can be optionally used by a fitter, e.g.


class commonnn._types.QueueExtInterface
classmethod get_builder_kwargs(type cls)
is_empty(self) bool
pop(self) int
push(self, value: int)
size(self) int

class commonnn._types.PriorityQueueExtInterface
classmethod get_builder_kwargs(type cls)
is_empty(self) bool
pop(self)
push(self, a: int, b: int, weight: float)
reset(self) None
size(self) int

class commonnn._types.Queue

Defines the queue interface

classmethod get_builder_kwargs(cls)
abstract is_empty(self) bool

Return True if there are no values in the queue

abstract pop(self)

Retrieve value from the queue

abstract push(self, value)

Put value into the queue

abstract size(self) int

Get number of items in the queue


class commonnn._types.PriorityQueue

Defines the prioqueue interface

classmethod get_builder_kwargs(cls)
abstract is_empty(self) bool

Return True if there are no values in the queue

abstract pop(self)

Retrieve values from the queue

abstract push(self, a, b, weight) None

Put values into the queue

abstract reset(self) None

Reset the queue

abstract size(self) int

Get number of items in the queue


class commonnn._types.QueueExtLIFOVector

Implements the queue interface


class commonnn._types.QueueExtFIFOQueue

Implements the queue interface


class commonnn._types.QueueFIFODeque

Implements the queue interface

is_empty(self) bool
pop(self)
push(self, value)

Append value to back/right end

size(self) int

class commonnn._types.PriorityQueueMaxHeap

Defines the prioqueue interface

is_empty(self) bool

Return True if there are no values in the queue

pop(self)

Retrieve values from the queue

push(self, a, b, weight) None

Put values into the queue

reset(self) None
size(self)