Loading [MathJax]/jax/output/HTML-CSS/config.js
scipy 1.10.1 Pypi GitHub Homepage
Other Docs

ParametersReturns
leaders(Z, T)

Returns the root nodes in a hierarchical clustering corresponding to a cut defined by a flat cluster assignment vector T. See the fcluster function for more information on the format of T.

For each flat cluster j of the k flat clusters represented in the n-sized flat cluster assignment vector T, this function finds the lowest cluster node i in the linkage tree Z, such that:

  • leaf descendants belong only to flat cluster j (i.e., T[p]==j for all p in S(i), where S(i) is the set of leaf ids of descendant leaf nodes with cluster node i)
  • there does not exist a leaf that is not a descendant with i that also belongs to cluster j (i.e., T[q]!=j for all q not in S(i)). If this condition is violated, T is not a valid cluster assignment vector, and an exception will be thrown.

Parameters

Z : ndarray

The hierarchical clustering encoded as a matrix. See linkage for more information.

T : ndarray

The flat cluster assignment vector.

Returns

L : ndarray

The leader linkage node id's stored as a k-element 1-D array, where k is the number of flat clusters found in T.

L[j]=i is the linkage cluster node id that is the leader of flat cluster with id M[j]. If i < n, i corresponds to an original observation, otherwise it corresponds to a non-singleton cluster.

M : ndarray

The leader linkage node id's stored as a k-element 1-D array, where k is the number of flat clusters found in T. This allows the set of flat cluster ids to be any arbitrary set of k integers.

For example: if L[3]=2 and M[3]=8, the flat cluster with id 8's leader is linkage node 2.

Return the root nodes in a hierarchical clustering.

See Also

fcluster

for the creation of flat cluster assignments.

Examples

from scipy.cluster.hierarchy import ward, fcluster, leaders
from scipy.spatial.distance import pdist
Given a linkage matrix ``Z`` - obtained after apply a clustering method to a dataset ``X`` - and a flat cluster assignment array ``T``:
X = [[0, 0], [0, 1], [1, 0],
     [0, 4], [0, 3], [1, 4],
     [4, 0], [3, 0], [4, 1],
     [4, 4], [3, 4], [4, 3]]
Z = ward(pdist(X))
Z
T = fcluster(Z, 3, criterion='distance')
T
`scipy.cluster.hierarchy.leaders` returns the indices of the nodes in the dendrogram that are the leaders of each flat cluster:
L, M = leaders(Z, T)
L
(remember that indices 0-11 point to the 12 data points in ``X``, whereas indices 12-22 point to the 11 rows of ``Z``)
`scipy.cluster.hierarchy.leaders` also returns the indices of the flat clusters in ``T``:
M
See :

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

fclusterfcluster

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


GitHub : /scipy/cluster/hierarchy.py#4060
type: <class 'function'>
Commit: