ncutdc
PURPOSE
Minimum Normalised Cut Divisive Clustering
SYNOPSIS
function [idx,t] = ncutdc(X, K, varargin)
DESCRIPTION
Minimum Normalised Cut Divisive Clustering
[IDX,T] = NCUTDC(X, K, VARARGIN)
[IDX, T] = NCUTDC(X, K) produces a divisive hierarchical clustering of the
N-by-D data matrix X into K clusters. This algorithm uses a hierarchy of
binary partitions each splitting the observations with the hyperplane that
minimises the normalised cut criterion. The algorithm can return fewer
clusters if no valid hyperplane separators are found.
[IDX, T] = NCUTDC(X, K) returns the cluster assignment, IDX, and the binary
tree (T) containing the cluster hierarchy
[IDX, T] = NCUTDC(X, K, 'PARAM1',val1, 'PARAM2',val2, ...) specifies optional parameters
in the form of Name,Value pairs.
'v0' - Function handle. v0(X) returns D-by-S matrix of initial projection vectors
(default: v0 = @(X)(pca(X,'NumComponents',1)) -- 1st principal component)
'sigma' - Function handle. sigma(X,pars) returns scaling parameter (sigma) as function of data matrix
(default: sigma = 100*sqrt(l)*N^(-0.2), where l = max(eig(cov(X))); )
'split_index' - Criterion determining which cluster to split
Function Handle: index = split_index(v, X, pars)
(v: projection vector, X:data matrix, pars: parameters structure)
Cluster with MAXIMUM INDEX is split at each step of the algorithm
Two standard choices of split index can be enabled by setting 'split_index' to
one of the strings below:
+ 'fval': Split cluster whose hyperplane achieves the lowest normalised cut value
+ 'size': Split largest cluster
(default: split_index = 'fval')
'minsize' - Minimum cluster size (integer)
(default minsize = 1)
'maxit' - Number of BFGS iterations to perform for each value of alpha (default: 50)
'ftol' - Stopping criterion for change in objective function value over consecutive iterations
(default: 1.e-7)
'verb' - Verbosity. Values greater than 0 enable visualisation during execution
Enabling this option slows down the algorithm considerably
(default: 0)
'labels' - true cluster labels. Specifying these enables the computation of performance over
successive iterations and a better visualisation of how clusters are split
'colours' - Matrix containing colour specification for observations in different clusters
Number of rows must be equal to the number of true clusters (if 'labels' has been specified) or equal to 2.
Reference:
D.P. Hofmeyr. Clustering by minimum cut hyperplanes. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 39(8):1547–1560, 2017.CROSS-REFERENCE INFORMATION
This function calls:- ifelse Shorthand for ternary operator: if-then-else
- myparser Function used to parse optional arguments in form of Name,Value pairs for a number of OPC algorithms
- palette Determines colours used for visualisation
- pcacomp Returns the principal components of (X) specified in vector (index)
- tree2clusters Assigns cluster labels from a cluster hierarchy (ctree object)
- ctree Class implementing cluster hierarchy in tree data structure
- ncut_sigma Default scaling parameter employed by Gaussian kernel in minimum normalised cut projection pursuit
- ncutpp Minimum normalised cut projection pursuit
Generated on Tue 17-Jul-2018 18:58:09 by m2html © 2005