Clear Filters
Clear Filters

How to use a custom distance in evalclusters?

6 views (last 30 days)
Daniel
Daniel on 28 Jul 2015
Answered: Taro Ichimura on 10 Jun 2016
For evalclusters, the help section says under distance: "You can also specify a function for the distance metric by using the function_handle (@) operator. The distance function must be of the form d2 = distfun(XI,XJ), where XI is a 1-by-nvector corresponding to a single row of the input matrix X, and XJ is an m2-by-n matrix corresponding to multiple rows of X. distfun must return an m2-by-1 vector of distances d2, whose kth element is the distance between XI and XJ(k,:)."
I have my own function for distance (ANewDistanceFunction) and want to have evalclusters use it. When I try GapKmeansCustomDistance = evalclusters(allClustersN,'kmeans','gap','Klist', [1:10], 'Distance', @ANewDistanceFunction);
Matlab gives me this error: "Warning: Clustering reference data into 7 clusters using function 'getKmeansFunc/nested' generated the following error: Invalid 'Distance' argument, must be a character string."
I've made sure that my custom distance function returns an mx1 vector where m is the number of rows in XJ. (XJ is one of the inputs for WeightedDistance)
How do I call my custom function here so that evalclusters understands what I'm trying to do?

Answers (2)

Matt Cohen
Matt Cohen on 30 Jul 2015
Edited: Matt Cohen on 30 Jul 2015
Hi Daniel,
I understand that you are receiving an error message while using a custom distance function with "evalclusters" and the k-means algorithm.
The "evalclusters" function does not allow you to use a self-defined 'Distance' function handle when using the k-means algorithm for the clustering. This is because the 'Distance' argument accepts a function handle only when the clustering function accepts a function handle as the distance metric, and the "kmeans" function does not allow for a function handle as the distance metric.
As mentioned in the 'Distance' section of the "evalclusters" documentation:
"In all other cases, the distance metric specified for 'Distance' must match the distance metric used in the clustering algorithm to obtain meaningful results."
You could try defining your own clustering algorithm, one that ideally uses your distance function as a measure, and use its handle as the 'clust' argument in order to make this work. The documentation for "evalclusters" provides an example that uses a function handle to specify the clustering algorithm. This might be useful if you decide to pursue this option.
You could also try using 'linkage' as the clustering algorithm input argument. This performs the clustering using the "clusterdata" function, which allows for function handles as the distance metric.
I hope this helps.
Matt

Taro Ichimura
Taro Ichimura on 10 Jun 2016

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!