What does sumd method in k-means clustering function exactly calculate?
Show older comments
I am doing basic experiments with kmeans function. As a real simple example, say that I have a data set of 4 items with 1 attribute and this attribute is their value:
Data=[1;2;3;4];
If I want to split this data set into 2 clusters I should get one centroid in 1.5 and another in 3.5:
[idx,C,sumd]=kmeans(Data,2)
C =
1.5000
3.5000
and I get it. However to my understanding sumd in this case should be:
abs(1-1.5)+abs(2-1.5) or abs(3-3.5)+abs(4-3.5)
ans =
1
but I am getting sumd as:
sumd =
0.5000
0.5000
for both clusters. Instead of getting 1's for both.
My question is what exactly does sumd calculate?
Accepted Answer
More Answers (1)
the cyclist
on 8 May 2018
1 vote
It's because the default distance metric used is the squared Euclidean distance (for minimization, and reporting). See the Distance input parameter.
1 Comment
Onur Kapucu
on 8 May 2018
Categories
Find more on Cluster Analysis and Anomaly Detection in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!