problem about function "evalclusters" function for gap statistic

1 view (last 30 days)
%% This is an exampler for Clustering evaluation based on gap statistic
YYc=rand(1,100);
evaluation = evalclusters(YYc,"kmeans","gap","KList",2:10,'SearchMethod','globalMaxSE','B',40);%% 'SearchMethod' — Method for selecting optimal number of clusters, 'globalMaxSE' (default) | 'firstMaxSE'
I find that almost all references take "firstMaxSE" as the serach method . There is little literature being based on "globalMaxSE" . I want to know why the programmer design "'globalMaxSE", and how to find the reference.

Answers (1)

Himanshu
Himanshu on 4 Oct 2024
Hello,
I see that you are trying to understand why the "globalMaxSE" search method is included in the "evalclusters" function for gap statistic and how to find relevant references.
"globalMaxSE" selects the number of clusters corresponding to the global maximum gap value, considering the standard error, which can provide a more reliable choice when the gap statistic has multiple local maxima.
This method is designed to ensure robustness in cluster selection, especially when the gap statistic curve is noisy or has several peaks.
For references, I would recommend exploring academic papers on clustering and gap statistics, as they may discuss variations in methods for selecting the optimal number of clusters.
I hope this helps.

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!