Using kmedoids with custom distance function with several input variables
Show older comments
I have a matrix X = [XCat XNum] where:
XCat is a matrix made of dummy variables resulting from encoding categorical variables
XNum is a matrix of continuous variables.
I want to apply a clustering algorithm, that keeps into account the categorical nature of part of the features in X. So I create a custom distance function, that uses the Hamming distance for the encoded categorical variables (dummies), and L1 (cityblock) for the continuous variable. This is the function:
function D = MixDistance(XCat,XNum)
% Mixed categorical/numerical distance
% INPUT:
% XCat = matrix nObsCat x nFeatures of categorical features
% XNum = matrix nObsNum x nFeatures of numerical features
% OUTPUT:
% D = matrix of distances (nObsCat+nObsNum) x (nObsCat+nObsNum)
% Number of categorical and numerical features
nCat = size(XCat,2);
nNum = size(XNum,2);
% Compute distances, separately
DCat = pdist2(XCat, XCat, 'hamming');
DNum = pdist2(XNum, XNum, 'cityblock');
% Compute relative weight based on the number of categorical variables
wCat = nCat/(nCat + nNum);
D = wCat*DCat + (1 - wCat)*DNum;
Now, one should be tempted to call kmedoids like this:
[IDX, C, SUMD, D, MIDX, INFO] = kmedoids(X,3,'distance', @MixDistance,'replicates',3);
but of course it doesn't work as the function MixDistance need XCat,XNum as input, not just X.
also, because of the way handles work, this doesn't work either:
[IDX, C, SUMD, D, MIDX, INFO] = kmedoids(X,3,'distance', MixDistance(XCat, XNum),'replicates',3);
Any idea?
Or alternatively, any idea on clustering when data are mixed, that is BOTH categorical AND continuous?
2 Comments
the cyclist
on 5 Feb 2021
Can you upload a sample of the X data in a MAT file, to make it easier for folks to investigate?
Raffaele Zenti
on 5 Feb 2021
Accepted Answer
More Answers (0)
Categories
Find more on Creating and Concatenating Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!