How can I use multidimensional matrices as input for clustering in MATLAB?

21 views (last 30 days)
Hi all,
I am trying to cluster 2-dimensional time series signals with a similarity measure (maybe corr2).
For example, suppose I have the following signals (1st column is time, 2nd column is amplitude):
A = [1.2,1.5; 2.4,0.5; 3.2,1.5; 4.1,1.0]
B = [1.0,1.0; 2.0,0; 3.0,1.0; 4.0,0.5]
C = [1.1,1.2; 2.2,1.3; 3.3,1.5; 4.2,1.7]
D = [1.3,1.3; 2.1,1.4; 3.2,1.7; 4.3, 2.0]
E = [1.3,1.8; 2.3,0.4; 3.1,1.6; 4.3,0.8]
After applying a clustering technique, such as K-means (hard partition) or Fuzzy (soft partition) or something else, I am expecting 2 clusters like this:
Cluster 1 (A, B, E: see figure 1) and cluster 2 (C, D: see figure 2)
Plot of A, B, E
Plot of C, D
I have been searching for a way to cluster 2-dimensional matrices but I have not found any MATLAB code I can adopt to cluster my matrices.
How can I use multidimensional matrices like A, B, C, D and E as input for clustering in MATLAB?

Accepted Answer

Ameer Hamza
Ameer Hamza on 12 Mar 2020
You can first convert each matrix to a linear array and then apply a clustering algorithm. Since the corresponding element of each matrix have the same unit, so the shape and order of input do not matter to the clustering algorithm. For example,
A = [1.2,1.5; 2.4,0.5; 3.2,1.5; 4.1,1.0];
B = [1.0,1.0; 2.0,0; 3.0,1.0; 4.0,0.5];
C = [1.1,1.2; 2.2,1.3; 3.3,1.5; 4.2,1.7];
D = [1.3,1.3; 2.1,1.4; 3.2,1.7; 4.3, 2.0];
E = [1.3,1.8; 2.3,0.4; 3.1,1.6; 4.3,0.8];
M = [A(:) B(:) C(:) D(:) E(:)]'; % each matrix A,B,...,E is now a row of matrix M
idx = kmeans(M, 2, 'Distance', 'correlation')
result:
idx =
1
1
2
2
1
It identified 1st, 2nd, and 5th row are one cluster, 3rd and fourth are other cluster.
  2 Comments
SOPAE YI
SOPAE YI on 12 Mar 2020
Edited: SOPAE YI on 13 Mar 2020
Hi Mr. Hamza,
Your answer is AWESOME!!
Thanks a million!!
I have learned something :--)
One more question,
Since most of my data have different lengths…
Can we cluster matrices of different sizes like following?
A = [1.2,1.5; 2.4,0.5; 3.2,1.5];
B = [1.0,1.0; 2.0,0; 3.0,1.0; 4.0,0.5];
C = [1.1,1.2; 2.2,1.3; 3.3,1.5];
D = [1.3,1.3; 2.1,1.4; 3.2,1.7; 4.3,2.0; 5.0,1.9];
E = [1.3,1.8; 2.3,0.4; 3.1,1.6; 4.3,0.8];
*** I noticed that ‘interp1’ is losing ‘time information’. The time (1st column) of each point is important in my research. I’d like to keep the time information of all matrices.
I do not have to keep the actual time values, I just want to keep the proportional time interval between points (points within 1st column). (e.g. Z = [1, 1.2; 6, 2.1; 8, 3.4 ] can be rewritten as ZZ = [1, 1.2; 4, 2.1; 5, 3.4 ] ). As they still have the same ratio (3 to 1) of time intervals, they have the same time information.
Ameer Hamza
Ameer Hamza on 13 Mar 2020
The clustering algorithm must take data with a consistent dimension, so I am not sure is there an easy way to apply clustering in a situation where the size of input matrices is not equal. Please refer to my comment on your other question to see how you can avoid losing information in matrix B.

Sign in to comment.

More Answers (0)

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!