clusterization of data in 1-D vector

Question

paganelle on 28 Oct 2020

0
Link

Direct link to this question

https://in.mathworks.com/matlabcentral/answers/628468-clusterization-of-data-in-1-d-vector

Commented: paganelle on 28 Oct 2020

I have large logical vector looking as V = [0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 ..............]

I need to find the position of each group of 1 (lets say - center of each group) but if two groups of ones are too close to each other (say, less than 3 zerros in between) I need to consider those groups as a single group. I.e. at the firs stage I need to find groups (bold-underlined elements) and then find the ceter element of each group (shift +/-1 element does not matter)

1st stage (clusterization): [0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 ..............]

2nd stage (find a center of each cluster): [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 ..............]

The way I implemented now is following: I do smoothing of the entire vector (it is couple million elements). The span is chousen to be equal of maximum expected lenght of the group and then I look for local maxima (islocalmax) with 'MinSeparation' of minimum distace between groups. It works, but really slow (I have 360x180 = 64800 of vectors - yes, it is LAT/LONG grid with ~10M elements in each vector)

Is any way to speed up this? I believe it should be some "textbook" examples of it!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Adam Danz on 28 Oct 2020

0
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/628468-clusterization-of-data-in-1-d-vector#answer_526348

Edited: Adam Danz on 28 Oct 2020

Open in MATLAB Online

There are lots of alternatives.

Input A is a vector of 1s and 0s.
n is minimum number of 0s between 1s separate groups of 1s.
T is a table showing the start and stop index for each consecutive group of 1s split by less than n zeros and the length of each group.

A = [0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 0 0 0 1 1 1 1];
% Length of each group of consecutive 1s
T = table();
T.OnesLength = diff(find([0;A(:);0]==0))-1;
T(T.OnesLength==0,:) = []; 
% Index of 1st '1' in each group of consecutive 1s
T.OnesStart = find(diff([0;A(:)])==1);
% Index of last '1' in each group of consecutive 1s
T.OnesStop = T.OnesStart + T.OnesLength - 1; 
% Determine the number of 0s between consecutive 1s
ZerosBetween = [T.OnesStart(2:end) - T.OnesStop(1:end-1); NaN]-1;
disp(T)
    OnesLength    OnesStart    OnesStop
    __________    _________    ________

        3             4            6   
        3             9           11   
        6            18           23   
        2            29           30   
        1            32           32   
        2            34           35   
        1            37           37   
        4            42           45   
% join groups of consecutive 1s with less than n zeros between. 
n = 3; 
joinGroups = ZerosBetween < n;
t = find(diff([0;joinGroups])==1);
f = find(diff([0;joinGroups])==-1);
T.remove = false(height(T),1); 
for i = 1:numel(t)
    T.OnesStop(t(i)) = T.OnesStop(f(i));
    T.OnesLength(t(i)) = sum(T.OnesLength(t(i):f(i))) + sum(ZerosBetween(t(i):f(i)-1));  
    T.remove(t(i)+1:f(i)) = true; 
end
T(T.remove,:) = []; 
T.remove = [];
disp(T)
    OnesLength    OnesStart    OnesStop
    __________    _________    ________

        8             4           11   
        6            18           23   
        9            29           37   
        4            42           45   

Now you can use the segment length and the start/stop indices to compute the segement centers.

1 Comment
Show -1 older commentsHide -1 older comments

paganelle on 28 Oct 2020

Perfect way, thank you!

It is ~5 times faster than method I used previously.

Sign in to comment.

clusterization of data in 1-D vector

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

clusterization of data in 1-D vector

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments