- Input A is a vector of 1s and 0s.
- n is minimum number of 0s between 1s separate groups of 1s.
- T is a table showing the start and stop index for each consecutive group of 1s split by less than n zeros and the length of each group.
clusterization of data in 1-D vector
5 views (last 30 days)
Show older comments
I have large logical vector looking as V = [0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 ..............]
I need to find the position of each group of 1 (lets say - center of each group) but if two groups of ones are too close to each other (say, less than 3 zerros in between) I need to consider those groups as a single group. I.e. at the firs stage I need to find groups (bold-underlined elements) and then find the ceter element of each group (shift +/-1 element does not matter)
1st stage (clusterization): [0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 ..............]
2nd stage (find a center of each cluster): [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 ..............]
The way I implemented now is following: I do smoothing of the entire vector (it is couple million elements). The span is chousen to be equal of maximum expected lenght of the group and then I look for local maxima (islocalmax) with 'MinSeparation' of minimum distace between groups. It works, but really slow (I have 360x180 = 64800 of vectors - yes, it is LAT/LONG grid with ~10M elements in each vector)
Is any way to speed up this? I believe it should be some "textbook" examples of it!
0 Comments
Accepted Answer
Adam Danz
on 28 Oct 2020
Edited: Adam Danz
on 28 Oct 2020
There are lots of alternatives.
A = [0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 0 0 0 1 1 1 1];
% Length of each group of consecutive 1s
T = table();
T.OnesLength = diff(find([0;A(:);0]==0))-1;
T(T.OnesLength==0,:) = [];
% Index of 1st '1' in each group of consecutive 1s
T.OnesStart = find(diff([0;A(:)])==1);
% Index of last '1' in each group of consecutive 1s
T.OnesStop = T.OnesStart + T.OnesLength - 1;
% Determine the number of 0s between consecutive 1s
ZerosBetween = [T.OnesStart(2:end) - T.OnesStop(1:end-1); NaN]-1;
disp(T)
% join groups of consecutive 1s with less than n zeros between.
n = 3;
joinGroups = ZerosBetween < n;
t = find(diff([0;joinGroups])==1);
f = find(diff([0;joinGroups])==-1);
T.remove = false(height(T),1);
for i = 1:numel(t)
T.OnesStop(t(i)) = T.OnesStop(f(i));
T.OnesLength(t(i)) = sum(T.OnesLength(t(i):f(i))) + sum(ZerosBetween(t(i):f(i)-1));
T.remove(t(i)+1:f(i)) = true;
end
T(T.remove,:) = [];
T.remove = [];
disp(T)
Now you can use the segment length and the start/stop indices to compute the segement centers.
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!