Cluster the data, separate the data

9 views (last 30 days)
Marine on 23 Sep 2019
Commented: MS on 26 Sep 2019
Hi! I was measuring tilt for let's say 71 nozzles continiously. I got the inclinations. The results are shown in the attached pic. Now I have to split the data on 71 groups but the upper and low limits and time of measurements are different. When aplitting the data I just need the data with high density. Should I write a loop for defining the data points that are close in values until I reach some pick value and store it. Or there some solution to do. I would be appreciate. Thanks.

Adam Danz on 23 Sep 2019
I'm not forming a mental picture based on this description. How are the data organized? Is density a measurment you already have or something you need to compute? What should defined the splits? Inclination? time? density?
Marine on 23 Sep 2019
The data are presented in the form vector1=inclinations and vector2=time. Density is not measured. Basically, the measured time (x-axis) vs inclinations (y-axis) result in the plot as shown in the attached pic. I need to separate the inclinations on 75 groups. Each group has to include the data where the most points accure with high density. In the pic, as example, the data I have to extract is shown in black squares, that represent more steady results. I just need this data, as it's real measurements.
Adam Danz on 23 Sep 2019
This will require some trial-error. If the time is at a fixed sampling rate, I'd personally start with differentiating the inclinations vector using abs(diff(vector1)) and then I'd plot the results to see if I could use some kind of threshold to detect the large spikes. If there wasn't a clear threshold I'd move on to findpeaks() which will require you to read the documentation and play around with those options to find what works for you.

MS on 23 Sep 2019
As a quick answer (since you didn't upload the data), you may try to detect the peaks of the signal and then get the segments between them. For example, see the findpeaks method at: https://www.mathworks.com/help/signal/examples/peak-analysis.html#responsive_offcanvas.

Show 1 older comment
MS on 24 Sep 2019
Hi,
See the following example demonstrating a process to detect step changes and then get the segments between them. Attached an image showing the result. You can tune the thresholds and the filtering criteria to match your application. You can procees by extracting the segments showing a step change of zero.
% Segment of the data as an example: 1500 sample
Time75x_new = Time75x(1:1500);
X75_new = X75(1:1500);
% preprocessing
% filter
filtered_X75 = smooth(X75_new);
% normalize
normalized_X75 = filtered_X75 / max(filtered_X75);
% search for step changes
% initialize baseline value
baseline = normalized_X75(1);
% set thresholds for step up and step down detection
upper_threshold = 0.05;
lower_threshold = 0.05;
% initialize an empty array to record the status
steps = zeros (1,length(X75_new));
% loop
for i=2:length(X75_new)
% get new value
new_value = normalized_X75(i);
% check if a step change occurred
if ((new_value > (baseline + upper_threshold)) || (new_value < (baseline - lower_threshold)))
% set steps to 1
steps(i) = 1;
% update baseline
baseline = normalized_X75(i);
else
% keep the step zero
% update baseline
baseline = normalized_X75(i);
end
end % end for
% the steps may be noisy, we can correct them by different methods,
% for example, we may take a temporal window of 5 elements to validate the
% step
refined_steps = zeros(1,length(steps));
for i=6:length(steps)-6
window_before = steps(i-5:i-1);
window_after = steps(i+1:i+5);
if ((sum(window_before) && sum(window_after)) == 0)
refined_steps(i) = 0;
else
refined_steps(i) = 1;
end
end
% plot original data
subplot(4,1,1);
plot(Time75x_new,X75_new);
title('Original data sample');
% plot normalized data
subplot(4,1,2);
plot(Time75x_new,normalized_X75);
title('Filtered and normalized data sample');
% plot steps
subplot(4,1,3);
plot(Time75x_new,steps);
title('Steps detected');
% plot refined steps
subplot(4,1,4);
plot(Time75x_new,refined_steps);
title('Refined steps');
% You can proceed by selecting the segments where the refined steps are zero
Marine on 26 Sep 2019
Amazing! Yes it helps a lot. I put the code in a fuction.m and just by changing parameters get nessecary results.
function [fig, res, mean_res, ind] = mean_ang(N, time, x, upper_threshold, lower_threshold, window)
% preprocessing
% filter
filtered = smooth(x);
% normalize
normalized_x = filtered / max(filtered);
% search for step changes
% initialize baseline value
baseline = normalized_x(1);
% initialize an empty array to record the status
steps = zeros (1,length(x));
% loop
for i=2:length(x)
% get new value
new_value = normalized_x(i);
% check if a step change occurred
if ((new_value > (baseline + upper_threshold)) || (new_value < (baseline - lower_threshold)))
% set steps to 1
steps(i) = 1;
% update baseline
baseline = normalized_x(i);
else
% keep the step zero
% update baseline
baseline = normalized_x(i);
end
end % end for
% the steps may be noisy, we can correct them by different methods,
% for example, we may take a temporal window of 5 elements to validate the
% step
refined_steps = zeros(1,length(steps));
for i=window:length(steps)-window
window_before = steps(i-(window-1):i-1);
window_after = steps(i+1:i+(window-1));
if ((sum(window_before) && sum(window_after)) == 0)
refined_steps(i) = 0;
else
refined_steps(i) = 1;
end
end
fig=figure('units','normalized','outerposition',[0 0 1 1]);
% plot original data
subplot(4,1,1);
plot(time,x);
title('Original data sample');
% plot normalized data
subplot(4,1,2);
plot(time,normalized_x);
title('Filtered and normalized data sample');
% plot steps
subplot(4,1,3);
plot(time,steps);
title('Steps detected');
% plot refined steps
subplot(4,1,4);
plot(time,refined_steps);
title('Refined steps');
% You can proceed by selecting the segments where the refined steps are zero
% find indeces for nozzles
flag = 1;
start_indices = [];
end_indices = [];
for i=1:numel(refined_steps)
if ((refined_steps(i)==0) && flag == 1)
start_indices = [start_indices i];
flag = 0;
elseif (flag == 0 && (refined_steps(i)~=0) )
end_indices = [end_indices i-1];
flag = 1;
end
end
ind=([start_indices(1:length(end_indices)); end_indices])';
res=zeros(length(x), N);
mean_res=zeros(1, N);
%find mean value
for k=1:N
res(1:length(x(ind(k,1):ind(k,2))),k)=(x(ind(k,1):ind(k,2)));
mean_res(1,k)=mean(res(1:length(x(ind(k,1):ind(k,2))),k));
end
end
MS on 26 Sep 2019