How to remove outliers from a vector when calculating a moving average?

Question

Cai Chin on 6 Nov 2020

0
Link

Direct link to this question

https://in.mathworks.com/matlabcentral/answers/638690-how-to-remove-outliers-from-a-vector-when-calculating-a-moving-average

Edited: Cai Chin on 9 Nov 2020

I am using MATLAB R2020a on a MacOS. I am calculating an exponentially weighted moving mean using the dsp.MovingAverage function and am trying to remove vector elements in real-time based on 2 conditions - if the new element causes the mean to exceed 1.5 times the 'overall' mean so far, or if it is below 0.5 times the 'overall' mean so far.

In other words,the weighted mean with the current element is compared to the previous weighted mean, and if the current element causes the weighted mean to increase above 1.5 times the previous mean or go below 0.5 times the previous mean, then it should be ignored and the recursive equation is instead applied to the next element, and so on. In the end, I'd like to have a vector containing the outliers removed.

This is the function I am using to calculate the exponentially weighted moving mean:

movavgExp = dsp.MovingAverage('Method', 'Exponential weighting',    'ForgettingFactor', 0.4);
mean_cycle_period_exp = movavgExp(cycle_period_step_change);

I tried doing this by creating a for loop which manipulates the algorithm used by the dsp.MovingAverage function as outlined here:

https://uk.mathworks.com/help/dsp/ref/dsp.movingaverage-system-object.html

However, this manual method of finding the weighted mean produces a different graphical output to the function.

% Create a vector containing 60 cycle periods with a step increase by 0.002 seconds 
cycle_period_step_change = [0.7:0.002:0.82]; 
% Calculate weights manually 
lambda = 0.1;
w = zeros(length(cycle_period_step_change),1); 
w(1) = 1; % initialize the weight for the first sample
for i = 2:length(cycle_period_step_change)
    w(i) = lambda*w(i-1) + 1; % calculate the successive weights
end
% Calculate moving mean with weights manually
x = zeros(length(cycle_period_step_change), 1);
x(1) = 2;
for i = 2:length(cycle_period_step_change)
    x(i) = (1 - 1/w(i))*x(i - 1) + (1/w(i))*x(i);
end

This is output when I manually calculate the weighted mean This is the output when I use the function to calculate the weighted mean

Furthermore, when I implement a for loop to exclude outliers from the moving mean in real-time, it produces an error:

% Calculate moving mean with weights manually
x = zeros(length(cycle_period_step_change), 1);
x(1) = 2;
for i = 2:length(cycle_period_step_change)
    x(i) = (1 - 1/w(i))*x(i - 1) + (1/w(i))*x(i);
      if x(i) > 1.5*(1 - 1/w(i - 1))*x(i - 2) + (1/w(i - 1))*x(i - 1)
        x(i) = [];
      elseif x(i) < 0.5*(1 - 1/w(i - 1))*x(i - 2) + (1/w(i - 1))*x(i - 1)
        x(i) = [];
      end
end   

I would very much appreciate any suggestions on how to tackle this, thanks in advance!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Image Analyst on 6 Nov 2020

0
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/638690-how-to-remove-outliers-from-a-vector-when-calculating-a-moving-average#answer_536455

Did you run the signal though rmoutliers() first?

5 Comments
Show 3 older commentsHide 3 older comments

Cai Chin on 7 Nov 2020

Hi, thanks again for your suggestion. I hadn't considered removing the outliers before calculating the exponentially weighted moving mean since I wanted it to be done on a real-time basis, i.e. element by element as the signal trace is processed. That is, if an element is identified to be an outlier, it is removed from the weighted mean before moving on to the next element.

This is my signal, each value represents the cycle period of an ECG trace:

cycle_periods = [0.0040, 0.7740, 2.3040, 0.8340, 0.0040, 0.8630, 0.8020, 1.5400, 3.0100, 0.7760, 0.0040, 0.7620, 0.7470, 0.7610, 0.0040, 0.7910, 0.0040, 0.7760, 0.7560, 0.7640, 0.7880, 0.0040, 0.8140, 0.0040, 0.8150, 0.7800, 0.7560, 0.7830, 0.0040, 0.8010, 0.7720, 0.7560, 0.7800, 0.0050, 0.7960, 0.7840, 0.7600, 0.7690, 0.0040, 0.8110, 0.0040, 0.8590, 0.0050, 0.8350, 2.4050, 0.8200, 0.0040, 0.8140, 0.0050, 0.8090, 0.7890, 0.7740, 0.0030, 0.7480, 0.7420, 0.7650, 0.8110, 0.0040, 0.835, 0.0040, 0.8330, 0.0040, 0.8280, 0.0040, 3.9230, 0.0040, 0.8390, 0.0040, 0.8430, 0.8110, 0.8050, 0.8060, 0.8230, 0.0040, 0.8260, 0.0040, 0.8330, 0.0040, 0.8250, 0.7930, 0.7810, 0.7840, 0.0050, 0.7990, 0.0050, 0.8130, 0.0050, 0.7910, 0.7950]

An element would be considered to be an outlier if it is greater than 1.5 times the exponentially weighted mean thus far, or if is smaller than 0.5 times that value.

Thanks again!

Image Analyst on 8 Nov 2020

Anything wrong with this?

cycle_periods = [0.0040, 0.7740, 2.3040, 0.8340, 0.0040, 0.8630, 0.8020, 1.5400, 3.0100, 0.7760, 0.0040, 0.7620, 0.7470, 0.7610, 0.0040, 0.7910, 0.0040, 0.7760, 0.7560, 0.7640, 0.7880, 0.0040, 0.8140, 0.0040, 0.8150, 0.7800, 0.7560, 0.7830, 0.0040, 0.8010, 0.7720, 0.7560, 0.7800, 0.0050, 0.7960, 0.7840, 0.7600, 0.7690, 0.0040, 0.8110, 0.0040, 0.8590, 0.0050, 0.8350, 2.4050, 0.8200, 0.0040, 0.8140, 0.0050, 0.8090, 0.7890, 0.7740, 0.0030, 0.7480, 0.7420, 0.7650, 0.8110, 0.0040, 0.835, 0.0040, 0.8330, 0.0040, 0.8280, 0.0040, 3.9230, 0.0040, 0.8390, 0.0040, 0.8430, 0.8110, 0.8050, 0.8060, 0.8230, 0.0040, 0.8260, 0.0040, 0.8330, 0.0040, 0.8250, 0.7930, 0.7810, 0.7840, 0.0050, 0.7990, 0.0050, 0.8130, 0.0050, 0.7910, 0.7950]
subplot(2, 1, 1);
plot(cycle_periods, 'b.-', 'LineWIdth', 2, 'MarkerSize', 20);
grid on;
title('Original Signal with Outliers', 'fontSize', 18);
yl = ylim
repairedSignal = rmoutliers(cycle_periods);
subplot(2, 1, 2);
plot(repairedSignal, 'b.-', 'LineWIdth', 2, 'MarkerSize', 20);
grid on;
ylim(yl); % Use same scale as original plot.
title('Outliers Removed', 'fontSize', 18);

If so, please say what elements you want to be excluded and included in your process after the outliers have been removed.

Cai Chin on 9 Nov 2020

Hi, thank you for this. However, I don't think it's really what I'm looking for. The 'rmoutliers' function seems to be removing a lot of the values that I would like to keep. Essentially, I would like the exponentially weighted moving mean to only include samples that do not exceed 1.5 times the mean calculated before that sample, or that do not go below 0.5 times the mean calculated before that sample. Therefore, the elements are included/ excluded based on the history of the signal rather than the signal as a whole, which your method seems to be doing, making it 'too' adaptive.

I have edited my question to clarify what I would like to do further, and the problems I encountered with my initial approach - I would really appreciate any help with this, thanks again!

Sign in to comment.

How to remove outliers from a vector when calculating a moving average?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

5 Comments
Show 3 older commentsHide 3 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How to remove outliers from a vector when calculating a moving average?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

5 Comments Show 3 older commentsHide 3 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

5 Comments
Show 3 older commentsHide 3 older comments