How to smoothen or interpolate an array of data if I cut out some parts of it?

4 views (last 30 days)
I have a large array of data from which I need to cut out short parts: the cut out parts always have the same length (for example 60 data points). Than I would like to replace the the missing values with vectors from the remaining data. After that I would like to smoothen or interpolate the data to get rid of sudden changes in the data set. I don't have any experience with interpolation, so I need some suggestions how to do it.
I have attached my data file. I have replaced the deleted parts with NaNs.
  4 Comments
Dyuman Joshi
Dyuman Joshi on 13 Sep 2023
Which type of interpolation do you want to do?
Linear? Spline? or some other method?
Bence Laczó
Bence Laczó on 17 Sep 2023
I wanted to use the pchip (Piecewise Cubic Hermite Interpolating Polynomial) method.

Sign in to comment.

Answers (2)

Robert Daly
Robert Daly on 13 Sep 2023
I get the idea that maybe you have some bad data points you would like to get rid of and then fill with interpolated data.
I would sugest something like this...
X = [0:10];
Y= [1,1.2,1.25,1.3,1.32,5,1.32,1.3,1.25,1.2,1];
figure
plot(X,Y)
bad = Y > 4 % some criteria to determine what your bad data is.
% bad needs to be a logical vector the same legnth as your original data
% with 1 (true) where your bad data is and 0 (false) where the good data
% is.
Yi = Y; % copy all of the data to a new variable i.e. interpolated Y data
Yi(bad) = interp1(X(~bad),Y(~bad),X(bad));
% '~' is the not operator so '~bad' is the "good" data
% feed the good data as input to the interpolation function as source data
% (first two parameters)
% and get it to evaluate at all of the x points where the bad data was
% i.e. last parameter.
% use the result to replace the bad bits of data in Yi
hold on
plot(X,Yi)

DGM
DGM on 17 Sep 2023
Edited: DGM on 17 Sep 2023
I don't know why nobody has mentioned fillmissing() or other inpainting tools.
load data.mat
% pick a region to inspect
idxrange = 2000:2299;
% use different interpolation
filled1 = fillmissing(data,'linear');
filled2 = fillmissing(data,'pchip');
filled3 = fillmissing(data,'makima');
filled4 = fillmissing(data,'movmean',100); % window size needs to be at least as wide as the gaps
% plot the filled and original data
plot(filled1(idxrange)); hold on
plot(filled2(idxrange))
plot(filled3(idxrange))
plot(filled4(idxrange))
plot(data(idxrange))
legend({'linear','pchip','makima','movmean'})
Of course, given what the data looks like, I have to question what's even appropriate here. Using a moving mean/median large enough to span these gaps basically makes a bunch of baloney data. I guess a spline does too, but I guess it all depends what you want.
Similarly, I don't see why the question of smoothing is necessary. The transitions at the boundary of the filled region are no less smooth than the surrounding data. If you wanted to smooth the data, I guess you could, but I don't know what that means in terms of validity.
load data.mat
idxrange = 1050:1149;
% smooth it? these all are using default parameters
smoo1 = smoothdata(data,'movmean');
smoo2 = smoothdata(data,'lowess');
smoo3 = smoothdata(data,'sgolay');
% plot the filled and original data
plot(smoo1(idxrange)); hold on
plot(smoo2(idxrange))
plot(smoo3(idxrange))
plot(data(idxrange),'linewidth',1.5)
legend({'movmean','lowess','sgolay','original'})
I'm just using defaults in the example. You can adjust the settings as needed.
  4 Comments
Robert Daly
Robert Daly on 18 Sep 2023
"I don't know why nobody has mentioned fillmissing() ..."
Lol because I am a dinosaur and learned how to do this sort of thing before fillmissing was introduced in 2016.
Bence Laczó
Bence Laczó on 18 Sep 2023
Edited: Bence Laczó on 18 Sep 2023
Actually your solution is not exactly the one I wanted but it definately solves one of my problems, so thank you very much for your help!
My main goal was to fill the missing data with other randomly selected parts of the raw data. Unfortunately due to large low frequency noise there can be large differences between the end of the data and the start of the substitued data part. I wanted to smoothen this difference, to avoid sudden changes in the data. I am not sure if that is possible at all.
I have uploaded another raw data file, and I have plotted some parts of it (between 29900 and 30200) with a substitued data part which demonstrate my problem.
plot(data(29900:30200))

Sign in to comment.

Categories

Find more on Descriptive Statistics in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!