# How to interpolate and leave NaNs if long gaps?

24 views (last 30 days)
K E on 5 Aug 2015
Commented: Andrew Pullin on 7 Feb 2017
I have a temperature measurement (x) which is sampled nearly-regularly in time (t), except for data dropouts (missing t,x values). I want to make an interpolated xi which is regular in time, but indicate data dropouts with with NaNs rather than interpolating across the gaps. How do I do this? I would like to avoid looping through each interpolated value and replacing it with a NaN if it is far from any measurement (real data vector is long).
x=rand(1,60); % Fake temperature measurement
t=[1:10 21:30 41:50 61:70 81:90 101:110]; % Time samples with dropouts, e.g. t=11,12
tNoise = t + rand(1,length(t))/100; % Time step varies slightly between measurements so add a little noise
timeStep=median(diff(t)); % Time step if there were no dropouts
ti=min(tNoise):timeStep:max(tNoise); % Time vector without dropouts
xi=interp1(tNoise,x,ti); % No NaNs, so dropouts are filled with 'fake' data
% How to replace fake data with NaNs if xi value is more than timeStep away from any value in t?
plot(ti,xi, 'b.', tNoise,x, 'co'); % Show fake data between actual sample time

David Young on 5 Aug 2015
I'd adopt a slightly different approach. I'd quantize the times first to get an array of just the times you want, then interpolate into those times only. The quantization gives you also the indices of the times in an array of evenly-spaced times. Then you can copy the interpolated data into just those bits of the array. The approach assumes that the noise in the times is small compared to the time difference.
My version looks like this.
% Data (with t starting at -5 to demonstrate generality)
x=rand(1,60); % Fake temperature measurement
t=[-5:4 21:30 41:50 61:70 81:90 101:110]; % Time samples with dropouts, e.g. t=11,12
tNoise = t + rand(1,length(t))/100; % Time step varies slightly between measurements so add a little noise
timeStep=median(diff(t)); % Time step if there were no dropouts
% quantise tNoise
t1 = tNoise(1); % starting time
tCount = round((tNoise-t1)/timeStep); % time in units of timeStep
tIndex = tCount + 1; % index of these times in array of regular times
tQuant = t1 + timeStep * tCount; % time to nearest timeStep
% interpolate for these points only (need extrapolation for first or last)
xiAtTquant = interp1(tNoise, x, tQuant, 'linear', 'extrap');
% evenly spaced times for whole sequence
ti = linspace(tQuant(1), tQuant(end), tIndex(end));
% output array of x values
xi = NaN(1, tIndex(end));
xi(tIndex) = xiAtTquant;
% plot
plot(ti,xi, 'bx', tNoise,x, 'co'); % Show fake data between actual sample time
##### 2 CommentsShowHide 1 older comment
Andrew Pullin on 7 Feb 2017
This was an immensely helpful answer. I am surprised that a function like this does not already exist in Matlab, so that datasets can be treated blindly as large blocks. In Mathematica, you can plot with "exclusions", which will do the gap skipping.
You should generalize this function into a "gapify" function and put it on the File Exchange. Or I should do that ...