How to get 15-min average values from 10 min average values?
4 views (last 30 days)
Show older comments
Does somebody have a idea? In the end the average value for 1 hour should be the same.
1 Comment
John D'Errico
on 19 Jul 2017
Note that any use of a standard interpolation tool will NOT ensure maintaining the hourly average. They simply are not designed to solve that problem.
Could you solve the problem with custom written interpolating spline code, designed to employ that constraint? Yes, I am sure you could do so. But I seriously doubt it is worth the effort to maintain the exact hourly average.
Answers (3)
Andrei Bobrov
on 19 Jul 2017
TT = timetable(minutes(10*(1:10)'),randi([-20,100],30,1),'V',{'var'}); % Let TT - your data
TT_15min = retime(TT,minutes(15*(1:25)'),'pchip');
0 Comments
Grzegorz Knor
on 19 Jul 2017
t1 = 10:10:60
av1 = [3, 5, 6, 4, 6, 3]
t2 = 15:15:60
av2 = interp1(t1,av1,t2)
plot(t1,av1,'Marker','.')
hold on
plot(t2,av2,'Marker','o')
0 Comments
John D'Errico
on 19 Jul 2017
Edited: John D'Errico
on 19 Jul 2017
Note that any use of a standard interpolation tool will NOT ensure maintaining the hourly average. They simply are not designed to solve that problem.
Could you solve the problem with custom written interpolating spline code, designed to employ that constraint? Yes, I am sure you could do so. But I seriously doubt it is worth the effort to maintain an exact hourly average. As well, one would need to understand fully the goals.
Lets try a simple example, so we can see what is the problem.
% 3 hours worth of data.
T10 = 0:10:180;
Y10 = rand(size(T))
% Our new sampling should be at increments of
% 15 minutes. So our new times will be at
T15 = 0:15:180;
% simplistically, that means we could just retain
% the data at times [0,30,60,90,120,150,180] minutes.
% then average the values at 10 and 20 minutes to
% gain a new value at 15. Do the dame thing for 45
% minutes, taking the average from 40 and 50. This
% is implicitly what interp1 will do if you use
% linear interpolation. Is that right here? Not really.
Y15 = interp1(T10,Y10,T15);
plot(T10,Y10,'r-o',T15,Y15,'bx-')
So, the red curve is the 10 minute data. The blue curve is the 15 minute data, created using linear interpolation. As you can see, the 0,30,60,90... points are identical to the originals, and the intermediate points are the average of their neighbors.
Is that a good thing? Perhaps. It is simple. Even a spline interpolation would have been easy.
Y15s = interp1(T10,Y10,T15,'spline');
But does either maintain an hourly average? No. The problem arises from what I'll call a weighting issue. Lets compute the hourly averages to see:
% The hourly average of the three hours, for
% the 10 minute sampling was:
Ybar10 = mean(reshape(Y10(1:18),[],3))
Ybar10 =
0.68547 0.57097 0.32839
% Whereas if we compute the same thing for
% the 15 minute data, we get a different result.
Ybar15 = mean(reshape(Y15(1:12),[],3))
Ybar15 =
0.74018 0.60488 0.26241
So it is different. Is is seriously different? Not hugely so.
Again, the problem has an easily identifiable source, IF we write out what we did. The mean for the first hour is essentially:
(Y10(1) + Y10(2) + Y10(3) + Y10(4) + Y10(5) + Y10(6))/6
That is, we tke sae values at times [0,10,20,30,40,50], and average them. Then we do the same for times 60:110, and 120:170. This is a "rectangle rule" based hourly average. I could have been more sophisticated and split the value at time 60 between the two hourly averages, then done the same for the value at time 120. That would have been a "trapezoidal rule" scheme to compute an hourly average.
So why don't we get the same result for the 15 minute average? For the first hour, that would be written as
(Y15(1) + Y15(2) + Y15(3) + Y15(4))/4
So we use the values at times 0,15,30,45. to compute the first hourly average. But if we go back to the original data, this reduces to:
(Y10(1) + (Y10(2) + Y10(3))/2 + Y10(4) + (Y10(5) + y10(6))/2)/4
So we got the 15 minute time by averaging the results for time 10 and 20, etc. Expanding, we see the first hourly average, in terms of the original data:
Y10(1)/4 + Y10(2)/8 + Y10(3)/8 + Y10(4)/4 + Y10(5)/8 + y10(6)/8
So this "average" has different weights for each time than the basic average of 10 minute values. In that original average, the weights were all 1/6. Here the different sample have different importance in the final 15 minute compilation versus the 10 minute compilation.
The point of all this? Just using simple interpolation will almost always fail to produce a result that satisfies your stated goal of maintaining an hourly average. In fact, ANY interpolation, linear, spline, pchip, interp1, retime, etc., ALL will fail in general. (Only for very simple data will the results be correct. I.e., if your data is absolutely constant, then you can do anything you darn well wish to do, and anything will all work.)
Can you fix it? Well, yes I arguably can fix it. Is it worth the effort? Sigh. Do I really need to get that deeply into the linear algebra to show how to make it work perfectly? Note that the simple interpolation will not be too terrible if your curve is fairly smooth.
0 Comments
See Also
Categories
Find more on Smoothing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!