Repeating a time series from yyy-MM-dd to yyyy-MM-dd-hh-mm
You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Show older comments
0 votes
Share a link to this question
I have a matrix where the first column is a time series with dates down to one minute resolution, in the format: yyyy-MM-dd-hh-mm. And in the columns next to it I want to add several types of measured data.
The problem is that some of this data has a lower resolution, for example only measured once a day in the format: yyyy-MM-dd.
What I want to do is insert this data in the high resolution matrix and having the daily values repeated across all hours and minutes of the correct day so that there are no gaps between the values.
Example: I have carbon dioxide data from 2011 measured daily. I want the measured value for 2011-01-01 to be repeated in the bigger matrix all the way from 2011-01-01-00-00 to 2011-01-01-23-59 where then 2011-01-02 starts and so on..
I imagine this shouldn’t be too complicated to do but I’m having trouble figuring out how to do it. Any suggestions?
Thanks.
Accepted Answer
Guillaume
on 17 Feb 2015
Actually, for this I would use date vectors. You can then use ismember with the 'rows' option to locate the matching rows of minute and daily data (using only the year month day columns of the date vectors):
%demo data, yours does not overlap:
DailyDatenum = [734504:734868; 1:365]';
MinuteDatenum = [734502:1/24/60:734870]';
Dailydv = datevec(DailyDatenum(:, 1));
Minutedv = datevec(MinuteDatenum(:, 1));
[ispresent, row] = ismember(Minutedv(:, 1:3), Dailydv(:, 1:3), 'rows');
MinuteDatenum(ispresent, 2) = DailyDatenum(row(ispresent), 2);
6 Comments
PetterS
on 18 Feb 2015
Oh I just realized that the minute time series I attached was for 2000 instead of 2011, that was a mistake! But when I changed that this method worked beautifully and did just what I had hoped, thank you so much! However I have two additional questions:
1. I also have more data that I want to fit in the same time series, some of it is in really strange intervals like every 7 min for a while and then a gap for a few days and then 15 min intervals and so on… really random stuff. Since this method breaks down the time so elegantly in date vectors, could I use it to automatically fit that data in the same way and fill in the gaps between the data? And what would I have to change in that case, would:
ismember(Minutedv(:, 1:5), Dailydv(:, 1:5), 'rows')
do the same thing but also include hours and minutes? Or is there something else I should adjust?
2. Could I use this method but instead of having it repeat the data linearly interpolate between the data gaps? Like in the code Star Strider suggested, but with the simplicity of using date vectors?
Guillaume
on 18 Feb 2015
For 1., I'm not sure I understand what you want to do. Do you just want to fill missing data with whatever previous value is available?
2. For interpolating, you probably need a different method.
PetterS
on 19 Feb 2015
Ok, maybe I will drop the interpolation part then. But what I mean by my first point is simply that I have several sets of data all measured during 2011, and there is very often gaps in the data.
What I want to do is place the measured data, whenever it is possible, in the correct time position (according to when it was measured) in the big time series (the one referred to as MinuteDatenum here) and then where there are gaps – fill them in with the closest existing value in the series.
Do you understand what I mean? Or would it be easier if I attached a .mat file with the values?
Guillaume
on 19 Feb 2015
That's a lot more complicated than your initial question. You can use ismember on the first 5 columns of the date vectors to insert your randomly spaced data into the minute data, but that won't fill the gaps. To find the beginnings and ends of gaps, a combination of diff and find should do the trick. You can then fill them using interpolation of the last value.
%assuming
%randomdv a n*6 matrix of date vectors, not necessarily ordered, e.g:
randomdv = [2011 01 01 12 01 00
2011 01 01 12 04 00
2011 01 01 12 05 00
2011 01 01 12 30 00
2011 01 01 12 21 00
2011 01 01 12 20 00];
%randomdata a n*1 vector of corresponding data, e.g:
randomdata = [1 4 5 30 21 20]';
%minutedv a m*6 matrix of date vectors in minute increment, not necessarily ordered, e.g:
minutedv = [repmat([2011 01 01 12], 30, 1) (1:30)'];
filloption = 'interpolate'; %or 'fillwithlast'
minutedata = nan(size(minutedv, 1), 1); %output
[ispresent, row] = ismember(minutedv(:, 1:5), randomdv(:, 1:5), 'rows');
minutedata(ispresent) = randomdata(row(ispresent));
%at this point we just copied randomdata in the corresponding row of minutedata
%find the NaN runs:
nantransitions = find(diff([0; isnan(minutedata); 0]));
runstarts = nantransitions(1:2:end);
runends = nantransitions(2:2:end)-1;
for run = 1:numel(runstarts)
runstart = runstarts(run);
runend = runends(run);
switch filloption
case 'fillwithlast'
filldata = minutedata(runstart - 1);
case 'interpolate'
filldata = linspace(minutedata(runstart-1), minutedata(runend+1), runend-runstart+3);
filldata = filldata(2:end-1);
otherwise
error('fill method not defined');
end
minutedata(runstart:runend) = filldata;
end
PetterS
on 20 Feb 2015
What an amazing answer, thank you so much this is something I never would have figured out myself!
The code works great when I use your demo data, but when I try it with my real data it only works until the point where you have written “at this point we just copied..”, when I run the loop after that it gives me the error ”Attempted to access minutedata(0); index must be a positive integer or logical” .
This is because the first element in my nantransitions is 1 (which makes runstarts(1)=1 ) and therefore filldata = minutedata(runstart - 1) becomes impossible.
Maybe this is very easy to resolve but it’s a bit difficult for me to figure out how I should tweak the code to avoid this because I don’t fully understand what the whole nantransition part is about. Do you have any suggestions in mind about what to do when the first element of nantransitions =1?
Guillaume
on 20 Feb 2015
Yes, I should have mentioned that the code assumes that there's no missing data at the beginning (for 'fillwithlast' and 'interpolate') and the end (for 'interpolate' only) of minutedata. Because, if there is, there's nothing to replicate / interpolate between.
You can deal with these special case anywhere within the for loop, in the case statements for example if you want different behaviour:
case 'fillwithlast'
if runstart == 0
%missing data at the beginning
filldata = minutedata(runend+1); %not sure what you want to do. This will fail if everything is NaN since runend+1 is then past the end
else
filldata = minutedata(runstart - 1);
end
case 'interpolate'
if runstart == 0
filldata = minutedata(runend+1); %maybe?
elseif runend == numel(minutedata)
filldata = minutedata(runstart-1); %maybe?
else
filldata = linspace(minutedata(runstart-1), minutedata(runend+1), runend-runstart+3);
filldata = filldata(2:end-1);
end
nantransitions is a temporary variable to help in finding the beginning and end of each run of NaN. Since isnan returns a vector of 0 and 1, (e.g. [0 0 1 1 1 0 1 1 ]), you know that you go from non-nan to nan when the diff is 1 and from nan to non-nan when the diff is -1. The rest of the diff is always 0. Therefore, you've got a transition whenever the diff is non-zero. I just flank it with 0 to make sure that the first transition is always non-nan to nan and the last nan to non-nan (and to make sure that nan at the beginning or end are caught). Hence odd transitions are non-nan to nan and even transitions are nan to non-nan.
More Answers (0)
Categories
Find more on Logical in Help Center and File Exchange
Products
Tags
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)