Repeating a time series from yyy-MM-dd to yyyy-MM-dd-hh-mm

Question

0 votes

I have a matrix where the first column is a time series with dates down to one minute resolution, in the format: yyyy-MM-dd-hh-mm. And in the columns next to it I want to add several types of measured data.

The problem is that some of this data has a lower resolution, for example only measured once a day in the format: yyyy-MM-dd.

What I want to do is insert this data in the high resolution matrix and having the daily values repeated across all hours and minutes of the correct day so that there are no gaps between the values.

Example: I have carbon dioxide data from 2011 measured daily. I want the measured value for 2011-01-01 to be repeated in the bigger matrix all the way from 2011-01-01-00-00 to 2011-01-01-23-59 where then 2011-01-02 starts and so on..

I imagine this shouldn’t be too complicated to do but I’m having trouble figuring out how to do it. Any suggestions?

Thanks.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Guillaume on 17 Feb 2015

Open in MATLAB Online

0 votes

Actually, for this I would use date vectors. You can then use ismember with the 'rows' option to locate the matching rows of minute and daily data (using only the year month day columns of the date vectors):

%demo data, yours does not overlap:
DailyDatenum = [734504:734868; 1:365]';
MinuteDatenum = [734502:1/24/60:734870]';
Dailydv = datevec(DailyDatenum(:, 1));
Minutedv = datevec(MinuteDatenum(:, 1));
[ispresent, row] = ismember(Minutedv(:, 1:3), Dailydv(:, 1:3), 'rows');
MinuteDatenum(ispresent, 2) = DailyDatenum(row(ispresent), 2);

6 Comments
Show 4 older comments Hide 4 older comments

PetterS on 18 Feb 2015

Open in MATLAB Online

Oh I just realized that the minute time series I attached was for 2000 instead of 2011, that was a mistake! But when I changed that this method worked beautifully and did just what I had hoped, thank you so much! However I have two additional questions:

1. I also have more data that I want to fit in the same time series, some of it is in really strange intervals like every 7 min for a while and then a gap for a few days and then 15 min intervals and so on… really random stuff. Since this method breaks down the time so elegantly in date vectors, could I use it to automatically fit that data in the same way and fill in the gaps between the data? And what would I have to change in that case, would:

ismember(Minutedv(:, 1:5), Dailydv(:, 1:5), 'rows')

do the same thing but also include hours and minutes? Or is there something else I should adjust?

2. Could I use this method but instead of having it repeat the data linearly interpolate between the data gaps? Like in the code Star Strider suggested, but with the simplicity of using date vectors?

Guillaume on 19 Feb 2015

Open in MATLAB Online

That's a lot more complicated than your initial question. You can use ismember on the first 5 columns of the date vectors to insert your randomly spaced data into the minute data, but that won't fill the gaps. To find the beginnings and ends of gaps, a combination of diff and find should do the trick. You can then fill them using interpolation of the last value.

%assuming 
%randomdv a n*6 matrix of date vectors, not necessarily ordered, e.g:
randomdv = [2011 01 01 12 01 00
            2011 01 01 12 04 00
            2011 01 01 12 05 00
            2011 01 01 12 30 00
            2011 01 01 12 21 00
            2011 01 01 12 20 00];  
%randomdata a n*1 vector of corresponding data, e.g:
randomdata = [1 4 5 30 21 20]';   
%minutedv a m*6 matrix of date vectors in minute increment, not necessarily ordered, e.g:
minutedv = [repmat([2011 01 01 12], 30, 1) (1:30)'];
filloption = 'interpolate'; %or 'fillwithlast'
minutedata = nan(size(minutedv, 1), 1); %output
[ispresent, row] = ismember(minutedv(:, 1:5), randomdv(:, 1:5), 'rows');
minutedata(ispresent) = randomdata(row(ispresent));
%at this point we just copied randomdata in the corresponding row of minutedata
%find the NaN runs:
nantransitions = find(diff([0; isnan(minutedata); 0]));
runstarts = nantransitions(1:2:end);
runends = nantransitions(2:2:end)-1;
for run = 1:numel(runstarts)
   runstart = runstarts(run);
   runend = runends(run);
   switch filloption
      case 'fillwithlast'
         filldata = minutedata(runstart - 1);
      case 'interpolate'
         filldata = linspace(minutedata(runstart-1), minutedata(runend+1), runend-runstart+3);
         filldata = filldata(2:end-1);
      otherwise
          error('fill method not defined');
   end
   minutedata(runstart:runend) = filldata;
end

PetterS on 20 Feb 2015

What an amazing answer, thank you so much this is something I never would have figured out myself!

The code works great when I use your demo data, but when I try it with my real data it only works until the point where you have written “at this point we just copied..”, when I run the loop after that it gives me the error ”Attempted to access minutedata(0); index must be a positive integer or logical” .

This is because the first element in my nantransitions is 1 (which makes runstarts(1)=1 ) and therefore filldata = minutedata(runstart - 1) becomes impossible.

Maybe this is very easy to resolve but it’s a bit difficult for me to figure out how I should tweak the code to avoid this because I don’t fully understand what the whole nantransition part is about. Do you have any suggestions in mind about what to do when the first element of nantransitions =1?

Guillaume on 20 Feb 2015

Open in MATLAB Online

Yes, I should have mentioned that the code assumes that there's no missing data at the beginning (for 'fillwithlast' and 'interpolate') and the end (for 'interpolate' only) of minutedata. Because, if there is, there's nothing to replicate / interpolate between.

You can deal with these special case anywhere within the for loop, in the case statements for example if you want different behaviour:

        case 'fillwithlast'
           if runstart == 0
              %missing data at the beginning
              filldata = minutedata(runend+1); %not sure what you want to do. This will fail if everything is NaN since runend+1 is then past the end
           else
              filldata = minutedata(runstart - 1);
           end
        case 'interpolate'
           if runstart == 0
              filldata = minutedata(runend+1); %maybe?
           elseif runend == numel(minutedata)
              filldata = minutedata(runstart-1); %maybe?
           else
              filldata = linspace(minutedata(runstart-1), minutedata(runend+1), runend-runstart+3);
              filldata = filldata(2:end-1);
           end

nantransitions is a temporary variable to help in finding the beginning and end of each run of NaN. Since isnan returns a vector of 0 and 1, (e.g. [0 0 1 1 1 0 1 1 ]), you know that you go from non-nan to nan when the diff is 1 and from nan to non-nan when the diff is -1. The rest of the diff is always 0. Therefore, you've got a transition whenever the diff is non-zero. I just flank it with 0 to make sure that the first transition is always non-nan to nan and the last nan to non-nan (and to make sure that nan at the beginning or end are caught). Hence odd transitions are non-nan to nan and even transitions are nan to non-nan.

Sign in to comment.

Repeating a time series from yyy-MM-dd to yyyy-MM-dd-hh-mm

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

6 Comments
Show 4 older comments Hide 4 older comments

More Answers (0)

Categories

Products

Tags

Community Treasure Hunt

Repeating a time series from yyy-MM-dd to yyyy-MM-dd-hh-mm

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

6 Comments Show 4 older comments Hide 4 older comments

More Answers (0)

Categories

Products

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

6 Comments
Show 4 older comments Hide 4 older comments