I'm trying to organize data so it can easily be averaged by date

1 view (last 30 days)
I have a cell of data where the first column is the full date. I'm being asked to organize it in a matrix so that if something like H(1,2,:), it would return all the data from the second day of January. The dates are in the format yyyymmddHHMM, if that helps. I'm kind of just looking for guidance on how to achieve something like this. Any help is appreciated
  2 Comments
Susan Santiago
Susan Santiago on 13 Oct 2018
I uploaded my workspace because there are many files and they're all in .dat which can't be uploaded here. And I think that is probably more clear anyway. The cell I'm concerned with is named C. Each row of C represents on data file.

Sign in to comment.

Accepted Answer

jonas
jonas on 13 Oct 2018
Edited: jonas on 13 Oct 2018
"...something like H(1,2,:), it would return all the data from the second day of January."
Not very good in my opinion. How do you deal with the fact that different months have different number of days? By padding with NaNs?
It is much easier to put all your data in a timetable. You can then easily access specific days.
t = datetime(2000,1,1):days(1):datetime(2001,1,1);
TT = timetable(t,zeros(length(t),1))
You want to access data for a specific date? Easy:
TT('2001-1-1',:)
ans =
timetable
Time Var1
___________ ____
01-Jan-2001 0
  32 Comments
Susan Santiago
Susan Santiago on 15 Oct 2018
Edited: Susan Santiago on 15 Oct 2018
Thanks! One last thing, is there any way to change my weirdo code so it's not just giving a daily average but all the results from the day? One of the main uses with this matrix is gonna be plotting the data. Thanks again. And if you don't mind, how would I get just one variable from matrix?
jonas
jonas on 15 Oct 2018
Edited: jonas on 15 Oct 2018
1. Yes and no. You have data every half hour if I remember correctly. You could make a fourth dimension of the matrix, and enter the "hour of day". Still, it would not work because you have half hours. You could make a fifth dimension called "minute of day"... you realize how absurd this method is becoming, especially since most 98% of minutes would be NaNs.
2. All variables are stored in the third dimension:
A(1,1,5)
outputs the "fifth" variable form the first of January. What is the fifth variable? You would have to compare with some kind of table every time you want to extract one variable.
I fully understand that you want to comply with your professors instructions. However, if you show him these two methods and explain the advantages of using tables (indexing by variable names, options for interpolation, easier access to specific dates, possibility of storing different classes, easier to plot as well as a variety of table-specific options that we have not even talked about) he/she would be crazy stubborn to opt for the array.

Sign in to comment.

More Answers (1)

Peter Perkins
Peter Perkins on 17 Oct 2018
"return all the data from the second day of January"
Imagine having this timetable:
>> tt = array2timetable(rand(100,2),'RowTimes',datetime(2018,1,1,0:8:792,0,0));
>> head(tt)
ans =
8×2 timetable
Time Var1 Var2
____________________ _______ ________
01-Jan-2018 00:00:00 0.85071 0.55903
01-Jan-2018 08:00:00 0.56056 0.8541
01-Jan-2018 16:00:00 0.92961 0.34788
02-Jan-2018 00:00:00 0.69667 0.44603
02-Jan-2018 08:00:00 0.58279 0.054239
02-Jan-2018 16:00:00 0.8154 0.17711
03-Jan-2018 00:00:00 0.87901 0.66281
03-Jan-2018 08:00:00 0.98891 0.33083
In recent versions of MATLAB (R2018a and later IIRC), you can do this:
>> tt(timerange('01-Jan-2018','day'),:)
ans =
3×2 timetable
Time Var1 Var2
____________________ _______ _______
01-Jan-2018 00:00:00 0.85071 0.55903
01-Jan-2018 08:00:00 0.56056 0.8541
01-Jan-2018 16:00:00 0.92961 0.34788
>> tt(timerange(datetime(2018,1,1),'day'),:)
ans =
3×2 timetable
Time Var1 Var2
____________________ _______ _______
01-Jan-2018 00:00:00 0.85071 0.55903
01-Jan-2018 08:00:00 0.56056 0.8541
01-Jan-2018 16:00:00 0.92961 0.34788
In earlier versions, you can do the same thing, with a bit more typing:
>> tt(timerange(datetime(2018,1,1),datetime(2018,1,2)),:)
ans =
3×2 timetable
Time Var1 Var2
____________________ _______ _______
01-Jan-2018 00:00:00 0.85071 0.55903
01-Jan-2018 08:00:00 0.56056 0.8541
01-Jan-2018 16:00:00 0.92961 0.34788

Categories

Find more on Timetables in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!