extract especific rows from matrix

Hi, I have a two columns data. The first column includes time yyyymmddhhmm (which means Year; month; day, hour and minutes). The second column contains numerical data. The interval between time tables are 15 minutes so you see the first row is 202002041000 and the second row is 202002041015.
I have a very big matrix (3500*2) and some times some data are missed, so the pattern is not uniform. I want to write a code to extract the rows which are for every 30 minutes (the rows that ends to 30). So maybe a code which check whether the first column data ends to 30 helps.
Could you please help me with this
202002041000 25
202002041015 35
202002041030 54
202002041045 54
202002041100 23

 Accepted Answer

One option —
c = [202002041000 25
202002041015 35
202002041030 54
202002041045 54
202002041100 23
202002041200 23]; % Last Row Added To Test Code
T1 = array2table(c);
T1.c1 = datetime(num2str(T1.c1),'InputFormat','yyyyMMddHHmm')
T1 = 6×2 table
c1 c2 ____________________ __ 04-Feb-2020 10:00:00 25 04-Feb-2020 10:15:00 35 04-Feb-2020 10:30:00 54 04-Feb-2020 10:45:00 54 04-Feb-2020 11:00:00 23 04-Feb-2020 12:00:00 23
TT1 = table2timetable(T1)
TT1 = 6×1 timetable
c1 c2 ____________________ __ 04-Feb-2020 10:00:00 25 04-Feb-2020 10:15:00 35 04-Feb-2020 10:30:00 54 04-Feb-2020 10:45:00 54 04-Feb-2020 11:00:00 23 04-Feb-2020 12:00:00 23
newTimes = TT1.c1(1):minutes(30):TT1.c1(end);
TT2 = retime(TT1,'regular','fillwithmissing','TimeStep',minutes(30))
TT2 = 5×1 timetable
c1 c2 ____________________ ___ 04-Feb-2020 10:00:00 25 04-Feb-2020 10:30:00 54 04-Feb-2020 11:00:00 23 04-Feb-2020 11:30:00 NaN 04-Feb-2020 12:00:00 23
This fills the missing data with NaN. To interpolate the missing values instead, replace 'fillwithmissing' with 'linear' or any of the other applicable options.
It would help to have more data, especially with missing times and more detail on the desired result for the missing data, however this will work for a start.
See the documentation for the various functions to understand how the code works.
.

11 Comments

Hi Star,
Thank you very much for your answer. Maybe I was not clear enough.
It is not like there is some times that they did not report the data for.
In fact, some rows are missing.
See, I have data for every 15 minutes. I wish to have a code that is able to collect rows that reports data for every 30 minutes. So for instance we have follwoing data:
04-Feb-2020 10:00:00 25
04-Feb-2020 10:30:00 54
04-Feb-2020 11:00:00 23
04-Feb-2020 11:30:00 25
04-Feb-2020 12:00:00 23
04-Feb-2020 12:15:00 30
04-Feb-2020 12:30:00 45
04-Feb-2020 12:45:00 50
04-Feb-2020 13:00:00 60
I wish to have a code to report me:
04-Feb-2020 10:30:00 54
04-Feb-2020 12:30:00 45
Just consider that some times the data has gap. For instance I jumed 1 hour, not 30 minutes.
So, the code should first check whether time ends with 30:00 and if yes, it should collect and report that data.
Hi again star, thank you so much for your following up.
The missing data should be interpolated. But the important point for me is that current data have 15 minutes intervals. I wish to have a code that is able to extracts the data hourly.
For instance:
04-Feb-2020 10:00:00 25
04-Feb-2020 10:15:00 30
04-Feb-2020 10:30:00 54
04-Feb-2020 10:45:00 65
04-Feb-2020 11:00:00 23
04-Feb-2020 11:15:00 27
04-Feb-2020 11:30:00 25
04-Feb-2020 11:45:00 63
04-Feb-2020 12:00:00 23
04-Feb-2020 12:15:00 30
04-Feb-2020 12:30:00 45
04-Feb-2020 12:45:00 50
04-Feb-2020 13:00:00 60
a code which extract data and its time every 1 hour. So, you can see the result of above time table should be like this:
04-Feb-2020 10:30:00 54
04-Feb-2020 11:30:00 25
04-Feb-2020 12:30:00 45
The data I have is for 10 years of data in 15 minutes interval.
I am attachinng the data file here.
Thank you. Armin
I am still not certain what the desired result is. I am reading this as wanting only the data for every hour beginning with the first time the data are collected at the 30-minute time (on the half-hour).
The data in the Comment appear to be much differernt from the data originally posted.
Also, while there do not appear to be any missing values here, are the missing data to be treated as NaN values, or interpolated?
c = {'04-Feb-2020 10:00:00' 25
'04-Feb-2020 10:30:00' 54
'04-Feb-2020 11:00:00' 23
'04-Feb-2020 11:30:00' 25
'04-Feb-2020 12:00:00' 23
'04-Feb-2020 12:15:00' 30
'04-Feb-2020 12:30:00' 45
'04-Feb-2020 12:45:00' 50
'04-Feb-2020 13:00:00' 60};
T1 = cell2table(c);
T1.c1= datetime(T1.c1) % Original Table
T1 = 9×2 table
c1 c2 ____________________ __ 04-Feb-2020 10:00:00 25 04-Feb-2020 10:30:00 54 04-Feb-2020 11:00:00 23 04-Feb-2020 11:30:00 25 04-Feb-2020 12:00:00 23 04-Feb-2020 12:15:00 30 04-Feb-2020 12:30:00 45 04-Feb-2020 12:45:00 50 04-Feb-2020 13:00:00 60
Iv = find(minute(T1.c1) == 30); % Data On The Half-Hour
TT1 = table2timetable(T1) % Create 'timetable'
TT1 = 9×1 timetable
c1 c2 ____________________ __ 04-Feb-2020 10:00:00 25 04-Feb-2020 10:30:00 54 04-Feb-2020 11:00:00 23 04-Feb-2020 11:30:00 25 04-Feb-2020 12:00:00 23 04-Feb-2020 12:15:00 30 04-Feb-2020 12:30:00 45 04-Feb-2020 12:45:00 50 04-Feb-2020 13:00:00 60
newTimes = TT1.c1(Iv(1)):hours(1):TT1.c1(Iv(end)); % Times To Sample (Or Interppolate)
newTimes = 1×3 datetime array
04-Feb-2020 10:30:00 04-Feb-2020 11:30:00 04-Feb-2020 12:30:00
TT2 = retime(TT1,newTimes) % Selected Data
TT2 = 3×1 timetable
c1 c2 ____________________ __ 04-Feb-2020 10:30:00 54 04-Feb-2020 11:30:00 25 04-Feb-2020 12:30:00 45
I do not understand the reason for omitting the last row (for 12:30) from the posted desired result. The code includes it here.
.
Hi Star,
I am sorry. I uploaded wrong file. In this email, I am sending the right file. Sorry again. Could you please take a look at that?
Does this produce the desired result?
T1 = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/776583/file.xlsx', 'VariableNamingRule','preserve')
T1 = 349546×2 table
datetime 67201_00060 ____________________ ___________ 16-Oct-2010 00:00:00 28.8 16-Oct-2010 00:15:00 28.8 16-Oct-2010 00:30:00 28.8 16-Oct-2010 00:45:00 28.8 16-Oct-2010 01:00:00 28.8 16-Oct-2010 01:15:00 28.8 16-Oct-2010 01:30:00 27.8 16-Oct-2010 01:45:00 27.8 16-Oct-2010 02:00:00 27.8 16-Oct-2010 02:15:00 27.8 16-Oct-2010 02:30:00 27.8 16-Oct-2010 02:45:00 27.8 16-Oct-2010 03:00:00 27.8 16-Oct-2010 03:15:00 25.9 16-Oct-2010 03:30:00 25.9 16-Oct-2010 03:45:00 25.9
Iv = find(minute(T1.datetime) == 30); % Data On The Half-Hour
TT1 = table2timetable(T1) % Create 'timetable'
TT1 = 349546×1 timetable
datetime 67201_00060 ____________________ ___________ 16-Oct-2010 00:00:00 28.8 16-Oct-2010 00:15:00 28.8 16-Oct-2010 00:30:00 28.8 16-Oct-2010 00:45:00 28.8 16-Oct-2010 01:00:00 28.8 16-Oct-2010 01:15:00 28.8 16-Oct-2010 01:30:00 27.8 16-Oct-2010 01:45:00 27.8 16-Oct-2010 02:00:00 27.8 16-Oct-2010 02:15:00 27.8 16-Oct-2010 02:30:00 27.8 16-Oct-2010 02:45:00 27.8 16-Oct-2010 03:00:00 27.8 16-Oct-2010 03:15:00 25.9 16-Oct-2010 03:30:00 25.9 16-Oct-2010 03:45:00 25.9
newTimes = TT1.datetime(Iv(1)):hours(1):TT1.datetime(Iv(end)); % Times To Sample (Or Interppolate)
TT2 = retime(TT1,newTimes) % Selected Data
TT2 = 87696×1 timetable
datetime 67201_00060 ____________________ ___________ 16-Oct-2010 00:30:00 28.8 16-Oct-2010 01:30:00 27.8 16-Oct-2010 02:30:00 27.8 16-Oct-2010 03:30:00 25.9 16-Oct-2010 04:30:00 25.9 16-Oct-2010 05:30:00 25 16-Oct-2010 06:30:00 25 16-Oct-2010 07:30:00 25 16-Oct-2010 08:30:00 24 16-Oct-2010 09:30:00 23.1 16-Oct-2010 10:30:00 23.1 16-Oct-2010 11:30:00 23.1 16-Oct-2010 12:30:00 22.2 16-Oct-2010 13:30:00 22.2 16-Oct-2010 14:30:00 22.2 16-Oct-2010 15:30:00 21.4
This produces rows for every hour on the half-hour, as I believe was the requested result.
.
That's grat/ It is exactly what I wanted. Thank you so much :)
As always, my pleasure!
.
Hi Star, good morning,
Thank you for your help. Does this code work on Matlab 2018 or a newer version is needed?
and also using what command I can write a csv file for the results?
Good moirning!
I only tested ot in R2021b, since I use that here and on my own computer offline. I no longer have access to either R2018 version, so I cannot test it with them. (The online documentation for R2018b is still available, so I am referring to it here.)
I always recommend upgrading to the most recent version, since there are a number of new features that are not present in R2018b.
The following applies to R2018b. (No changes will be required to run my code in R2021b.)
It will be necessary to omit the 'VariableNamingRule ,'preserve' name-value pair from the readtable call because that option (nor any similar option) does not exist in R2018b.
It looks like the retime call will work without modification.
To write it, use the writetable function. It may be necessary to convert the timetable back to a table with the timetable2table function first, since it may not support writing timetable arrays. It will be necessary to experiment first to be certain that step is necessary. (That also appears to be true for R2021b.)
.
Thank you so much. I could successuly do that in Matlab online 2021b.
Thanks again.
As always, my pleasure!
.

Sign in to comment.

More Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!