How to check data by years?
12 views (last 30 days)
Show older comments
Dear all,
I have a time series that starts from 1996 and end on 2017, the data I have are hourly data (so everyday I have 24 points). Each hour of my data has a quality flag to let me know if the data is useful or not. These quality flags are in the form of letters. So far I have a cell arrays of the years and the flags (24 flags per day because they are hourly flags).
I want check the number of the flag 'D' that I have for every year. So I want to basically have a table that tells me for every year from 1996 to 2017, the number of 'D' flags do I have.
Thank you, M
2 Comments
Jan
on 28 Aug 2017
Start with explaining the problem with useful details. What class and size do the data have? Is it a table or struct, matrix or cell? What is "the flag D"? Post a relevant part of the inputs and explain, what you want as output. Then it is easier to post an answer.
Accepted Answer
KL
on 29 Aug 2017
Assuming that you're working with tables, something like this might help,
%create dummy data
dt = datetime([2016 1 1 0 0 0]):hours(1):datetime([2017 12 31 00 23 00]);
Flags = randi([1,5],size(dt));
[Y M D H] = datevec(dt);
T = table(Y',M',D',H',Flags','VariableNames',{'Year','Month','Day','Hour','Flags'});
%example scenario, fetch all flag 4 from year 2016
T_1 = T(T.Year==2016 & T.Flags==4,:);
5 Comments
KL
on 30 Aug 2017
It looks like your QFLAG column has multiple subcolumns corresponding to hour of the day. I'd sort that and make them as one column by including a column called HOUR in the table (see my table).
%create dummy data
dt = datetime([2016 1 1 0 0 0]):hours(1):datetime([2016 1 1 12 00 00]);
Flags = ['M';'D';'M';'D';'M';'D';'M';'D';'M';'D';'M';'D';'M']; %now I have Ms and Ds
[Y M D H] = datevec(dt);
T = table(Y',M',D',H',Flags,'VariableNames',{'Year','Month','Day','Hour','Flags'});
%example scenario, fetch all flag 4 from year 2016
T_1 = T(T.Year==2016 & T.Flags=='D',:); % now T.Flags=='D'
and the result is,
>> T
T =
Year Month Day Hour Flags
____ _____ ___ ____ _____
2016 1 1 0 M
2016 1 1 1 D
2016 1 1 2 M
2016 1 1 3 D
2016 1 1 4 M
2016 1 1 5 D
2016 1 1 6 M
2016 1 1 7 D
2016 1 1 8 M
2016 1 1 9 D
2016 1 1 10 M
2016 1 1 11 D
2016 1 1 12 M
>> T_1
T_1 =
Year Month Day Hour Flags
____ _____ ___ ____ _____
2016 1 1 1 D
2016 1 1 3 D
2016 1 1 5 D
2016 1 1 7 D
2016 1 1 9 D
2016 1 1 11 D
More Answers (1)
Steven Lord
on 30 Aug 2017
Build a sample table.
rng default
n = 60;
theyear = randi([2010 2020], n, 1);
themonth = randi(12, n, 1);
theday = randi(28, n, 1);
flags = randi(5, n, 1);
T = table(theyear, themonth, theday, flags, ...
'VariableNames', {'Year', 'Month', 'Day', 'Flag'});
% Select the Flag variable for all rows in the table
% whose Year variable is equal to 2017
flagsFor2017 = T{T.Year == 2017, 'Flag'};
% How many are there?
size(flagsFor2017, 1)
% Look at the raw data
flagsFor2017
% Bin them. I use 1:6 here because the flags can range from 1 to 5
% If I had used 1:5 then 4 and 5 would have been in the same bin
histcounts(flagsFor2017, 1:6)
Alternately if you want to process all the years and all the flags at once, use histcounts2 or histogram2.
histogram2(T.Year, T.Flag)
From visual inspection of this particular data set the flag 2 in 2017 was the most common combination.
See Also
Categories
Find more on Tables in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!