how to select data based on month

20 views (last 30 days)
I have thousand of data points (Nx2, col 1=date and col 2=data)for a year for different dates for different sites. I want to make an average for the data for each month. My data are in excel and and date format is yyyymmdd (attached image).I easily can sort data suing sortrows and then I can calculate the average as below. But this is not an efficient method.Looking for an efficient method. Thanks in advance. M1=mean(data(1:30,2)); M2=mean(data(31:69, 2)); M3=mean(data(70:112, 2)); ..............

Accepted Answer

Mohammad Abouali
Mohammad Abouali on 7 Apr 2015
Edited: Mohammad Abouali on 7 Apr 2015
Here is what you should do:
I have assumed that you store those two columns in a csv file as follow:
% reading the CSV File
data=csvread('sample.csv');
% extracting data into a MATLAB Table variable
T=table();
T.year=floor(data(:,1)/10000);
T.mm=floor((data(:,1)-T.year*10000)/100);
T.dd=data(:,1)-T.year*10000-T.mm*100;
T.value=data(:,2);
Now To get yearly average:
grpstats(T,'year','mean','DataVars',{'value'})
ans =
year GroupCount mean_value
____ __________ __________
2010 2010 365 0.49589
2011 2011 365 0.51351
2012 2012 366 0.50263
2013 2013 365 0.52623
2014 2014 365 0.51586
2015 2015 360 0.5009
To get monthly average
grpstats(T,{'year','mm'},'mean','DataVars',{'value'})
ans =
year mm GroupCount mean_value
____ __ __________ __________
2010_1 2010 1 31 0.5204
2010_2 2010 2 28 0.57318
2010_3 2010 3 31 0.494
2010_4 2010 4 30 0.41664
2010_5 2010 5 31 0.49894
2010_6 2010 6 30 0.47679
2010_7 2010 7 31 0.46417
// I have chopped rest of the results.
I have attached the sample.csv for you to compare the format of your data set to this one. This data is randomly generated.
for more information on the command refer to grpstat()
P.S. please include some data next time.
  7 Comments
Stephanie Allen
Stephanie Allen on 29 Oct 2015
Hi Mohammed, Great answer, works very well! Is it possible to average more than just one column of data? If so how is this possible? Many thanks!
Mohammad Abouali
Mohammad Abouali on 4 Nov 2015
Hi, Stephanie,
Yes it is possible. You can list all the data columns that you want as DataVars. something like this:
grpstats(T,{'year','mm'},'mean','DataVars',{'value','Column2', 'column3'})
where column2 and column3 should be the name of the other data columns that you want.
If you want all columns then simply don't provide any 'DataVars' option; so something like:
grpstats(T,{'year','mm'},'mean')
If you use this, it automatically calculates the mean for all data columns.

Sign in to comment.

More Answers (1)

Chad Greene
Chad Greene on 7 Apr 2015
Or with a time vector t in datenum format, you can use downsample_ts. Syntax would be:
monthlymean_x = downsample_ts(x,t,'monthly','mean');

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!