# How can i convert daily rainfall data in monthly data?

55 views (last 30 days)
Govind Kumar on 12 Jun 2019
Edited: Adam Danz on 12 Jun 2019
I have daily rainfall data of 113 year of 17415 station in 41273*129*135; format where 41273 is days and 129*135 no of stations. Data is in .mat format.

Bob Nbob on 12 Jun 2019
What do you mean by, 'convert daily rainfall data in monthly data'? Are you looking to get a single average value? Are you looking to reshape the layout of the data to capture the daily information from each month as a separate 'block,' if you will?
Govind Kumar on 12 Jun 2019
want to make sum of each month but due to leap year and diiferent no of days in different month are giving me trouble. I want sum of month data in 1356*129*135 formate. (113*12=1356).

Adam Danz on 12 Jun 2019
Edited: Adam Danz on 12 Jun 2019
I suggest reorganizing your data into a timetable with 41273 rows and 4 columns: Time, Lat, Long, RainFall.
Then you can use retime() to perform monthly operations such as sum, mean, etc.
Here's an example.
% Create fake data
dt = datetime('01/Jan/1900') : datetime('31/Dec/2012');
long = randi(360,size(dt))-180;
lat = randi(180,size(dt))-90;
rainfall = rand(size(dt))*10;
% Put data into timetable
tt = timetable(dt',lat',long',rainfall','VariableNames',{'Lat','Long','Rainfall'});
% Calculate monthly rainfall combining all locations
monthlyRainfall = retime(tt,'Monthly','Sum');
See the link I provided to retime() for many other ways to use this function with timetables.
[Update]
To convert your 3D array to a time table, follow this demo. This demo uses 1000 days of fake data rather than 41273 days which would take a lot of time to generate the random numbers. There are two different ways to represent your 17415 locations and both ways are included in the code - you choose which is best.
% Here's a fake version of your data except for 1000 days (to save processing time).
% With the size of your data, some of these lines could consume some significant time.
% The vectors must be column vectors.
data = rand(1000, 129, 135);
dt = datetime('01/Jan/1900') + (0:999)'; %column vector of 1000 dates
loc1 = (1:129)'; %Column vector
loc2 = (1:135)'; %Column vector
% Create the columns for the table. Each of these vars should be same size.
dataVec = data(:);
dtVec = repmat(dt,size(data,2)*size(data,3),1);
loc1Vec = repmat(repelem(loc1,size(data,1)),size(data,3),1);
loc2Vec = repelem(loc2,size(data,1)*size(data,2),1);
% Create time table
tt = timetable(dtVec,loc1Vec,loc2Vec,dataVec,'VariableNames',{'Lat','Long','Rainfall'});
% --OR--
[~,~,locGroups] = unique([loc1Vec,loc2Vec],'rows');
tt = timetable(dtVec,locGroups,dataVec,'VariableNames',{'LocationGroup','Rainfall'});
% Look at first 10 rows
tt(1:10,:)
[Update II]
If you'd like to perform operations for each month AND for each location, you should organize your timetable so that each location gets its own column.
data = rand(90, 129, 135);
dt = datetime('01/Jan/1900') + (0:89)'; %column vector of 1000 dates
rainMat = reshape(data,size(data,1),[]);
locNames = strsplit(sprintf('Loc%d ',1:size(rainMat,2)));
tt = array2timetable(rainMat,'RowTimes',dt,'VariableNames', locNames(1:end-1));
moSum = retime(tt,'Monthly','Sum');
%View the first 2 months, first 6 locations
moSum(1:2,1:6)

Walter Roberson on 12 Jun 2019
You can use splitapply() passing it a function handle that does retime()
Adam Danz on 12 Jun 2019
What have you tried to do to fix that? Have you read about retime() to understand how it works? Have you checked out the examples in that link?
If you want monthly sum across all locations,
moSum = retime(tt(:,'Rainfall'),'Monthly','Sum');
If you want monthly averages for each location, you'll have to organize your timetable so that each location gets its own column. I'll update my answer again but I'll also include that section here:
data = rand(90, 129, 135);
dt = datetime('01/Jan/1900') + (0:89)'; %column vector of 1000 dates
rainMat = reshape(data,size(data,1),[]);
locNames = strsplit(sprintf('Loc%d ',1:size(rainMat,2)));
tt = array2timetable(rainMat,'RowTimes',dt,'VariableNames', locNames(1:end-1));
moSum = retime(tt,'Monthly','Sum');
%View the first 2 months, first 6 locations
moSum(1:2,1:6)
I encourage you to explore these new function and try to do some problem solving on your own before asking for solutions. You'll learn so much by going through that process.
Adam Danz on 12 Jun 2019
@Walter Roberson, I was trying to get splitapply() to work earlier and kept running into errors; eventually gave up. Here's a working example that fails.
data = rand(90, 129, 135);
dt = datetime('01/Jan/1900') + (0:89)'; %column vector of 1000 dates
loc1 = (1:129)'; %Column vector
loc2 = (1:135)'; %Column vector
% Create the columns for the table. Each of these vars should be same size.
dataVec = data(:);
dtVec = repmat(dt,size(data,2)*size(data,3),1);
loc1Vec = repmat(repelem(loc1,size(data,1)),size(data,3),1);
loc2Vec = repelem(loc2,size(data,1)*size(data,2),1);
% Create time table
[~,~,locGroups] = unique([loc1Vec,loc2Vec],'rows');
tt = timetable(dtVec,locGroups,dataVec,'VariableNames',{'LocationGroup','Rainfall'});
splitapply(@(x)retime(x(:,'Rainfall'),'Monthly','Sum'),tt,tt.LocationGroup)
ERROR (last line above)
Applying the function '@(x)retime(x(:,'Rainfall'),'Monthly','Sum')' to the
1st group of data generated the following error:
Too many input arguments.
Error in jff (line 14)
splitapply(@(x)retime(x(:,'Rainfall'),'Monthly','Sum'),tt,tt.LocationGroup)
Maybe my brain is shot for the day but I couldn't figure out what was wrong. Anyway, the alternative I propsed above (reorganizing the timetable so that each location gets a column) is a cleaner solution, IMHO.