extracting information from tall timetable using a loop

4 views (last 30 days)
I'm trying to extract certain time ranges from a tall timetable using a loop and I'm wondering how to do that most efficiently. In particular, gathering the data costs a lot of time and I want to avoid doing that withing every cycle of the loop.
My idea for the code looks like that at the moment, which doesnt work when it comes to calculations at the end. (Gathering in the loop works but takes forever)
location = 'C:\Folder'
ds = datastore(location)
TT = tall(ds)
x = {};
tic
for i =
Strt = minutes(RTImport.Start(i)) %searching the start point for extraction froam another table
endT = Strt + minutes(8) %calculate end time for extration
S = timerange(Strt,endT,'closed') %defining the timerange
TT8 = TT(S,:) %pull the information from the tall TT
Av = mean(TT8.variable,'omitnan') %doing some calculations
x{i} = Av %writing the result x(i)
end
toc
gather(x) %trying to perfom all calculation from tall table at once, but this doesnt work
location = 'folder'
write(location,x) %write is not supported for x
I'd be interested in doing this most efficiently and also if someone could point out the syntax on how to perform the calculation of the mean for several columns (mean of each individual column) in a timetable, that would be most obliged.

Accepted Answer

Sindar
Sindar on 1 Nov 2020
Looks like you could do everything with groupsummary, assuming you can figure out how to define the bins
G = groupsummary(TT,'TotalItemsSold',groupbins,'mean',["Var1";"Var2";"Var3"]);
I don't have much experience with datetimes, but spitballing some ideas if ranges don't overlap:
  • create a Nx2 matrix of start-end times
  • flatten into a list of bin edges
  • throw out bins made up of end-start (worst case, you might need to do this after computing means)
  3 Comments
Sindar
Sindar on 1 Nov 2020
Edited: Sindar on 1 Nov 2020
If you still end up needing to defer an array of results, this ended up working for me:
% run the defered operations to compute Phases
% the trick: each cell of Phases contains the recipe for a defered
% operation. gather runs each recipe, so Matlab knows the answers
% but, it isn't immediately stored in the variable
% this will take a while, but seems to be the fastest way
gather(x{:})
% update x variable by looking at the answers stored above, then
% reshaping to the correct matrix
x=reshape(gather([x{:}]),size(x));
Lutetium
Lutetium on 1 Nov 2020
the averages of the all the columns, I managed to perform using this code (including skipping the NaNs):
func = @(x) mean(x,'omitnan'); %ignoring NaN
varfun(func,TT8,'OutputFormat','table')
I'm running now the code gathering the inof in the loop since I need some results on monday and so far I seem to make it :)
Definetively, I'll try your approach for future data exctraction. That seems to be exactly for waht I was looking for. I appreciate your help! Thanks

Sign in to comment.

More Answers (0)

Categories

Find more on Data Preprocessing in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!