Preallocating a complex structure

17 views (last 30 days)
Amlan Rath
Amlan Rath on 14 Feb 2018
Commented: Jan on 16 Feb 2018
Hi! This is my first time asking a question here so forgive me if I am breaking any rules.
I am dealing with 19,000+ samples data of oil production and my objective is to perform various calculations and curve fitting and plot and save the results of each sample in a separate figure. However, this process in a loop takes 15-16 hours! I tried putting 'cla' after each figure was generated and saved, based on answers to a similar question here but that did not help me much. Reading a little further, looks like the issue is because I am not pre allocating my structure.I know how to pre allocate a simple structure but I am not sure if I am doing it right in my case, where there are structures and matrices within the structure.
clc
clear
close all
%%loading and defining data
load('middleton_nuM.mat')
load('middleton_date.mat')
[n,m]=size(middleton_nuM);
thold=30;
ft=fittype('power1');
data=struct('well_id',zeros(1,n),'start_date',cell(1,n),'end_date',cell
(1,n),'prod',zeros(1,n),'keep',zeros(1,n),'diff',zeros(1,n),'dxdy_sort',zeros(1,n),'dxdy2',zeros(1,n),'f',zeros(1,n),'GOF',zeros
(1,n),'Rsq2',zeros(1,n),'cumfit',zeros(1,n),'cumall',zeros(1,n),'der',zeros
(1,n),'std',zeros(1,n),'ratio',zeros(1,n));
This is what I did but it clearly is not working to give me any advantage in processing time. Any help in sorting this out will be greatly appreciated! I have attached a screenshot of my structure.
  9 Comments
Stephen23
Stephen23 on 16 Feb 2018
@Amlan Rath: if you really are serious about reducing the runtime then you need to make some changes to how your write code. The slowest parts of your code are related to the graphics, so that is where you should focus your attention:
  • Convert all scripts to functions.
  • Do NOT call clear, close, or clc.
  • Do NOT print/display anything in the command window.
  • Instead of creating a new figure for each plot use just one figure and update its contents.
  • Obtain and use explicit graphics handles for all relevant graphics objects.
  • Save the figures as .fig files first (you can easily post-process to convert to raster image format).
  • Always load into an output variable (which is a structure: S = load(...).
Read the first link that I gave you in my earlier comment.
Jan
Jan on 16 Feb 2018
@Amlan Rath: My name is "Jan Simon". Calling me "Jan" in the forum is short and polite.
The output of the profiler shows, that the most time is spent inside saveas, printingGenerateOutput and loadPrefs. This is strange. Are you working on a network drive with a very slow connection speed? I do not see, why loadPrefs is called 11730 times. I do not see also, why saving these more or less easy figures takes so much time.

Sign in to comment.

Answers (1)

Guillaume
Guillaume on 14 Feb 2018
Edited: Guillaume on 14 Feb 2018
The code you wrote does preallocate the structure, with the correct size though probably by mistake. It also unnecessarily preallocate a fair number of big vectors that are going to get replaced by your loop.
A more efficient way of allocating that structure would be simply:
data = struct('well_id', cell(1, n), 'start_date', [], 'end_date', [], ... and so on for the other fields regardless of their type)
You only need one of the input to be a cell array to create a structure array and if you do so, you do not want any of the other input to be a vector as it will create n of these 1xn vectors, one for each of your structure element.
While the above and your original code does allocate a 1xn structure, it does not preallocate any of the vectors, cell arrays, structures, etc. that are going to go into each of the fields. These are still going to be created in your loop and unfortunately there's nothing you can do about that because they're all different size.
The slow speed of your loop probably does not come from the lack of preallocation in any case.
edit: Now that you've posted the code
Most of the structure could be created without a loop, but it would still need a loop of some sort for the nan so I'm not sure you could gain much speed for the filling of the structure. I strongly suspect that the slow part is the processing loop, which is completely independent of the way the structure is allocated and filled.
  1 Comment
Amlan Rath
Amlan Rath on 14 Feb 2018
Edited: Amlan Rath on 14 Feb 2018
So is there any other way to speed things up?

Sign in to comment.

Categories

Find more on Creating, Deleting, and Querying Graphics Objects in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!