Matlab slows down every other for-loop cycle

Hi everyone,
We have a strange problem executing some code on different data files. The code reads long data files of the same type and approximately the same length (25000-26000 samples), does some processing and stores extracted features on the disk. The strange behavior is as follows: The first index takes approximately 240 minutes, the second index however takes about 50 minutes, the third 240 min, the forth 50 minutes etc.
In short: Every odd index takes 240 minutes and every even one 50 minutes independent of the file/data (thus if we change the order of the files, the same behavior can be observed).
During all calculations the CPU is working at about 50% and the memory usage remains stable. (Of course we tested this without running other heavy tasks which could lead to higher execution times etc.).
Does anyone has an explanation for that, or an idea what the problem might be? We assume, that it might be a bug of matlab in the internal handling of loops.
Thanks in advance for an answer
(We used matlab 2017b)

13 Comments

Without more information about the code that shows this behavior, only speculation is possible.
I would consider a bug in a feature that is used as frequently as loops improbable, though.
Hi Simon,
An obvious question is, how is the index used? Posting code and having a 240/50 test isn't going to happen, and at the moment diagnostics done by you are more likely to be effective than speculation done here. There is the profile command, the memory command, the timeit command, strategically placed use of tic and toc around parts of the code and there must be some others that I can't think of.
First thing to test is probably to just take your code outside of a for loop and run a single case of an even index and a single case of an odd index and see if you replicate the behaviour.
If you do then put
profile on
yourcode
profile off
profile viewer
around each one and run them in turn, taking any notes or screenshots etc from the profiling of one before running the other.
There is nothing inherent in a for loop that runs slower for odd indices than even indices, I'm 99% certain of that.
Thank you for your comments.
As already mentioned, the code which is run is the SAME for every index.
As suggested, we made some tests using the profiler. The code for feature extraction is run on exactly the same data (this data represents 1% of a large data file). The execution times are significantly different. By examining the nested functions, the main difference seems to be in findpeaks->findLocalMaxima (as visible on the figure).
Accumulation of all these time leads to the final difference in execution time.
Do you have any idea what may cause this problem?
Those profiler viewers show a huge number of differences across almost every function. One has a self time of 64s, the other 15s, filtfilt takes 15s on one and 1.7s on the other, etc, etc.
So why do you think that findPeaks->findLocalMaxima is what is causing the difference?
> As already mentioned, the code which is run is the SAME for every index.
But the input data is not. Can you run the test again and supply exactly the same input data to the code at each loop iteration? This would help narrow down whether this is a code issue or really a MatLAB issue.
Also, when working with large data sets, allocating, reallocating and deallocating memory can take up significant amounts of time.
@Adam
> So why do you think that findPeaks->findLocalMaxima is what is causing the difference?
Yes its true, that the self time is significantly different as well. That was a slight error in reasoning. But still the time-difference of nested functions, which are called a large amount of time will accumulate at the end.
Yes, we also thought of allocation etc. being the cause, but then still a large time difference every other time is strange (difference for odd and even loop iteration). Or would there be an explanation?
@ Christoph
> But the input data is not.
Yes! In the previous mentioned example we took care to use the exact same data twice! The resulting profiler result is shown in the image.
But still, literally every function in those listings is taking significantly different times between each run so something must be going wrong with the testing conditions.
> But still, literally every function in those listings is taking significantly different times between each run so something must be going wrong with the testing conditions.
The results do look odd indeed. Could it be that MatLAB needs more time for the first run in order to load/cache all the relevant functions, and the second run is faster since everything is already in memory?
Thank you for the answers and suggestions.
I'll check again if there is an error in the indexing of the segments and thus a longer segment of data is processed in the odd case. I'll keep you updated.
@Christoph
Yes, I think that's the case, but it doesn't explain the slower execution time for index 3,5,7, etc.
Try profiling e.g. index 5 and index 8 by wrapping the profile commands around these with a breakpoint after to gather the information from one before continuing to the next (and making sure the profiler is off between one test and the next)
I have some more Information...
The process is the following: A large datafile (.mat) is loaded. It contains data segments in a structure. (~10000 samples, always the same amount. For example 20000 segments of 10000 samples). The feature extraction code is run on those segments (always the same size!)
If the datafile is smaller (only a few segments, f.e. 140), the runtime is the same for all segments. If the datafile is large (f.e. 20000 segments), the runtime is different every other run (as written before) even for the same data and the same size.
Our conclusion is, that it definitely is a memory problem. Because the difference is only depending on the size of the datafile loaded. However the RAM is not full and there is no activity on the hard-drive during the execution (no swapping).
(In order to omit any misunderstanding, the loaded datafile has different size, but the code is always executed on segments with the same size).
> However the RAM is not full and there is no activity on the hard-drive during the execution (no swapping).
It doesn't necessarily have to be a memory size problem. Frequent allocation, reallocation and deallocation of memory can cause significant delays.
Does the code have any non-MatLAB pieces that occasionally run some form of garbage collection?

Sign in to comment.

Answers (0)

Categories

Asked:

on 29 Nov 2017

Commented:

on 5 Dec 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!