Clear Filters
Clear Filters

How to efficiently integrate big data without using memory / (How to create big data)

4 views (last 30 days)
  • in a study i will produce large arrays.
  • Each array will have at least 500 MB size.
  • Each array will have the same number of rows.
  • the total size of dataset will be approximately 20 GB or over.
  • Somehow I have to create a single variable/array which includes all data and size of 20 GB.
matfile seems a good solution. However when the size of file increases, it gets slower. How can i handle this problem?
Mehmet OZC
Mehmet OZC on 18 Aug 2015
Edited: Mehmet OZC on 18 Aug 2015
It works to a degree. When i try to append a 2 GB file to a 4GB file it gets slower. MATLAB does wonderful things. I believe it can handle this or is it impossible to create a really large file with using ordinary computers?

Sign in to comment.

Accepted Answer

JMP Phillips
JMP Phillips on 19 Aug 2015
Edited: Walter Roberson on 19 Aug 2015
Here are some things you could try:
Use the matfile function, which allows you to access and change variables directly in MAT-files, without loading into memory:
Structure your data differently: - if you are representing the data as doubles, maybe you can afford less accuracy e.g. use int32. For example, you can use scaling of 1e4 to represent a double value such as 100.3425 as an integer 1003425.
  • use 64 bit matlab version
  • try disabling compression when saving the files, with the -v6 option
Optimize your PC for your task:
Mehmet OZC
Mehmet OZC on 19 Aug 2015
In one of the links provided above I have run across following code
example = matfile('example.mat','Writable',true);
[nrowsB,ncolsB] = size(example,'B');
for row = 1:nrowsB
example.B(row,:) = row * example.B(row,:);
And that solved my problem. Thanks

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!