What are the ideal computer specs for parallel computing with large data sets?

3 views (last 30 days)
Omer Tigli
Omer Tigli on 26 Apr 2022
Commented: John D'Errico on 26 Apr 2022
I am processing data that takes about 20GB space as a .mat file. Custom functions processing the data takes a long time. The CPU usage in the task manager while Matlab was processing the data was only 8-9%. This corresponds to the one of the 12 processors. To utilize more processors, I downloaded parallel computing toolbox and converted one of the main loops to a parloop. The CPU usage percentage got higher (40-50%) first, then the memory usage reached to 100% and after a few minutes the CPU usage percentage got down to 8-9%.
I am assuming that due to the low ram capacity (32GB) compared to the large data set, parallel computing toolbox activates Matlab workers each of which needing memory space to work on the data, which in turn increase the demand on the RAM. This maximizes the RAM usage, and the algorithm decides to release some of the workers to reduce the demand on the RAM and reverts back to using only one processor. Does that explanation make sense? Is there an ideal computer spec that would perform better with parallel computing toolbox to expedite the data processing?

Accepted Answer

Jan
Jan on 26 Apr 2022
If the RAM is exhausted, the SSD or HDD is used as virtual RAM. This slows down the processing massively. You would see this by observing the disk load in addition.
The "ideal" computer is simply huge. If the compressed data use 20GB of disk space, this could be 20 GB or 100GB of RAM - depending on the compression. So MORE RAM is better.
Without knowing any details about the code, I cannot provide more than this vague answer.
  1 Comment
John D'Errico
John D'Errico on 26 Apr 2022
Thats about it. Huge is necessary. The ideal spec is as big as you can afford. Fast data transfer speeds obviously too.
And if you were to choose the absolutely perfect computer for THIS particular problem, it is a 100% certainty that tomorrow, you would need a larger computer, because you will be solving larger problems tomorrow or next month. A fundamental law of computing is involved:
Computer problems expand to the size of the resources available, and just a bit more.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!