fwrite and MATLAB for a raid0 disk - Only one lane?

4 views (last 30 days)
Hello everyone,
I have a raid0 NVMe disk (made up of 4 NVMe disks connected together through a PCIe card adaptator).
The disk works great (up to 12GB/s OUTSIDE MATLAB, PCIe 3.0) but I cannot reach such speed in MATLAB.
It looks like MATLAB is using a single bus lane (aka 3.5GB/s) to write the data to the disk (simple example):
data = randn(1024, 1024, 1024, 'double'); %8 GB
fid = fopen('test.bin', 'W');
tic;
fwrite(fid, data(:), 'double');
toc;
fclose(fid);
Takes about 2.3 seconds which is about 3.5 GB/s so like using one lane... where the raid0 uses 4 lanes (4x4 PCIe).
I am running out of solution, this is not related to the disk/raid0 itself; I tested a lot of raid0 configuration (bios, VROC, Windows raid), the issue only occur in MATLAB. Using hd5f files does not solve that issue, it seems to be related to MATLAB itself.
FYI: I need such speed, in my field/lab we are creating about 1TB data per 5 min the bottleneck is always related to saving the data.
EDIT 1: Removed "b" argument from "fopen"
EDIT 2: Added type "double" to "fwrite"
Thank you a lot.
  5 Comments
Walter Roberson
Walter Roberson on 30 Mar 2022
Getting high speed transfer to disk can require using special system calls. I do not have any information about how it is done in Windows; in Linux apparently there are methods that can avoid round-trips to user mode. It is unlikely that MATLAB implements those methods.
In Windows... I don't know. Is WriteFileEx still used in practice? https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefileex That does asynchronous writes, which historically has been an important step in performance improvement. Or perhaps WriteFileGather() https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefilegather ?
In a logging situation, you would like to be able to grab a buffer full of input, schedule it to be written, and continue on without waiting for the I/O to complete.
I suspect that MATLAB simply uses C or C++ fwrite() https://www.cplusplus.com/reference/cstdio/fwrite/ which waits for I/O to complete
Vincent Perrot
Vincent Perrot on 30 Mar 2022
@Walter Roberson I did a MEX file using WriteFile without success. I will try some asynchronous writes with WriteFileEx and also try WriteFileGather.
I did contact the support to get some answers about that.
I tried fwrite/ofstream/WriteFile (MEX files) even in chuncks, without any success.
Thanks for taking the time, I will read those links and try those approaches.

Sign in to comment.

Answers (2)

Jan
Jan on 29 Mar 2022
Edited: Jan on 29 Mar 2022
What about trying it as C-Mex?
data = randn(1024, 1024, 1024, 'double'); %8 GB
tic
uglyCWrite(data);
toc
// Short hack, UNTESTED!!!
// uglyCWrite.c
#include "mex.h"
#include <stdio.h>
#include <stdlib.h>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *data;
size_t n, w;
File *fid;
data = (double *) mxGetData(prhs[0]);
n = mxGetNumberOfElements(prhs[0]);
w = mxGetElementSize(prhs[0]);
fid = fopen("test.bin", "w");
fwrite(data, n, w, fid);
fclose(fid);
}
  2 Comments
Vincent Perrot
Vincent Perrot on 29 Mar 2022
Edited: Vincent Perrot on 29 Mar 2022
Thank you for taking the time to put that piece of code together.
This morning I tested several MEX implementations from this post: https://stackoverflow.com/questions/70126690/write-binary-file-to-disk-super-fast-in-mex
Those are not faster than fwrite in MATLAB:
void writeBinFile(int16_t *data, size_t size)
{
FILE *fID;
fID = fopen("file_fopen.bin", "W");
fwrite(data, sizeof(int16_t), size, fID);
fclose(fID);
}
void writeBinFileFast(int16_t *data, size_t size)
{
ofstream file("file_ostream.bin", std::ios::out | std::ios::binary);
file.write((char *)&data[0], size * sizeof(int16_t));
file.close();
}
void writeBinFilePartByPart(int16_t *int_data, size_t size)
{
size_t part = 64 * 1024 * 1024;
size = size * sizeof(int16_t);
char *data = reinterpret_cast<char *> (int_data);
HANDLE file = CreateFileA (
"windows_test.bin",
GENERIC_WRITE,
0,
NULL,
CREATE_ALWAYS,
FILE_FLAG_SEQUENTIAL_SCAN,
NULL);
// Expand file size
SetFilePointer (file, size, NULL, FILE_BEGIN);
SetEndOfFile (file);
SetFilePointer (file, 0, NULL, FILE_BEGIN);
DWORD written;
if (size < part)
{
WriteFile (file, data, size, &written, NULL);
CloseHandle (file);
return;
}
size_t rem = size % part;
for (size_t i = 0; i < size-rem; i += part)
{
WriteFile (file, data+i, part, &written, NULL);
}
if (rem)
WriteFile (file, data+size-rem, rem, &written, NULL);
CloseHandle (file);
}

Sign in to comment.


Jeremy Hughes
Jeremy Hughes on 29 Mar 2022
I was playing around with this and found that this is much faster (by a factor of 3 on my machine):
fwrite(fid,data(:),"double");
  1 Comment
Vincent Perrot
Vincent Perrot on 29 Mar 2022
Edited: Vincent Perrot on 29 Mar 2022
Thank you.
Sadly we tried it, this is how I got the 3.5GB/s I was talking about in my first message.
I played around with the code and forgot to put it back in my question, sorry about that.
I edited my question, we are still at 3.5GB/s instead of 12 GB/s ish.

Sign in to comment.

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!