read specified data range for fread a large binary file
29 views (last 30 days)
Show older comments
I have a large binary data file that’s around 70 GB. Unfortunately, my laptop doesn’t have enough RAM to read all the elements in this data file. The data L is a matrix with dimensions 28500 x 8031. To mitigate the RAM usage, I’m wondering if it’s possible to just read a specific range of data instead of the whole file. Specifically, I’d like to read only the 1338th, 1339th, and 1340th columns.
Here is my function,
%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~%
fclose all;
ncol = 8031;
formatype = 'float32'; % format type: 'float32' for .bin file and 'single' for .tda file
% Load the whole data and then transpose it
fileID = fopen (filename);
if fileID < 0
error ('This result file does not exist.');
else
L = fread (fileID, [ncol, inf], formatype);
L = transpose (L);
end
%~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~%
Thank you.
Best,
1 Comment
Answers (2)
recent works
on 8 Sep 2023
Yes, it is possible to read a specific range of data from a binary file in MATLAB. You can use the fread() function to read data from a file, and specify the start and end indices of the range you want to read.
In your case, you want to read the 1338th, 1339th, and 1340th columns.
fileID = fopen(filename);
L = fread(fileID, [ncol, 3], formatype, 1338, 1340);
This code will open the file, read the data from the 1338th to the 1340th column, and store it in the variable L.
The fread() function has a number of other parameters that you can use to control how the data is read.
3 Comments
Walter Roberson
on 8 Sep 2023
This answer is incorrect.
The full syntax for fread is
A = fread(FID,SIZE,PRECISION,SKIP,MACHINEFORMAT)
There is no option at all for specifying indices.
Walter Roberson
on 8 Sep 2023
There are two common ways that blocks of binary data can be arranged in a file.
If the data is stored so that for a given location in the file, the next location in the file generally is for the same column but the next row in that column, then that arrangement is called "Row Major Order". This is the arrangement that MATLAB uses internally for its arrays, and is the order that MATLAB would use when asked to write binary files.
If the data is stored so that for a given location in the file, the next location in the ffile is generally for the same row but the next column in that row, then that arrangement is called "Column Major Order". This is the arrangment that C and C++ and a number of other programming languages use internally, so it is common to find files that are stored this way.
If the file is Row Major Order, then in order to read a single column, then the steps are:
- [Row Major Only!] Use fseek to seek to the file location of the beginning of the column. Multiply (column number minus 1) by the number of rows in the array, and multiply that by the number of bytes per entry (4 bytes for float32) to get the byte offset to seek relative to the beginning of the file. Then use fread() with size [number of rows in array, number of columns to read now] and precision 'single' . If the columns were not adjacent, then you would repeat this using fseek() to get to the begining of each non-contiguous column
If the file is in Column Major Order, then reading a column is a bit more of a nuisance:
- [Column Major Only!] Use fseek() to seek to the file location of the beginning of the column. Multiply (row number minus 1) by the number of columns in the array, and multiply that by the number of bytes per entry (4 bytes for float32) to get the byte offset relative to the beginning of the file. Then use fread() with size [number of rows in array, 1] and precision 'single'. Use a skip equal to (number of columns in file) times number of bytes per entry (4 bytes for float32). This can only read one column at a time; to get the other columns you will need to fseek() again. (If the number of adjacent columns were to increase relative to the number of rows in the file, then a different reading strategy would become viable.)
0 Comments
See Also
Categories
Find more on Low-Level File I/O in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!