Loading in Data from large files
5 views (last 30 days)
Going to try and explain what i'm trying to do below and I welcome any suggestions the community can provide.
- Problem: I am trying to read in a large (>10GB) binary file and parse specific data. I can already parse the data, but I run out of RAM on MATLAB when parsing such large amounts of data.
- Current Logic: I have been using memmap to load in the data, and it worked up until I started having to deal with large file sizes. I am aware that memmap can skip a specific amount of memory and start at a later point, but i need to load a specific amount of bits at a time. I'm trying to avoid using fread since it takes so long.
- Goal: I am looking to parse through these binary files in smaller sections by using a command or some programmed logic to read in a small percentage of the binary file at a time, run the computations, grab what I need and store it elsewhere, then grab the next small percentage of the binary file and repeat. I've included some detailed notes below for my intent to try and help; but this is a project I am not allowed to share code too (copyright).
Procedure for what I want to do:
- Load in the header info of the file. this is static info that is easy to load and I can do this pretty well already.
- Load in a percentage of the file. For this specific logic, let's assume my file is 10 GB and I want to read it in in 500 MB sections.
- Parse the 500 MB I read in. I'm aware that I may only read 492 MB, in which case I need to make sure I read in 508 MB the next time.
- Store the parsed data in a structure
- clear used variables to reuse for next section of code logged.
I hope this helps. I'll try to keep an eye on this post moving forward, but it might take me some time to respond as i'll be traveling.