resampling of large dataset

11 views (last 30 days)
Oluwatomisin Faniyan
Oluwatomisin Faniyan on 6 Jun 2022
Commented: Jan on 7 Jun 2022
Please, I need to downsample my EEG recording from 30,000 Hz to 512 Hz. The problem is I cannot load the data due to the large size of the recording.
Below is the code I have been working on but it could not load my data. Please, I am quite new to matlab. Please I need help.
for u=1:length(list)
[signal,timestamps,info]=load_open_ephys_data(fullfile(list(u).folder,list(u).name));
SIGNALS(:,u)=signal;
CHANNELS{u}=info.header.channel;
end
%downsampling signals from 1000 to 512 Hz;
fsold=info.header.sampleRate;
fsnew=512;
SIGNALS=resample(SIGNALS,fsnew,fsold);
  5 Comments
Oluwatomisin Faniyan
Oluwatomisin Faniyan on 7 Jun 2022
Hi,
Please, I am new to matlab.
Please, this is the error message I got. I have attached my code in thye attached file
Requested 2137305088x1 (15.9GB) array exceeds maximum array size preference (15.9GB). This might cause MATLAB to become
unresponsive.
Error in load_open_ephys_data (line 239)
data(current_sample+1:current_sample+nsamples) = block;
Error in untitled3 (line 30)
[signal,timestamps,info]=load_open_ephys_data(fullfile(list(u).folder,list(u).name));

Sign in to comment.

Answers (1)

Jan
Jan on 7 Jun 2022
Edited: Jan on 7 Jun 2022
SIGNALS(:,u)=signal;
This let the array SIGNALS grow in each iteration. Therefore a new larger array must be created and the old data are copied. The solution is to pre-allocate the array with the final size.
If a single signal exhausts your RAM already, you would need much more to store all signals, especially if the pre-alliocation was forgotton.
The actual problem occurs in the function load_open_ephys_data . We cannot see its code, so there is no chance to suggest an improvement. Are you using this: https://github.com/open-ephys/analysis-tools/blob/master/load_open_ephys_data.m ? Changes in the caller will not solve this problem.
If one imported signal exhausts the limit of 15.9GB already, collecting a bunch of them must fail tremendously.
A way would be so downsample the signal during the import already.
By the way, clear all is a waste of time, because it removes all loaded functions from the RAM. Reloading them from the slow disk takes time, but has no benefits. Use functions instead of scripts to keep the worksapces clean.
Avoid this:
[outpath filesep 'EDF_FILES' filesep answer{5} filesep answer{4} filesep answer{1}]
Smarter and safer:
fullfile(outpath, 'EDF_FILES', answer{5}, answer{4}, , answer{1})
I took a look into https://github.com/open-ephys/analysis-tools/blob/master/load_open_ephys_data.m . This is an inefficient code. It imports the signal as UINT16, stores in in a double array, which is pre-allocated by MAX_NUMBER_OF_CONTINUOUS_SAMPLES = 1e8, but obviously this number of elements is exceeded during the reading.
The header of the file is read as code and processed by EVAL. This is an extremely bad idea and makes it impossible to test and debug the code exhaustively. I would never use this function for productive work, especially not for clinical applications.
You have to modify the function load_open_ephys_data such, that it solves the downsampling directly during the import. This is not tricky, actually, but unfortunately the code is a bunch if IF-branches for different file versions, strange decisions for data types and the mentioned evil EVAL is a disqualification.
If you did not work with Matlab yet, a clean solution is to hire a professional programmer and rewrite this function from scratch.
  2 Comments
Jan
Jan on 7 Jun 2022
You are welcome. Sorry for the non-constructive answer.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!