Matlab Save file and data corruption

24 views (last 30 days)
Mike
Mike on 5 Dec 2020
Commented: Mike on 6 Dec 2020
Hello all,
I am having a data corruption issue in MATLAB. The issue is as follows:
I am running a function from the command line. The function takes a path to a directory containing a number of binary data files. The function loops over each of the data files and 'maps' the binary files (it's a datagram binary file so I am mapping packet locations for quicker reading later). The map is a structure array f_map. The map variable is saved to a .mat file. In addition to the map file, an index file is created, which keeps track of the mapping success map_index. The code is shown below.
I keep having to try and add spagetti code checks to not have the function fail or corrupt the data files. Originally I didn't save the index, but when saving f_map failed, I would have to try and re-create the map_index so I switched to saving the index each time. Then I ran into save corrupting map_index . So I used a matfile object which is better suited for this and when that ran into corruption issues I decided to ask for help.
Additional information:
Dataset and map file directories are network drives. I interface to them using a mapped drive path as that was suggested by our IT back in the data.
Files are ~2.15GB each.
Map files are saved as .mat v7.3
My machine has 32GB of ram, a 1TB disk and a 1Gbps ethernet connection.
I am doing this work remotely.
The program, if it didnt crash, would likely still take two full days to map.
Anyone have any ideas or improvements to mitigate this situation?
function map_index = map_all_SCORE_datafiles(data_dir_path,map_dir_path)
% Function generates a packet map of each data file in data_dir_path. Maps
% for each file are saved in the map_dir_path as .mat files. An index of
% map files stores the map file information and mapping success.
%
% USAGE:
% map_all_SCORE_datafiles(data_dir_path,map_dir_path);
%
% map_index = map_all_SCORE_datafiles(data_dir_path,map_dir_path);
%
%
% INPUTS;
% data_dir_path - string containing the directory path to
% the binary datafiles
%
% map_dir_path - string containing the directory path to
% save the index files.
%
%
% OUTPUTS:
% map_index - a structure array containing information
% on the success of the function
% execution
%
%
%% Parameters
fname_map_index = 'index_of_map_files';
save_ext = '.mat';
%% Check for directory existance
% Check for data directory
if ~exist(data_dir_path,'dir')
error('data directory not found')
end
% Check for map directory
if ~exist(map_dir_path,'dir')
error('map file directory not found');
end
%% Get the list of all files in the data
d_dir = dir(fullfile(data_dir_path,'**\**'));
d_dir = d_dir(~[d_dir.isdir]);
num_files = numel(d_dir);
% sort directory by time (using the seconds in the file name)
temp = cellfun(@(x) x(6:end),{d_dir(:).name}','UniformOutput',false);
temp = cellfun(@str2num,temp);
[~,inx] = sort(temp);
d_dir = d_dir(inx);
%% Preallocatate the mapfile_index structure array
%check for an index file
file_index = fullfile(map_dir_path,[fname_map_index,save_ext]);
if exist(file_index,'file')
mp_io_inx = matfile(file_index,'Writable',true);
[~,sz_index] = size(mp_io_inx.map_index);
if sz_index ~= num_files
% this index is not the same size as the dataset, idk why it would
% be that way but it means that we need to recreate the index file
delete(mp_io_inx); clear mp_io_inx; % clear should work fine but sometimes the matfile object keeps control of the object....
mk_index = true;
else
mk_index = false;
end
else
% index not found, create it
mk_index = true;
end
if mk_index == true
map_index.raw_file = [];
map_index.raw_folder = [];
map_index.mapped = [];
map_index.file_size = [];
map_index.packet_count = [];
map_index.start_time = [];
map_index.end_time = [];
map_index.map_folder = [];
map_index(num_files).map_file_name = [];
save(file_index,'map_index');
mp_io_inx = matfile(file_index,'Writable',true);
end
%% Get list of existing map files in map directory
mp_dir = dir(map_dir_path);
mp_dir = mp_dir(~[mp_dir.isdir]);
mp_dir_fname = {mp_dir(:).name};
%% Loop over each file in the data directory and map if necessary.
fprintf(1,'Mapping dataset at : %s \n\n',data_dir_path)
t_func_start = tic;
t_iter_av = 0;
for ii = 1:num_files
t_iter_start = tic;
%% update command window
fprintf(1,'working on file #%d of %d \n',ii,num_files);
%% manage file names
raw_file_name = fullfile(d_dir(ii).folder, d_dir(ii).name);
map_file_name = ['map_',d_dir(ii).name,save_ext];
map_file_path = fullfile(map_dir_path,map_file_name);
fprintf(1,'full file name: %s \n',raw_file_name);
%% Check if already mapped
[map_check] = check_map_file(mp_dir_fname,map_file_path,mp_io_inx.map_index(1,ii));
if map_check == 0
% Map not present, or mapfile data does not match map index data
%% map data file
[f_map, map_stats,map_success] = map_SCORE_datafile(raw_file_name,1);
%% update stats
t_map_index.raw_file = d_dir(ii).name;
t_map_index.raw_folder = d_dir(ii).folder;
if map_success
t_map_index.mapped = 1;
t_map_index.file_size = map_stats.file_size;
t_map_index.packet_count = map_stats.packet_count;
t_map_index.start_time = map_stats.start_time;
t_map_index.end_time = map_stats.end_time;
t_map_index.map_folder = map_dir_path;
t_map_index.map_file_name = map_file_name;
fprintf(1,'\tfile size : %0.2fGB \n',map_stats.file_size/1e9);
fprintf(1,'\tpacket count: %d \n',map_stats.packet_count);
else
t_map_index.mapped = 0;
fprintf(1,'\tmapping operation failed!\n');
end
%% save data map - The try exception block is becuase of the save() failing to close the file, corrupting it.
if map_success
fprintf(1,'\tsaving map file : %s \n',map_file_path);
try
save(map_file_path,'f_map','map_stats');
catch
% it failed to save. try again
fprintf(1,'\t cannot save map file: trying again \n');
pause(1)
try
f_string = ['delete ',map_file_path];
eval(f_string);
save(map_file_path,'f_map','map_stats');
catch
% it failed a second time, skip it.
fprintf(1,'\t saving failed for second time. skip file \n');
f_string = ['delete ',map_file_path];
eval(f_string);
t_map_index.mapped = 0;
t_map_index.map_folder = [];
t_map_index.map_file_name = [];
end
end
end
%% Update index info
mp_io_inx.map_index(1,ii) = t_map_index; % ERRORS:Error closing file ...\data_file_maps\index_of_map_files.mat. The file may be corrupt.
elseif map_check == 2
% Update the map index instead of mapping file. This only happens because the index gets corrupted from abov^^^^
load(map_file_path,'map_stats');
temp_index = mp_io_inx.map_index(1,ii);
temp_index.raw_file = d_dir(ii).name;
temp_index.raw_folder = d_dir(ii).folder;
temp_index.mapped = 1;
temp_index.file_size = map_stats.file_size;
temp_index.packet_count = map_stats.packet_count;
temp_index.start_time = map_stats.start_time;
temp_index.end_time = map_stats.end_time;
temp_index.map_folder = map_dir_path;
temp_index.map_file_name = map_file_name;
mp_io_inx.map_index(1,ii) = temp_index; % ERRORS:Error closing file ...\data_file_maps\index_of_map_files.mat. The file may be corrupt.
fprintf(1,'\tfile size : %0.2fGB \n',map_stats.file_size/1e9);
fprintf(1,'\tpacket count: %d \n',map_stats.packet_count);
end
%% update window
t_iter_end = toc(t_iter_start); %seconds
t_iter_av = t_iter_av + (t_iter_end-t_iter_av)/ii; %seconds
t_iter_rem = (t_iter_av*(num_files-ii))/60; % minutes
fprintf(1,'mapping elapsed time : %0.2f \n',t_iter_end);
fprintf(1,'estimated time remaining: %0.2f min (%0.2f hr) \n',...
t_iter_rem, t_iter_rem/60);
fprintf(1,' \n\n');
end
t_func_end = toc(t_func_start); % seconds
fprintf(1,'process finished!!!');
fprintf(1,'elapsed time : %0.2f hr\n',t_func_end/3600);
map_index = mp_io_inx.map_index; % Pull the map index back in for user inspection at end of run
t_files_read = sum([map_index(:).mapped]);
fprintf(1,'total files mapped: %d of %d \n\n',t_files_read,num_files);
end
%% Helper function
function [map_check] = check_map_file(mp_dir_fname,map_file_path,map_index)
% Helper function. Checks if map_file already exists and if the data is
% up-to-date
[~,map_file_name,ext] = fileparts(map_file_path);
%% Check if we have mapped this file already
if any(strcmp(mp_dir_fname,[map_file_name,ext]))
% mapfile is present -> check data consistency
% Check mapfile can be opened and has the map stats.
warning('off','MATLAB:whos:UnableToRead'); % suppress the warning % this is because
mp_io = matfile(map_file_path);
try
map_stats = mp_io.map_stats;
catch
map_stats = [];
end
warning('on','MATLAB:whos:UnableToRead');
delete(mp_io); clear mp_io;
if isstruct(map_stats)
% mapfile can be opened -> check stats match
temp = struct2cell(map_index);
temp = temp([1,4:7]);
temp2 = struct2cell(map_stats);
if all(cellfun(@isempty,temp))
% This is an empty map entry, just fill the entry. I hate
% matlab at this point
fprintf(1,'\tMap check: file already mapped, map_index out-of-date -> update index, Continue\n');
map_check = 2;
elseif isequal(temp,temp2)
% map stats match the map index info -> return true
fprintf(1,'\tMap check: file already mapped, map_index is up-to-date -> Continue\n');
map_check = 1;
else
% map stats do not match the index info -> return false
fprintf(1,'\tMap check: map_index is out of date -> re-map raw file.\n');
map_check = 0;
end
else
% mapfile cannot be opened -> return false
fprintf(1,'\tMap check: cannot open map_file -> re-map raw file.\n');
map_check = 0;
end
else
% File is not present -> map
fprintf(1,'\tMap check: file not yet mapped.\n');
map_check = 0;
end
end
  4 Comments
Mike
Mike on 6 Dec 2020
@WalterRobertson: The network is not OneDrive. We have a local network I should say. Which our IT department manages. Checking the drive info I can say that it is definately NTFS. Doing a little more research it looks like the data is stored on an SVM cluster (Storage virtual machine?).
@dpb: Initially no. But what I am noticing now is that the loop fails on iteration 408. The map index file is consistently being corrupted on that loop iteration. I guess I will report back shortly.
Thanks for taking the time to read!
Mike
Mike on 6 Dec 2020
After stepping through line by line in that iteration it is directly failing at this code block:
elseif map_check == 2
...
%% Update index inf
try
mp_io_inx.map_index(1,ii) = t_map_index; %<- This line here
catch
keyboard
end
...
I am able to load, view the matfile object and everything. But the second it tries to do that write operation, on iteration 408, it fails. Something spooky is going on (/s). I checked that there is no variable name clashing with the local function (though they should have seperate workspaces anyways no?).
For giggles, I put a break point inside the loop, at the top before starting anything. I changed the iterator ii = 408, and the function was able to write to the map index file using the matfile object (obviously the old map file index was corrupted, so this info was written to a new, empty map file index). Rerunning it now, it should technically be able to skip over this index. We shall see what happens.

Sign in to comment.

Answers (0)

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!