Converting .txt to matlab matrix
Show older comments
I have a group of .txt files that I'd like to convert into matlab matrices. Each file contains approximately 2000 lines. Each line begins with a header and then gives a value, for example "variable: 500". The variables repeat, so I'd like to create a matrix where each line gets sorted into the appropriate column of the matrix so I'll have columns of values for each variable over different trials (rows). I don't want the text to be included in the matrix, just the values. Also, not all of the values are numerical.
I'm new to matlab and I'm really not sure how to get this done.
UPDATE
I would like each file to be imported into a separate matrix.
There variables are in the same order in each file/each iteration.
Here is an example:
****header start****
text text text
text text text
text text text
****header end****
level: 7
****log frame start****
A: valueA
B: valueB
C: img.bmp
D: valueD
E: 7483
F: 0
****log frame end****
level: 7
****log frame start****
A: valueA
B: valueB
C: img.bmp
D: valueD
E: 7483
F: 0
****log frame end****
****log frame start****
sessioninfo: sessioninfo
sessiondate: date
sessiontime: time
Answers (1)
Walter Roberson
on 11 Sep 2017
Untested code, since you did not provide sample files.
project_dir = '.'; %or name of directory files are in
file_ext = '.txt'; %use appropriate extension for your files
dinfo = dir( fullfile( project_dir, ['*', file_ext] );
filenames = fullfile( project_dir, {dinfo.name} ); %list of all file names
num_files = length(filenames);
results = cell(num_files, 1);
varnames = cell(num_files, 1);
for K = 1 : num_files
thisfile = filenames{K};
fid = fopen(thisfile, 'rt');
known_variables = {}; %SEE NOTE
data = {}; %SEE NOTE
row_num = 0; %SEE NOTE
last_idx = inf; %SEE NOTE
while ~feof(fid)
thisline = fgetl(fid);
if ~ischar(thisline); break; end %end of file detected ?
tokens = regexp(thisline, '(?<tag>\w+)\s*:\s*(?<value>.*?)\s*$', 'names');
if ~isempty(tokens)
[tf, idx] = ismember(tokens.tag, known_variables);
val_as_number = str2double(tokens.value);
if isnan(val_as_number)
val_to_store = tokens.value;
else
val_to_store = val_as_number;
end
if ~tf
known_variables{end+1} = tokens.tag;
idx = length(known_variables);
end
if last_idx >= idx
row_num = row_num + 1;
end
data{row_num, idx} = val_to_store;
last_idx = idx;
end
end
fclose(fid);
results{K} = data;
varnames{K} = known_variables;
end
The end result of this is the variable results, which will be a cell array with one entry per file, and each entry will be a cell array with one column per variable and as many rows as needed to accommodate all repetitions.
The code does not assume that all variables are present for each row. It does, though, assume that the order of variables is consistent within any one file, so if it sees A B C B then it will assume the second B is in a different row.
Unfortunately the logic used is weak on the possibility that new variables are being inserted into the sequence, so A B C A D B C will become A B C empty; A empty empty D; empty B C empty whereas if insertions could be detected it would be A empty B C; A D B C
Because the order (and presence) of the variable names is not assumed in advance, the code also outputs the cell array varnames each entry of which is the variable column order for the associated results{K} entry.
If you look in the code you will see some lines marked %SEE NOTE. Those lines are per-file initialization, causing each file to have its own variable order and to accumulate rows only for the individual file. Your Question was not clear as to whether you wanted per-file processing or if you wanted to aggregate all of the files into one data table. If you want to aggregate all of the files together then those %SEE NOTE lines should be moved to before the "for K" line. If you want to force the variable order to be consistent between files but want each file to be processed separately, then move just the initialization of known_variables to before the "for K" line.
1 Comment
Jessica Jacobs
on 12 Sep 2017
Categories
Find more on Text Files in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!