Info
This question is closed. Reopen it to edit or answer.
Speeding up text reading
    2 views (last 30 days)
  
       Show older comments
    
I am currently reading a "large" .txt file fro which I used the import tool in Matlab and then copied the generated code into my script, however, I think that there is a section of code that slows down the process quite significantly (below). I dont know what every single bit means so I was wondering how can I get rifd off whatever is slowing the reading down whithout affecting the output, perhaps there are unncesary processes/steps?
thanks a lot in advance!
formatSpec = '%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%[^\n\r]';
                            fileID = fopen(filename,'r','n','UTF-8');
                            fseek(fileID, 3, 'bof');
                            textscan(fileID, '%[^\n\r]', startRow (1,b), 'ReturnOnError', false);
                            dataArray = textscan(fileID, formatSpec, endRow(1,b)-startRow(1,b), 'Delimiter', delimiter, 'ReturnOnError', false);
                            raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
                            for col=1:length(dataArray)-1
                                raw(1:length(dataArray{col}),col) = dataArray{col};
                            end
                            numericData = NaN(size(dataArray{1},1),size(dataArray,2));
                            for col=[1:419]
                                rawData = dataArray{col};
                                for row=1:size(rawData, 1)
                                    regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
                                    try
                                        result = regexp(rawData{row}, regexstr, 'names');
                                        numbers = result.numbers;
                                        invalidThousandsSeparator = false;
                                        if any(numbers==',')
                                            thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
                                            if isempty(regexp(thousandsRegExp, ',', 'once'))
                                                numbers = NaN;
                                                invalidThousandsSeparator = true;
                                            end
                                        end
                                        if ~invalidThousandsSeparator
                                            numbers = textscan(strrep(numbers, ',', ''), '%f');
                                            numericData(row, col) = numbers{1};
                                            raw{row, col} = numbers{1};
                                        end
                                    catch me
                                    end
                                end
                            end
                            R = cellfun(@(x) ~isnumeric(x) && ~islogical(x),raw);
                            raw(R) = {NaN};
1 Comment
Answers (0)
This question is closed.
See Also
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
