Reading a variable that have different names in different data files

9 views (last 30 days)
assuming I am using the following lines to read some data files that have a table structure
for k = 1 : nfiles
fullFileName = fullnames{k};
%reading the file as a table
opts = detectImportOptions(fullFileName);
getvaropts(opts,{'YEAR','MONTH','DAY','HOUR','MIN','GDALT','NE8'});
opts.SelectedVariableNames = {'YEAR','MONTH','DAY','HOUR','MIN','GDALT','NE8'};
t = readtable(fullFileName,opts);
end
later on I have discovred that some files do not have the variable NE8 or GDALT, or they have them under different names, how can I exclude the files that does not have these variables and not read them and how can I read them from different files under differnet names ?
  2 Comments
Walter Roberson
Walter Roberson on 2 Oct 2021
detectImportOptions and then examine the variable names returned in the options structure, to see whether it has the variables you need
getvaropts(opts,{'YEAR','MONTH','DAY','HOUR','MIN','GDALT','NE8'});
That line is not doing anything useful for you -- not unless you remove the semi-colon so that you can display the output.
MA
MA on 2 Oct 2021
Edited: MA on 2 Oct 2021
This is the error I am getting
Error using matlab.io.ImportOptions/getNumericSelection (line 518)
Unknown variable name: 'POP'.
Error in matlab.io.ImportOptions/set.SelectedVariableNames (line 178)
rhs = getNumericSelection(obj,rhs);
Error in extractingtables (line 30)
opts.SelectedVariableNames = {'YEAR','MONTH','DAY','HOUR','MIN','GDALT','NE8','POP'};
where in one of the files we have such variable as seen in the inage below:
There are files that contain the NE8 variables , files that contain it under the name POP and files that does not have it at all. I am getting errors for both the second case and third case since not all the files have 'POP' and sone files does not have 'NE8'

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 2 Oct 2021
The assumption below is that if GDALT is missing then you do not want the file, but that if NE8 is missing then you can use POP interchangably, and that if both NE8 and POP are missing then you do not want the file.
needed_vars = {'YEAR','MONTH','DAY','HOUR','MIN','GDALT'};
wanted_one_of_vars = {'NE8', 'POP'};
opts = detectImportOptions(fullFileName);
vars = opts.VariableNames;
if all(ismember(needed_vars, vars)) && any(ismember(wanted_one_of_vars, vars))
if ismember(wanted_one_of_vars{1}, opts)
NE8_varname = wanted_one_of_vars{1};
else
NE8_varname = wanted_one_of_vars{2};
end
else
continue; %or whatever you need to do for files that do not have all the variables
end
opts.SelectedVariableNames = [needed_vars, {NE8_varname}];
t = readtable(fullFileName,opts);
  3 Comments
Walter Roberson
Walter Roberson on 10 Oct 2021
needed_vars = {'YEAR','MONTH','DAY','HOUR','MIN','GDALT'};
wanted_one_of_vars = {'NE8', 'POP'};
opts = detectImportOptions(fullFileName);
vars = opts.VariableNames;
if all(ismember(needed_vars, vars)) && any(ismember(wanted_one_of_vars, vars))
if ismember(wanted_one_of_vars{1}, vars)
NE8_varname = wanted_one_of_vars{1};
else
NE8_varname = wanted_one_of_vars{2};
end
else
continue; %or whatever you need to do for files that do not have all the variables
end
opts.SelectedVariableNames = [needed_vars, {NE8_varname}];
t = readtable(fullFileName,opts);

Sign in to comment.

More Answers (1)

Sulaymon Eshkabilov
Sulaymon Eshkabilov on 2 Oct 2021
Have you tried this ways of reading data, e.g.:
for k = 1 : nfiles
fullFileName = fullnames{k};
%reading the file as a table
opts = detectImportOptions(fullFileName);
t{k} = readtable(fullFileName,opts);
end
That results in k number of tables residing in a cell array variable called t that can be separated via another step.
  3 Comments
Sulaymon Eshkabilov
Sulaymon Eshkabilov on 2 Oct 2021
Yes, indeed what you are saying is correct. Maybe in this case, you had better import just part of the data by indexes and ignore the rest without using the variable names since they are not consistent for all data files.
MA
MA on 2 Oct 2021
yes I thought of that as well but the issue is the columns of the variables 'GDALT'and 'NE8' are having different places in different files, at one file they may be the 6th and 8th column at another they may be the 10th and 15th, that is why I am using there names in the first place, but now NE8 appear in a deffirent name as well in some files.

Sign in to comment.

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!