Filter excel files by sheetnames
Show older comments
Hi, I am trying to filter out excel files by checking whether it has a sheet named "Cooking_is_fun" within the excel.
This is the code I have been working on.
function hassheet = CheckForWorksheet(filepath, filelist, sheetname)
hassheet = false(size(filelist));
excel = actxserver('Excel.Application');
cleanupobj = onCleanup(@() excel.Quit);
for fileidx = 1:numel(filelist)
workbook = excel.Workbooks.Open(filepath{fileidx}, false, true);
sheetnames = arrayfun(@(i) workbook.Sheets.Item(i).Name, 1:workbook.Sheets.Count, 'UniformOutput', false);
if ismember(sheetname, sheetnames)
hassheet(fileidx) = true;
end
end
end
%now call the function, note that data is a pre-existing structure array that I have.
filepath = {data.AllExcelPath};
filelist = {data.AllExcel};
hassheet = CheckForWorksheet(SubFolder, filelist, 'Cooking_is_fun');
wantedExcel = filelist(hassheet);
However, this kept on giving me errors such as:
Cell contents reference from a non-cell array object.
Error in Open_Excel_Files_All/CheckForWorksheet
workbook = excel.Workbooks.Open(filepath{fileidx}, false, true);
Error in Open_Excel_Files_All
hassheet = CheckForWorksheet(SubFolder, filelist,'Cooking_is_fun');
My goal is to:
be able to write the function and call the function in the same script, so that when I send the code to someone they don't have to run a function and a m-file separately.
be able to extract wanted excel files based on whether it contain a sheet named "Cooking_is_fun".
I am fairly new to matlab and am still confused with many basic concepts. Please advice me with any tips/guide.
Thank you very much for reading and any help will be greatly appreciated :)
2 Comments
Guillaume
on 13 Jul 2016
Note that I've just realised that there's a minor bug (I never close the workbooks I open) in the original code I wrote for CheckForWorksheet. Fixed in the original discussion.
chlor thanks
on 13 Jul 2016
Edited: chlor thanks
on 13 Jul 2016
Accepted Answer
More Answers (2)
Walter Roberson
on 12 Jul 2016
1 vote
You construct a variable named filepath as a cell array, but what you pass as the first parameter in your call is SubFolder which we do not know the contents of.
8 Comments
Guillaume
on 12 Jul 2016
Note that this a continuation of http://uk.mathworks.com/matlabcentral/answers/293943-sort-excel-files-by-file-content.
@chlor, It would have been better to continue the conversion in that question.
1. You've removed the documentation (the comments) that I wrote for the function. It clearly stated what the purpose of each input was. In particular:
%folder: folder where the excel files are located (1D char array / string)
%filelist: names of excel files (cell array of 1D char arrays / string array)
2. Then you went on to change the code for opening the excel file. My original code constructed a full path for the file using the folder name and one of the file name from the filelist.
Your new code doesn't make much sense. You're iterating over the number of elements in filelist, yet never use filelist anywhere and use that iterator to extract elements in a cell array filepath instead. Is there a guarantee that the two cell arrays have the same size? Probably not. Is filepath supposed to be a cell array in the first place? probably not.
What was the purpose of the code change? If you're already passing the full path of the files, then change the signature of the function to:
function hassheet = CheckExcelFilesForWorksheet(filepaths, sheetname)
%HASSHEET Check whether or not the given excel files have a sheet with the given name
%filepaths: full path of excel files (cell array of 1D char arrays / string array)
%sheetname: name of sheet to find in excel files (1D char array / string)
%hassheet: array the same size as filelist, indicating whether or not the excel file has a sheet with sheetname (logical)
and change the loop to:
for fileidx = 1:numel(filepaths)
%open without updating link and as read only. read only ensure the file can be opened even if it's already in use:
workbook = excel.Workbooks.Open(filepaths{fileidx}), false, true);
Note that comments in code are a good thing. Also note that the comments immediately after the function signature show when you type help hasssheet or doc hassheet so are even more essential.
chlor thanks
on 12 Jul 2016
chlor thanks
on 12 Jul 2016
chlor thanks
on 12 Jul 2016
Edited: chlor thanks
on 12 Jul 2016
Guillaume
on 12 Jul 2016
You were correct that it should have been size(filelist). I've edited the post to fix the error.
If you get a 'not enough input argument' errors that is probably because you haven't supplied enough arguments when you called the function hassheet.
Can you clarify what exactly are the inputs that you'd like to pass to the function, what they contain and what type (e.g. cell arrays) they are.
Note that for testing if a sheet belongs to an excel file, the code is simply:
%inputs are fullpathoffile and sheetname, both strings
%output is hassheet, logical
excel = actxserver('Excel.Application');
workbook = excel.Workbooks.Open(fullpathoffile, false, true);
sheetnames = arrayfun(@(i) workbook.Sheets.Item(i).Name, 1:workbook.Sheets.Count, 'UniformOutput', false);
hassheet = ismember(sheetname, sheetnames);
excel.Quit;
Because the above starts and close excel each time, it is better to wrap the check in a loop as I did with the function and only start and close excel once.
chlor thanks
on 12 Jul 2016
Edited: chlor thanks
on 12 Jul 2016
Guillaume
on 12 Jul 2016
In general, you cannot call a function without input arguments just to test it, regardless of the version of matlab. If you try to call the built-in sin function without an input, you're going to get the same error.
Some functions may work without any input (e.g. spy) but they're in the minority and their behaviour is most likely different from when inputs are supplied.
chlor thanks
on 12 Jul 2016
Edited: chlor thanks
on 12 Jul 2016
Image Analyst
on 12 Jul 2016
0 votes
Rather than using your own CheckForWorksheet() function, you might use the built in function for doing that. From the help:
xlsfinfo
Determine if file contains Microsoft Excel spreadsheet
[status,sheets] = xlsfinfo(filename) additionally returns the name of each spreadsheet in the file.
6 Comments
Guillaume
on 13 Jul 2016
He! I wasn't aware of xlsfinfo. Yes, it can be used as a replacement for part of the code I've written. Internally, it is more or less the same code anyway (if you have excel installed).
However, i find Mathwork's description, "Determine if file contains Microsoft Excel spreadsheet", misleading. It simply returns all the spreadsheet, it does not check if the file contains a particular spreadsheet. You still need an ismember call for that.
chlor thanks
on 13 Jul 2016
Edited: chlor thanks
on 19 Jul 2016
Image Analyst
on 13 Jul 2016
As of R2015b, Excel doesn't launch and shutdown each time, at least not with xlsread() and xlswrite() so I imagine it would be the same with xlsfinfo().
chlor thanks
on 14 Jul 2016
Edited: chlor thanks
on 14 Jul 2016
Image Analyst
on 14 Jul 2016
It's the version of MATLAB you have that matters, not the version of Excel.
Categories
Find more on Spreadsheets in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!