How to create a loop that runs a function through subfolders in a directory?

Hello,
I currently have a folder with 11 subfolders in it. The subfolders have some text files, and I need to run a function through all the subfolders to extract the data from the text files. At the moment I am extracting the data one by one through each folder, because I can't figure out how to successfully do this with a loop. Any suggestions?

 Accepted Answer

You can get the sub-directories using dir:
D = dir; % A is a struct ... first elements are '.' and '..' used for navigation.
for k = 3:length(D) % avoid using the first ones
currD = D(k).name; % Get the current subdirectory name
% Run your function. Note, I am not sure on how your function is written,
% but you may need some of the following
cd(currD) % change the directory (then cd('..') to get back)
fList = dir(currD); % Get the file list in the subdirectory
end
or you could use the struct as your looping variable:
for k = D
currD = k.name;
cd(currD)
end

6 Comments

The function extracts the data from the text files and saves it in a file in the directory where the text files are (so in each of the subfolders). I am currently giving your code a go; it works smoothly for the first subfolder and when it goes into the second subfolder, gets stuck in the function. I'll try to find what the problem is, thanks for the help!
@Brendan, since the official documentation of dir has nothing to say about '.' and '..', we can't assume that they're going to be the first two entries returned or even that they're going to be returned at all. So for correctness, this would work better:
D = dir;
D = D(~ismember({D.name}, {'.', '..'}));
for k = 1:numel(D)
%...
This is more future-proof if Mathworks decide to change the order of returned dirs or remove the entries altogether. To this day, it still baffles me why Mathworks decided to returned these entries in the first place.
Also, I would avoid cd'ing into the directory. Just pass the path to the subfunction and use full paths for any I/O.
@Guillaume I am currently running this code with your suggestions and it seems like it is working! The code is running through each folder, applying the function and then going onto the next folder. Thank you so much!
@Guillaume Great suggestion! This will likely save me much time if/when some of my old code breaks and I can steer future code away from this. I wish I could up vote a comment!
@Guillaume can you clarify what you mean when you say "Also, I would avoid cd'ing into the directory. Just pass the path to the subfunction and use full paths for any I/O."? I'm not sure how to write that code within the loop structure. Thank you!

Sign in to comment.

More Answers (2)

You can run my function getfn to get all txt files in the current directory and all subdirectories. You can then work on the returned filenames.
fn = getfn(mkdir, 'txt$')
with getfn defined as
function filenames = getfn(mydir, pattern)
%GETFN Get filenames in directory and subdirectories.
%
% FILENAMES = GETFN(MYDIR, PATTERN)
%
% Example: Get all files that end with 'txt' in the current directory and
% all subdirectories
%
% fn = getfn(pwd, 'txt$')
%
% Thorsten.Hansen@psychol.uni-giessen.de 2016-07-06
if nargin == 0
mydir = pwd;
end
% computes common variable FILENAMES: get all files in MYDIR and
% recursively traverses subdirectories to get all files in these
% subdirectories:
getfnrec(mydir)
% if PATTERN is given, select only those files that match the PATTERN:
if nargin > 1
idx = ~cellfun(@isempty, regexp(filenames, pattern));
filenames = filenames(idx);
end
function getfnrec(mydir)
% nested function, works on common variable FILENAMES
% recursively traverses subdirectories and returns filenames
% with path relative to the top level directory
d = dir(mydir);
filenames = {d(~[d.isdir]).name};
filenames = strcat(mydir, filesep, filenames);
dirnames = {d([d.isdir]).name};
dirnames = setdiff(dirnames, {'.', '..'});
for i = 1:numel(dirnames)
fulldirname = [mydir filesep dirnames{i}];
filenames = [filenames, getfn(fulldirname)];
end
end % nested function
end
The easiest way to get subfolders is to use the built in function genpath(). No need to even worry about dot or dot dot. genpath() gives a cell array which is a list of all folders. Then use those folders with fullfile() and dir() to get the files in those folders, for example .dat or .xlsx files or whatever. See my attached demo for a demo.

3 Comments

I have seen your code before and tried it, but the output I get is that the number of folders is 1 and that there are no files in it. I'm pretty sure I'm doing something wrong, but I'm stuck on trying to fix it
"The easiest way to get subfolders is to use the built in function genpath"
With one big caveat: folders starting with + or @ or folders named private and any of their subfolders will not be returned.
genpath is not really designed for listing arbitrary path. It's only supposed to work for listing matlab toolbox folders.
It's probably ok for most use case, but of course, in three years the script is going to be used to generate that crucial report that would have won you the nobel prize if only it hadn't skipped over that folder named private.

Sign in to comment.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!