Main Content

folders2labels

Get list of labels from folder names

Since R2021a

Description

Use this function when you are working on a machine or deep learning classification problem and your labeled data is stored in folders that have the corresponding label names.

example

lbls = folders2labels(loc) creates a list of labels based on the folder names specified by the location loc.

lbls = folders2labels(loc,Name,Value) specifies additional input arguments using name-value pairs. For example, 'FileExtensions','.mat' includes only .mat files in the scan for labels.

lbls = folders2labels(ds) creates a list of labels based on the files contained in ds. ds can be a datastore, a matlab.io.datastore.FileSet object, or a matlab.io.datastore.BlockedFileSet object.

[lbls,files] = folders2labels(___) additionally returns a list of files. The ith element of lbls corresponds to the label of the ith file in files.

Examples

collapse all

Create a folder called Files in the current folder containing three subfolders, Files_1, Files_2, and Files_3. Add to each subfolder a random number of files, each containing a random signal of random size.

mkdir Files

for kj = 1:3
    fname = "Files_" + kj;
    mkdir(fname)
    for jk = 1:randi(4)
        sname = "sig_" + kj + "_" + jk;
        sgn = randn(randi([30 50]),randi(2));
        save(sname,"sgn")
        movefile(sname + ".mat",fname)
    end
    movefile(fname,"Files")
end

List the contents of the folders.

dir("*/*/*")
Files Found in: Files/Files_1

.            ..           sig_1_1.mat  sig_1_2.mat  sig_1_3.mat  sig_1_4.mat  

Files Found in: Files/Files_2

.            ..           sig_2_1.mat  sig_2_2.mat  

Files Found in: Files/Files_3

.            ..           sig_3_1.mat  sig_3_2.mat  sig_3_3.mat  

Create a list of labels based on the folder names.

lbls = folders2labels("Files")
lbls = 9x1 categorical
     Files_1 
     Files_1 
     Files_1 
     Files_1 
     Files_2 
     Files_2 
     Files_3 
     Files_3 
     Files_3 

List the names of the files associated with the labels.

[~,files] = folders2labels("Files");
[~,fnames] = fileparts(files)
fnames = 9x1 string
    "sig_1_1"
    "sig_1_2"
    "sig_1_3"
    "sig_1_4"
    "sig_2_1"
    "sig_2_2"
    "sig_3_1"
    "sig_3_2"
    "sig_3_3"

Remove the Files directory you created at the beginning of the example.

rmdir Files s

Input Arguments

collapse all

Files or folders to scan for labels, specified as a character vector, a cell array of character vectors, a string scalar, or a string array, containing the location of files or folders that are local or remote.

  • Local files or folders — Specify loc as a local path to files or folders. If the files are not in the current folder, then the local path must specify full or relative paths. Files within subfolders of the specified folder are included by default. You can use the wildcard character (*) when specifying the local path. This character specifies that the file search include all matching files or all files in the matching folders.

  • A remote location specified using an internationalized resource identifier (IRI).

  • Remote files or folders — Specify loc to be the full paths of the files or folders as a uniform resource locator (URL) of the form hdfs:///path_to_file. For more information, see Work with Remote Data.

folders2labels looks for all file formats. To specify a custom list of file extensions to scan, use the FileExtensions argument.

Example: 'whale.mat'

Example: '../dir/data/signal.mat'

Example: "../dir/data/"

Example: {'dataFiles/Files_1/' 'dataFiles/Files_2/'}

Example: ["dataFiles/Files_1/" "dataFiles/Files_2/"]

Data Types: char | string | cell

Data repository, specified as a datastore, a matlab.io.datastore.FileSet object, or a matlab.io.datastore.BlockedFileSet object.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: folders2labels('C:\dir\signaldata','FileExtensions','.csv') specifies a local path and includes only CSV files in the scan for labels.

Subfolder inclusion flag, specified as true or false. Specify true to include all files and subfolders within each folder or false to include only the files within each folder.

Example: 'IncludeSubfolders',true

Data Types: logical | double

Signal file extensions, specified as a string scalar, string array, character vector, or cell array of character vectors.

Example: 'FileExtensions','.csv'

Data Types: string | char | cell

Output Arguments

collapse all

List of labels, returned as a categorical vector.

List of files, returned as a string vector. The ith element of lbls corresponds to a label for the ith file in files.

Version History

Introduced in R2021a