Main Content

getIndexByKey

Class: BioIndexedFile

Retrieve indices from source file associated with BioIndexedFile object using alphanumeric key

Syntax

Indices = getIndexByKey(BioIFobj, Key)
[Indices, LogicalVals] = getIndexByKey(BioIFobj, Key)

Description

Indices = getIndexByKey(BioIFobj, Key) returns the indices of entries in the source file associated with BioIFobj, a BioIndexedFile object. It returns the indices of entries that have the keys specified by Key, a character vector or cell array of character vectors specifying one or more alphanumeric keys. It returns Indices, a numeric vector of the indices of entries that have the alphanumeric keys specified by Key. If the keys in the source file are not unique, it returns all indices of entries that match a specified key, all at the position of the key in the Key cell array. If the keys in the source file are unique, there is a one-to-one relationship between the number and order of elements in Key and the output Indices.

[Indices, LogicalVals] = getIndexByKey(BioIFobj, Key) returns a logical vector that indicates only the last match for each key, such that there is a one-to-one relationship between the number and order of elements in Key and Indices(LogicalVals).

Input Arguments

BioIFobj

Object of the BioIndexedFile class.

Key

Character vector or cell array of character vectors specifying one or more keys in the source file associated with BioIFobj, the BioIndexedFile object.

Output Arguments

Indices

Numeric vector of the indices of entries in source file that have the alphanumeric keys specified by Key.

LogicalVals

Logical vector containing the same number of elements as Indices. The vector indicates only the last match for each key specified in Key, such that there is a one-to-one relationship between the number and order of elements in Key and Indices(LogicalVals).

Tip

Some files contain repeated keys. For example, SAM-formatted files use the same key for entries that are paired end reads. Use the Indices(LogicalVals) syntax to return only the last index of a repeated key. For more information, see Examples.

Examples

Construct a BioIndexedFile object to access a table containing cross-references between gene names and gene ontology (GO) terms:

% Create variable containing full absolute path of source file
sourcefile = which('yeastgenes.sgd');
% Create a BioIndexedFile object from the source file. Indicate
% the source file is a tab-delimited file where contiguous rows
% with the same key are considered a single entry. Store the
% index file in the Current Folder. Indicate that keys are
% located in column 3 and that header lines are prefaced with !
gene2goObj = BioIndexedFile('mrtab', sourcefile, '.', ...
                            'KeyColumn', 3, 'HeaderPrefix','!')

Return the indices for the entries in the source file that are specified by the keys AAC1 and AAD10.

% Access indices for entries that have the keys AAC1 and AAD10
indices = getIndexByKey(gene2goObj, {'AAC1' 'AAD10'})
indices =

           3
				 5

Construct a BioIndexedFile object to access a SAM-formatted file that has repeated keys.

% Create variable containing full absolute path of source file
samsourcefile = which('ex1.sam');
% Create a BioIndexedFile object from the source file. Store the
% index file in the Current Folder. 
samObj = BioIndexedFile('sam', samsourcefile, '.')

Return only the last indices for the entries in the source file that are specified by two keys,'B7_593:7:15:244:876 and EAS56_65:4:296:78:421, both of which are repeated keys.

% Return all indices for entries that have two specific keys
[Indices, LogicalVal] = getIndexByKey(samObj, ...
                  {'B7_593:7:15:244:876', 'EAS56_65:4:296:78:421'})
Indices =

        3058
        3238
        3292
        3293

LogicalVal =

     0
     1
     0
     1
% Return only the last index for each key
LastIndices = Indices(LogicalVal)
LastIndices =

        3238
        3293

Tips

Use this method to determine the indices of specific entries with known keys.