Main Content

bioinfo.pipeline.block.MakeBlastDatabase

Create local BLAST database

Since R2024a

  • makeblastdatabase block icon

Description

A MakeBlastDatabase block enables you to create a local BLAST+ database [1][2].

bioinfo.pipeline.block.MakeBlastDatabase requires the BLAST+ Support Package for Bioinformatics Toolbox™. If this support package is not installed, then the function provides a download link. For details, see Bioinformatics Toolbox Software Support Packages.

Creation

Description

example

b = bioinfo.pipeline.block.MakeBlastDatabase creates a MakeBlastDatabase block.

b = bioinfo.pipeline.block.MakeBlastDatabase(options) also specifies additional options.

b = bioinfo.pipeline.block.MakeBlastDatabase(Name=Value) specifies additional options as the property names and values of a MakeDatabaseOptions object. These property values are assigned to the Options property of the block.

Input Arguments

expand all

BLAST database options, specified as a MakeDatabaseOptions object, string, or character vector.

If you are specifying a string or character vector, it must be in the native makeblastdb syntax (prefixed by a dash).

Data Types: char | string

Properties

expand all

Function to handle errors from the run method of the block, specified as a function handle. The handle specifies the function to call if the run method encounters an error within a pipeline. For the pipeline to continue after a block fails, ErrorHandler must return a structure that is compatible with the output ports of the block. The error handling function is called with the following two inputs:

  • Structure with these fields:

    FieldDescription
    identifierIdentifier of the error that occurred
    messageText of the error message
    indexLinear index indicating which block process failed in the parallel run. By default, the index is 1 because there is only one run per block. For details on how block inputs can be split across different dimensions for multiple run calls, see Bioinformatics Pipeline SplitDimension.

  • Input structure passed to the run method when it fails

Data Types: function_handle

This property is read-only.

Input ports of the block, specified as a structure. The field names of the structure are the names of the block input ports, and the field values are bioinfo.pipeline.Input objects. These objects describe the input port behaviors. The input port names are the expected field names of the input structure that you pass to the block run method.

The MakeBlastDatabase block Inputs structure has the following field:

  • InputFile — Sequence file name. The file must be a text file with one or more sequences in the FASTA format. This input is a required input that must be satisfied.

Data Types: struct

This property is read-only.

Output ports of the block, specified as a structure. The field names of the structure are the names of the block output ports, and the field values are bioinfo.pipeline.Output objects. These objects describe the output port behaviors. The field names of the output structure returned by the block run method are the same as the output port names.

The MakeBlastDatabase block Outputs structure has the field named BlastDatabase, which contains the full path to the output database.

Data Types: struct

BLAST database options, specified as a MakeDatabaseOptions object. The default value is a default MakeDatabaseOptions object.

Name of the output BLAST database, specified as a string scalar or character vector.

Data Types: char | string

Type of BLAST database to create, specified as "nucleotide" or "protein".

Data Types: char | string

Object Functions

compilePerform block-specific additional checks and validations
copyCopy array of handle objects
emptyInputsCreate input structure for use with run method
evalEvaluate block object
runRun block object

Examples

collapse all

Import the pipeline and blocks needed for the pipeline so that you can create these objects without specifying the entire namespace.

import bioinfo.pipeline.Pipeline
import bioinfo.pipeline.block.*

Create a pipeline.

P = Pipeline;

Create an SRAFasterqDump block to download some paired-end sequencing data in the FASTA format using the accession run number SRR26273031.

sraBlock                     = SRAFasterqDump;
sraBlock.Inputs.SRRID.Value  = "SRR26273031";
sraBlock.Options.FastaOutput = true;
addBlock(P,sraBlock);

Create a local nucleotide BLAST+ database.

bpDatabase                  = MakeBlastDatabase;
bpDatabase.DatabaseFilename = "SRR26273031_nucl_db";
bpDatabase.Type             = "nucleotide";
bpDatabase.Options.Title    = "SRR26273031_Nucleotide_DB";
addBlock(P,bpDatabase);

Connect sraBlock and bpDatabase.

connect(P,sraBlock,bpDatabase,["Reads","InputFile"]);

Create a BLASTN block to search the created BLAST+ nucleotide database using the blastn query program. One of the required block inputs is the name of the FASTA file that contains the nucleotide query sequences.

bnBlock                             = BLASTN;
queryFile                           = which("queryFile.fasta");
bnBlock.Inputs.QueryFile.Value      = queryFile;

Connect bpDatabase and bnBlock.

addBlock(P,bnBlock);
connect(P,bpDatabase,bnBlock,["BlastDatabase","BlastDatabase"]);

Perform the blastn search by running the pipeline.

run(P);

The BLAST report is saved in the results folder of the BLASTN block.

blastnResults = results(P,bnBlock)
blastnResults = struct with fields:
    BlastReport: [1×1 bioinfo.pipeline.datatype.File]

Display the location of the file using the unwrap function.

unwrap(blastnResults.BlastReport)

You can also run other query programs by creating the corresponding query block. For example, create a TBLASTX block, which searches translated nucleotide queries against a translated nucleotide database.

tbxBlock                         = TBLASTX;
tbxBlock.Inputs.QueryFile.Value  = queryFile;
addBlock(P,tbxBlock);
connect(P,bpDatabase,tbxBlock,["BlastDatabase","BlastDatabase"]);

Perform the tblastx search by running the pipeline.

run(P);

The BLAST report is saved in the results folder of the TBLASTX block.

tblastxResults = results(P,tbxBlock)
tblastxResults = struct with fields:
    BlastReport: [1×1 bioinfo.pipeline.datatype.File]

Display the location of the file using the unwrap function.

unwrap(tblastxResults.BlastReport)

References

[1] Camacho, Christiam, George Coulouris, Vahram Avagyan, Ning Ma, Jason Papadopoulos, Kevin Bealer, and Thomas L Madden. “BLAST+: Architecture and Applications.” BMC Bioinformatics 10, no. 1 (December 2009): 421.

[2] “BLAST: Basic Local Alignment Search Tool.” https://blast.ncbi.nlm.nih.gov/Blast.cgi.

Version History

Introduced in R2024a